Friday, February 14, 2020

The Cashaddr spec claims it can always detect six errored characters. This is incorrect, it cannot. Be careful with anything that claims to correct errors, this mistake makes such tools more dangerous.

Cashaddr is BCH's clone of BIP173.

It is modified to handle detecting additional errors at the expense of making addresses longer. Unfortunately, less care was apparently taken in the selection and validation of the parameters than BIP173. Particularly, the specification claims that "It ensures the detection of up to 6 errors in the address" but this property is not actually achieved for any supported address size.

I hadn't bothered to comment on this previously because handling 5 errors is plenty-- and any sequence of events likely to produce 6 would probably also be fairly likely to produce more in any case. As it was, this mistake wasn't of practical consequence and is just sort of the thing you'd expect from hasty changes "turn it up to 11", but otherwise harmless.

I also think it's unlikely that parameters exist which achieve the claimed behaviour without making the addresses even longer yet-- which I really doubt would be a good trade-off.

However, recently people have posted software to automatically correct errors made in typed addresses. Correction interacts poorly with mistaken beliefs about the error detection ability of the format.

BIP173 specifically recommends against correcting errors because doing so destroys the error detection safety: "An unfortunate side effect of error correction is that it erodes error detection: correction changes invalid inputs into valid inputs, but if more than a few errors were made then the valid input may not be the correct input. Use of an incorrect but valid input can cause funds to be lost irrecoverably. Because of this, implementations SHOULD NOT implement correction beyond potentially suggesting to the user where in the string an error might be found, without suggesting the correction to make."

The cashaddr spec dutifully gives a confusing paraphrase of this warning: "BCH codes allows for error correction. However, it is strongly advised that error correction is not done in an automatic manner as it may cause funds to be lost irrecoverably if done incorrectly. It may however be used to hint a user at a possible error." This text eliminates the justification and instead is easily read as expressing the concern that the correcting software might be written incorrectly rather than the concern being a fundamental trade-off between correction and detection strength.

Essentially every corrected error removes two characters from the detection ability. So if the code actually guaranteed the detection of 6 errors it could correct three errors unambiguously (though if there were more errors it would be undetected and your funds would be destroyed), but cashaddr does not actually have that property though the spec claims it does. Similarly, if some software corrected only two errors it would still have two errors worth of detection remaining-- but it doesn't because the code cannot guarantee the detection of 6 errors.

I'm unsure how this mistake was made. Validating the error handling capacity of these codes is tricky and computationally expensive. If the search were implemented completely naively it would have to evaluate more than 252 candidate error patterns to check for up to 6 errors in length 42 for each candidate set of parameters that were evaluated. To create BIP173 we had to invent a multitude of novel algebraic optimizations to make the search tractable. Deadalnix claims, however, that our published software wasn't used to create cashaddr. I asked deadalnix for the software used to derive cashaddr, but I guess he lost interest in the discussion.

In any case, this flaw isn't a big deal unless you start correcting errors. I recommend taking BIP173's advice and limit any error 'correction' to just hinting to users characters they should double check.

Cheers,


No comments:

Post a Comment