This both saves an explicit comparision and avoids the use of xadd
which introduces register constraints and other challenges to the
generated code.
The motivating case is from atomic reference counts where 1 is the
sentinel rather than 0 for whatever reason. This can and should be
lowered efficiently on x86 by just using a different flag, however the
x86 code only handled the 0 case.
There remains some further opportunities here that are currently hidden
due to canonicalization. I've included test cases that show these and
FIXMEs. However, I don't at the moment have any production use cases and
they seem substantially harder to address.