This patch has the same motivating example as D48466, but I'm trying to kill the setcc bool math sooner rather than later.
By matching a larger pattern that includes both the low-bit mask and the trailing add/sub, we can create a universally good fold because we always eliminate the condition code intermediate value.
Here are Alive proofs for these (currently instcombine folds the 'add' variants, but misses the 'sub' patterns):
https://rise4fun.com/Alive/Gsyp
Name: sub of zext cmp mask %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %z = zext i1 %c to i32 %r = sub i32 C1, %z => %optional_cast = zext i8 %a to i32 %r = add i32 %optional_cast, C1-1 Name: add of zext cmp mask %a = and i32 %x, 1 %c = icmp eq i32 %a, 0 %z = zext i1 %c to i8 %r = add i8 %z, C1 => %optional_cast = trunc i32 %a to i8 %r = sub i8 C1+1, %optional_cast
All of the tests look like improvements or neutral to me. But it is possible that x86 test+set+bitop is better than what we now show here. I suspect we could do better by adding another fold for the 'sub' variants in particular.
We start with select-of-constant in IR in the larger motivating test, so that's why I included tests with selects. Proofs for those variants:
https://rise4fun.com/Alive/Bx1
Name: true const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C2, i64 C1 => %z = zext i8 %a to i64 %r = sub i64 C2, %z Name: false const is bigger Pre: C2 == (C1 + 1) %a = and i8 %x, 1 %c = icmp eq i8 %a, 0 %r = select i1 %c, i64 C1, i64 C2 => %z = zext i8 %a to i64 %r = add i64 C1, %z
I have not stepped through the PPC tests to see how the 3 unchanged select tests escaped any diffs. It's possible that those are not folded the same as x86 initially or something later reverses what we're doing here.
Why not "getValueType() == MVT::i1"?