I noticed the base case in D24527, and that is hopefully simple enough. Please let me know if I've thought about the nsw/nuw cases correctly.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
Ping.
Minor update:
Not sure if the use of 'auto' was welcome here, so I changed it to 'BinaryOperator *Add'.
Comment Actions
Patch updated as suggested by Eli:
Favor the simpler zext+add sequence over keeping an 'nuw' flag.
Comment Actions
Please make sure we have a test to check that we can correctly reverse the transform in SelectionDAG when we need to (for example, make sure we don't genereate an unnecessary AND for a vector compare on x86).
Otherwise LGTM.
Comment Actions
How did you see that one coming? :)
Indeed, x86 doesn't deal with this and gets worse. This might be the same as the fixes needed before D25485 can proceed...
define <4 x i32> @foo(<4 x i32> %x, <4 x i32> %y, <4 x i32> %z) { %cmp = icmp eq <4 x i32> %x, %y %sext = sext <4 x i1> %cmp to <4 x i32> %sub = sub <4 x i32> %z, %sext ret <4 x i32> %sub }
We have this:
vpcmpeqd %xmm1, %xmm0, %xmm0 vpsubd %xmm0, %xmm2, %xmm0
But with this patch, we get this:
vpcmpeqd %xmm1, %xmm0, %xmm0 vpsrld $31, %xmm0, %xmm0 vpaddd %xmm2, %xmm0, %xmm0
I would favor dropping the nuw flag over keeping IR in a form that isn't canonical.