Fixes PR33364
When we fold vector constants that are operands of phi's that feed into select,
we need to set the correct insertion point for the *new* selects that get generated.
The correct insertion point is the incoming block for the phi.
Such cases can occur with patch rL298845, which fixed folding of
vector constants, but the new selects could be inserted incorrectly (as the added
test case shows).
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
The code change looks right, but the test is not minimized, and it's also not checking the more complicated case of a phi with >2 incoming values.
Please double-check my understanding of the bug - is this a better test for the transform and the fix?
define <2 x i8> @put_selects_in_the_right_blocks(i1 %cond1, i1 %cond2, <2 x i8> %x, <2 x i8> %y, <2 x i8> %z) { entry: br i1 %cond1, label %if1, label %else if1: br i1 %cond2, label %if2, label %else if2: br label %else else: %phi = phi <2 x i8> [ %y, %if2 ], [ <i8 8, i8 -1>, %entry ], [ <i8 4, i8 1>, %if1 ] %cmp = icmp slt <2 x i8> %phi, <i8 7, i8 0> %sel = select <2 x i1> %cmp, <2 x i8> %x, <2 x i8> %z ret <2 x i8> %sel }
I don't think we even need the cmp:
define <2 x i8> @pull_vector_select_constants_through_phi(i1 %cond1, i1 %cond2, <2 x i1> %x, <2 x i8> %y, <2 x i8> %z) { entry: br i1 %cond1, label %if1, label %else if1: br i1 %cond2, label %if2, label %else if2: br label %else else: %phi = phi <2 x i1> [ %x, %if2 ], [ <i1 0, i1 1>, %entry ], [ <i1 1, i1 0>, %if1 ] %sel = select <2 x i1> %phi, <2 x i8> %y, <2 x i8> %z ret <2 x i8> %sel }
This test actually raises a question about this transform independent of the bug. The transform can increase the number of IR instructions - is that expected?
Thanks @spatel , this test showcases the bug as well (also, there is an increase in IR, which is expected because we now have non-constant phi operands). Regarding the transform itself, I *think* the increase in IR can be avoided for non-constant vectors, i.e. we should handle folding of PhiOperands only for constant vectors.
Until rL298845, there was incorrect code gen for the vector constants being handled without a select. The transform was originally written such that the phi node has exactly one non-constant operand. However, this check was different in the bailout condition versus the actual IR transform. When computing the single non-constant block which feeds to the phi, we check for a *constant*. When transforming the IR, we look for *constant int*. This difference contributes the increase in IR. I'm guessing the increase was not expected, since the code was originally miscompiling for vector constants.
Should we go ahead with this fix, and come up with cost model for vector constants?
Sure - definitely kill the bug first. If you minimize the test and/or add the version that I came up with, this patch should be good.
Updated with a better test case from spatel.
Added comments to clarify that there maybe an increase in IR.