Limit this transform to a basic block, so we don't cause chaos.
Hopefully, this fixes the remaining failures in PR25999:
https://llvm.org/bugs/show_bug.cgi?id=25999
Details
Details
- Reviewers
ab
Diff Detail
Diff Detail
Event Timeline
Comment Actions
Looks reasonable; fixes 444.namd/ComputeNonbondedUtil.c for me; so far no other problems in SPEC.
LGTM.
test/Transforms/InstCombine/insert-extract-shuffle.ll | ||
---|---|---|
130 | %c2 isn't actually used, sorry about that. |
Comment Actions
After thinking about it some more, this doesn't work either.
It will fail if the vector that we're extracting from is defined in the same block:
define <4 x double> @t(i1 %c, <2 x double> %a, <4 x double> %b) { bb1: br i1 %c, label %bb2, label %bb3 bb2: %r = call <2 x double> @dummy(<2 x double> %a) br label %bb3 bb3: %tmp1 = phi <2 x double> [ %a, %bb1 ], [ %r, %bb2 ] %tmp2 = phi <4 x double> [ %b, %bb1 ], [ zeroinitializer, %bb2 ] %d = fadd <2 x double> %tmp1, %tmp1 %tmp3 = extractelement <2 x double> %d, i32 0 %tmp4 = insertelement <4 x double> %tmp2, double %tmp3, i32 2 ret <4 x double> %tmp4 } declare <2 x double> @dummy(<2 x double>)
We need a way to insert the shuffle at the later of a safe insertion point and the define of the vector that we're extracting from, but I'm not sure how to code that. This sort of thing must exist somewhere else in LLVM?
Comment Actions
Patch updated:
I don't see any existing APIs to do what we want, so just check for the PHI case explicitly.
Also, add another test case.
%c2 isn't actually used, sorry about that.