This extends the PR50183/D105901 "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" fold to account if the inner Ptr was a base gep, allowing us to merge the geps and select between the base index and the offset'd index.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
I feel like gep (gep Ptr, Idx0), Idx1 --> gep Ptr, Idx0+Idx1 is obviously a standalone fold that should be elsewhere in instcombine.
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp | ||
---|---|---|
2950–2953 |
... and while we clearly don't have that fold (https://godbolt.org/z/56asoz3ch),
i realize it wouldn't help here, since %gep1 has extra use in select.
llvm/test/Transforms/InstCombine/select-gep.ll | ||
---|---|---|
132 | Please add test where %gep1 has other uses (e.g. call void @use(i32* %gep1)) |
I think Roman is onto something here. And maybe I'm missing something, but I don't see the extra use he's talking about in the *optimized* IR.
Here's the current output of test2c:
; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i32, i32* [[P:%.*]], i64 [[X:%.*]]
; CHECK-NEXT: [[ICMP:%.*]] = icmp ugt i64 [[X]], [[Y:%.*]]
; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 0, i64 6
; CHECK-NEXT: [[SEL:%.*]] = getelementptr i32, i32* [[GEP1]], i64 [[SEL_IDX]]
; CHECK-NEXT: ret i32* [[SEL]]
Shouldn't a (gep (gep p, idx1), idx2) -> gep p, (idx1+idx2) rule catch this case just fine? The only real downside I see is that we loose the inbounds on the first gep.
llvm/test/Transforms/InstCombine/select-gep.ll | ||
---|---|---|
111–112 | Roman, I think you're looking at the wrong IR. The interesting IR is the result of running the current transforms, not the input to exercise the current transform. |
I'm going to take a look at folding nested geps in another parallel patch - I agree there's a lot of overlap in the transforms but hopefully we'll end up with similar canonicalizations - I'm hoping the the multi-use cases might not be a major problem.
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp | ||
---|---|---|
2958 | Something that I've noticed while trying to get this tested with alive2 - I think we need to guarantee that the add won't overlap? |
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp | ||
---|---|---|
2958 | [EDIT] Something that I've noticed while trying to get this tested with alive2 - I think we need to guarantee that the add won't overflow? |
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp | ||
---|---|---|
2958 | Cheers |
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp | ||
---|---|---|
2958 | Two more bits:
|
Rebase this?
Did that other patch handle everything here?
What about more complex patterns with more than to indexes?
Yes the other patch handled these cases - I'm going to start investigating how to perform better select-of-geps with more than 2 operands (in foldSelectOpOp), but nested-geps need to be addressed as well.