This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold (select C, (gep (gep Ptr, Idx0), Idx1), (gep Ptr, Idx0)) -> (gep Ptr,(select C, Idx0+Idx1, Idx0)) (PR51069)
AbandonedPublic

Authored by RKSimon on Jul 20 2021, 3:42 AM.

Details

Summary

This extends the PR50183/D105901 "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" fold to account if the inner Ptr was a base gep, allowing us to merge the geps and select between the base index and the offset'd index.

Diff Detail

Event Timeline

RKSimon created this revision.Jul 20 2021, 3:42 AM
RKSimon requested review of this revision.Jul 20 2021, 3:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2021, 3:42 AM

I feel like gep (gep Ptr, Idx0), Idx1 --> gep Ptr, Idx0+Idx1 is obviously a standalone fold that should be elsewhere in instcombine.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2950–2953

I feel like gep (gep Ptr, Idx0), Idx1 --> gep Ptr, Idx0+Idx1 is obviously a standalone fold that should be elsewhere in instcombine.

... and while we clearly don't have that fold (https://godbolt.org/z/56asoz3ch),
i realize it wouldn't help here, since %gep1 has extra use in select.

llvm/test/Transforms/InstCombine/select-gep.ll
132

Please add test where %gep1 has other uses (e.g. call void @use(i32* %gep1))

I feel like gep (gep Ptr, Idx0), Idx1 --> gep Ptr, Idx0+Idx1 is obviously a standalone fold that should be elsewhere in instcombine.

... and while we clearly don't have that fold (https://godbolt.org/z/56asoz3ch),
i realize it wouldn't help here, since %gep1 has extra use in select.

I think Roman is onto something here. And maybe I'm missing something, but I don't see the extra use he's talking about in the *optimized* IR.

Here's the current output of test2c:
; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i32, i32* [[P:%.*]], i64 [[X:%.*]]
; CHECK-NEXT: [[ICMP:%.*]] = icmp ugt i64 [[X]], [[Y:%.*]]
; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 0, i64 6
; CHECK-NEXT: [[SEL:%.*]] = getelementptr i32, i32* [[GEP1]], i64 [[SEL_IDX]]
; CHECK-NEXT: ret i32* [[SEL]]

Shouldn't a (gep (gep p, idx1), idx2) -> gep p, (idx1+idx2) rule catch this case just fine? The only real downside I see is that we loose the inbounds on the first gep.

lebedev.ri added inline comments.Jul 20 2021, 10:20 AM
llvm/test/Transforms/InstCombine/select-gep.ll
111–112

@reames if we look at this test, clearly we can't fold %gep2 into %p + (%x + 6),
because that results in two instructions, but %gep1 sticks around since it's used in select,
so we can't actually do this in instcombine.

reames added inline comments.Jul 20 2021, 10:24 AM
llvm/test/Transforms/InstCombine/select-gep.ll
111–112

Roman, I think you're looking at the wrong IR. The interesting IR is the result of running the current transforms, not the input to exercise the current transform.

I'm going to take a look at folding nested geps in another parallel patch - I agree there's a lot of overlap in the transforms but hopefully we'll end up with similar canonicalizations - I'm hoping the the multi-use cases might not be a major problem.

RKSimon added inline comments.Jul 21 2021, 4:41 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958

Something that I've noticed while trying to get this tested with alive2 - I think we need to guarantee that the add won't overlap?

RKSimon added inline comments.Jul 21 2021, 5:01 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958

[EDIT] Something that I've noticed while trying to get this tested with alive2 - I think we need to guarantee that the add won't overflow?

lebedev.ri added inline comments.Jul 21 2021, 5:17 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958
RKSimon added inline comments.Jul 21 2021, 5:23 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958

Cheers

lebedev.ri added inline comments.Jul 21 2021, 5:48 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958

Two more bits:

  • inbounds intersect
  • add is never nuw
  • add is nsw iff GEP ends up being inbounds

Rebase this?
Did that other patch handle everything here?
What about more complex patterns with more than to indexes?

RKSimon abandoned this revision.Jul 22 2021, 7:04 AM

Rebase this?
Did that other patch handle everything here?
What about more complex patterns with more than to indexes?

Yes the other patch handled these cases - I'm going to start investigating how to perform better select-of-geps with more than 2 operands (in foldSelectOpOp), but nested-geps need to be addressed as well.