This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
6
InstCombineSelect.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
3
select-gep.ll

Differential D106352

[InstCombine] Fold (select C, (gep (gep Ptr, Idx0), Idx1), (gep Ptr, Idx0)) -> (gep Ptr,(select C, Idx0+Idx1, Idx0)) (PR51069)
AbandonedPublic

Authored by RKSimon on Jul 20 2021, 3:42 AM.

Download Raw Diff

Details

Reviewers

spatel
reames
lebedev.ri

Summary

This extends the PR50183/D105901 "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" fold to account if the inner Ptr was a base gep, allowing us to merge the geps and select between the base index and the offset'd index.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	2,740 ms	x64 debian > libarcher.critical::critical.c
	2,780 ms	x64 debian > libarcher.races::critical-unrelated.c
	3,170 ms	x64 debian > libarcher.races::lock-nested-unrelated.c
	2,880 ms	x64 debian > libarcher.races::lock-unrelated.c
	3,200 ms	x64 debian > libarcher.races::parallel-simple.c
		View Full Test Results (18 Failed)

Event Timeline

RKSimon created this revision.Jul 20 2021, 3:42 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptJul 20 2021, 3:42 AM

RKSimon requested review of this revision.Jul 20 2021, 3:42 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2021, 3:42 AM

Harbormaster completed remote builds in B115061: Diff 360072.Jul 20 2021, 4:16 AM

I feel like gep (gep Ptr, Idx0), Idx1 --> gep Ptr, Idx0+Idx1 is obviously a standalone fold that should be elsewhere in instcombine.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2950–2953

In D106352#2890021, @lebedev.ri wrote:

I feel like gep (gep Ptr, Idx0), Idx1 --> gep Ptr, Idx0+Idx1 is obviously a standalone fold that should be elsewhere in instcombine.

... and while we clearly don't have that fold (https://godbolt.org/z/56asoz3ch),
i realize it wouldn't help here, since %gep1 has extra use in select.

llvm/test/Transforms/InstCombine/select-gep.ll
132	Please add test where `%gep1` has other uses (e.g. `call void @use(i32* %gep1)`)

In D106352#2890033, @lebedev.ri wrote:

In D106352#2890021, @lebedev.ri wrote:

I feel like gep (gep Ptr, Idx0), Idx1 --> gep Ptr, Idx0+Idx1 is obviously a standalone fold that should be elsewhere in instcombine.

... and while we clearly don't have that fold (https://godbolt.org/z/56asoz3ch),
i realize it wouldn't help here, since %gep1 has extra use in select.

I think Roman is onto something here. And maybe I'm missing something, but I don't see the extra use he's talking about in the *optimized* IR.

Here's the current output of test2c:
; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i32, i32* [[P:%.*]], i64 [[X:%.*]]
; CHECK-NEXT: [[ICMP:%.*]] = icmp ugt i64 [[X]], [[Y:%.*]]
; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 0, i64 6
; CHECK-NEXT: [[SEL:%.*]] = getelementptr i32, i32* [[GEP1]], i64 [[SEL_IDX]]
; CHECK-NEXT: ret i32* [[SEL]]

Shouldn't a (gep (gep p, idx1), idx2) -> gep p, (idx1+idx2) rule catch this case just fine? The only real downside I see is that we loose the inbounds on the first gep.

lebedev.ri added inline comments.Jul 20 2021, 10:20 AM

llvm/test/Transforms/InstCombine/select-gep.ll
111–112	@reames if we look at this test, clearly we can't fold `%gep2` into `%p + (%x + 6)`, because that results in two instructions, but `%gep1` sticks around since it's used in `select`, so we can't actually do this in instcombine.

reames added inline comments.Jul 20 2021, 10:24 AM

llvm/test/Transforms/InstCombine/select-gep.ll
111–112	Roman, I think you're looking at the wrong IR. The interesting IR is the result of running the current transforms, not the input to exercise the current transform.

I'm going to take a look at folding nested geps in another parallel patch - I agree there's a lot of overlap in the transforms but hopefully we'll end up with similar canonicalizations - I'm hoping the the multi-use cases might not be a major problem.

RKSimon added inline comments.Jul 21 2021, 4:41 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958	Something that I've noticed while trying to get this tested with alive2 - I think we need to guarantee that the add won't overlap?

RKSimon added inline comments.Jul 21 2021, 5:01 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958	[EDIT] Something that I've noticed while trying to get this tested with alive2 - I think we need to guarantee that the add won't overflow?

lebedev.ri added inline comments.Jul 21 2021, 5:17 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958	No?

RKSimon added inline comments.Jul 21 2021, 5:23 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958	Cheers

RKSimon mentioned this in rG59db3a5df918: [InstCombine] Add multiuse test for D106352.Jul 21 2021, 5:48 AM

lebedev.ri added inline comments.Jul 21 2021, 5:48 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2958	Two more bits: `inbounds` intersect `add` is never `nuw` `add` is `nsw` iff `GEP` ends up being `inbounds`

RKSimon mentioned this in D106450: [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069).Jul 21 2021, 8:18 AM

RKSimon mentioned this in rG1c9bec727ab5: [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0….Jul 22 2021, 3:16 AM

Rebase this?
Did that other patch handle everything here?
What about more complex patterns with more than to indexes?

In D106352#2896253, @lebedev.ri wrote:

Rebase this?
Did that other patch handle everything here?
What about more complex patterns with more than to indexes?

Yes the other patch handled these cases - I'm going to start investigating how to perform better select-of-geps with more than 2 operands (in foldSelectOpOp), but nested-geps need to be addressed as well.

RKSimon mentioned this in rG10c982e0b3e6: Revert rG1c9bec727ab5c53fa060560dc8d346a911142170 : [InstCombine] Fold (gep….Aug 23 2021, 1:09 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineSelect.cpp

25 lines

test/

Transforms/

InstCombine/

select-gep.ll

16 lines

Diff 360072

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

Show First 20 Lines • Show All 2,931 Lines • ▼ Show 20 Lines Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) {

if (Instruction *I = foldSelectExtConst(SI)) if (Instruction *I = foldSelectExtConst(SI))

return I; return I;

// Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) // Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))

// Fold (select C, Ptr, (gep Ptr, Idx)) -> (gep Ptr, (select C, 0, Idx)) // Fold (select C, Ptr, (gep Ptr, Idx)) -> (gep Ptr, (select C, 0, Idx))

auto SelectGepWithBase = [&](GetElementPtrInst *Gep, Value *Base, auto SelectGepWithBase = [&](GetElementPtrInst *Gep, Value *Base,

bool Swap) -> GetElementPtrInst * { bool Swap) -> GetElementPtrInst * {

Value *Ptr = Gep->getPointerOperand();

if (Gep->getNumOperands() != 2 || Gep->getPointerOperand() != Base || if (Gep->getNumOperands() != 2 || Gep->getPointerOperand() != Base ||

!Gep->hasOneUse()) !Gep->hasOneUse())

return nullptr; return nullptr;

auto *BaseGep = dyn_cast<GetElementPtrInst>(Base);

Type *ElementType = Gep->getResultElementType(); Type *ElementType = Gep->getResultElementType();

Value *Idx = Gep->getOperand(1); Value *Idx = Gep->getOperand(1);

Value *NewT = Idx; Value *Ptr, *NewT, *NewF;

Value *NewF = Constant::getNullValue(Idx->getType());

// Handle nested geps special case.

// Fold (select C, (gep (gep Ptr, Idx0), Idx1), (gep Ptr, Idx0))

// --> (gep Ptr,(select C, Idx0+Idx1, Idx0))

// Fold (select C, (gep Ptr, Idx0), (gep (gep Ptr, Idx0), Idx1))

// --> (gep Ptr,(select C, Idx0, Idx0+Idx1))

lebedev.riUnsubmitted

Not Done

// Handle nested geps special case.

- // Fold (select C, (gep (gep Ptr, Idx0), Idx1), (gep Ptr, Idx0))

- // --> (gep Ptr,(select C, Idx0+Idx1, Idx0))

- // Fold (select C, (gep Ptr, Idx0), (gep (gep Ptr, Idx0), Idx1))

- // --> (gep Ptr,(select C, Idx0, Idx0+Idx1))

+ // Fold (select C, (gep (gep Ptr, Idx0), Idx1), (gep Ptr, Idx2))

+ // --> (gep Ptr,(select C, Idx?+Idx?, Idx?))

+ // Fold (select C, (gep Ptr, Idx0), (gep (gep Ptr, Idx1), Idx2))

+ // --> (gep Ptr,(select C, Idx?, Idx?+Idx?))

if (BaseGep && BaseGep->getNumOperands() == 2 &&

lebedev.ri:

if (BaseGep && BaseGep->getNumOperands() == 2 &&

Lint: Pre-merge checks

clang-format: please reformat the code

-    if (BaseGep && BaseGep->getNumOperands() == 2 && 
+    if (BaseGep && BaseGep->getNumOperands() == 2 &&

Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (BaseGep && BaseGep->getNumOperands() == 2…

ElementType == BaseGep->getResultElementType() &&

Idx->getType() == BaseGep->getOperand(1)->getType()) {

NewT =

Builder.CreateAdd(Idx, BaseGep->getOperand(1), SI.getName() + ".add");

RKSimonAuthorUnsubmitted

Not Done

Something that I've noticed while trying to get this tested with alive2 - I think we need to guarantee that the add won't overlap?

RKSimon: Something that I've noticed while trying to get this tested with alive2 - I think we need to…

RKSimonAuthorUnsubmitted

Not Done

[EDIT] Something that I've noticed while trying to get this tested with alive2 - I think we need to guarantee that the add won't overflow?

RKSimon: [EDIT] Something that I've noticed while trying to get this tested with alive2 - I think we…

lebedev.riUnsubmitted

Not Done

No?

lebedev.ri: [[ https://alive2.llvm.org/ce/z/f8pLfD | No? ]]

RKSimonAuthorUnsubmitted

Not Done

Cheers

RKSimon: Cheers

lebedev.riUnsubmitted

Not Done

Two more bits:

inbounds intersect
add is never nuw
add is nsw iff GEP ends up being inbounds

lebedev.ri: Two more bits: * `inbounds` intersect * `add` is never `nuw` * `add` is `nsw` iff `GEP` ends up…

NewF = BaseGep->getOperand(1);

Ptr = BaseGep->getPointerOperand();

} else {

NewT = Idx;

NewF = Constant::getNullValue(Idx->getType());

Ptr = Gep->getPointerOperand();

}

if (Swap) if (Swap)

std::swap(NewT, NewF); std::swap(NewT, NewF);

Value *NewSI = Value *NewSI =

Builder.CreateSelect(CondVal, NewT, NewF, SI.getName() + ".idx", &SI); Builder.CreateSelect(CondVal, NewT, NewF, SI.getName() + ".idx", &SI);

return GetElementPtrInst::Create(ElementType, Ptr, {NewSI}); return GetElementPtrInst::Create(ElementType, Ptr, {NewSI});

}; };

if (auto *TrueGep = dyn_cast<GetElementPtrInst>(TrueVal)) if (auto *TrueGep = dyn_cast<GetElementPtrInst>(TrueVal))

if (auto *NewGep = SelectGepWithBase(TrueGep, FalseVal, false)) if (auto *NewGep = SelectGepWithBase(TrueGep, FalseVal, false))

▲ Show 20 Lines • Show All 267 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/select-gep.ll

Show First 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	;
%cmp = icmp ugt i64 %x, %y		%cmp = icmp ugt i64 %x, %y
%select = select i1 %cmp, i32* %p, i32* %gep		%select = select i1 %cmp, i32* %p, i32* %gep
ret i32* %select		ret i32* %select
}		}

; PR51069		; PR51069
define i32* @test2c(i32* %p, i64 %x, i64 %y) {		define i32* @test2c(i32* %p, i64 %x, i64 %y) {
; CHECK-LABEL: @test2c(		; CHECK-LABEL: @test2c(
; CHECK-NEXT: [[GEP1:%.]] = getelementptr inbounds i32, i32 [[P:%.]], i64 [[X:%.]]		; CHECK-NEXT: [[ICMP:%.]] = icmp ugt i64 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: [[ICMP:%.]] = icmp ugt i64 [[X]], [[Y:%.]]		; CHECK-NEXT: [[SEL_ADD:%.*]] = add i64 [[X]], 6
; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 0, i64 6		; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 [[X]], i64 [[SEL_ADD]]
; CHECK-NEXT: [[SEL:%.]] = getelementptr i32, i32 [[GEP1]], i64 [[SEL_IDX]]		; CHECK-NEXT: [[SEL:%.]] = getelementptr i32, i32 [[P:%.*]], i64 [[SEL_IDX]]
; CHECK-NEXT: ret i32* [[SEL]]		; CHECK-NEXT: ret i32* [[SEL]]
;		;
%gep1 = getelementptr inbounds i32, i32* %p, i64 %x		%gep1 = getelementptr inbounds i32, i32* %p, i64 %x
%gep2 = getelementptr inbounds i32, i32* %gep1, i64 6		%gep2 = getelementptr inbounds i32, i32* %gep1, i64 6
		lebedev.riUnsubmitted Not Done Reply Inline Actions @reames if we look at this test, clearly we can't fold `%gep2` into `%p + (%x + 6)`, because that results in two instructions, but `%gep1` sticks around since it's used in `select`, so we can't actually do this in instcombine. lebedev.ri: @reames if we look at this test, clearly we can't fold `%gep2` into `%p + (%x + 6)`, because…
		reamesUnsubmitted Not Done Reply Inline Actions Roman, I think you're looking at the wrong IR. The interesting IR is the result of running the current transforms, not the input to exercise the current transform. reames: Roman, I think you're looking at the wrong IR. The interesting IR is the result of running the…
%icmp = icmp ugt i64 %x, %y		%icmp = icmp ugt i64 %x, %y
%sel = select i1 %icmp, i32* %gep1, i32* %gep2		%sel = select i1 %icmp, i32* %gep1, i32* %gep2
ret i32* %sel		ret i32* %sel
}		}

; PR51069		; PR51069
define i32* @test2d(i32* %p, i64 %x, i64 %y) {		define i32* @test2d(i32* %p, i64 %x, i64 %y) {
; CHECK-LABEL: @test2d(		; CHECK-LABEL: @test2d(
; CHECK-NEXT: [[GEP1:%.]] = getelementptr inbounds i32, i32 [[P:%.]], i64 [[X:%.]]		; CHECK-NEXT: [[ICMP:%.]] = icmp ugt i64 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: [[ICMP:%.]] = icmp ugt i64 [[X]], [[Y:%.]]		; CHECK-NEXT: [[SEL_ADD:%.*]] = add i64 [[X]], 6
; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 6, i64 0		; CHECK-NEXT: [[SEL_IDX:%.*]] = select i1 [[ICMP]], i64 [[SEL_ADD]], i64 [[X]]
; CHECK-NEXT: [[SEL:%.]] = getelementptr i32, i32 [[GEP1]], i64 [[SEL_IDX]]		; CHECK-NEXT: [[SEL:%.]] = getelementptr i32, i32 [[P:%.*]], i64 [[SEL_IDX]]
; CHECK-NEXT: ret i32* [[SEL]]		; CHECK-NEXT: ret i32* [[SEL]]
;		;
%gep1 = getelementptr inbounds i32, i32* %p, i64 %x		%gep1 = getelementptr inbounds i32, i32* %p, i64 %x
%gep2 = getelementptr inbounds i32, i32* %gep1, i64 6		%gep2 = getelementptr inbounds i32, i32* %gep1, i64 6
%icmp = icmp ugt i64 %x, %y		%icmp = icmp ugt i64 %x, %y
%sel = select i1 %icmp, i32* %gep2, i32* %gep1		%sel = select i1 %icmp, i32* %gep2, i32* %gep1
ret i32* %sel		ret i32* %sel
}		}
		lebedev.riUnsubmitted Not Done Reply Inline Actions Please add test where `%gep1` has other uses (e.g. `call void @use(i32* %gep1)`) lebedev.ri: Please add test where `%gep1` has other uses (e.g. `call void @use(i32* %gep1)`)

; Three (or more) operand GEPs are currently expected to not be optimised,		; Three (or more) operand GEPs are currently expected to not be optimised,
; though they could be in principle.		; though they could be in principle.

define i32* @test3a([4 x i32]* %p, i64 %x, i64 %y) {		define i32* @test3a([4 x i32]* %p, i64 %x, i64 %y) {
; CHECK-LABEL: @test3a(		; CHECK-LABEL: @test3a(
; CHECK-NEXT: [[GEP1:%.]] = getelementptr inbounds [4 x i32], [4 x i32] [[P:%.]], i64 2, i64 [[X:%.]]		; CHECK-NEXT: [[GEP1:%.]] = getelementptr inbounds [4 x i32], [4 x i32] [[P:%.]], i64 2, i64 [[X:%.]]
; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds [4 x i32], [4 x i32] [[P]], i64 2, i64 [[Y:%.*]]		; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds [4 x i32], [4 x i32] [[P]], i64 2, i64 [[Y:%.*]]
▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold (select C, (gep (gep Ptr, Idx0), Idx1), (gep Ptr, Idx0)) -> (gep Ptr,(select C, Idx0+Idx1, Idx0)) (PR51069)AbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 360072

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

llvm/test/Transforms/InstCombine/select-gep.ll

[InstCombine] Fold (select C, (gep (gep Ptr, Idx0), Idx1), (gep Ptr, Idx0)) -> (gep Ptr,(select C, Idx0+Idx1, Idx0)) (PR51069)
AbandonedPublic