This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Combine opaque pointer single index GEP and with src GEP by matching the types
Needs ReviewPublic

Authored by skc7 on Nov 23 2022, 9:38 PM.

Details

Summary

Removing zero indices opaque pointer gep, link, has caused Instcombine pass to not combine few GepOfGep instructions which were previously possible for non-opaque pointers case.

Consider below IR : link

@ext = external hidden global [2 x [1 x float]]
%a = getelementptr inbounds [2 x [1 x float]], [2 x [1 x float]]* @ext, i64 0, i64 %c
%b = getelementptr inbounds [1 x float], [1 x float]* %a, i64 0, i64 0 
%c = getelementptr inbounds float, float* %b, i64 %in1

would be optimized to

%c = getelementptr inbounds [2 x [1 x float]], [2 x [1 x float]]* @ext, i64 0, i64 %c, i64 %in1

This is currently not happening for opaque pointer case. link

This patch tries to address this issue by combining single index gep with src gep.

Diff Detail

Event Timeline

skc7 created this revision.Nov 23 2022, 9:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 9:38 PM
skc7 requested review of this revision.Nov 23 2022, 9:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 9:38 PM
skc7 updated this revision to Diff 477797.Nov 24 2022, 8:09 AM

Rebase.

skc7 updated this revision to Diff 478953.Nov 30 2022, 7:45 AM

Rebase.

lebedev.ri added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2084

Do we need an explicit PtrTy->isOpaquePointerTy() check?
Do we need an one-use check on the inner (Src) gep?
Do we want to perform the same LICM-appeasing checks about their loop disposition?

llvm/test/Transforms/InstCombine/opaque-ptr-gep-of-gep.ll
2

Is explicit -opaque-pointers needed?
Just write the test to not have typed pointers to begin with.

7

It isn't obvious how these tests differ from each other.
They could use a comments.

skc7 updated this revision to Diff 479155.Nov 30 2022, 8:55 PM
skc7 added a reviewer: lebedev.ri.

Added comments to tests.

skc7 added inline comments.Nov 30 2022, 9:03 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2084

With new changes of making opaque pointer all-zero index gep as no-op lead to these GepOfGep not getting combined. So this change is specific to opaque pointers.

Src gep can have multiple users which can be geps. This patch tries to optimize them aswell. See gepofgep4 test added below.

I'm not sure about LICM checks are required here. Please let me know if this change would cause any issue with that? I would work on fixing it.

jmmartinez added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2091–2097

What do you think about doing the following ?

Instruction *GEP = GetElementPtrInst::Create(Src->getSourceElementType(), Src->getOperand(0), NewIndices, GEP.getName());
GEP->setIsInBounds(IsInBounds);
return GEP;
skc7 updated this revision to Diff 479994.Dec 4 2022, 11:48 PM
skc7 added a reviewer: jmmartinez.

Update code as per @jmmartinez comments.

skc7 added inline comments.Dec 4 2022, 11:51 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2091–2097

Updated. There were previous methods in instcombine pass using specifically GetElementPtrInst::CreateInBounds. I guess, creating GEP instruction and then setting InBounds is fine.

nikic requested changes to this revision.Dec 5 2022, 3:33 AM

I think in terms of general approach, the better way to handle this would be to relax the Src->getResultElementType() != GEP.getSourceElementType() check to allow the transform if the types can be made to match by adding additional zero indices.

(As a meta note, I think we'd probably be better off considering GEPs with multiple dynamic indices to be non-canonical, and try to split them up instead, e.g. because this allows LICM of an invariant part. But I guess that's not our current stance. It would be interesting to hear where you have found this merging to be beneficial.)

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2095

I believe this is missing a check that the GEP source element type matches the array element type of the Src result element type. Your current code would allow combining [1 x float] and double GEPs.

This revision now requires changes to proceed.Dec 5 2022, 3:33 AM
skc7 updated this revision to Diff 480314.Dec 5 2022, 7:06 PM

Thanks @nikic for feedback. Made changes to match Src ResultElementType with GEP SrcElementType by adding additional zero indices to Src gep.

skc7 updated this revision to Diff 480942.Dec 7 2022, 9:06 AM
skc7 set the repository for this revision to rG LLVM Github Monorepo.

Rebase.

skc7 updated this revision to Diff 482354.Dec 12 2022, 9:01 PM

Rebase. Use -passes=instcombine in test.

skc7 updated this revision to Diff 483089.Dec 15 2022, 12:57 AM
skc7 set the repository for this revision to rG LLVM Github Monorepo.

Rebase. Ping

skc7 added a comment.Dec 23 2022, 8:23 AM

I think in terms of general approach, the better way to handle this would be to relax the Src->getResultElementType() != GEP.getSourceElementType() check to allow the transform if the types can be made to match by adding additional zero indices.

(As a meta note, I think we'd probably be better off considering GEPs with multiple dynamic indices to be non-canonical, and try to split them up instead, e.g. because this allows LICM of an invariant part. But I guess that's not our current stance. It would be interesting to hear where you have found this merging to be beneficial.)

Hi @nikic I have updated the patch to match the types by adding additional zero indices. Please review.

nikic added inline comments.Dec 27 2022, 6:36 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2084

The Src->getResultElementType()->isArrayTy() check here is overly strict. Consider that the result type is something like { [0 x float] }. This can be handled the same way as [0 x float] with an extra zero index (which you do already handle, if this check is dropped).

Src gep can have multiple users which can be geps. This patch tries to optimize them aswell. See gepofgep4 test added below.

Hm, it looks like the existing transform does allow GEPs with multiple uses here, but this seems like a bug to me, because this means that we will end up repeating non-trivial address calculations, rather than performing them only once.

2112

Don't we need to only drop the last zero from NewIndices? Consider something like getelementptr [3 x [2 x [1 x float]]], ptr %p, i64 %x followed by getelementptr float, ptr %gep, i64 %y. I think your current code will convert this into getelementptr [3 x [2 x [1 x float]]], ptr %p, i64 %x, i64 %y, while the correct result would be getelementptr [3 x [2 x [1 x float]]], ptr %p, i64 %x, i64 0, i64 %y.

llvm/test/Transforms/InstCombine/opaque-ptr-gep-of-gep.ll
1

The triple can be dropped here. Please also use update_test_checks.py to generate check lines.

skc7 updated this revision to Diff 485845.Jan 2 2023, 1:08 AM
skc7 retitled this revision from [InstCombine] Combine opaque pointer single index GEP and with src GEP which has result of array type to [InstCombine] Combine opaque pointer single index GEP and with src GEP by matching the types.
skc7 edited the summary of this revision. (Show Details)

Remove check for src array type.
Drop last zero from resulting NewIndices after matching the types.
Drop triple from test. Updated test checks with update_test_checks.py.
Added new tests for struct of array type and to check new zero index in resulting gep.

skc7 added inline comments.Jan 2 2023, 1:11 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2084

Previously with non-opaque pointer case aswell, GEPs with multiple uses were optimized. Will there be any issue if we do so now with opaque pointers case ?
example with non opaque pointers and multiple uses case : https://llvm.godbolt.org/z/oEGE3736f

skc7 updated this revision to Diff 488080.Jan 10 2023, 10:21 PM
skc7 set the repository for this revision to rG LLVM Github Monorepo.

Rebase

skc7 added inline comments.Jan 16 2023, 8:20 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2112

Hi @nikic. I have made the changes to check if zeroes have been appended to NewIndices after types matching and if so, last zero will be erased and then GEP indices will be appended. Please review.

skc7 updated this revision to Diff 496925.Feb 13 2023, 4:00 AM

Rebase. Ping

skc7 added a comment.Apr 4 2023, 4:15 AM

Hi @nikic. I have made the changes that were suggested previously. Could you please review?