This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold extractelement + vector GEP with one use
ClosedPublic

Authored by david-arm on May 5 2021, 5:26 AM.

Details

Summary

We sometimes see code like this:

Case 1:

%gep = getelementptr i32, i32* %a, <2 x i64> %splat
%ext = extractelement <2 x i32*> %gep, i32 0

or this:

Case 2:

%gep = getelementptr i32, <4 x i32*> %a, i64 1
%ext = extractelement <4 x i32*> %gep, i32 0

where there is only one use of the GEP. In such cases it makes
sense to fold the two together such that we create a scalar GEP:

Case 1:

%ext = extractelement <2 x i64> %splat, i32 0
%gep = getelementptr i32, i32* %a, i64 %ext

Case 2:

%ext = extractelement <2 x i32*> %a, i32 0
%gep = getelementptr i32, i32* %ext, i64 1

This may create further folding opportunities as a result, i.e.
the extract of a splat vector can be completely eliminated. Also,
even for the general case where the vector operand is not a splat
it seems beneficial to create a scalar GEP and extract the scalar
element from the operand. Therefore, in this patch I've assumed
that a scalar GEP is always preferrable to a vector GEP and have
added code to unconditionally fold the extract + GEP.

I haven't added folds for the case when we have both a vector of
pointers and a vector of indices, since this would require
generating an additional extractelement operation.

Tests have been added here:

Transforms/InstCombine/gep-vector-indices.ll

Diff Detail

Event Timeline

david-arm created this revision.May 5 2021, 5:26 AM
david-arm requested review of this revision.May 5 2021, 5:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2021, 5:26 AM
sdesmalen added inline comments.May 6 2021, 3:18 AM
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
434

Isn't the type of GEP guaranteed to be of type VectorType?

440

Does this need to be limited to a vector of indices that is also a splat?

483

unnecessary change.

Hey @david-arm,

I had a looks, but not too sure I have more comments than the ones I've added.

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
450

Can you add a message when asserting?

llvm/test/Transforms/InstCombine/gep-vector-indices.ll
93

Should we add a negative test for this too:
// 3. We have a vector of pointers and a vector of indices.

david-arm updated this revision to Diff 343661.May 7 2021, 6:05 AM
david-arm edited the summary of this revision. (Show Details)
  • Tried to improve the commit message to explain why I've chosed to unconditionall apply the fold for all cases, not just splats.
  • Added extra negative test for GEPs with vector of ptrs + vector of indices.
  • Changed code to assume GEP result is always a VectorType.
david-arm marked 5 inline comments as done.May 7 2021, 6:06 AM
david-arm added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
440

Sorry for not explaining this better in the previous commit message. I've tried to make it clear that I'm applying the fold unconditionall on the basis that in general a scalar GEP is still preferrable to a vector one.

spatel added inline comments.May 10 2021, 6:14 AM
llvm/test/Transforms/InstCombine/vec_demanded_elts-inseltpoison.ll
501–502

Remove stale comment.

508

IIUC, this is 1 of the 2 basic patterns that we want to transform.

It would be better to use different index values in the test though, so we can verify that the indexes are being translated as expected (for example, make the gep index an argument variable?).

If that is a good suggestion, please commit the test change as a preliminary patch (no review needed), so we just see the functional diff from this patch here.

llvm/test/Transforms/InstCombine/vec_demanded_elts.ll
508

This test file is the pre-poison equivalent of the other one, so see earlier test comments.

david-arm marked an inline comment as done.May 12 2021, 3:11 AM
david-arm added inline comments.
llvm/test/Transforms/InstCombine/vec_demanded_elts-inseltpoison.ll
508

Hi @spatel, thanks for taking a look at the patch! I'm just not entirely clear what to do here to be honest - are you suggesting just changing gep_vbase_w_s_idx to take a variable index and use that instead of 1? Or are you thinking of adding another similar test that takes a variable index?

spatel added inline comments.May 12 2021, 5:09 AM
llvm/test/Transforms/InstCombine/vec_demanded_elts-inseltpoison.ll
508

Yes, I'd just change the function to take a variable index.
That generalizes the test without losing anything AFAICT.

david-arm updated this revision to Diff 345078.May 13 2021, 2:54 AM
  • Rebase.
  • Removed TODO comments from tests.
fhahn added a subscriber: fhahn.May 13 2021, 3:45 AM
fhahn added inline comments.
llvm/test/Transforms/InstCombine/gep-vector-indices.ll
93

Do we have test cases where the GEP has other users? And where the extractelement has a variable index?

david-arm updated this revision to Diff 345094.May 13 2021, 5:10 AM
  • Added two more negative tests.
david-arm marked an inline comment as done.May 13 2021, 5:10 AM
sdesmalen added inline comments.May 21 2021, 3:31 AM
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
434

This condition feels a bit too restrictive, because it would avoid:

%gep = getelementptr i32, i32** %ptr, i32 0, <2 x i32> <i32 4, i32 4>
extractelement <2 x i32*> %gep, i32 0

to not be simplified.

450

You can probably avoid spelling out the three possibilities and just have a loop that extracts the operand if it's type is a VectorType, as long as you make sure only one of the operands is of type VectorType.

llvm/test/Transforms/InstCombine/gep-vector-indices.ll
11–13

nit: For fixed-width, this pattern is unnecessary and you can write <2 x i64> <i64 4, i64 4> directly in the GEP.

david-arm updated this revision to Diff 347360.May 24 2021, 5:53 AM
  • Generalised the optimisation to more than 2 GEP operands with the requirement that only one operand should be a vector.
  • Tidied up the tests.
david-arm marked 3 inline comments as done.May 24 2021, 5:54 AM
david-arm added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
450

I've changed this to a loop, but I left in some comments as examples of different cases we might hit as I thought that might be useful to explain what's going on?

david-arm updated this revision to Diff 347410.May 24 2021, 8:48 AM
david-arm marked an inline comment as done.
  • After private discussion with @sdesmalen I've updated the patch to count the vector GEP operands using a lambda function with llvm::count_if.

Other than my two nits, the patch looks good to me!

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
462–464

nit: I think this assert is unnecessary, because we can assume the incoming IR was legal to begin with.

llvm/test/Transforms/InstCombine/gep-vector-indices.ll
10

nit: is entry: needed? (I don't see it used in the other tests you had to fix in this patch)

david-arm marked 2 inline comments as done.
sdesmalen accepted this revision.May 25 2021, 2:07 PM

LGTM, thanks @david-arm!

This revision is now accepted and ready to land.May 25 2021, 2:07 PM