This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Limit GEP lists based on width of index computation.
ClosedPublic

Authored by fhahn on Jun 23 2020, 3:43 PM.

Details

Summary

D68667 introduced a tighter limit to the number of GEPs to simplify
together. The limit was based on the vector element size of the pointer,
but the pointers themselves are not actually put in vectors.

IIUC we try to vectorize the index computations here, so we should base
the limit on the vector element size of the computation of the index.

This restores the test regression on AArch64 and also restores the
vectorization for a important pattern in SPEC2006/464.h264ref on
AArch64 (@test_i16_extend). We get a large benefit from doing a single
load up front and then processing the index computations in vectors.

Note that we could probably even further improve the AArch64 codegen, if
we would do zexts to i32 instead of i64 for the sub operands and then do
a single vector sext on the result of the subtractions. AArch64 provides
dedicated vector instructions to do so. Sketch of proof in Alive:
https://alive2.llvm.org/ce/z/A4xYAB

Diff Detail

Event Timeline

fhahn created this revision.Jun 23 2020, 3:43 PM
This revision is now accepted and ready to land.Jun 24 2020, 5:39 AM
spatel accepted this revision.Jun 24 2020, 8:20 AM
spatel added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7505–7506

We should update this comment with something like the text in this patch description:

// We are trying to vectorize the index computations, so we base
// the vectorization factor on the size of the loaded values rather
// than the size of the GEP itself (the target's pointer size).

?

fhahn updated this revision to Diff 273066.Jun 24 2020, 9:16 AM

Update comment stating why the maximum number of elements is based on the size of the index expression, rather than the GEP as suggested.

fhahn marked an inline comment as done.Jun 24 2020, 9:22 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7505–7506

I've updated the comment as suggested, with slight edits to make it if better with the maximum number of elements below. WDYT? I'll commit it in a bit, unless there are further suggestions.

spatel added inline comments.Jun 24 2020, 11:03 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7505–7506

That text works for me - thanks!

This revision was automatically updated to reflect the committed changes.