This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Refine hasIrregularType predicate
ClosedPublic

Authored by LemonBoy on Feb 25 2021, 5:56 AM.

Details

Summary

The hasIrregularType predicate checks whether an array of N values of type Ty is "bitcast-compatible" with a <N x Ty> vector.
The previous check returned invalid results in some cases where there's some padding between the array elements: eg. a 4-element array of u7 values is considered as compatible with <4 x u7>, even though the vector is only loading/storing 28 bits instead of 32.

The problem causes LLVM to generate incorrect code for some targets: for AArch64 the vector loads/stores are lowered in terms of ubfx/bfi, effectively losing the top (N * padding bits).

Diff Detail

Event Timeline

LemonBoy created this revision.Feb 25 2021, 5:56 AM
LemonBoy requested review of this revision.Feb 25 2021, 5:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 25 2021, 5:56 AM
craig.topper added inline comments.Feb 25 2021, 11:50 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
374–375

Should we remove the unused VF argument?

LemonBoy updated this revision to Diff 326488.Feb 25 2021, 1:50 PM

Remove unused parameter.

LemonBoy marked an inline comment as done.Feb 25 2021, 1:50 PM
LemonBoy added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
374–375

Good catch, fixed.

fhahn added inline comments.Mar 2 2021, 3:56 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
374–375

comment needs updating, there's no given vectorization factor any longer, right?

374–375

The comment also needs to be updated to not refer to VF I think, as it is gone now. Something like Determine if an array of type Ty is "bit cast compatible" with a vector with the same number of elements.

llvm/test/Transforms/LoopVectorize/irregular_type.ll
6

can you add a comment explaining what the test checks?

LemonBoy added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
381

CC @david-arm
This slightly different formulation of the check was introduced in c5ba0d33cc060cc06a28a5d9101060afd1c0ee9a.

LemonBoy updated this revision to Diff 327503.Mar 2 2021, 10:03 AM

Update some documentation comments.

lebedev.ri accepted this revision.Mar 16 2021, 1:08 PM
lebedev.ri added a subscriber: lebedev.ri.

Seems good to me.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
380–381

I wonder why can't we still vectorize such cases,
by instead loading <N x DL.getTypeAllocSizeInBits(Ty)> vector and then truncating it?
(beware of endianness)

This revision is now accepted and ready to land.Mar 16 2021, 1:08 PM
LemonBoy added inline comments.Mar 17 2021, 4:31 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
380–381

This was meant to be a hotfix targeting LLVM12.
I've experimented with the widen+truncate strategy and the results are promising (at least on x86), I'll submit a patch once I clean up the code.

lebedev.ri added inline comments.Mar 17 2021, 4:32 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
380–381

This was meant to be a hotfix targeting LLVM12.

Sure, i wasn't even suggesting doing that here.

I've experimented with the widen+truncate strategy and the results are promising (at least on x86), I'll submit a patch once I clean up the code.

Nice!

This revision was landed with ongoing or failed builds.Mar 17 2021, 9:05 AM
This revision was automatically updated to reflect the committed changes.