This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Vectorize alloca thru bitcast
ClosedPublic

Authored by rampitec on May 8 2020, 11:27 AM.

Details

Summary

This is mostly useful if alloca element type is not integer
and then casted to an integer for load or store. We now can
vectorize an [i32] alloca but cannot do so for [float].

There also a separate patch needed to properly lower 64 bit
types after they vectorized. At the moment these are lowered
via scratch anyway.

Diff Detail

Event Timeline

rampitec created this revision.May 8 2020, 11:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 8 2020, 11:27 AM

PSDB passed.

arsenm added inline comments.May 8 2020, 11:53 AM
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
310

I think this needs to be careful around multiple uses

435

Needs test with assume intrinsic

481

IRBuilder does this check for you so you can omit it

llvm/test/CodeGen/AMDGPU/vector-alloca-bitcast.ll
28

Needs tests with multiple uses

rampitec updated this revision to Diff 262944.May 8 2020, 1:36 PM
rampitec marked 7 inline comments as done.

Added more tests.

llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
310

We just need to get to the actual pointer here, it does not matter if there are multiple uses, bitcasts are not removed anyway. Also note that is not a problem if not all uses are converted, alloca itself stays. The pass does partial vectorization even now.

435

It's the same as lifetime instrinsics in the test, but I will add one. Just note that a pointer cannot be passed into assume, it shall be a compare, so it will prevent alloca removal.

481

Actually it does not: Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.

arsenm accepted this revision.May 8 2020, 2:47 PM
This revision is now accepted and ready to land.May 8 2020, 2:47 PM
This revision was automatically updated to reflect the committed changes.