This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPUPromoteAlloca] Make compatible with opaque pointers
ClosedPublic

Authored by nikic on Mar 10 2022, 3:56 AM.

Details

Summary

This mainly changes the handling of bitcasts to not check the types being casted from/to -- we should only care about the actual load/store types. The GEP handling is also changed to not care about types, and just make sure that we get an offset corresponding to a vector element.

This was a bit of a struggle for me, because this code seems to be pretty sensitive to small changes. The end result seems to produce strictly better results for the existing test coverage though, because we can now deal with more situations involving bitcasts.

Diff Detail

Event Timeline

nikic created this revision.Mar 10 2022, 3:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 3:56 AM
nikic requested review of this revision.Mar 10 2022, 3:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 3:56 AM
arsenm added inline comments.Mar 10 2022, 7:11 AM
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
433

Doesn't this need to check the exact operand number in case a pointer is being stored to itself?

llvm/test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll
40

Weird this test isn't using the right address space

75–79

I'm not sure what's going on here, but it may be an artifact of the test using the wrong address space for the stack

nikic updated this revision to Diff 414373.Mar 10 2022, 7:18 AM
nikic marked an inline comment as done.

Check exact operand

nikic added inline comments.Mar 10 2022, 7:23 AM
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
433

Yeah, at least far as intent goes. I don't think it's really relevant for correctness, but possibly for profitability.

llvm/test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll
40

Yeah, though I don't think it really matters for the transform. Should I adjust the test to use addrspace(5) for the allocas?

75–79

This is an intended change: bitcast should get the same treatment as gep p, 0, 0. This transform generally relies on SROA and InstCombine to clean up the mess afterwards.

arsenm accepted this revision.Mar 10 2022, 8:04 AM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
433

A captured pointer shouldn't be handled for correctness so I guess it doesn't matter

llvm/test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll
40

It shouldn't matter. I can fix this test later

This revision is now accepted and ready to land.Mar 10 2022, 8:04 AM
This revision was landed with ongoing or failed builds.Mar 11 2022, 12:24 AM
This revision was automatically updated to reflect the committed changes.