This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Legalize soffset of buffer instruction
AcceptedPublic

Authored by skc7 on Dec 12 2022, 1:15 AM.

Details

Summary

Legalize soffset of buffer instructions. If vgpr is assigned to soffset, use readlaneVGPRToSGPR to copy value from vgpr to sgpr.

Diff Detail

Event Timeline

skc7 created this revision.Dec 12 2022, 1:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 1:15 AM
skc7 requested review of this revision.Dec 12 2022, 1:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 1:15 AM
foad added a comment.Dec 12 2022, 2:46 AM

Looks good.

llvm/test/CodeGen/AMDGPU/legalize-soffset-mbuf.ll
4

Remove inreg throughout? I don't think it does anything unless you use one of the graphics calling conventionsl like amdgpu_ps.

Also, you are using a non-uniform %rsrc so the compiler is (correctly) generating a waterfall loop, but that's not what you're testing for here. So could you change the test to use a uniform %rsrc, and then the generated code would be much simpler, and you could use update_mir_test_checks.py.

skc7 updated this revision to Diff 482391.Dec 13 2022, 1:31 AM
skc7 added a reviewer: cdevadas.

Removed inreg to params of tests. Made srsrc operand undef.

foad accepted this revision.Dec 13 2022, 5:37 AM

LGTM, thanks.

This revision is now accepted and ready to land.Dec 13 2022, 5:37 AM
foad added a comment.Dec 13 2022, 5:38 AM

Made srsrc operand undef.

We're being told to use poison instead of undef in new tests, wherever possible.

arsenm added inline comments.Dec 13 2022, 5:38 AM
llvm/test/CodeGen/AMDGPU/legalize-soffset-mbuf.ll
15

I still think this is just a flat out wrong lowering that needs to use a waterfall loop

foad added inline comments.Dec 13 2022, 5:41 AM
llvm/test/CodeGen/AMDGPU/legalize-soffset-mbuf.ll
15

What's the use case for that? My understanding is that isel will never put a divergent value in soffset, and for the intrinsics we can define that soffset has to be uniform.

arsenm added inline comments.Dec 13 2022, 5:42 AM
llvm/test/CodeGen/AMDGPU/legalize-soffset-mbuf.ll
15

You cannot define that a value has to be uniform. Every transform would need to be aware of that concept and apply it. load constant 32-bit is similarly broken in pretending to provide a uniformity guarantee

foad added inline comments.Dec 13 2022, 6:08 AM
llvm/test/CodeGen/AMDGPU/legalize-soffset-mbuf.ll
15

Hmm. I guess you could define that an input had to be uniform if you marked the intrinsic as the-opposite-of-convergent, to say it cannot be hoisted out of an "if"?

arsenm added inline comments.Dec 13 2022, 2:43 PM
llvm/test/CodeGen/AMDGPU/legalize-soffset-mbuf.ll
15

I don't think we would want to do that. I think we just need to be able to handle waterfall loops for anything and have optimizations to avoid this kind of problem. We don't need a hard constraint

foad added inline comments.Dec 14 2022, 2:06 AM
llvm/test/CodeGen/AMDGPU/legalize-soffset-mbuf.ll
15

OK, well I have no objections to someone working on that approach. But it does feel like the optimization that you would need to implement to avoid hurting performance by introducing a waterfall loop where you don't really need one, would be very very similar to the hard constraint imposed by marking the intrinsic as the-opposite-of-convergent.

skc7 updated this revision to Diff 483102.Dec 15 2022, 2:31 AM
skc7 set the repository for this revision to rG LLVM Github Monorepo.

Rebase. Use poison for srsrc operands.