This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Legalize the operand of SI_INIT_M0
ClosedPublic

Authored by nhaehnle on Apr 19 2018, 10:25 AM.

Details

Summary

This fixes a case where the argument to a sendmsg intrinsic
ends up in a VGPR, for whatever reason.

The underlying performance issue is that a multiplication that
can be an s_mul_i32 is instead needlessly generated as
v_mul_u32_u24, but this is not addressed by this patch.

Change-Id: I61fd4034314d5acdf6074632c30b65364dfa7328

Diff Detail

Repository
rL LLVM

Event Timeline

nhaehnle created this revision.Apr 19 2018, 10:25 AM
rampitec added inline comments.Apr 19 2018, 10:30 AM
lib/Target/AMDGPU/SIInstrInfo.cpp
3303 ↗(On Diff #143122)

Do not you want to return SRegs[0] here (after the loop) if SubRegs == 1 and if not move creation of DstReg after that? It will save some code duplication.

nhaehnle added inline comments.Apr 19 2018, 11:43 AM
lib/Target/AMDGPU/SIInstrInfo.cpp
3303 ↗(On Diff #143122)

That's what I had at first, but it doesn't work because the loop uses RI.getSubRegFromChannel(i), and if the SrcReg is an SGPR32 this fails because it has no subregs...

This was apparently never noticed because so far, the function was only used for resource descriptors and samplers.

This revision is now accepted and ready to land.Apr 19 2018, 12:39 PM
This revision was automatically updated to reflect the committed changes.