This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix various issues around the VirtReg2Value mapping
ClosedPublic

Authored by nhaehnle on Nov 9 2018, 11:51 AM.

Details

Summary

The VirtReg2Value mapping is crucial for getting consistently
reliable divergence information into the SelectionDAG. This
patch fixes a bunch of issues that lead to incorrect divergence
info and introduces tight assertions to ensure we don't regress:

  1. VirtReg2Value is generated lazily; there were some cases where a lookup was performed before all relevant virtual registers were created, leading to an out-of-sync mapping. Those cases were:
    • Complex code to lower formal arguments that generated CopyFromReg nodes from live-in registers (fixed by never querying the mapping for live-in registers).
    • Code that generates CopyToReg for formal arguments that are used outside the entry basic block (fixed by never querying the mapping for Register nodes, which don't need the divergence info anyway).
  1. For complex values that are lowered to a sequence of registers, all registers must be reflected in the VirtReg2Value mapping.

I am not adding any new tests, since I'm not actually aware of any
bugs that these problems are causing with trunk as-is. However,
I recently added a test case (in r346423) which fails when D53283 is
applied without this change. Also, the new assertions should provide
most of the effective test coverage.

There is one test change in sdwa-peephole.ll. The underlying issue
is that since the divergence info is now correct, the DAGISel will
select V_OR_B32 directly instead of S_OR_B32. This leads to an extra
COPY which affects the behavior of MachineLICM in a way that ends up
with the S_MOV_B32 with the constant in a different basic block than
the V_OR_B32, which is presumably what defeats the peephole.

Diff Detail

Repository
rL LLVM

Event Timeline

nhaehnle created this revision.Nov 9 2018, 11:51 AM

Is there code for the lazy map that should be cleaned up now?

lib/Target/AMDGPU/SIISelLowering.cpp
9287 ↗(On Diff #173402)

This is only used with asserts enabled, so will warn?

alex-t added inline comments.Nov 14 2018, 3:43 AM
lib/Target/AMDGPU/SIISelLowering.cpp
9305 ↗(On Diff #173402)

Removing the line "case ISD::Register", why don't you remove the related lines either?

nhaehnle updated this revision to Diff 174354.Nov 16 2018, 4:12 AM
  • add LLVM_ATTRIBUTE_UNUSED to prevent warning in optimized builds
  • cleanup isSDNodeSourceOfDivergence a bit more
nhaehnle marked 2 inline comments as done.Nov 16 2018, 4:13 AM

Is there code for the lazy map that should be cleaned up now?

Not that I'm aware of. The lazy mapping is done by FunctionLoweringInfo::getValueFromVirtualReg, which this change modifies to fix the problem with composite values.

lib/Target/AMDGPU/SIISelLowering.cpp
9305 ↗(On Diff #173402)

Good point, cleaned this up a bit more.

nhaehnle marked an inline comment as done.Nov 29 2018, 3:04 AM

ping

rampitec added inline comments.Nov 29 2018, 7:16 AM
lib/Target/AMDGPU/SIISelLowering.cpp
9310 ↗(On Diff #174354)

Can we change to !isSGPRReg() just be on a safe side?

nhaehnle updated this revision to Diff 176104.Nov 30 2018, 5:57 AM
  • prefer to use !TRI.isSGPRReg
nhaehnle marked an inline comment as done.Nov 30 2018, 5:57 AM
This revision is now accepted and ready to land.Nov 30 2018, 7:28 AM
This revision was automatically updated to reflect the committed changes.