This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Improve verifier wrt vcc subregs
AbandonedPublic

Authored by rampitec on Dec 5 2017, 12:35 PM.

Details

Reviewers
arsenm
msearles
Summary

Registers vcc_lo and vcc_hi are added to the checks for constant bus
restriction. This has resulted in two test failures:

  1. attr-amdgpu-num-sgpr.ll fails at instructions: %vgpr4<def> = V_MOV_B32_e32 %vcc_lo, ... %vcc<imp-use> %vgpr5<def> = V_MOV_B32_e32 %vcc_hi, ... %vcc<imp-use,kill> ...

    We were incorrectly assuming that vcc and vcc_lo uses are distinct.
  1. inserted-wait-states.mir fails at the instruction:

    %vgpr4<def> = V_WRITELANE_B32 %sgpr4, %vcc_lo

    v_writelane_b32 is an exception from constant bus restriction: vsrc0 can be sgpr, const or m0 and lane select sgpr, m0 or inline-const.

Both verifier errors are fixed.

Diff Detail

Event Timeline

rampitec created this revision.Dec 5 2017, 12:35 PM
arsenm added inline comments.Dec 5 2017, 1:06 PM
lib/Target/AMDGPU/SIInstrInfo.cpp
2458

I was trying to clean this up recently to avoid having to list VCC here at all by checking SReg_32/SReg_64 classes instead. I think there's an issue here because implicit really means two different things. This really wants to be checking for the hardware implicit physical register uses from the instruction definition, which only applies to VCC. No instructions implicitly read just vcc_lo or vcc_hi. Implicit can also mean extra operands the compiler added on for modeling other constraints, which should not be verified for constant bus usage.

2473–2474

There aren't any instructions that use these for implicit operands.

In the first test you mentioned, I would hope this would be skipping any of the extra implicit operands beyond the defined exec use since they aren't actually going to be used by the hardware and are just for the compiler internals.

2739

Why would noreg count as a constant bus use? I wouldn't expect to see a noreg operand ever reach here

2740

I'm not sure this is correct. Consider:

v_cndmask_b32_e32 v0, vcc_hi, v1, <implicit vcc>

I'm not sure if this is valid or not.

rampitec added inline comments.Dec 5 2017, 1:12 PM
lib/Target/AMDGPU/SIInstrInfo.cpp
2739

Is we had no SGPRUsed before it is initialized with NoRegister. Now as we found a used SGPR and it is not the same as was used before we shall record it. It is in fact the same behavior as before, just NoRegister cannot be passed into regsOverlap().

2740

I think it is valid as we are going to read the whole pair in any way.

Is this still needed?

rampitec abandoned this revision.Feb 21 2019, 7:05 PM