When adding operands to machine instructions in case of RegisterSDNodes,
generate a COPY node in case the register class does not match the one in the instruction definition.
Details
- Reviewers
sdardis zoran.jovanovic
Diff Detail
Event Timeline
I think the approach here is incorrect, as this patch is solving a very specific case when the bug is more general.
During the instruction emission of SDNodes to MachineInstrs, InstrEmitter::AddOperand() fails to account
for handling register operands whose register class differs from the instruction definition.
Normally the initial register class for an incoming register operand is chosen based on the value type it
holds, which in this case is GPR32. The problematic SW16_MM however takes a GPRMM16Zero register
operand.
Can you look at modifying MachineInstrs, InstrEmitter::AddOperand() to use AddRegisterOperand in the
case of RegisterSDNodes?
Thanks,
Simon
Compare the register classes one would get for the machine instruction operand Op as opposed to the register class we would get based on the MCInstrDesc.
If those two would differ, we would emit an additional COPY node before adding the register which now has the desired register class.
Tried using the AddRegisterOperand() directly, but couldn't do so since getVR called inside cannot look up the SDValue Op inside the map -- that value happens to be a COPY node as well for which .isMachineOpcode() returns false.
As far as I can tell, additions to that map are being done in functions called from InstrEmitter::EmitMachineNode() which I assume deals with machine nodes only.
I believe the entire hunk for handling RegisterSDNodes should be sunk into AddRegisterOperand as your change mimics portions of the logic there.
lib/CodeGen/SelectionDAG/InstrEmitter.cpp | ||
---|---|---|
423 | Space after the if, and the line is overly long. |
Please update the title with [SelectionDAG] tag and description to reflect that this is now a selection dag patch. Also, can you restore the code to where it previously was, I was incorrect about that, as it'd just make the code harder to follow in terms of handling virtual and physical registers in the same function.
lib/CodeGen/SelectionDAG/InstrEmitter.cpp | ||
---|---|---|
319–321 | This comment should go above: bool Imp = II && (IIOpNum >= II->getNumOperands() && !II->isVariadic()); | |
330 | Formatting, space after the if. |
It appears this also fixes a machine verifier failure in test/CodeGen/AMDGPU/r600.work-item-intrinsics.ll, but I am unsure on the correctness of the changed assembly output. +CC'ing Jan Vesely who provided the test.
It'd help to see the diff. The test checks that certain registers are reserved and not allocated when thread IDs are used by the shader.
TIDIG.{x,y,z} is in T0.{x,y,z}, TGID.{x,y,z} is in T1.{x,y,z}.
It looks like this patch is OK other than introducing an extra MOV, I haven't looked into why that is.
These look good. An additional MOV is inserted since MEMRAT_CACHELESS STORE_RAW includes constraint of using _X subreg for 32bit stores.
When you edit the test please make sure to test both the MOV from the correct id register and store of the right data.
thank you.
LGTM.
test/CodeGen/AMDGPU/r600.work-item-intrinsics.ll | ||
---|---|---|
1 | Add -verify-machineinstrs here to ensure we don't regress. |
This comment should go above: