This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] All GWS instructions need aligned VGPR on gfx90a
ClosedPublic

Authored by rampitec on May 26 2021, 12:05 PM.

Diff Detail

Event Timeline

rampitec created this revision.May 26 2021, 12:05 PM
rampitec requested review of this revision.May 26 2021, 12:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2021, 12:05 PM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm added inline comments.May 26 2021, 12:13 PM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
4228–4237

Should just constrain the register class to an even aligned 32-bit register class

rampitec added inline comments.May 26 2021, 12:20 PM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
4228–4237

That was the first thing I did. It is much more code to handle just these 3 instructions, and it does not really work, I cannot prevent SIFoldOperands from replacing a register with unaligned one. Then isLegalRegOperand() should also handle this situation, which in turn would mean duplicating ds_gws opcodes with a different operand RC, handling new operand type in asm/disasm and other places.

On top of that such RC must have OtherVT as the only supported VT so that ISel will not use it after a call to getMinimalPhysRegClass(). This RC also will make it a quest to implement something line getCompatibleSubRegClass() and similar when an odd subreg is requested.

This is really an overkill.

rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
4228–4237

You can take a look at D103213 which does that. I was able to get away w/o new operand types, but I still believe that is an overkill to do it that way.

JBTW, this patch directly reflects what's actually happen in HW. Even though these 3 instructions read 32 bit as a source internally they request to read 64 bit which in turn triggered alignment requirement. I assume this is really modeled as "read dword as an operand and have an implicit read of a superreg".

rampitec updated this revision to Diff 349095.Jun 1 2021, 1:51 PM

Added verifier check.

arsenm added inline comments.Jun 1 2021, 2:14 PM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
4228–4237

SIFoldOperands should be respecting operand constraints

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4351–4353

The SSA check doesn't make sense, the class is unrelated

rampitec added inline comments.Jun 1 2021, 2:20 PM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
4228–4237

And for that I had to write a pipe of code in the D103213 to handle it in the isOperandLegal(), which as I said is an overkill.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4351–4353

Any idea how to skip verifier after selection with gisel and before finalize lowering? We should have lot more problems like this as finalizeLowering() calls legalizeOperands() for many more instructions, we just do not have enough tests I guess.

rampitec updated this revision to Diff 349097.Jun 1 2021, 2:21 PM

Fix tidy warning.

rampitec added inline comments.Jun 1 2021, 2:45 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4351–4353

Here is what happens with gisel:

# *** IR Dump After InstructionSelect (instruction-select) ***:
# Machine code for function gws_init_odd_agpr: IsSSA, TracksLiveness, Legalized, RegBankSelected, Selected
Function Live Ins: $sgpr0_sgpr1 in %1, $sgpr2 in %6
...
  DS_GWS_INIT %16:vgpr_32, 0, implicit $m0, implicit $exec :: (store 4 into custom "GWSResource")
...
# *** IR Dump After Finalize ISel and expand pseudo-instructions (finalize-isel) ***:
# Machine code for function gws_init_odd_agpr: IsSSA, TracksLiveness, Legalized, RegBankSelected, Selected
Function Live Ins: $sgpr0_sgpr1 in %1, $sgpr2 in %6
...
  BUNDLE implicit %19:vreg_64_align2, implicit $m0, implicit $exec {
    DS_GWS_INIT %19.sub0:vreg_64_align2, 0, implicit $m0, implicit $exec, implicit %19:vreg_64_align2 :: (store 4 into custom "GWSResource")
    S_WAITCNT 0
  }

If verifier kiks-in before finalizeLowering the instruction is not yet fixed, just like any other instruction which is supposed to be legalized there.

arsenm added inline comments.Jun 1 2021, 3:09 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4351–4353

Oh right, this problem. We currently have an ugly hack where we disable the verifier for selectiondag before finalize lowering, but I would like to avoid spreading that problem.

Can you adjust the class directly in selectDSGWSIntrinsic for gisel?

rampitec updated this revision to Diff 349126.Jun 1 2021, 4:16 PM
rampitec marked 2 inline comments as done.

Same code added to selectDSGWSIntrinsic().

arsenm accepted this revision.Jun 1 2021, 4:29 PM

LGTM plus comment

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1394

Should add a comment explaining this hack

This revision is now accepted and ready to land.Jun 1 2021, 4:29 PM
rampitec updated this revision to Diff 349135.Jun 1 2021, 4:51 PM
rampitec marked an inline comment as done.

Added comment.

This revision was landed with ongoing or failed builds.Jun 1 2021, 5:08 PM
This revision was automatically updated to reflect the committed changes.