This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix size of SOPK instructions to 4 bytes
ClosedPublic

Authored by gandhi21299 on Sep 15 2022, 1:46 PM.

Details

Summary

Instructions in SOPK format may not have 32-bit
literal constants following the instruction.

Diff Detail

Event Timeline

gandhi21299 created this revision.Sep 15 2022, 1:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 1:46 PM
gandhi21299 requested review of this revision.Sep 15 2022, 1:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 1:46 PM

I believe you are correct about SOPK being fixed size. This looks like it could be an NFC change, because the code below would detect if there was a literal used. If you have any reason to think its not NFC, please add a test. That said, I think a better way to do the change is set FixedSize = 1 on SOPP_Pseudo.

arsenm added inline comments.Sep 15 2022, 5:23 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7486 ↗(On Diff #460498)

Shouldn't special case this here, FixedSize should be set.

Set FixedSize for SOPK_Pseudo

@Joe_Nash This patch fixes a bug where a literal is appended to the instruction. It has hardly been observed, however a test will be helpful. I will write one up,

foad added a comment.Sep 16 2022, 1:36 AM

I believe you are correct about SOPK being fixed size. This looks like it could be an NFC change, because the code below would detect if there was a literal used. If you have any reason to think its not NFC, please add a test. That said, I think a better way to do the change is set FixedSize = 1 on SOPP_Pseudo.

Right, FixedSize was originally introduced (D25500) as a special case for instructions that are known not to have a literal operand even though the InOperandList contains something unusual that might look like a literal. It should not be needed in general for all instructions that happen to have a fixed size. So it would be nice to know what unusual case prompted this patch.

gandhi21299 updated this revision to Diff 461029.EditedSep 17 2022, 2:12 PM
  • added a test to check for block sizes through offsets printed during branch relaxation
arsenm added inline comments.Sep 19 2022, 6:17 AM
llvm/test/CodeGen/AMDGPU/sopk-no-literal.ll
2

If you're going to rely on -debug-only, this needs REQUIRES: asserts

12

Don't need attrs comment

13

I'm not seeing where there's a sopk instruction in this function

90–91

Don't need all the instruction flags. Should also use named values

gandhi21299 marked 4 inline comments as done.

Address Matt's comments

llvm/test/CodeGen/AMDGPU/sopk-no-literal.ll
13

s_waitcnt_* are all SOPK instructions

90–91

I will remove them since they are eliminated and do not generate any SOPK anyways.

arsenm added inline comments.Sep 19 2022, 9:01 AM
llvm/test/CodeGen/AMDGPU/sopk-no-literal.ll
6–10

For a function that does not require relaxation, how did it introduce a block with size 0?

gandhi21299 added inline comments.Sep 19 2022, 10:44 AM
llvm/test/CodeGen/AMDGPU/sopk-no-literal.ll
6–10

It was first introduced due to the structurize CFG pass. I think it combines a sequence of instructions into a phi instruction.

arsenm added inline comments.Sep 19 2022, 11:19 AM
llvm/test/CodeGen/AMDGPU/sopk-no-literal.ll
6–10

I'd somewhat prefer if the IR was a more direct match to what comes out here, so pre-structurize. You could also try using s.setreg intrinsics for direct emission of a sopk

  • Simplified sopk-no-literal.ll test. This test checks for the size of the kernel which should be 28 bytes as each instruction is 4 bytes long, except for s_load_b32 which is 8 bytes in size.
  • Adding a newline in the end of the test file.
arsenm accepted this revision.Sep 20 2022, 1:12 PM
This revision is now accepted and ready to land.Sep 20 2022, 1:12 PM
This revision was automatically updated to reflect the committed changes.