This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][MC][GFX11] Add partial NSA format for image sample instructions
ClosedPublic

Authored by mbrkusanin on Feb 14 2023, 10:35 AM.

Details

Summary

Image sample instructions that need more than 5 VGPRs for VAddr can use
partial NSA for NSA encoding format. VGPRs that can not fit into the
encoding are sequential after the last one.
This patch adds assembly and disassembly parts.

Diff Detail

Event Timeline

mbrkusanin created this revision.Feb 14 2023, 10:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2023, 10:35 AM
mbrkusanin requested review of this revision.Feb 14 2023, 10:35 AM

Same could be done for GFX10, however differences between GFX10.1 and GFX10.3 cause inconveniences.
MaxNSA size for 10.1 is 5 and for 10.3 is 13 so _V6, _V7,... opcodes for GFX10 already exist where every vaddr is a VGPR_32.
We would need new versions exclusive for non-10.3.

Not sure if that is the best approach. We would end up with opcodes like:
IMAGE_SAMPLE_D_V1_V6_nsa_gfx10
IMAGE_SAMPLE_D_V1_V6_partial_nsa_gfx10
or
IMAGE_SAMPLE_D_V1_V6_nsa_gfx1030
IMAGE_SAMPLE_D_V1_V6_nsa_gfx1010
or
IMAGE_SAMPLE_D_V1_V6_nsa_navi2_gfx10
IMAGE_SAMPLE_D_V1_V6_nsa_navi1_gfx10

foad added inline comments.Feb 15 2023, 5:18 AM
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
910–916
959

"more than"

971
986–987
foad added reviewers: critson, Restricted Project.Feb 15 2023, 5:20 AM
arsenm added inline comments.Feb 20 2023, 4:30 AM
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
3695

Should try to avoid raw generation checks and have some feature for this

  • new feature FeaturePartialNSAEncoding
  • NSAMaxSize is no longer a tablegen feature
foad added inline comments.
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
10

I think the idea is that the assembler should not include any of the "codegen" parts of the backend. If getNSAMaxSize needs to be shared by codegen and the assembler, it should be moved into AMDGPUBaseInfo - that is why AMDGPUBaseInfo exists.

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
910–915

This should check !hasPartialNSAEncoding?

llvm/lib/Target/AMDGPU/GCNSubtarget.h
940

No else after return.

mbrkusanin marked an inline comment as done.
mbrkusanin added inline comments.
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
10

Moved to AMDGPUBaseInfo.

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
910–915

Sure, it can.
Also changed isGFX10Plus() to STI.hasFeature(AMDGPU::FeatureNSAEncoding) below.

foad added a comment.Feb 22 2023, 6:38 AM

LGTM but I'd really like someone familiar with MIMGInstructions.td to take a look too.

llvm/lib/Target/AMDGPU/GCNSubtarget.h
137

Remove NSAMaxSize field?

llvm/lib/Target/AMDGPU/MIMGInstructions.td
1068

"... more VGPRs than ..."

mbrkusanin marked 2 inline comments as done.
foad added inline comments.Feb 22 2023, 7:11 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1950

I'm not sure if it's allowed to call into GCNSubtarget from this code - does anyone else know?

arsenm added inline comments.Feb 22 2023, 7:13 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1950

No, GCNSubtarget is from codegen.

foad added inline comments.Feb 22 2023, 7:20 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1950

I guess it happens to work if you something defined in GCNSubtarget.h. But really I don't think we should be including GCNSubtarget.h here in the first place.

foad added inline comments.Feb 22 2023, 8:00 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1950
mbrkusanin added inline comments.Feb 22 2023, 8:16 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1950

Should I duplicate the code from getNSAMaxSize() here or have GCNSubtarget.h call AMDGPUBaseInfo version?

foad added inline comments.Feb 22 2023, 8:21 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1950

Have GCNSubtarget.h call AMDGPUBaseInfo version. You will have to use IsaVersion for the generation checks. There are lots of examples in AMDGPUBaseInfo.cpp.

mbrkusanin added inline comments.
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1949–1954

or

if (isGFX10(STI))
  return hasGFX10_3Insts(STI) ? 13 : 5;
if (isGFX11(STI))
  return 5;
return 0;
foad accepted this revision.Feb 23 2023, 4:04 AM

LGTM, thanks!

Hopefully @critson has taken a look and is happy too :)

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1949–1954

Either way is fine. But it should probably be Version.Minor >= 3 just in case someone invents GFX 10.4.

This revision is now accepted and ready to land.Feb 23 2023, 4:04 AM
mbrkusanin marked an inline comment as done.
This revision was landed with ongoing or failed builds.Feb 23 2023, 4:38 AM
This revision was automatically updated to reflect the committed changes.