This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][MC][GFX11] Add partial NSA format for image sample instructions
ClosedPublic

Authored by mbrkusanin on Feb 14 2023, 10:35 AM.

Details

Summary

Image sample instructions that need more than 5 VGPRs for VAddr can use
partial NSA for NSA encoding format. VGPRs that can not fit into the
encoding are sequential after the last one.
This patch adds assembly and disassembly parts.

Diff Detail

Event Timeline

mbrkusanin created this revision.Feb 14 2023, 10:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2023, 10:35 AM
mbrkusanin requested review of this revision.Feb 14 2023, 10:35 AM

Same could be done for GFX10, however differences between GFX10.1 and GFX10.3 cause inconveniences.
MaxNSA size for 10.1 is 5 and for 10.3 is 13 so _V6, _V7,... opcodes for GFX10 already exist where every vaddr is a VGPR_32.
We would need new versions exclusive for non-10.3.

Not sure if that is the best approach. We would end up with opcodes like:
IMAGE_SAMPLE_D_V1_V6_nsa_gfx10
IMAGE_SAMPLE_D_V1_V6_partial_nsa_gfx10
or
IMAGE_SAMPLE_D_V1_V6_nsa_gfx1030
IMAGE_SAMPLE_D_V1_V6_nsa_gfx1010
or
IMAGE_SAMPLE_D_V1_V6_nsa_navi2_gfx10
IMAGE_SAMPLE_D_V1_V6_nsa_navi1_gfx10

foad added inline comments.Feb 15 2023, 5:18 AM
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
908–913
957

"more than"

974
989
foad added reviewers: critson, Restricted Project.Feb 15 2023, 5:20 AM
arsenm added inline comments.Feb 20 2023, 4:30 AM
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
3687

Should try to avoid raw generation checks and have some feature for this

  • new feature FeaturePartialNSAEncoding
  • NSAMaxSize is no longer a tablegen feature
foad added inline comments.
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
10

I think the idea is that the assembler should not include any of the "codegen" parts of the backend. If getNSAMaxSize needs to be shared by codegen and the assembler, it should be moved into AMDGPUBaseInfo - that is why AMDGPUBaseInfo exists.

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
908–913

This should check !hasPartialNSAEncoding?

llvm/lib/Target/AMDGPU/GCNSubtarget.h
940 ↗(On Diff #498894)

No else after return.

mbrkusanin marked an inline comment as done.
mbrkusanin added inline comments.
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
10

Moved to AMDGPUBaseInfo.

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
908–913

Sure, it can.
Also changed isGFX10Plus() to STI.hasFeature(AMDGPU::FeatureNSAEncoding) below.

foad added a comment.Feb 22 2023, 6:38 AM

LGTM but I'd really like someone familiar with MIMGInstructions.td to take a look too.

llvm/lib/Target/AMDGPU/GCNSubtarget.h
137 ↗(On Diff #499184)

Remove NSAMaxSize field?

llvm/lib/Target/AMDGPU/MIMGInstructions.td
1068

"... more VGPRs than ..."

mbrkusanin marked 2 inline comments as done.
foad added inline comments.Feb 22 2023, 7:11 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1948 ↗(On Diff #499495)

I'm not sure if it's allowed to call into GCNSubtarget from this code - does anyone else know?

arsenm added inline comments.Feb 22 2023, 7:13 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1948 ↗(On Diff #499495)

No, GCNSubtarget is from codegen.

foad added inline comments.Feb 22 2023, 7:20 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1948 ↗(On Diff #499495)

I guess it happens to work if you something defined in GCNSubtarget.h. But really I don't think we should be including GCNSubtarget.h here in the first place.

foad added inline comments.Feb 22 2023, 8:00 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1948 ↗(On Diff #499495)
mbrkusanin added inline comments.Feb 22 2023, 8:16 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1948 ↗(On Diff #499495)

Should I duplicate the code from getNSAMaxSize() here or have GCNSubtarget.h call AMDGPUBaseInfo version?

foad added inline comments.Feb 22 2023, 8:21 AM
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1948 ↗(On Diff #499495)

Have GCNSubtarget.h call AMDGPUBaseInfo version. You will have to use IsaVersion for the generation checks. There are lots of examples in AMDGPUBaseInfo.cpp.

mbrkusanin added inline comments.
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1947–1952 ↗(On Diff #499525)

or

if (isGFX10(STI))
  return hasGFX10_3Insts(STI) ? 13 : 5;
if (isGFX11(STI))
  return 5;
return 0;
foad accepted this revision.Feb 23 2023, 4:04 AM

LGTM, thanks!

Hopefully @critson has taken a look and is happy too :)

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1947–1952 ↗(On Diff #499525)

Either way is fine. But it should probably be Version.Minor >= 3 just in case someone invents GFX 10.4.

This revision is now accepted and ready to land.Feb 23 2023, 4:04 AM
mbrkusanin marked an inline comment as done.
This revision was landed with ongoing or failed builds.Feb 23 2023, 4:38 AM
This revision was automatically updated to reflect the committed changes.