This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][Disassembler] Adjust img instruction address field if a16 present
AbandonedPublic

Authored by dstuttard on Apr 30 2021, 5:29 AM.

Details

Reviewers
foad
rampitec
dp
Summary

A16 support for image instruction disassembly (gfx10) was missing

Diff Detail

Event Timeline

dstuttard created this revision.Apr 30 2021, 5:29 AM
dstuttard requested review of this revision.Apr 30 2021, 5:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2021, 5:29 AM
dstuttard updated this revision to Diff 341861.Apr 30 2021, 5:40 AM

IsA16 now bool - forgot to fix init

foad added a reviewer: dp.Apr 30 2021, 12:50 PM
dp added inline comments.May 4 2021, 7:30 AM
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
737

Your change is definitely a great improvement, a lot of sp3-based MIMG tests now pass. However I'm not sure if a16 should affect size of gradients. gfx10_shader_programming only says that gradients are packed for g16 opcodes. sp3 code does not pack gradients for a16=1 either.

When I remove this condition from your patch, I see some improvements in test pass rate for _d and _cd opcodes. Below are a few tests which fail with your patch but pass if IsA16 condition is removed:

image_sample_cd v[5:6], v[1:8], s[8:15], s[12:15] dmask:0x3 dim:SQ_RSRC_IMG_2D a16
image_sample_cd v[5:6], v[1:8], s[8:15], s[12:15] dmask:0x3 dim:SQ_RSRC_IMG_CUBE a16
image_sample_cd v[5:6], v[1:8], s[8:15], s[12:15] dmask:0x3 dim:SQ_RSRC_IMG_2D_ARRAY a16
dp added inline comments.May 4 2021, 7:47 AM
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
737

My previous comment was intended for your change in parser, sorry. https://reviews.llvm.org/D101619

sebastian-ne added inline comments.
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
737

The code in SIISelLowering suggests that A16 always needs G16 if derivatives are specified explicitely: https://github.com/llvm/llvm-project/blob/8e211bf1c895a31b3e9f49014b5494d8e1dabcf6/llvm/lib/Target/AMDGPU/SIISelLowering.cpp#L6098-L6103
I remember something like A16 implies G16, but I don’t remember where that comes from.

IIRC sp3 often shows larger registers than are actually used. LLVM is a lot stricter there.

dp added inline comments.May 4 2021, 10:38 AM
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
737

IIRC sp3 often shows larger registers than are actually used. LLVM is a lot stricter there.

It is not always true. Actually llvm may align MIMG address size to 8/16: https://github.com/llvm/llvm-project/blob/e1c729c56829d3b9502b9ac2439003f87231db50/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp#L3436

I remember something like A16 implies G16, but I don’t remember where that comes from.

Is this a feature of our compiler or AMD H/W?

I have just checked that the latest sp3 do distinguish g16 and a16. Below are some examples (valid sp3 code):

image_sample_d_g16 v[0:3], [v0, v2, v4, v6],         s[0:7], s[8:11] dmask:0xf dim:SQ_RSRC_IMG_2D    
image_sample_d_g16 v[0:3], [v0, v2, v4],             s[0:7], s[8:11] dmask:0xf dim:SQ_RSRC_IMG_2D a16
image_sample_d     v[0:3], [v0, v2, v4, v6, v8, v9], s[0:7], s[8:11] dmask:0xf dim:SQ_RSRC_IMG_2D
image_sample_d     v[0:3], [v0, v2, v4, v6, v8],     s[0:7], s[8:11] dmask:0xf dim:SQ_RSRC_IMG_2D a16
dstuttard added inline comments.May 5 2021, 1:11 AM
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
737

I've done some more digging and I can confirm that the A16 should not be used for gradient packing. I'll update the code in all locations and re-submit.

dstuttard abandoned this revision.May 11 2021, 5:17 AM

I've combined this with some other changes into D102231 - easier than trying to separate all the various bits