This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add G16 support to image instructions
ClosedPublic

Authored by Flakebi on Mar 26 2020, 4:02 AM.

Details

Summary

Add G16 feature for GFX10 and refactor lowerImage to be easier to read
and support 16 bit derivatives.

Diff Detail

Event Timeline

Flakebi created this revision.Mar 26 2020, 4:02 AM
Flakebi updated this revision to Diff 252793.Mar 26 2020, 4:04 AM

Forgot to run formatter

Can you also add this to the GlobalISel legalization

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
5912–5936

ST->hasR128A16() && ST->hasGFX10A16(). We should probably add a hasA16 that checks both

5968

ST->hasG16()

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.encode.ll
3

Needs gfx9 checks too

Flakebi updated this revision to Diff 252894.Mar 26 2020, 10:22 AM

Thank you for the fast review!

I added G16 to GlobalISel legalization. Is there a part where the image instructions get selected for GlobalISel? I didn’t find it and I guess it doesn’t pick up G16 instructions automatically.

gfx9 does not have G16 (addresses are 32 bit, derivatives 16 bit) only A16 (addresses and derivatives are 16 bit), so it’s tested in unsupported-image-g16.ll.

Thank you for the fast review!

I added G16 to GlobalISel legalization. Is there a part where the image instructions get selected for GlobalISel? I didn’t find it and I guess it doesn’t pick up G16 instructions automatically.

gfx9 does not have G16 (addresses are 32 bit, derivatives 16 bit) only A16 (addresses and derivatives are 16 bit), so it’s tested in unsupported-image-g16.ll.

The GLobalISEl selection is stuck in review in D74316

What is the best way to handle A16 and G16 in instruction selection for GlobalISel?

They are only encoded in the types of arguments and not in the intrinsic id. In legalization, we can simply check the type of derivative and address arguments but afterwards, all arguments are packed into a singe, large value for the non-nsa case or several 32 bit values for nsa.
I think it’s hard to detect if this was originally A16 or even only G16 with 32 bit addresses.

Should I encode this information into metadata in the legalization step and use that in instruction selection?

I think this would look like I.getMetadata("amdgpu.a16") and I.getMetadata("amdgpu.g16").

What is the best way to handle A16 and G16 in instruction selection for GlobalISel?

They are only encoded in the types of arguments and not in the intrinsic id. In legalization, we can simply check the type of derivative and address arguments but afterwards, all arguments are packed into a singe, large value for the non-nsa case or several 32 bit values for nsa.
I think it’s hard to detect if this was originally A16 or even only G16 with 32 bit addresses.

I agree that this is sub-optimal, though metadata isn't the right way to go here.

IIRC the current approach is to intuit what the original intent was based on NoReg left over by GlobalISel legalization and the like. However, now that we have G_AMDGPU_xxx MIR opcodes for image instructions after legalization, one option would be to add a magic operand to those machine instructions which encode the A16- and G16-ness.

Flakebi updated this revision to Diff 265265.May 20 2020, 9:07 AM

As Nicolai suggested, I added an operand which encodes A16- and G16-ness.

A16 and G16 seem to be working now with SDag and GlobalISel.

Flakebi updated this revision to Diff 268123.Jun 3 2020, 3:30 AM

Fixed UseNSA with A16 in GlobalISel, which decided based on the number of address components instead of dwords/registers.

nhaehnle accepted this revision.Jun 10 2020, 3:35 PM

LGTM

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
5756

Maybe state that the comment uses "hi,lo" notation?

This revision is now accepted and ready to land.Jun 10 2020, 3:35 PM
Flakebi updated this revision to Diff 270338.Jun 12 2020, 2:27 AM

Add comment as Nicolai suggested

This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.3d.ll