Page MenuHomePhabricator

[AMDGPU] Fix codegen of image intrinsics for g16 and a16
ClosedPublic

Authored by dstuttard on May 7 2021, 4:41 AM.

Details

Summary

For gfx10 gradient (g16) and address (a16) can be independent. Previous
implementation assumed that a16 implied g16.

There are some other changes that fix the verification (as well as asm/disasm)
that are required for the included test to pass - the XFAIL will be removed in
those changes.

Diff Detail

Event Timeline

dstuttard created this revision.May 7 2021, 4:41 AM
dstuttard requested review of this revision.May 7 2021, 4:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2021, 4:41 AM

D101619, D101620 are related.

I'll upload some changes to those reviews shortly - including removing the XFAIL in the test included here (fails verification without them).

foad added a comment.May 7 2021, 5:00 AM

Does globalisel get this right? Maybe add a -global-isel RUN line to the new test?

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
6114–6115

"16 bit addresses need 16 bit addresses ..." reads wrong. But anyway maybe combine these two cases into a single "if (IsA16 != IsG16)" with an appropriately general error message?

6128

Might be simpler to remove IsTiedG16 and just test !ST->hasG16() again here.

6142

Seems like there are some future cleanup opportunities here: packImageA16AddressToDwords shouldn't really have "A16" in its name since it's used for G16 too, and maybe A16/G16 should be passed in as an argument so the "if (A16)" test can be inside the helper function.

dstuttard updated this revision to Diff 344371.May 11 2021, 5:25 AM
dstuttard marked 3 inline comments as done.

Addressed review comments
Added support for global-isel. Note: global-isel produces less optimal code
presently. Will require some additional work to add extra combines

Note: this requires D102231

See new patch with globalisel implementation

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
6114–6115

I'm in two minds about this - I'd tend to err on the side of providing a more specific error message.
On balance, less code is better I guess.

6128

I was going to make a clarity argument - but I think you're right. It is just clutter.

6142

Yes, you're probably right. I've renamed the function for now (but not done the extra work to unify).

dstuttard updated this revision to Diff 345125.May 13 2021, 6:49 AM

Moved part of this change into D102231 (required for that change).

This change should be submitted after that one

foad accepted this revision.May 13 2021, 7:45 AM

Looks reasonable, though I can't follow all the logic in packImage16bitOpsToDwords.

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1505

"STI.hasG16()"

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4034

Aren't "non-gradient / non-coordinate operands" handled in the "if" above? Also I don't understand why the case above has a bitcast but this case does not. But I see that that was the pre-existing behaviour.

4232

Braces around this multi-line if body please (maybe git clang-format @^ can do this for you?).

4237

Likewise.

This revision is now accepted and ready to land.May 13 2021, 7:45 AM
dstuttard marked 4 inline comments as done.May 13 2021, 9:02 AM
dstuttard added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1505

Meh - ok.

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4034

I'm not 100% sure why it needs to do this either. I tried unifying it, but it failed verification, so resorted to this which more closely matches the original implementation.
I've adjusted the comment so it is less confusing.

dstuttard updated this revision to Diff 345166.May 13 2021, 9:03 AM
dstuttard marked 2 inline comments as done.

Updates based on review comments

dstuttard added inline comments.May 13 2021, 9:04 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4232

Interestingly, git clang-format didn't do this (I usually run it before any commits).

This revision was landed with ongoing or failed builds.May 14 2021, 1:31 AM
This revision was automatically updated to reflect the committed changes.