This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix v3f16 interaction with image store workaround
ClosedPublic

Authored by sebastian-ne on Nov 5 2020, 6:12 AM.

Details

Summary

In some cases, the wrong amount of registers was reserved.

Also enable more v3f16 tests.

@rdomingu, does the generated code look ok?

Diff Detail

Event Timeline

sebastian-ne created this revision.Nov 5 2020, 6:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 5 2020, 6:12 AM
sebastian-ne requested review of this revision.Nov 5 2020, 6:12 AM
rdomingu added inline comments.Nov 5 2020, 12:00 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.1d.d16.ll
540

AFAIK, gfx810 can't pack v3f16 when d16 is set because of a hardware bug. So this instruction should use 3 vgpr (not 2 vgpr).

562

gfx9 has the same hardware bug. So this instruction should use 3 vgpr (not 2 vgpr).

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.store.2d.d16.ll
63

This instruction should use 3 vgpr (not 4 vgpr).

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.d16.dim.ll
79

This instruction should use 3 vgpr (not 2 vgpr).

115

This is instruction should use 3 vgpr (not 2 vgpr).

Fix wrong register counts.

sebastian-ne retitled this revision from [AMDGPU] Enable more v3f16 tests, NFC to [AMDGPU] Fix v3f16 interaction with image store workaround.Nov 10 2020, 8:58 AM
sebastian-ne edited the summary of this revision. (Show Details)
sebastian-ne added inline comments.Nov 10 2020, 9:04 AM
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.1d.d16.ll
540

As far as I know, only image stores are affected and this is an image_load, so the workaround does not need to be applied.
If loads are also affected, we need to change the condition here: https://reviews.llvm.org/D81172#C2200035NL3550

562

Not a store, so this should be ok.

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.store.2d.d16.ll
63

Thanks, fixed

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.d16.dim.ll
79

Not a store, so this should be ok.

115

You’re right, I fixed it.
PACKED is gfx9, so 2 vgprs is fine there.

Friendly ping for review

arsenm accepted this revision.Nov 18 2020, 9:14 AM
This revision is now accepted and ready to land.Nov 18 2020, 9:14 AM
This revision was landed with ongoing or failed builds.Nov 18 2020, 9:21 AM
This revision was automatically updated to reflect the committed changes.