This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Avoid using divergent value in mubuf addr64 descriptor
ClosedPublic

Authored by tpr on May 25 2018, 11:33 AM.

Details

Summary

This fixes a problem where a load from global+idx generated incorrect
code on <=gfx7 when the index is divergent.

Change-Id: Ib4d177d6254b1dd3f8ec0203fdddec94bd8bc5ed

Diff Detail

Repository
rL LLVM

Event Timeline

tpr created this revision.May 25 2018, 11:33 AM
arsenm added a comment.Jun 1 2018, 1:34 AM

The test seems pretty lacking. There are a lot more combinations of divergent and non-divergent changed here that aren't tested

lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1002 ↗(On Diff #148641)

SReg_64_XEXEC

arsenm added inline comments.Jun 1 2018, 1:36 AM
lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1002 ↗(On Diff #148641)

Probably should also create a helper that the constant lowering can use as well since this is basically the same

tpr updated this revision to Diff 151159.Jun 13 2018, 7:26 AM

V2: s_mov_b64 helper func; improved test

tpr marked an inline comment as done.Jun 13 2018, 7:27 AM
tpr added inline comments.
lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1002 ↗(On Diff #148641)

I left the SGPR_64RegClassID as that's what the existing code for a 64 bit constant load has. I have done the suggested helper func.

Thanks. I feel like some of this could perhaps be improved with a computeKnownBits if there is a 64-bit uniform base and a 32-bit non-uniform offset, but that doesn't have to be part of this change.

lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
370–374 ↗(On Diff #151159)

Have you run clang-format on this? It looks a bit off.

1006 ↗(On Diff #151159)

Spurious whitespace change.

1083–1098 ↗(On Diff #151159)

This has a lot of redundancy with the isBaseWithConstantOffset case above. Perhaps the cases can be combined?

tpr updated this revision to Diff 155388.Jul 13 2018, 7:56 AM

V3: minor formatting; commoned up two legs of fix

tpr marked 3 inline comments as done.Jul 13 2018, 7:56 AM
tpr updated this revision to Diff 155783.Jul 16 2018, 4:09 PM

V4: fixed bugs in commoning two legs

nhaehnle added inline comments.Jul 27 2018, 6:32 AM
test/CodeGen/AMDGPU/shader-addr64-nonuniform.ll
26–34 ↗(On Diff #155783)

Could you please add a similar test-case, with a non-uniform i64 %arg18 and %offset a constant? I don't think this case is covered by tests, and I'm not sure that the code would do the right thing for that case, where I think Addr64 would also be needed.

tpr updated this revision to Diff 158095.Jul 30 2018, 3:10 PM

V5: Added extra test suggested in review comment, and fixed the bug it showed up.

tpr added inline comments.Jul 30 2018, 3:11 PM
test/CodeGen/AMDGPU/shader-addr64-nonuniform.ll
26–34 ↗(On Diff #155783)

Good spot; that case was not covered.

nhaehnle accepted this revision.Jul 31 2018, 3:25 AM

Great, thanks!

This revision is now accepted and ready to land.Jul 31 2018, 3:25 AM
This revision was automatically updated to reflect the committed changes.