This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions
ClosedPublic

Authored by tstellarAMD on Dec 10 2015, 10:42 AM.

Details

Summary

We were previously selecting all constant loads to SMRD instructions and legalizing
the SMRDs with non-uniform addresses during the SIFixSGPRCopesPass.

This new solution is more simple and also generates much better code, because
the instruction selector is able to take advantage of all the MUBUF addressing
modes that are legalization pass wasn't able to.

We also no longer need to generate v_add_* instructions when we
have a uniform pointer and a non-uniform offset, as this is now folded into the
MUBUF instruction during instruction selection.

Diff Detail

Event Timeline

tstellarAMD retitled this revision from to AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions.
tstellarAMD updated this object.
tstellarAMD added a reviewer: arsenm.
tstellarAMD added a subscriber: llvm-commits.
arsenm added inline comments.Dec 10 2015, 11:10 AM
lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp
38–40

This + AllBranchesUniform don't seem to be used anywhere

88

Should return true as the conservative default if not recording when changes are actually made

lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
285–288

Why is this run after the control flow annotate pass?

lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
95–96 ↗(On Diff #42446)

I think the implementation of isSourceOfDivergence should be committed as a separate patch

111–118 ↗(On Diff #42446)

I think we should just move these into the public header so we get the enum IDs. I think mesa already directly uses these, so it's where they belong anyway.

119 ↗(On Diff #42446)

Do image/sample instructions need to be considered divergent? We also need some of the cross lane intrinsics but I'm not sure we have those yet

125 ↗(On Diff #42446)

s/threads/workitems in a wavefront/

137–138 ↗(On Diff #42446)

I think this should be split into a function called something like isArgPassedInSGPR with a comment about how inreg is interpreted

141–142 ↗(On Diff #42446)

Ty->isInt32Ty() / isFloatTy() would be better

lib/Target/AMDGPU/SIISelLowering.cpp
514

This maybe should check constant isa<Constant> which happens occasionally for LDS

lib/Target/AMDGPU/SIInstructions.td
2111

A smaller number would be better. 10 is probably more than enough

arsenm added inline comments.Dec 10 2015, 11:31 AM
lib/Target/AMDGPU/SIISelLowering.cpp
514

Probably also the various kinds of symbols

Split isSourceOfDivergence() implementation to another patch.

tstellarAMD marked 4 inline comments as done.Dec 11 2015, 7:45 PM
tstellarAMD added inline comments.
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
285–288

No particular reason. I'll probably move it when I finish uniform branching.

lib/Target/AMDGPU/SIInstructions.td
2111

Some of the MUBUF patterns have complexity over 25, so I changed this to 100.

arsenm accepted this revision.Dec 14 2015, 9:57 AM
arsenm edited edge metadata.

LGTM

This revision is now accepted and ready to land.Dec 14 2015, 9:57 AM
This revision was automatically updated to reflect the committed changes.