Page MenuHomePhabricator

[AMDGPU] Fix-up cases where writelane has 2 SGPR operands
AcceptedPublic

Authored by dstuttard on Sep 11 2018, 7:40 AM.

Details

Summary

Even though writelane doesn't have the same constraints as other valu
instructions it still can't violate the >1 SGPR operand constraint

Due to later register propagation (e.g. fixing up vgpr operands via
readfirstlane) changing writelane to only have a single SGPR is tricky.

This implementation puts a new check after SIFixSGPRCopies that prevents
multiple SGPRs being used in any writelane instructions.

The algorithm used is to check for trivial copy prop of suitable constants into
one of the SGPR operands and perform that if possible. If this isn't possible
put an explicit copy of Src1 SGPR into M0 and use that instead (this is
allowable for writelane as the constraint is for SGPR read-port and not
constant-bus access).

Diff Detail

Event Timeline

dstuttard created this revision.Sep 11 2018, 7:40 AM
rampitec added inline comments.Sep 11 2018, 12:39 PM
lib/Target/AMDGPU/SIFixSGPRCopies.cpp
456

Why cannot you use loop in the runOnMachineFunction()?

481

You do not need vector here.

for (auto MO : {&Src0, &Src1})
dstuttard added inline comments.Sep 11 2018, 2:01 PM
lib/Target/AMDGPU/SIFixSGPRCopies.cpp
456

I think we need all the sgpr to vgpr moves to have been completed before applying this fix since in some cases it might not be necessary.
I guess there's an argument for this to be done in a separate pass, or a later pass, then it could go into runOnMachineFunction - any suggestions?

481

Good point, I'll make this one.

rampitec added inline comments.Sep 11 2018, 2:15 PM
lib/Target/AMDGPU/SIFixSGPRCopies.cpp
456

Given the semantics of writelane it is hard to believe its sources will be moved to VALU. Also if that is going to happen in general, it should have already happened by the time iterator would reach the instruction.

dstuttard updated this revision to Diff 165104.Sep 12 2018, 9:37 AM

Made suggested changes

dstuttard marked 5 inline comments as done.Sep 12 2018, 9:38 AM
dstuttard added inline comments.
lib/Target/AMDGPU/SIFixSGPRCopies.cpp
456

Yes - I think you're right. I've changed it part of the main runOnMachineFunction.

rampitec added inline comments.Sep 12 2018, 12:01 PM
lib/Target/AMDGPU/SIFixSGPRCopies.cpp
750

missing break.

dstuttard updated this revision to Diff 165215.Sep 13 2018, 1:32 AM
dstuttard marked 2 inline comments as done.

Added missing break

dstuttard added inline comments.Sep 13 2018, 1:34 AM
lib/Target/AMDGPU/SIFixSGPRCopies.cpp
750

Doh

This revision is now accepted and ready to land.Sep 13 2018, 1:37 AM
arsenm requested changes to this revision.Sep 13 2018, 1:54 AM
arsenm added inline comments.
lib/Target/AMDGPU/Utils/AMDGPUMCUtils.cpp
23–25 ↗(On Diff #165215)

This seems more like something for TII

This revision now requires changes to proceed.Sep 13 2018, 1:54 AM

Should have a special check in the verifier

dstuttard updated this revision to Diff 165274.Sep 13 2018, 6:51 AM

Moved foldToImm into SIInstrInfo as suggested
Implemented check in verifyInstruction and checked that it worked when the fix was removed

dstuttard marked 3 inline comments as done.Sep 13 2018, 6:52 AM

I don't actually understand why this code is where it is? Why is SIFixSGPRCopies doing this? To clarify is this just an optimization? My initial reaction was that it was a fix, but looking at it again it seems like an optimization to me

lib/Target/AMDGPU/SIFixSGPRCopies.cpp
744–746

This should be a COPY?

dstuttard updated this revision to Diff 166681.Sep 24 2018, 7:26 AM

Updated a mov to a copy as per review comment

I don't actually understand why this code is where it is? Why is SIFixSGPRCopies doing this? To clarify is this just an optimization? My initial reaction was that it was a fix, but looking at it again it seems like an optimization to me

It isn't an optimization - it's a bug. We encountered this in graphics shaders - hence the requirement for the fix.

SIFixSGPRCopies does feel like a strange place to put this fix. It has to be somewhere late enough to catch the issue since it's a transformation that happens after isel that causes the problem in this case. Have you got any other suggestions as to where it could go instead? FixSGPR copies isn't perfect, but doesn't seem too unreasonable as it is an SGPR related fix-up.

nhaehnle accepted this revision.Oct 8 2018, 8:41 AM

LGTM, to be honest. Matt?

@arsenm Matt - good to go?

arsenm accepted this revision.Jun 14 2019, 7:20 AM

LGTM

This revision is now accepted and ready to land.Jun 14 2019, 7:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2019, 7:20 AM