This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Optimize S_CBRANCH_VCC[N]Z -> S_CBRANCH_EXEC[N]Z
ClosedPublic

Authored by rampitec on Nov 6 2018, 11:11 AM.

Details

Summary

Sometimes after basic block placement we end up with a code like:

sreg = s_mov_b64 -1
vcc = s_and_b64 exec, sreg
s_cbranch_vccz

This happens as a join of a block assigning -1 to a saved mask and
another block which consumes that saved mask with s_and_b64 and a
branch.

This is essentially a single s_cbranch_execz instruction when moved
into a single new basic block.

Diff Detail

Repository
rL LLVM

Event Timeline

rampitec created this revision.Nov 6 2018, 11:11 AM
arsenm added inline comments.Nov 6 2018, 12:13 PM
lib/Target/AMDGPU/SIInsertSkips.cpp
341 ↗(On Diff #172798)

This should probably have something to avoid pointlessly scanning through the whole block. Maybe add a 4 instruction limit? The only instruction's you're looking for should only really ever be at the very end

390 ↗(On Diff #172798)

Is this registerDefIsDead really what you need? I would expect to need to use LivePhysRegs and check if it's live out (which would also avoid dependence on dead flags)

test/CodeGen/AMDGPU/insert-skip-from-vcc.mir
20 ↗(On Diff #172798)

The S_ENDPGMs at the end are very weird

302 ↗(On Diff #172798)

Needs a case with scc live out

rampitec updated this revision to Diff 172830.Nov 6 2018, 12:47 PM
rampitec marked 2 inline comments as done.

Added scan threshold and test with liveout scc.

lib/Target/AMDGPU/SIInsertSkips.cpp
390 ↗(On Diff #172798)

This check catches "implicit-def dead $scc", and that is what really produced by the compiler. This is the cheap test comparing to LivePhysRegs use. Any problems with it?

test/CodeGen/AMDGPU/insert-skip-from-vcc.mir
20 ↗(On Diff #172798)

They are here to avoid verifier error:

Bad machine code: MBB conditionally falls through out of function!

302 ↗(On Diff #172798)

If it is liveout it is then not a dead def at s_and_b64, otherwise IR is malformed. But OK, I have added it.
JBTW, is there any case where scc can really be a liveout?

arsenm added inline comments.Nov 7 2018, 9:19 AM
lib/Target/AMDGPU/SIInsertSkips.cpp
390 ↗(On Diff #172798)

You're supposed to avoid using kill flags now, but this is a dead flag which I'm less sure about.

test/CodeGen/AMDGPU/insert-skip-from-vcc.mir
302 ↗(On Diff #172798)

It's possible, but I don't think we produce it now. I know I've had patches I never completed where it appeared before when I was trying to enable if conversion

rampitec added inline comments.Nov 7 2018, 9:27 AM
lib/Target/AMDGPU/SIInsertSkips.cpp
390 ↗(On Diff #172798)

I would say unless there is a really good motivation I would avoid liveness recomputation. In addition we know this is an and with exec which we have produced in a specific place. E.g. the situation with live scc cannot even happen, the check is mostly a precaution.

nhaehnle accepted this revision.Nov 9 2018, 5:33 AM

Two stylistic nitpicks. LGTM with those addressed.

lib/Target/AMDGPU/SIInsertSkips.cpp
69 ↗(On Diff #172830)

Function names should be lowerCamelCase.

338–339 ↗(On Diff #172830)

Rename B to E please, that's more idiomatic.

This revision is now accepted and ready to land.Nov 9 2018, 5:33 AM
rampitec updated this revision to Diff 173399.Nov 9 2018, 11:44 AM
rampitec marked 2 inline comments as done.

Style fixes as requested.

rampitec updated this revision to Diff 173710.Nov 12 2018, 10:17 AM

Rebased to master.

This revision was automatically updated to reflect the committed changes.