This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Partially fix control flow at -O0
ClosedPublic

Authored by arsenm on Sep 3 2016, 12:33 PM.

Details

Summary

Fixes to allow spilling all registers at the end of the block
work with exec modifications. Don't emit s_and_saveexec_b64 for
if lowering, and instead emit copies. Mark control flow mask
instructions as terminators to get correct spill code placement
with fast regalloc, and then have a separate optimization pass
form the saveexec.

This should work if SGPRs are spilled to VGPRs, but
will likely fail in the case that an SGPR spills to memory
and no workitem takes a divergent branch.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 70270.Sep 3 2016, 12:33 PM
arsenm retitled this revision from to AMDGPU: Partially fix control flow at -O0.
arsenm updated this object.
arsenm added subscribers: llvm-commits, Restricted Project.
arsenm updated this revision to Diff 70271.Sep 3 2016, 12:47 PM
arsenm edited edge metadata.

Attach right version of patch

Thanks for doing this. I need it to fix up D22195 for current trunk.

lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
561–564

Since the pass now actually runs even before machine scheduling, wouldn't it be better to add this elsewhere, e.g. addPreRegAlloc? That would also eliminate the duplicate code below.

lib/Target/AMDGPU/SIInstrInfo.cpp
853–871

This looks redundant with what's in the exec-optimize pass.

lib/Target/AMDGPU/SILowerControlFlow.cpp
212–216

What is the reason for SI_ELSE's operands being tied? I think I remember why they were tied while lowering happened after regalloc: it was a trick to make sure the live ranges of the register after lowering were subsets of the live ranges before lowering.

But now that the lowering happens before regalloc, that seems unnecessary. Am I missing something?

221–224

Did you run clang-format-diff on this? I think it will complain about the formatting. (I'm personally rather annoyed at the fact that it does, but we should play by the rules...)

lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
38

That name doesn't match with what's used elsewhere.

274–281

ANDN2 and ORN2 are non-commutative, so we have to bail here for those opcodes.

arsenm added inline comments.Sep 8 2016, 12:49 PM
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
561–564

No, this needs to be run as part of regalloc to be post-SSA. It's easier to think about the two reg alloc paths as completely different

lib/Target/AMDGPU/SIInstrInfo.cpp
853–871

It's not, these are still required in the no-opt cases when the pass doesn't run

lib/Target/AMDGPU/SILowerControlFlow.cpp
212–216

I'm not sure, this was just preserving what it already did

nhaehnle added inline comments.Sep 9 2016, 1:21 AM
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
561–564

What gets me is that this is before machine scheduling now, so do people think about machine scheduling as part of register allocation as well? Anyway, it's not a big deal.

lib/Target/AMDGPU/SIInstrInfo.cpp
853–871

Ah yes, that makes sense. Maybe the code can still be shared somehow? But I have to admit that I don't see a great way to do so...

lib/Target/AMDGPU/SILowerControlFlow.cpp
212–216

Logically speaking, I think SI_ELSE should no longer use tied operands. Does anything break if the tie is removed?

arsenm added inline comments.Sep 13 2016, 9:09 PM
lib/Target/AMDGPU/SILowerControlFlow.cpp
221–224

I tried removing the tied and 5 tests fail with verifier errors (all Live segment doesn't end at a valid instruction). I think trying to fix these is a separate patch

nhaehnle accepted this revision.Sep 15 2016, 9:58 AM
nhaehnle added a reviewer: nhaehnle.

Fine with me.

This revision is now accepted and ready to land.Sep 15 2016, 9:58 AM
nhaehnle requested changes to this revision.Sep 15 2016, 10:01 AM
nhaehnle edited edge metadata.

Actually, I forgot about the ANDN2/ORN2 issue and the pass name that I remarked earlier.

This revision now requires changes to proceed.Sep 15 2016, 10:01 AM
arsenm updated this revision to Diff 71725.Sep 16 2016, 11:36 PM
arsenm edited edge metadata.

Check if commutable

nhaehnle accepted this revision.Sep 21 2016, 2:42 AM
nhaehnle edited edge metadata.

Your last change got lost in the mail somehow, sorry about that. LGTM.

This revision is now accepted and ready to land.Sep 21 2016, 2:42 AM

What's the status of this?

arsenm closed this revision.Sep 28 2016, 6:53 PM

r282667