This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Unify uniform return and divergent unreachable blocks
ClosedPublic

Authored by bcahoon on Oct 27 2022, 3:18 PM.

Details

Summary

This patch fixes a failed to annotate CFG error in
SIAnnotateControlFlow. The error indicates that the CFG is not
structurized.

The problem occurs when there are divergent unreachable blocks
and uniform return blocks in the same region. In this case,
AMDGPUUnifyDivergentExitNodes does not create a unified block so
the region contains multiple exits.

StructurizeCFG does not work properly when there are multiple
exits, so the necessary CFG transformations do not occur along
divergent control flow paths.

Subsequently, SIAnnotateControlFlow processes the paths to
divergent unreachable blocks, but may only partially process
blocks along a uniform control flow path to the return block.

This patch fixes the bug by creating a single exit block when
there is uniform return block and divergent unreachable blocks
in the same region. When the region contains a single exit,
then StructurizeCFG and SIAnnotateControlFlow transform the
CFG correctly.

Diff Detail

Event Timeline

bcahoon created this revision.Oct 27 2022, 3:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 27 2022, 3:18 PM
bcahoon requested review of this revision.Oct 27 2022, 3:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 27 2022, 3:18 PM
ruiling added inline comments.Oct 31 2022, 8:23 AM
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
224

I think the key point is we need to make sure to unify the function exits if there is any divergent branch to make sure the structurizer is not skipped when needed. The change does not work if there are both divergent reached return block and uniformly reached return block (no unreachable block). In this case, we still need to unify both the uniform/divergent return blocks. So, I would suggest you change to something like below:

scanning the terminators of all the blocks in the function,
if there is any divergent branch, set HasDivergentBranch to true.

then:
      if (!HasDivergentBranch && isUniformlyReached(DA, *BB))
        ReturningBlocks.push_back(BB);

It is also preferred to process unreachable blocks in the same way:
     if (!HasDivergentBranch && isUniformlyReached(DA, *BB))
        UnreachableBlocks.push_back(BB);
arsenm accepted this revision.Nov 1 2022, 12:36 PM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
200

If you're splitting the two return cases, should rename this

This revision is now accepted and ready to land.Nov 1 2022, 12:36 PM
bcahoon updated this revision to Diff 473386.Nov 4 2022, 6:48 PM

I changed the patch based upon @ruiling comment. He's correct that my patch doesn't catch the case when there is a uniform exit and a divergent exit. I looked at adding a HasDivergentBranch condition suggested in his comments, but that seemed to be overly conservative and created a lot of lit test changes.. Instead, the patch checks for is there is any exit block (return/unreachable) that is not uniformly reached. If that condition occurs, then all returns and unreachable blocks are unify. That fixes the test case, and only effects a couple of existing test cases. Two of those test cases, bool-legalization and skip-if-dead, are different because unreachable blocks are now added only if there is a divergent exit block. I think this conservatively will create a unified exit block correctly, but I'm interested in feedback.

ruiling accepted this revision.Nov 6 2022, 11:50 PM

LGTM

llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
224

I think the code suggestion I have given is wrong, really sorry for that. It should be: if (HasDivergentBranch) instead of if (...&&...), anyway this new version works for me.

This revision was landed with ongoing or failed builds.Nov 29 2022, 11:34 AM
This revision was automatically updated to reflect the committed changes.