This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SIInsertSkips: Fix the determination of whether early-exit-after-kill is possible
ClosedPublic

Authored by nhaehnle on Feb 20 2020, 9:09 AM.

Details

Summary

The old code made some incorrect assumptions about the order in which
basic blocks are laid out in a function. This could lead to incorrect
early-exits, especially when kills occurred inside of loops.

The new approach is to check whether the point where the conditional
kill occurs dominates all reachable code. If that is the case, there
cannot be any other threads in the wave that are waiting to rejoin
at a later point in the CFG, i.e. if exec=0 at that point, then all
threads really are dead and we can exit the wave.

Make some other minor cleanups to the pass while we're at it.

Change-Id: Ia0d2b113ac944ad642d1c622b6da1b20aa1aabcc

Diff Detail

Event Timeline

nhaehnle created this revision.Feb 20 2020, 9:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2020, 9:09 AM
arsenm added inline comments.Feb 20 2020, 9:17 AM
llvm/lib/Target/AMDGPU/SIInsertSkips.cpp
85

The update to also preserve it shouldn't be too difficult

nhaehnle updated this revision to Diff 245806.Feb 21 2020, 1:51 AM
  • preserve the dominator tree
nhaehnle marked an inline comment as done.Feb 21 2020, 1:58 AM
foad added a comment.Feb 21 2020, 2:46 AM

Could you pre-commit the complex_loop test case, if that's the one that really demonstrates the bug fix?

nhaehnle updated this revision to Diff 245818.Feb 21 2020, 4:39 AM

The new testcase is now in master; rebase this change on top of it.

lgtm except the dead code

llvm/lib/Target/AMDGPU/SIInsertSkips.cpp
341

Dead code

nhaehnle marked an inline comment as done.Feb 26 2020, 6:29 AM

Thanks.

llvm/lib/Target/AMDGPU/SIInsertSkips.cpp
341

Removed.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 26 2020, 6:34 AM
This revision was automatically updated to reflect the committed changes.