This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges.
ClosedPublic

Authored by cfang on Mar 13 2019, 10:50 AM.

Details

Summary

At the exit of the loop, the compiler uses a register to remember and accumulate the number of threads that have already exited. When all active threads exit the loop, this register is used to restore the exec mask, and the execution continues for the post loop code.

When there is a "continue" in the loop, the compiler made a mistake to reset the register to 0 when the "continue" backedge is taken. This will result in some threads not executing the post loop code as they are supposed to.

This patch fixed the issue ( the test is yet to be developed).

Diff Detail

Repository
rL LLVM

Event Timeline

cfang created this revision.Mar 13 2019, 10:50 AM
msearles added inline comments.Mar 13 2019, 10:53 AM
lib/Target/AMDGPU/SIAnnotateControlFlow.cpp
272 ↗(On Diff #190450)

Typo in comment: shouldl

Definitely needs several tests

cfang updated this revision to Diff 190658.Mar 14 2019, 10:18 AM

Fix a typo and add a test.

arsenm added inline comments.Mar 14 2019, 4:42 PM
lib/Target/AMDGPU/SIAnnotateControlFlow.cpp
274 ↗(On Diff #190658)

Could this still break for some block that is in the loop, but doesn't dominate BB?

test/CodeGen/AMDGPU/si-annotatecfg-multiple-backedges.ll
2 ↗(On Diff #190658)

I kind of think we should just start generating the checks for any test that deals with control flow

cfang marked 2 inline comments as done.Mar 14 2019, 9:19 PM
cfang added inline comments.
lib/Target/AMDGPU/SIAnnotateControlFlow.cpp
274 ↗(On Diff #190658)

I don’t think so. Given single entry single exit regions, and inside the loop, when BB and Pred are both backedge blocks, either Pred dominates BB or BB dominates Pred.

test/CodeGen/AMDGPU/si-annotatecfg-multiple-backedges.ll
2 ↗(On Diff #190658)

What do you want me to do here?

cfang marked 2 inline comments as done.Mar 14 2019, 9:22 PM
arsenm added inline comments.Mar 15 2019, 9:39 AM
test/CodeGen/AMDGPU/si-annotatecfg-multiple-backedges.ll
2 ↗(On Diff #190658)

Use utils/update_test_checks.py instead of the manually write checks. You just copied the output anyway, just for a subset of the function

cfang updated this revision to Diff 190857.Mar 15 2019, 11:12 AM

update the test with update_test_checks.py

The test needs an expansive comment describing what this is testing.

LGTM, but it would be better if Nicolai also looked at this

cfang updated this revision to Diff 190891.Mar 15 2019, 1:27 PM

Add comment to the test

Hmm, this is fragile -- I think your reasoning about domination is mostly sound, except when there's uniform control flow inside the loop itself, in which case I'm not sure. That said, it seems to me that the domination check is conservative, and your change shouldn't break anything that wasn't broken before, and a proper fix is potentially much more difficult.

Agree with Matt that using update_test_checks.py is a good idea, apart from that I think this can go in.

This revision is now accepted and ready to land.Mar 15 2019, 1:28 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2019, 2:02 PM