This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Don't remove short branches over kills
ClosedPublic

Authored by foad on Jan 31 2020, 2:48 AM.

Details

Summary

D68092 introduced a new SIRemoveShortExecBranches optimization pass and
broke some graphics shaders. The problem is that it was removing
branches over KILL pseudo instructions, and the fix is to explicitly
check for that in mustRetainExeczBranch.

Diff Detail

Event Timeline

foad created this revision.Jan 31 2020, 2:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2020, 2:48 AM
foad added a comment.Jan 31 2020, 2:49 AM

I realise that this needs a test case, but I wanted to submit it early as a heads-up to anyone else who is having problems with D68092.

Unit tests: pass. 62357 tests passed, 0 failed and 839 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

After some investigation the root cause of this appears to be that in some cases cbranch_execz instructions (relating to kills) are removed which leaves the SIInsertSkips pass unable to determine that a kill is being executed in branching control flow. The result is that SIInsertSkips inserts skipIfDead code where it shouldn't because it is unaware the exec mask does not contain allow active threads. This change is the least invasive solution, but the alternative would be to reengineer SIInsertSkips. In either case this seems to be a blind spot in the skip-if-dead lit tests.

I am moving the kill intrinsic handling into a separate pass. It is partially done. Fixing some corner cases now.
SIInsertSkips will eventually go away.
The handling/optimizaton of cbranch_execz insertion has been moved to the new pass SIRemoveShortExecBranches.

nhaehnle accepted this revision.Feb 2 2020, 10:27 AM

Ugh, this is yet another recent regression caused by this rework, and we should fix it quickly. I think this change should go in. Please do add a test though.

This revision is now accepted and ready to land.Feb 2 2020, 10:27 AM
foad updated this revision to Diff 241987.Feb 3 2020, 1:25 AM

Add test case from @critson.

This revision was automatically updated to reflect the committed changes.

Unit tests: fail. 62406 tests passed, 1 failed and 839 were skipped.

failed: libc++.std/thread/thread_mutex/thread_mutex_requirements/thread_timedmutex_requirements/thread_timedmutex_recursive/try_lock_until.pass.cpp

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.