This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Skip other terminators before inserting s_cbranch_exec[n]z
ClosedPublic

Authored by arsenm on Jul 23 2020, 2:07 PM.

Details

Summary

PHIElimination/createPHISourceCopy inserts non-branch terminators
after the control flow pseudo if a successor phi reads a register
defined by the control flow pseudo. If this happens, we need to split
the expansion of the control flow pseudo to ensure all the branches
are after all of the other mask management instructions.

GlobalISel hit this in testscases that happened to be tail
duplicated. The original testcase still does not work, since the same
problem appears to be present in a later pass.

Diff Detail

Event Timeline

arsenm created this revision.Jul 23 2020, 2:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2020, 2:07 PM
arsenm updated this revision to Diff 280253.Jul 23 2020, 2:08 PM

Missed test update

arsenm updated this revision to Diff 280255.Jul 23 2020, 2:11 PM

Merge tests. This is apparently not my first attempt to fix this

nhaehnle accepted this revision.Jul 24 2020, 8:36 AM

It's hard to say whether this is really correct given how underdefined the control flow pseudos are, but it seems like a pretty reasonable change to me. Assuming you've properly tested this change, LGTM

This revision is now accepted and ready to land.Jul 24 2020, 8:36 AM

It's hard to say whether this is really correct given how underdefined the control flow pseudos are, but it seems like a pretty reasonable change to me. Assuming you've properly tested this change, LGTM

This isn't the only insert point trick we need to play, it was just only really bad for SI_END_CF before. It needs to be moved to insert itself before lowered phi destination copies. This is the complementary problem on the phi source block side

llvm/test/CodeGen/AMDGPU/lower-control-flow-other-terminators.mir