This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra
ClosedPublic

Authored by critson on Jun 29 2022, 3:39 AM.

Details

Summary

Follow up to D127894, new liveness update code needs to handle
the case where S_ANDN2 input must be extended through loops when
V_CNDMASK_B32 has been hoisted.

Diff Detail

Event Timeline

critson created this revision.Jun 29 2022, 3:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 3:39 AM
critson requested review of this revision.Jun 29 2022, 3:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 3:39 AM
arsenm accepted this revision.Jun 29 2022, 6:18 AM

I think D128315 D128110 would also avoid this. I assumed cndmask in a different block was not useful, but presumably you found this in the wild?

llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
218

s/is/in

This revision is now accepted and ready to land.Jun 29 2022, 6:18 AM
critson marked an inline comment as done.Jun 29 2022, 8:26 PM

I think D128315 D128110 would also avoid this. I assumed cndmask in a different block was not useful, but presumably you found this in the wild?

Thank you for the quick review!

This was found in a Vulkan CTS test on GFX8 and GFX9.
The result was register reallocation within a loop that caused incorrect results (as s_andn2 source was overwritten).

D128315 would fix this issue, but perhaps this change removes the need for D128315?
I am not sure D128110 would fix this bug, but there is some overlap between the two changes.

This revision was landed with ongoing or failed builds.Jun 29 2022, 11:29 PM
This revision was automatically updated to reflect the committed changes.

@arsenm I hope you don't mind, I merged your tests (and some fixes) from D128110 and D128315 into a new review D128882. We can proceed with the merged review, or I am happy to contribute to reviewing on your existing patch sets.