This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Add a pass to fix SGPR liveness
AbandonedPublic

Authored by ruiling on Aug 23 2022, 1:36 AM.

Details

Summary

Sometime, we may get a SGPR-PHI with only one meaningful value from
dominator and undefined in another predecessor. The problem with
such pattern is the SGPR might be overwritten in the undefined
predecessor. The pass added here is to make the SGPR also alive in the
originally dead predecessors.

Diff Detail

Event Timeline

ruiling created this revision.Aug 23 2022, 1:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2022, 1:36 AM
ruiling requested review of this revision.Aug 23 2022, 1:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2022, 1:36 AM
critson added inline comments.Aug 23 2022, 6:04 AM
llvm/lib/Target/AMDGPU/SIFixSGPRLiveness.cpp
89

Stray comment?

94

Do these influence SingleOp?

128

Unnecessary continue?

ruiling updated this revision to Diff 455080.Aug 23 2022, 11:02 PM

address review comments

ruiling marked 3 inline comments as done.Aug 23 2022, 11:08 PM
ruiling added inline comments.
llvm/lib/Target/AMDGPU/SIFixSGPRLiveness.cpp
94

Thanks for point out this, I have refined this part to be more robust.

ruiling marked an inline comment as done.Aug 28 2022, 6:21 PM

any other concern?

I think the code is correct, but this needs a MIR test checking the behaviour of the pass.

foad added a comment.Aug 30 2022, 8:10 AM

My understanding is that this pass only fixes specific liveness problems that are created by D132450. It is not a general purpose pass that could handle any possible pattern of SGPR usage.

llvm/lib/Target/AMDGPU/SIFixSGPRLiveness.cpp
18

The implementation does not check that there is a divergent branch, so it would do the same thing for uniform control flow.

My understanding is that this pass only fixes specific liveness problems that are created by D132450. It is not a general purpose pass that could handle any possible pattern of SGPR usage.

Thank you for your comment @foad. But I don't agree the problem was created by D132450. see the attached sgpr-liveness.ll, a hand-made test can also show the problem (The tricky thing is that the existing structurizer still simplify the IR even it is already structurized, thus it helps hiding the issue, but I don't think it is the expected behavior). Yes the current design of the pass only works for this known pattern. That is just because I don't know if there are any other broken patterns.

llvm/lib/Target/AMDGPU/SIFixSGPRLiveness.cpp
18

Sounds true, this is doing things conservative, but I think it may not cause any regression in real case. Practically, such kind of pattern in source program should already been optimized into the single dominate value. The only possible cases that I can think of are either "opt-none" or "created by structurizeCFG". If such pattern is created by structurizeCFG, BB1 will be very likely to be divergent because structurizeCFG was designed to try to skip uniform branching as much as possible. I am not sure if checking the terminator of BB1 against s_cbranch_scc0/scc1 for uniform branch sounds good idea?

ruiling updated this revision to Diff 456885.Aug 31 2022, 1:12 AM

Don't extend for backedge case.
Add mir tests.

ruiling added inline comments.Aug 31 2022, 7:12 PM
llvm/lib/Target/AMDGPU/SIFixSGPRLiveness.cpp
18
ruiling updated this revision to Diff 457507.Sep 1 2022, 11:06 PM

Make it work in a few more cases.
Detect divergent branch.

My understanding is that this pass only fixes specific liveness problems that are created by D132450. It is not a general purpose pass that could handle any possible pattern of SGPR usage.

I think the new version should work correctly in all possible shapes of post-structurized CFG I can think of now. Please let me know if you have any further concern.