This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Handle multiple occurences of an incoming value in break large PHIs
ClosedPublic

Authored by Pierre-vh on May 22 2023, 2:15 AM.

Details

Summary

We naively broke all incoming values, assuming they'd be unique.
However it's not illegal to have multiple occurences of, e.g. [BB0, V0]
in a PHI node. What's illegal though is having the same basic block
multiple times but with different values, and it's exactly what the
transform caused. This broke in some rare applications where the pattern
arised.

Now we cache the BasicBlock, Value pairs we're breaking so we can reuse the values and preserve this invariant.

Solves SWDEV-399460

Diff Detail

Event Timeline

Pierre-vh created this revision.May 22 2023, 2:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2023, 2:15 AM
Pierre-vh requested review of this revision.May 22 2023, 2:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2023, 2:15 AM
rovka added a subscriber: rovka.May 22 2023, 3:49 AM
rovka added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
1575

Nit: I would assert that NumElts matches Ty.

1590

Maybe add a test for this case, so we don't regress in the future?

1595
1599

Nit: Reformat the rest of the comment?

llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-break-large-phis-heuristics.ll
15

Nit: Commit the irrelevant test updates (i.e. the variable renamings) separately?

Pierre-vh updated this revision to Diff 524230.May 22 2023, 4:04 AM
Pierre-vh marked 5 inline comments as done.

Comments

llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
1575

It's not really necessary because NumElts doesn't relate to Ty, but to the original vector type. Currently yes, if NumElts == 1 then Ty is scalar, and if NumElts > 1 then Ty if FixedVectorType, but it's really not required.

e.g. We could even decide to bitcast the slice to another type (e.g. a 4 x i8 slice to a i32) and that'd still be fine (NumElts would be 4 for a scalar Ty, but it's ok because 4 elements of the original vector can be bitcasted to Ty).

1590

phi_v7i16_switch already covers it (it's what caught the issue when I was just checking Values without BBs at first)

llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-break-large-phis-heuristics.ll
15

I would have done that if I could, but the naming changes are unfortunately a side-effect of restructuring the code.

I could have tried harder to preserve the old names but IMO it's just debug names that don't really matter, so I thought it'd be fine as is. Is that ok?

rovka accepted this revision.May 22 2023, 4:15 AM

LGTM

llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
1575

Ok, thanks for explaining. Maybe add that to that slice examples then? (optional)

llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-break-large-phis-heuristics.ll
15

Fine either way :)

This revision is now accepted and ready to land.May 22 2023, 4:15 AM
This revision was landed with ongoing or failed builds.May 22 2023, 4:40 AM
This revision was automatically updated to reflect the committed changes.