This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Fix EmitCopyFromReg for cloned nodes
ClosedPublic

Authored by foad on Dec 20 2022, 9:54 AM.

Details

Summary

Change EmitCopyFromReg to check all users of cloned nodes (as well as
non-cloned nodes) instead of assuming that they all copy the defined
value back to the same physical register.

This partially reverts 968e2e7b3db1 (svn r62356) which claimed:

CreateVirtualRegisters does trivial copy coalescing. If a node def is
used by a single CopyToReg, it reuses the virtual register assigned to
the CopyToReg. This won't work for SDNode that is a clone or is itself
cloned. Disable this optimization for those nodes or it can end up
with non-SSA machine instructions.

This is true for CreateVirtualRegisters but r62356 also updated
EmitCopyFromReg where it is not true. Firstly EmitCopyFromReg only
coalesces physical register copies, so the concern about SSA form does
not apply. Secondly making the loop over users in EmitCopyFromReg
conditional on !IsClone && !IsCloned breaks the handling of cloned
nodes, because it leaves MatchReg set to true by default, so it assumes
that all users will copy the defined value back to the same physical
register instead of actually checking.

Diff Detail

Event Timeline

foad created this revision.Dec 20 2022, 9:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2022, 9:54 AM
foad requested review of this revision.Dec 20 2022, 9:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2022, 9:54 AM
arsenm added a subscriber: arsenm.Dec 20 2022, 9:58 AM
arsenm added inline comments.
llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
83–84

I think IsCloned is now a dead argument

llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll
1

I am struggling to come up with a good test case. This one doesn't really demonstrate the problem. Immediately after isel there is a problem, because an S_AND_B32 instruction clobbers $scc when it should not. But then after isel the S_AND_B32 gets converted to V_AND_B32, so the problem goes away.

I could change the test to dump MIR with -stop-before=si-fix-sgpr-copies if that's acceptable.

arsenm added inline comments.Dec 20 2022, 10:00 AM
llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll
1

I think having both IR and MIR tests would be good here. -stop-before=si-fix-sgpr-copies is the right way to test a lot of these things

foad updated this revision to Diff 484310.Dec 20 2022, 10:05 AM

Remove unused argument.

foad marked an inline comment as done.Dec 20 2022, 10:06 AM
foad updated this revision to Diff 484327.Dec 20 2022, 10:55 AM

Test both ISA and MIR.

arsenm accepted this revision.Dec 20 2022, 10:56 AM
This revision is now accepted and ready to land.Dec 20 2022, 10:56 AM
foad added inline comments.Dec 20 2022, 10:56 AM
llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll
75

Here's the bug in the old code. scc is clobbered here...

78

... and then used here.

This revision was landed with ongoing or failed builds.Dec 21 2022, 2:47 AM
This revision was automatically updated to reflect the committed changes.