This is an archive of the discontinued LLVM Phabricator instance.

Optimize cross block gc.relocate lowering. NFC.
AbandonedPublic

Authored by dantrushin on Apr 11 2022, 11:35 AM.

Details

Reviewers
skatkov
reames
Summary

At IR level gc.relocate is used to express new value of GC pointer
which may have changed during call wrapped by statepoint (IR instructions
have only single def; gc.relocate is a workaround for this limitation).
But MachineInstruction can have multiple results (DEF registers),
so gc.relocate is not needed here - it is equal to the corresponding
STATEPOINT DEF operand (when GC pointers are lowered via VRegs).

This means that during lowering of gc.relocate which has uses outside
its basic block we can simply use virtual register exported by STATEPOINT
and not generate CopyFromRegs/CopyToRegs SDNodes which copy one virtual
register to another, producing redundant COPY instruction.

This is purely compile time optimization. On big methods it can
improve compile time up to 10%.

Diff Detail

Event Timeline

dantrushin created this revision.Apr 11 2022, 11:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2022, 11:35 AM
dantrushin requested review of this revision.Apr 11 2022, 11:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2022, 11:35 AM

Rebase on tip

Generally looks good. Could please land test first in the current implementation to see the difference caused by patch.

llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
1270–1271

please use spaces...

Test update.

Update test comment lost during merge

Well.. Do I understand correctly that the original problem comes from the fact that during lowering of statepoint if value is relocated through register then this register is exported unconditionally?

So if during statepoint lowering we export the value only in case there is a relocate outside of current basic block, the extra copy is disappeared.
gc.relocate should be handled in the same way - if there is a use outside of this basic block then export is needed otherwise no export at all. As I understand this corresponds to usual handling of instruction and no special handling of gc.relocate is required?

Do I miss anything?

Well.. Do I understand correctly that the original problem comes from the fact that during lowering of statepoint if value is relocated through register then this register is exported unconditionally?

So if during statepoint lowering we export the value only in case there is a relocate outside of current basic block, the extra copy is disappeared.
gc.relocate should be handled in the same way - if there is a use outside of this basic block then export is needed otherwise no export at all. As I understand this corresponds to usual handling of instruction and no special handling of gc.relocate is required?

Yes.

Do I miss anything?

Well, the keyword here is simplicity. To make it work that way some tricks are needed to avoid complicated implementation (it already is not that simple). Please have a look at alternative solution: https://reviews.llvm.org/D124444

dantrushin abandoned this revision.Apr 27 2022, 7:54 AM

Superceeded by D124444