This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Relax restriction on folding immediates into physregs
ClosedPublic

Authored by arsenm on Jul 20 2020, 5:28 AM.

Details

Summary

I never completed the work on the patches referenced by
f8bf7d7f42f28fa18144091022236208e199f331, but this was intended to
avoid folding immediate writes into m0 which the coalescer doesn't
understand very well. Relax this to allow simple SGPR immediates to
fold directly into VGPR copies. This pattern shows up routinely in
current GlobalISel code since nothing is smart enough to emit VGPR
constants yet.

Diff Detail

Event Timeline

arsenm created this revision.Jul 20 2020, 5:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2020, 5:28 AM

I do not think this can work with physregs given the CFG.

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
654

Source must be strictly scalar. You can have a copy of vector to vector (say a to v) and it can be initialized with a different mask.

In fact, how can we be sure even an SGPR was not initialized in a different CFG path? This shall never happen to a virtual register in SSA, but can easily happen to a physreg.

658

This does fit into a single line.

arsenm added inline comments.Jul 28 2020, 4:16 PM
llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
654

This is folding a constant. The mask doesn't matter. This is also not analyzing physreg defs, it's searching the uses of an SSA virtual register.

rampitec added inline comments.Jul 28 2020, 4:25 PM
llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
654

This is folding a constant. The mask doesn't matter. This is also not analyzing physreg defs, it's searching the uses of an SSA virtual register.

Ah, right! It is destination physical, not source.

It can probably create a worse code sometimes. Assume you have a constant which cannot be directly assigned to a register (v_accvgpr_write_b32 only accepts inline literals). If source def will not become dead it will create an extra copy. You probably need some legality checks here.

rampitec accepted this revision.Jul 28 2020, 4:36 PM

It can probably create a worse code sometimes. Assume you have a constant which cannot be directly assigned to a register (v_accvgpr_write_b32 only accepts inline literals). If source def will not become dead it will create an extra copy. You probably need some legality checks here.

But that's really orthogonal to what you are doing and is a separate change. It equally related to virtual registers.

LGTM, just fix formatting at line 657.

This revision is now accepted and ready to land.Jul 28 2020, 4:36 PM