This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Make some VOP1 instructions rematerializable
ClosedPublic

Authored by rampitec on Jul 9 2021, 3:00 PM.

Details

Summary

This is a pilot change to verify the logic. The rest will be
done in a same way, at least the rest of VOP1.

Diff Detail

Event Timeline

rampitec created this revision.Jul 9 2021, 3:00 PM
rampitec requested review of this revision.Jul 9 2021, 3:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 9 2021, 3:00 PM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm added inline comments.Jul 9 2021, 3:02 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
110

Braces

110–119

Probably should get a comment about how normally exec would block rematerialization but should be OK, plus mode

rampitec updated this revision to Diff 357653.Jul 9 2021, 3:16 PM
rampitec marked 2 inline comments as done.

Addressed review comments.

RA has VirtRegAuxInfo::weightCalcHelper() function which halves a weight of a LI if it is rematerializable, which in turn leads to different RA decisions. We will generally have a lot of small codegen changes, not just rematerialization instead of spilling:

// If all of the definitions of the interval are re-materializable,
// it is a preferred candidate for spilling.
// FIXME: this gets much more complicated once we support non-trivial
// re-materialization.
if (isRematerializable(LI, LIS, VRM, *MF.getSubtarget().getInstrInfo()))
  TotalWeight *= 0.5F;

It may be beneficial to only do it if we have isAsCheapAsAMove or at least use a different weight multiplier. This is however a different and much more intrusive change.

rampitec updated this revision to Diff 357656.Jul 9 2021, 3:35 PM

Updated mca test.

arsenm added inline comments.Jul 9 2021, 4:05 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
110

What about VOP2?

llvm/test/CodeGen/AMDGPU/remat-vop.mir
704

Can you add a test with a mode def that blocks rematerialization?

rampitec marked an inline comment as done.Jul 9 2021, 4:09 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
110

There are no supported VOP2 yet. Only two VOP1 conversions and their promoted VOP3 forms.

llvm/test/CodeGen/AMDGPU/remat-vop.mir
704

It is exist in remat-vop.mir, test test_no_remat_v_cvt_i32_f64_e32_mode_def.

rampitec marked an inline comment as done.Jul 9 2021, 4:10 PM
rampitec added inline comments.
llvm/test/tools/llvm-mca/AMDGPU/gfx10-double.s
63 ↗(On Diff #357656)

Something is wrong here, latency should not been changed...

rampitec planned changes to this revision.Jul 9 2021, 4:12 PM

I have misplaces braces in the td file, that's why there are so many codegen changes.

rampitec updated this revision to Diff 357663.Jul 9 2021, 4:25 PM

Fixed brances in the td. There are no parasite codegen changes anymore and no regressions. Only the rematerialization/spilling is actually changed.

rampitec edited the summary of this revision. (Show Details)Jul 9 2021, 4:34 PM
rampitec removed a reviewer: andreadb.
rampitec updated this revision to Diff 358095.Jul 12 2021, 4:18 PM
rampitec retitled this revision from [AMDGPU] Make V_CVT_I32_F64/V_CVT_F64_I32 rematerializable. to [AMDGPU] Make some VOP1 instructions rematerializable.
  • Make more VOP1 rematerializable.
  • Added instructions with SDWA and verified SDWA handling (unused_preserve cannot be rematerialized).
  • Dropped old logic to specifically list and handle move instructions, it is covered by the new more generic code.
arsenm accepted this revision.Jul 12 2021, 5:46 PM
This revision is now accepted and ready to land.Jul 12 2021, 5:46 PM
This revision was landed with ongoing or failed builds.Jul 12 2021, 11:44 PM
This revision was automatically updated to reflect the committed changes.