This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GlobalISel] Legalize memcpy family of intrinsics
ClosedPublic

Authored by mbrkusanin on Aug 19 2021, 2:29 AM.

Details

Summary

Legalize G_MEMCPY, G_MEMMOVE, G_MEMSET and G_MEMCPY_INLINE.

Corresponding intrinsics are replaced by a loop that uses loads/stores in
AMDGPULowerIntrinsics pass unless their length is a constant lower then
MemIntrinsicExpandSizeThresholdOpt (default 1024). Any G_MEM* instruction that
reaches legalizer should have a const length argument and should be expanded
into appropriate number of loads + stores.

Diff Detail

Event Timeline

mbrkusanin created this revision.Aug 19 2021, 2:29 AM
mbrkusanin requested review of this revision.Aug 19 2021, 2:29 AM
mbrkusanin added a comment.EditedAug 19 2021, 2:29 AM

This is the same solution from CombinerHelper with minor changes. However it makes more sense to use it in legalizer since it is required for correctness.

AArch64 currently uses those combines in PreLegalizerCombiner. We have 3 options:

  1. Custom AMDGPU lowering (current solution, copies code)
  2. Use same combines in AMDGPUPreLegalizerCombiner (correctness issue)
  3. Move combines into LegalizerHelper (AArch64 will be using them in legalizer, I don't see a nice way of using a LegalizerHelper method in combiner).

Also which tests should I keep, .ll or .mir?

This is the same solution from CombinerHelper with minor changes. However it makes more sense to use it in legalizer since it is required for correctness.

AArch64 currently uses those combines in PreLegalizerCombiner. We have 3 options:

  1. Custom AMDGPU lowering (current solution, copies code)
  2. Use same combines in AMDGPUPreLegalizerCombiner (correctness issue)
  3. Move combines into LegalizerHelper (AArch64 will be using them in legalizer, I don't see a nice way of using a LegalizerHelper method in combiner).

I think this is a lowering action that belongs in the legalizer helper

Also which tests should I keep, .ll or .mir?

Both?

Since these combines don't use KnownBitAnalysis or anything similar we can easily use them from legalizer without moving code through CombinerHelper
Our custom legalization now simply calls CombinerHelper methods.

Opposite is also possible: Move everything to LegalizerHelper and have AArch and Mips instantiate LegalizerHelper. @aemerson, @paquette do you have any objection to doing it that way instead?

  • update .mir tests

Since these combines don't use KnownBitAnalysis or anything similar we can easily use them from legalizer without moving code through CombinerHelper
Our custom legalization now simply calls CombinerHelper methods.

Opposite is also possible: Move everything to LegalizerHelper and have AArch and Mips instantiate LegalizerHelper. @aemerson, @paquette do you have any objection to doing it that way instead?

You can move the code to the LegalizerHelper and have the combiner call into it.

mbrkusanin updated this revision to Diff 370606.Sep 3 2021, 9:29 AM
  • Moved combines to LegalizerHelper (CombinerHelper now calls into LegalizerHelper).
foad added a comment.Sep 6 2021, 2:01 AM

Typo in summary "AMDGPULowerIntinics".

Patch looks pretty nice to me but please wait for comments from others (it's no-labor day in North America).

mbrkusanin edited the summary of this revision. (Show Details)Sep 6 2021, 8:44 AM
aemerson accepted this revision.Sep 6 2021, 10:30 PM

LGTM.

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
7936–7937 ↗(On Diff #370606)

This comment isn't accurate anymore now that the code lives in the legalizer as well.

This revision is now accepted and ready to land.Sep 6 2021, 10:30 PM
This revision was landed with ongoing or failed builds.Sep 7 2021, 3:33 AM
This revision was automatically updated to reflect the committed changes.
mbrkusanin marked an inline comment as done.Sep 7 2021, 3:34 AM
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.memset.ll