This is an archive of the discontinued LLVM Phabricator instance.

Extend memcpy expansion in Transform/Utils to handle wider operand types.
ClosedPublic

Authored by sfertile on Apr 26 2017, 7:01 AM.

Details

Summary

Generalizes the current transformation of memcpy calls into a loop expansion to use wider operand types, and specializes between compile-time constant sizes and unknown size expansions.

A TTI call back is used to allow the target to provide the operand type for the main loop expansion based on copy size and alignment of source and destination arguments. The residual block of memory after the main loop is then copied byte-wise in a loop for unknown sizes, and a TTI callback allows the target to provide the exact memory ops to use for the residual when the size is known at compile time.

The new implementations are turned off by default and can be enabled with '-use-wide-memcpy-loop-lowering=true'.
The default values for the TTI callbacks use int8 operand types and matches the existing behaviour if they aren't overridden by the target.

Diff Detail

Repository
rL LLVM

Event Timeline

sfertile created this revision.Apr 26 2017, 7:01 AM
arsenm added inline comments.Apr 26 2017, 4:12 PM
lib/Transforms/Utils/LowerMemIntrinsics.cpp
445 ↗(On Diff #96726)

I'm not sure this is a useful hook. If it's worth keeping the old path that had no target information, making TTI an optional pointer input and using it if available makes more sense

sfertile added inline comments.Apr 27 2017, 10:16 AM
lib/Transforms/Utils/LowerMemIntrinsics.cpp
445 ↗(On Diff #96726)

Sorry, I meant to explain how I broke this up in the description but forgot to. I added the new implementations but left them off by default. I've tested these with a PowerPC pass but I can't do any functional testing with the current consumers since I don't have a gpu setup. The idea was that the current consumers can do functional testing between the 2 implementations easily if they want. Once any issues or concerns are addressed then I will add another patch (or update this one) to remove the option and the old implementation.

syzaara added inline comments.Apr 28 2017, 8:39 AM
lib/Transforms/Utils/LowerMemIntrinsics.cpp
71 ↗(On Diff #96726)

The second argument is a hint for the number of incoming edges that this phi node will have. It should be 2 in this case.

177 ↗(On Diff #96726)

The second argument is a hint for the number of incoming edges that this phi node will have. It should be 2 in this case.

220 ↗(On Diff #96726)

Same note about second argument.

jtony added a subscriber: jtony.Apr 28 2017, 2:40 PM
jtony added inline comments.
include/llvm/Analysis/TargetTransformInfo.h
704 ↗(On Diff #96726)

I think the \returns is used for function return values. If I understand correctly, here we are using parameter OpsOut to return the operand types, maybe we should use \param OpsOut?

lib/Transforms/Utils/LowerMemIntrinsics.cpp
186 ↗(On Diff #96726)

Why do we use 1 here and use 0U at line 203?

236 ↗(On Diff #96726)

same issue with line 186.

465 ↗(On Diff #96726)

This is optional. Is it better to just call createMemCpyLoopKnownSize once and set the CopyLen parameter using a ternary expression? So we can save a few lines of code.

sfertile updated this revision to Diff 97321.EditedMay 1 2017, 11:45 AM

Updated calls to CreatePHI to pass correct value to NumReservedValues.
Updated literal arguments to to ConstantInt::get to be unsigned.
Fixed comment that use \returns to describe an output argument.
Rebased to top of trunk.

sfertile marked 6 inline comments as done.May 1 2017, 11:58 AM
sfertile added inline comments.
lib/Transforms/Utils/LowerMemIntrinsics.cpp
186 ↗(On Diff #96726)

I'm not sure why I only added the suffix in one call. I've updated the other calls to use unsigned literals also.

sfertile marked an inline comment as done.May 1 2017, 7:52 PM
sfertile added inline comments.
lib/Transforms/Utils/LowerMemIntrinsics.cpp
465 ↗(On Diff #96726)

I don't see how that would help, these are 2 different functions we are calling based on if the size argument is a constant or not.

sfertile updated this revision to Diff 97381.May 1 2017, 7:54 PM

Forgot to run clang-format over some of the changes.

dneilson added inline comments.
lib/Analysis/TargetTransformInfo.cpp
29 ↗(On Diff #97381)

Typo: "Transforms" -> "Transforms"

nemanjai edited edge metadata.May 15 2017, 11:02 AM
nemanjai added a subscriber: tstellarAMD.

The patch looks fine to me. I am reluctant to accept it since it would be good to get feedback from the maintainers of the AMDGPU and NVPTX (@tstellarAMD and @jholewinski) since the plan is to eventually remove the function they were using and replace it with this. Also, it would be good to hear from @arsenm about whether he's OK with keeping this TTI hook in the meantime.

sfertile updated this revision to Diff 99051.May 15 2017, 1:01 PM

Fixed spelling in option-description and removed some trailing white-space.

sfertile marked an inline comment as done.May 15 2017, 1:02 PM

@arsenm Would this be easier to review if it didn't keep both the old and new functionality? I can remove the option and make it switch over completely to the new behavior.

kbarton edited edge metadata.Jun 5 2017, 11:30 AM

I'm OK with this patch and then a follow-up patch to enable the new behaviour by default.
@arsenm Are you OK with this approach or do you prefer a different approach?

arsenm accepted this revision.Jun 20 2017, 12:06 PM

LGTM

This revision is now accepted and ready to land.Jun 20 2017, 12:06 PM
This revision was automatically updated to reflect the committed changes.