This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] Use memcpy source directly if dest is known to be immutable from attributes
ClosedPublic

Authored by khei4 on May 19 2023, 8:00 AM.

Details

Summary

This patch aims to remove memcpy on immutable arguments from attributes.

We can safely remove src-to-dest memcpy dest is used as only the call parameter if following conditions are satisfied

This is called on memcpy dest pointer arguments attributed as immutable
during call. Try to use memcpy source directly if all of the following
conditions are satisfied.

  1. The memcpy dst is neither modified during the call nor captured by the call. (if readonly, noalias, nocapture attributes on call-site.)
  2. The memcpy dst is an alloca with known alignment & size.
    1. The memcpy length == the alloca size which ensures that the new pointer is dereferenceable for the required range
    2. The src pointer has alignment >= the alloca alignment or can be enforced so.
  3. The memcpy dst and src is not modified between the memcpy and the call. (if MSSA clobber check is safe.)
  4. The memcpy src is not modified during the call. (ModRef check shows no Mod.)

(from https://reviews.llvm.org/D150967#4359222 and https://reviews.llvm.org/D150967#4365466 )

precommit tests: https://reviews.llvm.org/D150967
Motivated to resolve https://github.com/rust-lang/rust/issues/107436

Diff Detail

Event Timeline

khei4 created this revision.May 19 2023, 8:00 AM
khei4 requested review of this revision.May 19 2023, 8:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2023, 8:00 AM
khei4 updated this revision to Diff 524098.May 21 2023, 5:43 AM

update (currently attribute checking not working correctly

nikic added a subscriber: nikic.May 21 2023, 10:02 AM
nikic added inline comments.
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1750

This whole can can be replaced with CB->onlyReadsMemory(i). This will check readonly and readnone and will also correctly handle inheritance of attributes from the function.

1755

CB->paramHasAttr(i, Kind) is enough, it will handle attribute inheritance.

khei4 updated this revision to Diff 524190.May 22 2023, 1:22 AM

apply feedbacks, remove (most of) debug streams.

khei4 planned changes to this revision.May 22 2023, 1:23 AM

@nikic Thank you for the review! They saved me!

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1755

Thank you for saving me! I haven't found the attribute inheritance behavior and am confused about that!

nikic added inline comments.May 23 2023, 11:31 AM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1737

This check looks unnecessary.

1738–1743

This seems simpler than the loop :)

khei4 updated this revision to Diff 525043.May 24 2023, 12:51 AM
khei4 edited the summary of this revision. (Show Details)
khei4 added inline comments.
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1737

Thanks! That's remaining for the past diff!

1738–1743

Yeah, agreed, conditions are simpler than before.

khei4 updated this revision to Diff 525045.May 24 2023, 12:53 AM

attach Differential Revision on comment

khei4 retitled this revision from (WIP) [MemCpyOpt]remove memcpy on noalias readonly attributed arguments to (WIP) [MemCpyOpt]remove memcpy on immutable arguments from attributes.May 26 2023, 8:54 PM
khei4 edited the summary of this revision. (Show Details)
khei4 edited the summary of this revision. (Show Details)May 26 2023, 8:55 PM
khei4 updated this revision to Diff 526516.May 29 2023, 10:51 PM

process alignment, handle unescaped alloca and arg with noalias, clobber check on Dest of MDep(MemCpy)

khei4 planned changes to this revision.May 29 2023, 10:52 PM

need to change test on https://reviews.llvm.org/D150967#4365466 and refine description

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1704–1705

dest check might be redundant because the first clobber check might already show dest is not modified between call and memcpy.

khei4 edited the summary of this revision. (Show Details)May 29 2023, 11:18 PM
khei4 added a reviewer: nikic.
khei4 edited the summary of this revision. (Show Details)May 29 2023, 11:58 PM
khei4 edited the summary of this revision. (Show Details)May 30 2023, 12:01 AM
khei4 edited the summary of this revision. (Show Details)
khei4 updated this revision to Diff 526549.May 30 2023, 2:02 AM

apply feedback on call with @nikic

  1. remove noalias argument or unescaped alloca checks
  2. add ModRefCheck for memcpy source
khei4 edited the summary of this revision. (Show Details)May 31 2023, 5:59 AM
khei4 edited the summary of this revision. (Show Details)Jun 1 2023, 10:07 AM
nikic added inline comments.Jun 1 2023, 12:11 PM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1627

nit: Start comments with uppercase letter.

1640

nit: auto * when using dyn_cast.

1661

see -> check that

1663

ImmutArg->stripPointerCasts() is same as AI.

1666

Outdated comment?

1667

byval -> alloca?

1678

Is the explicit MaybeAlign() here needed?

1693

"not modified by the call" maybe?

1740

nit: Unnecessary braces.

nikic added inline comments.Jun 1 2023, 12:11 PM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1695

MSSA->getMemoryAccess(&CB) is CallAccess.

1697

This second writtenBetween looks redundant to me. This is already ensured by getClobberingMemoryAccess().

1698

Hm, shouldn't this be checking the source? We already know the dest is not modified via readonly noalias.

1708

You can drop the bitcast code, not needed with opaque pointers.

khei4 updated this revision to Diff 528080.Jun 3 2023, 1:30 AM

apply feedbacks

khei4 added a comment.Jun 3 2023, 1:33 AM

@nikic
Thanks for the review! Too much obsolete comments!

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1636–1642

I decided to postpone to the next Revision to collect all possible pointers and check that all ptr are alloca. (it might be a little conservative.)

1693

"not modified by the call" maybe?

Literally right,(writtenBetween guarantees so). TBH, I want to ensure the memcpy source is not modified by another alias during the call.
That part may be done by isModSet I guess.

1695

Good catch!! The same goes for ByVal.

1698

Right! Thanks, I was confused with them.

1708

Thanks! I believe ByVal has it too! Including other comments I'll fix them on ByVal side also!

khei4 updated this revision to Diff 528082.Jun 3 2023, 1:41 AM
khei4 marked 3 inline comments as done.

fix typo

khei4 updated this revision to Diff 528277.Jun 4 2023, 9:35 PM
khei4 marked 14 inline comments as done.
khei4 retitled this revision from (WIP) [MemCpyOpt]remove memcpy on immutable arguments from attributes to [MemCpyOpt]remove memcpy on immutable arguments from attributes.

apply feedbacks.

  • fix styles
khei4 added inline comments.Jun 4 2023, 9:37 PM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1678

That was just to adjust arg types to the off-the-shelf util functions! Although it was wrong and removed!

khei4 retitled this revision from [MemCpyOpt]remove memcpy on immutable arguments from attributes to [MemCpyOpt] Use memcpy source directly if dest is known to be immutable from attributes.Jun 4 2023, 10:20 PM
khei4 edited the summary of this revision. (Show Details)
khei4 updated this revision to Diff 528283.Jun 4 2023, 10:22 PM

fix styles and comments and postpone detailed case

khei4 edited the summary of this revision. (Show Details)Jun 4 2023, 10:25 PM
nikic requested changes to this revision.Jun 6 2023, 12:55 PM

It looks like the alignment checks are inverted.

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1624

<= should be >= here?

1639

"Even if the arg can be based on multiple allocas" or so.

1647

&& !AllocaSize->isScalable() (and a test for this)

1662

"equals to" -> "equals the" or "is equal to the"

1673

Does that work? getZextValue() may assert for wide integers (though probably not really relevant here).

1676

smaller -> larger than or equal

1679

Then you don't need to check !MemDepAlign

1681–1683
1704

stray byval

llvm/test/Transforms/MemCpyOpt/memcpy.ll
455–456

This is not a VLA. You need to do something like alloca i8, i64 %n.

486–487

This case is fine, the problematic case is if the %val alignment is smaller.

This revision now requires changes to proceed.Jun 6 2023, 12:55 PM
khei4 marked an inline comment as done.Jun 6 2023, 11:02 PM
khei4 added inline comments.
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1624

Right!

1647

&& !AllocaSize->isScalable()

Sorry, I'm not sure about a scalable vectors pointer could be transformed (maybe ||, not &&?). Do you mean scalable vectors TypeSize are available, but memcpy length cannot be scalable for it, so we couldn't handle it, right?

1673

Works fine! Thanks!

1679

Thank you!

llvm/test/Transforms/MemCpyOpt/memcpy.ll
486–487

Yeah, you seem right. Thank you for the comment!
I also found the alive2 has similar issues :)

https://alive2.llvm.org/ce/z/NZivTQ

khei4 updated this revision to Diff 529169.Jun 6 2023, 11:03 PM
khei4 marked an inline comment as done.

apply feedback and inverse alignment vs memcpy length comparison.

also rebase for

  • smaller alignment failure case
  • address space failure case
  • scalable vector failure case
nikic added inline comments.Jun 7 2023, 1:04 AM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1639

the all -> all the

1647

Yes, exactly.

1664

Add test with volatile memcpy.

1682

Can drop the extra () here.

1705

byval -> immut

llvm/test/Transforms/MemCpyOpt/memcpy.ll
464–465

I'd just write alloca <vscale x 2 x i32> here. You're using a "homogeneous scalable aggregate" which also tests the right thing, but is a more advanced feature...

471

We also need a test where dst is modified between call and memcpy, unless I missed it.

486–487

This comment is outdated.

500–501

We're missing a test where the alignment can be enforced. For that you want to copy from another alloca with lower alignment, and the alignment will be increased by the transform.

khei4 updated this revision to Diff 529242.Jun 7 2023, 3:58 AM
khei4 marked 15 inline comments as done.

rebase
remove redundancy,
fix wrong comments

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1647

Understood. Thanks!

llvm/test/Transforms/MemCpyOpt/memcpy.ll
464–465

Thanks! Yeah as you said in this case, unnecessary complexity!

500–501

Sorry, I'm not sure how this could be. I now found that MemDepAlign is just memcpy's arguments alignment attributes value, so basically enforcing from parameter attributes works, but I couldn't find the way to enforce lower alignment actually!
(I might too naive, but I tried for example,

@g = global ptr null, align 4

define void @immut_param_enforced_alignment(ptr align 16 noalias %val) {
  %val1 = alloca i8, align 16
  store ptr %val, ptr @g
  %p = load ptr, ptr @g
  call void @llvm.memcpy.p0.p0.i64(ptr align 16 %val1, ptr %p, i64 1, i1 false)
  call void @f(ptr nocapture noalias readonly %val1)
  ret void
}

this can't enforce the alignment.)

nikic added inline comments.Jun 7 2023, 4:01 AM
llvm/test/Transforms/MemCpyOpt/memcpy.ll
500–501

I believe you need something like this:

define void @test() {
  %val = alloca i8, align 1
  store i32 42, ptr %val
  %val1 = alloca i8, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr %val1, ptr %val, i64 1, i1 false)
  call void @f_full_readonly(ptr %val1)
  ret void
}

Here the align 1 on the first alloca will be changed to align 4 by the transform.

khei4 edited the summary of this revision. (Show Details)Jun 7 2023, 4:21 AM
khei4 updated this revision to Diff 529281.Jun 7 2023, 6:24 AM

rebase for enforced alignment

khei4 marked an inline comment as done.Jun 7 2023, 6:25 AM
khei4 added inline comments.
llvm/test/Transforms/MemCpyOpt/memcpy.ll
500–501

Thank you! Now I see DataLayout specify the alignment for stored types!

nikic added inline comments.Jun 7 2023, 8:14 AM
llvm/test/Transforms/MemCpyOpt/memcpy.ll
500–501

Hm, so I expected that for this test case the transform would happen and change to align 4 here. Do you know why it doesn't?

nikic added inline comments.Jun 7 2023, 8:21 AM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1694

I believe these MSSA->getMemoryAccess(MDep) calls can be replaced by Clobber.

llvm/test/Transforms/MemCpyOpt/memcpy.ll
500–501

Hm, this does work when I test the patch locally, so maybe just outdated test result?

khei4 marked an inline comment as done.Jun 7 2023, 9:34 AM
khei4 added inline comments.
llvm/test/Transforms/MemCpyOpt/memcpy.ll
500–501

Hm, this does work when I test the patch locally, so maybe just outdated test result?

Yeah, Sorry... I have forgotten to build before the test update. I believe this works fine.

khei4 updated this revision to Diff 529346.Jun 7 2023, 9:35 AM

update tests

nikic accepted this revision.Jun 8 2023, 12:11 AM

LGTM

This revision is now accepted and ready to land.Jun 8 2023, 12:11 AM
This revision was landed with ongoing or failed builds.Jun 9 2023, 11:47 PM
This revision was automatically updated to reflect the committed changes.