Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[MemCpyOpt] Use memcpy source directly if dest is known to be immutable from attributes
ClosedPublic

Authored by khei4 on May 19 2023, 8:00 AM.

Details

Summary

This patch aims to remove memcpy on immutable arguments from attributes.

We can safely remove src-to-dest memcpy dest is used as only the call parameter if following conditions are satisfied

This is called on memcpy dest pointer arguments attributed as immutable
during call. Try to use memcpy source directly if all of the following
conditions are satisfied.

  1. The memcpy dst is neither modified during the call nor captured by the call. (if readonly, noalias, nocapture attributes on call-site.)
  2. The memcpy dst is an alloca with known alignment & size.
    1. The memcpy length == the alloca size which ensures that the new pointer is dereferenceable for the required range
    2. The src pointer has alignment >= the alloca alignment or can be enforced so.
  3. The memcpy dst and src is not modified between the memcpy and the call. (if MSSA clobber check is safe.)
  4. The memcpy src is not modified during the call. (ModRef check shows no Mod.)

(from https://reviews.llvm.org/D150967#4359222 and https://reviews.llvm.org/D150967#4365466 )

precommit tests: https://reviews.llvm.org/D150967
Motivated to resolve https://github.com/rust-lang/rust/issues/107436

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
nikic added inline comments.Jun 1 2023, 12:11 PM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1686

MSSA->getMemoryAccess(&CB) is CallAccess.

1688

This second writtenBetween looks redundant to me. This is already ensured by getClobberingMemoryAccess().

1689

Hm, shouldn't this be checking the source? We already know the dest is not modified via readonly noalias.

1699

You can drop the bitcast code, not needed with opaque pointers.

khei4 updated this revision to Diff 528080.Jun 3 2023, 1:30 AM

apply feedbacks

khei4 added a comment.Jun 3 2023, 1:33 AM

@nikic
Thanks for the review! Too much obsolete comments!

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1627–1633

I decided to postpone to the next Revision to collect all possible pointers and check that all ptr are alloca. (it might be a little conservative.)

1684

"not modified by the call" maybe?

Literally right,(writtenBetween guarantees so). TBH, I want to ensure the memcpy source is not modified by another alias during the call.
That part may be done by isModSet I guess.

1686

Good catch!! The same goes for ByVal.

1689

Right! Thanks, I was confused with them.

1699

Thanks! I believe ByVal has it too! Including other comments I'll fix them on ByVal side also!

khei4 updated this revision to Diff 528082.Jun 3 2023, 1:41 AM
khei4 marked 3 inline comments as done.

fix typo

khei4 updated this revision to Diff 528277.Jun 4 2023, 9:35 PM
khei4 marked 14 inline comments as done.
khei4 retitled this revision from (WIP) [MemCpyOpt]remove memcpy on immutable arguments from attributes to [MemCpyOpt]remove memcpy on immutable arguments from attributes.

apply feedbacks.

  • fix styles
khei4 added inline comments.Jun 4 2023, 9:37 PM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1669

That was just to adjust arg types to the off-the-shelf util functions! Although it was wrong and removed!

khei4 retitled this revision from [MemCpyOpt]remove memcpy on immutable arguments from attributes to [MemCpyOpt] Use memcpy source directly if dest is known to be immutable from attributes.Jun 4 2023, 10:20 PM
khei4 edited the summary of this revision. (Show Details)
khei4 updated this revision to Diff 528283.Jun 4 2023, 10:22 PM

fix styles and comments and postpone detailed case

khei4 edited the summary of this revision. (Show Details)Jun 4 2023, 10:25 PM
nikic requested changes to this revision.Jun 6 2023, 12:55 PM

It looks like the alignment checks are inverted.

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1615

<= should be >= here?

1630

"Even if the arg can be based on multiple allocas" or so.

1638

&& !AllocaSize->isScalable() (and a test for this)

1653

"equals to" -> "equals the" or "is equal to the"

1664

Does that work? getZextValue() may assert for wide integers (though probably not really relevant here).

1667

smaller -> larger than or equal

1670

Then you don't need to check !MemDepAlign

1672–1674
1695

stray byval

llvm/test/Transforms/MemCpyOpt/memcpy.ll
486

This is not a VLA. You need to do something like alloca i8, i64 %n.

545–573

This case is fine, the problematic case is if the %val alignment is smaller.

This revision now requires changes to proceed.Jun 6 2023, 12:55 PM
khei4 marked an inline comment as done.Jun 6 2023, 11:02 PM
khei4 added inline comments.
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1615

Right!

1638

&& !AllocaSize->isScalable()

Sorry, I'm not sure about a scalable vectors pointer could be transformed (maybe ||, not &&?). Do you mean scalable vectors TypeSize are available, but memcpy length cannot be scalable for it, so we couldn't handle it, right?

1664

Works fine! Thanks!

1670

Thank you!

llvm/test/Transforms/MemCpyOpt/memcpy.ll
545–573

Yeah, you seem right. Thank you for the comment!
I also found the alive2 has similar issues :)

https://alive2.llvm.org/ce/z/NZivTQ

khei4 updated this revision to Diff 529169.Jun 6 2023, 11:03 PM
khei4 marked an inline comment as done.

apply feedback and inverse alignment vs memcpy length comparison.

also rebase for

  • smaller alignment failure case
  • address space failure case
  • scalable vector failure case
nikic added inline comments.Jun 7 2023, 1:04 AM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1630

the all -> all the

1638

Yes, exactly.

1655

Add test with volatile memcpy.

1673

Can drop the extra () here.

1696

byval -> immut

llvm/test/Transforms/MemCpyOpt/memcpy.ll
508

I'd just write alloca <vscale x 2 x i32> here. You're using a "homogeneous scalable aggregate" which also tests the right thing, but is a more advanced feature...

529

We also need a test where dst is modified between call and memcpy, unless I missed it.

572–573

This comment is outdated.

596

We're missing a test where the alignment can be enforced. For that you want to copy from another alloca with lower alignment, and the alignment will be increased by the transform.

khei4 updated this revision to Diff 529242.Jun 7 2023, 3:58 AM
khei4 marked 15 inline comments as done.

rebase
remove redundancy,
fix wrong comments

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1638

Understood. Thanks!

llvm/test/Transforms/MemCpyOpt/memcpy.ll
508

Thanks! Yeah as you said in this case, unnecessary complexity!

596

Sorry, I'm not sure how this could be. I now found that MemDepAlign is just memcpy's arguments alignment attributes value, so basically enforcing from parameter attributes works, but I couldn't find the way to enforce lower alignment actually!
(I might too naive, but I tried for example,

@g = global ptr null, align 4

define void @immut_param_enforced_alignment(ptr align 16 noalias %val) {
  %val1 = alloca i8, align 16
  store ptr %val, ptr @g
  %p = load ptr, ptr @g
  call void @llvm.memcpy.p0.p0.i64(ptr align 16 %val1, ptr %p, i64 1, i1 false)
  call void @f(ptr nocapture noalias readonly %val1)
  ret void
}

this can't enforce the alignment.)

nikic added inline comments.Jun 7 2023, 4:01 AM
llvm/test/Transforms/MemCpyOpt/memcpy.ll
596

I believe you need something like this:

define void @test() {
  %val = alloca i8, align 1
  store i32 42, ptr %val
  %val1 = alloca i8, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr %val1, ptr %val, i64 1, i1 false)
  call void @f_full_readonly(ptr %val1)
  ret void
}

Here the align 1 on the first alloca will be changed to align 4 by the transform.

khei4 edited the summary of this revision. (Show Details)Jun 7 2023, 4:21 AM
khei4 updated this revision to Diff 529281.Jun 7 2023, 6:24 AM

rebase for enforced alignment

khei4 marked an inline comment as done.Jun 7 2023, 6:25 AM
khei4 added inline comments.
llvm/test/Transforms/MemCpyOpt/memcpy.ll
596

Thank you! Now I see DataLayout specify the alignment for stored types!

nikic added inline comments.Jun 7 2023, 8:14 AM
llvm/test/Transforms/MemCpyOpt/memcpy.ll
600

Hm, so I expected that for this test case the transform would happen and change to align 4 here. Do you know why it doesn't?

nikic added inline comments.Jun 7 2023, 8:21 AM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
1685

I believe these MSSA->getMemoryAccess(MDep) calls can be replaced by Clobber.

llvm/test/Transforms/MemCpyOpt/memcpy.ll
600

Hm, this does work when I test the patch locally, so maybe just outdated test result?

khei4 marked an inline comment as done.Jun 7 2023, 9:34 AM
khei4 added inline comments.
llvm/test/Transforms/MemCpyOpt/memcpy.ll
600

Hm, this does work when I test the patch locally, so maybe just outdated test result?

Yeah, Sorry... I have forgotten to build before the test update. I believe this works fine.

khei4 updated this revision to Diff 529346.Jun 7 2023, 9:35 AM

update tests

nikic accepted this revision.Jun 8 2023, 12:11 AM

LGTM

This revision is now accepted and ready to land.Jun 8 2023, 12:11 AM
This revision was landed with ongoing or failed builds.Jun 9 2023, 11:47 PM
This revision was automatically updated to reflect the committed changes.