This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Allow memcpys from constant memory to readonly nocapture parameters to be elided.
ClosedPublic

Authored by pcwalton on Oct 27 2022, 1:51 AM.

Details

Summary

Currently, InstCombine can elide a memcpy from a constant to a local alloca if
that alloca is passed as a nocapture parameter to a *function* that's readnone
or readonly, but it can't forward the memcpy if the *argument* is marked
readonly nocapture, even though readonly guarantees that the callee won't
mutate the pointee through that pointer. This patch adds support for detecting
and handling such situations, which arise relatively frequently in Rust, a
frontend that liberally emits readonly.

A more general version of this optimization would use alias analysis to check
the call's ModRef info for the pointee, but I was concerned about blowing up
compile time, so for now I'm just checking for one of readnone on the function,
readonly on the function, or readonly on the parameter.

Diff Detail

Event Timeline

pcwalton created this revision.Oct 27 2022, 1:51 AM
pcwalton requested review of this revision.Oct 27 2022, 1:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 27 2022, 1:51 AM
nikic added inline comments.Oct 27 2022, 3:07 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
92

Use onlyReadsMemory(DataOpNo) here, to also handle readnone implicitly.

93

Do we actually need the noalias requirement here? The code in this function should visit all uses of the alloca and bail out on any captures. So if there is a second argument based on the alloca that is not readonly, then we would reject it at that point. Does that sound right?

llvm/test/Transforms/InstCombine/forward-memcpy-to-readonly-noalias.ll
4 ↗(On Diff #471071)

Use update_test_checks.py please.

nikic added inline comments.Oct 27 2022, 3:50 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
92

Noticed a pre-existing issue while reviewing this, fixed in https://reviews.llvm.org/D136834.

nikic added inline comments.Oct 27 2022, 9:05 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
91–93

I think this change introduces a subtle bug wrt capturing: If we have call without return value that is not-nocapture and not-readonly, and only the parameter is readonly, then the parameter might be captured by writing to some other location (e.g. a global) and then the captured pointer might be used after the call in a way we can't inspect.

So I think if only the parameter is readonly, then we need to require that it is nocapture as well.

pcwalton updated this revision to Diff 471270.Oct 27 2022, 12:56 PM

Addressed review comments.

pcwalton marked 5 inline comments as done.Oct 27 2022, 12:56 PM
pcwalton added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
93

Good point! I'll remove that.

nikic added inline comments.Oct 28 2022, 4:00 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
91–93

Nit: It's kind of odd that you negate doesNotCapture here, and then negate it again at both uses.

93

Why the IsArgOperand check here?

llvm/test/Transforms/InstCombine/forward-memcpy-to-readonly.ll
4 ↗(On Diff #471270)

It looks like the existing tests for this transform are in memcpy-from-global.ll, I'd suggest adding the new test cases there as well. test3-test9 have call-related cases.

17 ↗(On Diff #471270)

Drop noundef/nonnull here, as they are not relevant to the transform.

There should be two additional negative tests, where the transform does not occur:

  • Argument is only readonly, without nocapture.
  • The pointer is also passed to a second argument that is not readonly.
nikic added a comment.Oct 28 2022, 4:08 AM

By the way, the asan test failure in printf-5.c may be legitimate. Sometimes, this kind of change ends up optimizing away the code causing the asan failure. In that case it's necessary to adjust the test case to prevent the optimization.

pcwalton marked an inline comment as done.Oct 28 2022, 9:03 PM
pcwalton added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
93

Hmm, it was necessary to avoid an assert in varargs in earlier versions of the patch that checked the attributes directly, but it doesn't seem necessary anymore. I'll remove it.

pcwalton updated this revision to Diff 471720.Oct 28 2022, 11:22 PM

Review comments addressed. I tried to fix the asan test as well; let me know if that approach looks good.

Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2022, 11:22 PM
Herald added subscribers: Restricted Project, Enna1. · View Herald Transcript
pcwalton marked 2 inline comments as done.Oct 28 2022, 11:22 PM
nikic added inline comments.Oct 29 2022, 2:26 AM
compiler-rt/test/asan/TestCases/printf-5.c
22 ↗(On Diff #471720)

I'm really confused by what this test is trying to do with the memcpy here. Would the test pass with just volatile char fmt[2] = "%c"? That'll produce a volatile memcpy, and that shouldn't get optimized away.

llvm/test/Transforms/InstCombine/memcpy-from-global.ll
350

This can be call void @bar(ptr nocapture readonly %A). We don't need separate functions and can vary the attributes at the call-site only. Same below.

pcwalton updated this revision to Diff 471758.Oct 29 2022, 10:44 AM

Addressed review comments.

pcwalton marked 2 inline comments as done.Oct 29 2022, 10:45 AM
pcwalton updated this revision to Diff 471786.Oct 29 2022, 2:58 PM

Addresses comments.

I actually found that this optimization was kicking in even for volatile
allocas. I assume this was a bug, so I fixed it and added a test case.

pcwalton added inline comments.Oct 29 2022, 3:00 PM
compiler-rt/test/asan/TestCases/printf-5.c
22 ↗(On Diff #471720)

Turns out that InstCombine was optimizing away volatile memcpys. I fixed that, added a new test for it, and then made the change you suggested.

nikic added a comment.Oct 29 2022, 3:23 PM

Can you please precommit your test cases (they look fine) and split off the volatile change into a separate patch? I agree that this was a bug, but it's not directly related to the change here.

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
133

There is another volatile check above, just drop the operand check there?

I split the tests out to D137033.

pcwalton updated this revision to Diff 471826.Oct 30 2022, 2:47 AM

Rebased. This should be ready for sign off now.

nikic accepted this revision.Oct 30 2022, 3:46 AM

LGTM, but the patch description could use an update (no longer relies on noalias).

This revision is now accepted and ready to land.Oct 30 2022, 3:46 AM
pcwalton retitled this revision from [InstCombine] Allow memcpys from constant memory to readonly noalias parameters to be elided. to [InstCombine] Allow memcpys from constant memory to readonly nocapture parameters to be elided..Oct 30 2022, 10:23 AM
pcwalton edited the summary of this revision. (Show Details)
pcwalton updated this revision to Diff 471862.Oct 30 2022, 1:22 PM

Rebased to make sure CI succeeds.

Failure seems like an unrelated Windows Fortran build error.

This revision was landed with ongoing or failed builds.Oct 30 2022, 2:41 PM
This revision was automatically updated to reflect the committed changes.