This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] Check for throwing calls during call slot optimization
ClosedPublic

Authored by nikic on Oct 4 2020, 9:13 AM.

Details

Summary

When performing call slot optimization for a non-local target, we need to check whether there may be throwing calls between the call and the copy. Otherwise, the copy may never be reached.

This was already done for call slot optimization of load/store, but not for memcpys. For the sake of clarity, I'm moving this check into the common optimization function, even if that does need an additional instruction scan for the load/store case.

Diff Detail

Event Timeline

nikic created this revision.Oct 4 2020, 9:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2020, 9:13 AM
nikic requested review of this revision.Oct 4 2020, 9:13 AM
nikic added inline comments.Oct 4 2020, 2:58 PM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
835

Replying to @efriedma's comment on https://reviews.llvm.org/D88805#inline-824472: I believe checking for throwing instructions rather than guaranteed-to-transfer is correct here, because cpyDest cannot be read between the call and the copy, as a precondition for the call-slot optimization. So even if one of the instructions does not return, an early write to cpyDest will not be observable, with two caveats:

  • The call itself might read cpyDest. We explicitly protect against this below (see the getModRefInfo and callCapturesBefore checks).
  • We might throw and the catching code might read cpyDest. That's what we protect against here.
efriedma added inline comments.Oct 4 2020, 3:28 PM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
835

The case I'm concerned about is a multithreaded context. Say there's an infinite loop between the call and the write to cpyDest. Then the current thread never touches cpyDest at all, so some other thread could legally use the memory.

If we're sure no other thread can access the memory, ensuring we don't throw/longjmp would be sufficient, for the reasons you describe.

nikic updated this revision to Diff 296284.Oct 5 2020, 1:22 PM

Add more detailed comment and TODO.

nikic added inline comments.Oct 5 2020, 1:30 PM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
835

I think you are right, and there is nothing that prevents that from happening right now. Unfortunately this also seems like a more significant limitation (not just because of the willreturn requirement, but also because this affects captured allocas, unlike the unwinding case). For now I've added a more extensive comment listing the different cases and marked this part as a TODO.

efriedma accepted this revision.Oct 5 2020, 6:05 PM

LGTM; this seems like a step in the right direction in any case.

This revision is now accepted and ready to land.Oct 5 2020, 6:05 PM
jdoerfert added inline comments.
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
847

You could use the fn attributes to avoid this loop all together. nounwind should imply this just fine.