This is an archive of the discontinued LLVM Phabricator instance.

[DSE] Remove calls with known writes to dead memory
ClosedPublic

Authored by reames on Dec 16 2021, 1:44 PM.

Details

Summary

The majority of this change is sinking logic from instcombine into MemoryLocation such that it can be generically reused. If we have a call with a single analyzable write to an argument, we can treat that as-if it were a store of unknown size.

Merging the code in this was unblocks DSE in the store to dead memory code paths. In theory, it should also enable classic DSE of such calls, but the code appears to not know how to use object sizes to refine unknown access bounds (yet).

In addition, this does make the isAllocRemovable path slightly stronger by reusing the libfunc and additional intrinsics bits which are already in getForDest.

A couple ideas for follow up (which I don't plan to do):

  • The more I look at this, the more I'm starting to think the capture handling is conservative. It feels like the write and no-return handling should be enough to disallow capture in the problematic cases. Maybe we can relax this? Or adjust inference rules to simply infer?
  • We should be able to use known object size info in DSE per the added test cases.
  • We should be able to remove empty lifetime start/end ranges in DSE, and thus kill off remaining alloca uses. Today, this will take another instcombine run. Adding the DSE support is most useful when we have two distinct live ranges, one live and one dead. DSE can recognize the dead one cheaply (even if large), where instcombine really can't.

Diff Detail

Event Timeline

reames created this revision.Dec 16 2021, 1:44 PM
reames requested review of this revision.Dec 16 2021, 1:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 16 2021, 1:44 PM
nikic added inline comments.Dec 17 2021, 12:21 AM
llvm/include/llvm/Analysis/MemoryLocation.h
260

This might want to highlight that the function can still read other memory.

llvm/lib/Analysis/MemoryLocation.cpp
154

I'm not sure the willreturn and nounwind checks really belong in here. We should try to separate the modelling of different effects, and MemoryLocation should only be modelling memory effects. The willreturn/nounwind checks can be done in the caller.

After all, the statement that only this one location is written remains true regardless of whether the call unwind or diverge, the latter only affect whether it can be removed.

Please check failing ASAN tests

Please check failing ASAN tests

These are almost certainly spurious. The pre-commit CI picks a random base commit when doing builds. This change has nothing to do with compiler-rt, so this breakage is probably in the base commit chosen.

llvm/include/llvm/Analysis/MemoryLocation.h
260

I thought the wording about side effects covered this, but will do so if desired.

llvm/lib/Analysis/MemoryLocation.cpp
154

I'm sympathetic to this concern, but this appears to already be the implicit contract of this method. In particular, DSE does not otherwise check these properties for any of the intrinsics and libfuncs.

I propose that we go with this, and if desired, try to annotate all the intrinsics/libfuncs separately so that these checks can be lifted out to the two callers.

nikic added inline comments.Dec 17 2021, 9:41 AM
llvm/lib/Analysis/MemoryLocation.cpp
154

This code was only recently moved from DSE into here, and it makes sense that DSE did not check this while it was still working on a hardcoded list of functions. We should already be specifying/inferring the necessary attributes (though possibly the tests don't have the full set of inferred attributes).

186

Thinking about libcalls, I just realized that this should probably not returning a getBeforeOrAfter(UsedV), but rather remembering which argument the write is on and then calling MemoryLocation::getForArgument() on that. That would likely subsume the libcall cases by returning a more accurate location if getForArgument() implements something more specific.

This would lose the case where the same argument is passed to the function twice, but I don't think we particularly care about that (as it would have to be literally the same argument as implemented).

@nikic I don't disagree with either of your recent suggestions, but the reorg your asking for is a huge amount of churn. Do you mind if this goes in, and then we try to cleanup the attribute handling in the direction you've indicated? I'm happy to do some of the work, but frankly, I suspect that's a lot more involved than you're making it out to be and I don't want to go boil the ocean. :)

On the single argument vs multiple arguments, I want to preserve the functionality of returning a conservative result if passed twice, but supporting both the precise and less precise results would not be hard to do. My strong preference would for a follow up though.

@nikic Looks like you proved my concern about scoping wrong with D115962. Once that lands, I'll rebase to include the restructure you requested on the attribute checking side.

reames updated this revision to Diff 395199.Dec 17 2021, 12:33 PM

Incorporate review comments.

Note: I tried removing the libfunc block, but that caused a test failure. I suspect there's just some attributes in the test which need updated, but I'd prefer to do that as a follow up.

reames updated this revision to Diff 395201.Dec 17 2021, 12:46 PM

re-rebase - I managed to drop tests last time.

nikic accepted this revision.Dec 17 2021, 12:55 PM

LGTM

This revision is now accepted and ready to land.Dec 17 2021, 12:55 PM
This revision was landed with ongoing or failed builds.Dec 17 2021, 1:42 PM
This revision was automatically updated to reflect the committed changes.
nikic reopened this revision.Dec 18 2021, 12:32 AM

Reverted this because it does break the strncpy-overflow.cpp test case in asan (https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/asan/TestCases/strncpy-overflow.cpp). The strncpy is now getting optimized away (correctly). Not sure what the policy in this case is, but I assume it's updating the test case to be more robust rather than trying to avoid the optimization in sanitized functions?

I found this change a bit surprising because I thought we'd already do this based on the previous attribute code in InstCombine. The reason we didn't is that strncpy returns the first argument, which means that it's not nocapture. But we do provide getForDest() information for this libcall, and that got exposed to InstCombine now.

I guess this is also why you say failures when dropping the libcall handling: We do still need it for libcalls that return the argument.

This revision is now accepted and ready to land.Dec 18 2021, 12:32 AM
xbolva00 added a comment.EditedDec 18 2021, 12:42 AM

Please check failing ASAN tests

Well I told you so…

Use -fno-builtin-strncpy in RUN commands

lkail added a subscriber: lkail.Dec 18 2021, 2:06 AM

Reverted this because it does break the strncpy-overflow.cpp test case in asan (https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/asan/TestCases/strncpy-overflow.cpp). The strncpy is now getting optimized away (correctly). Not sure what the policy in this case is, but I assume it's updating the test case to be more robust rather than trying to avoid the optimization in sanitized functions?

First, thank you for reverting. I'd seen the build failures, but since we all coming from bots I've mentally marked as unstable (as in, they fail often enough there's no signal in the messages), I didn't see the real failure here.

Second, yeah, that test is simply wrong/unstable. I will take a shot at trying to stabilize it, but I see at least two other transform paths which can break it as written. I'll probably just end up deferring the problem to the next person, but hey, that's how this sometimes works.

I found this change a bit surprising because I thought we'd already do this based on the previous attribute code in InstCombine. The reason we didn't is that strncpy returns the first argument, which means that it's not nocapture. But we do provide getForDest() information for this libcall, and that got exposed to InstCombine now.

Well, not quite. So long as the return value is unused, the strncpy is in fact nocapture on both arguments.

This is confirming my suspicion that nocapture is too conservative for this case. If the only capture can be a write to the dead memory, then the callee is effectively nocapture.

Given DSE wasn't previously checking for the use on the return value, I suspect we may have had a latent miscompile before your recent change

I guess this is also why you say failures when dropping the libcall handling: We do still need it for libcalls that return the argument.

I went back and skimmed the tests in libcalls.ll (the ones which failed without the libcall handling). None of them cover the case where a strncpy actually captures by returning. It's possibly I'd have noticed that while updating tests, but equally likely I'd have missed it.

This revision was automatically updated to reflect the committed changes.