Page MenuHomePhabricator

[InstCombine] Improve TryToSink for side-effecting calls that would be trivially dead
Needs RevisionPublic

Authored by anna on Sep 16 2021, 1:19 PM.

Details

Summary

This patch adds support to sink side-effecting calls that are legal to
sink to the successor use block. We can sink such calls as long as these
two conditions are satisfied:

  1. The instruction would be trivially dead (i.e. if there were no uses of the instruction, we can remove the instruction).
  2. There are no side-effecting instructions between the current one until the end of the block.

This helps with sinking allocations down to use blocks when safe to do
so (see added testcases).

Diff Detail

Event Timeline

anna created this revision.Sep 16 2021, 1:19 PM
anna requested review of this revision.Sep 16 2021, 1:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 16 2021, 1:19 PM
mkazantsev added inline comments.Sep 23 2021, 11:08 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3667

What if I has implicit control flow, e.g. may call system exit?

anna added inline comments.Sep 27 2021, 6:42 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3667

So, after this transform, we would have some non side-effecting instructions that are executed before this I that may call system exit. Why would this matter?

Note that if we were moving this instruction past some side-effecting (example not marked willreturn), there is a problem. But that isn't the case here.

Skimming the code for this and the prerequisite patch, I'm a bit uncomfortable. It really feels like you're trying to a) do several things in one change, and b) possibly abusing an interface for a purpose it wasn't intended.

I would suggest moving the no-alias-call bit to a separate patch as that seems unrelated to the core functionality.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3718

Hm, I am not sure here, but I think you should check for potential reads here as well.

My reasoning is as follows. While the instruction we're moving might be trivially dead if there are no users, it may *not* be trivially dead if there are users. There may be some externally arranged invariant that (say) the result is stored with a volatile variable *if-and-only-if* the call actually has a side effect.

As an example, consider the following:
a = malloc()
stat = malloc_library_counter()
if (c)

free(a)

Basically, is it legal to sink the malloc in this case? I legitimately don't know what the right answer is here.

Another example might be:
a = malloc()
stat = malloc_library_counter()
if (c)

(volatile i8*)g = a;
reames added inline comments.Sep 28 2021, 9:57 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3718

JFYI, my potential cases just described are definitely disallowed by the function comment in the header which describes trivial-deadness. Unfortunately, so is malloc. As such, that comment is definitely wrong, and needs updated. :)

anna added a comment.Sep 30 2021, 8:54 AM

Skimming the code for this and the prerequisite patch, I'm a bit uncomfortable. It really feels like you're trying to a) do several things in one change, and b) possibly abusing an interface for a purpose it wasn't intended.

For others following along: we plan to discuss offline. Will summarize result.

I would suggest moving the no-alias-call bit to a separate patch as that seems unrelated to the core functionality.

Unfortunately, the no-alias bit is required for the malloc, since the Scan instruction *starts* from the instruction we try to sink (i.e. the malloc call) and we fail at mayWriteToMemory. We can frame this as possibly a CT win and land it separately.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3718

I'm not sure if trivial-deadness is the property used in DSE for removing the malloc, but the case above is handled by DSE even if we have an externally arranged invariant through malloc_library_counter for example:

cat simple.ll

declare noalias i8* @malloc(i32) willreturn
declare i32 @malloc_library_counter()
define void @test20A() {
  %m = call i8* @malloc(i32 24) 
   %stat = call i32 @malloc_library_counter()
   store i8 0, i8* %m
   ret void
}

` opt -basic-aa -dse -S simple.ll` :

define void @test20A() {

%stat = call i32 @malloc_library_counter()
ret void

}

anna planned changes to this revision.Oct 1 2021, 7:05 AM

Had an offline discussion with Philip. Plan to separate out two APIs showing the difference between "isTriviallyDeadOnAllPaths" versus "isTriviallyDeadOnSomePaths".

anna updated this revision to Diff 390089.Fri, Nov 26, 9:49 AM

separated out API

anna updated this revision to Diff 391088.Wed, Dec 1, 11:10 AM

rebased

nikic requested changes to this revision.Wed, Dec 1, 11:36 AM
nikic added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3709

Comparing with the previous code block, the DestBlock->getUniquePredecessor() != I->getParent() check is missing. With an unreachable/return terminator, there may be intervening blocks. Please add a test for this.

This revision now requires changes to proceed.Wed, Dec 1, 11:36 AM
nikic added inline comments.Wed, Dec 1, 11:55 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3714

I find the reasoning in terms of mayHaveSideEffects() in this patch confusing. A "side effect" in LLVM can either be a memory write, unwinding or divergence. I think the side effects you have in mind here are only of the "memory write" variety. I'd prefer if this code was more explicit about the actual correctness requirements.

llvm/test/Transforms/InstCombine/sink_instruction.ll
235

This test doesn't make sense to me, at least as a test for the new functionality. It would already fold beforehand because @log() is marked side-effect free. I think you were trying to test the isMathLibCallNoop() case, but I don't think it is relevant to your patch, because it requires constant arguments -- in which case the result will be constant folded anyway and the call removed.

nikic added a comment.Wed, Dec 1, 11:57 AM

Do you have a use case for this apart from sinking allocations? If not, I would recommend restricting this to allocations only, because the generalized reasoning seems pretty subtle to me, and I'm not sure in which cases (apart from allocations) it would help.

anna added a comment.Wed, Dec 1, 12:29 PM

Do you have a use case for this apart from sinking allocations? If not, I would recommend restricting this to allocations only, because the generalized reasoning seems pretty subtle to me, and I'm not sure in which cases (apart from allocations) it would help.

To be honest, I ended up writing a sinking allocations test support, just so that we have this code upstream :)
In our downstream use case, we have support for additional "trivially dead" instructions in java world, such as object.hashCode and identityHashCode. These modify state, so they are not readnone, but they can be sunk to uses (which is beneficial especially if use is on cold path).

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3714

I did mean instructions that are side-effecting, not just the subset of "memory write". For an allocation, it may be enough to worry about "memory writes" only, but we're looking at the entire set of Instructions that are trivially dead on unused paths (see main comment about downstream use cases for example).

bool Instruction::mayHaveSideEffects() const {
    return mayWriteToMemory() || mayThrow() || !willReturn();
  }

Consider this:

  %x = call @A() <-- mayThrow
  %y = call @B() <-- mayThrow`
...

BB:
  use (%x)

If the original behaviour of the program was such that call to @A can throw an exception and we can handle the exception and call @B also throws an exception, by sinking the call @A we have changed the order of exceptions.
Or even the case if call @B does not return.

I am not sure why we are making it more general here by relaxing the restrictions?

llvm/test/Transforms/InstCombine/sink_instruction.ll
235

Yes, you're right. It is not relevant. Makes sense to land separately.
I do not have a way to showcase a test since there's no test which would trigger in this patch upstream. At least pretty much everything we already have triggers without this patch (either they are side-effect free or the instruction itself has no uses such as true guards and assumes).

Unfortunately, the only testcase which works with this patch is the alloc case, but there was a request to separate out from this patch because it looks slightly orthogonal (D114648).

reames added a comment.Wed, Dec 1, 1:56 PM

Thinking about it a bit, I think there are some interesting cases where we can sink something with an analyzeable side effect. An example I came up with is a large function with out-params which are unused in the caller.

Here's an attempt at a C example:

int foo();

extern int test(int *out) __attribute__((noinline));
int test(int *out) {
  *out = foo();
  return foo();
}

int wrapper() {

  int notdead;
  if (test(&notdead))
    return 0;

  int dead;
  int tmp = test(&dead);
  if (notdead)
    return tmp;
  return foo();
}

Key points in the example:

  • foo is used to represent a large *no-throw* block of code with hard to IPO results.
  • We have a parameter which is only written. (Today, we fail to infer this fact from IR oddly. We should fix that.)
  • We have an unused temporary created to pass to said out-param.
  • We have two or more callers, some of which *do* use the out-param.

In the IR, we'd look for something like:

  • An argmemonly nothrow willreturn function, with one or more write only params.
  • A callsite to said function where the memory passed is an alloca used only by lifetime markers and the call.

Such a call site can be sunk to the dataflow uses without changing the visibility of the side effect.

To be careful, this comment is meant to motivate the basic idea, not any particular implementation detail of this patch.

anna added a comment.Fri, Dec 3, 6:23 AM

Thinking about it a bit, I think there are some interesting cases where we can sink something with an analyzeable side effect. An example I came up with is a large function with out-params which are unused in the caller.

Thanks Philip. This is a definitely interesting example worth handling. Let me update the code to consider this use case and precommit some of the tests. The use case we have downstream (as mentioned is object.hashCode), but this example makes a good usecase for upstream IMO.

nikic added inline comments.Fri, Dec 3, 11:54 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3714

If wouldInstructionBeTriviallyDeadOnUnusedPaths() is true, then you are basically promising that call @A cannot throw, or at least we're allowed to pretend it can't. Otherwise it would clearly be illegal to sink the call past control flow in any case.

Now, do we care about whether call @B throws? If we sink call @A past a call that throws or diverges, then side-effects from call @A may not occur, but the whole premise of the optimization is that they do not matter. You are already sinking the call past explicit control flow into a different block, so I don't see why sinking it past implicit control flow would be relevant.