This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] erase instructions leading up to unreachable
ClosedPublic

Authored by spatel on Sep 4 2020, 8:44 AM.

Details

Summary

Normal dead code elimination ignores assume intrinsics, so we fail to delete assumes that are not meaningful (and potentially worse if they cause conflicts with other assumptions).

The motivating example in https://llvm.org/PR47416 suggests that we might have problems upstream from here (difference between C and C++), but this should be a cheap way to make sure we remove more dead code.

Diff Detail

Event Timeline

spatel created this revision.Sep 4 2020, 8:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2020, 8:44 AM
spatel requested review of this revision.Sep 4 2020, 8:44 AM
nikic added a comment.EditedSep 4 2020, 11:53 AM

This doesn't seem assume specific -- shouldn't we instead make this a visitUnreachableInst fold that removes preceding guaranteed-to-transfer instructions?

Also as implemented this may remove asserts that are relevant, if we're not guaranteed-to-transfer from the assume to the unreachable terminator.

This doesn't seem assume specific -- shouldn't we instead make this a visitUnreachableInst fold that removes preceding guaranteed-to-transfer instructions?

Also as implemented this may remove asserts that are relevant, if we're not guaranteed-to-transfer from the assume to the unreachable terminator.

The first point is the important one. unreachable should eat any preceeding instruction that transfers execution for sure.
We should not remove assumes for other reasons.

spatel updated this revision to Diff 290082.Sep 5 2020, 6:01 AM
spatel retitled this revision from [InstCombine] erase assume in block that ends in unreachable to [InstCombine] erase instructions leading up to unreachable.

Patch updated:
Reimplemented as a "must-reach-unreachable" transform. This shows removing a store from another test.
I only realized the EHPad restriction was needed by seeing regression test crashes, so I'm not sure if that is a sufficient guard.

lebedev.ri added inline comments.Sep 5 2020, 6:11 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2805

I think this should be a loop, otherwise it may take extra worklist iterations to catch everything

nikic added inline comments.Sep 5 2020, 8:01 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2805

Right, it should either be a loop, or use

eraseInstFromFunction(*Prev);
return &I;

to repreorcess the UnreachableInst.

spatel updated this revision to Diff 290254.Sep 7 2020, 5:24 AM
spatel marked 2 inline comments as done.

Patch updated:

  1. Re-process the unreachable to avoid extra iterations. I tried the loop variant of that too, but it triggered infinite looping on some tests, and I haven't tracked down why.
  2. Added a test to verify that we are not iterating extra times.
  3. Added a use_empty() check to try to make this safer - anything can happen in an unreachable block.
lebedev.ri added inline comments.Sep 7 2020, 5:29 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2806

Is there any testcase with non-empty use list?
I'd think we could RAUW it with undef.

nikic added inline comments.Sep 7 2020, 5:47 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2806

I don't see how the last instruction can have a non-empty use-list if followed by unreachable and not part of a properly unreachable block (which is ensured by instcombine early). I think it can be an assert.

llvm/test/Transforms/InstCombine/assume.ll
2

This should be -instcombine-infinite-loop-threshold=2. Otherwise InstCombine will just stop after two iterations without performing further folds -- but we may not see that easily.

spatel updated this revision to Diff 290279.Sep 7 2020, 6:45 AM
spatel marked 3 inline comments as done.

Patch updated:

  1. Removed unused check.
  2. Updated test option to "-instcombine-infinite-loop-threshold=2".
lebedev.ri accepted this revision.Sep 7 2020, 6:50 AM

LGTM, thanks

This revision is now accepted and ready to land.Sep 7 2020, 6:50 AM
spatel added inline comments.Sep 7 2020, 6:54 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
2806

Yes, I'm not seeing a way to get bogus IR (eg, that references itself) into here unless it's truly in a block that can't be reached from entry. I'll remove the check - the assert already exists in eraseInstFromFunction().

This revision was automatically updated to reflect the committed changes.

This change affected a case like this:

code with side effects, for example, volatile stores
an unreachable instruction __builtin_unreachable

Although executing __builtin_unreachable is undefined, removing the code before it deletes their side effects.

Can we temporarily revert this change and add checking side effect checking for code to be removed?

This change affected a case like this:

code with side effects, for example, volatile stores
an unreachable instruction __builtin_unreachable

Although executing __builtin_unreachable is undefined, removing the code before it deletes their side effects.

Oh, hm.
While volatile stores aren't a great example, we certainly shouldn't be dropping e.g. printf() calls.

Can we temporarily revert this change and add checking side effect checking for code to be removed?

It doesn't appear we have such a check handy (isAssumedSideEffectFree() in Attributor seems like the best match),
so i guess it would be fine.

nikic added a comment.Sep 9 2020, 9:27 AM

Although executing __builtin_unreachable is undefined, removing the code before it deletes their side effects.

I believe this behavior is correct. LangRef explicitly states that execution continues past a volatile store. As such, unreachable must be reached, which is undefined behavior. As such, we are free to optimize as we wish, including removing a preceding volatile store.

nikic added a comment.Sep 9 2020, 9:32 AM

@lebedev.ri I don't think the printf() case is any different. Keep in mind that undefined behavior acts retroactively, so if we know that unreachable will be executed later, we are free to optimize away side-effects before it as well. (If printf is not considered willreturn, then we cannot optimize it away, and won't.)

@lebedev.ri I don't think the printf() case is any different. Keep in mind that undefined behavior acts retroactively, so if we know that unreachable will be executed later, we are free to optimize away side-effects before it as well. (If printf is not considered willreturn, then we cannot optimize it away, and won't.)

IOW you are saying that in https://godbolt.org/z/TEo6oj tail call void @_Z5errorv() should ideally be optimized out?

@lebedev.ri I don't think the printf() case is any different. Keep in mind that undefined behavior acts retroactively, so if we know that unreachable will be executed later, we are free to optimize away side-effects before it as well. (If printf is not considered willreturn, then we cannot optimize it away, and won't.)

I agree. Without analyzing the format of printf we should not be able to make it willreturn in the first place. That said, there is no good way to ensure an effect is kept unless you ensure the effect does not inevitably lead to UB.
This patch doesn't change that in principle, just makes it more obvious. Take: https://godbolt.org/z/4qMb5G which shows how we hoist a division if we assume printf is willreturn and nounwind, "eliminating" one printf call in case d = 0 (which leads to UB).

@lebedev.ri I don't think the printf() case is any different. Keep in mind that undefined behavior acts retroactively, so if we know that unreachable will be executed later, we are free to optimize away side-effects before it as well. (If printf is not considered willreturn, then we cannot optimize it away, and won't.)

IOW you are saying that in https://godbolt.org/z/TEo6oj tail call void @_Z5errorv() should ideally be optimized out?

If _Z5errorv was willreturn and nounwind, yes. Otherwise, no. Basically, if isGuaranteedToTransferExecutionToSuccessor given the call returns true.

This comment was removed by lebedev.ri.
stephan.yichao.zhao added a comment.EditedSep 9 2020, 10:04 AM

Although executing __builtin_unreachable is undefined, removing the code before it deletes their side effects.

I believe this behavior is correct. LangRef explicitly states that execution continues past a volatile store. As such, unreachable must be reached, which is undefined behavior. As such, we are free to optimize as we wish, including removing a preceding volatile store.

I think the problem is if that reaching unreachable instructions is undefined means the behavior when a program hits builtin_unreachable is undefined and the behavior before builtin_unreachable is hit is also undefined.

The behavior when a program hits __builtin_unreachable is definitely undefined, because a runtime can abort or hang or keep on going. The LangRef does not define it.

However, is the behavior before __builtin_unreachable is hit is also undefined? Does LLVM have any existing transformation or optimization examples that change behaviors before an undefined instruction?
I found some blogs: https://blog.llvm.org/posts/2011-05-13-what-every-c-programmer-should-know/ and https://blog.regehr.org/archives/213 They do not answer the question explicitly, although I found the first link says "If you're using an LLVM-based compiler, you can dereference a "volatile" null pointer to get a crash if that's what you're looking for, since volatile loads and stores are generally not touched by the optimizer."

If derefencing a null volatile has a defined crash behavior, it is not safe to remove volatile accesses before unreachable because it may not be reaching the unreachable if it gets crashed.

Checked llvm::isGuaranteedToTransferExecutionToSuccessor from https://llvm.org/doxygen/ValueTracking_8cpp_source.html.
Yes, non-throw and return calls are considered a through instruction. But it does not check volatile specially about whether such an access can abort a program.

Although executing __builtin_unreachable is undefined, removing the code before it deletes their side effects.

I believe this behavior is correct. LangRef explicitly states that execution continues past a volatile store. As such, unreachable must be reached, which is undefined behavior. As such, we are free to optimize as we wish, including removing a preceding volatile store.

I think the problem is if that reaching unreachable instructions is undefined means the behavior when a program hits builtin_unreachable is undefined and the behavior before builtin_unreachable is hit is also undefined.

The behavior when a program hits __builtin_unreachable is definitely undefined, because a runtime can abort or hang or keep on going. The LangRef does not define it.

A program execution has defined behavior or not. If you cannot derive defined behavior for your *entire* execution you do not have defined behavior for any part of your execution.
The fact that you see some side effects that you might expect before "UB is hit" is coincidental. Take this code:

if (A)
  printf("A is not null!\n");

*A = 0;

If you would execute this with A = nullptr you would cause UB. The compiler knows that and removes the conditional. Thus, when you run this with A = nullptr you will see the message before, you know, something else happens (=UB).
You can revert the conditional, thus !A, to remove some side-effect with the same explanation.

However, is the behavior before __builtin_unreachable is hit is also undefined? Does LLVM have any existing transformation or optimization examples that change behaviors before an undefined instruction?
I found some blogs: https://blog.llvm.org/posts/2011-05-13-what-every-c-programmer-should-know/ and https://blog.regehr.org/archives/213 They do not answer the question explicitly, although I found the first link says "If you're using an LLVM-based compiler, you can dereference a "volatile" null pointer to get a crash if that's what you're looking for, since volatile loads and stores are generally not touched by the optimizer."

If derefencing a null volatile has a defined crash behavior, it is not safe to remove volatile accesses before unreachable because it may not be reaching the unreachable if it gets crashed.

I do not believe this is defined at all. A volatile access might not return and might throw, but it is, IMHO, never *known* to "crash" (w/o target knowledge).

Regarding access to volatile memory...

If volatile memory access is known to not trap, then the guidance on LLVM's own blog is wrong:
https://blog.llvm.org/posts/2011-05-13-what-every-c-programmer-should-know/

And in fact, Clang's error message is wrong as it specifically suggests re-writing a dereference of a null pointer to use volatile in order to ensure a trap.

If we insist on following the LangRef rules, we would need to change the lowering of volatile memory accesses in Clang, and that doesn't seem especially useful. I think the LangRef is simply wrong on this point, and we need to admit that access to volatile memory, much like a call to an unknown external function, may trap, unwind, never return, etc. There is a significant amount of real C and C++ code written that relies on these rules, no doubt because LLVM and Clang advertised them. =]

(And to be clear, IMO we should revert this patch while the discussion concludes as this is breaking real code.)

Regarding access to volatile memory...

If volatile memory access is known to not trap, then the guidance on LLVM's own blog is wrong:
https://blog.llvm.org/posts/2011-05-13-what-every-c-programmer-should-know/

To be precise, I think the entire paragraph (below) is not correct (from todays perspective). The flag exists (in IR) and the behavior of volatile accesses, e.g. of nullptr, is target specific (in IR).

If you're using an LLVM-based compiler, you can dereference a "volatile" null pointer to get a crash if that's what you're looking for, since volatile loads and stores are generally not touched by the optimizer. There is currently no flag that enables random NULL pointer loads to be treated as valid accesses or to make random loads know that their pointer is "allowed to be null".

And in fact, Clang's error message is wrong as it specifically suggests re-writing a dereference of a null pointer to use volatile in order to ensure a trap.

We should have a trap intrinsic for that, and I thought we do actually... let's be honest, that is also a way nicer way to ensure a trap so we should advertise it either way.

If we insist on following the LangRef rules, we would need to change the lowering of volatile memory accesses in Clang, and that doesn't seem especially useful. I think the LangRef is simply wrong on this point, and we need to admit that access to volatile memory, much like a call to an unknown external function, may trap, unwind, never return, etc. There is a significant amount of real C and C++ code written that relies on these rules, no doubt because LLVM and Clang advertised them. =]

I would argue to scrub

The compiler may assume execution will continue after a volatile operation, so operations which modify memory or may have undefined behavior can be hoisted past a volatile operation.

from the LangRef and to teach isGuaranteedToTransferExecutionToSuccessor about this is the best option here.

nikic added a comment.Sep 9 2020, 12:15 PM

Regarding access to volatile memory...

If volatile memory access is known to not trap, then the guidance on LLVM's own blog is wrong:
https://blog.llvm.org/posts/2011-05-13-what-every-c-programmer-should-know/

This is a (non-normative) blog post from 2011. The requirements on volatile operations have been clarified in D53184 after a llvm-dev discussion. Of course we can re-evaluate this decision, in which case a LangRef patch needs to be proposed and a new RFC on llvm-dev started.

The requirements on volatile operations have been clarified in D53184 after a llvm-dev discussion. Of course we can re-evaluate this decision, in which case a LangRef patch needs to be proposed and a new RFC on llvm-dev started.

In D53184, about trap, the reasoning is "If we don't restrict the effects of a volatile load/store, we have to assume it could, at worst, modify any memory that isn't local to the function and unwind the stack, like a call to an unknown function. That would block optimizations around volatile accesses. That's not what LLVM implements, and it's not what other compilers implement."

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120218.html talked about a hoisting issue. The question is if to allow moving a non-volatile operation across a volatile access, do we need all restrictions: memory non-aliasing, non-looping, non-existing. These features are like a negation of a function with conservative assumptions. The first two are definitely necessary to swap a volatile access and a non-volatile access. But non-existing is not necessary. For example,

volatile-access;
ordinary-access;

If ordinary-access is defined, no matter volatile-access returns or exists, it is fine to swap them because there are no other external events to distinguish them.
If ordinary-access is undefined, it is okay to swap because undefined behavior is retroactive.

So if we rewrite "The compiler may assume execution will continue after a volatile operation, so operations which modify memory or may have undefined behavior can be hoisted past a volatile operation." in LangRef into "The compiler may assume execution will continue after a volatile operation or a volatile operation can exist", hoisting still works.

Did other optimizations need all restrictions to work? https://reviews.llvm.org/D65375 removed checkings of volatile from isGuaranteedToTransferExecutionToSuccessor. After then, maybe some optimizations were turned on like this diff.

(Continuing the discussion here as I'm not really aware of a better place... feel free to point to a different thread if better... I don't actively follow llvm-dev at this point though, sorry if I've missed a post there.)

I don't have super strong opinions about exactly what semantics LLVM implements.

I have somewhat strong opinions about the semantics that users of C++ need from volatile memory accesses, and would like for Clang to have a clear way to lower that into LLVM IR.

The semantics I think are that the memory access is observable *outside* of the C++ abstract machine. I think that needs to include things like an emulator tracing memory accesses, IPC, MMIO, hardware exception handling (in the kernel), signal handlers (in userspace), and other external observation of memory access.

One specific example from the above: I think that C and C++ volatile accesses need to be usable to reliably trigger a fault due to failed virtual address translation or bad permissions on the mapped page on Linux and similar virtual memory operating systems. This is, for example, heavily used by kernels and other systems level code. It is even used in userspace for fast bounds checking and in some cases deoptimization. On many systems (certainly Linux userspace) these accesses should reliably trigger a signal handler. Signal handlers are not, AFAIK, guaranteed to return.

Given that, it seems very hard to argue that a volatile memory access must not trap, and must not unwind the stack -- signal handlers are allowed to do that, and it seems like a long held valid use case for volatile access to memory is to trigger a specific signal handler. Perhaps this use case just didn't come up in the prior discussions?

Another use case I have is very specifically to make a store not dead to the compiler if it is reachable. Benchmark utilities have relied on this usage of volatile for many years, and it would seem painful to try to re-teach people to use a different facility.

The last use case I have is to make a store not-dead to the compiler even if it is followed by something unreachable. This comes up in a few rare cases in benchmarks, but also comes up in a few other cases. It is perhaps the least important use case. But it happens to be easily satisfied by the combination of volatile accesses potentially intentionally triggering signal handlers, and not being considered a "dead" store even if unused.

I will point out that of all the modern compilers I could test, only Clang trunk deletes volatile stores, even followed by clearly unreachable code: https://compiler-explorer.com/z/EGTofn

I think Clang deviating in the semantics it provides for volatile stores is a serious mistake and bug. So my argument for reverting would be "we need to fix Clang to use some other lowering of volatile stores given the clarified LLVM semantics before landing patches that implement them" as otherwise we will regress users of Clang. If folks want to revisit the LLVM semantics in light of this, as I said, I don't have an especially strong opinion either way. My mild opinion is that it would be more useful for volatile accesses in LLVM to have a superset of the possible behaviors (and thus restrictions on optimizations) placed upon a call of an unknown function, in that execution might not proceed past the store. That would allow Clang and other frontends to use it for lowering these kinds of operations.

[...]

Given that, it seems very hard to argue that a volatile memory access must not trap, and must not unwind the stack -- signal handlers are allowed to do that, and it seems like a long held valid use case for volatile access to memory is to trigger a specific signal handler. Perhaps this use case just didn't come up in the prior discussions?

I think @efriedma was initially proposing to allow volatile accesses to trap and that is what we should do here. It is not willreturn, for the lack of a better "attribute". That solves the problem this patch exposes, matches the expectations expressed so far, and does not require us to force additional restrictions that we might don't want. The change is makes the IR retroactively stricter, which in turn makes optimizations we did before illegal but keeps old IR valid under the new semantics.

I would say let's write an RFC and see if there are other opinions. Also, @nlopes what does alive2 think of such a proposal?

I would say let's write an RFC and see if there are other opinions. Also, @nlopes what does alive2 think of such a proposal?

Alive2 implements whatever semantics make sense for LLVM. We are not here to constrain the solutions, but to help :)
Right now Alive2 doesn't support volatile stores, and personally I haven't studied their semantics.

unreachable allows the compiler to delete any instruction that is executed beforehand as long as it returns. So, noreturn function calls and __builtin_trap() are stopping points.
Volatile stores if they are guaranteed to trap, then they should be considered as stopping point as well. I wouldn't know what to say about volatile stores that are guaranteed not to trap (i.e., when the compiler can prove the address is dereferenceable), but I'm inclined to say they can be removed, as UB allows externally observable behavior to be removed. If there's a compelling "niceness" argument because it's needed in practice, then we can keep volatile stores alone (e.g., are people writing (volatile int*)0 = 0x42; __builtin_unreachable() in linux kernel rather than __builtin_trap()??).
Regular stores can always be removed.

For example, this is ok:

printf("foo")
store 42, @glb
icmp foo
unreachable
  =>
unreachable

I'm happy to implement that stricter volatile semantics (i.e., assume volatile accesses may trap) in Alive2 to help catching optimizations that are wrong w.r.t. to that "new" semantics.

Ok, let me make it more concrete.
it seems we have 3 possible semantics:

  1. volatile accesses never trap, but rather trigger UB when the address is not dereferenceable
  2. they trap if the address is not dereferenceable
  3. they may trap regardless (i.e., they can never be removed). Alternatively we can state that the load/store address traces are externally observable and can't change

Regular accesses follow semantics 1).
Option 3) makes me uncomfortable. Proving refinement of that is non-trivial and I'm scared of the implications. I can study that option is there's demand.
Option 2) seems reasonable to me if needed to make LLVM nicer for some applications.

spatel added a comment.EditedSep 10 2020, 6:48 AM

We have an LLVM-internal example of the problem here?
https://llvm.org/PR47480

nikic added a comment.Sep 10 2020, 7:22 AM

I've applied https://github.com/llvm/llvm-project/commit/4e413e16216d0c94ada2171f3c59e0a85f4fa4b6 to temporarily disable the removal of volatile stores in particular. A LangRef change (and corresponding ValueTracking patch to align with new semantics) will be needed to make this permanent.

dmajor added a subscriber: dmajor.Sep 10 2020, 7:41 AM

I will point out that of all the modern compilers I could test, only Clang trunk deletes volatile stores, even followed by clearly unreachable code: https://compiler-explorer.com/z/EGTofn

Other compilers make effectively equivalent assumptions, even if they don't actively delete volatile operations before an unreachable. See https://compiler-explorer.com/z/7a81Gj : every non-clang compiler will hoist a divide by zero past a volatile operation. So I'm not sure we can claim there's any uniformity here.

One specific example from the above: I think that C and C++ volatile accesses need to be usable to reliably trigger a fault due to failed virtual address translation or bad permissions on the mapped page on Linux and similar virtual memory operating systems. This is, for example, heavily used by kernels and other systems level code. It is even used in userspace for fast bounds checking and in some cases deoptimization. On many systems (certainly Linux userspace) these accesses should reliably trigger a signal handler. Signal handlers are not, AFAIK, guaranteed to return.

Given that, it seems very hard to argue that a volatile memory access must not trap, and must not unwind the stack -- signal handlers are allowed to do that, and it seems like a long held valid use case for volatile access to memory is to trigger a specific signal handler. Perhaps this use case just didn't come up in the prior discussions?

A lot of the things you might want to do in a signal handler are things code on another thread could do anyway. But there's some design space here, I guess.

Another use case I have is very specifically to make a store not dead to the compiler if it is reachable. Benchmark utilities have relied on this usage of volatile for many years, and it would seem painful to try to re-teach people to use a different facility.

This should be covered by the existing semantics.


Assuming a volatile operation can terminate the thread, or maybe siglongjmp, probably doesn't hurt optimization too badly. (A little, but isGuaranteedToTransferExecutionToSuccessor() isn't necessary for most important optimizations.)

The thing that's probably most important from an optimization perspective is that we don't want to treat volatile as a synchronization barrier (so it could modify arbitrary memory); that would seriously hurt optimizations around volatile operations.

Hi!

We've run into a case where instcombine crashes with this patch:

opt -o /dev/null bbi-47401.ll -instcombine

on

@c = external global i16*, align 1

define void @main() {
entry:
  br label %for.cond

for.cond:                                         ; preds = %g.exit, %entry
  %conv1 = phi double [ %conv, %g.exit ], [ undef, %entry ]
  br i1 undef, label %for.end, label %for.body

for.body:                                         ; preds = %for.cond
  %0 = load i16*, i16** @c, align 1
  %1 = load i16, i16* %0, align 1
  %conv = sitofp i16 %1 to double
  unreachable

g.exit:                                           ; No predecessors!
  br label %for.cond

for.end:                                          ; preds = %for.cond
  %conv1.lcssa = phi double [ %conv1, %for.cond ]
  store double %conv1.lcssa, double* undef, align 1
  ret void
}

results in

opt: ../lib/Transforms/InstCombine/InstCombineInternal.h:448: virtual llvm::Instruction *llvm::InstCombinerImpl::eraseInstFromFunction(llvm::Instruction &): Assertion `I.use_empty() && "Cannot erase instruction that is used!"' failed.

I suppose it's the dead block %g.exit that messes up things.

Hi!

We've run into a case where instcombine crashes with this patch:

opt -o /dev/null bbi-47401.ll -instcombine

on

@c = external global i16*, align 1

define void @main() {
entry:
  br label %for.cond

for.cond:                                         ; preds = %g.exit, %entry
  %conv1 = phi double [ %conv, %g.exit ], [ undef, %entry ]
  br i1 undef, label %for.end, label %for.body

for.body:                                         ; preds = %for.cond
  %0 = load i16*, i16** @c, align 1
  %1 = load i16, i16* %0, align 1
  %conv = sitofp i16 %1 to double
  unreachable

g.exit:                                           ; No predecessors!
  br label %for.cond

for.end:                                          ; preds = %for.cond
  %conv1.lcssa = phi double [ %conv1, %for.cond ]
  store double %conv1.lcssa, double* undef, align 1
  ret void
}

results in

opt: ../lib/Transforms/InstCombine/InstCombineInternal.h:448: virtual llvm::Instruction *llvm::InstCombinerImpl::eraseInstFromFunction(llvm::Instruction &): Assertion `I.use_empty() && "Cannot erase instruction that is used!"' failed.

I suppose it's the dead block %g.exit that messes up things.

We had a use_empty() guard in an earlier draft of this patch and decided that could not happen. :)
https://reviews.llvm.org/D87149#2259207
Thanks for the test case!

Hi!

We've run into a case where instcombine crashes with this patch:

opt -o /dev/null bbi-47401.ll -instcombine

on

@c = external global i16*, align 1

define void @main() {
entry:
  br label %for.cond

for.cond:                                         ; preds = %g.exit, %entry
  %conv1 = phi double [ %conv, %g.exit ], [ undef, %entry ]
  br i1 undef, label %for.end, label %for.body

for.body:                                         ; preds = %for.cond
  %0 = load i16*, i16** @c, align 1
  %1 = load i16, i16* %0, align 1
  %conv = sitofp i16 %1 to double
  unreachable

g.exit:                                           ; No predecessors!
  br label %for.cond

for.end:                                          ; preds = %for.cond
  %conv1.lcssa = phi double [ %conv1, %for.cond ]
  store double %conv1.lcssa, double* undef, align 1
  ret void
}

results in

opt: ../lib/Transforms/InstCombine/InstCombineInternal.h:448: virtual llvm::Instruction *llvm::InstCombinerImpl::eraseInstFromFunction(llvm::Instruction &): Assertion `I.use_empty() && "Cannot erase instruction that is used!"' failed.

I suppose it's the dead block %g.exit that messes up things.

We had a use_empty() guard in an earlier draft of this patch and decided that could not happen. :)
https://reviews.llvm.org/D87149#2259207
Thanks for the test case!

Unreachable BB's were the obvious potential source for non-empty use-count, but i thought instcombine didn't combine instructions in unreachable blocks.
Maybe that is the bug?

Unreachable BB's were the obvious potential source for non-empty use-count, but i thought instcombine didn't combine instructions in unreachable blocks.
Maybe that is the bug?

Instcombine tries to remove instructions in unreachable blocks, but we can create circular dependencies that instcombine can't reduce because it doesn't change the CFG. Here's a reduced version of the test (not sure if it's minimal):

define void @main(i16 %x) {
entry:
  br label %for.cond

for.cond:
  %p = phi double [ %conv, %g.exit ], [ undef, %entry ]
  br i1 undef, label %for.end, label %for.body

for.body:
  %conv = sitofp i16 %x to double
  unreachable

g.exit:
  br label %for.cond

for.end:
  store double %p, double* undef
  ret void
}

I guess we could try to eliminate phi operands if an incoming block (g.exit) is unreachable? But simplifycfg should do that already, so I don't know if we want instcombine trying to do that too.

I guess we could try to eliminate phi operands if an incoming block (g.exit) is unreachable? But simplifycfg should do that already, so I don't know if we want instcombine trying to do that too.

Actually, that would be out-of-bounds for instcombine. Removing a phi operand would mean removing the incoming block too, and we don't do that in instcombine. So I think we just bail out if a value still has uses. SimplifyCFG will eventually get it.

Actually, that would be out-of-bounds for instcombine. Removing a phi operand would mean removing the incoming block too, and we don't do that in instcombine. So I think we just bail out if a value still has uses. SimplifyCFG will eventually get it.

Replacing with undef before erasing would do as well.

Actually, that would be out-of-bounds for instcombine. Removing a phi operand would mean removing the incoming block too, and we don't do that in instcombine. So I think we just bail out if a value still has uses. SimplifyCFG will eventually get it.

Replacing with undef before erasing would do as well.

Draft of that idea:
D87965