This is an archive of the discontinued LLVM Phabricator instance.

[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors
ClosedPublic

Authored by mkazantsev on Sep 5 2017, 3:08 AM.

Details

Summary

This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:

int array[LEN];
...
guard(0 <= index && index < LEN);
use(array[index]);

Diff Detail

Repository
rL LLVM

Event Timeline

mkazantsev created this revision.Sep 5 2017, 3:08 AM
mkazantsev updated this revision to Diff 113820.Sep 5 2017, 3:11 AM

Fixed formatting.

dberlin requested changes to this revision.Sep 5 2017, 6:42 AM
  1. This seems wrong for assumes (I don't know enough about guards).

It is not incorrect to hoist through an assume, nor should doing so generate wrong code.
Assumes are essentially metadata, and documented as such.

Can you provide an example where the code generated is incorrect?

(guards i don't know enough about to say anything).

  1. You've done this in an N^2 way when it should be amortized constant time (using orderedinstructions)
This revision now requires changes to proceed.Sep 5 2017, 6:42 AM
mkazantsev added a comment.EditedSep 5 2017, 8:09 PM

Hi Daniel,

  1. The motivating example is the following situation:

int array[LEN];

void accessArray(int index) {
guard(0 <= index && index < LEN);
do_something(array[index]);
}

Here we guard on condition that our array access is within bounds, otherwise we reject the compiled code and deoptimize. Hoisting the array access to a place above the guard means that we try to access the array without knowing if it within the bounds or not. I will add a corresponding test.

  1. It seems that you're right about assumes, I will remove them.
  1. Ok, I will try to do it in a faster manner.
mkazantsev updated this revision to Diff 113971.Sep 6 2017, 2:25 AM
mkazantsev edited edge metadata.
mkazantsev retitled this revision from [GVN] Prevent LoadPRE from hoisting through guards and assumes to [GVN] Prevent LoadPRE from hoisting through guards.
mkazantsev edited the summary of this revision. (Show Details)
  1. Removed assumes-related stuff.
  2. Added a motivating test and comment.
  3. Daniel, I didn't get what you mean under using ordered instructions to do this stuff. OrderedInstructions can only answer if a particular guard dominates our load. It will still take linear time to gather the guards, and for them we know for sure that they dominate the load and don't basically need to ask anything from OrderdeInstructions. I used any_of to speed the things up, but don't see any way to reduce complexity.

Could you please explain what you mean under using OrderedInstructions for gaining amortized constant complexity?

dberlin added a comment.EditedSep 6 2017, 1:36 PM

The problem:
Right now, for each load, you go through the loadbb, and see if you hit a guard.
This information does not change for each load. At most it changes as GVN removes things (and even then, no).
The number of guards is smaller than the number of blocks.

Thus, you could use ordered instructions to make this O(guards) * O(Number of loads) instead of O(number of instructions in each load block) * O(number of loads)
Trivially.
But you can do even better, and this is the constant time i refer to:
To make it constant time to check, and O(guards) to compute:

For each guard in the program:
   // Make FirstGuard point to the first guard in each block
   auto Iter = FirstGuard.(guard->getParent())
   if (Iter != FIrstGuard.end())
      FirstGuard.insert({guard->parent(), guard})
   else
     if (OrderedInstructions->dominates(*Iter, guard))
      Iter->second = guard

In processLoad, replace your entire loop with:

if the bb doesn't exist in firstguard
    you are fine
if the bb does, check whether orderedinstructions->dominates(FirstGuard[BB], load)

Basically, this is: Build a table of where the first guard in the block is. Since you are trying to see if there is a guard above a given instruction in the block, the only thing that matters is where the first guard in the block is, because eventually you will hit that guard if you go up. OrderedInstructions will tell you whether that first guard is before or after the load in the block.

You already walk all the instructions in the main gvn loop in RPO order. You can fill in firstguard as you go, as it will see all the guards before anything that one could try to lift above, except in the case of loops.

mkazantsev planned changes to this revision.Sep 6 2017, 10:45 PM

Thanks for the detailed clarification! I will do that.

Of course. Just in case you try to use the code, i'm pretty sure if (OrderedInstructions->dominates(*Iter, guard))
should be if (OrderedInstructions->dominates(guard, *Iter))

(At least one of the calls is reversed)

mkazantsev updated this revision to Diff 114311.Sep 8 2017, 1:21 AM

Actually we don't even need OrderedInstructions to do that. It's enough to maintain a set of blocks where we have ever seen the guards. We iterate through blocks in reverse postorder which guarantees us that if a block has one predecessor it will be visited before the block. We also traverse block instructions in direct order. Using this, we just mark a block as having a guard when we meet one and know for sure that:

  1. if the current block is marked, there is a guard above the current instruction, and we cannot hoist;
  2. if the only predecessor (recursively) is marked, it also blocks us from hoisting.

I've updated the patch using this solution. Please let me know if it makes sense to you.

Actually we don't even need OrderedInstructions to do that. It's enough to maintain a set of blocks where we have ever seen the guards. We iterate through blocks in reverse postorder which guarantees us that if a block has one predecessor it will be visited before the block. We also traverse block instructions in direct order. Using this, we just mark a block as having a guard when we meet one and know for sure that:

  1. if the current block is marked, there is a guard above the current instruction, and we cannot hoist;
  2. if the only predecessor (recursively) is marked, it also blocks us from hoisting.

I've updated the patch using this solution. Please let me know if it makes sense to you.

Sure, this will work, but it requires clearing the guard set between iterations :)
Given how much faster either solution is than the original option, seems fine to me. I doubt it will ever be a real performance issue.

The expected size of set of blocks with guards is actually small, so cleaning should be cheap. Could you please approve the patch for merge?

mkazantsev planned changes to this revision.Sep 8 2017, 1:54 AM

Need to clean guard set between iterations, not one time in the end.

mkazantsev updated this revision to Diff 114316.Sep 8 2017, 1:57 AM

Cleaning between iterations now.

efriedma added inline comments.
lib/Transforms/Scalar/GVN.cpp
2042 ↗(On Diff #114316)

Why does this specifically apply to guards, as opposed to anything where isGuaranteedToTransferExecutionToSuccessor() is false?

dberlin added inline comments.Sep 8 2017, 12:55 PM
lib/Transforms/Scalar/GVN.cpp
2042 ↗(On Diff #114316)

Yeah, it looks like GVN presumes that if it's a straight line non-infinite loop, they have the same "guaranteed to execute" properties.

As you correctly point out, since LLVM has implicit control flow, this is not always true.
I suspect GVN's PRE is quite broken here in that respect in theory.

However, most of the effect of isGuaranteed is on memory instructions, and those almost certainly get corrected by memdep (IE it ends up respecting the same things that isGuaranteed* is respecting).
Going through the rest:

if (const auto *CRI = dyn_cast<CleanupReturnInst>(I))
   return !CRI->unwindsToCaller();
 if (const auto *CatchSwitch = dyn_cast<CatchSwitchInst>(I))
   return !CatchSwitch->unwindsToCaller();
 if (isa<ResumeInst>(I))
   return false;
 if (isa<ReturnInst>(I))
   return false;
 if (isa<UnreachableInst>(I))
   return false;

None of these will have successors, so there shouldn't be a predecessor we hit to check anyway.
You can hoist above them, just not sink below them.

So those should be fine.

The callsite checking, except for throwing, is probably also handled by memdep's insertion of memory dependencies.

So I suspect you may be able to rig up a testcase where memdep and isGuaranteed* end up disagreeing in a way that causes this to break. But i also suspect it's hard.

Thoughts?

(and using isGuarante* would handle guard since guard is marked as throwing)

So I suspect you may be able to rig up a testcase where memdep and isGuaranteed* end up disagreeing in a way that causes this to break.

See https://reviews.llvm.org/D21041. :)

mkazantsev added inline comments.Sep 11 2017, 12:10 AM
lib/Transforms/Scalar/GVN.cpp
2042 ↗(On Diff #114316)

Hi Eli,

I tried to use use isGuaranteedToTransferExecutionToSuccessor, and it works for my example, but another tests starts failing. The test is test7 from Transforms/GVN/PRE/volatile.ll which is:

; Does cross block PRE work with volatiles?
define i32 @test7(i1 %c, i32* noalias nocapture %p, i32* noalias nocapture %q) {
; CHECK-LABEL: test7
; CHECK-LABEL: entry.header_crit_edge:
; CHECK:       %y.pre = load i32, i32* %p
; CHECK-LABEL: skip:
; CHECK:       %y1 = load i32, i32* %p
; CHECK-LABEL: header:
; CHECK:      %y = phi i32
; CHECK-NEXT: %x = load volatile i32, i32* %q
; CHECK-NEXT: %add = sub i32 %y, %x
entry:
  br i1 %c, label %header, label %skip
skip:
  %y1 = load i32, i32* %p
  call void @use(i32 %y1)
  br label %header
header:
  %x = load volatile i32, i32* %q
  %y = load i32, i32* %p
  %add = sub i32 %y, %x
  %cnd = icmp eq i32 %add, 0
  br i1 %cnd, label %exit, label %header
exit:
  ret i32 %add
}

It now fails because this function returns false for olatile loads and stores because they are allowed to trap. This logic was added in patch https://reviews.llvm.org/D21167 and Sanjoy's comment about why we do that remained unanswered. I don't clearly understand why you've added that, could you please clarify why it was done and is it OK if this test now fails?

Thanks,
Max

We consider volatile loads and stores as potentially trapping as a courtesy to our users. Strictly speaking, it's undefined behavior in C, but people sometimes intentionally trigger SIGSEGV using volatile loads, so we don't want to cause an unrelated crash before that happens. Breaking that testcase should be fine.

Eli, i presume an updated version with isGuaranteed... should fix your testcases from 21041, right?
If so, Max, can you add them to this patch?

I think it will fix those testcases... at least, it will fix the load testcase (I don't think this patch addresses the divide-by-zero problem).

reames edited edge metadata.Sep 11 2017, 4:17 PM

We consider volatile loads and stores as potentially trapping as a courtesy to our users. Strictly speaking, it's undefined behavior in C, but people sometimes intentionally trigger SIGSEGV using volatile loads, so we don't want to cause an unrelated crash before that happens. Breaking that testcase should be fine.

I disagree. This would be breaking optimizations *across* a volatile load, not breaking optimizations *of* a volatile load. I think the isGuaranteedTo... scheme is probably the right one, but I think we got our semantics for that function wrong for volatile loads.

Given we've not documented volatile loads as potentially trapping, I don't think we should *partially* implement that semantic. I'd actually like to have a faulting load concept in the IR, but hacking that in for volatile seems misguided.

For the moment, I'll suggest a compromise to unblock this review from the broader discussion. Can we use the isGuaranteedTo... scheme, but additionally whitelist skipping over volatile loads? That let's us make progress on this correctness issue while taking the volatile semantics discussion elsewhere.

lib/Transforms/Scalar/GVN.cpp
2049 ↗(On Diff #114316)

I'm not sure building this set in RPO is sufficient in the face of cycles. I believe the current PRE code would bail out of said cycles, but at minimum, that assumptions needs to be carefully documented and asserted for.

i.e. how do we know that BlocksWithGuards has been populated for each block we traverse backwards through?

In particular, if we extend this to isGuaranteedTo... and the scalar PRE case from https://reviews.llvm.org/D21041, it's not obvious to me this code would be correct.

The volatile load thing never really got discussed in depth, as far as I remember... I wanted to be conservative because volatile memory access can be weird, but I guess my interpretation in isGuaranteedToTransferExecutionToSuccessor() is a little suspect given the way alias analysis treats volatile operations.

The volatile load thing never really got discussed in depth, as far as I remember... I wanted to be conservative because volatile memory access can be weird, but I guess my interpretation in isGuaranteedToTransferExecutionToSuccessor() is a little suspect given the way alias analysis treats volatile operations.

I think that it would be good to revisit this. As you say, we'd also need to model this in AA (by essentially making all volatile access universally aliasing because the trap handlers could be arbitrary functions). Moreover, it would essentially mean modeling volatile loads as both reading and writing to memory.

What I consider to be "regular" uses of volatile, handling memory-mapped I/O and communicating with signal/interrupt handlers, I don't think require this "loads also write" treatment. These use cases might want universally-aliasing stores (because the stores might trigger some device or signal/interrupt handler to do something arbitrary), but can allow more-precise AA for volatile loads.

I don't know exactly what use cases we want to support with volatile accesses. If it's essentially just memory-mapped I/O (for device drivers and the like), then it might make sense to make . If we other use cases, then it might not make sense (and, as Philip says, we might want some other mechanism).

If we do want to make volatile accesses universally aliasing and writing, then I think we should remove them from the IR all together and add some intrinsics instead for this use case.

dberlin added inline comments.Sep 11 2017, 7:53 PM
lib/Transforms/Scalar/GVN.cpp
2049 ↗(On Diff #114316)

It's definitely not sufficient in the case of cycles.

The only reason it's thought to be sufficient here is that you are guaranteed that any straight line predecessors have already been processed.

Since it's only walking back through straight line predecessors, ...

So yes, if it tried to use this info in random predecessors, it would not be correct the way it is being computed here.

mkazantsev added inline comments.Sep 11 2017, 10:02 PM
lib/Transforms/Scalar/GVN.cpp
2049 ↗(On Diff #114316)

Yes, we only can rely on this set if we have visited all the predecessors before. This is true for the case where we use it (because there is only 1 predecessor) and not true in general. I'll write a comment on that to make it clear.

mkazantsev edited the summary of this revision. (Show Details)

Reused isGuaranteedToTransferExecutionToSuccessor, added some clarifying comments. In this version of patch, we explicitly allow hoisting across volatile loads/stores. Given the discussion here, we might want to modify isGuaranteedToTransferExecutionToSuccessor so that it returns true for volatile loads/stores. Once the ongoing discussion on volatiles comes to some conclusion, this part can be reworked.

Could you add the testcases from D21041?

mkazantsev retitled this revision from [GVN] Prevent LoadPRE from hoisting through guards to [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.Sep 12 2017, 9:50 PM
mkazantsev edited the summary of this revision. (Show Details)

Added some tests, changed commit message.

dberlin added inline comments.Sep 12 2017, 10:08 PM
lib/Transforms/Scalar/GVN.cpp
2070 ↗(On Diff #114967)

It's worth noting:

The RPO iteration order only guarantees that checking straight line predecessors is safe[1]
It would not be safe to check other predecessors unless the program is loop free
(IE such a thing would have to be computed separately from this loop).

[1] I can prove this if you like - A visit order dfs walk of the dominator tree is a valid RPO order [2]. Any straight line single predecessor must be your idom, because there is no path around it to your node. Because the dominator tree is acylic, and a valid RPO order, RPO order must guarantee you visit this node before you.
The only way something else can be visited first is if there was another path to your node (which would in turn prove that that your idom is not really your idom)
,
[2] LLVM does not happen to keep it in the sort order, but you can see that, for example, NewGVN sorts it back into the "right" order.

mkazantsev marked an inline comment as done.

Factored the filling of implicit control flow map out of main iteration loop.

dberlin added inline comments.Sep 13 2017, 7:39 AM
lib/Transforms/Scalar/GVN.cpp
2351 ↗(On Diff #114995)

This is expensive to compute, you should pass it in.

2365 ↗(On Diff #114995)

for (auto &I : BB)

(the &*BI will just be &I below)

2368 ↗(On Diff #114995)

This map could be const *BasicBlock->const *Instruction, FWIW.

mkazantsev marked 3 inline comments as done.

Nits.

efriedma added inline comments.Sep 13 2017, 11:25 AM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

GVN got smarter, so this isn't actually testing what it's supposed to test anymore. Maybe try replacing "icmp eq i32 %p, %q" with "icmp eq i32 %p, %r", where "%r" is a parameter. And I guess pass %a to may_exit(), so it won't get moved if we come up with some new form of sinking in the future.

mkazantsev added inline comments.Sep 13 2017, 8:03 PM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

What is the difference between icmp eq i32 %p, %q where %q is a parameter and icmp eq i32 %p, %r where %r is a parameter? If you mean that we should branch by unknown condition then we could just add parameter i5 %cond.

Anyways, I don't think that extension of the test set can somehow block us from merging this functional fix. I also don't see what is the conceptual difference between this test and what you describe. The transformation across may_exit still doesn't happen. Do you mind if we consider adding new tests as NFC later?

mkazantsev added inline comments.Sep 13 2017, 8:04 PM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

%i1 cond off course :)

efriedma added inline comments.Sep 14 2017, 12:30 PM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

i1 %cond probably does the same thing, yes. The important part is that we don't simplify sdiv i32 %p, %q to 1.

Fixing the underlying issue here doesn't necessarily block committing this patch, I guess... but it's probably something you want to address at some point, and it's very closely related to this patch.

mkazantsev added inline comments.Sep 15 2017, 4:08 AM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

I want to clearly distinguish fixing miscompile with guards and a missing opportunity for turning this div into 1. Do we have a bug to address this missing opportunity? You could file it if you think this case is important. But I don't think it is a part of the problem being addressed by this patch.

efriedma added inline comments.Sep 15 2017, 11:06 AM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

Sorry, probably should have written this out in the first place.

GVN currently performs an incorrect optimization for the following testcase:

define i32 @test2(i32 %p, i32 %q, i1 %r) {
block1:
  br i1 %r, label %block2, label %block3

block2:
 %a = sdiv i32 %p, %q
 br label %block4

block3:
  br label %block4

block4:
  %phi = phi i32 [ 0, %block3 ], [ %a, %block2 ]
  call void @may_exit(i32 %phi) nounwind
  %b = sdiv i32 %p, %q
  ret i32 %b

}
declare void @may_exit(i32) nounwind
mkazantsev marked an inline comment as done.Sep 15 2017, 8:17 PM
mkazantsev added inline comments.
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

Ah, now I got what you mean. :) Thanks for pointing out! Fixed.

mkazantsev marked an inline comment as done.

Fixed one more case with PRE, updated tests accordingly.

dberlin added inline comments.Sep 15 2017, 9:07 PM
lib/Transforms/Scalar/GVN.cpp
2170 ↗(On Diff #115529)

Please use ordered instructions, which caches this, and invalidate it as needed.

mkazantsev updated this revision to Diff 117313.Oct 2 2017, 2:30 AM
mkazantsev marked an inline comment as done.

Reworked with OrderedInstructions.

reames requested changes to this revision.Oct 2 2017, 8:21 PM
  • List Item

The loadPRE part of this seems very close to being ready to submit.

lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

This is overly conservative for any instruction which can not ever fault. For instance, PRE a load over a throwing call is perfectly sound. Note that preserving the nsw/nuw flags is questionable in the same case, but that's a hard discussion. :)

I'd prefer to see the scalarPRE part separate into it's own patch. I know I'm saying the exact opposite of Eli here, but I think this needs more discussion and trying to combine them will delay the loadPRE patch unnecessarily.

2406 ↗(On Diff #117313)

This is scarily permissive. It (for instance) allows ordered loads and stores which (I think) is valid. It'd be much safer to add a special matcher for volatile-only loads/stores here.

This revision now requires changes to proceed.Oct 2 2017, 8:21 PM
mkazantsev added inline comments.Oct 2 2017, 9:44 PM
lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

I disagree that PRE over a throwing call is invalid. Imagine the following situation:

int arr[LEN];

void foo(index) {
  ...
  call rangeCheck(index, LEN);
  %x = load %arr, %index
  print(%x);
}

void rangeCheck(int index, int len) {
  if (index < 0 || index >= len)
    throw exception;
}

We cannot PRE across the throwing call here.

2399 ↗(On Diff #117313)

typo: volatile

2406 ↗(On Diff #117313)

Will do.

mkazantsev added inline comments.Oct 2 2017, 9:46 PM
lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

I disagree that prohibiting PRE over a throwing call is invalid

:)

mkazantsev planned changes to this revision.Oct 2 2017, 10:00 PM

Need rework: address comments & fix failure found on internal test.

reames added inline comments.Oct 3 2017, 9:57 AM
lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

Sorry, I should have said "PRE of a *known safe to speculate* load over a throwing call is perfectly sound". My sloppy wording here was really confusing, sorry!

efriedma added inline comments.Oct 3 2017, 11:24 AM
lib/Transforms/Scalar/GVN.cpp
2406 ↗(On Diff #117313)

Hoisting a load across an Acquire operation is often not allowed, yes... but alias analysis will catch that; we don't need to handle it here.

efriedma added inline comments.Oct 3 2017, 11:53 AM
lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

I'd prefer to see the scalarPRE part separate into it's own patch. I know I'm saying the exact opposite of Eli here, but I think this needs more discussion and trying to combine them will delay the loadPRE patch unnecessarily.

I mostly pointed out the issue with scalars here because it makes sense to write the patches together; they're very similar issues, and the solution requires the same infrastructure. If you want to separate it out for review/bisectability, that's fine.

And yes, we should probably be checking for isSafeToSpeculativelyExecute(), for both scalar and load PRE.

mkazantsev marked 3 inline comments as done.Oct 6 2017, 4:22 AM
mkazantsev added inline comments.
lib/Transforms/Scalar/GVN.cpp
2406 ↗(On Diff #117313)

Added assert that the said load and store are volatile. Otherwise isGuaranteedToTransferExecutionToSuccessor is true for them.

mkazantsev updated this revision to Diff 117978.Oct 6 2017, 4:23 AM
mkazantsev edited edge metadata.
mkazantsev marked an inline comment as done.

ScalarPRE will go in follow-up patch

reames accepted this revision.Oct 10 2017, 12:04 PM

LGTM

This revision was automatically updated to reflect the committed changes.
mppf added a subscriber: mppf.Oct 12 2017, 7:25 AM

Hi @mkazantsev , this change seems to have caused a regression. Before this change, the following program compiled with opt -gvn minimal64.ll -S but after, it causes an assertion failure in the compiler. I've verified that this failure is still present on trunk as of r315579.

GVN.cpp:1073: bool llvm::GVN::PerformLoadPRE(llvm::LoadInst*, llvm::GVN::AvailValInBlkVect&, llvm::GVN::UnavailBlkVect&): Assertion `It->second->getParent() == TmpBB && "Implicit control flow map broken?"' failed.

Here is a carefully minimized test case (from a failure compiling with our frontend):

; ModuleID = 'simpler.ll'
source_filename = "bugpoint-output-5fd3360.bc"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%ArrayImpl = type { i64, i64 addrspace(100)*, [1 x i64], [1 x i64], [1 x i64], i64, i64, double addrspace(100)*, double addrspace(100)*, i8, i64 }

; Function Attrs: readnone
declare %ArrayImpl* @getaddr_ArrayImpl(%ArrayImpl addrspace(100)*) #0

; Function Attrs: readnone
declare i64* @getaddr_i64(i64 addrspace(100)*) #0

define hidden void @wrapon_fn173() {
entry:
  %0 = call %ArrayImpl* @getaddr_ArrayImpl(%ArrayImpl addrspace(100)* undef)
  br label %loop

loop:
  %1 = call %ArrayImpl* @getaddr_ArrayImpl(%ArrayImpl addrspace(100)* undef)
  %2 = load i64 addrspace(100)*, i64 addrspace(100)** null, align 8
  %3 = call i64* @getaddr_i64(i64 addrspace(100)* %2)
  br label %loop
}

attributes #0 = { readnone }

To reproduce, put the above into minimal64.ll and run opt -gvn minimal64.ll -S.

Note that if the "readnone" attribute is removed from the getaddr functions, the test will pass.

I'd greatly appreciate any attention you can give to this matter. (I'd make a bug report but don't have an account on the bug report server yet).

Thanks!

mppf added a comment.Oct 12 2017, 6:03 PM

FWIW I've also reported this as this bug: https://bugs.llvm.org/show_bug.cgi?id=34937

This is use after free.
When the first implicit control flow instruction for a bb gets erased, we now have wrong info.

Previously, when this was filled in on the fly, this would be easy to fix.
But now it's much harder.

I attached a patch for this to PR 34937.
I don't have time ATM to perform testing and shepherding of it through review, however.

Thanks guys for looking into the problem! I've submitted a more conservative version of fix proposed by Daniel for review as https://reviews.llvm.org/D38944 (it fixes the map but does not rework GVN's deletion mechanism because it leads to further troubles).

mkazantsev edited the summary of this revision. (Show Details)Oct 26 2017, 10:11 PM
mkazantsev planned changes to this revision.Oct 27 2017, 3:07 AM

I am going to abandon https://reviews.llvm.org/D38944 and combine its changes into here with all possible NFC factored out. It would be more logical to keep them together.

mkazantsev edited the summary of this revision. (Show Details)

Changes from https://reviews.llvm.org/D38944 (fixed version) have been included into this one since they must go together.

reames requested changes to this revision.Oct 27 2017, 9:51 AM

Ok, once more very close. Thank you for working through the issues raised and splitting out the various bits. Now that's done, one more round of updates and we should be good for a resubmit.

lib/Transforms/Scalar/GVN.cpp
2101 ↗(On Diff #120572)

Are we guaranteed this entry exists in the map? Or are you relying on default construction of the value type? I think the second, in which case it might be worth renaming the variable to something like FirstICFOrNull for clarity.

2116 ↗(On Diff #120572)

This check is implied by one further up in the same scope.

2312 ↗(On Diff #120572)

It just occurred to me, are you sure loads - which are the only instruction type in use here - every have implicit control flow? If not, this piece is dead code and should be replaced with an assertion instead.

In particular:
assert(!MayNotTransferExecutionToSuccessor(CurInst), "loads don't have implicit control flow");

test/Transforms/GVN/PRE/pre-load.ll
434 ↗(On Diff #120572)

across *potentially throwing* calls

436 ↗(On Diff #120572)

Required - Add an analogous test case where @f is in a block other than the origin block.

Required - Add a pair of analogous test cases where %x is marked as dereferenceable to exercise the speculation logic in each case.

This revision now requires changes to proceed.Oct 27 2017, 9:51 AM
reames added inline comments.Oct 29 2017, 11:28 PM
lib/Transforms/Scalar/GVN.cpp
2312 ↗(On Diff #120572)

Talking to Max, we realized my comment here is wrong. This is the *Scalar* PRE block, not the *Load* PRE block I thought I was looking at.

mkazantsev added inline comments.Oct 29 2017, 11:32 PM
lib/Transforms/Scalar/GVN.cpp
2312 ↗(On Diff #120572)

Yeah, and in general we mark all instructions for deletion (no matter loads or not) and then process them uniformly. That it is not done here is just a mess (due to some assertion failures if we do); I think it is to be fixed in future.

mkazantsev edited edge metadata.
mkazantsev marked 4 inline comments as done.

Addressed comments.

reames requested changes to this revision.Oct 30 2017, 10:47 AM
reames added inline comments.
lib/Transforms/Scalar/GVN.cpp
2318 ↗(On Diff #120773)

Caught one last bug. The invalidation needs to be conditional since the rebuild is. As currently structured, this will leave the entry empty allowing a miscompile.

Do we have a test case for the scalarPRE invalidation? If so, why didn't it catch this? If not, please add one.

This revision now requires changes to proceed.Oct 30 2017, 10:47 AM
mkazantsev added inline comments.Oct 30 2017, 9:13 PM
lib/Transforms/Scalar/GVN.cpp
2318 ↗(On Diff #120773)

It is not a bug. Here we invalidate OrderedInstructions for this block because we have just erased an instruction. We should do it for all instructions, not only for implicit control flow. The re-fill of the implicit control flow map below is conditional because we only need it for ICF instructions. So at my point, it's OK.

reames accepted this revision.Oct 30 2017, 9:14 PM

LGTM

lib/Transforms/Scalar/GVN.cpp
2318 ↗(On Diff #120773)

You are correct. I misread the code (again).

This revision was automatically updated to reflect the committed changes.