[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors
ClosedPublic

Authored by mkazantsev on Sep 5 2017, 3:08 AM.

Details

Summary

This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:

int array[LEN];
...
guard(0 <= index && index < LEN);
use(array[index]);

Diff Detail

Repository
rL LLVM
There are a very large number of changes, so older changes are hidden. Show Older Changes
mkazantsev retitled this revision from [GVN] Prevent LoadPRE from hoisting through guards to [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors.
mkazantsev edited the summary of this revision. (Show Details)

Added some tests, changed commit message.

dberlin added inline comments.Sep 12 2017, 10:08 PM
lib/Transforms/Scalar/GVN.cpp
2070 ↗(On Diff #114967)

It's worth noting:

The RPO iteration order only guarantees that checking straight line predecessors is safe[1]
It would not be safe to check other predecessors unless the program is loop free
(IE such a thing would have to be computed separately from this loop).

[1] I can prove this if you like - A visit order dfs walk of the dominator tree is a valid RPO order [2]. Any straight line single predecessor must be your idom, because there is no path around it to your node. Because the dominator tree is acylic, and a valid RPO order, RPO order must guarantee you visit this node before you.
The only way something else can be visited first is if there was another path to your node (which would in turn prove that that your idom is not really your idom)
,
[2] LLVM does not happen to keep it in the sort order, but you can see that, for example, NewGVN sorts it back into the "right" order.

mkazantsev marked an inline comment as done.

Factored the filling of implicit control flow map out of main iteration loop.

dberlin added inline comments.Sep 13 2017, 7:39 AM
lib/Transforms/Scalar/GVN.cpp
2351 ↗(On Diff #114995)

This is expensive to compute, you should pass it in.

2365 ↗(On Diff #114995)

for (auto &I : BB)

(the &*BI will just be &I below)

2368 ↗(On Diff #114995)

This map could be const *BasicBlock->const *Instruction, FWIW.

mkazantsev marked 3 inline comments as done.

Nits.

efriedma added inline comments.Sep 13 2017, 11:25 AM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

GVN got smarter, so this isn't actually testing what it's supposed to test anymore. Maybe try replacing "icmp eq i32 %p, %q" with "icmp eq i32 %p, %r", where "%r" is a parameter. And I guess pass %a to may_exit(), so it won't get moved if we come up with some new form of sinking in the future.

mkazantsev added inline comments.Sep 13 2017, 8:03 PM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

What is the difference between icmp eq i32 %p, %q where %q is a parameter and icmp eq i32 %p, %r where %r is a parameter? If you mean that we should branch by unknown condition then we could just add parameter i5 %cond.

Anyways, I don't think that extension of the test set can somehow block us from merging this functional fix. I also don't see what is the conceptual difference between this test and what you describe. The transformation across may_exit still doesn't happen. Do you mind if we consider adding new tests as NFC later?

mkazantsev added inline comments.Sep 13 2017, 8:04 PM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

%i1 cond off course :)

efriedma added inline comments.Sep 14 2017, 12:30 PM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

i1 %cond probably does the same thing, yes. The important part is that we don't simplify sdiv i32 %p, %q to 1.

Fixing the underlying issue here doesn't necessarily block committing this patch, I guess... but it's probably something you want to address at some point, and it's very closely related to this patch.

mkazantsev added inline comments.Sep 15 2017, 4:08 AM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

I want to clearly distinguish fixing miscompile with guards and a missing opportunity for turning this div into 1. Do we have a bug to address this missing opportunity? You could file it if you think this case is important. But I don't think it is a part of the problem being addressed by this patch.

efriedma added inline comments.Sep 15 2017, 11:06 AM
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

Sorry, probably should have written this out in the first place.

GVN currently performs an incorrect optimization for the following testcase:

define i32 @test2(i32 %p, i32 %q, i1 %r) {
block1:
  br i1 %r, label %block2, label %block3

block2:
 %a = sdiv i32 %p, %q
 br label %block4

block3:
  br label %block4

block4:
  %phi = phi i32 [ 0, %block3 ], [ %a, %block2 ]
  call void @may_exit(i32 %phi) nounwind
  %b = sdiv i32 %p, %q
  ret i32 %b

}
declare void @may_exit(i32) nounwind
mkazantsev marked an inline comment as done.Sep 15 2017, 8:17 PM
mkazantsev added inline comments.
test/Transforms/GVN/PRE/local-pre.ll
48 ↗(On Diff #115055)

Ah, now I got what you mean. :) Thanks for pointing out! Fixed.

mkazantsev marked an inline comment as done.

Fixed one more case with PRE, updated tests accordingly.

dberlin added inline comments.Sep 15 2017, 9:07 PM
lib/Transforms/Scalar/GVN.cpp
2170 ↗(On Diff #115529)

Please use ordered instructions, which caches this, and invalidate it as needed.

mkazantsev updated this revision to Diff 117313.Oct 2 2017, 2:30 AM
mkazantsev marked an inline comment as done.

Reworked with OrderedInstructions.

reames requested changes to this revision.Oct 2 2017, 8:21 PM
  • List Item

The loadPRE part of this seems very close to being ready to submit.

lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

This is overly conservative for any instruction which can not ever fault. For instance, PRE a load over a throwing call is perfectly sound. Note that preserving the nsw/nuw flags is questionable in the same case, but that's a hard discussion. :)

I'd prefer to see the scalarPRE part separate into it's own patch. I know I'm saying the exact opposite of Eli here, but I think this needs more discussion and trying to combine them will delay the loadPRE patch unnecessarily.

2406 ↗(On Diff #117313)

This is scarily permissive. It (for instance) allows ordered loads and stores which (I think) is valid. It'd be much safer to add a special matcher for volatile-only loads/stores here.

This revision now requires changes to proceed.Oct 2 2017, 8:21 PM
mkazantsev added inline comments.Oct 2 2017, 9:44 PM
lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

I disagree that PRE over a throwing call is invalid. Imagine the following situation:

int arr[LEN];

void foo(index) {
  ...
  call rangeCheck(index, LEN);
  %x = load %arr, %index
  print(%x);
}

void rangeCheck(int index, int len) {
  if (index < 0 || index >= len)
    throw exception;
}

We cannot PRE across the throwing call here.

2399 ↗(On Diff #117313)

typo: volatile

2406 ↗(On Diff #117313)

Will do.

mkazantsev added inline comments.Oct 2 2017, 9:46 PM
lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

I disagree that prohibiting PRE over a throwing call is invalid

:)

mkazantsev planned changes to this revision.Oct 2 2017, 10:00 PM

Need rework: address comments & fix failure found on internal test.

reames added inline comments.Oct 3 2017, 9:57 AM
lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

Sorry, I should have said "PRE of a *known safe to speculate* load over a throwing call is perfectly sound". My sloppy wording here was really confusing, sorry!

efriedma added inline comments.Oct 3 2017, 11:24 AM
lib/Transforms/Scalar/GVN.cpp
2406 ↗(On Diff #117313)

Hoisting a load across an Acquire operation is often not allowed, yes... but alias analysis will catch that; we don't need to handle it here.

efriedma added inline comments.Oct 3 2017, 11:53 AM
lib/Transforms/Scalar/GVN.cpp
2189 ↗(On Diff #117313)

I'd prefer to see the scalarPRE part separate into it's own patch. I know I'm saying the exact opposite of Eli here, but I think this needs more discussion and trying to combine them will delay the loadPRE patch unnecessarily.

I mostly pointed out the issue with scalars here because it makes sense to write the patches together; they're very similar issues, and the solution requires the same infrastructure. If you want to separate it out for review/bisectability, that's fine.

And yes, we should probably be checking for isSafeToSpeculativelyExecute(), for both scalar and load PRE.

mkazantsev marked 3 inline comments as done.Oct 6 2017, 4:22 AM
mkazantsev added inline comments.
lib/Transforms/Scalar/GVN.cpp
2406 ↗(On Diff #117313)

Added assert that the said load and store are volatile. Otherwise isGuaranteedToTransferExecutionToSuccessor is true for them.

mkazantsev updated this revision to Diff 117978.Oct 6 2017, 4:23 AM
mkazantsev marked an inline comment as done.

ScalarPRE will go in follow-up patch

reames accepted this revision.Oct 10 2017, 12:04 PM

LGTM

This revision was automatically updated to reflect the committed changes.
mppf added a subscriber: mppf.Oct 12 2017, 7:25 AM

Hi @mkazantsev , this change seems to have caused a regression. Before this change, the following program compiled with opt -gvn minimal64.ll -S but after, it causes an assertion failure in the compiler. I've verified that this failure is still present on trunk as of r315579.

GVN.cpp:1073: bool llvm::GVN::PerformLoadPRE(llvm::LoadInst*, llvm::GVN::AvailValInBlkVect&, llvm::GVN::UnavailBlkVect&): Assertion `It->second->getParent() == TmpBB && "Implicit control flow map broken?"' failed.

Here is a carefully minimized test case (from a failure compiling with our frontend):

; ModuleID = 'simpler.ll'
source_filename = "bugpoint-output-5fd3360.bc"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%ArrayImpl = type { i64, i64 addrspace(100)*, [1 x i64], [1 x i64], [1 x i64], i64, i64, double addrspace(100)*, double addrspace(100)*, i8, i64 }

; Function Attrs: readnone
declare %ArrayImpl* @getaddr_ArrayImpl(%ArrayImpl addrspace(100)*) #0

; Function Attrs: readnone
declare i64* @getaddr_i64(i64 addrspace(100)*) #0

define hidden void @wrapon_fn173() {
entry:
  %0 = call %ArrayImpl* @getaddr_ArrayImpl(%ArrayImpl addrspace(100)* undef)
  br label %loop

loop:
  %1 = call %ArrayImpl* @getaddr_ArrayImpl(%ArrayImpl addrspace(100)* undef)
  %2 = load i64 addrspace(100)*, i64 addrspace(100)** null, align 8
  %3 = call i64* @getaddr_i64(i64 addrspace(100)* %2)
  br label %loop
}

attributes #0 = { readnone }

To reproduce, put the above into minimal64.ll and run opt -gvn minimal64.ll -S.

Note that if the "readnone" attribute is removed from the getaddr functions, the test will pass.

I'd greatly appreciate any attention you can give to this matter. (I'd make a bug report but don't have an account on the bug report server yet).

Thanks!

mppf added a comment.Oct 12 2017, 6:03 PM

FWIW I've also reported this as this bug: https://bugs.llvm.org/show_bug.cgi?id=34937

This is use after free.
When the first implicit control flow instruction for a bb gets erased, we now have wrong info.

Previously, when this was filled in on the fly, this would be easy to fix.
But now it's much harder.

I attached a patch for this to PR 34937.
I don't have time ATM to perform testing and shepherding of it through review, however.

Thanks guys for looking into the problem! I've submitted a more conservative version of fix proposed by Daniel for review as https://reviews.llvm.org/D38944 (it fixes the map but does not rework GVN's deletion mechanism because it leads to further troubles).

mkazantsev edited the summary of this revision. (Show Details)Thu, Oct 26, 10:11 PM
mkazantsev planned changes to this revision.Fri, Oct 27, 3:07 AM

I am going to abandon https://reviews.llvm.org/D38944 and combine its changes into here with all possible NFC factored out. It would be more logical to keep them together.

mkazantsev edited the summary of this revision. (Show Details)

Changes from https://reviews.llvm.org/D38944 (fixed version) have been included into this one since they must go together.

reames requested changes to this revision.Fri, Oct 27, 9:51 AM

Ok, once more very close. Thank you for working through the issues raised and splitting out the various bits. Now that's done, one more round of updates and we should be good for a resubmit.

lib/Transforms/Scalar/GVN.cpp
2101 ↗(On Diff #120572)

Are we guaranteed this entry exists in the map? Or are you relying on default construction of the value type? I think the second, in which case it might be worth renaming the variable to something like FirstICFOrNull for clarity.

2116 ↗(On Diff #120572)

This check is implied by one further up in the same scope.

2312 ↗(On Diff #120572)

It just occurred to me, are you sure loads - which are the only instruction type in use here - every have implicit control flow? If not, this piece is dead code and should be replaced with an assertion instead.

In particular:
assert(!MayNotTransferExecutionToSuccessor(CurInst), "loads don't have implicit control flow");

test/Transforms/GVN/PRE/pre-load.ll
434 ↗(On Diff #120572)

across *potentially throwing* calls

436 ↗(On Diff #120572)

Required - Add an analogous test case where @f is in a block other than the origin block.

Required - Add a pair of analogous test cases where %x is marked as dereferenceable to exercise the speculation logic in each case.

This revision now requires changes to proceed.Fri, Oct 27, 9:51 AM
reames added inline comments.Sun, Oct 29, 11:28 PM
lib/Transforms/Scalar/GVN.cpp
2312 ↗(On Diff #120572)

Talking to Max, we realized my comment here is wrong. This is the *Scalar* PRE block, not the *Load* PRE block I thought I was looking at.

mkazantsev added inline comments.Sun, Oct 29, 11:32 PM
lib/Transforms/Scalar/GVN.cpp
2312 ↗(On Diff #120572)

Yeah, and in general we mark all instructions for deletion (no matter loads or not) and then process them uniformly. That it is not done here is just a mess (due to some assertion failures if we do); I think it is to be fixed in future.

mkazantsev marked 4 inline comments as done.

Addressed comments.

reames requested changes to this revision.Mon, Oct 30, 10:47 AM
reames added inline comments.
lib/Transforms/Scalar/GVN.cpp
2318 ↗(On Diff #120773)

Caught one last bug. The invalidation needs to be conditional since the rebuild is. As currently structured, this will leave the entry empty allowing a miscompile.

Do we have a test case for the scalarPRE invalidation? If so, why didn't it catch this? If not, please add one.

This revision now requires changes to proceed.Mon, Oct 30, 10:47 AM
mkazantsev added inline comments.Mon, Oct 30, 9:13 PM
lib/Transforms/Scalar/GVN.cpp
2318 ↗(On Diff #120773)

It is not a bug. Here we invalidate OrderedInstructions for this block because we have just erased an instruction. We should do it for all instructions, not only for implicit control flow. The re-fill of the implicit control flow map below is conditional because we only need it for ICF instructions. So at my point, it's OK.

reames accepted this revision.Mon, Oct 30, 9:14 PM

LGTM

lib/Transforms/Scalar/GVN.cpp
2318 ↗(On Diff #120773)

You are correct. I misread the code (again).

This revision was automatically updated to reflect the committed changes.