This is an archive of the discontinued LLVM Phabricator instance.

[GVN] Improve PRE on load instructions
ClosedPublic

Authored by Carrot on Jan 13 2023, 11:12 AM.

Details

Summary

This patch implements the enhancement proposed by https://github.com/llvm/llvm-project/issues/59312.

Suppose we have following code

   v0 = load %addr
   br %LoadBB

LoadBB:
   v1 = load %addr
   ...

PredBB:
   ...
   br %cond, label %LoadBB, label %SuccBB

SuccBB:
   v2 = load %addr
   ...

Instruction v1 in LoadBB is partially redundant, edge (PredBB, LoadBB) is a critical edge. SuccBB is another successor of PredBB, it contains another load v2 which is identical to v1. Current GVN splits the critical edge (PredBB, LoadBB) and inserts a new load in it. A better method is move the load of v2 into PredBB, then v1 can be changed to a PHI instruction.

If there are two or more similar predecessors, like the test case in the bug entry, current GVN simply gives up because otherwise it needs to split multiple critical edges. But we can move all loads in successor blocks into predecessors.

This is the second try of D139582, with the following enhancement.

  • Don't try to move load instructions across exception handling instructions.
  • In function replaceValuesPerBlockEntry ValuesPerBlock[BB] may not be the moved loaded value, in this case we should not replace its value.

Diff Detail

Event Timeline

Carrot created this revision.Jan 13 2023, 11:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 13 2023, 11:12 AM
Carrot requested review of this revision.Jan 13 2023, 11:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 13 2023, 11:12 AM
Carrot edited the summary of this revision. (Show Details)Jan 13 2023, 11:14 AM
mkazantsev requested changes to this revision.Jan 16 2023, 9:41 PM

Some minor nits & style proposals; could you please add more tests for weird corner cases, specifically same block being a pred multiple times?

llvm/lib/Transforms/Scalar/GVN.cpp
1395

Do you care about switch with 2 branches? If not, maybe then match(m_Br(m_Value(), m_BasicBlock(IfTrue), m_BasicBlock(IfFalse)))?

1398

Will there be a problem if Term->getSuccessor(0) == Term->getSuccessor(1)? Any tests for it?

1400

Do you really need to check EHPad after you've already checked for isExceptionalTerminator?

1403

nit: unsigned to signed conversion, might be warning in compiler

1404

Maybe

for (Instruction &Inst : *SuccBB) {
  if (!Inst.isIdenticalTo(Load))
    continue;
  // same code with reduced nest
}
1412

Add a comment that, if one identical load already depends on something, then there is no point to look further?

1488

Add some statistic here?

1591

What if the same block goes into LoadBB multiple times? Smth like

switch cond
   case 1: LoadBB
   case 2: LoadBB
   case 3: LoadBB
   default: LoadBB

Will this work correctly for this case? Please add some tests for situations like this.

1631–1634

nit: { } not needed

llvm/test/Transforms/GVN/PRE/2017-06-28-pre-load-dbgloc.ll
3

Why change that?

This revision now requires changes to proceed.Jan 16 2023, 9:41 PM
Carrot updated this revision to Diff 490011.Jan 17 2023, 6:19 PM
Carrot marked 5 inline comments as done.
Carrot added inline comments.
llvm/lib/Transforms/Scalar/GVN.cpp
1395

I prefer to include switch case, this is a benefit without any extra cost.

1398

If Term->getSuccessor(0) == Term->getSuccessor(1), then SuccBB will have two predecessors, the next statement should exit immediately. Added a test case for it.

1400

You are right, this is not necessary.

1591

It looks similar to your comment in findLoadToHoistIntoPred. Either the switch BB contains an identical load, nothing should be handled. Or switch BB doesn't contains an identical load, findLoadToHoistIntoPred returns nullptr because of too many edges. Test added.

llvm/test/Transforms/GVN/PRE/2017-06-28-pre-load-dbgloc.ll
3

Because with this optimization, both cases generate same result. The PRE of load %desc now can be detected and moved to entry block.

mkazantsev requested changes to this revision.Jan 18 2023, 4:12 AM
mkazantsev added inline comments.
llvm/lib/Transforms/Scalar/GVN.cpp
97

More specific name, smth related to critical edges maybe?

1643

Is it really needed? There is literally same check just below.

2703

I think this assert should not fail, and if it fails, you have a bug. ICF may keep cached information for it bound to its old parent, and will try to remove it from its new parent. You may get inconsistent state of ICF because of it. At least I don't see how you update it.

You should not move instructions. The right approach is to create a new one.

This revision now requires changes to proceed.Jan 18 2023, 4:12 AM

I have a question regarding code example: it looks like a case of load hoisting from SuccBB and basic block after splitting of (PredBB, LoadBB) critical edge. If so, the same result can be achieved by:

  1. running SimplifyCFGOpt::HoistThenElseCodeToIf after GVN. Its implementation is very constrained (can hoist only first N exactly the same instructions from ThenBB and ElseBB, so very sensitive to instruction order)
  2. Run GVNHoist pass after GVN -- the most general solution. Unfortunately it's disabled by default and placed before GVN in pass pipeline (although GVN exposes lots of hoisting opportunities, and its authors advise to run it before and after GVN-PRE).

Also there was an attempt to implement intermediate solution (hoist only instructions from 2 BBs to their common successors, but match them with value numbers): https://discourse.llvm.org/t/rfc-simple-gvn-hoist, so there is a chance that https://reviews.llvm.org/D110822 also covers the same problem.

In general, I'm concerned that this patch tries to combine 2 different transformations: load PRE and hoisting. It can omit critical edge splitting in some cases, but after hoisting we can also cleanup CFG.

In general, I'm concerned that this patch tries to combine 2 different transformations: load PRE and hoisting. It can omit critical edge splitting in some cases, but after hoisting we can also cleanup CFG.

Sort of. But most of the code are used to detect the optimization opportunities, some changes to reuse current code to insert load into PredBB, only the changes in eliminatePartiallyRedundantLoad are specifically for the hoisting part. I think it's a too heavy hammer for this purpose, especially the expected following pass is still unavailable.

Carrot updated this revision to Diff 490288.Jan 18 2023, 2:13 PM
Carrot marked 2 inline comments as done.
Carrot added inline comments.
llvm/lib/Transforms/Scalar/GVN.cpp
2703

Add a call to ICF->insertInstructionTo for the new created load instruction. For the deleted instruction ICF->removeInstruction is called in line 2711.

Maybe a naive question, does Load instruction impact implicit control flow?

mkazantsev added inline comments.Jan 18 2023, 8:46 PM
llvm/lib/Transforms/Scalar/GVN.cpp
2703

Sorry, I didn't formulate the problem I'm seeing correctly. It's not that loads create implicit control flow. It's that by removing this assert, you allow instruction motion here. There is no check that you could only have moved a load, right? So potentially it creates a room for this kind of bugs.

mkazantsev added inline comments.Jan 18 2023, 8:50 PM
llvm/lib/Transforms/Scalar/GVN.cpp
2703

Let's not scatter cache updates across the code. There can be new caches added in the future, not only ICF or whatever it is now. We don't want multiple places where we need to update them. This opens doors for bugs.

Any serious reasons to move instruction rather than create a new one?

mkazantsev added inline comments.Jan 18 2023, 8:54 PM
llvm/lib/Transforms/Scalar/GVN.cpp
2703

Just imagine that someone will accidentally move an ICF instruction in later change. Currently, the assert is protecting us from it. By giving it up, we make mistakes like this harder to find.

Carrot updated this revision to Diff 490663.Jan 19 2023, 2:32 PM
Carrot added inline comments.
llvm/lib/Transforms/Scalar/GVN.cpp
2703

Sounds reasonable.

So now I just create a new load, replace all uses of old load with the new load. The dead old load instruction can be deleted in the next iteration of GVN.

mkazantsev accepted this revision.Jan 22 2023, 8:26 PM

Looks good now, thanks for your efforts!

This revision is now accepted and ready to land.Jan 22 2023, 8:26 PM
This revision was landed with ongoing or failed builds.Jan 25 2023, 11:48 AM
This revision was automatically updated to reflect the committed changes.
nikic added a comment.Jan 26 2023, 3:14 AM

It looks like the new version of the patch ended up being more expensive in terms of compile-time: Original, New

I wonder whether this is because not removing the load essentially always forces an extra GVN iteration?

llvm/lib/Transforms/Scalar/GVN.cpp
1495

If we're not removing the load, we probably shouldn't be removing it from the leader table either?

It looks like the new version of the patch ended up being more expensive in terms of compile-time: Original, New

I wonder whether this is because not removing the load essentially always forces an extra GVN iteration?

I think so, it causes all triggering cases need another iteration.

llvm/lib/Transforms/Scalar/GVN.cpp
1495

Because we expect it to be deleted in the next iteration, and the same value is also available in the NewLoad instruction, so I think it should not be used by other optimizations, and assume it's not available in the leader table.

Hello!

It seems that with this patch GVN may yield different results when debug info is present.
I suspect the MaxNumInsnsPerBlock limit includes also debug instructions?
This is seen with the following example:

opt -passes=gvn bbi-78272.ll -S -o -

If you remove some dbg.declare from the input you get different output.

dstenb added a subscriber: dstenb.Jan 27 2023, 8:06 AM

@uabelho, thank you for the reporting! I think you are right. I'm preparing the patch.

hans added a subscriber: hans.Feb 1 2023, 2:47 AM

Just a heads up that we've bisected a test failure to this revision: https://bugs.chromium.org/p/chromium/issues/detail?id=1411693
We're still investigating, but it would be interesting to know if others have also hit any issues.

hans added a subscriber: reames.Feb 1 2023, 9:30 AM

I'm not sure it explains our test failure, but @reames raised a concern on the llvm-commits email:

I believe this patch to be incorrect, and need reverted.

The case I'm concerned about is that this patch appears to hoist a load 
over instructions which may throw.  The load at the original location 
may be dynamically dead because the exception path is always taken.  
Moving said load into the executed path can introduce a new fault in the 
program.

@hans, do you have a reduced test case to demonstrate the problem?

I exchanged emails with @reames

A throwing call is a terminator, can we find a load below it ?

On Mon, Jan 30, 2023 at 11:49 AM Philip Reames


<listmail@philipreames.com> wrote:
>
> You continue on any non-identical instruction.  That instruction can be
> a throwing call.
>
> Philip
>
> On 1/30/23 11:23, Carrot Wei wrote:
> > Hi Philip
> >
> > Do you have a reduced test case to show your problem?
> >
> > This patch tries to check exception throwing instructions in the
> > function findLoadToHoistIntoPred. What's your case that I missed?
> >
> > thanks
> > Guozhi Wei

At this point, please revert the patch immediately.

There's been a probable bug reported.  There's been a failure reported. 
These are probably the same.  You should revert, investigate, and fix
offline.

Philip

Carrot added a comment.Feb 1 2023, 1:27 PM

@reames, I'm happy to revert it once I get your test case, otherwise I have nothing to investigate.

slightly modified from one of the test cases:

declare i1 @foo()
declare void @maybethrow() readnone

; %v3 is partially redundant, bb3 has multiple predecessors coming through
; critical edges. The other successors of those predecessors have same loads.
; We can move all loads into predecessors.

define void @test17(ptr %p1, ptr %p2, ptr %p3, ptr %p4) {
entry:
  %v1 = load i64, ptr %p1, align 8
  %cond1 = icmp sgt i64 %v1, 200
  br i1 %cond1, label %bb200, label %bb1

bb1:
  %cond2 = icmp sgt i64 %v1, 100
  br i1 %cond2, label %bb100, label %bb2

bb2:
  %v2 = add nsw i64 %v1, 1
  store i64 %v2, ptr %p1, align 8
  br label %bb3

bb3:
  %v3 = load i64, ptr %p1, align 8
  store i64 %v3, ptr %p2, align 8
  ret void

bb100:
  %cond3 = call i1 @foo()
  br i1 %cond3, label %bb3, label %bb101

bb101:
  %v4 = load i64, ptr %p1, align 8
  store i64 %v4, ptr %p3, align 8
  ret void

bb200:
  %cond4 = call i1 @foo()
  br i1 %cond4, label %bb3, label %bb201

bb201:
  %_ = call i1 @maybethrow()
  %v5 = load i64, ptr %p1, align 8
  store i64 %v5, ptr %p4, align 8
  ret void
}

becomes

declare i1 @foo()

; Function Attrs: memory(none)
declare void @maybethrow() #0

define void @test17(ptr %p1, ptr %p2, ptr %p3, ptr %p4) {
entry:
  %v1 = load i64, ptr %p1, align 8
  %cond1 = icmp sgt i64 %v1, 200
  br i1 %cond1, label %bb200, label %bb1

bb1:                                              ; preds = %entry
  %cond2 = icmp sgt i64 %v1, 100
  br i1 %cond2, label %bb100, label %bb2

bb2:                                              ; preds = %bb1
  %v2 = add nsw i64 %v1, 1
  store i64 %v2, ptr %p1, align 8
  br label %bb3

bb3:                                              ; preds = %bb200, %bb100, %bb2
  %v3 = phi i64 [ %v3.pre, %bb200 ], [ %v3.pre1, %bb100 ], [ %v2, %bb2 ]
  store i64 %v3, ptr %p2, align 8
  ret void

bb100:                                            ; preds = %bb1
  %cond3 = call i1 @foo()
  %v3.pre1 = load i64, ptr %p1, align 8
  br i1 %cond3, label %bb3, label %bb101

bb101:                                            ; preds = %bb100
  store i64 %v3.pre1, ptr %p3, align 8
  ret void

bb200:                                            ; preds = %entry
  %cond4 = call i1 @foo()
  %v3.pre = load i64, ptr %p1, align 8
  br i1 %cond4, label %bb3, label %bb201

bb201:                                            ; preds = %bb200
  %_ = call i1 @maybethrow()
  store i64 %v3.pre, ptr %p4, align 8
  ret void
}

attributes #0 = { memory(none) }

%v5 is hoisted above call @maybethrow()

Carrot added a comment.Feb 1 2023, 2:50 PM

It's reverted now. Thanks for the test case. Will improve it in another version.

Carrot reopened this revision.Feb 2 2023, 8:08 PM
This revision is now accepted and ready to land.Feb 2 2023, 8:08 PM
Carrot updated this revision to Diff 494499.Feb 2 2023, 8:10 PM

Check for implicit control flow instruction in function findLoadToHoistIntoPred, so we can avoid moving load across unexpected control flow.

@reames and @hans please try to see if this patch works for your code.

reames added a comment.Feb 3 2023, 7:54 AM

This fixes (via the call to isDominatedByICFIFromSame) the issue I had identified in post commit review. I have not closely examined the patch more generally.

hans added a comment.Feb 3 2023, 12:02 PM

Check for implicit control flow instruction in function findLoadToHoistIntoPred, so we can avoid moving load across unexpected control flow.

@reames and @hans please try to see if this patch works for your code.

Sadly, our test fails also with this version.

We don't have a reduced repro.

Would it be possible to try this out on some code internally in case any test there exposes the problem?

In private email communication, @hans told me he couldn't find any problem in his optimized code, and his fail disappeared after updating his source code. So there is no known problem with this version.

Could anybody take another look at this patch? Compared to last version there is only one line change in function findLoadToHoistIntoPred.

chapuni accepted this revision.Feb 23 2023, 11:46 PM

Works for me.

I believe the updated patch doesn't address the compile-time regression yet -- it's still running a whole extra GVN iteration to remove the load that is left behind.

Carrot updated this revision to Diff 500326.Feb 24 2023, 4:23 PM

To improve compile time, now I record the basic blocks contain dead load instructions, at the end of the same iteration, process these blocks again to delete load instructions. So we can avoid the extra iteration on the whole function.

@nikic, the original version of this patch is 0.07% slower because more triggered optimizations caused more iterations on huge functions.
http://llvm-compile-time-tracker.com/compare.php?from=d38d6065584ee5dd837e9a629f90c731d8a7dffc&to=94fc2022ff32b63d6c744d3eeff4f304e1a81618&stat=instructions:u

The second version is 0.16% slower because it leaves the dead instructions to next iteration, so it causes an extra iteration on whole function. http://llvm-compile-time-tracker.com/compare.php?from=287508cd9c4396c8845d92310d258879202a179e&to=5f1448fe1585b5677d5f0064e4eeac3b493d8a18&stat=instructions%3Au

The current version is 0.08% slower, very close to the first version. This time I record the BBs contain dead stores, instead of revisit the whole function, I just revisit these recorded BBs. http://llvm-compile-time-tracker.com/compare.php?from=c8e5354f0063b090b6fa7ad4cfc2b9df78038454&to=d21dbfa65423bbd9251ab5948f6cc1baa2db61c8&stat=instructions:u.

Is it OK now?
Thanks.

Carrot added a comment.Apr 3 2023, 2:30 PM

@nikic, do you have any other concern on this patch?

@mkazantsev, could you help to take a look at the latest version?

@nikic expressed his concern on compile time, then I sent out this version to improve compile time. But @nikic didn't response for several weeks.

The change is the use of a new variable AdditionalWorkSet for tracking BBs contain dead load, so we can process these BBs from AdditionalWorkSet at the end of each iteration, and avoid an extra iteration on the whole function.

@nikic, the original version of this patch is 0.07% slower because more triggered optimizations caused more iterations on huge functions.
http://llvm-compile-time-tracker.com/compare.php?from=d38d6065584ee5dd837e9a629f90c731d8a7dffc&to=94fc2022ff32b63d6c744d3eeff4f304e1a81618&stat=instructions:u

The second version is 0.16% slower because it leaves the dead instructions to next iteration, so it causes an extra iteration on whole function. http://llvm-compile-time-tracker.com/compare.php?from=287508cd9c4396c8845d92310d258879202a179e&to=5f1448fe1585b5677d5f0064e4eeac3b493d8a18&stat=instructions%3Au

The current version is 0.08% slower, very close to the first version. This time I record the BBs contain dead stores, instead of revisit the whole function, I just revisit these recorded BBs. http://llvm-compile-time-tracker.com/compare.php?from=c8e5354f0063b090b6fa7ad4cfc2b9df78038454&to=d21dbfa65423bbd9251ab5948f6cc1baa2db61c8&stat=instructions:u.

Is it OK now?
Thanks.

0.08% compile time difference is too small to be considered significant. To double check, what is the clang self build time diff?

nikic added a comment.Apr 27 2023, 1:44 PM

Sorry for the long silence. I think it's fine in terms of compile time, but I guess I still don't fully get why it is better to reprocess the block than remove the load. I understand that InstrsToErase doesn't work for this case because it's only for the current BB, but we basically have the same code copied at the end of performScalarPRE() as well, so it seems like it should be possible to extract that into a common helper and use it here as well. Maybe I'm missing some subtlety.

Carrot updated this revision to Diff 518049.Apr 28 2023, 2:09 PM

Extract the common code for deleting an instruction to a function removeInstruction, call it from several places as required.

nikic requested changes to this revision.May 9 2023, 6:24 AM

I believe the change to metadata.ll is a miscompile. The new range metadata in that case should be a union of both old ranges. This is missing a combineMetadataForCSE() somewhere.

This revision now requires changes to proceed.May 9 2023, 6:24 AM
Carrot updated this revision to Diff 520874.May 9 2023, 5:45 PM

Combine the old load's metadata before deleting it.

nikic accepted this revision.May 14 2023, 7:45 AM

LGTM (Haven't reviewed the whole thing, but my concerns have been addressed.)

This revision is now accepted and ready to land.May 14 2023, 7:45 AM
This revision was landed with ongoing or failed builds.May 16 2023, 6:12 PM
This revision was automatically updated to reflect the committed changes.
TimNN added a subscriber: TimNN.May 16 2023, 10:14 PM

I strongly suspect that this patch is breaking the Rust compiler when building against LLVM head:

Instruction does not dominate all uses!
  %11 = load i32, ptr %10, align 4, !range !10, !alias.scope !6314, !noalias !6319
  %19 = phi i32 [ %8, %9 ], [ %11, %32 ]
in function _RINvNtCsbJsLLO9YF7d_9rustc_hir10intravisit9walk_bodyINtNtCs7c017R9yI2R_10rustc_lint4late18LateContextAndPassNtBT_33BuiltinCombinedModuleLateLintPassEEBT_
LLVM ERROR: Broken function found, compilation aborted!
error: could not compile `rustc_lint` (lib)

From https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/19373

Changes since the last successful run of that build bot: https://github.com/llvm/llvm-project/compare/d3b9d8b28f8e9fde41a4136c28f458347d9bb292...6a2f52a3a00bdfc8bd4edf6592099a7a749d324e

(I don't have a minimized repro yet).

@TimNN, I have reverted it, LLVM buildbot also reported a sanitized bootstrap failure.

If you have simple repro, it will be a great help to me.

Thanks.

TimNN added a comment.May 17 2023, 9:16 AM

@TimNN, I have reverted it, LLVM buildbot also reported a sanitized bootstrap failure.

If you have simple repro, it will be a great help to me.

Thanks for reverting! I sadly don't have a good repro, and likely won't have the time to try and get one in the near future.

The compiler crash didn't occur in my last commit, but occurred in this commit. It is caused by the interaction between the new instruction deletion method and the implementation of replaceValuesPerBlockEntry.

Consider the following case

; case1.

bb1:
   ...
   ; NewLoad is added here.
   br %cond1, label %bb2, label %bb3

bb2:
   OldLoad
   ...
   br label %bb3

bb3:
   Load
   ...

Once we decide to do the transformation, we take the following steps

  • Insert NewLoad into bb1.
  • Replace ValuesPerBlock(OldLoad->getParent(), OldLoad) with ValuesPerBlock(OldLoad->getParent(), NewLoad), so we'll get ValuesPerBlock(%bb2, NewLoad).
  • Delete instruction OldLoad, and replace all its uses with NewLoad.
  • Replace instruction Load with a new PHI instruction, required information is from ValuesPerBlock.

If we change the case to

; case2.

bb1:
   ...
   ; NewLoad is added here.
   br %cond, label %bb2, label %bb3

bb2:
   OldLoad
   ...
   br %cond2, label %bb5, label %bb4

bb4:
   ...
   br label %bb3

bb3:
   Load
   ...

We take the following steps to do the transformation

  • Insert NewLoad into bb1.
  • Try to replace ValuesPerBlock(OldLoad->getParent(), OldLoad) with ValuesPerBlock(OldLoad->getParent(), NewLoad). But we don't have an entry for ValuesPerBlock(%bb2, OldLoad), instead we have an entry ValuesPerBlock(%bb4, OldLoad), and it is not changed.
  • Delete instruction OldLoad, and replace all its uses with NewLoad.
  • Replace instruction Load with a new PHI instruction, required information is from ValuesPerBlock. Then we get the value OldLoad from %bb4 again, because OldLoad is already deleted, it's an invalid value, and causes crash.

In my last version, OldLoad isn't deleted immediately, the reference to it in the PHI instruction is valid and correct. In the next iteration of GVN, OldLoad can be found as fully redundant, then it is deleted and the use in PHI instruction is updated to use NewLoad.

It should be fixed in replaceValuesPerBlockEntry, every entry with the value OldLoad should be changed to NewLoad. Because OldLoad is dominated by NewLoad, so this change is safe.

Carrot reopened this revision.May 26 2023, 1:22 PM
This revision is now accepted and ready to land.May 26 2023, 1:22 PM
Carrot updated this revision to Diff 526181.May 26 2023, 1:29 PM

Updated the function replaceValuesPerBlockEntry to replace all OldLoad with NewLoad. So the deleted OldLoad instruction will not be used by later PHI instruction.

@TimNN, @antmo, @nickdesaulniers, @maxim-kuvyrkov could you help to test if this version work for you?

thanks

antmo added a comment.May 30 2023, 8:33 AM

the new version looks ok here (no more crash on ByteCode.cpp)

Updated the function replaceValuesPerBlockEntry to replace all OldLoad with NewLoad. So the deleted OldLoad instruction will not be used by later PHI instruction.

@TimNN, @antmo, @nickdesaulniers, @maxim-kuvyrkov could you help to test if this version work for you?

thanks

Sorry, before even being able to test the new patch, I'm running into numerous regressions that I need to sort out first:

  1. https://github.com/ClangBuiltLinux/linux/issues/1855
  2. https://github.com/ClangBuiltLinux/linux/issues/1856
  3. https://github.com/ClangBuiltLinux/linux/issues/1857

It's unusual to have that many regressions over the weekend, but may we please have some time to resolve those so that we can re-test this patch properly (before you resubmit)?

@antmo, thank you for your verification.

@nickdesaulniers, no problem to me, thanks!

nickdesaulniers accepted this revision.May 31 2023, 1:59 PM

Thanks for your patience. I was able to verify that Diff 526181 no longer ICE's as was observed by @maxim-kuvyrkov in linux stable 6.3.y for ARCH=arm allyesconfig https://ci.linaro.org/job/tcwg_kernel--llvm-master-arm-stable-allyesconfig-build/29/artifact/artifacts/06-build_linux/console.log.xz.

TimNN added a comment.Jun 2 2023, 8:15 AM

I could _not_ reproduce the problems I previously saw when building the Rust compiler, so seems to be all good now :).

Carrot added a comment.Jun 2 2023, 3:52 PM

@nickdesaulniers, @TimNN, thank you for your verification.

@mkazantsev, @nikic, could you take a look at this version, the only modification is in function replaceValuesPerBlockEntry and a new test case.

nikic accepted this revision.Jun 5 2023, 12:06 AM

LGTM

This revision was landed with ongoing or failed builds.Jun 6 2023, 12:47 PM
This revision was automatically updated to reflect the committed changes.

Hi,

The following starts crashing with this patch:

opt -passes="inline,function<eager-inv;no-rerun>(gvn<>)" bbi-83499.ll -o /dev/null -debug

It requires -debug to crash since there is a verifyRemoved call hidden in a LLVM_DEBUG.

llvm/lib/Transforms/Scalar/GVN.cpp
3153

A bit strange to hide verification in LLVM_DEBUG? So we only run that with debug printouts turned on?

nikic added a comment.EditedJun 12 2023, 6:18 AM

@uabelho Thanks for the report, should be fixed by https://github.com/llvm/llvm-project/commit/282324aa4a6c29d5ce31c66f8def15d9bd8e84e4.

(I did not add an additional test, because plenty of other tests already fail if we don't require -debug for this.)

@uabelho Thanks for the report, should be fixed by https://github.com/llvm/llvm-project/commit/282324aa4a6c29d5ce31c66f8def15d9bd8e84e4.

(I did not add an additional test, because plenty of other tests already fail if we don't require -debug for this.)

@nikic : Yes, I verified it doesn't crash anymore. Thanks, that was fast :)

@nikic Thanks for fixing!

It is not a fix for debug but a fix for actual unstable bug.
I saw the case that phantoms were found in VN and caused nondeterministic behavior with a low probability.
(I guess new Inst might point deleted one, not sure)

@uabelho Thanks for the report, should be fixed by https://github.com/llvm/llvm-project/commit/282324aa4a6c29d5ce31c66f8def15d9bd8e84e4.

(I did not add an additional test, because plenty of other tests already fail if we don't require -debug for this.)

Does it make sense to use AssertingVH instead of bare Value* in GVNPass::ValueTable::valueNumbering (and probably in some other places as well) to prevent such bugs?