This is an archive of the discontinued LLVM Phabricator instance.

[LSR] Don't break a critical edge if parent ends with "callbr"
AbandonedPublic

Authored by void on Jan 11 2021, 10:05 PM.

Details

Summary

The 'callbr' instruction doesn't have a good place to put values from a
split critical edge. This is similar to indirect branch and case switch
instructions.

Diff Detail

Event Timeline

void created this revision.Jan 11 2021, 10:05 PM
void requested review of this revision.Jan 11 2021, 10:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2021, 10:05 PM

I guess the presence of the profile data makes it difficult to use bugpoint to reduce the test case more?

The test should be added in test/Transforms/PGOProfile/, see D87435

void added a comment.Jan 12 2021, 12:07 PM

The test should be added in test/Transforms/PGOProfile/, see D87435

The error isn't in opt but in llc though. In fact, I don't think it's specific to PGO, but only triggered by it.

void updated this revision to Diff 316193.Jan 12 2021, 12:10 PM

Reduce the testcase.

void added a comment.Jan 12 2021, 12:11 PM

I guess the presence of the profile data makes it difficult to use bugpoint to reduce the test case more?

It looks like I was able to reduce it. PTAL.

I guess the presence of the profile data makes it difficult to use bugpoint to reduce the test case more?

It looks like I was able to reduce it. PTAL.

The test case passes for me on ToT; was it over-reduced perhaps?

void added a comment.Jan 12 2021, 1:10 PM

I guess the presence of the profile data makes it difficult to use bugpoint to reduce the test case more?

It looks like I was able to reduce it. PTAL.

The test case passes for me on ToT; was it over-reduced perhaps?

It fails for me. Are you running it with assertions enabled?

ah! probably not, let me rebuild

nickdesaulniers accepted this revision.Jan 12 2021, 1:40 PM

Patch LGTM; just curious about the comment, since that's not what's occurring in the test AFAICT. Thanks for the patch!

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5594–5595

Can this comment be made more precise, I don't think what's described is what's being tested?

llvm/test/CodeGen/Generic/callbr-critical-edge-splitting.ll
21 ↗(On Diff #316193)

(I'm surprised we can refer to values in the IR before they've been assigned to).

This revision is now accepted and ready to land.Jan 12 2021, 1:40 PM
MaskRay requested changes to this revision.Jan 12 2021, 1:44 PM

The test should be added in test/Transforms/PGOProfile/, see D87435

The error isn't in opt but in llc though. In fact, I don't think it's specific to PGO, but only triggered by it.

Then it should be in llvm/test/Transforms/LoopStrengthReduce/

This revision now requires changes to proceed.Jan 12 2021, 1:44 PM
MaskRay added inline comments.Jan 12 2021, 1:46 PM
llvm/test/CodeGen/Generic/callbr-critical-edge-splitting.ll
1 ↗(On Diff #316193)

Can you write a opt -passes='loop(loop-reduce)' test?

Checking a command returns 0 is not a useful test. Checking some desired instructions makes the test not only useful for one particular regression, but also useful for some other scenarios.

void added a comment.Jan 12 2021, 2:00 PM

The test should be added in test/Transforms/PGOProfile/, see D87435

The error isn't in opt but in llc though. In fact, I don't think it's specific to PGO, but only triggered by it.

Then it should be in llvm/test/Transforms/LoopStrengthReduce/

This is a code-gen bug. I'm not sure how else to put this. If I run it through opt, it doesn't reproduce.

void added inline comments.Jan 12 2021, 2:03 PM
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5594–5595

It's pretty accurate as is. :) I'm not sure how else to word it.

llvm/test/CodeGen/Generic/callbr-critical-edge-splitting.ll
1 ↗(On Diff #316193)

It replicates only with llc, not with opt.

21 ↗(On Diff #316193)

Might be some PHI node weirdness.

MaskRay added inline comments.Jan 12 2021, 2:08 PM
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5594–5595

The change only affects llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp, not any codegen part, so I hope this can be tested with opt. Usually this should check the produced instructions.

If the resulting IR is correct but codegen somehow crashes, then it is probably a codegen bug which should be separately fixed. If the output IR is actually incorrect, then the IR verifier should be fixed instead.

Anyway I think the test should be added in an appropriate place and if this uncovers other issues those other issues should be reported as well.

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5594–5595

Ah, yes sorry, I was looking at the first phi thinking WTF; didn't notice the second phi until just now.

void added inline comments.Jan 12 2021, 2:18 PM
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5594–5595

It appears that this pass is added only during code-gen:

llvm/lib/CodeGen/TargetPassConfig.cpp:      addPass(createLoopStrengthReducePass());
llvm/lib/Passes/PassBuilder.cpp:#include "llvm/Transforms/Scalar/LoopStrengthReduce.h"
llvm/lib/Passes/PassRegistry.def:LOOP_PASS("loop-reduce", LoopStrengthReducePass())
llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp:/// (LoopStrengthReduce.cpp) and memory optimization for address mode
llvm/lib/Target/NVPTX/NVPTXISelLowering.h:  /// reduction (LoopStrengthReduce.cpp) and memory optimization for
llvm/lib/Transforms/Scalar/CMakeLists.txt:  LoopStrengthReduce.cpp
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp://===- LoopStrengthReduce.cpp - Strength Reduce IVs in Loops --------------===//
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:#include "llvm/Transforms/Scalar/LoopStrengthReduce.h"
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:class LoopStrengthReduce : public LoopPass {
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:  LoopStrengthReduce();
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:LoopStrengthReduce::LoopStrengthReduce() : LoopPass(ID) {
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:  initializeLoopStrengthReducePass(*PassRegistry::getPassRegistry());
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:void LoopStrengthReduce::getAnalysisUsage(AnalysisUsage &AU) const {
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:bool LoopStrengthReduce::runOnLoop(Loop *L, LPPassManager & /*LPM*/) {
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:PreservedAnalyses LoopStrengthReducePass::run(Loop &L, LoopAnalysisManager &AM,
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:char LoopStrengthReduce::ID = 0;
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:INITIALIZE_PASS_BEGIN(LoopStrengthReduce, "loop-reduce",
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:INITIALIZE_PASS_END(LoopStrengthReduce, "loop-reduce",
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:Pass *llvm::createLoopStrengthReducePass() { return new LoopStrengthReduce(); }
llvm/lib/Transforms/Scalar/Scalar.cpp:  initializeLoopStrengthReducePass(Registry);
llvm/test/CodeGen/Generic/callbr-critical-edge-splitting.ll
1 ↗(On Diff #316193)

I'm also having trouble reproducing with opt. If this relies on CodeGen specific transforms, do we need a target triple for the module? https://lists.llvm.org/pipermail/llvm-dev/2020-July/143666.html seems to allude to this problem (I haven't read the whole thread yet to see whether/how it's been resolved).

llvm/test/CodeGen/Generic/callbr-critical-edge-splitting.ll
1 ↗(On Diff #316193)

adding

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

allows

opt -loop-reduce /tmp/foo.ll -S -o -

to repro.

llvm/test/CodeGen/Generic/callbr-critical-edge-splitting.ll
1 ↗(On Diff #316193)

opt -mcpu=alderlake -loop-reduce -S /tmp/foo.ll repros also works; guessing that cpu specifies a datalayout?

void updated this revision to Diff 316249.Jan 12 2021, 2:52 PM

Move testcase.

void added inline comments.Jan 12 2021, 3:14 PM
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5594–5595

Sorry. Fixed.

MaskRay added inline comments.Jan 12 2021, 3:35 PM
llvm/test/Transforms/LoopStrengthReduce/callbr-critical-edge-splitting.ll
10

@fhahn

The exiting edge (%do.body.i.i.rdrand_int.exit.i_crit_edge -> %for.end) is a critical edge.

SplitBlockPredecessors is called with LoopPreds being BBs other than TIBB.
SplitBlockPredecessors asserts if one such BB (%for.cond) has an IndirectBr/CallBr terminator.

24

Perhaps simplify the BB names. For example _crit_edge can be dropped as it is the name introduced by a previous critical edge breaking step.

void added inline comments.Jan 12 2021, 3:57 PM
llvm/test/Transforms/LoopStrengthReduce/callbr-critical-edge-splitting.ll
24

No. Please stop. I'm not going to rename BB names. That's just busy work.

@nathanchance also pointed out I had https://reviews.llvm.org/D88438 to fix a similar issue (I forgot about that patch, I should try it with this test case, too).

void updated this revision to Diff 316761.Jan 14 2021, 1:03 PM
  • Add FileCheck to testcase.

This patch also fixes the test case in D88438 (and D88438 also fixes this test case). So two different ways of solving similar bugs.
(ie. either would fix https://github.com/ClangBuiltLinux/linux/issues/1161 and https://github.com/ClangBuiltLinux/linux/issues/1252). I don't feel strongly either way; maybe reviewers could take a look at both approaches?

This patch also fixes the test case in D88438 (and D88438 also fixes this test case).

Does that other review fix this test case intrinsically -- in that this patch is definitely not necessary if that is committed? Or might it just be fixed by accident -- e.g. this particular test-case doesn't trigger the problem anymore, but something else might, without this patch?

This patch also fixes the test case in D88438 (and D88438 also fixes this test case).

Does that other review fix this test case intrinsically -- in that this patch is definitely not necessary if that is committed? Or might it just be fixed by accident -- e.g. this particular test-case doesn't trigger the problem anymore, but something else might, without this patch?

From the github links posted above; both traces from the unwind post-failed-assertion look the same. I assume this patch terminates LoopReduce sooner than D88438 would (though SplitCriticalEdge is called from many other passes than just LoopReduce).

void abandoned this revision.Jan 15 2021, 2:06 PM