This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/test/Frontend/
-
test/
-
Frontend/
-
optimization-remark-analysis.c
-
llvm/
-
include/llvm/Transforms/Utils/
-
llvm/
-
Transforms/
-
Utils/
-
LowerSwitch.h
-
lib/
-
Passes/
-
PassBuilder.cpp
-
Transforms/
-
Scalar/
-
StructurizeCFG.cpp
-
Utils/
-
FixIrreducible.cpp
-
LowerSwitch.cpp
-
UnifyLoopExits.cpp
-
test/
-
CodeGen/AMDGPU/
-
AMDGPU/
-
llc-pipeline.ll
-
Other/
-
new-pm-defaults.ll
-
new-pm-lto-defaults.ll
-
new-pm-thinlto-defaults.ll
-
new-pm-thinlto-postlink-pgo-defaults.ll
-
new-pm-thinlto-postlink-samplepgo-defaults.ll
-
Transforms/
-
LoopVectorize/
-
AArch64/
-
sve-remove-switches.ll
-
remove-switches.ll
-
LowerSwitch/
-
simple-switches.ll
-
StructurizeCFG/workarounds/
-
workarounds/
-
needs-fr-ule.ll

Differential D108138

[WIP] Remove switch statements before vectorization
AbandonedPublic

Authored by kmclaughlin on Aug 16 2021, 8:35 AM.

Download Raw Diff

Details

Reviewers

david-arm
fhahn
dmgreen
craig.topper
lebedev.ri

Summary

This patch changes the LowerSwitch pass so that when a flag is
passed (LoopUnswitch) the pass will only attempt to unswitch simple
switch statements (i.e. there are no ranges, each destination block is
unique) which are part of a loop. The purpose of this is to allow
vectorization of loops which is not possible at the moment due to
the presence of switch statements.

The LowerSwitch pass is now run just before the vectorizer, with
LoopUnswitch set to true. For simple switches, we create a series of
compares and branches which have a simpler structure which SimplifyCFG
can later replace with a switch again if the vectorizer made no changes.

The following tests have been added:

LowerSwitch/simple-switches.ll: Tests the changes to LowerSwitch to replace switch statments in loops.
LoopVectorize/AArch64/sve-remove-switches.ll: Tests that we can vectorize loops with switch statements with scalable vectors. Also tests that where vectorization is not possible, that the switch statement is created again.
LoopVectorize/remove-switches.ll: Ensures that we do not vectorize the loop if the target doesn't support masked loads & stores, where the cost would be too high.

Diff Detail

Event Timeline

kmclaughlin created this revision.Aug 16 2021, 8:35 AM

Herald added subscribers: ctetreau, ormris, wenlei and 3 others. · View Herald TranscriptAug 16 2021, 8:35 AM

kmclaughlin requested review of this revision.Aug 16 2021, 8:35 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 16 2021, 8:35 AM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B119725: Diff 366637.Aug 16 2021, 9:13 AM

I'm not sure i'm sold on this, even though i'm aware that selects hurt vectorization.
How does this Simplify the CFG? I think it would be best to teach LV selects,
or at worst do this in LV itself.

In D108138#2947229, @lebedev.ri wrote:

I'm not sure i'm sold on this, even though i'm aware that selects hurt vectorization.
How does this Simplify the CFG? I think it would be best to teach LV selects,
or at worst do this in LV itself.

Hi @lebedev.ri, I'm under the impression that the vectoriser has a policy of never making scalar transformations so I doubt it would be acceptable to do this in the vectoriser pass. I think the only realistic alternative is to teach LV how to vectorise switch statements and create the vector compares and selects directly in the code, or scalarise them in the vector loop with creation of new blocks. @fhahn and @craig.topper do you have any thoughts on this or preference?

The alternative to teaching looprotate/LV about switches is to make swiches non-canonical in the first half of the pipeline, before LV.
That is, don't form them, and aggressively expand any and all existing switches.

I'm under the impression that the vectoriser has a policy of never making scalar transformations

I'm not sure what you mean. I've not looked into the details, but it could presumably be done as some sort of VPlan transformation, possibly in the constructions of vplans to treat switches like multiple icmp's/branches?

In D108138#2948975, @dmgreen wrote:

I'm under the impression that the vectoriser has a policy of never making scalar transformations

I'm not sure what you mean. I've not looked into the details, but it could presumably be done as some sort of VPlan transformation, possibly in the constructions of vplans to treat switches like multiple icmp's/branches?

Hi @dmgreen, I just meant that if LV makes a scalar transformation prior to legality/cost-model checks, then for some reason we don't vectorise, we then end up with a changed scalar body without any vectorisation.

In D108138#2948995, @david-arm wrote:

In D108138#2948975, @dmgreen wrote:

I'm under the impression that the vectoriser has a policy of never making scalar transformations

I'm not sure what you mean. I've not looked into the details, but it could presumably be done as some sort of VPlan transformation, possibly in the constructions of vplans to treat switches like multiple icmp's/branches?

Hi @dmgreen, I just meant that if LV makes a scalar transformation prior to legality/cost-model checks, then for some reason we don't vectorise, we then end up with a changed scalar body without any vectorisation.

Oh yeah, that makes sense. I was wondering if we could teach VPlan to treat them as ICmp/Br without having to actually transform the IR, just doing it as part of constructing the VPlan.

Also check @nikic’s https://reviews.llvm.org/D95296

Matt added a subscriber: Matt.Aug 17 2021, 9:23 AM

junparser added a subscriber: junparser.Aug 17 2021, 11:33 PM

Since we already have LowerSwitchPass to transform switchinst, can we add a cost modle and run it before vectorization?

Thanks all for the suggestions on this patch :)

I had a look at the LowerSwitch pass as suggested by @junparser, and I did find that running it before vectorisation transforms the switch and allows the same loops to be vectorised. However, I did find that if the loop is not vectorised then the switch is not created again later by SimplifyCFG (possibly because the pass is also arbitrarily splitting cases into ranges and creating multiple branches to the default block?). Tests such as Transforms/PhaseOrdering/X86/simplifycfg-late.ll then fail, which attempts to convert a switch statement into a lookup table.

For example, running the @switch_no_vectorize test (from remove-switches.ll) with -lowerswitch results in:

for.body:                                         ; preds = %L3, %entry
  %i = phi i64 [ %inc, %L3 ], [ 0, %entry ]
  %sum.033 = phi float [ %conv20, %L3 ], [ 2.000000e+00, %entry ]
  %arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
  %0 = load i32, i32* %arrayidx, align 4
  br label %NodeBlock

NodeBlock:                                        ; preds = %for.body
  %Pivot = icmp slt i32 %0, 3
  br i1 %Pivot, label %LeafBlock, label %LeafBlock1

LeafBlock1:                                       ; preds = %NodeBlock
  %SwitchLeaf2 = icmp eq i32 %0, 3
  br i1 %SwitchLeaf2, label %L3, label %NewDefault

LeafBlock:                                        ; preds = %NodeBlock
  %SwitchLeaf = icmp eq i32 %0, 2
  br i1 %SwitchLeaf, label %L2, label %NewDefault

NewDefault:                                       ; preds = %LeafBlock1, %LeafBlock
  br label %L1

I also found that any weights assigned to the switch statement are ignored when creating the new branches in LowerSwitch.

I'm not sure what the best approach to this is - I could try to change LowerSwitch to create branches which SimplifyCFG will be able to recognise and replace with a switch, or try to change SimplifyCFG to recognise this pattern of compares & branches. Alternatively, the changes in this patch could be used as the basis for a new pass which runs before the vectoriser. I wondered if anyone has any thoughts or preferences on which would be the best option here?

IMO anything other than enhancing LV is wrong.

In D108138#2967100, @lebedev.ri wrote:

IMO anything other than enhancing LV is wrong.

Hi @lebedev.ri I personally disagree here. Adding support to LV for this is significantly more work (and IMO unnecessary) because there are cases when LV has to handle a lot more than just the obvious flattened vectorisation case using vector comparisons and select instructions. We will also need to add support for vectorisation factors of 1 (with interleaving) and cases where VF>1,but we have to scalarise the switch statement. These latter two cases require basically doing exactly the same thing as @kmclaughlin's patch does here, i.e. unswitching the switch statement into compares/branches and new blocks. It seems far simpler to have a small pass that runs prior to the vectoriser (when enabled) that unswitches.

Not sure what others think here?

How is it conceptually different to break apart IR in LV itself, or do the same in a special pass just before that?
If we want to go this road, we need to completely make switches illegal/non-canonical before LV.

In D108138#2967133, @lebedev.ri wrote:

How is it conceptually different to break apart IR in LV itself, or do the same in a special pass just before that?
If we want to go this road, we need to completely make switches illegal/non-canonical before LV.

If I understand correctly you're suggesting that LV makes a scalar transformation prior to legalisation checks/cost model analysis? If that's the case then I don't think we can do that as this is beyond LV's remit and I don't see how that's any different to making a scalar transformation in a separate pass prior to LV.

In D108138#2967156, @david-arm wrote:

In D108138#2967133, @lebedev.ri wrote:

How is it conceptually different to break apart IR in LV itself, or do the same in a special pass just before that?
If we want to go this road, we need to completely make switches illegal/non-canonical before LV.

If I understand correctly you're suggesting that LV makes a scalar transformation prior to legalisation checks/cost model analysis? If that's the case then I don't think we can do that as this is beyond LV's remit and I don't see how that's any different to making a scalar transformation in a separate pass prior to LV.

Actually no, i'm saying that since LV is not allowed to do such scalar transformations,
doing the same scalar transfomation, but just outside of LV, doesn't change the fact
that we've just made a preparatory transformation in hope that it will allow LV,
without actually knowing that. If it doesn't, we now need to undo it.

I could try to change LowerSwitch to create branches which SimplifyCFG will be able to recognise and replace with a switch, or try to change SimplifyCFG to recognise this pattern of compares & branches.

option is better.

I had a look at the LowerSwitch pass as suggested by @junparser, and I did find that running it before vectorisation transforms the switch and allows the same loops to be vectorised. However, I did find that if the loop is not vectorised then the switch is not created again later by SimplifyCFG

Maybe always lower switch in loops before LV? And some very late (simplifycfg) pass to form switches from branches? icmps are more friendly for futher optimizations than switches anyway, or?

Removed changes to SimplifyCFG and instead run LowerSwitch before vectorisation.
Added SimpleSwitchConvert to LowerSwitch which is used if the pass is run before vectorisation - this only considers simple switches (where each destination block is unique) which are also part of a loop.

Herald added subscribers: kerbowa, nhaehnle, jvesely. · View Herald TranscriptSep 15 2021, 8:20 AM

Hi all, I've updated this to take a different approach - the new patch runs LowerSwitch just before the vectoriser, where it will only consider simple switches which are part of a loop. For these switches, the pass will create a series of branches and compares which SimplifyCFG is able replace with a switch again later if the vectoriser did not make any changes.

I'm happy to split this patch up to make it easier to review, but I thought I would first post the changes I have so far to gather some thoughts on whether this is a better direction than before? Thanks!

Harbormaster completed remote builds in B124015: Diff 372706.Sep 15 2021, 9:02 AM

Hi. I'm personally still not very okay with the approach as it currently is.

Do you need to run LoopRotate after lowering switches? Anything else?
But then you don't actually know that after spending all this compile time,
the vectorization will actually happen, and you won't just now need to undo all this,
correct? This seems conceptually wrong to me.

Will LV never have to learn to deal with switches properly?
I would assume it will, in which case what is the urgency of this temporary approach?

If you really don't want to fix this properly, i'm looking forward to an RFC on llvm-dev.

I just wanted to give an update on this patch, which I'm abandoning for the time being:

@lebedev.ri raised some good questions about the approach taken and whether the additional compile time spent would be worth the additional opportunities for vectorisation. After posting the last update, I collected some benchmark results using Spec2017 to get a better understanding of the impact of these changes and found that several benchmarks showed performance regressions for fixed-width.

The biggest outliers (in terms of percentage runtime change) were:
520.omnetpp_r: -3.00%
500.perlbench_r: -2.00%
502.gcc_r: -1.52%

I also collected the results after adding in a threshold number of cases to be unswitched (set to 4), as was included in the first draft of this patch. This also showed some regressions in the benchmarks run and no significant improvements. Both sets of results showed increased compile times for many benchmarks.

The same benchmarks as above, with the threshold of 4 set:
520.omnetpp_r: -3.46%
500.perlbench_r: -1.20%
502.gcc_r: -1.22%

Results were collected on a Neoverse-N1 machine. Given that these results indicate this isn't the best approach to take, I'm abandoning the patch for now. When this is picked up in future, it will likely be better to follow either the suggestion to prevent canonicalisation of branches & compares into switch statements (under a given number of cases) in the first place, or to teach the loop vectoriser to recognise switches.

:(
I'm sorry for derailing this.
I still think proper switch handling for loops would be nice.

lebedev.ri mentioned this in D116309: [WIP][LoopVectorize] Convert switch blocks into branch sequence.Dec 27 2021, 7:41 AM

Revision Contents

Path

Size

clang/

test/

Frontend/

optimization-remark-analysis.c

4 lines

llvm/

include/

llvm/

Transforms/

Utils/

LowerSwitch.h

6 lines

lib/

Passes/

PassBuilder.cpp

1 line

Transforms/

Scalar/

StructurizeCFG.cpp

1 line

Utils/

FixIrreducible.cpp

2 lines

LowerSwitch.cpp

127 lines

UnifyLoopExits.cpp

2 lines

test/

CodeGen/

AMDGPU/

llc-pipeline.ll

34 lines

Other/

new-pm-defaults.ll

2 lines

new-pm-lto-defaults.ll

1 line

new-pm-thinlto-defaults.ll

2 lines

new-pm-thinlto-postlink-pgo-defaults.ll

2 lines

new-pm-thinlto-postlink-samplepgo-defaults.ll

2 lines

Transforms/

LoopVectorize/

AArch64/

sve-remove-switches.ll

413 lines

remove-switches.ll

467 lines

LowerSwitch/

simple-switches.ll

250 lines

StructurizeCFG/

workarounds/

needs-fr-ule.ll

92 lines

Diff 372706

clang/test/Frontend/optimization-remark-analysis.c

	// RUN: %clang -O1 -fvectorize -target x86_64-unknown-unknown -emit-llvm -Rpass-analysis -S %s -o - 2>&1 \| FileCheck %s --check-prefix=RPASS			// RUN: %clang -O1 -fvectorize -target x86_64-unknown-unknown -emit-llvm -Rpass-analysis -S %s -o - 2>&1 \| FileCheck %s --check-prefix=RPASS
	// RUN: %clang -O1 -fvectorize -target x86_64-unknown-unknown -emit-llvm -S %s -o - 2>&1 \| FileCheck %s			// RUN: %clang -O1 -fvectorize -target x86_64-unknown-unknown -emit-llvm -S %s -o - 2>&1 \| FileCheck %s

	// RPASS: {{.*}}:7:8: remark: loop not vectorized: loop contains a switch statement			// RPASS: {{.*}}:7:8: remark: loop not vectorized: value that could not be identified as reduction is used outside the loop
	// CHECK-NOT: {{.*}}:7:8: remark: loop not vectorized: loop contains a switch statement			// CHECK-NOT: {{.*}}:7:8: remark: loop not vectorized: value that could not be identified as reduction is used outside the loop

	double foo(int N, int *Array) {			double foo(int N, int *Array) {
	double v = 0.0;			double v = 0.0;

	#pragma clang loop vectorize(enable)			#pragma clang loop vectorize(enable)
	for (int i = 0; i < N; i++) {			for (int i = 0; i < N; i++) {
	switch(Array[i]) {			switch(Array[i]) {
	case 0: v += 1.0f; break;			case 0: v += 1.0f; break;
	case 1: v -= 0.5f; break;			case 1: v -= 0.5f; break;
	case 2: v *= 2.0f; break;			case 2: v *= 2.0f; break;
	default: v = 0.0f;			default: v = 0.0f;
	}			}
	}			}

	return v;			return v;
	}			}

llvm/include/llvm/Transforms/Utils/LowerSwitch.h

	Show All 12 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_TRANSFORMS_UTILS_LOWERSWITCH_H			#ifndef LLVM_TRANSFORMS_UTILS_LOWERSWITCH_H
	#define LLVM_TRANSFORMS_UTILS_LOWERSWITCH_H			#define LLVM_TRANSFORMS_UTILS_LOWERSWITCH_H

	#include "llvm/IR/PassManager.h"			#include "llvm/IR/PassManager.h"

	namespace llvm {			namespace llvm {
				class LoopInfo;

	struct LowerSwitchPass : public PassInfoMixin<LowerSwitchPass> {			struct LowerSwitchPass : public PassInfoMixin<LowerSwitchPass> {
				bool LoopUnswitch;
				LoopInfo *LI = nullptr;
				LowerSwitchPass() : LoopUnswitch(false) {}
				LowerSwitchPass(bool LoopUnswitch) : LoopUnswitch(LoopUnswitch) {}
	PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);			PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
	};			};
	} // namespace llvm			} // namespace llvm

	#endif // LLVM_TRANSFORMS_UTILS_LOWERSWITCH_H			#endif // LLVM_TRANSFORMS_UTILS_LOWERSWITCH_H

llvm/lib/Passes/PassBuilder.cpp

Show First 20 Lines • Show All 1,197 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
}		}

return MPM;		return MPM;
}		}

/// TODO: Should LTO cause any differences to this set of passes?		/// TODO: Should LTO cause any differences to this set of passes?
void PassBuilder::addVectorPasses(OptimizationLevel Level,		void PassBuilder::addVectorPasses(OptimizationLevel Level,
FunctionPassManager &FPM, bool IsFullLTO) {		FunctionPassManager &FPM, bool IsFullLTO) {
		FPM.addPass(LowerSwitchPass(true));
FPM.addPass(LoopVectorizePass(		FPM.addPass(LoopVectorizePass(
LoopVectorizeOptions(!PTO.LoopInterleaving, !PTO.LoopVectorization)));		LoopVectorizeOptions(!PTO.LoopInterleaving, !PTO.LoopVectorization)));

if (IsFullLTO) {		if (IsFullLTO) {
// The vectorizer may have significantly shortened a loop body; unroll		// The vectorizer may have significantly shortened a loop body; unroll
// again. Unroll small loops to hide loop backedge latency and saturate any		// again. Unroll small loops to hide loop backedge latency and saturate any
// parallel execution resources of an out-of-order processor. We also then		// parallel execution resources of an out-of-order processor. We also then
// need to clean up redundancies and loop invariant code.		// need to clean up redundancies and loop invariant code.
▲ Show 20 Lines • Show All 2,108 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/StructurizeCFG.cpp

Show First 20 Lines • Show All 342 Lines • ▼ Show 20 Lines	public:
StringRef getPassName() const override { return "Structurize control flow"; }		StringRef getPassName() const override { return "Structurize control flow"; }

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
if (SkipUniformRegions)		if (SkipUniformRegions)
AU.addRequired<LegacyDivergenceAnalysis>();		AU.addRequired<LegacyDivergenceAnalysis>();
AU.addRequiredID(LowerSwitchID);		AU.addRequiredID(LowerSwitchID);
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();

AU.addPreserved<DominatorTreeWrapperPass>();
RegionPass::getAnalysisUsage(AU);		RegionPass::getAnalysisUsage(AU);
}		}
};		};

} // end anonymous namespace		} // end anonymous namespace

char StructurizeCFGLegacyPass::ID = 0;		char StructurizeCFGLegacyPass::ID = 0;

▲ Show 20 Lines • Show All 757 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/FixIrreducible.cpp

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	FixIrreducible() : FunctionPass(ID) {
initializeFixIrreduciblePass(*PassRegistry::getPassRegistry());		initializeFixIrreduciblePass(*PassRegistry::getPassRegistry());
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequiredID(LowerSwitchID);		AU.addRequiredID(LowerSwitchID);
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<LoopInfoWrapperPass>();		AU.addRequired<LoopInfoWrapperPass>();
AU.addPreservedID(LowerSwitchID);		AU.addPreservedID(LowerSwitchID);
AU.addPreserved<DominatorTreeWrapperPass>();
AU.addPreserved<LoopInfoWrapperPass>();
}		}

bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;
};		};
} // namespace		} // namespace

char FixIrreducible::ID = 0;		char FixIrreducible::ID = 0;

▲ Show 20 Lines • Show All 248 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/LowerSwitch.cpp

Show All 13 Lines

#include "llvm/Transforms/Utils/LowerSwitch.h"		#include "llvm/Transforms/Utils/LowerSwitch.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/LazyValueInfo.h"		#include "llvm/Analysis/LazyValueInfo.h"
		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/ConstantRange.h"		#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/KnownBits.h"		#include "llvm/Support/KnownBits.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/Utils.h"		#include "llvm/Transforms/Utils.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"		#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <iterator>		#include <iterator>
#include <limits>		#include <limits>
#include <vector>		#include <vector>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "lower-switch"		#define DEBUG_TYPE "lower-switch"

		static cl::opt<bool>
		ForceLoopUnswitch("force-loop-unswitch", cl::Hidden, cl::init(false),
		cl::desc("Unswitch simple switches in loops"));

namespace {		namespace {

struct IntRange {		struct IntRange {
int64_t Low, High;		int64_t Low, High;
};		};

} // end anonymous namespace		} // end anonymous namespace

▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	for (CaseVector::const_iterator B = C.begin(), E = C.end(); B != E;) {
O << "[" << B->Low->getValue() << ", " << B->High->getValue() << "]";		O << "[" << B->Low->getValue() << ", " << B->High->getValue() << "]";
if (++B != E)		if (++B != E)
O << ", ";		O << ", ";
}		}

return O << "]";		return O << "]";
}		}

		namespace {
		class LowerSwitch {

		private:
		LoopInfo *LI;
		bool LoopUnswitch;

		public:
		LowerSwitch(LoopInfo *LI, bool &LoopUnswitch)
		: LI(LI), LoopUnswitch(LoopUnswitch) {}

		bool run();

		void FixPhis(BasicBlock SuccBB, BasicBlock OrigBB, BasicBlock *NewBB,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'FixPhis' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'FixPhis' [readability-identifier-naming]…
		const unsigned NumMergedCases);

		BasicBlock NewLeafBlock(CaseRange &Leaf, Value Val, ConstantInt *LowerBound,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'NewLeafBlock' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'NewLeafBlock' [readability-identifier…
		ConstantInt UpperBound, BasicBlock OrigBlock,
		BasicBlock *Default);

		BasicBlock SimpleSwitchConvert(SwitchInst SI, BasicBlock *OrigBlock,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'SimpleSwitchConvert' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'SimpleSwitchConvert' [readability…
		BasicBlock *DefaultBlock);

		BasicBlock SwitchConvert(CaseItr Begin, CaseItr End, ConstantInt LowerBound,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'SwitchConvert' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'SwitchConvert' [readability-identifier…
		ConstantInt UpperBound, Value Val,
		BasicBlock Predecessor, BasicBlock OrigBlock,
		BasicBlock *Default,
		const std::vector<IntRange> &UnreachableRanges);

		unsigned Clusterify(CaseVector &Cases, SwitchInst *SI);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'Clusterify' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'Clusterify' [readability-identifier…

		void ProcessSwitchInst(SwitchInst *SI,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'ProcessSwitchInst' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'ProcessSwitchInst' [readability…
		SmallPtrSetImpl<BasicBlock *> &DeleteList,
		AssumptionCache AC, LazyValueInfo LVI);

		bool LowerSwitches(Function &F, LazyValueInfo LVI, AssumptionCache AC);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'LowerSwitches' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'LowerSwitches' [readability-identifier…
		};
		} // namespace

/// Update the first occurrence of the "switch statement" BB in the PHI		/// Update the first occurrence of the "switch statement" BB in the PHI
/// node with the "new" BB. The other occurrences will:		/// node with the "new" BB. The other occurrences will:
///		///
/// 1) Be updated by subsequent calls to this function. Switch statements may		/// 1) Be updated by subsequent calls to this function. Switch statements may
/// have more than one outcoming edge into the same BB if they all have the same		/// have more than one outcoming edge into the same BB if they all have the same
/// value. When the switch statement is converted these incoming edges are now		/// value. When the switch statement is converted these incoming edges are now
/// coming from multiple BBs.		/// coming from multiple BBs.
/// 2) Removed if subsequent incoming values now share the same case, i.e.,		/// 2) Removed if subsequent incoming values now share the same case, i.e.,
/// multiple outcome edges are condensed into one. This is necessary to keep the		/// multiple outcome edges are condensed into one. This is necessary to keep the
/// number of phi values equal to the number of branches to SuccBB.		/// number of phi values equal to the number of branches to SuccBB.
void FixPhis(		void LowerSwitch::FixPhis(
BasicBlock SuccBB, BasicBlock OrigBB, BasicBlock *NewBB,		BasicBlock SuccBB, BasicBlock OrigBB, BasicBlock *NewBB,
const unsigned NumMergedCases = std::numeric_limits<unsigned>::max()) {		const unsigned NumMergedCases = std::numeric_limits<unsigned>::max()) {
for (BasicBlock::iterator I = SuccBB->begin(),		for (BasicBlock::iterator I = SuccBB->begin(),
IE = SuccBB->getFirstNonPHI()->getIterator();		IE = SuccBB->getFirstNonPHI()->getIterator();
I != IE; ++I) {		I != IE; ++I) {
PHINode *PN = cast<PHINode>(I);		PHINode *PN = cast<PHINode>(I);

// Only update the first occurrence.		// Only update the first occurrence.
Show All 20 Lines	for (unsigned III : llvm::reverse(Indices))
PN->removeIncomingValue(III);		PN->removeIncomingValue(III);
}		}
}		}

/// Create a new leaf block for the binary lookup tree. It checks if the		/// Create a new leaf block for the binary lookup tree. It checks if the
/// switch's value == the case's value. If not, then it jumps to the default		/// switch's value == the case's value. If not, then it jumps to the default
/// branch. At this point in the tree, the value can't be another valid case		/// branch. At this point in the tree, the value can't be another valid case
/// value, so the jump to the "default" branch is warranted.		/// value, so the jump to the "default" branch is warranted.
BasicBlock NewLeafBlock(CaseRange &Leaf, Value Val, ConstantInt *LowerBound,		BasicBlock LowerSwitch::NewLeafBlock(CaseRange &Leaf, Value Val, ConstantInt *LowerBound,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -BasicBlock LowerSwitch::NewLeafBlock(CaseRange &Leaf, Value Val, ConstantInt LowerBound, - ConstantInt UpperBound, BasicBlock OrigBlock, - BasicBlock Default) { +BasicBlock LowerSwitch::NewLeafBlock(CaseRange &Leaf, Value Val, + ConstantInt LowerBound, + ConstantInt UpperBound, + BasicBlock OrigBlock, + BasicBlock Default) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -BasicBlock *LowerSwitch::NewLeafBlock(CaseRange…
ConstantInt UpperBound, BasicBlock OrigBlock,		ConstantInt UpperBound, BasicBlock OrigBlock,
BasicBlock *Default) {		BasicBlock *Default) {
Function *F = OrigBlock->getParent();		Function *F = OrigBlock->getParent();
BasicBlock *NewLeaf = BasicBlock::Create(Val->getContext(), "LeafBlock");		BasicBlock *NewLeaf = BasicBlock::Create(Val->getContext(), "LeafBlock");
F->getBasicBlockList().insert(++OrigBlock->getIterator(), NewLeaf);		F->getBasicBlockList().insert(++OrigBlock->getIterator(), NewLeaf);

// Emit comparison		// Emit comparison
ICmpInst *Comp = nullptr;		ICmpInst *Comp = nullptr;
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator I = Succ->begin(); isa<PHINode>(I); ++I) {
int BlockIdx = PN->getBasicBlockIndex(OrigBlock);		int BlockIdx = PN->getBasicBlockIndex(OrigBlock);
assert(BlockIdx != -1 && "Switch didn't go to this successor??");		assert(BlockIdx != -1 && "Switch didn't go to this successor??");
PN->setIncomingBlock((unsigned)BlockIdx, NewLeaf);		PN->setIncomingBlock((unsigned)BlockIdx, NewLeaf);
}		}

return NewLeaf;		return NewLeaf;
}		}

		BasicBlock LowerSwitch::SimpleSwitchConvert(SwitchInst SI,
		BasicBlock *OrigBlock,
		BasicBlock *DefaultBlock) {
		BasicBlock *FalseDest = DefaultBlock;

		for (auto CI : SI->cases()) {
		BasicBlock *TrueDest = CI.getCaseSuccessor();
		CaseRange Case = CaseRange(CI.getCaseValue(), CI.getCaseValue(), TrueDest);
		FalseDest = NewLeafBlock(Case, SI->getCondition(), Case.Low, Case.High,
		OrigBlock, FalseDest);
		}

		return FalseDest;
		}

/// Convert the switch statement into a binary lookup of the case values.		/// Convert the switch statement into a binary lookup of the case values.
/// The function recursively builds this tree. LowerBound and UpperBound are		/// The function recursively builds this tree. LowerBound and UpperBound are
/// used to keep track of the bounds for Val that have already been checked by		/// used to keep track of the bounds for Val that have already been checked by
/// a block emitted by one of the previous calls to switchConvert in the call		/// a block emitted by one of the previous calls to switchConvert in the call
/// stack.		/// stack.
BasicBlock SwitchConvert(CaseItr Begin, CaseItr End, ConstantInt LowerBound,		BasicBlock *
		LowerSwitch::SwitchConvert(CaseItr Begin, CaseItr End, ConstantInt *LowerBound,
ConstantInt UpperBound, Value Val,		ConstantInt UpperBound, Value Val,
BasicBlock Predecessor, BasicBlock OrigBlock,		BasicBlock Predecessor, BasicBlock OrigBlock,
BasicBlock *Default,		BasicBlock *Default,
const std::vector<IntRange> &UnreachableRanges) {		const std::vector<IntRange> &UnreachableRanges) {
assert(LowerBound && UpperBound && "Bounds must be initialized");		assert(LowerBound && UpperBound && "Bounds must be initialized");
unsigned Size = End - Begin;		unsigned Size = End - Begin;

if (Size == 1) {		if (Size == 1) {
// Check if the Case Range is perfectly squeezed in between		// Check if the Case Range is perfectly squeezed in between
// already checked Upper and Lower bounds. If it is then we can avoid		// already checked Upper and Lower bounds. If it is then we can avoid
// emitting the code that checks if the value actually falls in the range		// emitting the code that checks if the value actually falls in the range
// because the bounds already tell us so.		// because the bounds already tell us so.
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	LowerSwitch::SwitchConvert(CaseItr Begin, CaseItr End, ConstantInt *LowerBound,

BranchInst::Create(LBranch, RBranch, Comp, NewNode);		BranchInst::Create(LBranch, RBranch, Comp, NewNode);
return NewNode;		return NewNode;
}		}

/// Transform simple list of \p SI's cases into list of CaseRange's \p Cases.		/// Transform simple list of \p SI's cases into list of CaseRange's \p Cases.
/// \post \p Cases wouldn't contain references to \p SI's default BB.		/// \post \p Cases wouldn't contain references to \p SI's default BB.
/// \returns Number of \p SI's cases that do not reference \p SI's default BB.		/// \returns Number of \p SI's cases that do not reference \p SI's default BB.
unsigned Clusterify(CaseVector &Cases, SwitchInst *SI) {		unsigned LowerSwitch::Clusterify(CaseVector &Cases, SwitchInst *SI) {
unsigned NumSimpleCases = 0;		unsigned NumSimpleCases = 0;

// Start with "simple" cases		// Start with "simple" cases
for (auto Case : SI->cases()) {		for (auto Case : SI->cases()) {
if (Case.getCaseSuccessor() == SI->getDefaultDest())		if (Case.getCaseSuccessor() == SI->getDefaultDest())
continue;		continue;
Cases.push_back(CaseRange(Case.getCaseValue(), Case.getCaseValue(),		Cases.push_back(CaseRange(Case.getCaseValue(), Case.getCaseValue(),
Case.getCaseSuccessor()));		Case.getCaseSuccessor()));
Show All 24 Lines	if (Cases.size() >= 2) {
Cases.erase(std::next(I), Cases.end());		Cases.erase(std::next(I), Cases.end());
}		}

return NumSimpleCases;		return NumSimpleCases;
}		}

/// Replace the specified switch instruction with a sequence of chained if-then		/// Replace the specified switch instruction with a sequence of chained if-then
/// insts in a balanced binary search.		/// insts in a balanced binary search.
void ProcessSwitchInst(SwitchInst *SI,		void LowerSwitch::ProcessSwitchInst(SwitchInst *SI,
SmallPtrSetImpl<BasicBlock *> &DeleteList,		SmallPtrSetImpl<BasicBlock *> &DeleteList,
AssumptionCache AC, LazyValueInfo LVI) {		AssumptionCache AC, LazyValueInfo LVI) {
BasicBlock *OrigBlock = SI->getParent();		BasicBlock *OrigBlock = SI->getParent();
Function *F = OrigBlock->getParent();		Function *F = OrigBlock->getParent();
Value *Val = SI->getCondition(); // The value we are switching on...		Value *Val = SI->getCondition(); // The value we are switching on...
BasicBlock* Default = SI->getDefaultDest();		BasicBlock* Default = SI->getDefaultDest();

// Don't handle unreachable blocks. If there are successors with phis, this		// Don't handle unreachable blocks. If there are successors with phis, this
// would leave them behind with missing predecessors.		// would leave them behind with missing predecessors.
if ((OrigBlock != &F->getEntryBlock() && pred_empty(OrigBlock)) \|\|		if ((OrigBlock != &F->getEntryBlock() && pred_empty(OrigBlock)) \|\|
Show All 13 Lines	void LowerSwitch::ProcessSwitchInst(SwitchInst *SI,
if (Cases.empty()) {		if (Cases.empty()) {
BranchInst::Create(Default, OrigBlock);		BranchInst::Create(Default, OrigBlock);
// Remove all the references from Default's PHIs to OrigBlock, but one.		// Remove all the references from Default's PHIs to OrigBlock, but one.
FixPhis(Default, OrigBlock, OrigBlock);		FixPhis(Default, OrigBlock, OrigBlock);
SI->eraseFromParent();		SI->eraseFromParent();
return;		return;
}		}

		bool SimpleSwitch = true;
		for (auto Case : SI->cases())
		if (!SI->findCaseDest(Case.getCaseSuccessor()))
		SimpleSwitch = false;

		// If we're running this pass before loop vectorise, we should only
		// attempt to convert simple switches which are in a loop
		if ((LoopUnswitch \|\| ForceLoopUnswitch) &&
		(!SimpleSwitch \|\| !LI->getLoopFor(OrigBlock)))
		return;

ConstantInt *LowerBound = nullptr;		ConstantInt *LowerBound = nullptr;
ConstantInt *UpperBound = nullptr;		ConstantInt *UpperBound = nullptr;
bool DefaultIsUnreachableFromSwitch = false;		bool DefaultIsUnreachableFromSwitch = false;

if (isa<UnreachableInst>(Default->getFirstNonPHIOrDbg())) {		if (isa<UnreachableInst>(Default->getFirstNonPHIOrDbg())) {
// Make the bounds tightly fitted around the case value range, because we		// Make the bounds tightly fitted around the case value range, because we
// know that the value passed to the switch must be exactly one of the case		// know that the value passed to the switch must be exactly one of the case
// values.		// values.
Show All 28 Lines	if (isa<UnreachableInst>(Default->getFirstNonPHIOrDbg())) {

LowerBound = ConstantInt::get(SI->getContext(), Min);		LowerBound = ConstantInt::get(SI->getContext(), Min);
UpperBound = ConstantInt::get(SI->getContext(), Max);		UpperBound = ConstantInt::get(SI->getContext(), Max);
DefaultIsUnreachableFromSwitch = (Min + (NumSimpleCases - 1) == Max);		DefaultIsUnreachableFromSwitch = (Min + (NumSimpleCases - 1) == Max);
}		}

std::vector<IntRange> UnreachableRanges;		std::vector<IntRange> UnreachableRanges;

if (DefaultIsUnreachableFromSwitch) {		if (DefaultIsUnreachableFromSwitch && !(LoopUnswitch \|\| ForceLoopUnswitch)) {
DenseMap<BasicBlock *, unsigned> Popularity;		DenseMap<BasicBlock *, unsigned> Popularity;
unsigned MaxPop = 0;		unsigned MaxPop = 0;
BasicBlock *PopSucc = nullptr;		BasicBlock *PopSucc = nullptr;

IntRange R = {std::numeric_limits<int64_t>::min(),		IntRange R = {std::numeric_limits<int64_t>::min(),
std::numeric_limits<int64_t>::max()};		std::numeric_limits<int64_t>::max()};
UnreachableRanges.push_back(R);		UnreachableRanges.push_back(R);
for (const auto &I : Cases) {		for (const auto &I : Cases) {
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	#endif
}		}

// Create a new, empty default block so that the new hierarchy of		// Create a new, empty default block so that the new hierarchy of
// if-then statements go to this and the PHI nodes are happy.		// if-then statements go to this and the PHI nodes are happy.
BasicBlock *NewDefault = BasicBlock::Create(SI->getContext(), "NewDefault");		BasicBlock *NewDefault = BasicBlock::Create(SI->getContext(), "NewDefault");
F->getBasicBlockList().insert(Default->getIterator(), NewDefault);		F->getBasicBlockList().insert(Default->getIterator(), NewDefault);
BranchInst::Create(Default, NewDefault);		BranchInst::Create(Default, NewDefault);

BasicBlock *SwitchBlock =		BasicBlock *SwitchBlock;
		if ((LoopUnswitch \|\| ForceLoopUnswitch) && SimpleSwitch)
		SwitchBlock = SimpleSwitchConvert(SI, OrigBlock, NewDefault);
		else
		SwitchBlock =
SwitchConvert(Cases.begin(), Cases.end(), LowerBound, UpperBound, Val,		SwitchConvert(Cases.begin(), Cases.end(), LowerBound, UpperBound, Val,
OrigBlock, OrigBlock, NewDefault, UnreachableRanges);		OrigBlock, OrigBlock, NewDefault, UnreachableRanges);

// If there are entries in any PHI nodes for the default edge, make sure		// If there are entries in any PHI nodes for the default edge, make sure
// to update them as well.		// to update them as well.
FixPhis(Default, OrigBlock, NewDefault);		FixPhis(Default, OrigBlock, NewDefault);

// Branch to our shiny new if-then stuff...		// Branch to our shiny new if-then stuff...
BranchInst::Create(SwitchBlock, OrigBlock);		BranchInst::Create(SwitchBlock, OrigBlock);

// We are now done with the switch instruction, delete it.		// We are now done with the switch instruction, delete it.
BasicBlock *OldDefault = SI->getDefaultDest();		BasicBlock *OldDefault = SI->getDefaultDest();
OrigBlock->getInstList().erase(SI);		OrigBlock->getInstList().erase(SI);

// If the Default block has no more predecessors just add it to DeleteList.		// If the Default block has no more predecessors just add it to DeleteList.
if (pred_empty(OldDefault))		if (pred_empty(OldDefault))
DeleteList.insert(OldDefault);		DeleteList.insert(OldDefault);
}		}

bool LowerSwitch(Function &F, LazyValueInfo LVI, AssumptionCache AC) {		bool LowerSwitch::LowerSwitches(Function &F, LazyValueInfo *LVI,
		AssumptionCache *AC) {
bool Changed = false;		bool Changed = false;
SmallPtrSet<BasicBlock *, 8> DeleteList;		SmallPtrSet<BasicBlock *, 8> DeleteList;

for (Function::iterator I = F.begin(), E = F.end(); I != E;) {		for (Function::iterator I = F.begin(), E = F.end(); I != E;) {
BasicBlock *Cur =		BasicBlock *Cur =
&*I++; // Advance over block so we don't traverse new blocks		&*I++; // Advance over block so we don't traverse new blocks

// If the block is a dead Default block that will be deleted later, don't		// If the block is a dead Default block that will be deleted later, don't
Show All 15 Lines	bool LowerSwitch::LowerSwitches(Function &F, LazyValueInfo *LVI,
return Changed;		return Changed;
}		}

/// Replace all SwitchInst instructions with chained branch instructions.		/// Replace all SwitchInst instructions with chained branch instructions.
class LowerSwitchLegacyPass : public FunctionPass {		class LowerSwitchLegacyPass : public FunctionPass {
public:		public:
// Pass identification, replacement for typeid		// Pass identification, replacement for typeid
static char ID;		static char ID;
		bool LoopUnswitch;

LowerSwitchLegacyPass() : FunctionPass(ID) {		LowerSwitchLegacyPass(bool LoopUnswitch = false)
		: FunctionPass(ID), LoopUnswitch(LoopUnswitch) {
initializeLowerSwitchLegacyPassPass(*PassRegistry::getPassRegistry());		initializeLowerSwitchLegacyPassPass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<LazyValueInfoWrapperPass>();		AU.addRequired<LazyValueInfoWrapperPass>();
		AU.addRequired<LoopInfoWrapperPass>();
		AU.addRequired<DominatorTreeWrapperPass>();
}		}
};		};

} // end anonymous namespace		} // end anonymous namespace

char LowerSwitchLegacyPass::ID = 0;		char LowerSwitchLegacyPass::ID = 0;

// Publicly exposed interface to pass...		// Publicly exposed interface to pass...
char &llvm::LowerSwitchID = LowerSwitchLegacyPass::ID;		char &llvm::LowerSwitchID = LowerSwitchLegacyPass::ID;

INITIALIZE_PASS_BEGIN(LowerSwitchLegacyPass, "lowerswitch",		INITIALIZE_PASS_BEGIN(LowerSwitchLegacyPass, "lowerswitch",
"Lower SwitchInst's to branches", false, false)		"Lower SwitchInst's to branches", false, false)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_PASS_DEPENDENCY(LazyValueInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(LazyValueInfoWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_END(LowerSwitchLegacyPass, "lowerswitch",		INITIALIZE_PASS_END(LowerSwitchLegacyPass, "lowerswitch",
"Lower SwitchInst's to branches", false, false)		"Lower SwitchInst's to branches", false, false)

// createLowerSwitchPass - Interface to this file...		// createLowerSwitchPass - Interface to this file...
FunctionPass *llvm::createLowerSwitchPass() {		FunctionPass *llvm::createLowerSwitchPass() {
return new LowerSwitchLegacyPass();		return new LowerSwitchLegacyPass();
}		}

bool LowerSwitchLegacyPass::runOnFunction(Function &F) {		bool LowerSwitchLegacyPass::runOnFunction(Function &F) {
LazyValueInfo *LVI = &getAnalysis<LazyValueInfoWrapperPass>().getLVI();		LazyValueInfo *LVI = &getAnalysis<LazyValueInfoWrapperPass>().getLVI();
		auto *LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
auto *ACT = getAnalysisIfAvailable<AssumptionCacheTracker>();		auto *ACT = getAnalysisIfAvailable<AssumptionCacheTracker>();
AssumptionCache *AC = ACT ? &ACT->getAssumptionCache(F) : nullptr;		AssumptionCache *AC = ACT ? &ACT->getAssumptionCache(F) : nullptr;
return LowerSwitch(F, LVI, AC);		LowerSwitch LS = LowerSwitch(LI, LoopUnswitch);
		return LS.LowerSwitches(F, LVI, AC);
}		}

PreservedAnalyses LowerSwitchPass::run(Function &F,		PreservedAnalyses LowerSwitchPass::run(Function &F,
FunctionAnalysisManager &AM) {		FunctionAnalysisManager &AM) {
LazyValueInfo *LVI = &AM.getResult<LazyValueAnalysis>(F);		LazyValueInfo *LVI = &AM.getResult<LazyValueAnalysis>(F);
		LoopInfo *LI = &AM.getResult<LoopAnalysis>(F);
AssumptionCache *AC = AM.getCachedResult<AssumptionAnalysis>(F);		AssumptionCache *AC = AM.getCachedResult<AssumptionAnalysis>(F);
return LowerSwitch(F, LVI, AC) ? PreservedAnalyses::none()		LowerSwitch LS = LowerSwitch(LI, LoopUnswitch);
		return LS.LowerSwitches(F, LVI, AC) ? PreservedAnalyses::none()
: PreservedAnalyses::all();		: PreservedAnalyses::all();
}		}

llvm/lib/Transforms/Utils/UnifyLoopExits.cpp

Show All 34 Lines	UnifyLoopExitsLegacyPass() : FunctionPass(ID) {
initializeUnifyLoopExitsLegacyPassPass(*PassRegistry::getPassRegistry());		initializeUnifyLoopExitsLegacyPassPass(*PassRegistry::getPassRegistry());
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequiredID(LowerSwitchID);		AU.addRequiredID(LowerSwitchID);
AU.addRequired<LoopInfoWrapperPass>();		AU.addRequired<LoopInfoWrapperPass>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addPreservedID(LowerSwitchID);		AU.addPreservedID(LowerSwitchID);
AU.addPreserved<LoopInfoWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();
}		}

bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;
};		};
} // namespace		} // namespace

char UnifyLoopExitsLegacyPass::ID = 0;		char UnifyLoopExitsLegacyPass::ID = 0;

▲ Show 20 Lines • Show All 193 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/llc-pipeline.ll

	Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
	; GCN-O0-NEXT: Scalarize Masked Memory Intrinsics			; GCN-O0-NEXT: Scalarize Masked Memory Intrinsics
	; GCN-O0-NEXT: Expand reduction intrinsics			; GCN-O0-NEXT: Expand reduction intrinsics
	; GCN-O0-NEXT: CallGraph Construction			; GCN-O0-NEXT: CallGraph Construction
	; GCN-O0-NEXT: Call Graph SCC Pass Manager			; GCN-O0-NEXT: Call Graph SCC Pass Manager
	; GCN-O0-NEXT: AMDGPU Annotate Kernel Features			; GCN-O0-NEXT: AMDGPU Annotate Kernel Features
	; GCN-O0-NEXT: FunctionPass Manager			; GCN-O0-NEXT: FunctionPass Manager
	; GCN-O0-NEXT: AMDGPU Lower Kernel Arguments			; GCN-O0-NEXT: AMDGPU Lower Kernel Arguments
	; GCN-O0-NEXT: Lazy Value Information Analysis			; GCN-O0-NEXT: Lazy Value Information Analysis
				; GCN-O0-NEXT: Dominator Tree Construction
				; GCN-O0-NEXT: Natural Loop Information
	; GCN-O0-NEXT: Lower SwitchInst's to branches			; GCN-O0-NEXT: Lower SwitchInst's to branches
	; GCN-O0-NEXT: Lower invoke and unwind, for unwindless code generators			; GCN-O0-NEXT: Lower invoke and unwind, for unwindless code generators
	; GCN-O0-NEXT: Remove unreachable blocks from the CFG			; GCN-O0-NEXT: Remove unreachable blocks from the CFG
	; GCN-O0-NEXT: Post-Dominator Tree Construction			; GCN-O0-NEXT: Post-Dominator Tree Construction
	; GCN-O0-NEXT: Dominator Tree Construction			; GCN-O0-NEXT: Dominator Tree Construction
	; GCN-O0-NEXT: Natural Loop Information			; GCN-O0-NEXT: Natural Loop Information
	; GCN-O0-NEXT: Legacy Divergence Analysis			; GCN-O0-NEXT: Legacy Divergence Analysis
	; GCN-O0-NEXT: Unify divergent function exit nodes			; GCN-O0-NEXT: Unify divergent function exit nodes
	; GCN-O0-NEXT: Lazy Value Information Analysis			; GCN-O0-NEXT: Lazy Value Information Analysis
				; GCN-O0-NEXT: Dominator Tree Construction
				; GCN-O0-NEXT: Natural Loop Information
	; GCN-O0-NEXT: Lower SwitchInst's to branches			; GCN-O0-NEXT: Lower SwitchInst's to branches
	; GCN-O0-NEXT: Dominator Tree Construction			; GCN-O0-NEXT: Dominator Tree Construction
	; GCN-O0-NEXT: Natural Loop Information			; GCN-O0-NEXT: Natural Loop Information
	; GCN-O0-NEXT: Convert irreducible control-flow into natural loops			; GCN-O0-NEXT: Convert irreducible control-flow into natural loops
				; GCN-O0-NEXT: Dominator Tree Construction
				; GCN-O0-NEXT: Natural Loop Information
	; GCN-O0-NEXT: Fixup each natural loop to have a single exit block			; GCN-O0-NEXT: Fixup each natural loop to have a single exit block
				; GCN-O0-NEXT: Dominator Tree Construction
	; GCN-O0-NEXT: Post-Dominator Tree Construction			; GCN-O0-NEXT: Post-Dominator Tree Construction
	; GCN-O0-NEXT: Dominance Frontier Construction			; GCN-O0-NEXT: Dominance Frontier Construction
	; GCN-O0-NEXT: Detect single entry single exit regions			; GCN-O0-NEXT: Detect single entry single exit regions
	; GCN-O0-NEXT: Region Pass Manager			; GCN-O0-NEXT: Region Pass Manager
	; GCN-O0-NEXT: Structurize control flow			; GCN-O0-NEXT: Structurize control flow
				; GCN-O0-NEXT: Dominator Tree Construction
	; GCN-O0-NEXT: Post-Dominator Tree Construction			; GCN-O0-NEXT: Post-Dominator Tree Construction
	; GCN-O0-NEXT: Natural Loop Information			; GCN-O0-NEXT: Natural Loop Information
	; GCN-O0-NEXT: Legacy Divergence Analysis			; GCN-O0-NEXT: Legacy Divergence Analysis
	; GCN-O0-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O0-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O0-NEXT: Function Alias Analysis Results			; GCN-O0-NEXT: Function Alias Analysis Results
	; GCN-O0-NEXT: Memory SSA			; GCN-O0-NEXT: Memory SSA
	; GCN-O0-NEXT: AMDGPU Annotate Uniform Values			; GCN-O0-NEXT: AMDGPU Annotate Uniform Values
	; GCN-O0-NEXT: SI annotate control flow			; GCN-O0-NEXT: SI annotate control flow
	▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
	; GCN-O1-NEXT: Call Graph SCC Pass Manager			; GCN-O1-NEXT: Call Graph SCC Pass Manager
	; GCN-O1-NEXT: AMDGPU Annotate Kernel Features			; GCN-O1-NEXT: AMDGPU Annotate Kernel Features
	; GCN-O1-NEXT: FunctionPass Manager			; GCN-O1-NEXT: FunctionPass Manager
	; GCN-O1-NEXT: AMDGPU Lower Kernel Arguments			; GCN-O1-NEXT: AMDGPU Lower Kernel Arguments
	; GCN-O1-NEXT: Dominator Tree Construction			; GCN-O1-NEXT: Dominator Tree Construction
	; GCN-O1-NEXT: Natural Loop Information			; GCN-O1-NEXT: Natural Loop Information
	; GCN-O1-NEXT: CodeGen Prepare			; GCN-O1-NEXT: CodeGen Prepare
	; GCN-O1-NEXT: Lazy Value Information Analysis			; GCN-O1-NEXT: Lazy Value Information Analysis
				; GCN-O1-NEXT: Dominator Tree Construction
				; GCN-O1-NEXT: Natural Loop Information
	; GCN-O1-NEXT: Lower SwitchInst's to branches			; GCN-O1-NEXT: Lower SwitchInst's to branches
	; GCN-O1-NEXT: Lower invoke and unwind, for unwindless code generators			; GCN-O1-NEXT: Lower invoke and unwind, for unwindless code generators
	; GCN-O1-NEXT: Remove unreachable blocks from the CFG			; GCN-O1-NEXT: Remove unreachable blocks from the CFG
	; GCN-O1-NEXT: Dominator Tree Construction			; GCN-O1-NEXT: Dominator Tree Construction
	; GCN-O1-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O1-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O1-NEXT: Function Alias Analysis Results			; GCN-O1-NEXT: Function Alias Analysis Results
	; GCN-O1-NEXT: Flatten the CFG			; GCN-O1-NEXT: Flatten the CFG
	; GCN-O1-NEXT: Dominator Tree Construction			; GCN-O1-NEXT: Dominator Tree Construction
	; GCN-O1-NEXT: Post-Dominator Tree Construction			; GCN-O1-NEXT: Post-Dominator Tree Construction
	; GCN-O1-NEXT: Natural Loop Information			; GCN-O1-NEXT: Natural Loop Information
	; GCN-O1-NEXT: Legacy Divergence Analysis			; GCN-O1-NEXT: Legacy Divergence Analysis
	; GCN-O1-NEXT: AMDGPU IR late optimizations			; GCN-O1-NEXT: AMDGPU IR late optimizations
	; GCN-O1-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O1-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O1-NEXT: Function Alias Analysis Results			; GCN-O1-NEXT: Function Alias Analysis Results
	; GCN-O1-NEXT: Code sinking			; GCN-O1-NEXT: Code sinking
	; GCN-O1-NEXT: Legacy Divergence Analysis			; GCN-O1-NEXT: Legacy Divergence Analysis
	; GCN-O1-NEXT: Unify divergent function exit nodes			; GCN-O1-NEXT: Unify divergent function exit nodes
	; GCN-O1-NEXT: Lazy Value Information Analysis			; GCN-O1-NEXT: Lazy Value Information Analysis
				; GCN-O1-NEXT: Dominator Tree Construction
				; GCN-O1-NEXT: Natural Loop Information
	; GCN-O1-NEXT: Lower SwitchInst's to branches			; GCN-O1-NEXT: Lower SwitchInst's to branches
	; GCN-O1-NEXT: Dominator Tree Construction			; GCN-O1-NEXT: Dominator Tree Construction
	; GCN-O1-NEXT: Natural Loop Information			; GCN-O1-NEXT: Natural Loop Information
	; GCN-O1-NEXT: Convert irreducible control-flow into natural loops			; GCN-O1-NEXT: Convert irreducible control-flow into natural loops
				; GCN-O1-NEXT: Dominator Tree Construction
				; GCN-O1-NEXT: Natural Loop Information
	; GCN-O1-NEXT: Fixup each natural loop to have a single exit block			; GCN-O1-NEXT: Fixup each natural loop to have a single exit block
				; GCN-O1-NEXT: Dominator Tree Construction
	; GCN-O1-NEXT: Post-Dominator Tree Construction			; GCN-O1-NEXT: Post-Dominator Tree Construction
	; GCN-O1-NEXT: Dominance Frontier Construction			; GCN-O1-NEXT: Dominance Frontier Construction
	; GCN-O1-NEXT: Detect single entry single exit regions			; GCN-O1-NEXT: Detect single entry single exit regions
	; GCN-O1-NEXT: Region Pass Manager			; GCN-O1-NEXT: Region Pass Manager
	; GCN-O1-NEXT: Structurize control flow			; GCN-O1-NEXT: Structurize control flow
				; GCN-O1-NEXT: Dominator Tree Construction
	; GCN-O1-NEXT: Post-Dominator Tree Construction			; GCN-O1-NEXT: Post-Dominator Tree Construction
	; GCN-O1-NEXT: Natural Loop Information			; GCN-O1-NEXT: Natural Loop Information
	; GCN-O1-NEXT: Legacy Divergence Analysis			; GCN-O1-NEXT: Legacy Divergence Analysis
	; GCN-O1-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O1-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O1-NEXT: Function Alias Analysis Results			; GCN-O1-NEXT: Function Alias Analysis Results
	; GCN-O1-NEXT: Memory SSA			; GCN-O1-NEXT: Memory SSA
	; GCN-O1-NEXT: AMDGPU Annotate Uniform Values			; GCN-O1-NEXT: AMDGPU Annotate Uniform Values
	; GCN-O1-NEXT: SI annotate control flow			; GCN-O1-NEXT: SI annotate control flow
	▲ Show 20 Lines • Show All 250 Lines • ▼ Show 20 Lines
	; GCN-O1-OPTS-NEXT: Legacy Divergence Analysis			; GCN-O1-OPTS-NEXT: Legacy Divergence Analysis
	; GCN-O1-OPTS-NEXT: AMDGPU IR late optimizations			; GCN-O1-OPTS-NEXT: AMDGPU IR late optimizations
	; GCN-O1-OPTS-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O1-OPTS-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O1-OPTS-NEXT: Function Alias Analysis Results			; GCN-O1-OPTS-NEXT: Function Alias Analysis Results
	; GCN-O1-OPTS-NEXT: Code sinking			; GCN-O1-OPTS-NEXT: Code sinking
	; GCN-O1-OPTS-NEXT: Legacy Divergence Analysis			; GCN-O1-OPTS-NEXT: Legacy Divergence Analysis
	; GCN-O1-OPTS-NEXT: Unify divergent function exit nodes			; GCN-O1-OPTS-NEXT: Unify divergent function exit nodes
	; GCN-O1-OPTS-NEXT: Lazy Value Information Analysis			; GCN-O1-OPTS-NEXT: Lazy Value Information Analysis
				; GCN-O1-OPTS-NEXT: Dominator Tree Construction
				; GCN-O1-OPTS-NEXT: Natural Loop Information
	; GCN-O1-OPTS-NEXT: Lower SwitchInst's to branches			; GCN-O1-OPTS-NEXT: Lower SwitchInst's to branches
	; GCN-O1-OPTS-NEXT: Dominator Tree Construction			; GCN-O1-OPTS-NEXT: Dominator Tree Construction
	; GCN-O1-OPTS-NEXT: Natural Loop Information			; GCN-O1-OPTS-NEXT: Natural Loop Information
	; GCN-O1-OPTS-NEXT: Convert irreducible control-flow into natural loops			; GCN-O1-OPTS-NEXT: Convert irreducible control-flow into natural loops
				; GCN-O1-OPTS-NEXT: Dominator Tree Construction
				; GCN-O1-OPTS-NEXT: Natural Loop Information
	; GCN-O1-OPTS-NEXT: Fixup each natural loop to have a single exit block			; GCN-O1-OPTS-NEXT: Fixup each natural loop to have a single exit block
				; GCN-O1-OPTS-NEXT: Dominator Tree Construction
	; GCN-O1-OPTS-NEXT: Post-Dominator Tree Construction			; GCN-O1-OPTS-NEXT: Post-Dominator Tree Construction
	; GCN-O1-OPTS-NEXT: Dominance Frontier Construction			; GCN-O1-OPTS-NEXT: Dominance Frontier Construction
	; GCN-O1-OPTS-NEXT: Detect single entry single exit regions			; GCN-O1-OPTS-NEXT: Detect single entry single exit regions
	; GCN-O1-OPTS-NEXT: Region Pass Manager			; GCN-O1-OPTS-NEXT: Region Pass Manager
	; GCN-O1-OPTS-NEXT: Structurize control flow			; GCN-O1-OPTS-NEXT: Structurize control flow
				; GCN-O1-OPTS-NEXT: Dominator Tree Construction
	; GCN-O1-OPTS-NEXT: Post-Dominator Tree Construction			; GCN-O1-OPTS-NEXT: Post-Dominator Tree Construction
	; GCN-O1-OPTS-NEXT: Natural Loop Information			; GCN-O1-OPTS-NEXT: Natural Loop Information
	; GCN-O1-OPTS-NEXT: Legacy Divergence Analysis			; GCN-O1-OPTS-NEXT: Legacy Divergence Analysis
	; GCN-O1-OPTS-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O1-OPTS-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O1-OPTS-NEXT: Function Alias Analysis Results			; GCN-O1-OPTS-NEXT: Function Alias Analysis Results
	; GCN-O1-OPTS-NEXT: Memory SSA			; GCN-O1-OPTS-NEXT: Memory SSA
	; GCN-O1-OPTS-NEXT: AMDGPU Annotate Uniform Values			; GCN-O1-OPTS-NEXT: AMDGPU Annotate Uniform Values
	; GCN-O1-OPTS-NEXT: SI annotate control flow			; GCN-O1-OPTS-NEXT: SI annotate control flow
	▲ Show 20 Lines • Show All 258 Lines • ▼ Show 20 Lines
	; GCN-O2-NEXT: Legacy Divergence Analysis			; GCN-O2-NEXT: Legacy Divergence Analysis
	; GCN-O2-NEXT: AMDGPU IR late optimizations			; GCN-O2-NEXT: AMDGPU IR late optimizations
	; GCN-O2-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O2-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O2-NEXT: Function Alias Analysis Results			; GCN-O2-NEXT: Function Alias Analysis Results
	; GCN-O2-NEXT: Code sinking			; GCN-O2-NEXT: Code sinking
	; GCN-O2-NEXT: Legacy Divergence Analysis			; GCN-O2-NEXT: Legacy Divergence Analysis
	; GCN-O2-NEXT: Unify divergent function exit nodes			; GCN-O2-NEXT: Unify divergent function exit nodes
	; GCN-O2-NEXT: Lazy Value Information Analysis			; GCN-O2-NEXT: Lazy Value Information Analysis
				; GCN-O2-NEXT: Dominator Tree Construction
				; GCN-O2-NEXT: Natural Loop Information
	; GCN-O2-NEXT: Lower SwitchInst's to branches			; GCN-O2-NEXT: Lower SwitchInst's to branches
	; GCN-O2-NEXT: Dominator Tree Construction			; GCN-O2-NEXT: Dominator Tree Construction
	; GCN-O2-NEXT: Natural Loop Information			; GCN-O2-NEXT: Natural Loop Information
	; GCN-O2-NEXT: Convert irreducible control-flow into natural loops			; GCN-O2-NEXT: Convert irreducible control-flow into natural loops
				; GCN-O2-NEXT: Dominator Tree Construction
				; GCN-O2-NEXT: Natural Loop Information
	; GCN-O2-NEXT: Fixup each natural loop to have a single exit block			; GCN-O2-NEXT: Fixup each natural loop to have a single exit block
				; GCN-O2-NEXT: Dominator Tree Construction
	; GCN-O2-NEXT: Post-Dominator Tree Construction			; GCN-O2-NEXT: Post-Dominator Tree Construction
	; GCN-O2-NEXT: Dominance Frontier Construction			; GCN-O2-NEXT: Dominance Frontier Construction
	; GCN-O2-NEXT: Detect single entry single exit regions			; GCN-O2-NEXT: Detect single entry single exit regions
	; GCN-O2-NEXT: Region Pass Manager			; GCN-O2-NEXT: Region Pass Manager
	; GCN-O2-NEXT: Structurize control flow			; GCN-O2-NEXT: Structurize control flow
				; GCN-O2-NEXT: Dominator Tree Construction
	; GCN-O2-NEXT: Post-Dominator Tree Construction			; GCN-O2-NEXT: Post-Dominator Tree Construction
	; GCN-O2-NEXT: Natural Loop Information			; GCN-O2-NEXT: Natural Loop Information
	; GCN-O2-NEXT: Legacy Divergence Analysis			; GCN-O2-NEXT: Legacy Divergence Analysis
	; GCN-O2-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O2-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O2-NEXT: Function Alias Analysis Results			; GCN-O2-NEXT: Function Alias Analysis Results
	; GCN-O2-NEXT: Memory SSA			; GCN-O2-NEXT: Memory SSA
	; GCN-O2-NEXT: AMDGPU Annotate Uniform Values			; GCN-O2-NEXT: AMDGPU Annotate Uniform Values
	; GCN-O2-NEXT: SI annotate control flow			; GCN-O2-NEXT: SI annotate control flow
	▲ Show 20 Lines • Show All 273 Lines • ▼ Show 20 Lines
	; GCN-O3-NEXT: Legacy Divergence Analysis			; GCN-O3-NEXT: Legacy Divergence Analysis
	; GCN-O3-NEXT: AMDGPU IR late optimizations			; GCN-O3-NEXT: AMDGPU IR late optimizations
	; GCN-O3-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O3-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O3-NEXT: Function Alias Analysis Results			; GCN-O3-NEXT: Function Alias Analysis Results
	; GCN-O3-NEXT: Code sinking			; GCN-O3-NEXT: Code sinking
	; GCN-O3-NEXT: Legacy Divergence Analysis			; GCN-O3-NEXT: Legacy Divergence Analysis
	; GCN-O3-NEXT: Unify divergent function exit nodes			; GCN-O3-NEXT: Unify divergent function exit nodes
	; GCN-O3-NEXT: Lazy Value Information Analysis			; GCN-O3-NEXT: Lazy Value Information Analysis
				; GCN-O3-NEXT: Dominator Tree Construction
				; GCN-O3-NEXT: Natural Loop Information
	; GCN-O3-NEXT: Lower SwitchInst's to branches			; GCN-O3-NEXT: Lower SwitchInst's to branches
	; GCN-O3-NEXT: Dominator Tree Construction			; GCN-O3-NEXT: Dominator Tree Construction
	; GCN-O3-NEXT: Natural Loop Information			; GCN-O3-NEXT: Natural Loop Information
	; GCN-O3-NEXT: Convert irreducible control-flow into natural loops			; GCN-O3-NEXT: Convert irreducible control-flow into natural loops
				; GCN-O3-NEXT: Dominator Tree Construction
				; GCN-O3-NEXT: Natural Loop Information
	; GCN-O3-NEXT: Fixup each natural loop to have a single exit block			; GCN-O3-NEXT: Fixup each natural loop to have a single exit block
				; GCN-O3-NEXT: Dominator Tree Construction
	; GCN-O3-NEXT: Post-Dominator Tree Construction			; GCN-O3-NEXT: Post-Dominator Tree Construction
	; GCN-O3-NEXT: Dominance Frontier Construction			; GCN-O3-NEXT: Dominance Frontier Construction
	; GCN-O3-NEXT: Detect single entry single exit regions			; GCN-O3-NEXT: Detect single entry single exit regions
	; GCN-O3-NEXT: Region Pass Manager			; GCN-O3-NEXT: Region Pass Manager
	; GCN-O3-NEXT: Structurize control flow			; GCN-O3-NEXT: Structurize control flow
				; GCN-O3-NEXT: Dominator Tree Construction
	; GCN-O3-NEXT: Post-Dominator Tree Construction			; GCN-O3-NEXT: Post-Dominator Tree Construction
	; GCN-O3-NEXT: Natural Loop Information			; GCN-O3-NEXT: Natural Loop Information
	; GCN-O3-NEXT: Legacy Divergence Analysis			; GCN-O3-NEXT: Legacy Divergence Analysis
	; GCN-O3-NEXT: Basic Alias Analysis (stateless AA impl)			; GCN-O3-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O3-NEXT: Function Alias Analysis Results			; GCN-O3-NEXT: Function Alias Analysis Results
	; GCN-O3-NEXT: Memory SSA			; GCN-O3-NEXT: Memory SSA
	; GCN-O3-NEXT: AMDGPU Annotate Uniform Values			; GCN-O3-NEXT: AMDGPU Annotate Uniform Values
	; GCN-O3-NEXT: SI annotate control flow			; GCN-O3-NEXT: SI annotate control flow
	▲ Show 20 Lines • Show All 152 Lines • Show Last 20 Lines

llvm/test/Other/new-pm-defaults.ll

	Show First 20 Lines • Show All 210 Lines • ▼ Show 20 Lines
	; CHECK-EP-VECTORIZER-START-NEXT: Running pass: NoOpFunctionPass			; CHECK-EP-VECTORIZER-START-NEXT: Running pass: NoOpFunctionPass
	; CHECK-EXT: Running pass: {{.*}}::Bye on foo			; CHECK-EXT: Running pass: {{.*}}::Bye on foo
	; CHECK-NOEXT: {{^}}			; CHECK-NOEXT: {{^}}
	; CHECK-O-NEXT: Running pass: LoopSimplifyPass			; CHECK-O-NEXT: Running pass: LoopSimplifyPass
	; CHECK-O-NEXT: Running pass: LCSSAPass			; CHECK-O-NEXT: Running pass: LCSSAPass
	; CHECK-O-NEXT: Running pass: LoopRotatePass			; CHECK-O-NEXT: Running pass: LoopRotatePass
	; CHECK-O-NEXT: Running pass: LoopDistributePass			; CHECK-O-NEXT: Running pass: LoopDistributePass
	; CHECK-O-NEXT: Running pass: InjectTLIMappings			; CHECK-O-NEXT: Running pass: InjectTLIMappings
				; CHECK-O-NEXT: Running pass: LowerSwitchPass
				; CHECK-O-NEXT: Running analysis: LazyValueAnalysis
	; CHECK-O-NEXT: Running pass: LoopVectorizePass			; CHECK-O-NEXT: Running pass: LoopVectorizePass
	; CHECK-O-NEXT: Running analysis: BlockFrequencyAnalysis			; CHECK-O-NEXT: Running analysis: BlockFrequencyAnalysis
	; CHECK-O-NEXT: Running analysis: BranchProbabilityAnalysis			; CHECK-O-NEXT: Running analysis: BranchProbabilityAnalysis
	; CHECK-O-NEXT: Running pass: LoopLoadEliminationPass			; CHECK-O-NEXT: Running pass: LoopLoadEliminationPass
	; CHECK-O-NEXT: Running analysis: LoopAccessAnalysis			; CHECK-O-NEXT: Running analysis: LoopAccessAnalysis
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O2-NEXT: Running pass: SLPVectorizerPass			; CHECK-O2-NEXT: Running pass: SLPVectorizerPass
	▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

llvm/test/Other/new-pm-lto-defaults.ll

	Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines
	; CHECK-O23SZ-NEXT: Running analysis: PostDominatorTreeAnalysis on foo			; CHECK-O23SZ-NEXT: Running analysis: PostDominatorTreeAnalysis on foo
	; CHECK-O23SZ-NEXT: Running pass: MergedLoadStoreMotionPass on foo			; CHECK-O23SZ-NEXT: Running pass: MergedLoadStoreMotionPass on foo
	; CHECK-O23SZ-NEXT: Running pass: LoopSimplifyPass on foo			; CHECK-O23SZ-NEXT: Running pass: LoopSimplifyPass on foo
	; CHECK-O23SZ-NEXT: Running pass: LCSSAPass on foo			; CHECK-O23SZ-NEXT: Running pass: LCSSAPass on foo
	; CHECK-O23SZ-NEXT: Running pass: IndVarSimplifyPass on Loop			; CHECK-O23SZ-NEXT: Running pass: IndVarSimplifyPass on Loop
	; CHECK-O23SZ-NEXT: Running pass: LoopDeletionPass on Loop			; CHECK-O23SZ-NEXT: Running pass: LoopDeletionPass on Loop
	; CHECK-O23SZ-NEXT: Running pass: LoopFullUnrollPass on Loop			; CHECK-O23SZ-NEXT: Running pass: LoopFullUnrollPass on Loop
	; CHECK-O23SZ-NEXT: Running pass: LoopDistributePass on foo			; CHECK-O23SZ-NEXT: Running pass: LoopDistributePass on foo
				; CHECK-O23SZ-NEXT: Running pass: LowerSwitchPass on foo
	; CHECK-O23SZ-NEXT: Running pass: LoopVectorizePass on foo			; CHECK-O23SZ-NEXT: Running pass: LoopVectorizePass on foo
	; CHECK-O23SZ-NEXT: Running analysis: BlockFrequencyAnalysis on foo			; CHECK-O23SZ-NEXT: Running analysis: BlockFrequencyAnalysis on foo
	; CHECK-O23SZ-NEXT: Running analysis: BranchProbabilityAnalysis on foo			; CHECK-O23SZ-NEXT: Running analysis: BranchProbabilityAnalysis on foo
	; CHECK-O23SZ-NEXT: Running analysis: DemandedBitsAnalysis on foo			; CHECK-O23SZ-NEXT: Running analysis: DemandedBitsAnalysis on foo
	; CHECK-O23SZ-NEXT: Running pass: LoopUnrollPass on foo			; CHECK-O23SZ-NEXT: Running pass: LoopUnrollPass on foo
	; CHECK-O23SZ-NEXT: WarnMissedTransformationsPass on foo			; CHECK-O23SZ-NEXT: WarnMissedTransformationsPass on foo
	; CHECK-O23SZ-NEXT: Running pass: InstCombinePass on foo			; CHECK-O23SZ-NEXT: Running pass: InstCombinePass on foo
	; CHECK-O23SZ-NEXT: Running pass: SimplifyCFGPass on foo			; CHECK-O23SZ-NEXT: Running pass: SimplifyCFGPass on foo
	▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/test/Other/new-pm-thinlto-defaults.ll

	Show First 20 Lines • Show All 191 Lines • ▼ Show 20 Lines
	; CHECK-POSTLINK-O-NEXT: Running pass: Float2IntPass			; CHECK-POSTLINK-O-NEXT: Running pass: Float2IntPass
	; CHECK-POSTLINK-O-NEXT: Running pass: LowerConstantIntrinsicsPass			; CHECK-POSTLINK-O-NEXT: Running pass: LowerConstantIntrinsicsPass
	; CHECK-EXT: Running pass: {{.*}}::Bye			; CHECK-EXT: Running pass: {{.*}}::Bye
	; CHECK-POSTLINK-O-NEXT: Running pass: LoopSimplifyPass			; CHECK-POSTLINK-O-NEXT: Running pass: LoopSimplifyPass
	; CHECK-POSTLINK-O-NEXT: Running pass: LCSSAPass			; CHECK-POSTLINK-O-NEXT: Running pass: LCSSAPass
	; CHECK-POSTLINK-O-NEXT: Running pass: LoopRotatePass			; CHECK-POSTLINK-O-NEXT: Running pass: LoopRotatePass
	; CHECK-POSTLINK-O-NEXT: Running pass: LoopDistributePass			; CHECK-POSTLINK-O-NEXT: Running pass: LoopDistributePass
	; CHECK-POSTLINK-O-NEXT: Running pass: InjectTLIMappings			; CHECK-POSTLINK-O-NEXT: Running pass: InjectTLIMappings
				; CHECK-POSTLINK-O-NEXT: Running pass: LowerSwitchPass
				; CHECK-POSTLINK-O-NEXT: Running analysis: LazyValueAnalysis
	; CHECK-POSTLINK-O-NEXT: Running pass: LoopVectorizePass			; CHECK-POSTLINK-O-NEXT: Running pass: LoopVectorizePass
	; CHECK-POSTLINK-O-NEXT: Running analysis: BlockFrequencyAnalysis			; CHECK-POSTLINK-O-NEXT: Running analysis: BlockFrequencyAnalysis
	; CHECK-POSTLINK-O-NEXT: Running analysis: BranchProbabilityAnalysis			; CHECK-POSTLINK-O-NEXT: Running analysis: BranchProbabilityAnalysis
	; CHECK-POSTLINK-O-NEXT: Running pass: LoopLoadEliminationPass			; CHECK-POSTLINK-O-NEXT: Running pass: LoopLoadEliminationPass
	; CHECK-POSTLINK-O-NEXT: Running analysis: LoopAccessAnalysis			; CHECK-POSTLINK-O-NEXT: Running analysis: LoopAccessAnalysis
	; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass			; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass
	; CHECK-POSTLINK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-POSTLINK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-POSTLINK-O2-NEXT: Running pass: SLPVectorizerPass			; CHECK-POSTLINK-O2-NEXT: Running pass: SLPVectorizerPass
	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll

	Show First 20 Lines • Show All 162 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: Float2IntPass			; CHECK-O-NEXT: Running pass: Float2IntPass
	; CHECK-O-NEXT: Running pass: LowerConstantIntrinsicsPass			; CHECK-O-NEXT: Running pass: LowerConstantIntrinsicsPass
	; CHECK-EXT: Running pass: {{.*}}::Bye			; CHECK-EXT: Running pass: {{.*}}::Bye
	; CHECK-O-NEXT: Running pass: LoopSimplifyPass on foo			; CHECK-O-NEXT: Running pass: LoopSimplifyPass on foo
	; CHECK-O-NEXT: Running pass: LCSSAPass on foo			; CHECK-O-NEXT: Running pass: LCSSAPass on foo
	; CHECK-O-NEXT: Running pass: LoopRotatePass			; CHECK-O-NEXT: Running pass: LoopRotatePass
	; CHECK-O-NEXT: Running pass: LoopDistributePass			; CHECK-O-NEXT: Running pass: LoopDistributePass
	; CHECK-O-NEXT: Running pass: InjectTLIMappings			; CHECK-O-NEXT: Running pass: InjectTLIMappings
				; CHECK-O-NEXT: Running pass: LowerSwitchPass
				; CHECK-O-NEXT: Running analysis: LazyValueAnalysis
	; CHECK-O-NEXT: Running pass: LoopVectorizePass			; CHECK-O-NEXT: Running pass: LoopVectorizePass
	; CHECK-O-NEXT: Running pass: LoopLoadEliminationPass			; CHECK-O-NEXT: Running pass: LoopLoadEliminationPass
	; CHECK-O-NEXT: Running analysis: LoopAccessAnalysis			; CHECK-O-NEXT: Running analysis: LoopAccessAnalysis
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O2-NEXT: Running pass: SLPVectorizerPass			; CHECK-O2-NEXT: Running pass: SLPVectorizerPass
	; CHECK-O3-NEXT: Running pass: SLPVectorizerPass			; CHECK-O3-NEXT: Running pass: SLPVectorizerPass
	; CHECK-Os-NEXT: Running pass: SLPVectorizerPass			; CHECK-Os-NEXT: Running pass: SLPVectorizerPass
	▲ Show 20 Lines • Show All 84 Lines • Show Last 20 Lines

llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll

	Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: Float2IntPass			; CHECK-O-NEXT: Running pass: Float2IntPass
	; CHECK-O-NEXT: Running pass: LowerConstantIntrinsicsPass			; CHECK-O-NEXT: Running pass: LowerConstantIntrinsicsPass
	; CHECK-EXT: Running pass: {{.*}}::Bye			; CHECK-EXT: Running pass: {{.*}}::Bye
	; CHECK-O-NEXT: Running pass: LoopSimplifyPass			; CHECK-O-NEXT: Running pass: LoopSimplifyPass
	; CHECK-O-NEXT: Running pass: LCSSAPass			; CHECK-O-NEXT: Running pass: LCSSAPass
	; CHECK-O-NEXT: Running pass: LoopRotatePass			; CHECK-O-NEXT: Running pass: LoopRotatePass
	; CHECK-O-NEXT: Running pass: LoopDistributePass			; CHECK-O-NEXT: Running pass: LoopDistributePass
	; CHECK-O-NEXT: Running pass: InjectTLIMappings			; CHECK-O-NEXT: Running pass: InjectTLIMappings
				; CHECK-O-NEXT: Running pass: LowerSwitchPass
				; CHECK-O-NEXT: Running analysis: LazyValueAnalysis
	; CHECK-O-NEXT: Running pass: LoopVectorizePass			; CHECK-O-NEXT: Running pass: LoopVectorizePass
	; CHECK-O-NEXT: Running pass: LoopLoadEliminationPass			; CHECK-O-NEXT: Running pass: LoopLoadEliminationPass
	; CHECK-O-NEXT: Running analysis: LoopAccessAnalysis			; CHECK-O-NEXT: Running analysis: LoopAccessAnalysis
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O2-NEXT: Running pass: SLPVectorizerPass			; CHECK-O2-NEXT: Running pass: SLPVectorizerPass
	; CHECK-O3-NEXT: Running pass: SLPVectorizerPass			; CHECK-O3-NEXT: Running pass: SLPVectorizerPass
	; CHECK-Os-NEXT: Running pass: SLPVectorizerPass			; CHECK-Os-NEXT: Running pass: SLPVectorizerPass
	▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/AArch64/sve-remove-switches.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O3 -loop-vectorize -mtriple aarch64-linux-gnu -mattr=+sve -scalable-vectorization=on -S \| FileCheck %s

				define void @switch(i32* noalias %a, i32* noalias %b, i32* noalias %c, i64 %N) #0 {
				; CHECK-LABEL: @switch(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
				; CHECK-NEXT: [[TMP1:%.*]] = shl i64 [[TMP0]], 2
				; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ugt i64 [[TMP1]], [[N:%.]]
				; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[TMP2:%.*]] = call i64 @llvm.vscale.i64()
				; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[TMP2]], 2
				; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[N]], [[TMP3]]
				; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[N]], [[N_MOD_VF]]
				; CHECK-NEXT: [[TMP4:%.*]] = call i64 @llvm.vscale.i64()
				; CHECK-NEXT: [[TMP5:%.*]] = shl i64 [[TMP4]], 2
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP7:%.]] = bitcast i32 [[TMP6]] to <vscale x 4 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <vscale x 4 x i32>, <vscale x 4 x i32> [[TMP7]], align 4
				; CHECK-NEXT: [[TMP8:%.*]] = icmp eq <vscale x 4 x i32> [[WIDE_LOAD]], shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 3, i32 0), <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP9:%.*]] = icmp eq <vscale x 4 x i32> [[WIDE_LOAD]], shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 2, i32 0), <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP10:%.*]] = icmp eq <vscale x 4 x i32> [[WIDE_LOAD]], shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 4, i32 0), <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP11:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP12:%.*]] = xor <vscale x 4 x i1> [[TMP9]], shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 true, i32 0), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP13:%.*]] = select <vscale x 4 x i1> [[TMP8]], <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 false, i32 0), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer), <vscale x 4 x i1> [[TMP12]]
				; CHECK-NEXT: [[TMP14:%.*]] = xor <vscale x 4 x i1> [[TMP10]], shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 true, i32 0), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP15:%.*]] = select <vscale x 4 x i1> [[TMP13]], <vscale x 4 x i1> [[TMP14]], <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 false, i32 0), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP16:%.]] = bitcast i32 [[TMP11]] to <vscale x 4 x i32>*
				; CHECK-NEXT: [[WIDE_MASKED_LOAD:%.]] = call <vscale x 4 x i32> @llvm.masked.load.nxv4i32.p0nxv4i32(<vscale x 4 x i32> [[TMP16]], i32 4, <vscale x 4 x i1> [[TMP15]], <vscale x 4 x i32> poison)
				; CHECK-NEXT: [[TMP17:%.*]] = mul nsw <vscale x 4 x i32> [[WIDE_MASKED_LOAD]], [[WIDE_LOAD]]
				; CHECK-NEXT: [[TMP18:%.*]] = add nsw <vscale x 4 x i32> [[TMP17]], [[WIDE_LOAD]]
				; CHECK-NEXT: [[TMP19:%.]] = getelementptr inbounds i32, i32 [[C:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP20:%.*]] = select <vscale x 4 x i1> [[TMP13]], <vscale x 4 x i1> [[TMP10]], <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 false, i32 0), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP21:%.]] = bitcast i32 [[TMP19]] to <vscale x 4 x i32>*
				; CHECK-NEXT: [[WIDE_MASKED_LOAD6:%.]] = call <vscale x 4 x i32> @llvm.masked.load.nxv4i32.p0nxv4i32(<vscale x 4 x i32> [[TMP21]], i32 4, <vscale x 4 x i1> [[TMP20]], <vscale x 4 x i32> poison)
				; CHECK-NEXT: [[TMP22:%.*]] = select <vscale x 4 x i1> [[TMP8]], <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 false, i32 0), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer), <vscale x 4 x i1> [[TMP9]]
				; CHECK-NEXT: [[TMP23:%.]] = bitcast i32 [[TMP11]] to <vscale x 4 x i32>*
				; CHECK-NEXT: [[WIDE_MASKED_LOAD7:%.]] = call <vscale x 4 x i32> @llvm.masked.load.nxv4i32.p0nxv4i32(<vscale x 4 x i32> [[TMP23]], i32 4, <vscale x 4 x i1> [[TMP22]], <vscale x 4 x i32> poison)
				; CHECK-NEXT: [[PREDPHI:%.*]] = select <vscale x 4 x i1> [[TMP15]], <vscale x 4 x i32> [[WIDE_MASKED_LOAD]], <vscale x 4 x i32> [[WIDE_MASKED_LOAD7]]
				; CHECK-NEXT: [[PREDPHI8:%.*]] = select <vscale x 4 x i1> [[TMP15]], <vscale x 4 x i32> [[TMP18]], <vscale x 4 x i32> shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 2, i32 0), <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP24:%.*]] = mul nsw <vscale x 4 x i32> [[PREDPHI]], [[PREDPHI]]
				; CHECK-NEXT: [[TMP25:%.*]] = add nsw <vscale x 4 x i32> [[TMP24]], [[PREDPHI8]]
				; CHECK-NEXT: [[TMP26:%.*]] = or <vscale x 4 x i1> [[TMP22]], [[TMP15]]
				; CHECK-NEXT: [[PREDPHI9:%.*]] = select <vscale x 4 x i1> [[TMP26]], <vscale x 4 x i32> [[TMP25]], <vscale x 4 x i32> shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 3, i32 0), <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP27:%.*]] = or <vscale x 4 x i1> [[TMP8]], [[TMP26]]
				; CHECK-NEXT: [[TMP28:%.]] = bitcast i32 [[TMP19]] to <vscale x 4 x i32>*
				; CHECK-NEXT: [[WIDE_MASKED_LOAD10:%.]] = call <vscale x 4 x i32> @llvm.masked.load.nxv4i32.p0nxv4i32(<vscale x 4 x i32> [[TMP28]], i32 4, <vscale x 4 x i1> [[TMP27]], <vscale x 4 x i32> poison)
				; CHECK-NEXT: [[TMP29:%.*]] = mul nsw <vscale x 4 x i32> [[WIDE_MASKED_LOAD10]], [[PREDPHI9]]
				; CHECK-NEXT: [[TMP30:%.*]] = add nsw <vscale x 4 x i32> [[TMP29]], [[PREDPHI9]]
				; CHECK-NEXT: [[PREDPHI11:%.*]] = select <vscale x 4 x i1> [[TMP27]], <vscale x 4 x i32> [[WIDE_MASKED_LOAD10]], <vscale x 4 x i32> [[WIDE_MASKED_LOAD6]]
				; CHECK-NEXT: [[PREDPHI12:%.*]] = select <vscale x 4 x i1> [[TMP27]], <vscale x 4 x i32> [[TMP30]], <vscale x 4 x i32> shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 4, i32 0), <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)
				; CHECK-NEXT: [[TMP31:%.*]] = mul nsw <vscale x 4 x i32> [[PREDPHI11]], [[PREDPHI11]]
				; CHECK-NEXT: [[TMP32:%.*]] = add nsw <vscale x 4 x i32> [[TMP31]], [[PREDPHI12]]
				; CHECK-NEXT: [[TMP33:%.]] = bitcast i32 [[TMP6]] to <vscale x 4 x i32>*
				; CHECK-NEXT: store <vscale x 4 x i32> [[TMP32]], <vscale x 4 x i32>* [[TMP33]], align 4
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP5]]
				; CHECK-NEXT: [[TMP34:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP34]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_MOD_VF]], 0
				; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[FOR_BODY_PREHEADER]]
				; CHECK: for.body.preheader:
				; CHECK-NEXT: [[I_PH:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L4:%.*]] ], [ [[I_PH]], [[FOR_BODY_PREHEADER]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[I]]
				; CHECK-NEXT: [[TMP35:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: switch i32 [[TMP35]], label [[L1:%.*]] [
				; CHECK-NEXT: i32 3, label [[L3:%.*]]
				; CHECK-NEXT: i32 2, label [[FOR_BODY_L2_CRIT_EDGE:%.*]]
				; CHECK-NEXT: i32 4, label [[FOR_BODY_L4_CRIT_EDGE:%.*]]
				; CHECK-NEXT: ]
				; CHECK: for.body.L4_crit_edge:
				; CHECK-NEXT: [[ARRAYIDX17_PHI_TRANS_INSERT:%.]] = getelementptr inbounds i32, i32 [[C]], i64 [[I]]
				; CHECK-NEXT: [[DOTPRE1:%.]] = load i32, i32 [[ARRAYIDX17_PHI_TRANS_INSERT]], align 4
				; CHECK-NEXT: br label [[L4]]
				; CHECK: for.body.L2_crit_edge:
				; CHECK-NEXT: [[ARRAYIDX7_PHI_TRANS_INSERT:%.]] = getelementptr inbounds i32, i32 [[B]], i64 [[I]]
				; CHECK-NEXT: [[DOTPRE:%.]] = load i32, i32 [[ARRAYIDX7_PHI_TRANS_INSERT]], align 4
				; CHECK-NEXT: br label [[L2:%.*]]
				; CHECK: L1:
				; CHECK-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds i32, i32 [[B]], i64 [[I]]
				; CHECK-NEXT: [[TMP36:%.]] = load i32, i32 [[ARRAYIDX5]], align 4
				; CHECK-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP36]], [[TMP35]]
				; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[MUL]], [[TMP35]]
				; CHECK-NEXT: br label [[L2]]
				; CHECK: L2:
				; CHECK-NEXT: [[TMP37:%.*]] = phi i32 [ [[DOTPRE]], [[FOR_BODY_L2_CRIT_EDGE]] ], [ [[TMP36]], [[L1]] ]
				; CHECK-NEXT: [[TMP38:%.*]] = phi i32 [ 2, [[FOR_BODY_L2_CRIT_EDGE]] ], [ [[ADD]], [[L1]] ]
				; CHECK-NEXT: [[MUL9:%.*]] = mul nsw i32 [[TMP37]], [[TMP37]]
				; CHECK-NEXT: [[ADD11:%.*]] = add nsw i32 [[MUL9]], [[TMP38]]
				; CHECK-NEXT: br label [[L3]]
				; CHECK: L3:
				; CHECK-NEXT: [[TMP39:%.*]] = phi i32 [ [[TMP35]], [[FOR_BODY]] ], [ [[ADD11]], [[L2]] ]
				; CHECK-NEXT: [[ARRAYIDX13:%.]] = getelementptr inbounds i32, i32 [[C]], i64 [[I]]
				; CHECK-NEXT: [[TMP40:%.]] = load i32, i32 [[ARRAYIDX13]], align 4
				; CHECK-NEXT: [[MUL14:%.*]] = mul nsw i32 [[TMP40]], [[TMP39]]
				; CHECK-NEXT: [[ADD16:%.*]] = add nsw i32 [[MUL14]], [[TMP39]]
				; CHECK-NEXT: br label [[L4]]
				; CHECK: L4:
				; CHECK-NEXT: [[TMP41:%.*]] = phi i32 [ [[DOTPRE1]], [[FOR_BODY_L4_CRIT_EDGE]] ], [ [[TMP40]], [[L3]] ]
				; CHECK-NEXT: [[TMP42:%.*]] = phi i32 [ 4, [[FOR_BODY_L4_CRIT_EDGE]] ], [ [[ADD16]], [[L3]] ]
				; CHECK-NEXT: [[MUL19:%.*]] = mul nsw i32 [[TMP41]], [[TMP41]]
				; CHECK-NEXT: [[ADD21:%.*]] = add nsw i32 [[MUL19]], [[TMP42]]
				; CHECK-NEXT: store i32 [[ADD21]], i32* [[ARRAYIDX]], align 4
				; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I]], 1
				; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INC]], [[N]]
				; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]
				; CHECK: for.end.loopexit:
				; CHECK-NEXT: br label [[FOR_END]]
				; CHECK: for.end:
				; CHECK-NEXT: ret void
				;

				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L4 ], [ 0, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				switch i32 %0, label %L1 [
				i32 4, label %L4
				i32 2, label %L2
				i32 3, label %L3
				]

				L1:
				%arrayidx5 = getelementptr inbounds i32, i32* %b, i64 %i
				%1 = load i32, i32* %arrayidx5
				%mul = mul nsw i32 %1, %0
				%add = add nsw i32 %mul, %0
				store i32 %add, i32* %arrayidx
				br label %L2

				L2:
				%2 = phi i32 [ 2, %for.body ], [ %add, %L1 ]
				%arrayidx7 = getelementptr inbounds i32, i32* %b, i64 %i
				%3 = load i32, i32* %arrayidx7
				%mul9 = mul nsw i32 %3, %3
				%add11 = add nsw i32 %2, %mul9
				store i32 %add11, i32* %arrayidx
				br label %L3

				L3:
				%4 = phi i32 [ 3, %for.body ], [ %add11, %L2 ]
				%arrayidx13 = getelementptr inbounds i32, i32* %c, i64 %i
				%5 = load i32, i32* %arrayidx13
				%mul14 = mul nsw i32 %5, %4
				%add16 = add nsw i32 %mul14, %4
				store i32 %add16, i32* %arrayidx
				br label %L4

				L4:
				%6 = phi i32 [ 4, %for.body ], [ %add16, %L3 ]
				%arrayidx17 = getelementptr inbounds i32, i32* %c, i64 %i
				%7 = load i32, i32* %arrayidx17
				%mul19 = mul nsw i32 %7, %7
				%add21 = add nsw i32 %6, %mul19
				store i32 %add21, i32* %arrayidx
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body

				for.end:
				ret void
				}

				define void @switch_VF1_UF2(i32* noalias %a, i32* noalias %b, i32* noalias %c, i64 %N) #0 {
				; CHECK-LABEL: @switch_VF1_UF2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[N:%.]], 2
				; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[N]], -2
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_LOAD_CONTINUE6:%.*]] ]
				; CHECK-NEXT: [[INDUCTION3:%.*]] = or i64 [[INDEX]], 1
				; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[TMP0]] to <2 x i32>*
				; CHECK-NEXT: [[TMP2:%.]] = load <2 x i32>, <2 x i32> [[TMP1]], align 4
				; CHECK-NEXT: [[TMP3:%.*]] = icmp eq <2 x i32> [[TMP2]], <i32 2, i32 2>
				; CHECK-NEXT: [[TMP4:%.*]] = icmp eq <2 x i32> [[TMP2]], <i32 3, i32 3>
				; CHECK-NEXT: [[TMP5:%.*]] = mul nsw <2 x i32> [[TMP2]], <i32 3, i32 3>
				; CHECK-NEXT: [[TMP6:%.*]] = select <2 x i1> [[TMP3]], <2 x i32> <i32 2, i32 2>, <2 x i32> [[TMP5]]
				; CHECK-NEXT: [[TMP7:%.*]] = extractelement <2 x i1> [[TMP4]], i32 0
				; CHECK-NEXT: br i1 [[TMP7]], label [[PRED_LOAD_CONTINUE:%.]], label [[PRED_LOAD_IF:%.]]
				; CHECK: pred.load.if:
				; CHECK-NEXT: [[TMP8:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP8]], align 4
				; CHECK-NEXT: br label [[PRED_LOAD_CONTINUE]]
				; CHECK: pred.load.continue:
				; CHECK-NEXT: [[TMP10:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP9]], [[PRED_LOAD_IF]] ]
				; CHECK-NEXT: [[TMP11:%.*]] = extractelement <2 x i1> [[TMP4]], i32 1
				; CHECK-NEXT: br i1 [[TMP11]], label [[PRED_LOAD_CONTINUE6]], label [[PRED_LOAD_IF5:%.*]]
				; CHECK: pred.load.if5:
				; CHECK-NEXT: [[TMP12:%.]] = getelementptr inbounds i32, i32 [[B]], i64 [[INDUCTION3]]
				; CHECK-NEXT: [[TMP13:%.]] = load i32, i32 [[TMP12]], align 4
				; CHECK-NEXT: br label [[PRED_LOAD_CONTINUE6]]
				; CHECK: pred.load.continue6:
				; CHECK-NEXT: [[TMP14:%.*]] = phi i32 [ poison, [[PRED_LOAD_CONTINUE]] ], [ [[TMP13]], [[PRED_LOAD_IF5]] ]
				; CHECK-NEXT: [[TMP15:%.*]] = insertelement <2 x i32> poison, i32 [[TMP10]], i32 0
				; CHECK-NEXT: [[TMP16:%.*]] = insertelement <2 x i32> [[TMP15]], i32 [[TMP14]], i32 1
				; CHECK-NEXT: [[TMP17:%.*]] = mul nsw <2 x i32> [[TMP16]], <i32 3, i32 3>
				; CHECK-NEXT: [[TMP18:%.*]] = add nsw <2 x i32> [[TMP17]], [[TMP6]]
				; CHECK-NEXT: [[TMP19:%.*]] = select <2 x i1> [[TMP4]], <2 x i32> <i32 3, i32 3>, <2 x i32> [[TMP18]]
				; CHECK-NEXT: [[TMP20:%.]] = getelementptr inbounds i32, i32 [[C:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP21:%.]] = bitcast i32 [[TMP20]] to <2 x i32>*
				; CHECK-NEXT: [[TMP22:%.]] = load <2 x i32>, <2 x i32> [[TMP21]], align 4
				; CHECK-NEXT: [[TMP23:%.*]] = shl nsw <2 x i32> [[TMP22]], <i32 2, i32 2>
				; CHECK-NEXT: [[TMP24:%.*]] = add nsw <2 x i32> [[TMP23]], [[TMP19]]
				; CHECK-NEXT: [[TMP25:%.]] = bitcast i32 [[TMP0]] to <2 x i32>*
				; CHECK-NEXT: store <2 x i32> [[TMP24]], <2 x i32>* [[TMP25]], align 4
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
				; CHECK-NEXT: [[TMP26:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP26]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[N]]
				; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[FOR_BODY_PREHEADER]]
				; CHECK: for.body.preheader:
				; CHECK-NEXT: [[I_PH:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L3:%.*]] ], [ [[I_PH]], [[FOR_BODY_PREHEADER]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[I]]
				; CHECK-NEXT: [[TMP27:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: switch i32 [[TMP27]], label [[FOR_BODY_SWITCH2:%.*]] [
				; CHECK-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-NEXT: i32 3, label [[L3]]
				; CHECK-NEXT: ]
				; CHECK: for.body.switch2:
				; CHECK-NEXT: [[ADD:%.*]] = mul nsw i32 [[TMP27]], 3
				; CHECK-NEXT: br label [[L2]]
				; CHECK: L2:
				; CHECK-NEXT: [[TMP28:%.*]] = phi i32 [ [[ADD]], [[FOR_BODY_SWITCH2]] ], [ [[TMP27]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds i32, i32 [[B]], i64 [[I]]
				; CHECK-NEXT: [[TMP29:%.]] = load i32, i32 [[ARRAYIDX5]], align 4
				; CHECK-NEXT: [[MUL6:%.*]] = mul nsw i32 [[TMP29]], 3
				; CHECK-NEXT: [[ADD8:%.*]] = add nsw i32 [[MUL6]], [[TMP28]]
				; CHECK-NEXT: br label [[L3]]
				; CHECK: L3:
				; CHECK-NEXT: [[TMP30:%.*]] = phi i32 [ [[ADD8]], [[L2]] ], [ [[TMP27]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ARRAYIDX9:%.]] = getelementptr inbounds i32, i32 [[C]], i64 [[I]]
				; CHECK-NEXT: [[TMP31:%.]] = load i32, i32 [[ARRAYIDX9]], align 4
				; CHECK-NEXT: [[MUL10:%.*]] = shl nsw i32 [[TMP31]], 2
				; CHECK-NEXT: [[ADD12:%.*]] = add nsw i32 [[MUL10]], [[TMP30]]
				; CHECK-NEXT: store i32 [[ADD12]], i32* [[ARRAYIDX]], align 4
				; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I]], 1
				; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INC]], [[N]]
				; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
				; CHECK: for.end.loopexit:
				; CHECK-NEXT: br label [[FOR_END]]
				; CHECK: for.end:
				; CHECK-NEXT: ret void
				;

				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L3 ], [ 0, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				%switch = icmp eq i32 %0, 3
				br i1 %switch, label %L3, label %for.body.switch

				for.body.switch:
				%switch1 = icmp eq i32 %0, 2
				br i1 %switch1, label %L2, label %for.body.switch2

				for.body.switch2:
				%add = mul nsw i32 %0, 3
				store i32 %add, i32* %arrayidx
				br label %L2

				L2:
				%1 = phi i32 [ %add, %for.body.switch2 ], [ %0, %for.body.switch ]
				%arrayidx5 = getelementptr inbounds i32, i32* %b, i64 %i
				%2 = load i32, i32* %arrayidx5
				%mul6 = mul nsw i32 %2, 3
				%add8 = add nsw i32 %1, %mul6
				store i32 %add8, i32* %arrayidx
				br label %L3

				L3:
				%3 = phi i32 [ %0, %for.body ], [ %add8, %L2 ]
				%arrayidx9 = getelementptr inbounds i32, i32* %c, i64 %i
				%4 = load i32, i32* %arrayidx9
				%mul10 = shl nsw i32 %4, 2
				%add12 = add nsw i32 %3, %mul10
				store i32 %add12, i32* %arrayidx
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !0

				for.end:
				ret void
				}

				; This loop will not vectorize due to unsafe FP ops, ensure the switch statement is created again in for.body
				define float @switch_no_vectorize(i32* noalias %a, i32* noalias %b, i32* noalias %c, float %val, i64 %N) {
				; CHECK-LABEL: @switch_no_vectorize(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L3:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[SUM_033:%.]] = phi float [ [[CONV20:%.]], [[L3]] ], [ 2.000000e+00, [[ENTRY]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: switch i32 [[TMP0]], label [[L1:%.*]] [
				; CHECK-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-NEXT: i32 3, label [[L3]]
				; CHECK-NEXT: ]
				; CHECK: L1:
				; CHECK-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP0]] to float
				; CHECK-NEXT: [[CONV4:%.*]] = fpext float [[CONV]] to double
				; CHECK-NEXT: [[ADD:%.*]] = fadd double [[CONV4]], 1.000000e+00
				; CHECK-NEXT: [[CONV5:%.*]] = fpext float [[SUM_033]] to double
				; CHECK-NEXT: [[MUL:%.*]] = fmul double [[ADD]], [[CONV5]]
				; CHECK-NEXT: [[CONV6:%.*]] = fptrunc double [[MUL]] to float
				; CHECK-NEXT: br label [[L2]]
				; CHECK: L2:
				; CHECK-NEXT: [[SUM_1:%.*]] = phi float [ [[CONV6]], [[L1]] ], [ [[SUM_033]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ARRAYIDX7:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[ARRAYIDX7]], align 4
				; CHECK-NEXT: [[CONV8:%.*]] = sitofp i32 [[TMP1]] to float
				; CHECK-NEXT: [[CONV9:%.*]] = fpext float [[CONV8]] to double
				; CHECK-NEXT: [[ADD10:%.*]] = fadd double [[CONV9]], 2.000000e+00
				; CHECK-NEXT: [[CONV11:%.*]] = fpext float [[SUM_1]] to double
				; CHECK-NEXT: [[MUL12:%.*]] = fmul double [[ADD10]], [[CONV11]]
				; CHECK-NEXT: [[CONV13:%.*]] = fptrunc double [[MUL12]] to float
				; CHECK-NEXT: br label [[L3]]
				; CHECK: L3:
				; CHECK-NEXT: [[SUM_2:%.*]] = phi float [ [[CONV13]], [[L2]] ], [ [[SUM_033]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ARRAYIDX14:%.]] = getelementptr inbounds i32, i32 [[C:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX14]], align 4
				; CHECK-NEXT: [[CONV15:%.*]] = sitofp i32 [[TMP2]] to float
				; CHECK-NEXT: [[CONV16:%.*]] = fpext float [[CONV15]] to double
				; CHECK-NEXT: [[ADD17:%.*]] = fadd double [[CONV16]], 3.000000e+00
				; CHECK-NEXT: [[CONV18:%.*]] = fpext float [[SUM_2]] to double
				; CHECK-NEXT: [[MUL19:%.*]] = fmul double [[ADD17]], [[CONV18]]
				; CHECK-NEXT: [[CONV20]] = fptrunc double [[MUL19]] to float
				; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I]], 1
				; CHECK-NEXT: [[EXITCOND_NOT:%.]] = icmp eq i64 [[INC]], [[N:%.]]
				; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END:%.*]], label [[FOR_BODY]]
				; CHECK: for.end:
				; CHECK-NEXT: [[CONV20_LCSSA:%.*]] = phi float [ [[CONV20]], [[L3]] ]
				; CHECK-NEXT: ret float [[CONV20_LCSSA]]
				;

				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L3 ], [ 0, %entry ]
				%sum.033 = phi float [ %conv20, %L3 ], [ 2.000000e+00, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				switch i32 %0, label %L1 [
				i32 3, label %L3
				i32 2, label %L2
				]

				L1:
				%conv = sitofp i32 %0 to float
				%conv4 = fpext float %conv to double
				%add = fadd double %conv4, 1.000000e+00
				%conv5 = fpext float %sum.033 to double
				%mul = fmul double %add, %conv5
				%conv6 = fptrunc double %mul to float
				br label %L2

				L2:
				%sum.1 = phi float [ %conv6, %L1 ], [ %sum.033, %for.body ]
				%arrayidx7 = getelementptr inbounds i32, i32* %b, i64 %i
				%1 = load i32, i32* %arrayidx7
				%conv8 = sitofp i32 %1 to float
				%conv9 = fpext float %conv8 to double
				%add10 = fadd double %conv9, 2.000000e+00
				%conv11 = fpext float %sum.1 to double
				%mul12 = fmul double %add10, %conv11
				%conv13 = fptrunc double %mul12 to float
				br label %L3

				L3:
				%sum.2 = phi float [ %conv13, %L2 ], [ %sum.033, %for.body ]
				%arrayidx14 = getelementptr inbounds i32, i32* %c, i64 %i
				%2 = load i32, i32* %arrayidx14
				%conv15 = sitofp i32 %2 to float
				%conv16 = fpext float %conv15 to double
				%add17 = fadd double %conv16, 3.000000e+00
				%conv18 = fpext float %sum.2 to double
				%mul19 = fmul double %add17, %conv18
				%conv20 = fptrunc double %mul19 to float
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body

				for.end:
				ret float %conv20
				}

				!0 = distinct !{!0, !1, !2, !3, !4}
				!1 = !{!"llvm.loop.vectorize.width", i32 1}
				!2 = !{!"llvm.loop.interleave.count", i32 2}
				!3 = !{!"llvm.loop.vectorize.enable", i1 true}
				!4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}

llvm/test/Transforms/LoopVectorize/remove-switches.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -O3 -loop-vectorize -pass-remarks-analysis=loop-vectorize -S 2>%t \| FileCheck %s
				; RUN: cat %t \| FileCheck %s -check-prefix=CHECK-REMARKS

				; We should not vectorize this loop since we do not have masked loads and stores
				; CHECK-REMARKS: remark: <unknown>:0:0: the cost-model indicates that vectorization is not beneficial
				define void @switch_cost(i32* noalias %a, i32* noalias readonly %b, i32* noalias readonly %c, i64 %N) #0 {
				; CHECK-LABEL: @switch_cost(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L4:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: switch i32 [[TMP0]], label [[L1:%.*]] [
				; CHECK-NEXT: i32 3, label [[L3:%.*]]
				; CHECK-NEXT: i32 2, label [[FOR_BODY_L2_CRIT_EDGE:%.*]]
				; CHECK-NEXT: i32 4, label [[FOR_BODY_L4_CRIT_EDGE:%.*]]
				; CHECK-NEXT: ]
				; CHECK: for.body.L4_crit_edge:
				; CHECK-NEXT: [[ARRAYIDX17_PHI_TRANS_INSERT:%.]] = getelementptr inbounds i32, i32 [[C:%.*]], i64 [[I]]
				; CHECK-NEXT: [[DOTPRE1:%.]] = load i32, i32 [[ARRAYIDX17_PHI_TRANS_INSERT]], align 4
				; CHECK-NEXT: br label [[L4]]
				; CHECK: for.body.L2_crit_edge:
				; CHECK-NEXT: [[ARRAYIDX7_PHI_TRANS_INSERT:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[I]]
				; CHECK-NEXT: [[DOTPRE:%.]] = load i32, i32 [[ARRAYIDX7_PHI_TRANS_INSERT]], align 4
				; CHECK-NEXT: br label [[L2:%.*]]
				; CHECK: L1:
				; CHECK-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds i32, i32 [[B]], i64 [[I]]
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[ARRAYIDX5]], align 4
				; CHECK-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP1]], [[TMP0]]
				; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[MUL]], [[TMP0]]
				; CHECK-NEXT: br label [[L2]]
				; CHECK: L2:
				; CHECK-NEXT: [[TMP2:%.*]] = phi i32 [ [[DOTPRE]], [[FOR_BODY_L2_CRIT_EDGE]] ], [ [[TMP1]], [[L1]] ]
				; CHECK-NEXT: [[TMP3:%.*]] = phi i32 [ 2, [[FOR_BODY_L2_CRIT_EDGE]] ], [ [[ADD]], [[L1]] ]
				; CHECK-NEXT: [[MUL9:%.*]] = mul nsw i32 [[TMP2]], [[TMP2]]
				; CHECK-NEXT: [[ADD11:%.*]] = add nsw i32 [[MUL9]], [[TMP3]]
				; CHECK-NEXT: br label [[L3]]
				; CHECK: L3:
				; CHECK-NEXT: [[TMP4:%.*]] = phi i32 [ [[TMP0]], [[FOR_BODY]] ], [ [[ADD11]], [[L2]] ]
				; CHECK-NEXT: [[ARRAYIDX13:%.]] = getelementptr inbounds i32, i32 [[C]], i64 [[I]]
				; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 [[ARRAYIDX13]], align 4
				; CHECK-NEXT: [[MUL14:%.*]] = mul nsw i32 [[TMP5]], [[TMP4]]
				; CHECK-NEXT: [[ADD16:%.*]] = add nsw i32 [[MUL14]], [[TMP4]]
				; CHECK-NEXT: br label [[L4]]
				; CHECK: L4:
				; CHECK-NEXT: [[TMP6:%.*]] = phi i32 [ [[DOTPRE1]], [[FOR_BODY_L4_CRIT_EDGE]] ], [ [[TMP5]], [[L3]] ]
				; CHECK-NEXT: [[TMP7:%.*]] = phi i32 [ 4, [[FOR_BODY_L4_CRIT_EDGE]] ], [ [[ADD16]], [[L3]] ]
				; CHECK-NEXT: [[MUL19:%.*]] = mul nsw i32 [[TMP6]], [[TMP6]]
				; CHECK-NEXT: [[ADD21:%.*]] = add nsw i32 [[MUL19]], [[TMP7]]
				; CHECK-NEXT: store i32 [[ADD21]], i32* [[ARRAYIDX]], align 4
				; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I]], 1
				; CHECK-NEXT: [[EXITCOND_NOT:%.]] = icmp eq i64 [[INC]], [[N:%.]]
				; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END:%.*]], label [[FOR_BODY]]
				; CHECK: for.end:
				; CHECK-NEXT: ret void
				;

				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L4 ], [ 0, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				switch i32 %0, label %L1 [
				i32 4, label %L4
				i32 2, label %L2
				i32 3, label %L3
				]

				L1:
				%arrayidx5 = getelementptr inbounds i32, i32* %b, i64 %i
				%1 = load i32, i32* %arrayidx5
				%mul = mul nsw i32 %1, %0
				%add = add nsw i32 %mul, %0
				store i32 %add, i32* %arrayidx
				br label %L2

				L2:
				%2 = phi i32 [ 2, %for.body ], [ %add, %L1 ]
				%arrayidx7 = getelementptr inbounds i32, i32* %b, i64 %i
				%3 = load i32, i32* %arrayidx7
				%mul9 = mul nsw i32 %3, %3
				%add11 = add nsw i32 %2, %mul9
				store i32 %add11, i32* %arrayidx
				br label %L3

				L3:
				%4 = phi i32 [ 3, %for.body ], [ %add11, %L2 ]
				%arrayidx13 = getelementptr inbounds i32, i32* %c, i64 %i
				%5 = load i32, i32* %arrayidx13
				%mul14 = mul nsw i32 %5, %4
				%add16 = add nsw i32 %mul14, %4
				store i32 %add16, i32* %arrayidx
				br label %L4

				L4:
				%6 = phi i32 [ 4, %for.body ], [ %add16, %L3 ]
				%arrayidx17 = getelementptr inbounds i32, i32* %c, i64 %i
				%7 = load i32, i32* %arrayidx17
				%mul19 = mul nsw i32 %7, %7
				%add21 = add nsw i32 %6, %mul19
				store i32 %add21, i32* %arrayidx
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body

				for.end:
				ret void
				}

				define void @switch(i32* noalias %a, i32* noalias %b, i64 %N) {
				; CHECK-LABEL: @switch(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[CMP14:%.]] = icmp sgt i64 [[N:%.]], 0
				; CHECK-NEXT: br i1 [[CMP14]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_COND_CLEANUP:%.]]
				; CHECK: for.body.preheader:
				; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[N]], 4
				; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER5:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[N]], -4
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP1:%.]] = bitcast i32 [[TMP0]] to <4 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <4 x i32>, <4 x i32> [[TMP1]], align 4
				; CHECK-NEXT: [[TMP2:%.*]] = icmp eq <4 x i32> [[WIDE_LOAD]], <i32 2, i32 2, i32 2, i32 2>
				; CHECK-NEXT: [[TMP3:%.*]] = icmp eq <4 x i32> [[WIDE_LOAD]], <i32 3, i32 3, i32 3, i32 3>
				; CHECK-NEXT: [[PREDPHI_OP:%.*]] = select <4 x i1> [[TMP2]], <4 x i32> <i32 9, i32 9, i32 9, i32 9>, <4 x i32> <i32 16, i32 16, i32 16, i32 16>
				; CHECK-NEXT: [[TMP4:%.*]] = select <4 x i1> [[TMP3]], <4 x i32> <i32 7, i32 7, i32 7, i32 7>, <4 x i32> [[PREDPHI_OP]]
				; CHECK-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP6:%.]] = bitcast i32 [[TMP5]] to <4 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD4:%.]] = load <4 x i32>, <4 x i32> [[TMP6]], align 4
				; CHECK-NEXT: [[TMP7:%.*]] = mul nsw <4 x i32> [[WIDE_LOAD4]], [[TMP4]]
				; CHECK-NEXT: [[TMP8:%.]] = bitcast i32 [[TMP5]] to <4 x i32>*
				; CHECK-NEXT: store <4 x i32> [[TMP7]], <4 x i32>* [[TMP8]], align 4
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
				; CHECK-NEXT: [[TMP9:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP9]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[N]]
				; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY_PREHEADER5]]
				; CHECK: for.body.preheader5:
				; CHECK-NEXT: [[I_015_PH:%.*]] = phi i64 [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.cond.cleanup.loopexit:
				; CHECK-NEXT: br label [[FOR_COND_CLEANUP]]
				; CHECK: for.cond.cleanup:
				; CHECK-NEXT: ret void
				; CHECK: for.body:
				; CHECK-NEXT: [[I_015:%.]] = phi i64 [ [[INC:%.]], [[L3:%.*]] ], [ [[I_015_PH]], [[FOR_BODY_PREHEADER5]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[I_015]]
				; CHECK-NEXT: [[TMP10:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: switch i32 [[TMP10]], label [[L1:%.*]] [
				; CHECK-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-NEXT: i32 3, label [[L3]]
				; CHECK-NEXT: ]
				; CHECK: L1:
				; CHECK-NEXT: br label [[L2]]
				; CHECK: L2:
				; CHECK-NEXT: [[R_0:%.*]] = phi i32 [ 12, [[L1]] ], [ 5, [[FOR_BODY]] ]
				; CHECK-NEXT: br label [[L3]]
				; CHECK: L3:
				; CHECK-NEXT: [[R_1:%.*]] = phi i32 [ [[R_0]], [[L2]] ], [ [[TMP10]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ADD4:%.*]] = add nuw nsw i32 [[R_1]], 4
				; CHECK-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds i32, i32 [[B]], i64 [[I_015]]
				; CHECK-NEXT: [[TMP11:%.]] = load i32, i32 [[ARRAYIDX5]], align 4
				; CHECK-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP11]], [[ADD4]]
				; CHECK-NEXT: store i32 [[MUL]], i32* [[ARRAYIDX5]], align 4
				; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_015]], 1
				; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INC]], [[N]]
				; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]
				;

				entry:
				%cmp14 = icmp sgt i64 %N, 0
				br i1 %cmp14, label %for.body.preheader, label %for.cond.cleanup

				for.body.preheader: ; preds = %entry
				br label %for.body

				for.cond.cleanup.loopexit: ; preds = %L3
				br label %for.cond.cleanup

				for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry
				ret void

				for.body: ; preds = %for.body.preheader, %L3
				%i.015 = phi i64 [ %inc, %L3 ], [ 0, %for.body.preheader ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i.015
				%0 = load i32, i32* %arrayidx
				switch i32 %0, label %L1 [
				i32 3, label %L3
				i32 2, label %L2
				]

				L1: ; preds = %for.body
				br label %L2

				L2: ; preds = %for.body, %L1
				%r.0 = phi i32 [ 12, %L1 ], [ 5, %for.body ]
				br label %L3

				L3: ; preds = %for.body, %L2
				%r.1 = phi i32 [ %r.0, %L2 ], [ 3, %for.body ]
				%add4 = add nuw nsw i32 %r.1, 4
				%arrayidx5 = getelementptr inbounds i32, i32* %b, i64 %i.015
				%1 = load i32, i32* %arrayidx5
				%mul = mul nsw i32 %1, %add4
				store i32 %mul, i32* %arrayidx5
				%inc = add nuw nsw i64 %i.015, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.cond.cleanup.loopexit, label %for.body, !llvm.loop !0
				}

				define void @switch_VF1_UF2(i32* noalias %a, i32* noalias readonly %b, i32* noalias readonly %c, i64 %N) {
				; CHECK-LABEL: @switch_VF1_UF2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[N:%.]], 2
				; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[N]], -2
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_LOAD_CONTINUE6:%.*]] ]
				; CHECK-NEXT: [[INDUCTION3:%.*]] = or i64 [[INDEX]], 1
				; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[INDUCTION3]]
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[TMP0]], align 4
				; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP1]], align 4
				; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP2]], 2
				; CHECK-NEXT: [[TMP5:%.*]] = icmp eq i32 [[TMP3]], 2
				; CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i32 [[TMP2]], 3
				; CHECK-NEXT: [[DOTNOT9:%.*]] = icmp eq i32 [[TMP3]], 3
				; CHECK-NEXT: [[TMP6:%.*]] = mul nsw i32 [[TMP2]], 3
				; CHECK-NEXT: [[TMP7:%.*]] = mul nsw i32 [[TMP3]], 3
				; CHECK-NEXT: [[PREDPHI:%.*]] = select i1 [[TMP4]], i32 2, i32 [[TMP6]]
				; CHECK-NEXT: [[PREDPHI4:%.*]] = select i1 [[TMP5]], i32 2, i32 [[TMP7]]
				; CHECK-NEXT: br i1 [[DOTNOT]], label [[PRED_LOAD_CONTINUE:%.]], label [[PRED_LOAD_IF:%.]]
				; CHECK: pred.load.if:
				; CHECK-NEXT: [[TMP8:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP8]], align 4
				; CHECK-NEXT: br label [[PRED_LOAD_CONTINUE]]
				; CHECK: pred.load.continue:
				; CHECK-NEXT: [[TMP10:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP9]], [[PRED_LOAD_IF]] ]
				; CHECK-NEXT: br i1 [[DOTNOT9]], label [[PRED_LOAD_CONTINUE6]], label [[PRED_LOAD_IF5:%.*]]
				; CHECK: pred.load.if5:
				; CHECK-NEXT: [[TMP11:%.]] = getelementptr inbounds i32, i32 [[B]], i64 [[INDUCTION3]]
				; CHECK-NEXT: [[TMP12:%.]] = load i32, i32 [[TMP11]], align 4
				; CHECK-NEXT: br label [[PRED_LOAD_CONTINUE6]]
				; CHECK: pred.load.continue6:
				; CHECK-NEXT: [[TMP13:%.*]] = phi i32 [ poison, [[PRED_LOAD_CONTINUE]] ], [ [[TMP12]], [[PRED_LOAD_IF5]] ]
				; CHECK-NEXT: [[TMP14:%.*]] = mul nsw i32 [[TMP10]], 3
				; CHECK-NEXT: [[TMP15:%.*]] = mul nsw i32 [[TMP13]], 3
				; CHECK-NEXT: [[TMP16:%.*]] = add nsw i32 [[TMP14]], [[PREDPHI]]
				; CHECK-NEXT: [[TMP17:%.*]] = add nsw i32 [[TMP15]], [[PREDPHI4]]
				; CHECK-NEXT: [[PREDPHI7:%.*]] = select i1 [[DOTNOT]], i32 3, i32 [[TMP16]]
				; CHECK-NEXT: [[PREDPHI8:%.*]] = select i1 [[DOTNOT9]], i32 3, i32 [[TMP17]]
				; CHECK-NEXT: [[TMP18:%.]] = getelementptr inbounds i32, i32 [[C:%.*]], i64 [[INDEX]]
				; CHECK-NEXT: [[TMP19:%.]] = getelementptr inbounds i32, i32 [[C]], i64 [[INDUCTION3]]
				; CHECK-NEXT: [[TMP20:%.]] = load i32, i32 [[TMP18]], align 4
				; CHECK-NEXT: [[TMP21:%.]] = load i32, i32 [[TMP19]], align 4
				; CHECK-NEXT: [[TMP22:%.*]] = shl nsw i32 [[TMP20]], 2
				; CHECK-NEXT: [[TMP23:%.*]] = shl nsw i32 [[TMP21]], 2
				; CHECK-NEXT: [[TMP24:%.*]] = add nsw i32 [[TMP22]], [[PREDPHI7]]
				; CHECK-NEXT: [[TMP25:%.*]] = add nsw i32 [[TMP23]], [[PREDPHI8]]
				; CHECK-NEXT: store i32 [[TMP24]], i32* [[TMP0]], align 4
				; CHECK-NEXT: store i32 [[TMP25]], i32* [[TMP1]], align 4
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
				; CHECK-NEXT: [[TMP26:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP26]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[N]]
				; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[FOR_BODY_PREHEADER]]
				; CHECK: for.body.preheader:
				; CHECK-NEXT: [[I_PH:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L3:%.*]] ], [ [[I_PH]], [[FOR_BODY_PREHEADER]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[I]]
				; CHECK-NEXT: [[TMP27:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: switch i32 [[TMP27]], label [[FOR_BODY_SWITCH2:%.*]] [
				; CHECK-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-NEXT: i32 3, label [[L3]]
				; CHECK-NEXT: ]
				; CHECK: for.body.switch2:
				; CHECK-NEXT: [[ADD:%.*]] = mul nsw i32 [[TMP27]], 3
				; CHECK-NEXT: br label [[L2]]
				; CHECK: L2:
				; CHECK-NEXT: [[TMP28:%.*]] = phi i32 [ [[ADD]], [[FOR_BODY_SWITCH2]] ], [ [[TMP27]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds i32, i32 [[B]], i64 [[I]]
				; CHECK-NEXT: [[TMP29:%.]] = load i32, i32 [[ARRAYIDX5]], align 4
				; CHECK-NEXT: [[MUL6:%.*]] = mul nsw i32 [[TMP29]], 3
				; CHECK-NEXT: [[ADD8:%.*]] = add nsw i32 [[MUL6]], [[TMP28]]
				; CHECK-NEXT: br label [[L3]]
				; CHECK: L3:
				; CHECK-NEXT: [[TMP30:%.*]] = phi i32 [ [[ADD8]], [[L2]] ], [ [[TMP27]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ARRAYIDX9:%.]] = getelementptr inbounds i32, i32 [[C]], i64 [[I]]
				; CHECK-NEXT: [[TMP31:%.]] = load i32, i32 [[ARRAYIDX9]], align 4
				; CHECK-NEXT: [[MUL10:%.*]] = shl nsw i32 [[TMP31]], 2
				; CHECK-NEXT: [[ADD12:%.*]] = add nsw i32 [[MUL10]], [[TMP30]]
				; CHECK-NEXT: store i32 [[ADD12]], i32* [[ARRAYIDX]], align 4
				; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I]], 1
				; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INC]], [[N]]
				; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
				; CHECK: for.end.loopexit:
				; CHECK-NEXT: br label [[FOR_END]]
				; CHECK: for.end:
				; CHECK-NEXT: ret void
				;

				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L3 ], [ 0, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				%switch = icmp eq i32 %0, 3
				br i1 %switch, label %L3, label %for.body.switch

				for.body.switch:
				%switch1 = icmp eq i32 %0, 2
				br i1 %switch1, label %L2, label %for.body.switch2

				for.body.switch2:
				%add = mul nsw i32 %0, 3
				store i32 %add, i32* %arrayidx
				br label %L2

				L2:
				%1 = phi i32 [ %add, %for.body.switch2 ], [ %0, %for.body.switch ]
				%arrayidx5 = getelementptr inbounds i32, i32* %b, i64 %i
				%2 = load i32, i32* %arrayidx5
				%mul6 = mul nsw i32 %2, 3
				%add8 = add nsw i32 %1, %mul6
				store i32 %add8, i32* %arrayidx
				br label %L3

				L3:
				%3 = phi i32 [ %0, %for.body ], [ %add8, %L2 ]
				%arrayidx9 = getelementptr inbounds i32, i32* %c, i64 %i
				%4 = load i32, i32* %arrayidx9
				%mul10 = shl nsw i32 %4, 2
				%add12 = add nsw i32 %3, %mul10
				store i32 %add12, i32* %arrayidx
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !1

				for.end:
				ret void
				}

				; This loop will not vectorize due to unsafe FP ops, ensure the switch statement is created again in for.body
				define float @switch_no_vectorize(i32* noalias %a, i32* noalias readonly %b, i32* noalias readonly %c, float %val, i64 %N) {
				; CHECK-LABEL: @switch_no_vectorize(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L3:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[SUM_033:%.]] = phi float [ [[CONV20:%.]], [[L3]] ], [ 2.000000e+00, [[ENTRY]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: switch i32 [[TMP0]], label [[L1:%.*]] [
				; CHECK-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-NEXT: i32 3, label [[L3]]
				; CHECK-NEXT: ]
				; CHECK: L1:
				; CHECK-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP0]] to float
				; CHECK-NEXT: [[CONV4:%.*]] = fpext float [[CONV]] to double
				; CHECK-NEXT: [[ADD:%.*]] = fadd double [[CONV4]], 1.000000e+00
				; CHECK-NEXT: [[CONV5:%.*]] = fpext float [[SUM_033]] to double
				; CHECK-NEXT: [[MUL:%.*]] = fmul double [[ADD]], [[CONV5]]
				; CHECK-NEXT: [[CONV6:%.*]] = fptrunc double [[MUL]] to float
				; CHECK-NEXT: br label [[L2]]
				; CHECK: L2:
				; CHECK-NEXT: [[SUM_1:%.*]] = phi float [ [[CONV6]], [[L1]] ], [ [[SUM_033]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ARRAYIDX7:%.]] = getelementptr inbounds i32, i32 [[B:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[ARRAYIDX7]], align 4
				; CHECK-NEXT: [[CONV8:%.*]] = sitofp i32 [[TMP1]] to float
				; CHECK-NEXT: [[CONV9:%.*]] = fpext float [[CONV8]] to double
				; CHECK-NEXT: [[ADD10:%.*]] = fadd double [[CONV9]], 2.000000e+00
				; CHECK-NEXT: [[CONV11:%.*]] = fpext float [[SUM_1]] to double
				; CHECK-NEXT: [[MUL12:%.*]] = fmul double [[ADD10]], [[CONV11]]
				; CHECK-NEXT: [[CONV13:%.*]] = fptrunc double [[MUL12]] to float
				; CHECK-NEXT: br label [[L3]]
				; CHECK: L3:
				; CHECK-NEXT: [[SUM_2:%.*]] = phi float [ [[CONV13]], [[L2]] ], [ [[SUM_033]], [[FOR_BODY]] ]
				; CHECK-NEXT: [[ARRAYIDX14:%.]] = getelementptr inbounds i32, i32 [[C:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX14]], align 4
				; CHECK-NEXT: [[CONV15:%.*]] = sitofp i32 [[TMP2]] to float
				; CHECK-NEXT: [[CONV16:%.*]] = fpext float [[CONV15]] to double
				; CHECK-NEXT: [[ADD17:%.*]] = fadd double [[CONV16]], 3.000000e+00
				; CHECK-NEXT: [[CONV18:%.*]] = fpext float [[SUM_2]] to double
				; CHECK-NEXT: [[MUL19:%.*]] = fmul double [[ADD17]], [[CONV18]]
				; CHECK-NEXT: [[CONV20]] = fptrunc double [[MUL19]] to float
				; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I]], 1
				; CHECK-NEXT: [[EXITCOND_NOT:%.]] = icmp eq i64 [[INC]], [[N:%.]]
				; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END:%.*]], label [[FOR_BODY]]
				; CHECK: for.end:
				; CHECK-NEXT: [[CONV20_LCSSA:%.*]] = phi float [ [[CONV20]], [[L3]] ]
				; CHECK-NEXT: ret float [[CONV20_LCSSA]]
				;

				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L3 ], [ 0, %entry ]
				%sum.033 = phi float [ %conv20, %L3 ], [ 2.000000e+00, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				switch i32 %0, label %L1 [
				i32 3, label %L3
				i32 2, label %L2
				]

				L1:
				%conv = sitofp i32 %0 to float
				%conv4 = fpext float %conv to double
				%add = fadd double %conv4, 1.000000e+00
				%conv5 = fpext float %sum.033 to double
				%mul = fmul double %add, %conv5
				%conv6 = fptrunc double %mul to float
				br label %L2

				L2:
				%sum.1 = phi float [ %conv6, %L1 ], [ %sum.033, %for.body ]
				%arrayidx7 = getelementptr inbounds i32, i32* %b, i64 %i
				%1 = load i32, i32* %arrayidx7
				%conv8 = sitofp i32 %1 to float
				%conv9 = fpext float %conv8 to double
				%add10 = fadd double %conv9, 2.000000e+00
				%conv11 = fpext float %sum.1 to double
				%mul12 = fmul double %add10, %conv11
				%conv13 = fptrunc double %mul12 to float
				br label %L3

				L3:
				%sum.2 = phi float [ %conv13, %L2 ], [ %sum.033, %for.body ]
				%arrayidx14 = getelementptr inbounds i32, i32* %c, i64 %i
				%2 = load i32, i32* %arrayidx14
				%conv15 = sitofp i32 %2 to float
				%conv16 = fpext float %conv15 to double
				%add17 = fadd double %conv16, 3.000000e+00
				%conv18 = fpext float %sum.2 to double
				%mul19 = fmul double %add17, %conv18
				%conv20 = fptrunc double %mul19 to float
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body

				for.end:
				ret float %conv20
				}

				!0 = distinct !{!0, !2, !4, !6}
				!1 = distinct !{!1, !3, !5, !6}
				!2 = !{!"llvm.loop.vectorize.width", i32 4}
				!3 = !{!"llvm.loop.vectorize.width", i32 1}
				!4 = !{!"llvm.loop.interleave.count", i32 1}
				!5 = !{!"llvm.loop.interleave.count", i32 2}
				!6 = !{!"llvm.loop.vectorize.enable", i1 true}

llvm/test/Transforms/LowerSwitch/simple-switches.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -lowerswitch -force-loop-unswitch -S \| FileCheck %s
				; RUN: opt < %s -lowerswitch -force-loop-unswitch -simplifycfg -S \| FileCheck --check-prefix=CHECK-SIMPLIFY-CFG %s

				define void @unswitch(i32* nocapture %a, i32* nocapture readonly %b, i32* nocapture readonly %c, i64 %N){
				; CHECK-LABEL: @unswitch(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L4:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: br label [[LEAFBLOCK3:%.*]]
				; CHECK: LeafBlock3:
				; CHECK-NEXT: [[SWITCHLEAF4:%.*]] = icmp eq i32 [[TMP0]], 3
				; CHECK-NEXT: br i1 [[SWITCHLEAF4]], label [[L3:%.]], label [[LEAFBLOCK1:%.]]
				; CHECK: LeafBlock1:
				; CHECK-NEXT: [[SWITCHLEAF2:%.*]] = icmp eq i32 [[TMP0]], 2
				; CHECK-NEXT: br i1 [[SWITCHLEAF2]], label [[L2:%.]], label [[LEAFBLOCK:%.]]
				; CHECK: LeafBlock:
				; CHECK-NEXT: [[SWITCHLEAF:%.*]] = icmp eq i32 [[TMP0]], 4
				; CHECK-NEXT: br i1 [[SWITCHLEAF]], label [[L4]], label [[NEWDEFAULT:%.*]]
				; CHECK: NewDefault:
				; CHECK-NEXT: br label [[L1:%.*]]
				;
				; CHECK-SIMPLIFY-CFG-LABEL: @unswitch(
				; CHECK-SIMPLIFY-CFG-NEXT: entry:
				; CHECK-SIMPLIFY-CFG-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK-SIMPLIFY-CFG: for.body:
				; CHECK-SIMPLIFY-CFG-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L4:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-SIMPLIFY-CFG-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-SIMPLIFY-CFG-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-SIMPLIFY-CFG-NEXT: switch i32 [[TMP0]], label [[L1:%.*]] [
				; CHECK-SIMPLIFY-CFG-NEXT: i32 3, label [[L3:%.*]]
				; CHECK-SIMPLIFY-CFG-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-SIMPLIFY-CFG-NEXT: i32 4, label [[L4]]
				; CHECK-SIMPLIFY-CFG-NEXT: ]

				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L4 ], [ 0, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				switch i32 %0, label %L1 [
				i32 4, label %L4
				i32 2, label %L2
				i32 3, label %L3
				]

				L1:
				%arrayidx5 = getelementptr inbounds i32, i32* %b, i64 %i
				%1 = load i32, i32* %arrayidx5
				%mul = mul nsw i32 %1, %0
				%add = add nsw i32 %mul, %0
				store i32 %add, i32* %arrayidx
				br label %L2

				L2:
				%2 = phi i32 [ %0, %for.body ], [ %add, %L1 ]
				%arrayidx7 = getelementptr inbounds i32, i32* %b, i64 %i
				%3 = load i32, i32* %arrayidx7, align 4
				%mul9 = mul nsw i32 %3, %3
				%add11 = add nsw i32 %2, %mul9
				store i32 %add11, i32* %arrayidx
				br label %L3

				L3:
				%4 = phi i32 [ %0, %for.body ], [ %add11, %L2 ]
				%arrayidx13 = getelementptr inbounds i32, i32* %c, i64 %i
				%5 = load i32, i32* %arrayidx13
				%mul14 = mul nsw i32 %5, %4
				%add16 = add nsw i32 %mul14, %4
				store i32 %add16, i32* %arrayidx
				br label %L4

				L4:
				%6 = phi i32 [ %0, %for.body ], [ %add16, %L3 ]
				%arrayidx17 = getelementptr inbounds i32, i32* %c, i64 %i
				%7 = load i32, i32* %arrayidx17
				%mul19 = mul nsw i32 %7, %7
				%add21 = add nsw i32 %6, %mul19
				store i32 %add21, i32* %arrayidx
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body

				for.end:
				ret void
				}

				; This test should not replace the switch statement as multiple cases have the same destination block
				define dso_local void @switch2(i32* nocapture %a, i32* nocapture readonly %b, i32* nocapture readonly %c, i64 %N) {
				; CHECK-LABEL: @switch2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L3:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: switch i32 [[TMP0]], label [[L1:%.*]] [
				; CHECK-NEXT: i32 4, label [[L3]]
				; CHECK-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-NEXT: i32 3, label [[L3]]
				; CHECK-NEXT: ]
				;
				; CHECK-SIMPLIFY-CFG-LABEL: @switch2(
				; CHECK-SIMPLIFY-CFG-NEXT: entry:
				; CHECK-SIMPLIFY-CFG-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK-SIMPLIFY-CFG: for.body:
				; CHECK-SIMPLIFY-CFG-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L3:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-SIMPLIFY-CFG-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-SIMPLIFY-CFG-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-SIMPLIFY-CFG-NEXT: switch i32 [[TMP0]], label [[L1:%.*]] [
				; CHECK-SIMPLIFY-CFG-NEXT: i32 4, label [[L3]]
				; CHECK-SIMPLIFY-CFG-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-SIMPLIFY-CFG-NEXT: i32 3, label [[L3]]
				; CHECK-SIMPLIFY-CFG-NEXT: ]

				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L3 ], [ 0, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				switch i32 %0, label %L1 [
				i32 4, label %L3
				i32 2, label %L2
				i32 3, label %L3
				]

				L1:
				%arrayidx5 = getelementptr inbounds i32, i32* %b, i64 %i
				%1 = load i32, i32* %arrayidx5
				%mul = mul nsw i32 %1, %0
				%add = add nsw i32 %mul, %0
				store i32 %add, i32* %arrayidx
				br label %L2

				L2:
				%2 = phi i32 [ %0, %for.body ], [ %add, %L1 ]
				%arrayidx7 = getelementptr inbounds i32, i32* %b, i64 %i
				%3 = load i32, i32* %arrayidx7
				%mul9 = mul nsw i32 %3, %3
				%add11 = add nsw i32 %2, %mul9
				store i32 %add11, i32* %arrayidx
				br label %L3

				L3:
				%4 = phi i32 [ %0, %for.body ], [ %0, %for.body ], [ %add11, %L2 ]
				%arrayidx13 = getelementptr inbounds i32, i32* %c, i64 %i
				%5 = load i32, i32* %arrayidx13
				%mul14 = mul nsw i32 %5, %4
				%add16 = add nsw i32 %mul14, %4
				store i32 %add16, i32* %arrayidx
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body

				for.end:
				ret void
				}

				define dso_local void @unreachable(i32* nocapture %a, i32* nocapture readonly %b, i32* nocapture readonly %c, i64 %N) {
				; CHECK-LABEL: @unreachable(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.body:
				; CHECK-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L3:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-NEXT: br label [[LEAFBLOCK3:%.*]]
				; CHECK: LeafBlock3:
				; CHECK-NEXT: [[SWITCHLEAF4:%.*]] = icmp eq i32 [[TMP0]], 3
				; CHECK-NEXT: br i1 [[SWITCHLEAF4]], label [[L3]], label [[LEAFBLOCK1:%.*]]
				; CHECK: LeafBlock1:
				; CHECK-NEXT: [[SWITCHLEAF2:%.*]] = icmp eq i32 [[TMP0]], 2
				; CHECK-NEXT: br i1 [[SWITCHLEAF2]], label [[L2:%.]], label [[LEAFBLOCK:%.]]
				; CHECK: LeafBlock:
				; CHECK-NEXT: [[SWITCHLEAF:%.*]] = icmp eq i32 [[TMP0]], 4
				; CHECK-NEXT: br i1 [[SWITCHLEAF]], label [[L1:%.]], label [[NEWDEFAULT:%.]]
				; CHECK: NewDefault:
				; CHECK-NEXT: br label [[DEFAULT:%.*]]
				; CHECK: Default:
				; CHECK-NEXT: unreachable
				;
				; CHECK-SIMPLIFY-CFG-LABEL: @unreachable(
				; CHECK-SIMPLIFY-CFG-NEXT: entry:
				; CHECK-SIMPLIFY-CFG-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK-SIMPLIFY-CFG: for.body:
				; CHECK-SIMPLIFY-CFG-NEXT: [[I:%.]] = phi i64 [ [[INC:%.]], [[L3:%.]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-SIMPLIFY-CFG-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[I]]
				; CHECK-SIMPLIFY-CFG-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; CHECK-SIMPLIFY-CFG-NEXT: switch i32 [[TMP0]], label [[DEFAULT:%.*]] [
				; CHECK-SIMPLIFY-CFG-NEXT: i32 3, label [[L3]]
				; CHECK-SIMPLIFY-CFG-NEXT: i32 2, label [[L2:%.*]]
				; CHECK-SIMPLIFY-CFG-NEXT: i32 4, label [[L1:%.*]]
				; CHECK-SIMPLIFY-CFG-NEXT: ]
				; CHECK-SIMPLIFY-CFG: Default:
				; CHECK-SIMPLIFY-CFG-NEXT: unreachable
				;
				entry:
				br label %for.body

				for.body:
				%i = phi i64 [ %inc, %L3 ], [ 0, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %i
				%0 = load i32, i32* %arrayidx
				switch i32 %0, label %Default [
				i32 4, label %L1
				i32 2, label %L2
				i32 3, label %L3
				]

				Default:
				unreachable

				L1:
				%arrayidx5 = getelementptr inbounds i32, i32* %b, i64 %i
				%1 = load i32, i32* %arrayidx5
				%mul = mul nsw i32 %1, %0
				%add = add nsw i32 %mul, %0
				store i32 %add, i32* %arrayidx
				br label %L2

				L2:
				%2 = phi i32 [ %0, %for.body ], [ %add, %L1 ]
				%arrayidx7 = getelementptr inbounds i32, i32* %b, i64 %i
				%3 = load i32, i32* %arrayidx7
				%mul9 = mul nsw i32 %3, %3
				%add11 = add nsw i32 %2, %mul9
				store i32 %add11, i32* %arrayidx
				br label %L3

				L3:
				%4 = phi i32 [ %0, %for.body ], [ %add11, %L2 ]
				%arrayidx13 = getelementptr inbounds i32, i32* %c, i64 %i
				%5 = load i32, i32* %arrayidx13
				%mul14 = mul nsw i32 %5, %4
				%add16 = add nsw i32 %mul14, %4
				store i32 %add16, i32* %arrayidx
				%inc = add nuw nsw i64 %i, 1
				%exitcond.not = icmp eq i64 %inc, %N
				br i1 %exitcond.not, label %for.end, label %for.body

				for.end:
				ret void
				}

llvm/test/Transforms/StructurizeCFG/workarounds/needs-fr-ule.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -fix-irreducible -unify-loop-exits -structurizecfg -enable-new-pm=0 -S \| FileCheck %s			; RUN: opt < %s -fix-irreducible -unify-loop-exits -structurizecfg -enable-new-pm=0 -S \| FileCheck %s
	define void @irreducible_mountain_bug(i1 %Pred0, i1 %Pred1, i1 %Pred2, i1 %Pred3, i1 %Pred4, i1 %Pred5, i1 %Pred6, i1 %Pred7, i1 %Pred8, i1 %Pred9, i1 %Pred10, i1 %Pred11, i1 %Pred12, i1 %Pred13) {			define void @irreducible_mountain_bug(i1 %Pred0, i1 %Pred1, i1 %Pred2, i1 %Pred3, i1 %Pred4, i1 %Pred5, i1 %Pred6, i1 %Pred7, i1 %Pred8, i1 %Pred9, i1 %Pred10, i1 %Pred11, i1 %Pred12, i1 %Pred13) {
	; CHECK-LABEL: @irreducible_mountain_bug(			; CHECK-LABEL: @irreducible_mountain_bug(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[PRED0_INV:%.]] = xor i1 [[PRED0:%.]], true			; CHECK-NEXT: [[PRED0_INV:%.]] = xor i1 [[PRED0:%.]], true
	; CHECK-NEXT: [[PRED1_INV:%.]] = xor i1 [[PRED1:%.]], true			; CHECK-NEXT: [[PRED1_INV:%.]] = xor i1 [[PRED1:%.]], true
	; CHECK-NEXT: [[PRED2_INV:%.]] = xor i1 [[PRED2:%.]], true			; CHECK-NEXT: [[PRED2_INV:%.]] = xor i1 [[PRED2:%.]], true
	; CHECK-NEXT: [[PRED3_INV:%.]] = xor i1 [[PRED3:%.]], true			; CHECK-NEXT: [[PRED3_INV:%.]] = xor i1 [[PRED3:%.]], true
	; CHECK-NEXT: [[PRED5_INV:%.]] = xor i1 [[PRED5:%.]], true			; CHECK-NEXT: [[PRED5_INV:%.]] = xor i1 [[PRED5:%.]], true
	; CHECK-NEXT: [[PRED4_INV:%.]] = xor i1 [[PRED4:%.]], true			; CHECK-NEXT: [[PRED4_INV:%.]] = xor i1 [[PRED4:%.]], true
	; CHECK-NEXT: [[PRED10_INV:%.]] = xor i1 [[PRED10:%.]], true			; CHECK-NEXT: [[PRED10_INV:%.]] = xor i1 [[PRED10:%.]], true
	; CHECK-NEXT: [[PRED11_INV:%.]] = xor i1 [[PRED11:%.]], true			; CHECK-NEXT: [[PRED11_INV:%.]] = xor i1 [[PRED11:%.]], true
	; CHECK-NEXT: [[PRED12_INV:%.]] = xor i1 [[PRED12:%.]], true			; CHECK-NEXT: [[PRED12_INV:%.]] = xor i1 [[PRED12:%.]], true
	; CHECK-NEXT: [[PRED13_INV:%.]] = xor i1 [[PRED13:%.]], true			; CHECK-NEXT: [[PRED13_INV:%.]] = xor i1 [[PRED13:%.]], true
	; CHECK-NEXT: br i1 [[PRED0_INV]], label [[IF_THEN:%.]], label [[FLOW19:%.]]			; CHECK-NEXT: br i1 [[PRED0_INV]], label [[IF_THEN:%.]], label [[FLOW18:%.]]
	; CHECK: Flow19:			; CHECK: Flow18:
	; CHECK-NEXT: [[TMP0:%.]] = phi i1 [ false, [[FLOW3:%.]] ], [ true, [[ENTRY:%.*]] ]			; CHECK-NEXT: [[TMP0:%.]] = phi i1 [ false, [[FLOW3:%.]] ], [ true, [[ENTRY:%.*]] ]
	; CHECK-NEXT: br i1 [[TMP0]], label [[IF_END:%.]], label [[FLOW20:%.]]			; CHECK-NEXT: br i1 [[TMP0]], label [[IF_END:%.]], label [[FLOW19:%.]]
	; CHECK: if.end:			; CHECK: if.end:
	; CHECK-NEXT: br i1 [[PRED1_INV]], label [[IF_ELSE:%.]], label [[FLOW18:%.]]			; CHECK-NEXT: br i1 [[PRED1_INV]], label [[IF_ELSE:%.]], label [[FLOW17:%.]]
	; CHECK: Flow18:			; CHECK: Flow17:
	; CHECK-NEXT: [[TMP1:%.*]] = phi i1 [ false, [[IF_ELSE]] ], [ true, [[IF_END]] ]			; CHECK-NEXT: [[TMP1:%.*]] = phi i1 [ false, [[IF_ELSE]] ], [ true, [[IF_END]] ]
	; CHECK-NEXT: br i1 [[TMP1]], label [[IF_THEN7:%.]], label [[IF_END16:%.]]			; CHECK-NEXT: br i1 [[TMP1]], label [[IF_THEN7:%.]], label [[IF_END16:%.]]
	; CHECK: if.then7:			; CHECK: if.then7:
	; CHECK-NEXT: br label [[IF_END16]]			; CHECK-NEXT: br label [[IF_END16]]
	; CHECK: if.else:			; CHECK: if.else:
	; CHECK-NEXT: br label [[FLOW18]]			; CHECK-NEXT: br label [[FLOW17]]
	; CHECK: Flow20:			; CHECK: Flow19:
	; CHECK-NEXT: br label [[EXIT:%.*]]			; CHECK-NEXT: br label [[EXIT:%.*]]
	; CHECK: if.end16:			; CHECK: if.end16:
	; CHECK-NEXT: br i1 [[PRED2_INV]], label [[IF_THEN39:%.]], label [[FLOW16:%.]]			; CHECK-NEXT: br i1 [[PRED2_INV]], label [[IF_THEN39:%.]], label [[FLOW15:%.]]
	; CHECK: Flow16:			; CHECK: Flow15:
	; CHECK-NEXT: [[TMP2:%.]] = phi i1 [ false, [[FLOW5:%.]] ], [ true, [[IF_END16]] ]			; CHECK-NEXT: [[TMP2:%.]] = phi i1 [ false, [[FLOW5:%.]] ], [ true, [[IF_END16]] ]
	; CHECK-NEXT: br i1 [[TMP2]], label [[WHILE_COND_PREHEADER:%.]], label [[FLOW17:%.]]			; CHECK-NEXT: br i1 [[TMP2]], label [[WHILE_COND_PREHEADER:%.]], label [[FLOW16:%.]]
	; CHECK: while.cond.preheader:			; CHECK: while.cond.preheader:
	; CHECK-NEXT: br label [[WHILE_COND:%.*]]			; CHECK-NEXT: br label [[WHILE_COND:%.*]]
	; CHECK: Flow17:			; CHECK: Flow16:
	; CHECK-NEXT: br label [[FLOW20]]			; CHECK-NEXT: br label [[FLOW19]]
	; CHECK: while.cond:			; CHECK: while.cond:
	; CHECK-NEXT: br i1 [[PRED3_INV]], label [[LOR_RHS:%.]], label [[FLOW12:%.]]			; CHECK-NEXT: br i1 [[PRED3_INV]], label [[LOR_RHS:%.]], label [[FLOW11:%.]]
	; CHECK: Flow7:			; CHECK: Flow7:
	; CHECK-NEXT: [[TMP3:%.]] = phi i1 [ [[PRED7:%.]], [[COND_END61:%.]] ], [ false, [[IRR_GUARD:%.]] ]			; CHECK-NEXT: [[TMP3:%.]] = phi i1 [ [[PRED7:%.]], [[COND_END61:%.]] ], [ false, [[IRR_GUARD:%.]] ]
	; CHECK-NEXT: [[TMP4:%.*]] = phi i1 [ false, [[COND_END61]] ], [ true, [[IRR_GUARD]] ]			; CHECK-NEXT: [[TMP4:%.*]] = phi i1 [ false, [[COND_END61]] ], [ true, [[IRR_GUARD]] ]
	; CHECK-NEXT: br i1 [[TMP4]], label [[COND_TRUE49:%.]], label [[FLOW8:%.]]			; CHECK-NEXT: br i1 [[TMP4]], label [[COND_TRUE49:%.]], label [[FLOW8:%.]]
	; CHECK: cond.true49:			; CHECK: cond.true49:
	; CHECK-NEXT: br label [[FLOW8]]			; CHECK-NEXT: br label [[FLOW8]]
	; CHECK: Flow8:			; CHECK: Flow8:
	; CHECK-NEXT: [[TMP5:%.]] = phi i1 [ false, [[COND_TRUE49]] ], [ true, [[FLOW7:%.]] ]			; CHECK-NEXT: [[TMP5:%.]] = phi i1 [ true, [[COND_TRUE49]] ], [ false, [[FLOW7:%.]] ]
	; CHECK-NEXT: [[TMP6:%.*]] = phi i1 [ [[PRED4_INV]], [[COND_TRUE49]] ], [ [[TMP3]], [[FLOW7]] ]			; CHECK-NEXT: [[TMP6:%.*]] = phi i1 [ false, [[COND_TRUE49]] ], [ true, [[FLOW7]] ]
	; CHECK-NEXT: br i1 [[TMP6]], label [[WHILE_BODY63:%.]], label [[FLOW9:%.]]			; CHECK-NEXT: [[TMP7:%.*]] = phi i1 [ [[PRED4_INV]], [[COND_TRUE49]] ], [ [[TMP3]], [[FLOW7]] ]
				; CHECK-NEXT: br i1 [[TMP7]], label [[WHILE_BODY63:%.]], label [[FLOW9:%.]]
	; CHECK: while.body63:			; CHECK: while.body63:
	; CHECK-NEXT: br i1 [[PRED5_INV]], label [[WHILE_COND47:%.]], label [[FLOW10:%.]]			; CHECK-NEXT: br i1 [[PRED5_INV]], label [[WHILE_COND47:%.]], label [[FLOW10:%.]]
	; CHECK: Flow9:			; CHECK: Flow9:
	; CHECK-NEXT: [[TMP7:%.*]] = phi i1 [ true, [[FLOW10]] ], [ false, [[FLOW8]] ]
	; CHECK-NEXT: [[TMP8:%.*]] = phi i1 [ false, [[FLOW10]] ], [ [[TMP5]], [[FLOW8]] ]			; CHECK-NEXT: [[TMP8:%.*]] = phi i1 [ false, [[FLOW10]] ], [ [[TMP5]], [[FLOW8]] ]
	; CHECK-NEXT: [[TMP9:%.]] = phi i1 [ [[TMP15:%.]], [[FLOW10]] ], [ true, [[FLOW8]] ]			; CHECK-NEXT: [[TMP9:%.*]] = phi i1 [ false, [[FLOW10]] ], [ [[TMP6]], [[FLOW8]] ]
	; CHECK-NEXT: [[DOTINV11:%.*]] = xor i1 [[TMP7]], true			; CHECK-NEXT: [[TMP10:%.]] = phi i1 [ [[TMP16:%.]], [[FLOW10]] ], [ true, [[FLOW8]] ]
	; CHECK-NEXT: [[DOTINV:%.*]] = xor i1 [[TMP8]], true			; CHECK-NEXT: [[DOTINV:%.*]] = xor i1 [[TMP9]], true
	; CHECK-NEXT: br i1 [[TMP9]], label [[LOOP_EXIT_GUARD1:%.*]], label [[IRR_GUARD]]			; CHECK-NEXT: br i1 [[TMP10]], label [[LOOP_EXIT_GUARD1:%.*]], label [[IRR_GUARD]]
	; CHECK: while.cond47:			; CHECK: while.cond47:
	; CHECK-NEXT: br label [[FLOW10]]			; CHECK-NEXT: br label [[FLOW10]]
	; CHECK: cond.end61:			; CHECK: cond.end61:
	; CHECK-NEXT: br label [[FLOW7]]			; CHECK-NEXT: br label [[FLOW7]]
	; CHECK: Flow14:			; CHECK: Flow13:
	; CHECK-NEXT: [[TMP10:%.]] = phi i1 [ false, [[FLOW15:%.]] ], [ true, [[LOOP_EXIT_GUARD1]] ]			; CHECK-NEXT: [[TMP11:%.]] = phi i1 [ false, [[FLOW14:%.]] ], [ true, [[LOOP_EXIT_GUARD1]] ]
	; CHECK-NEXT: [[TMP11:%.]] = phi i1 [ [[TMP14:%.]], [[FLOW15]] ], [ [[DOTINV]], [[LOOP_EXIT_GUARD1]] ]			; CHECK-NEXT: [[TMP12:%.]] = phi i1 [ [[TMP15:%.]], [[FLOW14]] ], [ [[DOTINV]], [[LOOP_EXIT_GUARD1]] ]
	; CHECK-NEXT: br label [[FLOW13:%.*]]			; CHECK-NEXT: br label [[FLOW12:%.*]]
	; CHECK: if.then69:			; CHECK: if.then69:
	; CHECK-NEXT: br label [[FLOW15]]			; CHECK-NEXT: br label [[FLOW14]]
	; CHECK: lor.rhs:			; CHECK: lor.rhs:
	; CHECK-NEXT: br label [[FLOW12]]			; CHECK-NEXT: br label [[FLOW11]]
	; CHECK: while.end76:			; CHECK: while.end76:
	; CHECK-NEXT: br label [[FLOW6:%.*]]			; CHECK-NEXT: br label [[FLOW6:%.*]]
	; CHECK: if.then39:			; CHECK: if.then39:
	; CHECK-NEXT: br i1 [[PRED10_INV]], label [[IF_END_I145:%.*]], label [[FLOW5]]			; CHECK-NEXT: br i1 [[PRED10_INV]], label [[IF_END_I145:%.*]], label [[FLOW5]]
	; CHECK: if.end.i145:			; CHECK: if.end.i145:
	; CHECK-NEXT: br i1 [[PRED11_INV]], label [[IF_END8_I149:%.]], label [[FLOW4:%.]]			; CHECK-NEXT: br i1 [[PRED11_INV]], label [[IF_END8_I149:%.]], label [[FLOW4:%.]]
	; CHECK: if.end8.i149:			; CHECK: if.end8.i149:
	; CHECK-NEXT: br label [[FLOW4]]			; CHECK-NEXT: br label [[FLOW4]]
	; CHECK: if.then:			; CHECK: if.then:
	; CHECK-NEXT: br i1 [[PRED12_INV]], label [[IF_END_I:%.*]], label [[FLOW3]]			; CHECK-NEXT: br i1 [[PRED12_INV]], label [[IF_END_I:%.*]], label [[FLOW3]]
	; CHECK: if.end.i:			; CHECK: if.end.i:
	; CHECK-NEXT: br i1 [[PRED13_INV]], label [[IF_END8_I:%.]], label [[FLOW:%.]]			; CHECK-NEXT: br i1 [[PRED13_INV]], label [[IF_END8_I:%.]], label [[FLOW:%.]]
	; CHECK: if.end8.i:			; CHECK: if.end8.i:
	; CHECK-NEXT: br label [[FLOW]]			; CHECK-NEXT: br label [[FLOW]]
	; CHECK: Flow:			; CHECK: Flow:
	; CHECK-NEXT: br label [[FLOW3]]			; CHECK-NEXT: br label [[FLOW3]]
	; CHECK: Flow3:			; CHECK: Flow3:
	; CHECK-NEXT: br label [[FLOW19]]			; CHECK-NEXT: br label [[FLOW18]]
	; CHECK: Flow4:			; CHECK: Flow4:
	; CHECK-NEXT: br label [[FLOW5]]			; CHECK-NEXT: br label [[FLOW5]]
	; CHECK: Flow5:			; CHECK: Flow5:
	; CHECK-NEXT: br label [[FLOW16]]			; CHECK-NEXT: br label [[FLOW15]]
	; CHECK: Flow6:			; CHECK: Flow6:
	; CHECK-NEXT: br label [[FLOW17]]			; CHECK-NEXT: br label [[FLOW16]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	; CHECK: Flow12:			; CHECK: Flow11:
	; CHECK-NEXT: [[TMP12:%.*]] = phi i1 [ false, [[LOR_RHS]] ], [ true, [[WHILE_COND]] ]			; CHECK-NEXT: [[TMP13:%.*]] = phi i1 [ false, [[LOR_RHS]] ], [ true, [[WHILE_COND]] ]
	; CHECK-NEXT: [[TMP13:%.]] = phi i1 [ [[PRED9:%.]], [[LOR_RHS]] ], [ [[PRED3]], [[WHILE_COND]] ]			; CHECK-NEXT: [[TMP14:%.]] = phi i1 [ [[PRED9:%.]], [[LOR_RHS]] ], [ [[PRED3]], [[WHILE_COND]] ]
	; CHECK-NEXT: br i1 [[TMP13]], label [[IRR_GUARD]], label [[FLOW13]]			; CHECK-NEXT: br i1 [[TMP14]], label [[IRR_GUARD]], label [[FLOW12]]
	; CHECK: irr.guard:			; CHECK: irr.guard:
	; CHECK-NEXT: [[GUARD_COND_TRUE49:%.]] = phi i1 [ [[PRED6:%.]], [[FLOW9]] ], [ [[TMP12]], [[FLOW12]] ]			; CHECK-NEXT: [[GUARD_COND_TRUE49:%.]] = phi i1 [ [[PRED6:%.]], [[FLOW9]] ], [ [[TMP13]], [[FLOW11]] ]
	; CHECK-NEXT: [[GUARD_COND_TRUE49_INV:%.*]] = xor i1 [[GUARD_COND_TRUE49]], true			; CHECK-NEXT: [[GUARD_COND_TRUE49_INV:%.*]] = xor i1 [[GUARD_COND_TRUE49]], true
	; CHECK-NEXT: br i1 [[GUARD_COND_TRUE49_INV]], label [[COND_END61]], label [[FLOW7]]			; CHECK-NEXT: br i1 [[GUARD_COND_TRUE49_INV]], label [[COND_END61]], label [[FLOW7]]
	; CHECK: Flow15:			; CHECK: Flow14:
	; CHECK-NEXT: [[TMP14]] = phi i1 [ [[PRED8:%.]], [[IF_THEN69:%.]] ], [ [[DOTINV]], [[LOOP_EXIT_GUARD2:%.*]] ]			; CHECK-NEXT: [[TMP15]] = phi i1 [ [[PRED8:%.]], [[IF_THEN69:%.]] ], [ [[DOTINV]], [[LOOP_EXIT_GUARD2:%.*]] ]
	; CHECK-NEXT: br label [[FLOW14:%.*]]			; CHECK-NEXT: br label [[FLOW13:%.*]]
	; CHECK: loop.exit.guard:			; CHECK: loop.exit.guard:
	; CHECK-NEXT: br i1 [[TMP16:%.]], label [[WHILE_END76:%.]], label [[FLOW6]]			; CHECK-NEXT: br i1 [[TMP17:%.]], label [[WHILE_END76:%.]], label [[FLOW6]]
	; CHECK: Flow10:			; CHECK: Flow10:
	; CHECK-NEXT: [[TMP15]] = phi i1 [ false, [[WHILE_COND47]] ], [ true, [[WHILE_BODY63]] ]			; CHECK-NEXT: [[TMP16]] = phi i1 [ false, [[WHILE_COND47]] ], [ true, [[WHILE_BODY63]] ]
	; CHECK-NEXT: br label [[FLOW9]]			; CHECK-NEXT: br label [[FLOW9]]
	; CHECK: Flow13:			; CHECK: Flow12:
	; CHECK-NEXT: [[TMP16]] = phi i1 [ [[TMP10]], [[FLOW14]] ], [ true, [[FLOW12]] ]			; CHECK-NEXT: [[TMP17]] = phi i1 [ [[TMP11]], [[FLOW13]] ], [ true, [[FLOW11]] ]
	; CHECK-NEXT: [[TMP17:%.*]] = phi i1 [ [[TMP11]], [[FLOW14]] ], [ true, [[FLOW12]] ]			; CHECK-NEXT: [[TMP18:%.*]] = phi i1 [ [[TMP12]], [[FLOW13]] ], [ true, [[FLOW11]] ]
	; CHECK-NEXT: br i1 [[TMP17]], label [[LOOP_EXIT_GUARD:%.*]], label [[WHILE_COND]]			; CHECK-NEXT: br i1 [[TMP18]], label [[LOOP_EXIT_GUARD:%.*]], label [[WHILE_COND]]
	; CHECK: loop.exit.guard1:			; CHECK: loop.exit.guard1:
	; CHECK-NEXT: br i1 [[DOTINV]], label [[LOOP_EXIT_GUARD2]], label [[FLOW14]]			; CHECK-NEXT: br i1 [[DOTINV]], label [[LOOP_EXIT_GUARD2]], label [[FLOW13]]
	; CHECK: loop.exit.guard2:			; CHECK: loop.exit.guard2:
	; CHECK-NEXT: br i1 [[DOTINV11]], label [[IF_THEN69]], label [[FLOW15]]			; CHECK-NEXT: br i1 [[TMP8]], label [[IF_THEN69]], label [[FLOW14]]
	;			;
	entry:			entry:
	br i1 %Pred0, label %if.end, label %if.then			br i1 %Pred0, label %if.end, label %if.then

	if.end:			if.end:
	br i1 %Pred1, label %if.then7, label %if.else			br i1 %Pred1, label %if.then7, label %if.else

	if.then7:			if.then7:
	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[WIP] Remove switch statements before vectorizationAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 372706

clang/test/Frontend/optimization-remark-analysis.c

llvm/include/llvm/Transforms/Utils/LowerSwitch.h

llvm/lib/Passes/PassBuilder.cpp

llvm/lib/Transforms/Scalar/StructurizeCFG.cpp

llvm/lib/Transforms/Utils/FixIrreducible.cpp

llvm/lib/Transforms/Utils/LowerSwitch.cpp

llvm/lib/Transforms/Utils/UnifyLoopExits.cpp

llvm/test/CodeGen/AMDGPU/llc-pipeline.ll

llvm/test/Other/new-pm-defaults.ll

llvm/test/Other/new-pm-lto-defaults.ll

llvm/test/Other/new-pm-thinlto-defaults.ll

llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll

llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll

llvm/test/Transforms/LoopVectorize/AArch64/sve-remove-switches.ll

llvm/test/Transforms/LoopVectorize/remove-switches.ll

llvm/test/Transforms/LowerSwitch/simple-switches.ll

llvm/test/Transforms/StructurizeCFG/workarounds/needs-fr-ule.ll

[WIP] Remove switch statements before vectorization
AbandonedPublic