This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
1
LangRef.rst
-
ReleaseNotes.rst
-
include/llvm/
-
llvm/
-
Analysis/
-
ValueTracking.h
-
CodeGen/GlobalISel/
-
GlobalISel/
-
IRTranslator.h
-
IR/
-
Constant.h
-
Constants.h
-
lib/
-
Analysis/
-
CodeMetrics.cpp
1/3
ValueTracking.cpp
-
CodeGen/SelectionDAG/
-
SelectionDAG/
-
FastISel.cpp
-
SelectionDAGBuilder.h
1
SelectionDAGBuilder.cpp
-
SelectionDAGISel.cpp
-
IR/
1
Constants.cpp
-
Transforms/
-
Utils/
-
SimplifyCFG.cpp
-
Vectorize/
-
LoopVectorizationLegality.cpp
-
test/
-
CodeGen/X86/
-
X86/
-
critical-edge-split-2.ll
-
divide-constant-expression.ll
-
Transforms/
-
LoopVectorize/
-
X86/
1
masked_load_store.ll
-
if-conversion.ll
-
SimplifyCFG/
-
2006-10-19-UncondDiv.ll
-
ConditionalTrappingConstantExpr.ll
-
PR16069.ll
-
PR17073.ll
-
unittests/IR/
-
IR/
-
ConstantsTest.cpp

Differential D63036

LLVM IR constant expressions never trap.
Changes PlannedPublic

Authored by efriedma on Jun 7 2019, 5:30 PM.

Download Raw Diff

Details

Reviewers

hfinkel
chandlerc
jdoerfert
aemerson

Summary

Currently, constants have a property "Constant::canTrap", which is whether they contain a division that might have undefined behavior. If an instruction has a canTrap constant expression as an operand, and that constant expression contains a division with undefined behavior, the instruction has undefined behavior. For PHI nodes, the behavior is only undefined along the corresponding edges. This isn't documented anywhere in LangRef, but we use it to avoid certain transforms in a few optimization passes. For example, isSafeToSpeculativelyExecute checks whether instructions have a canTrap operand.

In practice, canTrap is almost never true: the only way create such an expression is to do something strange with the address of a global, so the denominator of a division is a complex constant expression. This means we have a lot of complexity with very little test coverage. So it would be nice if we could simplify the rules here.

This patch proposes to give up on the whole "canTrap" thing, and redefine the meaning of division in constant expressions. With this patch, if a constant expression divides by zero, or contains an overflowing divide, the result is poison. This simplifies a bunch of code. It also fixes an infinite loop bug involving a canTrap constant, a PHI, and an unsplittable critical edge. The downside is a slight performance hit: if we do end up with a divide constant expression with a complex denominator, we generate extra code to avoid the trap.

There are a few ways to reduce the performance hit that I haven't tried to implement. On architectures where division never traps, we could avoid generating extra code. We could also try to avoid constant-folding divide instructions that would result in a complex constant expression.

Diff Detail

Repository: rL LLVM

Event Timeline

efriedma created this revision.Jun 7 2019, 5:30 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2019, 5:30 PM

Overall, I think that this makes sense. Thanks for proposing this.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3224	Please add a comment here explaining that you're guarding against both x/0 and INT_MIN/-1.
test/CodeGen/X86/divide-constant.ll
301 ↗	(On Diff #203646)	Can you check known bits? I feel like we should somehow know that `ptrtoint(@g)` isn't zero. For a test case, we can always do `ptrtoint(@g1)/(ptrtoint(@g2)-123456)` or similar.

jdoerfert added inline comments.Jun 8 2019, 9:29 AM

docs/LangRef.rst
3441	I'm not really happy with this formulation. It basically says that we somewhere define things to be UB and here we say that it is not UB if it is a constant.
lib/Analysis/ValueTracking.cpp
489	`V == I` implies `isa<Instruction>(V)` and you can use `isSafeToSpeculativelyExecute(I)`

Looks much better, the GISel changes are fine.

efriedma marked 2 inline comments as done.Jun 10 2019, 2:06 PM

efriedma added inline comments.

lib/Analysis/ValueTracking.cpp
489	I think you're reading this backwards; isSafeToSpeculativelyExecute only executes if `V != I`
test/CodeGen/X86/divide-constant.ll
301 ↗	(On Diff #203646)	I don't think there's any way to prove the value is non-zero here; it's extern_weak.

cameron.mcinally added a subscriber: cameron.mcinally.Jun 10 2019, 2:14 PM

hfinkel added inline comments.Jun 10 2019, 2:30 PM

test/CodeGen/X86/divide-constant.ll
301 ↗	(On Diff #203646)	Indeed. I missed that. I did mean the comment more generally - it seems like we should know that non-weak globals aren't zero. That having been said, looking at the implementation of SelectionDAG::isKnownNeverZero and SelectionDAG::computeKnownBits, etc. they don't seem to know anything about globals, so I suppose that enhancement there would also be needed in order to have an impact on this lowering.

nikic added a subscriber: nikic.Jun 10 2019, 2:33 PM

jdoerfert added inline comments.Jun 10 2019, 3:10 PM

lib/Analysis/ValueTracking.cpp
489	I was, :(

Address review comments, minor code cleanup, fix ConstantExpr::getAsInstruction, fix regression tests.

Is this still an RFC or by now an accumulation of changes we actually want to make? I added to comments assuming the latter.

lib/IR/Constants.cpp
2966	This can be reasonably separated or you could probably work with the `ValueOperands` or `Ops` array. At least I don't (immediately) see that we need to keep this connected to the RFC patch.
test/Transforms/LoopVectorize/X86/masked_load_store.ll
1505–1506	I'm unsure about the sentence that talks about history. I think I'd prefer a statement about the semantic we have, thus, "constant expressions never trap, check ...". But I don't feel strongly about this.

Addressed review comments. Added release note. I think this patch contains all the changes necessary to reflect the change to IR semantics.

I'll send a brief email to llvmdev now, so everyone's aware this is changing.

I commented on the llvmdev thread, but instead of moving this complexity around, I'd really rather see it go away. We should never have supported div/rem constant expressions in the first place...

-Chris

mcberg2017 added a subscriber: mcberg2017.Jun 14 2019, 4:47 PM

efriedma planned changes to this revision.Mar 3 2022, 10:28 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 10:28 AM

Herald added a subscriber: pengfei. · View Herald Transcript

Revision Contents

Path

Size

docs/

LangRef.rst

4 lines

ReleaseNotes.rst

4 lines

include/

llvm/

Analysis/

ValueTracking.h

2 lines

CodeGen/

GlobalISel/

IRTranslator.h

12 lines

IR/

Constant.h

4 lines

Constants.h

3 lines

lib/

Analysis/

CodeMetrics.cpp

3 lines

ValueTracking.cpp

14 lines

CodeGen/

SelectionDAG/

FastISel.cpp

12 lines

SelectionDAGBuilder.h

9 lines

SelectionDAGBuilder.cpp

47 lines

SelectionDAGISel.cpp

47 lines

IR/

Constants.cpp

69 lines

Transforms/

Utils/

SimplifyCFG.cpp

61 lines

Vectorize/

LoopVectorizationLegality.cpp

27 lines

test/

CodeGen/

X86/

critical-edge-split-2.ll

2 lines

divide-constant-expression.ll

390 lines

Transforms/

LoopVectorize/

X86/

masked_load_store.ll

250 lines

if-conversion.ll

15 lines

SimplifyCFG/

2006-10-19-UncondDiv.ll

44 lines

ConditionalTrappingConstantExpr.ll

26 lines

PR16069.ll

4 lines

PR17073.ll

13 lines

unittests/

IR/

ConstantsTest.cpp

24 lines

Diff 204362

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,430 Lines • ▼ Show 20 Lines	``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
The index list is interpreted in a similar manner as indices in a		The index list is interpreted in a similar manner as indices in a
':ref:`getelementptr <i_getelementptr>`' operation. At least one index		':ref:`getelementptr <i_getelementptr>`' operation. At least one index
value must be specified.		value must be specified.
``OPCODE (LHS, RHS)``		``OPCODE (LHS, RHS)``
Perform the specified operation of the LHS and RHS constants. OPCODE		Perform the specified operation of the LHS and RHS constants. OPCODE
may be any of the :ref:`binary <binaryops>` or :ref:`bitwise		may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
binary <bitwiseops>` operations. The constraints on operands are		binary <bitwiseops>` operations. The constraints on operands are
the same as those for the corresponding instruction (e.g. no bitwise		the same as those for the corresponding instruction (e.g. no bitwise
operations on floating-point values are allowed).		operations on floating-point values are allowed). Division by zero
		and overflowing signed division produce poison (unlike division
		and remainder instructions, which have undefined behavior).
		jdoerfertUnsubmitted Not Done Reply Inline Actions I'm not really happy with this formulation. It basically says that we somewhere define things to be UB and here we say that it is not UB if it is a constant. jdoerfert: I'm not really happy with this formulation. It basically says that we somewhere define things…

Other Values		Other Values
============		============

.. _inlineasmexprs:		.. _inlineasmexprs:

Inline Assembler Expressions		Inline Assembler Expressions
----------------------------		----------------------------
▲ Show 20 Lines • Show All 13,797 Lines • Show Last 20 Lines

docs/ReleaseNotes.rst

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	* The 2-field form of global variables ``@llvm.global_ctors`` and
type is now mandatory. Specify `i8* null` to migrate from the obsoleted		type is now mandatory. Specify `i8* null` to migrate from the obsoleted
2-field form.		2-field form.

* The ``byval`` attribute can now take a type parameter:		* The ``byval`` attribute can now take a type parameter:
``byval(<ty>)``. If present it must be identical to the argument's		``byval(<ty>)``. If present it must be identical to the argument's
pointee type. In the next release we intend to make this parameter		pointee type. In the next release we intend to make this parameter
mandatory in preparation for opaque pointer types.		mandatory in preparation for opaque pointer types.

		* The semantics of constant expressions have changed so it is no longer
		possible for a constant expression to have undefined behavior. The
		``Constant::canTrap()`` C++ API has been removed.

Changes to the ARM Backend		Changes to the ARM Backend
--------------------------		--------------------------

During this release ...		During this release ...


Changes to the MIPS Target		Changes to the MIPS Target
--------------------------		--------------------------
▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

include/llvm/Analysis/ValueTracking.h

Show First 20 Lines • Show All 382 Lines • ▼ Show 20 Lines	class Value;
///		///
/// If the CtxI is NOT specified this method only looks at the instruction		/// If the CtxI is NOT specified this method only looks at the instruction
/// itself and its operands, so if this method returns true, it is safe to		/// itself and its operands, so if this method returns true, it is safe to
/// move the instruction as long as the correct dominance relationships for		/// move the instruction as long as the correct dominance relationships for
/// the operands and users hold.		/// the operands and users hold.
///		///
/// This method can return true for instructions that read memory;		/// This method can return true for instructions that read memory;
/// for such instructions, moving them may change the resulting value.		/// for such instructions, moving them may change the resulting value.
bool isSafeToSpeculativelyExecute(const Value *V,		bool isSafeToSpeculativelyExecute(const Instruction *I,
const Instruction *CtxI = nullptr,		const Instruction *CtxI = nullptr,
const DominatorTree *DT = nullptr);		const DominatorTree *DT = nullptr);

/// Returns true if the result or effects of the given instructions \p I		/// Returns true if the result or effects of the given instructions \p I
/// depend on or influence global memory.		/// depend on or influence global memory.
/// Memory dependence arises for example if the instruction reads from		/// Memory dependence arises for example if the instruction reads from
/// memory or may produce effects or undefined behaviour. Memory dependent		/// memory or may produce effects or undefined behaviour. Memory dependent
/// instructions generally cannot be reorderd with respect to other memory		/// instructions generally cannot be reorderd with respect to other memory
▲ Show 20 Lines • Show All 237 Lines • Show Last 20 Lines

include/llvm/CodeGen/GlobalISel/IRTranslator.h

Show First 20 Lines • Show All 331 Lines • ▼ Show 20 Lines	private:
bool translateOr(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateOr(const User &U, MachineIRBuilder &MIRBuilder) {
return translateBinaryOp(TargetOpcode::G_OR, U, MIRBuilder);		return translateBinaryOp(TargetOpcode::G_OR, U, MIRBuilder);
}		}
bool translateXor(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateXor(const User &U, MachineIRBuilder &MIRBuilder) {
return translateBinaryOp(TargetOpcode::G_XOR, U, MIRBuilder);		return translateBinaryOp(TargetOpcode::G_XOR, U, MIRBuilder);
}		}

bool translateUDiv(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateUDiv(const User &U, MachineIRBuilder &MIRBuilder) {
		// Non-trapping div for ConstantExpr not yet implemented.
		if (isa<ConstantExpr>(U))
		return false;
return translateBinaryOp(TargetOpcode::G_UDIV, U, MIRBuilder);		return translateBinaryOp(TargetOpcode::G_UDIV, U, MIRBuilder);
}		}
bool translateSDiv(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateSDiv(const User &U, MachineIRBuilder &MIRBuilder) {
		// Non-trapping div for ConstantExpr not yet implemented.
		if (isa<ConstantExpr>(U))
		return false;
return translateBinaryOp(TargetOpcode::G_SDIV, U, MIRBuilder);		return translateBinaryOp(TargetOpcode::G_SDIV, U, MIRBuilder);
}		}
bool translateURem(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateURem(const User &U, MachineIRBuilder &MIRBuilder) {
		// Non-trapping div for ConstantExpr not yet implemented.
		if (isa<ConstantExpr>(U))
		return false;
return translateBinaryOp(TargetOpcode::G_UREM, U, MIRBuilder);		return translateBinaryOp(TargetOpcode::G_UREM, U, MIRBuilder);
}		}
bool translateSRem(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateSRem(const User &U, MachineIRBuilder &MIRBuilder) {
		// Non-trapping div for ConstantExpr not yet implemented.
		if (isa<ConstantExpr>(U))
		return false;
return translateBinaryOp(TargetOpcode::G_SREM, U, MIRBuilder);		return translateBinaryOp(TargetOpcode::G_SREM, U, MIRBuilder);
}		}
bool translateIntToPtr(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateIntToPtr(const User &U, MachineIRBuilder &MIRBuilder) {
return translateCast(TargetOpcode::G_INTTOPTR, U, MIRBuilder);		return translateCast(TargetOpcode::G_INTTOPTR, U, MIRBuilder);
}		}
bool translatePtrToInt(const User &U, MachineIRBuilder &MIRBuilder) {		bool translatePtrToInt(const User &U, MachineIRBuilder &MIRBuilder) {
return translateCast(TargetOpcode::G_PTRTOINT, U, MIRBuilder);		return translateCast(TargetOpcode::G_PTRTOINT, U, MIRBuilder);
}		}
▲ Show 20 Lines • Show All 209 Lines • Show Last 20 Lines

include/llvm/IR/Constant.h

Show First 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	public:
/// Return true if this is a vector constant that includes any undefined		/// Return true if this is a vector constant that includes any undefined
/// elements.		/// elements.
bool containsUndefElement() const;		bool containsUndefElement() const;

/// Return true if this is a vector constant that includes any constant		/// Return true if this is a vector constant that includes any constant
/// expressions.		/// expressions.
bool containsConstantExpression() const;		bool containsConstantExpression() const;

/// Return true if evaluation of this constant could trap. This is true for
/// things like constant expressions that could divide by zero.
bool canTrap() const;

/// Return true if the value can vary between threads.		/// Return true if the value can vary between threads.
bool isThreadDependent() const;		bool isThreadDependent() const;

/// Return true if the value is dependent on a dllimport variable.		/// Return true if the value is dependent on a dllimport variable.
bool isDLLImportDependent() const;		bool isDLLImportDependent() const;

/// Return true if the constant has users other than constant expressions and		/// Return true if the constant has users other than constant expressions and
/// other dangling things.		/// other dangling things.
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

include/llvm/IR/Constants.h

Show First 20 Lines • Show All 1,240 Lines • ▼ Show 20 Lines	public:
/// canonicalized. This parameter should almost always be \c false.		/// canonicalized. This parameter should almost always be \c false.
Constant getWithOperands(ArrayRef<Constant > Ops, Type *Ty,		Constant getWithOperands(ArrayRef<Constant > Ops, Type *Ty,
bool OnlyIfReduced = false,		bool OnlyIfReduced = false,
Type *SrcTy = nullptr) const;		Type *SrcTy = nullptr) const;

/// Returns an Instruction which implements the same operation as this		/// Returns an Instruction which implements the same operation as this
/// ConstantExpr. The instruction is not linked to any basic block.		/// ConstantExpr. The instruction is not linked to any basic block.
///		///
		/// For division operations, the denominator may be rewritten to avoid
		/// generating a division which would trap.
		///
/// A better approach to this could be to have a constructor for Instruction		/// A better approach to this could be to have a constructor for Instruction
/// which would take a ConstantExpr parameter, but that would have spread		/// which would take a ConstantExpr parameter, but that would have spread
/// implementation details of ConstantExpr outside of Constants.cpp, which		/// implementation details of ConstantExpr outside of Constants.cpp, which
/// would make it harder to remove ConstantExprs altogether.		/// would make it harder to remove ConstantExprs altogether.
Instruction *getAsInstruction();		Instruction *getAsInstruction();

/// Methods for support type inquiry through isa, cast, and dyn_cast:		/// Methods for support type inquiry through isa, cast, and dyn_cast:
static bool classof(const Value *V) {		static bool classof(const Value *V) {
▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

lib/Analysis/CodeMetrics.cpp

Show All 28 Lines	appendSpeculatableOperands(const Value *V,
SmallPtrSetImpl<const Value *> &Visited,		SmallPtrSetImpl<const Value *> &Visited,
SmallVectorImpl<const Value *> &Worklist) {		SmallVectorImpl<const Value *> &Worklist) {
const User *U = dyn_cast<User>(V);		const User *U = dyn_cast<User>(V);
if (!U)		if (!U)
return;		return;

for (const Value *Operand : U->operands())		for (const Value *Operand : U->operands())
if (Visited.insert(Operand).second)		if (Visited.insert(Operand).second)
if (isSafeToSpeculativelyExecute(Operand))		if (!isa<Instruction>(Operand) \|\|
		isSafeToSpeculativelyExecute(cast<Instruction>(Operand)))
Worklist.push_back(Operand);		Worklist.push_back(Operand);
}		}

static void completeEphemeralValues(SmallPtrSetImpl<const Value *> &Visited,		static void completeEphemeralValues(SmallPtrSetImpl<const Value *> &Visited,
SmallVectorImpl<const Value *> &Worklist,		SmallVectorImpl<const Value *> &Worklist,
SmallPtrSetImpl<const Value *> &EphValues) {		SmallPtrSetImpl<const Value *> &EphValues) {
// Note: We don't speculate PHIs here, so we'll miss instruction chains kept		// Note: We don't speculate PHIs here, so we'll miss instruction chains kept
// alive only by ephemeral values.		// alive only by ephemeral values.
▲ Show 20 Lines • Show All 150 Lines • Show Last 20 Lines

lib/Analysis/ValueTracking.cpp

Show First 20 Lines • Show All 479 Lines • ▼ Show 20 Lines	while (!WorkSet.empty()) {

// If all uses of this value are ephemeral, then so is this value.		// If all uses of this value are ephemeral, then so is this value.
if (llvm::all_of(V->users(), [&](const User *U) {		if (llvm::all_of(V->users(), [&](const User *U) {
return EphValues.count(U);		return EphValues.count(U);
})) {		})) {
if (V == E)		if (V == E)
return true;		return true;

if (V == I \|\| isSafeToSpeculativelyExecute(V)) {		if (V == I \|\| !isa<Instruction>(V) \|\|
		isSafeToSpeculativelyExecute(cast<Instruction>(V))) {
		jdoerfertUnsubmitted Not Done Reply Inline Actions `V == I` implies `isa<Instruction>(V)` and you can use `isSafeToSpeculativelyExecute(I)` jdoerfert: `V == I` implies `isa<Instruction>(V)` and you can use `isSafeToSpeculativelyExecute(I)`
		efriedmaAuthorUnsubmitted Done Reply Inline Actions I think you're reading this backwards; isSafeToSpeculativelyExecute only executes if `V != I` efriedma: I think you're reading this backwards; isSafeToSpeculativelyExecute only executes if `V != I`
		jdoerfertUnsubmitted Not Done Reply Inline Actions I was, :( jdoerfert: I was, :(
EphValues.insert(V);		EphValues.insert(V);
if (const User *U = dyn_cast<User>(V))		if (const User *U = dyn_cast<User>(V))
for (User::const_op_iterator J = U->op_begin(), JE = U->op_end();		for (User::const_op_iterator J = U->op_begin(), JE = U->op_end();
J != JE; ++J)		J != JE; ++J)
WorkSet.push_back(*J);		WorkSet.push_back(*J);
}		}
}		}
}		}
▲ Show 20 Lines • Show All 3,393 Lines • ▼ Show 20 Lines	for (const User *U : V->users()) {
if (!II) return false;		if (!II) return false;

if (!II->isLifetimeStartOrEnd())		if (!II->isLifetimeStartOrEnd())
return false;		return false;
}		}
return true;		return true;
}		}

bool llvm::isSafeToSpeculativelyExecute(const Value *V,		bool llvm::isSafeToSpeculativelyExecute(const Instruction *Inst,
const Instruction *CtxI,		const Instruction *CtxI,
const DominatorTree *DT) {		const DominatorTree *DT) {
const Operator *Inst = dyn_cast<Operator>(V);
if (!Inst)
return false;

for (unsigned i = 0, e = Inst->getNumOperands(); i != e; ++i)
if (Constant *C = dyn_cast<Constant>(Inst->getOperand(i)))
if (C->canTrap())
return false;

switch (Inst->getOpcode()) {		switch (Inst->getOpcode()) {
default:		default:
return true;		return true;
case Instruction::UDiv:		case Instruction::UDiv:
case Instruction::URem: {		case Instruction::URem: {
// x / y is undefined if y == 0.		// x / y is undefined if y == 0.
const APInt *V;		const APInt *V;
if (match(Inst->getOperand(1), m_APInt(V)))		if (match(Inst->getOperand(1), m_APInt(V)))
▲ Show 20 Lines • Show All 1,793 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/FastISel.cpp

Show First 20 Lines • Show All 1,811 Lines • ▼ Show 20 Lines	if (match(I, m_FNeg(m_Value(X))))
return selectFNeg(I, X);		return selectFNeg(I, X);
return selectBinaryOp(I, ISD::FSUB);		return selectBinaryOp(I, ISD::FSUB);
}		}
case Instruction::Mul:		case Instruction::Mul:
return selectBinaryOp(I, ISD::MUL);		return selectBinaryOp(I, ISD::MUL);
case Instruction::FMul:		case Instruction::FMul:
return selectBinaryOp(I, ISD::FMUL);		return selectBinaryOp(I, ISD::FMUL);
case Instruction::SDiv:		case Instruction::SDiv:
		// Non-trapping div for ConstantExpr not yet implemented.
		if (isa<ConstantExpr>(I))
		return false;
return selectBinaryOp(I, ISD::SDIV);		return selectBinaryOp(I, ISD::SDIV);
case Instruction::UDiv:		case Instruction::UDiv:
		// Non-trapping div for ConstantExpr not yet implemented.
		if (isa<ConstantExpr>(I))
		return false;
return selectBinaryOp(I, ISD::UDIV);		return selectBinaryOp(I, ISD::UDIV);
case Instruction::FDiv:		case Instruction::FDiv:
return selectBinaryOp(I, ISD::FDIV);		return selectBinaryOp(I, ISD::FDIV);
case Instruction::SRem:		case Instruction::SRem:
		// Non-trapping div for ConstantExpr not yet implemented.
		if (isa<ConstantExpr>(I))
		return false;
return selectBinaryOp(I, ISD::SREM);		return selectBinaryOp(I, ISD::SREM);
case Instruction::URem:		case Instruction::URem:
		// Non-trapping div for ConstantExpr not yet implemented.
		if (isa<ConstantExpr>(I))
		return false;
return selectBinaryOp(I, ISD::UREM);		return selectBinaryOp(I, ISD::UREM);
case Instruction::FRem:		case Instruction::FRem:
return selectBinaryOp(I, ISD::FREM);		return selectBinaryOp(I, ISD::FREM);
case Instruction::Shl:		case Instruction::Shl:
return selectBinaryOp(I, ISD::SHL);		return selectBinaryOp(I, ISD::SHL);
case Instruction::LShr:		case Instruction::LShr:
return selectBinaryOp(I, ISD::SRL);		return selectBinaryOp(I, ISD::SRL);
case Instruction::AShr:		case Instruction::AShr:
▲ Show 20 Lines • Show All 650 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

Show First 20 Lines • Show All 872 Lines • ▼ Show 20 Lines	private:
void visitCallBr(const CallBrInst &I);		void visitCallBr(const CallBrInst &I);
void visitResume(const ResumeInst &I);		void visitResume(const ResumeInst &I);

void visitUnary(const User &I, unsigned Opcode);		void visitUnary(const User &I, unsigned Opcode);
void visitFNeg(const User &I) { visitUnary(I, ISD::FNEG); }		void visitFNeg(const User &I) { visitUnary(I, ISD::FNEG); }

void visitBinary(const User &I, unsigned Opcode);		void visitBinary(const User &I, unsigned Opcode);
void visitShift(const User &I, unsigned Opcode);		void visitShift(const User &I, unsigned Opcode);
		void visitDivRem(const User &I, unsigned Opcode);
void visitAdd(const User &I) { visitBinary(I, ISD::ADD); }		void visitAdd(const User &I) { visitBinary(I, ISD::ADD); }
void visitFAdd(const User &I) { visitBinary(I, ISD::FADD); }		void visitFAdd(const User &I) { visitBinary(I, ISD::FADD); }
void visitSub(const User &I) { visitBinary(I, ISD::SUB); }		void visitSub(const User &I) { visitBinary(I, ISD::SUB); }
void visitFSub(const User &I);		void visitFSub(const User &I);
void visitMul(const User &I) { visitBinary(I, ISD::MUL); }		void visitMul(const User &I) { visitBinary(I, ISD::MUL); }
void visitFMul(const User &I) { visitBinary(I, ISD::FMUL); }		void visitFMul(const User &I) { visitBinary(I, ISD::FMUL); }
void visitURem(const User &I) { visitBinary(I, ISD::UREM); }		void visitURem(const User &I) { visitDivRem(I, ISD::UREM); }
void visitSRem(const User &I) { visitBinary(I, ISD::SREM); }		void visitSRem(const User &I) { visitDivRem(I, ISD::SREM); }
void visitFRem(const User &I) { visitBinary(I, ISD::FREM); }		void visitFRem(const User &I) { visitBinary(I, ISD::FREM); }
void visitUDiv(const User &I) { visitBinary(I, ISD::UDIV); }		void visitUDiv(const User &I) { visitDivRem(I, ISD::UDIV); }
void visitSDiv(const User &I);		void visitSDiv(const User &I) { visitDivRem(I, ISD::SDIV); }
void visitFDiv(const User &I) { visitBinary(I, ISD::FDIV); }		void visitFDiv(const User &I) { visitBinary(I, ISD::FDIV); }
void visitAnd (const User &I) { visitBinary(I, ISD::AND); }		void visitAnd (const User &I) { visitBinary(I, ISD::AND); }
void visitOr (const User &I) { visitBinary(I, ISD::OR); }		void visitOr (const User &I) { visitBinary(I, ISD::OR); }
void visitXor (const User &I) { visitBinary(I, ISD::XOR); }		void visitXor (const User &I) { visitBinary(I, ISD::XOR); }
void visitShl (const User &I) { visitShift(I, ISD::SHL); }		void visitShl (const User &I) { visitShift(I, ISD::SHL); }
void visitLShr(const User &I) { visitShift(I, ISD::SRL); }		void visitLShr(const User &I) { visitShift(I, ISD::SRL); }
void visitAShr(const User &I) { visitShift(I, ISD::SRA); }		void visitAShr(const User &I) { visitShift(I, ISD::SRA); }
void visitICmp(const User &I);		void visitICmp(const User &I);
▲ Show 20 Lines • Show All 200 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,200 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitShift(const User &I, unsigned Opcode) {
Flags.setExact(exact);		Flags.setExact(exact);
Flags.setNoSignedWrap(nsw);		Flags.setNoSignedWrap(nsw);
Flags.setNoUnsignedWrap(nuw);		Flags.setNoUnsignedWrap(nuw);
SDValue Res = DAG.getNode(Opcode, getCurSDLoc(), Op1.getValueType(), Op1, Op2,		SDValue Res = DAG.getNode(Opcode, getCurSDLoc(), Op1.getValueType(), Op1, Op2,
Flags);		Flags);
setValue(&I, Res);		setValue(&I, Res);
}		}

void SelectionDAGBuilder::visitSDiv(const User &I) {		void SelectionDAGBuilder::visitDivRem(const User &I, unsigned Opcode) {
		if (!isa<Constant>(I))
		return visitBinary(I, Opcode);

		// Constants aren't allowed to trap, so we have to do something
		// a bit trickier.
		//
		// FIXME: Some targets have a cheap non-trapping div.
SDValue Op1 = getValue(I.getOperand(0));		SDValue Op1 = getValue(I.getOperand(0));
SDValue Op2 = getValue(I.getOperand(1));		SDValue Op2 = getValue(I.getOperand(1));
		SDLoc dl(getCurSDLoc());
SDNodeFlags Flags;		EVT VT = Op1.getValueType();
Flags.setExact(isa<PossiblyExactOperator>(&I) &&		if (Opcode == ISD::UDIV \|\| Opcode == ISD::UREM) {
cast<PossiblyExactOperator>(&I)->isExact());		// Ensure the denominator is not zero.
setValue(&I, DAG.getNode(ISD::SDIV, getCurSDLoc(), Op1.getValueType(), Op1,		Op2 = DAG.getNode(ISD::UMAX, dl, VT, Op2, DAG.getConstant(1, dl, VT));
Op2, Flags));		} else {
		hfinkelUnsubmitted Not Done Reply Inline Actions Please add a comment here explaining that you're guarding against both x/0 and INT_MIN/-1. hfinkel: Please add a comment here explaining that you're guarding against both x/0 and INT_MIN/-1.
		// Ensure the denominator is not zero, and we are not dividing INT_MIN
		// by -1.
		auto &TLI = DAG.getTargetLoweringInfo();
		EVT CCVT =
		TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);
		SDValue IsZero =
		DAG.getSetCC(dl, CCVT, Op2, DAG.getConstant(0, dl, VT), ISD::SETEQ);
		SDValue IsNegOne =
		DAG.getSetCC(dl, CCVT, Op2, DAG.getAllOnesConstant(dl, VT), ISD::SETEQ);
		auto IntMin = APInt::getSignedMinValue(VT.getScalarSizeInBits());
		SDValue IsIntMin = DAG.getSetCC(
		dl, CCVT, Op1, DAG.getConstant(IntMin, dl, VT), ISD::SETEQ);
		SDValue IsIntMinOverNegOne =
		DAG.getNode(ISD::AND, dl, CCVT, IsNegOne, IsIntMin);
		SDValue IsInvalid =
		DAG.getNode(ISD::OR, dl, CCVT, IsZero, IsIntMinOverNegOne);
		ISD::NodeType SelectOpCode = VT.isVector() ? ISD::VSELECT : ISD::SELECT;
		Op2 = DAG.getNode(SelectOpCode, dl, VT, IsInvalid,
		DAG.getConstant(1, dl, VT), Op2);
		}

		SDNodeFlags DivFlags;
		if (auto *ExactOp = dyn_cast<PossiblyExactOperator>(&I))
		DivFlags.setExact(ExactOp->isExact());
		SDValue BinNodeValue = DAG.getNode(Opcode, dl, VT, Op1, Op2, DivFlags);
		setValue(&I, BinNodeValue);
}		}

void SelectionDAGBuilder::visitICmp(const User &I) {		void SelectionDAGBuilder::visitICmp(const User &I) {
ICmpInst::Predicate predicate = ICmpInst::BAD_ICMP_PREDICATE;		ICmpInst::Predicate predicate = ICmpInst::BAD_ICMP_PREDICATE;
if (const ICmpInst *IC = dyn_cast<ICmpInst>(&I))		if (const ICmpInst *IC = dyn_cast<ICmpInst>(&I))
predicate = IC->getPredicate();		predicate = IC->getPredicate();
else if (const ConstantExpr *IC = dyn_cast<ConstantExpr>(&I))		else if (const ConstantExpr *IC = dyn_cast<ConstantExpr>(&I))
predicate = ICmpInst::Predicate(IC->getPredicate());		predicate = ICmpInst::Predicate(IC->getPredicate());
▲ Show 20 Lines • Show All 7,700 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

Show First 20 Lines • Show All 336 Lines • ▼ Show 20 Lines	void SelectionDAGISel::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addPreserved<GCModuleInfo>();		AU.addPreserved<GCModuleInfo>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
AU.addRequired<TargetTransformInfoWrapperPass>();		AU.addRequired<TargetTransformInfoWrapperPass>();
if (UseMBPI && OptLevel != CodeGenOpt::None)		if (UseMBPI && OptLevel != CodeGenOpt::None)
AU.addRequired<BranchProbabilityInfoWrapperPass>();		AU.addRequired<BranchProbabilityInfoWrapperPass>();
MachineFunctionPass::getAnalysisUsage(AU);		MachineFunctionPass::getAnalysisUsage(AU);
}		}

/// SplitCriticalSideEffectEdges - Look for critical edges with a PHI value that
/// may trap on it. In this case we have to split the edge so that the path
/// through the predecessor block that doesn't go to the phi block doesn't
/// execute the possibly trapping instruction. If available, we pass domtree
/// and loop info to be updated when we split critical edges. This is because
/// SelectionDAGISel preserves these analyses.
/// This is required for correctness, so it must be done at -O0.
///
static void SplitCriticalSideEffectEdges(Function &Fn, DominatorTree *DT,
LoopInfo *LI) {
// Loop for blocks with phi nodes.
for (BasicBlock &BB : Fn) {
PHINode *PN = dyn_cast<PHINode>(BB.begin());
if (!PN) continue;

ReprocessBlock:
// For each block with a PHI node, check to see if any of the input values
// are potentially trapping constant expressions. Constant expressions are
// the only potentially trapping value that can occur as the argument to a
// PHI.
for (BasicBlock::iterator I = BB.begin(); (PN = dyn_cast<PHINode>(I)); ++I)
for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
ConstantExpr *CE = dyn_cast<ConstantExpr>(PN->getIncomingValue(i));
if (!CE \|\| !CE->canTrap()) continue;

// The only case we have to worry about is when the edge is critical.
// Since this block has a PHI Node, we assume it has multiple input
// edges: check to see if the pred has multiple successors.
BasicBlock *Pred = PN->getIncomingBlock(i);
if (Pred->getTerminator()->getNumSuccessors() == 1)
continue;

// Okay, we have to split this edge.
SplitCriticalEdge(
Pred->getTerminator(), GetSuccessorNumber(Pred, &BB),
CriticalEdgeSplittingOptions(DT, LI).setMergeIdenticalEdges());
goto ReprocessBlock;
}
}
}

static void computeUsesMSVCFloatingPoint(const Triple &TT, const Function &F,		static void computeUsesMSVCFloatingPoint(const Triple &TT, const Function &F,
MachineModuleInfo &MMI) {		MachineModuleInfo &MMI) {
// Only needed for MSVC		// Only needed for MSVC
if (!TT.isKnownWindowsMSVCEnvironment())		if (!TT.isKnownWindowsMSVCEnvironment())
return;		return;

// If it's already set, nothing to do.		// If it's already set, nothing to do.
if (MMI.usesMSVCFloatingPoint())		if (MMI.usesMSVCFloatingPoint())
Show All 38 Lines	bool SelectionDAGISel::runOnMachineFunction(MachineFunction &mf) {
OptLevelChanger OLC(*this, NewOptLevel);		OptLevelChanger OLC(*this, NewOptLevel);

TII = MF->getSubtarget().getInstrInfo();		TII = MF->getSubtarget().getInstrInfo();
TLI = MF->getSubtarget().getTargetLowering();		TLI = MF->getSubtarget().getTargetLowering();
RegInfo = &MF->getRegInfo();		RegInfo = &MF->getRegInfo();
LibInfo = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();		LibInfo = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
GFI = Fn.hasGC() ? &getAnalysis<GCModuleInfo>().getFunctionInfo(Fn) : nullptr;		GFI = Fn.hasGC() ? &getAnalysis<GCModuleInfo>().getFunctionInfo(Fn) : nullptr;
ORE = make_unique<OptimizationRemarkEmitter>(&Fn);		ORE = make_unique<OptimizationRemarkEmitter>(&Fn);
auto *DTWP = getAnalysisIfAvailable<DominatorTreeWrapperPass>();
DominatorTree *DT = DTWP ? &DTWP->getDomTree() : nullptr;
auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();
LoopInfo *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;

LLVM_DEBUG(dbgs() << "\n\n\n=== " << Fn.getName() << "\n");		LLVM_DEBUG(dbgs() << "\n\n\n=== " << Fn.getName() << "\n");

SplitCriticalSideEffectEdges(const_cast<Function &>(Fn), DT, LI);

CurDAG->init(MF, ORE, this, LibInfo,		CurDAG->init(MF, ORE, this, LibInfo,
getAnalysisIfAvailable<LegacyDivergenceAnalysis>());		getAnalysisIfAvailable<LegacyDivergenceAnalysis>());
FuncInfo->set(Fn, *MF, CurDAG);		FuncInfo->set(Fn, *MF, CurDAG);
SwiftError->setFunction(*MF);		SwiftError->setFunction(*MF);

// Now get the optional analyzes if we want to.		// Now get the optional analyzes if we want to.
// This is based on the possibly changed OptLevel (after optnone is taken		// This is based on the possibly changed OptLevel (after optnone is taken
// into account). That's unfortunate but OK because it just means we won't		// into account). That's unfortunate but OK because it just means we won't
▲ Show 20 Lines • Show All 3,175 Lines • Show Last 20 Lines

lib/IR/Constants.cpp

Show First 20 Lines • Show All 402 Lines • ▼ Show 20 Lines	#endif
// The constant should remove itself from our use list...		// The constant should remove itself from our use list...
assert((use_empty() \|\| user_back() != V) && "Constant not removed!");		assert((use_empty() \|\| user_back() != V) && "Constant not removed!");
}		}

// Value has no outstanding references it is safe to delete it now...		// Value has no outstanding references it is safe to delete it now...
delete this;		delete this;
}		}

static bool canTrapImpl(const Constant *C,
SmallPtrSetImpl<const ConstantExpr *> &NonTrappingOps) {
assert(C->getType()->isFirstClassType() && "Cannot evaluate aggregate vals!");
// The only thing that could possibly trap are constant exprs.
const ConstantExpr *CE = dyn_cast<ConstantExpr>(C);
if (!CE)
return false;

// ConstantExpr traps if any operands can trap.
for (unsigned i = 0, e = C->getNumOperands(); i != e; ++i) {
if (ConstantExpr *Op = dyn_cast<ConstantExpr>(CE->getOperand(i))) {
if (NonTrappingOps.insert(Op).second && canTrapImpl(Op, NonTrappingOps))
return true;
}
}

// Otherwise, only specific operations can trap.
switch (CE->getOpcode()) {
default:
return false;
case Instruction::UDiv:
case Instruction::SDiv:
case Instruction::URem:
case Instruction::SRem:
// Div and rem can trap if the RHS is not known to be non-zero.
if (!isa<ConstantInt>(CE->getOperand(1)) \|\|CE->getOperand(1)->isNullValue())
return true;
return false;
}
}

bool Constant::canTrap() const {
SmallPtrSet<const ConstantExpr *, 4> NonTrappingOps;
return canTrapImpl(this, NonTrappingOps);
}

/// Check if C contains a GlobalValue for which Predicate is true.		/// Check if C contains a GlobalValue for which Predicate is true.
static bool		static bool
ConstHasGlobalValuePredicate(const Constant *C,		ConstHasGlobalValuePredicate(const Constant *C,
bool (Predicate)(const GlobalValue )) {		bool (Predicate)(const GlobalValue )) {
SmallPtrSet<const Constant *, 8> Visited;		SmallPtrSet<const Constant *, 8> Visited;
SmallVector<const Constant *, 8> WorkList;		SmallVector<const Constant *, 8> WorkList;
WorkList.push_back(C);		WorkList.push_back(C);
Visited.insert(C);		Visited.insert(C);
▲ Show 20 Lines • Show All 2,503 Lines • ▼ Show 20 Lines	Value ConstantExpr::handleOperandChangeImpl(Value From, Value *ToV) {
// Update to the new value.		// Update to the new value.
return getContext().pImpl->ExprConstants.replaceOperandsInPlace(		return getContext().pImpl->ExprConstants.replaceOperandsInPlace(
NewOps, this, From, To, NumUpdated, OperandNo);		NewOps, this, From, To, NumUpdated, OperandNo);
}		}

Instruction *ConstantExpr::getAsInstruction() {		Instruction *ConstantExpr::getAsInstruction() {
SmallVector<Value *, 4> ValueOperands(op_begin(), op_end());		SmallVector<Value *, 4> ValueOperands(op_begin(), op_end());
ArrayRef<Value*> Ops(ValueOperands);		ArrayRef<Value*> Ops(ValueOperands);

jdoerfertUnsubmitted Not Done Reply Inline Actions This can be reasonably separated or you could probably work with the `ValueOperands` or `Ops` array. At least I don't (immediately) see that we need to keep this connected to the RFC patch. jdoerfert: This can be reasonably separated or you could probably work with the `ValueOperands` or `Ops`…
switch (getOpcode()) {		switch (getOpcode()) {
case Instruction::Trunc:		case Instruction::Trunc:
case Instruction::ZExt:		case Instruction::ZExt:
case Instruction::SExt:		case Instruction::SExt:
case Instruction::FPTrunc:		case Instruction::FPTrunc:
case Instruction::FPExt:		case Instruction::FPExt:
case Instruction::UIToFP:		case Instruction::UIToFP:
case Instruction::SIToFP:		case Instruction::SIToFP:
Show All 12 Lines	Instruction *ConstantExpr::getAsInstruction() {
case Instruction::ExtractElement:		case Instruction::ExtractElement:
return ExtractElementInst::Create(Ops[0], Ops[1]);		return ExtractElementInst::Create(Ops[0], Ops[1]);
case Instruction::InsertValue:		case Instruction::InsertValue:
return InsertValueInst::Create(Ops[0], Ops[1], getIndices());		return InsertValueInst::Create(Ops[0], Ops[1], getIndices());
case Instruction::ExtractValue:		case Instruction::ExtractValue:
return ExtractValueInst::Create(Ops[0], getIndices());		return ExtractValueInst::Create(Ops[0], getIndices());
case Instruction::ShuffleVector:		case Instruction::ShuffleVector:
return new ShuffleVectorInst(Ops[0], Ops[1], Ops[2]);		return new ShuffleVectorInst(Ops[0], Ops[1], Ops[2]);

case Instruction::GetElementPtr: {		case Instruction::GetElementPtr: {
const auto *GO = cast<GEPOperator>(this);		const auto *GO = cast<GEPOperator>(this);
if (GO->isInBounds())		if (GO->isInBounds())
return GetElementPtrInst::CreateInBounds(GO->getSourceElementType(),		return GetElementPtrInst::CreateInBounds(GO->getSourceElementType(),
Ops[0], Ops.slice(1));		Ops[0], Ops.slice(1));
return GetElementPtrInst::Create(GO->getSourceElementType(), Ops[0],		return GetElementPtrInst::Create(GO->getSourceElementType(), Ops[0],
Ops.slice(1));		Ops.slice(1));
}		}
case Instruction::ICmp:		case Instruction::ICmp:
case Instruction::FCmp:		case Instruction::FCmp:
return CmpInst::Create((Instruction::OtherOps)getOpcode(),		return CmpInst::Create((Instruction::OtherOps)getOpcode(),
(CmpInst::Predicate)getPredicate(), Ops[0], Ops[1]);		(CmpInst::Predicate)getPredicate(), Ops[0], Ops[1]);
case Instruction::FNeg:		case Instruction::FNeg:
return UnaryOperator::Create((Instruction::UnaryOps)getOpcode(), Ops[0]);		return UnaryOperator::Create((Instruction::UnaryOps)getOpcode(), Ops[0]);
default:		default:
assert(getNumOperands() == 2 && "Must be binary operator?");		assert(getNumOperands() == 2 && "Must be binary operator?");
		Constant *Op0 = getOperand(0);
		Constant *Op1 = getOperand(1);
		if (getOpcode() == Instruction::UDiv \|\| getOpcode() == Instruction::URem) {
		// Ensure the denominator is not zero.
		Constant *Zero = Constant::getNullValue(getType());
		Constant *One = ConstantInt::get(getType(), 1);
		Constant *IsZero = ConstantExpr::getICmp(CmpInst::ICMP_EQ, Op1, Zero);
		Op1 = ConstantExpr::getSelect(IsZero, One, Op1);
		}
		if (getOpcode() == Instruction::SDiv \|\| getOpcode() == Instruction::SRem) {
		// Ensure the denominator is not zero, and we are not dividing INT_MIN
		// by -1.
		unsigned BitWidth = getType()->getScalarSizeInBits();
		assert(BitWidth != 1 && "One-bit divide should be folded away");
		Constant *Zero = Constant::getNullValue(getType());
		Constant *NegOne = Constant::getAllOnesValue(getType());
		Constant *One = ConstantInt::get(getType(), 1);
		Constant *SignedMin =
		ConstantInt::get(getType(), APInt::getSignedMinValue(BitWidth));
		Constant *IsZero =
		ConstantExpr::getICmp(CmpInst::ICMP_EQ, Op1, Zero);
		Constant *IsNegOne =
		ConstantExpr::getICmp(CmpInst::ICMP_EQ, Op1, NegOne);
		Constant *IsIntMin =
		ConstantExpr::getICmp(CmpInst::ICMP_EQ, Op0, SignedMin);
		Constant *IsOverflowing = ConstantExpr::getAnd(IsNegOne, IsIntMin);
		Constant *IsUndefined = ConstantExpr::getOr(IsOverflowing, IsZero);
		Op1 = ConstantExpr::getSelect(IsUndefined, One, Op1);
		}
BinaryOperator *BO =		BinaryOperator *BO =
BinaryOperator::Create((Instruction::BinaryOps)getOpcode(),		BinaryOperator::Create((Instruction::BinaryOps)getOpcode(), Op0, Op1);
Ops[0], Ops[1]);
if (isa<OverflowingBinaryOperator>(BO)) {		if (isa<OverflowingBinaryOperator>(BO)) {
BO->setHasNoUnsignedWrap(SubclassOptionalData &		BO->setHasNoUnsignedWrap(SubclassOptionalData &
OverflowingBinaryOperator::NoUnsignedWrap);		OverflowingBinaryOperator::NoUnsignedWrap);
BO->setHasNoSignedWrap(SubclassOptionalData &		BO->setHasNoSignedWrap(SubclassOptionalData &
OverflowingBinaryOperator::NoSignedWrap);		OverflowingBinaryOperator::NoSignedWrap);
}		}
if (isa<PossiblyExactOperator>(BO))		if (isa<PossiblyExactOperator>(BO))
BO->setIsExact(SubclassOptionalData & PossiblyExactOperator::IsExact);		BO->setIsExact(SubclassOptionalData & PossiblyExactOperator::IsExact);
return BO;		return BO;
}		}
}		}

lib/Transforms/Utils/SimplifyCFG.cpp

Show First 20 Lines • Show All 302 Lines • ▼ Show 20 Lines
}		}

/// Compute an abstract "cost" of speculating the given instruction,		/// Compute an abstract "cost" of speculating the given instruction,
/// which is assumed to be safe to speculate. TCC_Free means cheap,		/// which is assumed to be safe to speculate. TCC_Free means cheap,
/// TCC_Basic means less cheap, and TCC_Expensive means prohibitively		/// TCC_Basic means less cheap, and TCC_Expensive means prohibitively
/// expensive.		/// expensive.
static unsigned ComputeSpeculationCost(const User *I,		static unsigned ComputeSpeculationCost(const User *I,
const TargetTransformInfo &TTI) {		const TargetTransformInfo &TTI) {
assert(isSafeToSpeculativelyExecute(I) &&		assert(!isa<Instruction>(I) \|\|
		isSafeToSpeculativelyExecute(cast<Instruction>(I)) &&
"Instruction is not safe to speculatively execute!");		"Instruction is not safe to speculatively execute!");
return TTI.getUserCost(I);		return TTI.getUserCost(I);
}		}

/// If we have a merge point of an "if condition" as accepted above,		/// If we have a merge point of an "if condition" as accepted above,
/// return true if the specified value dominates the block. We		/// return true if the specified value dominates the block. We
/// don't handle the true generality of domination here, just a special case		/// don't handle the true generality of domination here, just a special case
/// which works well enough for us.		/// which works well enough for us.
Show All 18 Lines	static bool DominatesMergePoint(Value V, BasicBlock BB,
// It is possible to hit a zero-cost cycle (phi/gep instructions for example),		// It is possible to hit a zero-cost cycle (phi/gep instructions for example),
// so limit the recursion depth.		// so limit the recursion depth.
// TODO: While this recursion limit does prevent pathological behavior, it		// TODO: While this recursion limit does prevent pathological behavior, it
// would be better to track visited instructions to avoid cycles.		// would be better to track visited instructions to avoid cycles.
if (Depth == MaxSpeculationDepth)		if (Depth == MaxSpeculationDepth)
return false;		return false;

Instruction *I = dyn_cast<Instruction>(V);		Instruction *I = dyn_cast<Instruction>(V);
if (!I) {		if (!I)
// Non-instructions all dominate instructions, but not all constantexprs
// can be executed unconditionally.
if (ConstantExpr *C = dyn_cast<ConstantExpr>(V))
if (C->canTrap())
return false;
return true;		return true;
}
BasicBlock *PBB = I->getParent();		BasicBlock *PBB = I->getParent();

// We don't want to allow weird loops that might have the "if condition" in		// We don't want to allow weird loops that might have the "if condition" in
// the bottom of this block.		// the bottom of this block.
if (PBB == BB)		if (PBB == BB)
return false;		return false;

// If this instruction is defined in a block that contains an unconditional		// If this instruction is defined in a block that contains an unconditional
▲ Show 20 Lines • Show All 1,011 Lines • ▼ Show 20 Lines	for (PHINode &PN : Succ->phis()) {
if (BB1V == BB2V)		if (BB1V == BB2V)
continue;		continue;

// Check for passingValueIsAlwaysUndefined here because we would rather		// Check for passingValueIsAlwaysUndefined here because we would rather
// eliminate undefined control flow then converting it to a select.		// eliminate undefined control flow then converting it to a select.
if (passingValueIsAlwaysUndefined(BB1V, &PN) \|\|		if (passingValueIsAlwaysUndefined(BB1V, &PN) \|\|
passingValueIsAlwaysUndefined(BB2V, &PN))		passingValueIsAlwaysUndefined(BB2V, &PN))
return Changed;		return Changed;

if (isa<ConstantExpr>(BB1V) && !isSafeToSpeculativelyExecute(BB1V))
return Changed;
if (isa<ConstantExpr>(BB2V) && !isSafeToSpeculativelyExecute(BB2V))
return Changed;
}		}
}		}

// Okay, it is safe to hoist the terminator.		// Okay, it is safe to hoist the terminator.
Instruction *NT = I1->clone();		Instruction *NT = I1->clone();
BIParent->getInstList().insert(BI->getIterator(), NT);		BIParent->getInstList().insert(BI->getIterator(), NT);
if (!NT->getType()->isVoidTy()) {		if (!NT->getType()->isVoidTy()) {
I1->replaceAllUsesWith(NT);		I1->replaceAllUsesWith(NT);
▲ Show 20 Lines • Show All 657 Lines • ▼ Show 20 Lines	if (passingValueIsAlwaysUndefined(OrigV, &PN) \|\|
return false;		return false;

HaveRewritablePHIs = true;		HaveRewritablePHIs = true;
ConstantExpr *OrigCE = dyn_cast<ConstantExpr>(OrigV);		ConstantExpr *OrigCE = dyn_cast<ConstantExpr>(OrigV);
ConstantExpr *ThenCE = dyn_cast<ConstantExpr>(ThenV);		ConstantExpr *ThenCE = dyn_cast<ConstantExpr>(ThenV);
if (!OrigCE && !ThenCE)		if (!OrigCE && !ThenCE)
continue; // Known safe and cheap.		continue; // Known safe and cheap.

if ((ThenCE && !isSafeToSpeculativelyExecute(ThenCE)) \|\|
(OrigCE && !isSafeToSpeculativelyExecute(OrigCE)))
return false;
unsigned OrigCost = OrigCE ? ComputeSpeculationCost(OrigCE, TTI) : 0;		unsigned OrigCost = OrigCE ? ComputeSpeculationCost(OrigCE, TTI) : 0;
unsigned ThenCost = ThenCE ? ComputeSpeculationCost(ThenCE, TTI) : 0;		unsigned ThenCost = ThenCE ? ComputeSpeculationCost(ThenCE, TTI) : 0;
unsigned MaxCost =		unsigned MaxCost =
2 * PHINodeFoldingThreshold * TargetTransformInfo::TCC_Basic;		2 * PHINodeFoldingThreshold * TargetTransformInfo::TCC_Basic;
if (OrigCost + ThenCost > MaxCost)		if (OrigCost + ThenCost > MaxCost)
return false;		return false;

// Account for the cost of an unfolded ConstantExpr which could end up		// Account for the cost of an unfolded ConstantExpr which could end up
▲ Show 20 Lines • Show All 392 Lines • ▼ Show 20 Lines	static bool SimplifyCondBranchToTwoReturns(BranchInst *BI,
// Unwrap any PHI nodes in the return blocks.		// Unwrap any PHI nodes in the return blocks.
if (PHINode *TVPN = dyn_cast_or_null<PHINode>(TrueValue))		if (PHINode *TVPN = dyn_cast_or_null<PHINode>(TrueValue))
if (TVPN->getParent() == TrueSucc)		if (TVPN->getParent() == TrueSucc)
TrueValue = TVPN->getIncomingValueForBlock(BI->getParent());		TrueValue = TVPN->getIncomingValueForBlock(BI->getParent());
if (PHINode *FVPN = dyn_cast_or_null<PHINode>(FalseValue))		if (PHINode *FVPN = dyn_cast_or_null<PHINode>(FalseValue))
if (FVPN->getParent() == FalseSucc)		if (FVPN->getParent() == FalseSucc)
FalseValue = FVPN->getIncomingValueForBlock(BI->getParent());		FalseValue = FVPN->getIncomingValueForBlock(BI->getParent());

// In order for this transformation to be safe, we must be able to
// unconditionally execute both operands to the return. This is
// normally the case, but we could have a potentially-trapping
// constant expression that prevents this transformation from being
// safe.
if (ConstantExpr *TCV = dyn_cast_or_null<ConstantExpr>(TrueValue))
if (TCV->canTrap())
return false;
if (ConstantExpr *FCV = dyn_cast_or_null<ConstantExpr>(FalseValue))
if (FCV->canTrap())
return false;

// Okay, we collected all the mapped values and checked them for sanity, and		// Okay, we collected all the mapped values and checked them for sanity, and
// defined to really do this transformation. First, update the CFG.		// defined to really do this transformation. First, update the CFG.
TrueSucc->removePredecessor(BI->getParent());		TrueSucc->removePredecessor(BI->getParent());
FalseSucc->removePredecessor(BI->getParent());		FalseSucc->removePredecessor(BI->getParent());

// Insert select instructions where needed.		// Insert select instructions where needed.
Value *BrCond = BI->getCondition();		Value *BrCond = BI->getCondition();
if (TrueValue) {		if (TrueValue) {
▲ Show 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	for (auto I = BB->begin(); Cond != &*I; ++I) {
// Account for the cost of duplicating this instruction into each		// Account for the cost of duplicating this instruction into each
// predecessor.		// predecessor.
NumBonusInsts += PredCount;		NumBonusInsts += PredCount;
// Early exits once we reach the limit.		// Early exits once we reach the limit.
if (NumBonusInsts > BonusInstThreshold)		if (NumBonusInsts > BonusInstThreshold)
return false;		return false;
}		}

// Cond is known to be a compare or binary operator. Check to make sure that
// neither operand is a potentially-trapping constant expression.
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(Cond->getOperand(0)))
if (CE->canTrap())
return false;
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(Cond->getOperand(1)))
if (CE->canTrap())
return false;

// Finally, don't infinitely unroll conditional loops.		// Finally, don't infinitely unroll conditional loops.
BasicBlock *TrueDest = BI->getSuccessor(0);		BasicBlock *TrueDest = BI->getSuccessor(0);
BasicBlock *FalseDest = (BI->isConditional()) ? BI->getSuccessor(1) : nullptr;		BasicBlock *FalseDest = (BI->isConditional()) ? BI->getSuccessor(1) : nullptr;
if (TrueDest == BB \|\| FalseDest == BB)		if (TrueDest == BB \|\| FalseDest == BB)
return false;		return false;

for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {		for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
BasicBlock PredBlock = PI;		BasicBlock PredBlock = PI;
▲ Show 20 Lines • Show All 585 Lines • ▼ Show 20 Lines	if (BlockIsSimpleEnoughToThreadThrough(BB)) {
}		}
}		}

BI->setCondition(NewPN);		BI->setCondition(NewPN);
return true;		return true;
}		}
}		}

if (auto *CE = dyn_cast<ConstantExpr>(BI->getCondition()))
if (CE->canTrap())
return false;

// If both branches are conditional and both contain stores to the same		// If both branches are conditional and both contain stores to the same
// address, remove the stores from the conditionals and create a conditional		// address, remove the stores from the conditionals and create a conditional
// merged store at the end.		// merged store at the end.
if (MergeCondStores && mergeConditionalStores(PBI, BI, DL))		if (MergeCondStores && mergeConditionalStores(PBI, BI, DL))
return true;		return true;

// If this is a conditional branch in an empty block, and if any		// If this is a conditional branch in an empty block, and if any
// predecessors are a conditional branch to one of our destinations,		// predecessors are a conditional branch to one of our destinations,
Show All 24 Lines	static bool SimplifyCondBranchToCondBranch(BranchInst PBI, BranchInst BI,
// isn't BB itself. If so, this is an infinite loop that will		// isn't BB itself. If so, this is an infinite loop that will
// keep getting unwound.		// keep getting unwound.
if (PBI->getSuccessor(PBIOp) == BB)		if (PBI->getSuccessor(PBIOp) == BB)
return false;		return false;

// Do not perform this transformation if it would require		// Do not perform this transformation if it would require
// insertion of a large number of select instructions. For targets		// insertion of a large number of select instructions. For targets
// without predication/cmovs, this is a big pessimization.		// without predication/cmovs, this is a big pessimization.

// Also do not perform this transformation if any phi node in the common
// destination block can trap when reached by BB or PBB (PR17073). In that
// case, it would be unsafe to hoist the operation into a select instruction.

BasicBlock *CommonDest = PBI->getSuccessor(PBIOp);		BasicBlock *CommonDest = PBI->getSuccessor(PBIOp);
unsigned NumPhis = 0;		unsigned NumPhis = 0;
for (BasicBlock::iterator II = CommonDest->begin(); isa<PHINode>(II);		for (BasicBlock::iterator II = CommonDest->begin(); isa<PHINode>(II);
++II, ++NumPhis) {		++II, ++NumPhis) {
if (NumPhis > 2) // Disable this xform.		if (NumPhis > 2) // Disable this xform.
return false;		return false;

PHINode *PN = cast<PHINode>(II);
Value *BIV = PN->getIncomingValueForBlock(BB);
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(BIV))
if (CE->canTrap())
return false;

unsigned PBBIdx = PN->getBasicBlockIndex(PBI->getParent());
Value *PBIV = PN->getIncomingValue(PBBIdx);
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(PBIV))
if (CE->canTrap())
return false;
}		}

// Finally, if everything is ok, fold the branches to logical ops.		// Finally, if everything is ok, fold the branches to logical ops.
BasicBlock *OtherDest = BI->getSuccessor(BIOp ^ 1);		BasicBlock *OtherDest = BI->getSuccessor(BIOp ^ 1);

LLVM_DEBUG(dbgs() << "FOLDING BRs:" << *PBI->getParent()		LLVM_DEBUG(dbgs() << "FOLDING BRs:" << *PBI->getParent()
<< "AND: " << *BI->getParent());		<< "AND: " << *BI->getParent());

▲ Show 20 Lines • Show All 2,797 Lines • Show Last 20 Lines

lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

Show First 20 Lines • Show All 372 Lines • ▼ Show 20 Lines	static bool isUniformLoopNest(Loop Lp, Loop OuterLp) {
// Check if nested loops are uniform.		// Check if nested loops are uniform.
for (Loop SubLp : Lp)		for (Loop SubLp : Lp)
if (!isUniformLoopNest(SubLp, OuterLp))		if (!isUniformLoopNest(SubLp, OuterLp))
return false;		return false;

return true;		return true;
}		}

/// Check whether it is safe to if-convert this phi node.
///
/// Phi nodes with constant expressions that can trap are not safe to if
/// convert.
static bool canIfConvertPHINodes(BasicBlock *BB) {
for (PHINode &Phi : BB->phis()) {
for (Value *V : Phi.incoming_values())
if (auto *C = dyn_cast<Constant>(V))
if (C->canTrap())
return false;
}
return true;
}

static Type convertPointerToIntegerType(const DataLayout &DL, Type Ty) {		static Type convertPointerToIntegerType(const DataLayout &DL, Type Ty) {
if (Ty->isPointerTy())		if (Ty->isPointerTy())
return DL.getIntPtrType(Ty);		return DL.getIntPtrType(Ty);

// It is possible that char's or short's overflow when we ask for the loop's		// It is possible that char's or short's overflow when we ask for the loop's
// trip count, work around this by changing the type size.		// trip count, work around this by changing the type size.
if (Ty->getScalarSizeInBits() < 32)		if (Ty->getScalarSizeInBits() < 32)
return Type::getInt32Ty(Ty->getContext());		return Type::getInt32Ty(Ty->getContext());
▲ Show 20 Lines • Show All 469 Lines • ▼ Show 20 Lines	bool LoopVectorizationLegality::blockNeedsPredication(BasicBlock *BB) {
return LoopAccessInfo::blockNeedsPredication(BB, TheLoop, DT);		return LoopAccessInfo::blockNeedsPredication(BB, TheLoop, DT);
}		}

bool LoopVectorizationLegality::blockCanBePredicated(		bool LoopVectorizationLegality::blockCanBePredicated(
BasicBlock BB, SmallPtrSetImpl<Value > &SafePtrs) {		BasicBlock BB, SmallPtrSetImpl<Value > &SafePtrs) {
const bool IsAnnotatedParallel = TheLoop->isAnnotatedParallel();		const bool IsAnnotatedParallel = TheLoop->isAnnotatedParallel();

for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
// Check that we don't have a constant expression that can trap as operand.
for (Value *Operand : I.operands()) {
if (auto *C = dyn_cast<Constant>(Operand))
if (C->canTrap())
return false;
}
// We might be able to hoist the load.		// We might be able to hoist the load.
if (I.mayReadFromMemory()) {		if (I.mayReadFromMemory()) {
auto *LI = dyn_cast<LoadInst>(&I);		auto *LI = dyn_cast<LoadInst>(&I);
if (!LI)		if (!LI)
return false;		return false;
if (!SafePtrs.count(LI->getPointerOperand())) {		if (!SafePtrs.count(LI->getPointerOperand())) {
// !llvm.mem.parallel_loop_access implies if-conversion safety.		// !llvm.mem.parallel_loop_access implies if-conversion safety.
// Otherwise, record that the load needs (real or emulated) masking		// Otherwise, record that the load needs (real or emulated) masking
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	if (blockNeedsPredication(BB))
continue;		continue;

for (Instruction &I : *BB)		for (Instruction &I : *BB)
if (auto *Ptr = getLoadStorePointerOperand(&I))		if (auto *Ptr = getLoadStorePointerOperand(&I))
SafePointes.insert(Ptr);		SafePointes.insert(Ptr);
}		}

// Collect the blocks that need predication.		// Collect the blocks that need predication.
BasicBlock *Header = TheLoop->getHeader();
for (BasicBlock *BB : TheLoop->blocks()) {		for (BasicBlock *BB : TheLoop->blocks()) {
// We don't support switch statements inside loops.		// We don't support switch statements inside loops.
if (!isa<BranchInst>(BB->getTerminator())) {		if (!isa<BranchInst>(BB->getTerminator())) {
reportVectorizationFailure("Loop contains a switch statement",		reportVectorizationFailure("Loop contains a switch statement",
"loop contains a switch statement",		"loop contains a switch statement",
"LoopContainsSwitch", BB->getTerminator());		"LoopContainsSwitch", BB->getTerminator());
return false;		return false;
}		}

// We must be able to predicate all blocks that need to be predicated.		// We must be able to predicate all blocks that need to be predicated.
if (blockNeedsPredication(BB)) {		if (blockNeedsPredication(BB)) {
if (!blockCanBePredicated(BB, SafePointes)) {		if (!blockCanBePredicated(BB, SafePointes)) {
reportVectorizationFailure(		reportVectorizationFailure(
"Control flow cannot be substituted for a select",		"Control flow cannot be substituted for a select",
"control flow cannot be substituted for a select",		"control flow cannot be substituted for a select",
"NoCFGForSelect", BB->getTerminator());		"NoCFGForSelect", BB->getTerminator());
return false;		return false;
}		}
} else if (BB != Header && !canIfConvertPHINodes(BB)) {
reportVectorizationFailure(
"Control flow cannot be substituted for a select",
"control flow cannot be substituted for a select",
"NoCFGForSelect", BB->getTerminator());
return false;
}		}
}		}

// We can if-convert this loop.		// We can if-convert this loop.
return true;		return true;
}		}

// Helper function to canVectorizeLoopNestCFG.		// Helper function to canVectorizeLoopNestCFG.
▲ Show 20 Lines • Show All 241 Lines • Show Last 20 Lines

test/CodeGen/X86/critical-edge-split-2.ll

	Show All 15 Lines
	; CHECK-NEXT: jne .LBB0_2			; CHECK-NEXT: jne .LBB0_2
	; CHECK-NEXT: # %bb.1: # %cond.false.i			; CHECK-NEXT: # %bb.1: # %cond.false.i
	; CHECK-NEXT: movl $g_4, %eax			; CHECK-NEXT: movl $g_4, %eax
	; CHECK-NEXT: movl $g_2+4, %ecx			; CHECK-NEXT: movl $g_2+4, %ecx
	; CHECK-NEXT: xorl %esi, %esi			; CHECK-NEXT: xorl %esi, %esi
	; CHECK-NEXT: cmpq %rax, %rcx			; CHECK-NEXT: cmpq %rax, %rcx
	; CHECK-NEXT: sete %sil			; CHECK-NEXT: sete %sil
	; CHECK-NEXT: movl $1, %eax			; CHECK-NEXT: movl $1, %eax
				; CHECK-NEXT: cmovnel %eax, %esi
				; CHECK-NEXT: movl $1, %eax
	; CHECK-NEXT: xorl %edx, %edx			; CHECK-NEXT: xorl %edx, %edx
	; CHECK-NEXT: divl %esi			; CHECK-NEXT: divl %esi
	; CHECK-NEXT: movl %edx, %eax			; CHECK-NEXT: movl %edx, %eax
	; CHECK-NEXT: .LBB0_2: # %cond.end.i			; CHECK-NEXT: .LBB0_2: # %cond.end.i
	; CHECK-NEXT: # kill: def $ax killed $ax killed $eax			; CHECK-NEXT: # kill: def $ax killed $ax killed $eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	br i1 %C, label %cond.end.i, label %cond.false.i			br i1 %C, label %cond.end.i, label %cond.false.i
	Show All 9 Lines

test/CodeGen/X86/divide-constant-expression.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=x86_64-linux-gnu -verify-machineinstrs \| FileCheck %s -check-prefix=SDAG
				; RUN: llc < %s -mtriple=x86_64-linux-gnu -fast-isel -verify-machineinstrs \| FileCheck %s -check-prefix=FAST
				; RUN: llc < %s -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=0 -verify-machineinstrs \| FileCheck %s -check-prefix=GLOBAL

				@g1 = extern_weak global i8
				@g2 = extern_weak global i8

				define i32 @test1(i1 %c) {
				; SDAG-LABEL: test1:
				; SDAG: # %bb.0: # %entry
				; SDAG-NEXT: movl $g1, %eax
				; SDAG-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; SDAG-NEXT: movl $g2, %esi
				; SDAG-NEXT: movl $g2, %ecx
				; SDAG-NEXT: notl %ecx
				; SDAG-NEXT: orl %eax, %ecx
				; SDAG-NEXT: sete %al
				; SDAG-NEXT: testl %esi, %esi
				; SDAG-NEXT: sete %cl
				; SDAG-NEXT: orb %al, %cl
				; SDAG-NEXT: movl $1, %ecx
				; SDAG-NEXT: cmovnel %ecx, %esi
				; SDAG-NEXT: movl $g1, %eax
				; SDAG-NEXT: cltd
				; SDAG-NEXT: idivl %esi
				; SDAG-NEXT: testb $1, %dil
				; SDAG-NEXT: je .LBB0_2
				; SDAG-NEXT: # %bb.1:
				; SDAG-NEXT: movl %eax, %ecx
				; SDAG-NEXT: .LBB0_2: # %cond.end.i
				; SDAG-NEXT: movl %ecx, %eax
				; SDAG-NEXT: retq
				;
				; FAST-LABEL: test1:
				; FAST: # %bb.0: # %entry
				; FAST-NEXT: movl $g1, %eax
				; FAST-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; FAST-NEXT: movl $g2, %esi
				; FAST-NEXT: movl $g2, %ecx
				; FAST-NEXT: notl %ecx
				; FAST-NEXT: orl %eax, %ecx
				; FAST-NEXT: sete %al
				; FAST-NEXT: testl %esi, %esi
				; FAST-NEXT: sete %cl
				; FAST-NEXT: orb %al, %cl
				; FAST-NEXT: movl $1, %ecx
				; FAST-NEXT: cmovnel %ecx, %esi
				; FAST-NEXT: movl $g1, %eax
				; FAST-NEXT: cltd
				; FAST-NEXT: idivl %esi
				; FAST-NEXT: testb $1, %dil
				; FAST-NEXT: je .LBB0_2
				; FAST-NEXT: # %bb.1:
				; FAST-NEXT: movl %eax, %ecx
				; FAST-NEXT: .LBB0_2: # %cond.end.i
				; FAST-NEXT: movl %ecx, %eax
				; FAST-NEXT: retq
				;
				; GLOBAL-LABEL: test1:
				; GLOBAL: # %bb.0: # %entry
				; GLOBAL-NEXT: movl $g1, %eax
				; GLOBAL-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; GLOBAL-NEXT: movl $g2, %esi
				; GLOBAL-NEXT: movl $g2, %ecx
				; GLOBAL-NEXT: notl %ecx
				; GLOBAL-NEXT: orl %eax, %ecx
				; GLOBAL-NEXT: sete %al
				; GLOBAL-NEXT: testl %esi, %esi
				; GLOBAL-NEXT: sete %cl
				; GLOBAL-NEXT: orb %al, %cl
				; GLOBAL-NEXT: movl $1, %ecx
				; GLOBAL-NEXT: cmovnel %ecx, %esi
				; GLOBAL-NEXT: movl $g1, %eax
				; GLOBAL-NEXT: cltd
				; GLOBAL-NEXT: idivl %esi
				; GLOBAL-NEXT: testb $1, %dil
				; GLOBAL-NEXT: je .LBB0_2
				; GLOBAL-NEXT: # %bb.1:
				; GLOBAL-NEXT: movl %eax, %ecx
				; GLOBAL-NEXT: .LBB0_2: # %cond.end.i
				; GLOBAL-NEXT: movl %ecx, %eax
				; GLOBAL-NEXT: retq
				entry:
				br i1 %c, label %cond.end.i, label %cond.false.i

				cond.false.i:
				br label %cond.end.i

				cond.end.i:
				%r = phi i32 [ sdiv (i32 ptrtoint (i8* @g1 to i32), i32 ptrtoint (i8* @g2 to i32)), %entry ], [ 1, %cond.false.i ]
				ret i32 %r
				}

				define i32 @test2(i1 %c) {
				; SDAG-LABEL: test2:
				; SDAG: # %bb.0: # %entry
				; SDAG-NEXT: movl $g2, %esi
				; SDAG-NEXT: cmpl $1, %esi
				; SDAG-NEXT: movl $1, %ecx
				; SDAG-NEXT: cmovbel %ecx, %esi
				; SDAG-NEXT: movl $g1, %eax
				; SDAG-NEXT: xorl %edx, %edx
				; SDAG-NEXT: divl %esi
				; SDAG-NEXT: testb $1, %dil
				; SDAG-NEXT: je .LBB1_2
				; SDAG-NEXT: # %bb.1:
				; SDAG-NEXT: movl %eax, %ecx
				; SDAG-NEXT: .LBB1_2: # %cond.end.i
				; SDAG-NEXT: movl %ecx, %eax
				; SDAG-NEXT: retq
				;
				; FAST-LABEL: test2:
				; FAST: # %bb.0: # %entry
				; FAST-NEXT: movl $g2, %esi
				; FAST-NEXT: cmpl $1, %esi
				; FAST-NEXT: movl $1, %ecx
				; FAST-NEXT: cmovbel %ecx, %esi
				; FAST-NEXT: movl $g1, %eax
				; FAST-NEXT: xorl %edx, %edx
				; FAST-NEXT: divl %esi
				; FAST-NEXT: testb $1, %dil
				; FAST-NEXT: je .LBB1_2
				; FAST-NEXT: # %bb.1:
				; FAST-NEXT: movl %eax, %ecx
				; FAST-NEXT: .LBB1_2: # %cond.end.i
				; FAST-NEXT: movl %ecx, %eax
				; FAST-NEXT: retq
				;
				; GLOBAL-LABEL: test2:
				; GLOBAL: # %bb.0: # %entry
				; GLOBAL-NEXT: movl $g2, %esi
				; GLOBAL-NEXT: cmpl $1, %esi
				; GLOBAL-NEXT: movl $1, %ecx
				; GLOBAL-NEXT: cmovbel %ecx, %esi
				; GLOBAL-NEXT: movl $g1, %eax
				; GLOBAL-NEXT: xorl %edx, %edx
				; GLOBAL-NEXT: divl %esi
				; GLOBAL-NEXT: testb $1, %dil
				; GLOBAL-NEXT: je .LBB1_2
				; GLOBAL-NEXT: # %bb.1:
				; GLOBAL-NEXT: movl %eax, %ecx
				; GLOBAL-NEXT: .LBB1_2: # %cond.end.i
				; GLOBAL-NEXT: movl %ecx, %eax
				; GLOBAL-NEXT: retq
				entry:
				br i1 %c, label %cond.end.i, label %cond.false.i

				cond.false.i:
				br label %cond.end.i

				cond.end.i:
				%r = phi i32 [ udiv (i32 ptrtoint (i8* @g1 to i32), i32 ptrtoint (i8* @g2 to i32)), %entry ], [ 1, %cond.false.i ]
				ret i32 %r
				}

				define i32 @test3(i1 %c) {
				; SDAG-LABEL: test3:
				; SDAG: # %bb.0: # %entry
				; SDAG-NEXT: movl $g1, %eax
				; SDAG-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; SDAG-NEXT: movl $g2, %esi
				; SDAG-NEXT: movl $g2, %ecx
				; SDAG-NEXT: notl %ecx
				; SDAG-NEXT: orl %eax, %ecx
				; SDAG-NEXT: sete %al
				; SDAG-NEXT: testl %esi, %esi
				; SDAG-NEXT: sete %cl
				; SDAG-NEXT: orb %al, %cl
				; SDAG-NEXT: movl $1, %ecx
				; SDAG-NEXT: cmovnel %ecx, %esi
				; SDAG-NEXT: movl $g1, %eax
				; SDAG-NEXT: cltd
				; SDAG-NEXT: idivl %esi
				; SDAG-NEXT: testb $1, %dil
				; SDAG-NEXT: je .LBB2_2
				; SDAG-NEXT: # %bb.1:
				; SDAG-NEXT: movl %edx, %ecx
				; SDAG-NEXT: .LBB2_2: # %cond.end.i
				; SDAG-NEXT: movl %ecx, %eax
				; SDAG-NEXT: retq
				;
				; FAST-LABEL: test3:
				; FAST: # %bb.0: # %entry
				; FAST-NEXT: movl $g1, %eax
				; FAST-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; FAST-NEXT: movl $g2, %esi
				; FAST-NEXT: movl $g2, %ecx
				; FAST-NEXT: notl %ecx
				; FAST-NEXT: orl %eax, %ecx
				; FAST-NEXT: sete %al
				; FAST-NEXT: testl %esi, %esi
				; FAST-NEXT: sete %cl
				; FAST-NEXT: orb %al, %cl
				; FAST-NEXT: movl $1, %ecx
				; FAST-NEXT: cmovnel %ecx, %esi
				; FAST-NEXT: movl $g1, %eax
				; FAST-NEXT: cltd
				; FAST-NEXT: idivl %esi
				; FAST-NEXT: testb $1, %dil
				; FAST-NEXT: je .LBB2_2
				; FAST-NEXT: # %bb.1:
				; FAST-NEXT: movl %edx, %ecx
				; FAST-NEXT: .LBB2_2: # %cond.end.i
				; FAST-NEXT: movl %ecx, %eax
				; FAST-NEXT: retq
				;
				; GLOBAL-LABEL: test3:
				; GLOBAL: # %bb.0: # %entry
				; GLOBAL-NEXT: movl $g1, %eax
				; GLOBAL-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; GLOBAL-NEXT: movl $g2, %esi
				; GLOBAL-NEXT: movl $g2, %ecx
				; GLOBAL-NEXT: notl %ecx
				; GLOBAL-NEXT: orl %eax, %ecx
				; GLOBAL-NEXT: sete %al
				; GLOBAL-NEXT: testl %esi, %esi
				; GLOBAL-NEXT: sete %cl
				; GLOBAL-NEXT: orb %al, %cl
				; GLOBAL-NEXT: movl $1, %ecx
				; GLOBAL-NEXT: cmovnel %ecx, %esi
				; GLOBAL-NEXT: movl $g1, %eax
				; GLOBAL-NEXT: cltd
				; GLOBAL-NEXT: idivl %esi
				; GLOBAL-NEXT: testb $1, %dil
				; GLOBAL-NEXT: je .LBB2_2
				; GLOBAL-NEXT: # %bb.1:
				; GLOBAL-NEXT: movl %edx, %ecx
				; GLOBAL-NEXT: .LBB2_2: # %cond.end.i
				; GLOBAL-NEXT: movl %ecx, %eax
				; GLOBAL-NEXT: retq
				entry:
				br i1 %c, label %cond.end.i, label %cond.false.i

				cond.false.i:
				br label %cond.end.i

				cond.end.i:
				%r = phi i32 [ srem (i32 ptrtoint (i8* @g1 to i32), i32 ptrtoint (i8* @g2 to i32)), %entry ], [ 1, %cond.false.i ]
				ret i32 %r
				}

				define i32 @test4(i1 %c) {
				; SDAG-LABEL: test4:
				; SDAG: # %bb.0: # %entry
				; SDAG-NEXT: movl $g2, %esi
				; SDAG-NEXT: cmpl $1, %esi
				; SDAG-NEXT: movl $1, %ecx
				; SDAG-NEXT: cmovbel %ecx, %esi
				; SDAG-NEXT: movl $g1, %eax
				; SDAG-NEXT: xorl %edx, %edx
				; SDAG-NEXT: divl %esi
				; SDAG-NEXT: testb $1, %dil
				; SDAG-NEXT: je .LBB3_2
				; SDAG-NEXT: # %bb.1:
				; SDAG-NEXT: movl %edx, %ecx
				; SDAG-NEXT: .LBB3_2: # %cond.end.i
				; SDAG-NEXT: movl %ecx, %eax
				; SDAG-NEXT: retq
				;
				; FAST-LABEL: test4:
				; FAST: # %bb.0: # %entry
				; FAST-NEXT: movl $g2, %esi
				; FAST-NEXT: cmpl $1, %esi
				; FAST-NEXT: movl $1, %ecx
				; FAST-NEXT: cmovbel %ecx, %esi
				; FAST-NEXT: movl $g1, %eax
				; FAST-NEXT: xorl %edx, %edx
				; FAST-NEXT: divl %esi
				; FAST-NEXT: testb $1, %dil
				; FAST-NEXT: je .LBB3_2
				; FAST-NEXT: # %bb.1:
				; FAST-NEXT: movl %edx, %ecx
				; FAST-NEXT: .LBB3_2: # %cond.end.i
				; FAST-NEXT: movl %ecx, %eax
				; FAST-NEXT: retq
				;
				; GLOBAL-LABEL: test4:
				; GLOBAL: # %bb.0: # %entry
				; GLOBAL-NEXT: movl $g2, %esi
				; GLOBAL-NEXT: cmpl $1, %esi
				; GLOBAL-NEXT: movl $1, %ecx
				; GLOBAL-NEXT: cmovbel %ecx, %esi
				; GLOBAL-NEXT: movl $g1, %eax
				; GLOBAL-NEXT: xorl %edx, %edx
				; GLOBAL-NEXT: divl %esi
				; GLOBAL-NEXT: testb $1, %dil
				; GLOBAL-NEXT: je .LBB3_2
				; GLOBAL-NEXT: # %bb.1:
				; GLOBAL-NEXT: movl %edx, %ecx
				; GLOBAL-NEXT: .LBB3_2: # %cond.end.i
				; GLOBAL-NEXT: movl %ecx, %eax
				; GLOBAL-NEXT: retq
				entry:
				br i1 %c, label %cond.end.i, label %cond.false.i

				cond.false.i:
				br label %cond.end.i

				cond.end.i:
				%r = phi i32 [ urem (i32 ptrtoint (i8* @g1 to i32), i32 ptrtoint (i8* @g2 to i32)), %entry ], [ 1, %cond.false.i ]
				ret i32 %r
				}

				define i32 @test5(i32 %c) {
				; SDAG-LABEL: test5:
				; SDAG: # %bb.0: # %entry
				; SDAG-NEXT: movl $g1, %eax
				; SDAG-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; SDAG-NEXT: movl $g2, %ecx
				; SDAG-NEXT: movl $g2, %edx
				; SDAG-NEXT: notl %edx
				; SDAG-NEXT: orl %eax, %edx
				; SDAG-NEXT: sete %al
				; SDAG-NEXT: testl %ecx, %ecx
				; SDAG-NEXT: sete %dl
				; SDAG-NEXT: orb %al, %dl
				; SDAG-NEXT: movl $1, %eax
				; SDAG-NEXT: cmovnel %eax, %ecx
				; SDAG-NEXT: movl $g1, %eax
				; SDAG-NEXT: cltd
				; SDAG-NEXT: idivl %ecx
				; SDAG-NEXT: #APP
				; SDAG-NEXT: #NO_APP
				; SDAG-NEXT: .Ltmp0: # Block address taken
				; SDAG-NEXT: .LBB4_1: # %cond.false.i
				; SDAG-NEXT: movl $1, %eax
				; SDAG-NEXT: .LBB4_2: # %cond.end.i
				; SDAG-NEXT: retq
				;
				; FAST-LABEL: test5:
				; FAST: # %bb.0: # %entry
				; FAST-NEXT: movl $g1, %eax
				; FAST-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; FAST-NEXT: movl $g2, %ecx
				; FAST-NEXT: movl $g2, %edx
				; FAST-NEXT: notl %edx
				; FAST-NEXT: orl %eax, %edx
				; FAST-NEXT: sete %al
				; FAST-NEXT: testl %ecx, %ecx
				; FAST-NEXT: sete %dl
				; FAST-NEXT: orb %al, %dl
				; FAST-NEXT: movl $1, %eax
				; FAST-NEXT: cmovnel %eax, %ecx
				; FAST-NEXT: movl $g1, %eax
				; FAST-NEXT: cltd
				; FAST-NEXT: idivl %ecx
				; FAST-NEXT: #APP
				; FAST-NEXT: #NO_APP
				; FAST-NEXT: .Ltmp0: # Block address taken
				; FAST-NEXT: .LBB4_1: # %cond.false.i
				; FAST-NEXT: movl $1, %eax
				; FAST-NEXT: .LBB4_2: # %cond.end.i
				; FAST-NEXT: retq
				;
				; GLOBAL-LABEL: test5:
				; GLOBAL: # %bb.0: # %entry
				; GLOBAL-NEXT: movl $g1, %eax
				; GLOBAL-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; GLOBAL-NEXT: movl $g2, %ecx
				; GLOBAL-NEXT: movl $g2, %edx
				; GLOBAL-NEXT: notl %edx
				; GLOBAL-NEXT: orl %eax, %edx
				; GLOBAL-NEXT: sete %al
				; GLOBAL-NEXT: testl %ecx, %ecx
				; GLOBAL-NEXT: sete %dl
				; GLOBAL-NEXT: orb %al, %dl
				; GLOBAL-NEXT: movl $1, %eax
				; GLOBAL-NEXT: cmovnel %eax, %ecx
				; GLOBAL-NEXT: movl $g1, %eax
				; GLOBAL-NEXT: cltd
				; GLOBAL-NEXT: idivl %ecx
				; GLOBAL-NEXT: #APP
				; GLOBAL-NEXT: #NO_APP
				; GLOBAL-NEXT: .Ltmp0: # Block address taken
				; GLOBAL-NEXT: .LBB4_1: # %cond.false.i
				; GLOBAL-NEXT: movl $1, %eax
				; GLOBAL-NEXT: .LBB4_2: # %cond.end.i
				; GLOBAL-NEXT: retq
				entry:
				callbr void asm "", "r,X"(i32 %c, i8 *blockaddress(@test5, %cond.false.i))
				to label %cond.false.i [label %cond.end.i]

				cond.false.i:
				br label %cond.end.i

				cond.end.i:
				%r = phi i32 [ sdiv (i32 ptrtoint (i8* @g1 to i32), i32 ptrtoint (i8* @g2 to i32)), %entry ], [ 1, %cond.false.i ]
				ret i32 %r
				}

test/Transforms/LoopVectorize/X86/masked_load_store.ll

	Show First 20 Lines • Show All 1,496 Lines • ▼ Show 20 Lines

	for.end: ; preds = %for.inc			for.end: ; preds = %for.inc
	ret void			ret void
	}			}

	@a = common global [1 x i32*] zeroinitializer, align 8			@a = common global [1 x i32*] zeroinitializer, align 8
	@c = common global i32* null, align 8			@c = common global i32* null, align 8

	; The loop here should not be vectorized due to trapping			; Constant expressions never trap; check that we perform the transform
	; constant expression			; consistently.
				jdoerfertUnsubmitted Not Done Reply Inline Actions I'm unsure about the sentence that talks about history. I think I'd prefer a statement about the semantic we have, thus, "constant expressions never trap, check ...". But I don't feel strongly about this. jdoerfert: I'm unsure about the sentence that talks about history. I think I'd prefer a statement about…

	define void @foo5(i32* nocapture %A, i32* nocapture readnone %B, i32* nocapture readonly %trigger) local_unnamed_addr #0 {			define void @foo5(i32* nocapture %A, i32* nocapture readnone %B, i32* nocapture readonly %trigger) local_unnamed_addr #0 {
	; AVX-LABEL: @foo5(			; AVX-LABEL: @foo5(
	; AVX-NEXT: entry:			; AVX-NEXT: entry:
				; AVX-NEXT: [[A1:%.]] = bitcast i32 [[A:%.]] to i8
				; AVX-NEXT: [[TRIGGER3:%.]] = bitcast i32 [[TRIGGER:%.]] to i8
				; AVX-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]
				; AVX: vector.memcheck:
				; AVX-NEXT: [[SCEVGEP:%.]] = getelementptr i32, i32 [[A]], i64 10000
				; AVX-NEXT: [[SCEVGEP2:%.]] = bitcast i32 [[SCEVGEP]] to i8*
				; AVX-NEXT: [[SCEVGEP4:%.]] = getelementptr i32, i32 [[TRIGGER]], i64 10000
				; AVX-NEXT: [[SCEVGEP45:%.]] = bitcast i32 [[SCEVGEP4]] to i8*
				; AVX-NEXT: [[BOUND0:%.]] = icmp ult i8 [[A1]], [[SCEVGEP45]]
				; AVX-NEXT: [[BOUND1:%.]] = icmp ult i8 [[TRIGGER3]], [[SCEVGEP2]]
				; AVX-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
				; AVX-NEXT: [[MEMCHECK_CONFLICT:%.*]] = and i1 [[FOUND_CONFLICT]], true
				; AVX-NEXT: br i1 [[MEMCHECK_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
				; AVX: vector.ph:
				; AVX-NEXT: br label [[VECTOR_BODY:%.*]]
				; AVX: vector.body:
				; AVX-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; AVX-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <8 x i64> undef, i64 [[INDEX]], i32 0
				; AVX-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <8 x i64> [[BROADCAST_SPLATINSERT]], <8 x i64> undef, <8 x i32> zeroinitializer
				; AVX-NEXT: [[INDUCTION:%.*]] = add <8 x i64> [[BROADCAST_SPLAT]], <i64 0, i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7>
				; AVX-NEXT: [[INDUCTION6:%.*]] = add <8 x i64> [[BROADCAST_SPLAT]], <i64 8, i64 9, i64 10, i64 11, i64 12, i64 13, i64 14, i64 15>
				; AVX-NEXT: [[INDUCTION7:%.*]] = add <8 x i64> [[BROADCAST_SPLAT]], <i64 16, i64 17, i64 18, i64 19, i64 20, i64 21, i64 22, i64 23>
				; AVX-NEXT: [[INDUCTION8:%.*]] = add <8 x i64> [[BROADCAST_SPLAT]], <i64 24, i64 25, i64 26, i64 27, i64 28, i64 29, i64 30, i64 31>
				; AVX-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
				; AVX-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 8
				; AVX-NEXT: [[TMP2:%.*]] = add i64 [[INDEX]], 16
				; AVX-NEXT: [[TMP3:%.*]] = add i64 [[INDEX]], 24
				; AVX-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP0]]
				; AVX-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP1]]
				; AVX-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP2]]
				; AVX-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP3]]
				; AVX-NEXT: [[TMP8:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 0
				; AVX-NEXT: [[TMP9:%.]] = bitcast i32 [[TMP8]] to <8 x i32>*
				; AVX-NEXT: [[WIDE_LOAD:%.]] = load <8 x i32>, <8 x i32> [[TMP9]], align 4, !alias.scope !41
				; AVX-NEXT: [[TMP10:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 8
				; AVX-NEXT: [[TMP11:%.]] = bitcast i32 [[TMP10]] to <8 x i32>*
				; AVX-NEXT: [[WIDE_LOAD9:%.]] = load <8 x i32>, <8 x i32> [[TMP11]], align 4, !alias.scope !41
				; AVX-NEXT: [[TMP12:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 16
				; AVX-NEXT: [[TMP13:%.]] = bitcast i32 [[TMP12]] to <8 x i32>*
				; AVX-NEXT: [[WIDE_LOAD10:%.]] = load <8 x i32>, <8 x i32> [[TMP13]], align 4, !alias.scope !41
				; AVX-NEXT: [[TMP14:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 24
				; AVX-NEXT: [[TMP15:%.]] = bitcast i32 [[TMP14]] to <8 x i32>*
				; AVX-NEXT: [[WIDE_LOAD11:%.]] = load <8 x i32>, <8 x i32> [[TMP15]], align 4, !alias.scope !41
				; AVX-NEXT: [[TMP16:%.*]] = icmp slt <8 x i32> [[WIDE_LOAD]], <i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100>
				; AVX-NEXT: [[TMP17:%.*]] = icmp slt <8 x i32> [[WIDE_LOAD9]], <i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100>
				; AVX-NEXT: [[TMP18:%.*]] = icmp slt <8 x i32> [[WIDE_LOAD10]], <i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100>
				; AVX-NEXT: [[TMP19:%.*]] = icmp slt <8 x i32> [[WIDE_LOAD11]], <i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100>
				; AVX-NEXT: [[TMP20:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP0]]
				; AVX-NEXT: [[TMP21:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP1]]
				; AVX-NEXT: [[TMP22:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP2]]
				; AVX-NEXT: [[TMP23:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP3]]
				; AVX-NEXT: [[TMP24:%.]] = getelementptr inbounds i32, i32 [[TMP20]], i32 0
				; AVX-NEXT: [[TMP25:%.]] = bitcast i32 [[TMP24]] to <8 x i32>*
				; AVX-NEXT: call void @llvm.masked.store.v8i32.p0v8i32(<8 x i32> <i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32))>, <8 x i32>* [[TMP25]], i32 4, <8 x i1> [[TMP16]]), !alias.scope !44, !noalias !41
				; AVX-NEXT: [[TMP26:%.]] = getelementptr inbounds i32, i32 [[TMP20]], i32 8
				; AVX-NEXT: [[TMP27:%.]] = bitcast i32 [[TMP26]] to <8 x i32>*
				; AVX-NEXT: call void @llvm.masked.store.v8i32.p0v8i32(<8 x i32> <i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32))>, <8 x i32>* [[TMP27]], i32 4, <8 x i1> [[TMP17]]), !alias.scope !44, !noalias !41
				; AVX-NEXT: [[TMP28:%.]] = getelementptr inbounds i32, i32 [[TMP20]], i32 16
				; AVX-NEXT: [[TMP29:%.]] = bitcast i32 [[TMP28]] to <8 x i32>*
				; AVX-NEXT: call void @llvm.masked.store.v8i32.p0v8i32(<8 x i32> <i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32))>, <8 x i32>* [[TMP29]], i32 4, <8 x i1> [[TMP18]]), !alias.scope !44, !noalias !41
				; AVX-NEXT: [[TMP30:%.]] = getelementptr inbounds i32, i32 [[TMP20]], i32 24
				; AVX-NEXT: [[TMP31:%.]] = bitcast i32 [[TMP30]] to <8 x i32>*
				; AVX-NEXT: call void @llvm.masked.store.v8i32.p0v8i32(<8 x i32> <i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32))>, <8 x i32>* [[TMP31]], i32 4, <8 x i1> [[TMP19]]), !alias.scope !44, !noalias !41
				; AVX-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 32
				; AVX-NEXT: [[TMP32:%.*]] = icmp eq i64 [[INDEX_NEXT]], 9984
				; AVX-NEXT: br i1 [[TMP32]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !46
				; AVX: middle.block:
				; AVX-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 9984
				; AVX-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
				; AVX: scalar.ph:
				; AVX-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 9984, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; AVX-NEXT: br label [[FOR_BODY:%.*]]			; AVX-NEXT: br label [[FOR_BODY:%.*]]
	; AVX: for.body:			; AVX: for.body:
	; AVX-NEXT: [[INDVARS_IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.]] ]			; AVX-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[TRIGGER:%.*]], i64 [[INDVARS_IV]]			; AVX-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; AVX-NEXT: [[TMP33:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; AVX-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP0]], 100			; AVX-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP33]], 100
	; AVX-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX: if.then:			; AVX: if.then:
	; AVX-NEXT: [[ARRAYIDX7:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]			; AVX-NEXT: [[ARRAYIDX7:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[INDVARS_IV]]
	; AVX-NEXT: store i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32)), i32* [[ARRAYIDX7]], align 4			; AVX-NEXT: store i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32)), i32* [[ARRAYIDX7]], align 4
	; AVX-NEXT: br label [[FOR_INC]]			; AVX-NEXT: br label [[FOR_INC]]
	; AVX: for.inc:			; AVX: for.inc:
	; AVX-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX-NEXT: br i1 [[EXITCOND]], label [[FOR_END:%.*]], label [[FOR_BODY]]			; AVX-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop !47
	; AVX: for.end:			; AVX: for.end:
	; AVX-NEXT: ret void			; AVX-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo5(			; AVX512-LABEL: @foo5(
	; AVX512-NEXT: entry:			; AVX512-NEXT: entry:
				; AVX512-NEXT: [[A1:%.]] = bitcast i32 [[A:%.]] to i8
				; AVX512-NEXT: [[TRIGGER3:%.]] = bitcast i32 [[TRIGGER:%.]] to i8
				; AVX512-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]
				; AVX512: vector.memcheck:
				; AVX512-NEXT: [[SCEVGEP:%.]] = getelementptr i32, i32 [[A]], i64 10000
				; AVX512-NEXT: [[SCEVGEP2:%.]] = bitcast i32 [[SCEVGEP]] to i8*
				; AVX512-NEXT: [[SCEVGEP4:%.]] = getelementptr i32, i32 [[TRIGGER]], i64 10000
				; AVX512-NEXT: [[SCEVGEP45:%.]] = bitcast i32 [[SCEVGEP4]] to i8*
				; AVX512-NEXT: [[BOUND0:%.]] = icmp ult i8 [[A1]], [[SCEVGEP45]]
				; AVX512-NEXT: [[BOUND1:%.]] = icmp ult i8 [[TRIGGER3]], [[SCEVGEP2]]
				; AVX512-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
				; AVX512-NEXT: [[MEMCHECK_CONFLICT:%.*]] = and i1 [[FOUND_CONFLICT]], true
				; AVX512-NEXT: br i1 [[MEMCHECK_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
				; AVX512: vector.ph:
				; AVX512-NEXT: br label [[VECTOR_BODY:%.*]]
				; AVX512: vector.body:
				; AVX512-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; AVX512-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <16 x i64> undef, i64 [[INDEX]], i32 0
				; AVX512-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <16 x i64> [[BROADCAST_SPLATINSERT]], <16 x i64> undef, <16 x i32> zeroinitializer
				; AVX512-NEXT: [[INDUCTION:%.*]] = add <16 x i64> [[BROADCAST_SPLAT]], <i64 0, i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12, i64 13, i64 14, i64 15>
				; AVX512-NEXT: [[INDUCTION6:%.*]] = add <16 x i64> [[BROADCAST_SPLAT]], <i64 16, i64 17, i64 18, i64 19, i64 20, i64 21, i64 22, i64 23, i64 24, i64 25, i64 26, i64 27, i64 28, i64 29, i64 30, i64 31>
				; AVX512-NEXT: [[INDUCTION7:%.*]] = add <16 x i64> [[BROADCAST_SPLAT]], <i64 32, i64 33, i64 34, i64 35, i64 36, i64 37, i64 38, i64 39, i64 40, i64 41, i64 42, i64 43, i64 44, i64 45, i64 46, i64 47>
				; AVX512-NEXT: [[INDUCTION8:%.*]] = add <16 x i64> [[BROADCAST_SPLAT]], <i64 48, i64 49, i64 50, i64 51, i64 52, i64 53, i64 54, i64 55, i64 56, i64 57, i64 58, i64 59, i64 60, i64 61, i64 62, i64 63>
				; AVX512-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
				; AVX512-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 16
				; AVX512-NEXT: [[TMP2:%.*]] = add i64 [[INDEX]], 32
				; AVX512-NEXT: [[TMP3:%.*]] = add i64 [[INDEX]], 48
				; AVX512-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP0]]
				; AVX512-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP1]]
				; AVX512-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP2]]
				; AVX512-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP3]]
				; AVX512-NEXT: [[TMP8:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 0
				; AVX512-NEXT: [[TMP9:%.]] = bitcast i32 [[TMP8]] to <16 x i32>*
				; AVX512-NEXT: [[WIDE_LOAD:%.]] = load <16 x i32>, <16 x i32> [[TMP9]], align 4, !alias.scope !51
				; AVX512-NEXT: [[TMP10:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 16
				; AVX512-NEXT: [[TMP11:%.]] = bitcast i32 [[TMP10]] to <16 x i32>*
				; AVX512-NEXT: [[WIDE_LOAD9:%.]] = load <16 x i32>, <16 x i32> [[TMP11]], align 4, !alias.scope !51
				; AVX512-NEXT: [[TMP12:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 32
				; AVX512-NEXT: [[TMP13:%.]] = bitcast i32 [[TMP12]] to <16 x i32>*
				; AVX512-NEXT: [[WIDE_LOAD10:%.]] = load <16 x i32>, <16 x i32> [[TMP13]], align 4, !alias.scope !51
				; AVX512-NEXT: [[TMP14:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 48
				; AVX512-NEXT: [[TMP15:%.]] = bitcast i32 [[TMP14]] to <16 x i32>*
				; AVX512-NEXT: [[WIDE_LOAD11:%.]] = load <16 x i32>, <16 x i32> [[TMP15]], align 4, !alias.scope !51
				; AVX512-NEXT: [[TMP16:%.*]] = icmp slt <16 x i32> [[WIDE_LOAD]], <i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100>
				; AVX512-NEXT: [[TMP17:%.*]] = icmp slt <16 x i32> [[WIDE_LOAD9]], <i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100>
				; AVX512-NEXT: [[TMP18:%.*]] = icmp slt <16 x i32> [[WIDE_LOAD10]], <i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100>
				; AVX512-NEXT: [[TMP19:%.*]] = icmp slt <16 x i32> [[WIDE_LOAD11]], <i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100, i32 100>
				; AVX512-NEXT: [[TMP20:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP0]]
				; AVX512-NEXT: [[TMP21:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP1]]
				; AVX512-NEXT: [[TMP22:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP2]]
				; AVX512-NEXT: [[TMP23:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP3]]
				; AVX512-NEXT: [[TMP24:%.]] = getelementptr inbounds i32, i32 [[TMP20]], i32 0
				; AVX512-NEXT: [[TMP25:%.]] = bitcast i32 [[TMP24]] to <16 x i32>*
				; AVX512-NEXT: call void @llvm.masked.store.v16i32.p0v16i32(<16 x i32> <i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32))>, <16 x i32>* [[TMP25]], i32 4, <16 x i1> [[TMP16]]), !alias.scope !54, !noalias !51
				; AVX512-NEXT: [[TMP26:%.]] = getelementptr inbounds i32, i32 [[TMP20]], i32 16
				; AVX512-NEXT: [[TMP27:%.]] = bitcast i32 [[TMP26]] to <16 x i32>*
				; AVX512-NEXT: call void @llvm.masked.store.v16i32.p0v16i32(<16 x i32> <i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32))>, <16 x i32>* [[TMP27]], i32 4, <16 x i1> [[TMP17]]), !alias.scope !54, !noalias !51
				; AVX512-NEXT: [[TMP28:%.]] = getelementptr inbounds i32, i32 [[TMP20]], i32 32
				; AVX512-NEXT: [[TMP29:%.]] = bitcast i32 [[TMP28]] to <16 x i32>*
				; AVX512-NEXT: call void @llvm.masked.store.v16i32.p0v16i32(<16 x i32> <i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32))>, <16 x i32>* [[TMP29]], i32 4, <16 x i1> [[TMP18]]), !alias.scope !54, !noalias !51
				; AVX512-NEXT: [[TMP30:%.]] = getelementptr inbounds i32, i32 [[TMP20]], i32 48
				; AVX512-NEXT: [[TMP31:%.]] = bitcast i32 [[TMP30]] to <16 x i32>*
				; AVX512-NEXT: call void @llvm.masked.store.v16i32.p0v16i32(<16 x i32> <i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32 @c) to i32)), i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32 getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32))>, <16 x i32>* [[TMP31]], i32 4, <16 x i1> [[TMP19]]), !alias.scope !54, !noalias !51
				; AVX512-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 64
				; AVX512-NEXT: [[TMP32:%.*]] = icmp eq i64 [[INDEX_NEXT]], 9984
				; AVX512-NEXT: br i1 [[TMP32]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !56
				; AVX512: middle.block:
				; AVX512-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 9984
				; AVX512-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
				; AVX512: scalar.ph:
				; AVX512-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 9984, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; AVX512-NEXT: br label [[FOR_BODY:%.*]]			; AVX512-NEXT: br label [[FOR_BODY:%.*]]
	; AVX512: for.body:			; AVX512: for.body:
	; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.]] ]			; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[TRIGGER:%.*]], i64 [[INDVARS_IV]]			; AVX512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX512-NEXT: [[TMP0:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; AVX512-NEXT: [[TMP33:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; AVX512-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP0]], 100			; AVX512-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP33]], 100
	; AVX512-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX512-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX512: if.then:			; AVX512: if.then:
	; AVX512-NEXT: [[ARRAYIDX7:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]			; AVX512-NEXT: [[ARRAYIDX7:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[INDVARS_IV]]
	; AVX512-NEXT: store i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32)), i32* [[ARRAYIDX7]], align 4			; AVX512-NEXT: store i32 sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 1, i64 0), i32** @c) to i32)), i32* [[ARRAYIDX7]], align 4
	; AVX512-NEXT: br label [[FOR_INC]]			; AVX512-NEXT: br label [[FOR_INC]]
	; AVX512: for.inc:			; AVX512: for.inc:
	; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX512-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX512-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX512-NEXT: br i1 [[EXITCOND]], label [[FOR_END:%.*]], label [[FOR_BODY]]			; AVX512-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop !57
	; AVX512: for.end:			; AVX512: for.end:
	; AVX512-NEXT: ret void			; AVX512-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body: ; preds = %for.inc, %entry			for.body: ; preds = %for.inc, %entry
	%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.inc ]			%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.inc ]
	▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], -12			; AVX2-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], -12
	; AVX2-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP0]]			; AVX2-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP0]]
	; AVX2-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP1]]			; AVX2-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP1]]
	; AVX2-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP2]]			; AVX2-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP2]]
	; AVX2-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP3]]			; AVX2-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP3]]
	; AVX2-NEXT: [[TMP8:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 0			; AVX2-NEXT: [[TMP8:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 0
	; AVX2-NEXT: [[TMP9:%.]] = getelementptr inbounds i32, i32 [[TMP8]], i32 -3			; AVX2-NEXT: [[TMP9:%.]] = getelementptr inbounds i32, i32 [[TMP8]], i32 -3
	; AVX2-NEXT: [[TMP10:%.]] = bitcast i32 [[TMP9]] to <4 x i32>*			; AVX2-NEXT: [[TMP10:%.]] = bitcast i32 [[TMP9]] to <4 x i32>*
	; AVX2-NEXT: [[WIDE_LOAD:%.]] = load <4 x i32>, <4 x i32> [[TMP10]], align 4, !alias.scope !41			; AVX2-NEXT: [[WIDE_LOAD:%.]] = load <4 x i32>, <4 x i32> [[TMP10]], align 4, !alias.scope !48
	; AVX2-NEXT: [[REVERSE:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD]], <4 x i32> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD]], <4 x i32> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP11:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -4			; AVX2-NEXT: [[TMP11:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -4
	; AVX2-NEXT: [[TMP12:%.]] = getelementptr inbounds i32, i32 [[TMP11]], i32 -3			; AVX2-NEXT: [[TMP12:%.]] = getelementptr inbounds i32, i32 [[TMP11]], i32 -3
	; AVX2-NEXT: [[TMP13:%.]] = bitcast i32 [[TMP12]] to <4 x i32>*			; AVX2-NEXT: [[TMP13:%.]] = bitcast i32 [[TMP12]] to <4 x i32>*
	; AVX2-NEXT: [[WIDE_LOAD15:%.]] = load <4 x i32>, <4 x i32> [[TMP13]], align 4, !alias.scope !41			; AVX2-NEXT: [[WIDE_LOAD15:%.]] = load <4 x i32>, <4 x i32> [[TMP13]], align 4, !alias.scope !48
	; AVX2-NEXT: [[REVERSE16:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD15]], <4 x i32> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE16:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD15]], <4 x i32> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP14:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -8			; AVX2-NEXT: [[TMP14:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -8
	; AVX2-NEXT: [[TMP15:%.]] = getelementptr inbounds i32, i32 [[TMP14]], i32 -3			; AVX2-NEXT: [[TMP15:%.]] = getelementptr inbounds i32, i32 [[TMP14]], i32 -3
	; AVX2-NEXT: [[TMP16:%.]] = bitcast i32 [[TMP15]] to <4 x i32>*			; AVX2-NEXT: [[TMP16:%.]] = bitcast i32 [[TMP15]] to <4 x i32>*
	; AVX2-NEXT: [[WIDE_LOAD17:%.]] = load <4 x i32>, <4 x i32> [[TMP16]], align 4, !alias.scope !41			; AVX2-NEXT: [[WIDE_LOAD17:%.]] = load <4 x i32>, <4 x i32> [[TMP16]], align 4, !alias.scope !48
	; AVX2-NEXT: [[REVERSE18:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD17]], <4 x i32> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE18:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD17]], <4 x i32> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP17:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -12			; AVX2-NEXT: [[TMP17:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -12
	; AVX2-NEXT: [[TMP18:%.]] = getelementptr inbounds i32, i32 [[TMP17]], i32 -3			; AVX2-NEXT: [[TMP18:%.]] = getelementptr inbounds i32, i32 [[TMP17]], i32 -3
	; AVX2-NEXT: [[TMP19:%.]] = bitcast i32 [[TMP18]] to <4 x i32>*			; AVX2-NEXT: [[TMP19:%.]] = bitcast i32 [[TMP18]] to <4 x i32>*
	; AVX2-NEXT: [[WIDE_LOAD19:%.]] = load <4 x i32>, <4 x i32> [[TMP19]], align 4, !alias.scope !41			; AVX2-NEXT: [[WIDE_LOAD19:%.]] = load <4 x i32>, <4 x i32> [[TMP19]], align 4, !alias.scope !48
	; AVX2-NEXT: [[REVERSE20:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD19]], <4 x i32> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE20:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD19]], <4 x i32> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP20:%.*]] = icmp sgt <4 x i32> [[REVERSE]], zeroinitializer			; AVX2-NEXT: [[TMP20:%.*]] = icmp sgt <4 x i32> [[REVERSE]], zeroinitializer
	; AVX2-NEXT: [[TMP21:%.*]] = icmp sgt <4 x i32> [[REVERSE16]], zeroinitializer			; AVX2-NEXT: [[TMP21:%.*]] = icmp sgt <4 x i32> [[REVERSE16]], zeroinitializer
	; AVX2-NEXT: [[TMP22:%.*]] = icmp sgt <4 x i32> [[REVERSE18]], zeroinitializer			; AVX2-NEXT: [[TMP22:%.*]] = icmp sgt <4 x i32> [[REVERSE18]], zeroinitializer
	; AVX2-NEXT: [[TMP23:%.*]] = icmp sgt <4 x i32> [[REVERSE20]], zeroinitializer			; AVX2-NEXT: [[TMP23:%.*]] = icmp sgt <4 x i32> [[REVERSE20]], zeroinitializer
	; AVX2-NEXT: [[TMP24:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP0]]			; AVX2-NEXT: [[TMP24:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP0]]
	; AVX2-NEXT: [[TMP25:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP1]]			; AVX2-NEXT: [[TMP25:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP1]]
	; AVX2-NEXT: [[TMP26:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP2]]			; AVX2-NEXT: [[TMP26:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP2]]
	; AVX2-NEXT: [[TMP27:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP3]]			; AVX2-NEXT: [[TMP27:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP3]]
	; AVX2-NEXT: [[TMP28:%.]] = getelementptr inbounds double, double [[TMP24]], i32 0			; AVX2-NEXT: [[TMP28:%.]] = getelementptr inbounds double, double [[TMP24]], i32 0
	; AVX2-NEXT: [[TMP29:%.]] = getelementptr inbounds double, double [[TMP28]], i32 -3			; AVX2-NEXT: [[TMP29:%.]] = getelementptr inbounds double, double [[TMP28]], i32 -3
	; AVX2-NEXT: [[REVERSE21:%.*]] = shufflevector <4 x i1> [[TMP20]], <4 x i1> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE21:%.*]] = shufflevector <4 x i1> [[TMP20]], <4 x i1> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP30:%.]] = bitcast double [[TMP29]] to <4 x double>*			; AVX2-NEXT: [[TMP30:%.]] = bitcast double [[TMP29]] to <4 x double>*
	; AVX2-NEXT: [[WIDE_MASKED_LOAD:%.]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double> [[TMP30]], i32 8, <4 x i1> [[REVERSE21]], <4 x double> undef), !alias.scope !44			; AVX2-NEXT: [[WIDE_MASKED_LOAD:%.]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double> [[TMP30]], i32 8, <4 x i1> [[REVERSE21]], <4 x double> undef), !alias.scope !51
	; AVX2-NEXT: [[REVERSE22:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE22:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP31:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -4			; AVX2-NEXT: [[TMP31:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -4
	; AVX2-NEXT: [[TMP32:%.]] = getelementptr inbounds double, double [[TMP31]], i32 -3			; AVX2-NEXT: [[TMP32:%.]] = getelementptr inbounds double, double [[TMP31]], i32 -3
	; AVX2-NEXT: [[REVERSE23:%.*]] = shufflevector <4 x i1> [[TMP21]], <4 x i1> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE23:%.*]] = shufflevector <4 x i1> [[TMP21]], <4 x i1> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP33:%.]] = bitcast double [[TMP32]] to <4 x double>*			; AVX2-NEXT: [[TMP33:%.]] = bitcast double [[TMP32]] to <4 x double>*
	; AVX2-NEXT: [[WIDE_MASKED_LOAD24:%.]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double> [[TMP33]], i32 8, <4 x i1> [[REVERSE23]], <4 x double> undef), !alias.scope !44			; AVX2-NEXT: [[WIDE_MASKED_LOAD24:%.]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double> [[TMP33]], i32 8, <4 x i1> [[REVERSE23]], <4 x double> undef), !alias.scope !51
	; AVX2-NEXT: [[REVERSE25:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD24]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE25:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD24]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP34:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -8			; AVX2-NEXT: [[TMP34:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -8
	; AVX2-NEXT: [[TMP35:%.]] = getelementptr inbounds double, double [[TMP34]], i32 -3			; AVX2-NEXT: [[TMP35:%.]] = getelementptr inbounds double, double [[TMP34]], i32 -3
	; AVX2-NEXT: [[REVERSE26:%.*]] = shufflevector <4 x i1> [[TMP22]], <4 x i1> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE26:%.*]] = shufflevector <4 x i1> [[TMP22]], <4 x i1> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP36:%.]] = bitcast double [[TMP35]] to <4 x double>*			; AVX2-NEXT: [[TMP36:%.]] = bitcast double [[TMP35]] to <4 x double>*
	; AVX2-NEXT: [[WIDE_MASKED_LOAD27:%.]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double> [[TMP36]], i32 8, <4 x i1> [[REVERSE26]], <4 x double> undef), !alias.scope !44			; AVX2-NEXT: [[WIDE_MASKED_LOAD27:%.]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double> [[TMP36]], i32 8, <4 x i1> [[REVERSE26]], <4 x double> undef), !alias.scope !51
	; AVX2-NEXT: [[REVERSE28:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD27]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE28:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD27]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP37:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -12			; AVX2-NEXT: [[TMP37:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -12
	; AVX2-NEXT: [[TMP38:%.]] = getelementptr inbounds double, double [[TMP37]], i32 -3			; AVX2-NEXT: [[TMP38:%.]] = getelementptr inbounds double, double [[TMP37]], i32 -3
	; AVX2-NEXT: [[REVERSE29:%.*]] = shufflevector <4 x i1> [[TMP23]], <4 x i1> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE29:%.*]] = shufflevector <4 x i1> [[TMP23]], <4 x i1> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP39:%.]] = bitcast double [[TMP38]] to <4 x double>*			; AVX2-NEXT: [[TMP39:%.]] = bitcast double [[TMP38]] to <4 x double>*
	; AVX2-NEXT: [[WIDE_MASKED_LOAD30:%.]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double> [[TMP39]], i32 8, <4 x i1> [[REVERSE29]], <4 x double> undef), !alias.scope !44			; AVX2-NEXT: [[WIDE_MASKED_LOAD30:%.]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double> [[TMP39]], i32 8, <4 x i1> [[REVERSE29]], <4 x double> undef), !alias.scope !51
	; AVX2-NEXT: [[REVERSE31:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD30]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE31:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD30]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP40:%.*]] = fadd <4 x double> [[REVERSE22]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX2-NEXT: [[TMP40:%.*]] = fadd <4 x double> [[REVERSE22]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX2-NEXT: [[TMP41:%.*]] = fadd <4 x double> [[REVERSE25]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX2-NEXT: [[TMP41:%.*]] = fadd <4 x double> [[REVERSE25]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX2-NEXT: [[TMP42:%.*]] = fadd <4 x double> [[REVERSE28]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX2-NEXT: [[TMP42:%.*]] = fadd <4 x double> [[REVERSE28]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX2-NEXT: [[TMP43:%.*]] = fadd <4 x double> [[REVERSE31]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX2-NEXT: [[TMP43:%.*]] = fadd <4 x double> [[REVERSE31]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX2-NEXT: [[TMP44:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP0]]			; AVX2-NEXT: [[TMP44:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP0]]
	; AVX2-NEXT: [[TMP45:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP1]]			; AVX2-NEXT: [[TMP45:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP1]]
	; AVX2-NEXT: [[TMP46:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP2]]			; AVX2-NEXT: [[TMP46:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP2]]
	; AVX2-NEXT: [[TMP47:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP3]]			; AVX2-NEXT: [[TMP47:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP3]]
	; AVX2-NEXT: [[REVERSE32:%.*]] = shufflevector <4 x double> [[TMP40]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE32:%.*]] = shufflevector <4 x double> [[TMP40]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP48:%.]] = getelementptr inbounds double, double [[TMP44]], i32 0			; AVX2-NEXT: [[TMP48:%.]] = getelementptr inbounds double, double [[TMP44]], i32 0
	; AVX2-NEXT: [[TMP49:%.]] = getelementptr inbounds double, double [[TMP48]], i32 -3			; AVX2-NEXT: [[TMP49:%.]] = getelementptr inbounds double, double [[TMP48]], i32 -3
	; AVX2-NEXT: [[TMP50:%.]] = bitcast double [[TMP49]] to <4 x double>*			; AVX2-NEXT: [[TMP50:%.]] = bitcast double [[TMP49]] to <4 x double>*
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[REVERSE32]], <4 x double>* [[TMP50]], i32 8, <4 x i1> [[REVERSE21]]), !alias.scope !46, !noalias !48			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[REVERSE32]], <4 x double>* [[TMP50]], i32 8, <4 x i1> [[REVERSE21]]), !alias.scope !53, !noalias !55
	; AVX2-NEXT: [[REVERSE34:%.*]] = shufflevector <4 x double> [[TMP41]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE34:%.*]] = shufflevector <4 x double> [[TMP41]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP51:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -4			; AVX2-NEXT: [[TMP51:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -4
	; AVX2-NEXT: [[TMP52:%.]] = getelementptr inbounds double, double [[TMP51]], i32 -3			; AVX2-NEXT: [[TMP52:%.]] = getelementptr inbounds double, double [[TMP51]], i32 -3
	; AVX2-NEXT: [[TMP53:%.]] = bitcast double [[TMP52]] to <4 x double>*			; AVX2-NEXT: [[TMP53:%.]] = bitcast double [[TMP52]] to <4 x double>*
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[REVERSE34]], <4 x double>* [[TMP53]], i32 8, <4 x i1> [[REVERSE23]]), !alias.scope !46, !noalias !48			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[REVERSE34]], <4 x double>* [[TMP53]], i32 8, <4 x i1> [[REVERSE23]]), !alias.scope !53, !noalias !55
	; AVX2-NEXT: [[REVERSE36:%.*]] = shufflevector <4 x double> [[TMP42]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE36:%.*]] = shufflevector <4 x double> [[TMP42]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP54:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -8			; AVX2-NEXT: [[TMP54:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -8
	; AVX2-NEXT: [[TMP55:%.]] = getelementptr inbounds double, double [[TMP54]], i32 -3			; AVX2-NEXT: [[TMP55:%.]] = getelementptr inbounds double, double [[TMP54]], i32 -3
	; AVX2-NEXT: [[TMP56:%.]] = bitcast double [[TMP55]] to <4 x double>*			; AVX2-NEXT: [[TMP56:%.]] = bitcast double [[TMP55]] to <4 x double>*
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[REVERSE36]], <4 x double>* [[TMP56]], i32 8, <4 x i1> [[REVERSE26]]), !alias.scope !46, !noalias !48			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[REVERSE36]], <4 x double>* [[TMP56]], i32 8, <4 x i1> [[REVERSE26]]), !alias.scope !53, !noalias !55
	; AVX2-NEXT: [[REVERSE38:%.*]] = shufflevector <4 x double> [[TMP43]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE38:%.*]] = shufflevector <4 x double> [[TMP43]], <4 x double> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP57:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -12			; AVX2-NEXT: [[TMP57:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -12
	; AVX2-NEXT: [[TMP58:%.]] = getelementptr inbounds double, double [[TMP57]], i32 -3			; AVX2-NEXT: [[TMP58:%.]] = getelementptr inbounds double, double [[TMP57]], i32 -3
	; AVX2-NEXT: [[TMP59:%.]] = bitcast double [[TMP58]] to <4 x double>*			; AVX2-NEXT: [[TMP59:%.]] = bitcast double [[TMP58]] to <4 x double>*
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[REVERSE38]], <4 x double>* [[TMP59]], i32 8, <4 x i1> [[REVERSE29]]), !alias.scope !46, !noalias !48			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> [[REVERSE38]], <4 x double>* [[TMP59]], i32 8, <4 x i1> [[REVERSE29]]), !alias.scope !53, !noalias !55
	; AVX2-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16			; AVX2-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16
	; AVX2-NEXT: [[TMP60:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096			; AVX2-NEXT: [[TMP60:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096
	; AVX2-NEXT: br i1 [[TMP60]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !49			; AVX2-NEXT: br i1 [[TMP60]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !56
	; AVX2: middle.block:			; AVX2: middle.block:
	; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 4096, 4096			; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 4096, 4096
	; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; AVX2: scalar.ph:			; AVX2: scalar.ph:
	; AVX2-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ -1, [[MIDDLE_BLOCK]] ], [ 4095, [[ENTRY:%.]] ], [ 4095, [[VECTOR_MEMCHECK]] ]			; AVX2-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ -1, [[MIDDLE_BLOCK]] ], [ 4095, [[ENTRY:%.]] ], [ 4095, [[VECTOR_MEMCHECK]] ]
	; AVX2-NEXT: br label [[FOR_BODY:%.*]]			; AVX2-NEXT: br label [[FOR_BODY:%.*]]
	; AVX2: for.body:			; AVX2: for.body:
	; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX2-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: [[TMP61:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; AVX2-NEXT: [[TMP61:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; AVX2-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP61]], 0			; AVX2-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP61]], 0
	; AVX2-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX2-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX2: if.then:			; AVX2: if.then:
	; AVX2-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds double, double [[IN]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds double, double [[IN]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: [[TMP62:%.]] = load double, double [[ARRAYIDX3]], align 8			; AVX2-NEXT: [[TMP62:%.]] = load double, double [[ARRAYIDX3]], align 8
	; AVX2-NEXT: [[ADD:%.*]] = fadd double [[TMP62]], 5.000000e-01			; AVX2-NEXT: [[ADD:%.*]] = fadd double [[TMP62]], 5.000000e-01
	; AVX2-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: store double [[ADD]], double* [[ARRAYIDX5]], align 8			; AVX2-NEXT: store double [[ADD]], double* [[ARRAYIDX5]], align 8
	; AVX2-NEXT: br label [[FOR_INC]]			; AVX2-NEXT: br label [[FOR_INC]]
	; AVX2: for.inc:			; AVX2: for.inc:
	; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1			; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
	; AVX2-NEXT: [[CMP:%.*]] = icmp eq i64 [[INDVARS_IV]], 0			; AVX2-NEXT: [[CMP:%.*]] = icmp eq i64 [[INDVARS_IV]], 0
	; AVX2-NEXT: br i1 [[CMP]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop !50			; AVX2-NEXT: br i1 [[CMP]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop !57
	; AVX2: for.end:			; AVX2: for.end:
	; AVX2-NEXT: ret void			; AVX2-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo6(			; AVX512-LABEL: @foo6(
	; AVX512-NEXT: entry:			; AVX512-NEXT: entry:
	; AVX512-NEXT: [[OUT1:%.]] = bitcast double [[OUT:%.]] to i8			; AVX512-NEXT: [[OUT1:%.]] = bitcast double [[OUT:%.]] to i8
	; AVX512-NEXT: [[TRIGGER3:%.]] = bitcast i32 [[TRIGGER:%.]] to i8			; AVX512-NEXT: [[TRIGGER3:%.]] = bitcast i32 [[TRIGGER:%.]] to i8
	; AVX512-NEXT: [[IN6:%.]] = bitcast double [[IN:%.]] to i8			; AVX512-NEXT: [[IN6:%.]] = bitcast double [[IN:%.]] to i8
	Show All 31 Lines
	; AVX512-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], -24			; AVX512-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], -24
	; AVX512-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP0]]			; AVX512-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP0]]
	; AVX512-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP1]]			; AVX512-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP1]]
	; AVX512-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP2]]			; AVX512-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP2]]
	; AVX512-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP3]]			; AVX512-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[TMP3]]
	; AVX512-NEXT: [[TMP8:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 0			; AVX512-NEXT: [[TMP8:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 0
	; AVX512-NEXT: [[TMP9:%.]] = getelementptr inbounds i32, i32 [[TMP8]], i32 -7			; AVX512-NEXT: [[TMP9:%.]] = getelementptr inbounds i32, i32 [[TMP8]], i32 -7
	; AVX512-NEXT: [[TMP10:%.]] = bitcast i32 [[TMP9]] to <8 x i32>*			; AVX512-NEXT: [[TMP10:%.]] = bitcast i32 [[TMP9]] to <8 x i32>*
	; AVX512-NEXT: [[WIDE_LOAD:%.]] = load <8 x i32>, <8 x i32> [[TMP10]], align 4, !alias.scope !51			; AVX512-NEXT: [[WIDE_LOAD:%.]] = load <8 x i32>, <8 x i32> [[TMP10]], align 4, !alias.scope !58
	; AVX512-NEXT: [[REVERSE:%.*]] = shufflevector <8 x i32> [[WIDE_LOAD]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE:%.*]] = shufflevector <8 x i32> [[WIDE_LOAD]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP11:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -8			; AVX512-NEXT: [[TMP11:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -8
	; AVX512-NEXT: [[TMP12:%.]] = getelementptr inbounds i32, i32 [[TMP11]], i32 -7			; AVX512-NEXT: [[TMP12:%.]] = getelementptr inbounds i32, i32 [[TMP11]], i32 -7
	; AVX512-NEXT: [[TMP13:%.]] = bitcast i32 [[TMP12]] to <8 x i32>*			; AVX512-NEXT: [[TMP13:%.]] = bitcast i32 [[TMP12]] to <8 x i32>*
	; AVX512-NEXT: [[WIDE_LOAD15:%.]] = load <8 x i32>, <8 x i32> [[TMP13]], align 4, !alias.scope !51			; AVX512-NEXT: [[WIDE_LOAD15:%.]] = load <8 x i32>, <8 x i32> [[TMP13]], align 4, !alias.scope !58
	; AVX512-NEXT: [[REVERSE16:%.*]] = shufflevector <8 x i32> [[WIDE_LOAD15]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE16:%.*]] = shufflevector <8 x i32> [[WIDE_LOAD15]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP14:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -16			; AVX512-NEXT: [[TMP14:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -16
	; AVX512-NEXT: [[TMP15:%.]] = getelementptr inbounds i32, i32 [[TMP14]], i32 -7			; AVX512-NEXT: [[TMP15:%.]] = getelementptr inbounds i32, i32 [[TMP14]], i32 -7
	; AVX512-NEXT: [[TMP16:%.]] = bitcast i32 [[TMP15]] to <8 x i32>*			; AVX512-NEXT: [[TMP16:%.]] = bitcast i32 [[TMP15]] to <8 x i32>*
	; AVX512-NEXT: [[WIDE_LOAD17:%.]] = load <8 x i32>, <8 x i32> [[TMP16]], align 4, !alias.scope !51			; AVX512-NEXT: [[WIDE_LOAD17:%.]] = load <8 x i32>, <8 x i32> [[TMP16]], align 4, !alias.scope !58
	; AVX512-NEXT: [[REVERSE18:%.*]] = shufflevector <8 x i32> [[WIDE_LOAD17]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE18:%.*]] = shufflevector <8 x i32> [[WIDE_LOAD17]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP17:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -24			; AVX512-NEXT: [[TMP17:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i32 -24
	; AVX512-NEXT: [[TMP18:%.]] = getelementptr inbounds i32, i32 [[TMP17]], i32 -7			; AVX512-NEXT: [[TMP18:%.]] = getelementptr inbounds i32, i32 [[TMP17]], i32 -7
	; AVX512-NEXT: [[TMP19:%.]] = bitcast i32 [[TMP18]] to <8 x i32>*			; AVX512-NEXT: [[TMP19:%.]] = bitcast i32 [[TMP18]] to <8 x i32>*
	; AVX512-NEXT: [[WIDE_LOAD19:%.]] = load <8 x i32>, <8 x i32> [[TMP19]], align 4, !alias.scope !51			; AVX512-NEXT: [[WIDE_LOAD19:%.]] = load <8 x i32>, <8 x i32> [[TMP19]], align 4, !alias.scope !58
	; AVX512-NEXT: [[REVERSE20:%.*]] = shufflevector <8 x i32> [[WIDE_LOAD19]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE20:%.*]] = shufflevector <8 x i32> [[WIDE_LOAD19]], <8 x i32> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP20:%.*]] = icmp sgt <8 x i32> [[REVERSE]], zeroinitializer			; AVX512-NEXT: [[TMP20:%.*]] = icmp sgt <8 x i32> [[REVERSE]], zeroinitializer
	; AVX512-NEXT: [[TMP21:%.*]] = icmp sgt <8 x i32> [[REVERSE16]], zeroinitializer			; AVX512-NEXT: [[TMP21:%.*]] = icmp sgt <8 x i32> [[REVERSE16]], zeroinitializer
	; AVX512-NEXT: [[TMP22:%.*]] = icmp sgt <8 x i32> [[REVERSE18]], zeroinitializer			; AVX512-NEXT: [[TMP22:%.*]] = icmp sgt <8 x i32> [[REVERSE18]], zeroinitializer
	; AVX512-NEXT: [[TMP23:%.*]] = icmp sgt <8 x i32> [[REVERSE20]], zeroinitializer			; AVX512-NEXT: [[TMP23:%.*]] = icmp sgt <8 x i32> [[REVERSE20]], zeroinitializer
	; AVX512-NEXT: [[TMP24:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP0]]			; AVX512-NEXT: [[TMP24:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP0]]
	; AVX512-NEXT: [[TMP25:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP1]]			; AVX512-NEXT: [[TMP25:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP1]]
	; AVX512-NEXT: [[TMP26:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP2]]			; AVX512-NEXT: [[TMP26:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP2]]
	; AVX512-NEXT: [[TMP27:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP3]]			; AVX512-NEXT: [[TMP27:%.]] = getelementptr inbounds double, double [[IN]], i64 [[TMP3]]
	; AVX512-NEXT: [[TMP28:%.]] = getelementptr inbounds double, double [[TMP24]], i32 0			; AVX512-NEXT: [[TMP28:%.]] = getelementptr inbounds double, double [[TMP24]], i32 0
	; AVX512-NEXT: [[TMP29:%.]] = getelementptr inbounds double, double [[TMP28]], i32 -7			; AVX512-NEXT: [[TMP29:%.]] = getelementptr inbounds double, double [[TMP28]], i32 -7
	; AVX512-NEXT: [[REVERSE21:%.*]] = shufflevector <8 x i1> [[TMP20]], <8 x i1> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE21:%.*]] = shufflevector <8 x i1> [[TMP20]], <8 x i1> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP30:%.]] = bitcast double [[TMP29]] to <8 x double>*			; AVX512-NEXT: [[TMP30:%.]] = bitcast double [[TMP29]] to <8 x double>*
	; AVX512-NEXT: [[WIDE_MASKED_LOAD:%.]] = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double> [[TMP30]], i32 8, <8 x i1> [[REVERSE21]], <8 x double> undef), !alias.scope !54			; AVX512-NEXT: [[WIDE_MASKED_LOAD:%.]] = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double> [[TMP30]], i32 8, <8 x i1> [[REVERSE21]], <8 x double> undef), !alias.scope !61
	; AVX512-NEXT: [[REVERSE22:%.*]] = shufflevector <8 x double> [[WIDE_MASKED_LOAD]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE22:%.*]] = shufflevector <8 x double> [[WIDE_MASKED_LOAD]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP31:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -8			; AVX512-NEXT: [[TMP31:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -8
	; AVX512-NEXT: [[TMP32:%.]] = getelementptr inbounds double, double [[TMP31]], i32 -7			; AVX512-NEXT: [[TMP32:%.]] = getelementptr inbounds double, double [[TMP31]], i32 -7
	; AVX512-NEXT: [[REVERSE23:%.*]] = shufflevector <8 x i1> [[TMP21]], <8 x i1> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE23:%.*]] = shufflevector <8 x i1> [[TMP21]], <8 x i1> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP33:%.]] = bitcast double [[TMP32]] to <8 x double>*			; AVX512-NEXT: [[TMP33:%.]] = bitcast double [[TMP32]] to <8 x double>*
	; AVX512-NEXT: [[WIDE_MASKED_LOAD24:%.]] = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double> [[TMP33]], i32 8, <8 x i1> [[REVERSE23]], <8 x double> undef), !alias.scope !54			; AVX512-NEXT: [[WIDE_MASKED_LOAD24:%.]] = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double> [[TMP33]], i32 8, <8 x i1> [[REVERSE23]], <8 x double> undef), !alias.scope !61
	; AVX512-NEXT: [[REVERSE25:%.*]] = shufflevector <8 x double> [[WIDE_MASKED_LOAD24]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE25:%.*]] = shufflevector <8 x double> [[WIDE_MASKED_LOAD24]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP34:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -16			; AVX512-NEXT: [[TMP34:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -16
	; AVX512-NEXT: [[TMP35:%.]] = getelementptr inbounds double, double [[TMP34]], i32 -7			; AVX512-NEXT: [[TMP35:%.]] = getelementptr inbounds double, double [[TMP34]], i32 -7
	; AVX512-NEXT: [[REVERSE26:%.*]] = shufflevector <8 x i1> [[TMP22]], <8 x i1> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE26:%.*]] = shufflevector <8 x i1> [[TMP22]], <8 x i1> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP36:%.]] = bitcast double [[TMP35]] to <8 x double>*			; AVX512-NEXT: [[TMP36:%.]] = bitcast double [[TMP35]] to <8 x double>*
	; AVX512-NEXT: [[WIDE_MASKED_LOAD27:%.]] = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double> [[TMP36]], i32 8, <8 x i1> [[REVERSE26]], <8 x double> undef), !alias.scope !54			; AVX512-NEXT: [[WIDE_MASKED_LOAD27:%.]] = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double> [[TMP36]], i32 8, <8 x i1> [[REVERSE26]], <8 x double> undef), !alias.scope !61
	; AVX512-NEXT: [[REVERSE28:%.*]] = shufflevector <8 x double> [[WIDE_MASKED_LOAD27]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE28:%.*]] = shufflevector <8 x double> [[WIDE_MASKED_LOAD27]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP37:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -24			; AVX512-NEXT: [[TMP37:%.]] = getelementptr inbounds double, double [[TMP24]], i32 -24
	; AVX512-NEXT: [[TMP38:%.]] = getelementptr inbounds double, double [[TMP37]], i32 -7			; AVX512-NEXT: [[TMP38:%.]] = getelementptr inbounds double, double [[TMP37]], i32 -7
	; AVX512-NEXT: [[REVERSE29:%.*]] = shufflevector <8 x i1> [[TMP23]], <8 x i1> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE29:%.*]] = shufflevector <8 x i1> [[TMP23]], <8 x i1> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP39:%.]] = bitcast double [[TMP38]] to <8 x double>*			; AVX512-NEXT: [[TMP39:%.]] = bitcast double [[TMP38]] to <8 x double>*
	; AVX512-NEXT: [[WIDE_MASKED_LOAD30:%.]] = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double> [[TMP39]], i32 8, <8 x i1> [[REVERSE29]], <8 x double> undef), !alias.scope !54			; AVX512-NEXT: [[WIDE_MASKED_LOAD30:%.]] = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double> [[TMP39]], i32 8, <8 x i1> [[REVERSE29]], <8 x double> undef), !alias.scope !61
	; AVX512-NEXT: [[REVERSE31:%.*]] = shufflevector <8 x double> [[WIDE_MASKED_LOAD30]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE31:%.*]] = shufflevector <8 x double> [[WIDE_MASKED_LOAD30]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP40:%.*]] = fadd <8 x double> [[REVERSE22]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX512-NEXT: [[TMP40:%.*]] = fadd <8 x double> [[REVERSE22]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX512-NEXT: [[TMP41:%.*]] = fadd <8 x double> [[REVERSE25]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX512-NEXT: [[TMP41:%.*]] = fadd <8 x double> [[REVERSE25]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX512-NEXT: [[TMP42:%.*]] = fadd <8 x double> [[REVERSE28]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX512-NEXT: [[TMP42:%.*]] = fadd <8 x double> [[REVERSE28]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX512-NEXT: [[TMP43:%.*]] = fadd <8 x double> [[REVERSE31]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX512-NEXT: [[TMP43:%.*]] = fadd <8 x double> [[REVERSE31]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX512-NEXT: [[TMP44:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP0]]			; AVX512-NEXT: [[TMP44:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP0]]
	; AVX512-NEXT: [[TMP45:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP1]]			; AVX512-NEXT: [[TMP45:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP1]]
	; AVX512-NEXT: [[TMP46:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP2]]			; AVX512-NEXT: [[TMP46:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP2]]
	; AVX512-NEXT: [[TMP47:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP3]]			; AVX512-NEXT: [[TMP47:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[TMP3]]
	; AVX512-NEXT: [[REVERSE32:%.*]] = shufflevector <8 x double> [[TMP40]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE32:%.*]] = shufflevector <8 x double> [[TMP40]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP48:%.]] = getelementptr inbounds double, double [[TMP44]], i32 0			; AVX512-NEXT: [[TMP48:%.]] = getelementptr inbounds double, double [[TMP44]], i32 0
	; AVX512-NEXT: [[TMP49:%.]] = getelementptr inbounds double, double [[TMP48]], i32 -7			; AVX512-NEXT: [[TMP49:%.]] = getelementptr inbounds double, double [[TMP48]], i32 -7
	; AVX512-NEXT: [[TMP50:%.]] = bitcast double [[TMP49]] to <8 x double>*			; AVX512-NEXT: [[TMP50:%.]] = bitcast double [[TMP49]] to <8 x double>*
	; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> [[REVERSE32]], <8 x double>* [[TMP50]], i32 8, <8 x i1> [[REVERSE21]]), !alias.scope !56, !noalias !58			; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> [[REVERSE32]], <8 x double>* [[TMP50]], i32 8, <8 x i1> [[REVERSE21]]), !alias.scope !63, !noalias !65
	; AVX512-NEXT: [[REVERSE34:%.*]] = shufflevector <8 x double> [[TMP41]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE34:%.*]] = shufflevector <8 x double> [[TMP41]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP51:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -8			; AVX512-NEXT: [[TMP51:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -8
	; AVX512-NEXT: [[TMP52:%.]] = getelementptr inbounds double, double [[TMP51]], i32 -7			; AVX512-NEXT: [[TMP52:%.]] = getelementptr inbounds double, double [[TMP51]], i32 -7
	; AVX512-NEXT: [[TMP53:%.]] = bitcast double [[TMP52]] to <8 x double>*			; AVX512-NEXT: [[TMP53:%.]] = bitcast double [[TMP52]] to <8 x double>*
	; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> [[REVERSE34]], <8 x double>* [[TMP53]], i32 8, <8 x i1> [[REVERSE23]]), !alias.scope !56, !noalias !58			; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> [[REVERSE34]], <8 x double>* [[TMP53]], i32 8, <8 x i1> [[REVERSE23]]), !alias.scope !63, !noalias !65
	; AVX512-NEXT: [[REVERSE36:%.*]] = shufflevector <8 x double> [[TMP42]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE36:%.*]] = shufflevector <8 x double> [[TMP42]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP54:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -16			; AVX512-NEXT: [[TMP54:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -16
	; AVX512-NEXT: [[TMP55:%.]] = getelementptr inbounds double, double [[TMP54]], i32 -7			; AVX512-NEXT: [[TMP55:%.]] = getelementptr inbounds double, double [[TMP54]], i32 -7
	; AVX512-NEXT: [[TMP56:%.]] = bitcast double [[TMP55]] to <8 x double>*			; AVX512-NEXT: [[TMP56:%.]] = bitcast double [[TMP55]] to <8 x double>*
	; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> [[REVERSE36]], <8 x double>* [[TMP56]], i32 8, <8 x i1> [[REVERSE26]]), !alias.scope !56, !noalias !58			; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> [[REVERSE36]], <8 x double>* [[TMP56]], i32 8, <8 x i1> [[REVERSE26]]), !alias.scope !63, !noalias !65
	; AVX512-NEXT: [[REVERSE38:%.*]] = shufflevector <8 x double> [[TMP43]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>			; AVX512-NEXT: [[REVERSE38:%.*]] = shufflevector <8 x double> [[TMP43]], <8 x double> undef, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
	; AVX512-NEXT: [[TMP57:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -24			; AVX512-NEXT: [[TMP57:%.]] = getelementptr inbounds double, double [[TMP44]], i32 -24
	; AVX512-NEXT: [[TMP58:%.]] = getelementptr inbounds double, double [[TMP57]], i32 -7			; AVX512-NEXT: [[TMP58:%.]] = getelementptr inbounds double, double [[TMP57]], i32 -7
	; AVX512-NEXT: [[TMP59:%.]] = bitcast double [[TMP58]] to <8 x double>*			; AVX512-NEXT: [[TMP59:%.]] = bitcast double [[TMP58]] to <8 x double>*
	; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> [[REVERSE38]], <8 x double>* [[TMP59]], i32 8, <8 x i1> [[REVERSE29]]), !alias.scope !56, !noalias !58			; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> [[REVERSE38]], <8 x double>* [[TMP59]], i32 8, <8 x i1> [[REVERSE29]]), !alias.scope !63, !noalias !65
	; AVX512-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 32			; AVX512-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 32
	; AVX512-NEXT: [[TMP60:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096			; AVX512-NEXT: [[TMP60:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096
	; AVX512-NEXT: br i1 [[TMP60]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !59			; AVX512-NEXT: br i1 [[TMP60]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !66
	; AVX512: middle.block:			; AVX512: middle.block:
	; AVX512-NEXT: [[CMP_N:%.*]] = icmp eq i64 4096, 4096			; AVX512-NEXT: [[CMP_N:%.*]] = icmp eq i64 4096, 4096
	; AVX512-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; AVX512-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; AVX512: scalar.ph:			; AVX512: scalar.ph:
	; AVX512-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ -1, [[MIDDLE_BLOCK]] ], [ 4095, [[ENTRY:%.]] ], [ 4095, [[VECTOR_MEMCHECK]] ]			; AVX512-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ -1, [[MIDDLE_BLOCK]] ], [ 4095, [[ENTRY:%.]] ], [ 4095, [[VECTOR_MEMCHECK]] ]
	; AVX512-NEXT: br label [[FOR_BODY:%.*]]			; AVX512-NEXT: br label [[FOR_BODY:%.*]]
	; AVX512: for.body:			; AVX512: for.body:
	; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[INDVARS_IV]]			; AVX512-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX512-NEXT: [[TMP61:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; AVX512-NEXT: [[TMP61:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; AVX512-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP61]], 0			; AVX512-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP61]], 0
	; AVX512-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX512-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX512: if.then:			; AVX512: if.then:
	; AVX512-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds double, double [[IN]], i64 [[INDVARS_IV]]			; AVX512-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds double, double [[IN]], i64 [[INDVARS_IV]]
	; AVX512-NEXT: [[TMP62:%.]] = load double, double [[ARRAYIDX3]], align 8			; AVX512-NEXT: [[TMP62:%.]] = load double, double [[ARRAYIDX3]], align 8
	; AVX512-NEXT: [[ADD:%.*]] = fadd double [[TMP62]], 5.000000e-01			; AVX512-NEXT: [[ADD:%.*]] = fadd double [[TMP62]], 5.000000e-01
	; AVX512-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]			; AVX512-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]
	; AVX512-NEXT: store double [[ADD]], double* [[ARRAYIDX5]], align 8			; AVX512-NEXT: store double [[ADD]], double* [[ARRAYIDX5]], align 8
	; AVX512-NEXT: br label [[FOR_INC]]			; AVX512-NEXT: br label [[FOR_INC]]
	; AVX512: for.inc:			; AVX512: for.inc:
	; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1			; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
	; AVX512-NEXT: [[CMP:%.*]] = icmp eq i64 [[INDVARS_IV]], 0			; AVX512-NEXT: [[CMP:%.*]] = icmp eq i64 [[INDVARS_IV]], 0
	; AVX512-NEXT: br i1 [[CMP]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop !60			; AVX512-NEXT: br i1 [[CMP]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop !67
	; AVX512: for.end:			; AVX512: for.end:
	; AVX512-NEXT: ret void			; AVX512-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body: ; preds = %for.inc, %entry			for.body: ; preds = %for.inc, %entry
	%indvars.iv = phi i64 [ 4095, %entry ], [ %indvars.iv.next, %for.inc ]			%indvars.iv = phi i64 [ 4095, %entry ], [ %indvars.iv.next, %for.inc ]
	▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 8			; AVX1-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 8
	; AVX1-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <4 x double>*			; AVX1-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <4 x double>*
	; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP61]], i32 8, <4 x i1> [[TMP54]])			; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP61]], i32 8, <4 x i1> [[TMP54]])
	; AVX1-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 12			; AVX1-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 12
	; AVX1-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <4 x double>*			; AVX1-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <4 x double>*
	; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP63]], i32 8, <4 x i1> [[TMP55]])			; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP63]], i32 8, <4 x i1> [[TMP55]])
	; AVX1-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16			; AVX1-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16
	; AVX1-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; AVX1-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AVX1-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !41			; AVX1-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !48
	; AVX1: middle.block:			; AVX1: middle.block:
	; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]			; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
	; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; AVX1: scalar.ph:			; AVX1: scalar.ph:
	; AVX1-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; AVX1-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; AVX1-NEXT: br label [[FOR_BODY:%.*]]			; AVX1-NEXT: br label [[FOR_BODY:%.*]]
	; AVX1: for.body:			; AVX1: for.body:
	; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	Show All 9 Lines
	; AVX1-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]			; AVX1-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]
	; AVX1: if.then:			; AVX1: if.then:
	; AVX1-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8			; AVX1-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8
	; AVX1-NEXT: br label [[FOR_INC]]			; AVX1-NEXT: br label [[FOR_INC]]
	; AVX1: for.inc:			; AVX1: for.inc:
	; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !42			; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !49
	; AVX1: for.end.loopexit:			; AVX1: for.end.loopexit:
	; AVX1-NEXT: br label [[FOR_END]]			; AVX1-NEXT: br label [[FOR_END]]
	; AVX1: for.end:			; AVX1: for.end:
	; AVX1-NEXT: ret void			; AVX1-NEXT: ret void
	;			;
	; AVX2-LABEL: @foo7(			; AVX2-LABEL: @foo7(
	; AVX2-NEXT: entry:			; AVX2-NEXT: entry:
	; AVX2-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0			; AVX2-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0
	▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 8			; AVX2-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 8
	; AVX2-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <4 x double>*			; AVX2-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <4 x double>*
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP61]], i32 8, <4 x i1> [[TMP54]])			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP61]], i32 8, <4 x i1> [[TMP54]])
	; AVX2-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 12			; AVX2-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 12
	; AVX2-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <4 x double>*			; AVX2-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <4 x double>*
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP63]], i32 8, <4 x i1> [[TMP55]])			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP63]], i32 8, <4 x i1> [[TMP55]])
	; AVX2-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16			; AVX2-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16
	; AVX2-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; AVX2-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AVX2-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !51			; AVX2-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !58
	; AVX2: middle.block:			; AVX2: middle.block:
	; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]			; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
	; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; AVX2: scalar.ph:			; AVX2: scalar.ph:
	; AVX2-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; AVX2-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; AVX2-NEXT: br label [[FOR_BODY:%.*]]			; AVX2-NEXT: br label [[FOR_BODY:%.*]]
	; AVX2: for.body:			; AVX2: for.body:
	; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	Show All 9 Lines
	; AVX2-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]			; AVX2-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]
	; AVX2: if.then:			; AVX2: if.then:
	; AVX2-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8			; AVX2-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8
	; AVX2-NEXT: br label [[FOR_INC]]			; AVX2-NEXT: br label [[FOR_INC]]
	; AVX2: for.inc:			; AVX2: for.inc:
	; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !52			; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !59
	; AVX2: for.end.loopexit:			; AVX2: for.end.loopexit:
	; AVX2-NEXT: br label [[FOR_END]]			; AVX2-NEXT: br label [[FOR_END]]
	; AVX2: for.end:			; AVX2: for.end:
	; AVX2-NEXT: ret void			; AVX2-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo7(			; AVX512-LABEL: @foo7(
	; AVX512-NEXT: entry:			; AVX512-NEXT: entry:
	; AVX512-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0			; AVX512-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0
	▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	; AVX512-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 16			; AVX512-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 16
	; AVX512-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <8 x double>*			; AVX512-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <8 x double>*
	; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <8 x double>* [[TMP61]], i32 8, <8 x i1> [[TMP54]])			; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <8 x double>* [[TMP61]], i32 8, <8 x i1> [[TMP54]])
	; AVX512-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 24			; AVX512-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 24
	; AVX512-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <8 x double>*			; AVX512-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <8 x double>*
	; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <8 x double>* [[TMP63]], i32 8, <8 x i1> [[TMP55]])			; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <8 x double>* [[TMP63]], i32 8, <8 x i1> [[TMP55]])
	; AVX512-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 32			; AVX512-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 32
	; AVX512-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; AVX512-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AVX512-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !61			; AVX512-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !68
	; AVX512: middle.block:			; AVX512: middle.block:
	; AVX512-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]			; AVX512-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
	; AVX512-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; AVX512-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; AVX512: scalar.ph:			; AVX512: scalar.ph:
	; AVX512-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; AVX512-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; AVX512-NEXT: br label [[FOR_BODY:%.*]]			; AVX512-NEXT: br label [[FOR_BODY:%.*]]
	; AVX512: for.body:			; AVX512: for.body:
	; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	Show All 9 Lines
	; AVX512-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]			; AVX512-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]
	; AVX512: if.then:			; AVX512: if.then:
	; AVX512-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]			; AVX512-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]
	; AVX512-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8			; AVX512-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8
	; AVX512-NEXT: br label [[FOR_INC]]			; AVX512-NEXT: br label [[FOR_INC]]
	; AVX512: for.inc:			; AVX512: for.inc:
	; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX512-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; AVX512-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; AVX512-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !62			; AVX512-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !69
	; AVX512: for.end.loopexit:			; AVX512: for.end.loopexit:
	; AVX512-NEXT: br label [[FOR_END]]			; AVX512-NEXT: br label [[FOR_END]]
	; AVX512: for.end:			; AVX512: for.end:
	; AVX512-NEXT: ret void			; AVX512-NEXT: ret void
	;			;
	entry:			entry:
	%cmp5 = icmp eq i32 %size, 0			%cmp5 = icmp eq i32 %size, 0
	br i1 %cmp5, label %for.end, label %for.body.preheader			br i1 %cmp5, label %for.end, label %for.body.preheader
	▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 8			; AVX1-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 8
	; AVX1-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <4 x double>*			; AVX1-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <4 x double>*
	; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP61]], i32 8, <4 x i1> [[TMP54]])			; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP61]], i32 8, <4 x i1> [[TMP54]])
	; AVX1-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 12			; AVX1-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 12
	; AVX1-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <4 x double>*			; AVX1-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <4 x double>*
	; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP63]], i32 8, <4 x i1> [[TMP55]])			; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP63]], i32 8, <4 x i1> [[TMP55]])
	; AVX1-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16			; AVX1-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16
	; AVX1-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; AVX1-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AVX1-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !44			; AVX1-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !51
	; AVX1: middle.block:			; AVX1: middle.block:
	; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]			; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
	; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; AVX1: scalar.ph:			; AVX1: scalar.ph:
	; AVX1-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; AVX1-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; AVX1-NEXT: br label [[FOR_BODY:%.*]]			; AVX1-NEXT: br label [[FOR_BODY:%.*]]
	; AVX1: for.body:			; AVX1: for.body:
	; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	Show All 9 Lines
	; AVX1-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]			; AVX1-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]
	; AVX1: if.then:			; AVX1: if.then:
	; AVX1-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8			; AVX1-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8
	; AVX1-NEXT: br label [[FOR_INC]]			; AVX1-NEXT: br label [[FOR_INC]]
	; AVX1: for.inc:			; AVX1: for.inc:
	; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !45			; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !52
	; AVX1: for.end.loopexit:			; AVX1: for.end.loopexit:
	; AVX1-NEXT: br label [[FOR_END]]			; AVX1-NEXT: br label [[FOR_END]]
	; AVX1: for.end:			; AVX1: for.end:
	; AVX1-NEXT: ret void			; AVX1-NEXT: ret void
	;			;
	; AVX2-LABEL: @foo8(			; AVX2-LABEL: @foo8(
	; AVX2-NEXT: entry:			; AVX2-NEXT: entry:
	; AVX2-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0			; AVX2-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0
	▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 8			; AVX2-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 8
	; AVX2-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <4 x double>*			; AVX2-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <4 x double>*
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP61]], i32 8, <4 x i1> [[TMP54]])			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP61]], i32 8, <4 x i1> [[TMP54]])
	; AVX2-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 12			; AVX2-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 12
	; AVX2-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <4 x double>*			; AVX2-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <4 x double>*
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP63]], i32 8, <4 x i1> [[TMP55]])			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0v4f64(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <4 x double>* [[TMP63]], i32 8, <4 x i1> [[TMP55]])
	; AVX2-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16			; AVX2-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 16
	; AVX2-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; AVX2-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AVX2-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !54			; AVX2-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !61
	; AVX2: middle.block:			; AVX2: middle.block:
	; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]			; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
	; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; AVX2: scalar.ph:			; AVX2: scalar.ph:
	; AVX2-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; AVX2-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; AVX2-NEXT: br label [[FOR_BODY:%.*]]			; AVX2-NEXT: br label [[FOR_BODY:%.*]]
	; AVX2: for.body:			; AVX2: for.body:
	; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	Show All 9 Lines
	; AVX2-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]			; AVX2-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]
	; AVX2: if.then:			; AVX2: if.then:
	; AVX2-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8			; AVX2-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8
	; AVX2-NEXT: br label [[FOR_INC]]			; AVX2-NEXT: br label [[FOR_INC]]
	; AVX2: for.inc:			; AVX2: for.inc:
	; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !55			; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !62
	; AVX2: for.end.loopexit:			; AVX2: for.end.loopexit:
	; AVX2-NEXT: br label [[FOR_END]]			; AVX2-NEXT: br label [[FOR_END]]
	; AVX2: for.end:			; AVX2: for.end:
	; AVX2-NEXT: ret void			; AVX2-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo8(			; AVX512-LABEL: @foo8(
	; AVX512-NEXT: entry:			; AVX512-NEXT: entry:
	; AVX512-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0			; AVX512-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0
	▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	; AVX512-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 16			; AVX512-NEXT: [[TMP60:%.]] = getelementptr inbounds double, double [[TMP44]], i32 16
	; AVX512-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <8 x double>*			; AVX512-NEXT: [[TMP61:%.]] = bitcast double [[TMP60]] to <8 x double>*
	; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <8 x double>* [[TMP61]], i32 8, <8 x i1> [[TMP54]])			; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <8 x double>* [[TMP61]], i32 8, <8 x i1> [[TMP54]])
	; AVX512-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 24			; AVX512-NEXT: [[TMP62:%.]] = getelementptr inbounds double, double [[TMP44]], i32 24
	; AVX512-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <8 x double>*			; AVX512-NEXT: [[TMP63:%.]] = bitcast double [[TMP62]] to <8 x double>*
	; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <8 x double>* [[TMP63]], i32 8, <8 x i1> [[TMP55]])			; AVX512-NEXT: call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, <8 x double>* [[TMP63]], i32 8, <8 x i1> [[TMP55]])
	; AVX512-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 32			; AVX512-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 32
	; AVX512-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; AVX512-NEXT: [[TMP64:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AVX512-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !64			; AVX512-NEXT: br i1 [[TMP64]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !71
	; AVX512: middle.block:			; AVX512: middle.block:
	; AVX512-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]			; AVX512-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
	; AVX512-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; AVX512-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; AVX512: scalar.ph:			; AVX512: scalar.ph:
	; AVX512-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; AVX512-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; AVX512-NEXT: br label [[FOR_BODY:%.*]]			; AVX512-NEXT: br label [[FOR_BODY:%.*]]
	; AVX512: for.body:			; AVX512: for.body:
	; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	Show All 9 Lines
	; AVX512-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]			; AVX512-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]
	; AVX512: if.then:			; AVX512: if.then:
	; AVX512-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]			; AVX512-NEXT: [[ARRAYIDX5:%.]] = getelementptr inbounds double, double [[OUT]], i64 [[INDVARS_IV]]
	; AVX512-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8			; AVX512-NEXT: store double 5.000000e-01, double* [[ARRAYIDX5]], align 8
	; AVX512-NEXT: br label [[FOR_INC]]			; AVX512-NEXT: br label [[FOR_INC]]
	; AVX512: for.inc:			; AVX512: for.inc:
	; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX512-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; AVX512-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; AVX512-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !65			; AVX512-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop !72
	; AVX512: for.end.loopexit:			; AVX512: for.end.loopexit:
	; AVX512-NEXT: br label [[FOR_END]]			; AVX512-NEXT: br label [[FOR_END]]
	; AVX512: for.end:			; AVX512: for.end:
	; AVX512-NEXT: ret void			; AVX512-NEXT: ret void
	;			;
	entry:			entry:
	%cmp5 = icmp eq i32 %size, 0			%cmp5 = icmp eq i32 %size, 0
	br i1 %cmp5, label %for.end, label %for.body.preheader			br i1 %cmp5, label %for.end, label %for.body.preheader
	Show All 34 Lines

test/Transforms/LoopVectorize/if-conversion.ll

	Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines
	for.end: ; preds = %for.inc, %entry			for.end: ; preds = %for.inc, %entry
	%sum.0.lcssa = phi i32 [ 0, %entry ], [ %sum.1, %for.inc ]			%sum.0.lcssa = phi i32 [ 0, %entry ], [ %sum.1, %for.inc ]
	ret i32 %sum.0.lcssa			ret i32 %sum.0.lcssa
	}			}

	@a = common global [1 x i32*] zeroinitializer, align 8			@a = common global [1 x i32*] zeroinitializer, align 8
	@c = common global i32* null, align 8			@c = common global i32* null, align 8

	; We use to if convert this loop. This is not safe because there is a trapping			; Constant expressions never trap; check that we perform the transform
	; constant expression.			; consistently.
	; PR16729			; PR16729

	; CHECK-LABEL: trapping_constant_expression			; CHECK-LABEL: trapping_constant_expression
	; CHECK-NOT: or <4 x i32>			; CHECK: or <4 x i32>

	define i32 @trapping_constant_expression() {			define i32 @trapping_constant_expression() {
	entry:			entry:
	br label %for.body			br label %for.body

	for.body:			for.body:
	%inc3 = phi i32 [ 0, %entry ], [ %inc, %cond.end ]			%inc3 = phi i32 [ 0, %entry ], [ %inc, %cond.end ]
	%or2 = phi i32 [ 0, %entry ], [ %or, %cond.end ]			%or2 = phi i32 [ 0, %entry ], [ %or, %cond.end ]
	br i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 0, i64 0), i32** @c), label %cond.false, label %cond.end			br i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 0, i64 0), i32** getelementptr inbounds (i32, i32* @c, i64 1)), label %cond.false, label %cond.end

	cond.false:			cond.false:
	br label %cond.end			br label %cond.end

	cond.end:			cond.end:
	%cond = phi i32 [ sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 0, i64 0), i32** @c) to i32)), %cond.false ], [ 0, %for.body ]			%cond = phi i32 [ sdiv (i32 1, i32 zext (i1 icmp eq (i32** getelementptr inbounds ([1 x i32], [1 x i32]* @a, i64 0, i64 0), i32** getelementptr inbounds (i32, i32* @c, i64 1)) to i32)), %cond.false ], [ 0, %for.body ]
	%or = or i32 %or2, %cond			%or = or i32 %or2, %cond
	%inc = add nsw i32 %inc3, 1			%inc = add nsw i32 %inc3, 1
	%cmp = icmp slt i32 %inc, 128			%cmp = icmp slt i32 %inc, 128
	br i1 %cmp, label %for.body, label %for.end			br i1 %cmp, label %for.body, label %for.end

	for.end:			for.end:
	ret i32 %or			ret i32 %or
	}			}

	; Neither should we if-convert if there is an instruction operand that is a			; Constant expressions never trap; check that we perform the transform consistently.
	; trapping constant expression.
	; PR16729			; PR16729

	; CHECK-LABEL: trapping_constant_expression2			; CHECK-LABEL: trapping_constant_expression2
	; CHECK-NOT: or <4 x i32>			; CHECK: or <4 x i32>

	define i32 @trapping_constant_expression2() {			define i32 @trapping_constant_expression2() {
	entry:			entry:
	br label %for.body			br label %for.body

	for.body:			for.body:
	%inc3 = phi i32 [ 0, %entry ], [ %inc, %cond.end ]			%inc3 = phi i32 [ 0, %entry ], [ %inc, %cond.end ]
	%or2 = phi i32 [ 0, %entry ], [ %or, %cond.end ]			%or2 = phi i32 [ 0, %entry ], [ %or, %cond.end ]
	▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

test/Transforms/SimplifyCFG/2006-10-19-UncondDiv.ll

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; PR957			; PR957
	; RUN: opt < %s -simplifycfg -S \| FileCheck %s			; RUN: opt < %s -simplifycfg -S \| FileCheck %s

	; CHECK-NOT: select

	@G = extern_weak global i32			@G = extern_weak global i32

	define i32 @test(i32 %tmp) {			define i32 @test(i32 %tmp) {
				; CHECK-LABEL: @test(
				; CHECK-NEXT: cond_false179:
				; CHECK-NEXT: [[TMP181:%.]] = icmp eq i32 [[TMP:%.]], 0
				; CHECK-NEXT: [[SPEC_SELECT:%.]] = select i1 [[TMP181]], i32 udiv (i32 1, i32 ptrtoint (i32 @G to i32)), i32 [[TMP]]
				; CHECK-NEXT: ret i32 [[SPEC_SELECT]]
				;
	cond_false179:			cond_false179:
	%tmp181 = icmp eq i32 %tmp, 0 ; <i1> [#uses=1]			%tmp181 = icmp eq i32 %tmp, 0
	br i1 %tmp181, label %cond_true182, label %cond_next185			br i1 %tmp181, label %cond_true182, label %cond_next185
	cond_true182: ; preds = %cond_false179			cond_true182:
	br label %cond_next185			br label %cond_next185
	cond_next185: ; preds = %cond_true182, %cond_false179			cond_next185:
	%d0.3 = phi i32 [ udiv (i32 1, i32 ptrtoint (i32* @G to i32)), %cond_true182 ], [ %tmp, %cond_false179 ] ; <i32> [#uses=1]			%d0.3 = phi i32 [ udiv (i32 1, i32 ptrtoint (i32* @G to i32)), %cond_true182 ], [ %tmp, %cond_false179 ]
	ret i32 %d0.3			ret i32 %d0.3
	}			}

	define i32 @test2(i32 %tmp) {			define i32 @test2(i32 %tmp) {
				; CHECK-LABEL: @test2(
				; CHECK-NEXT: cond_false179:
				; CHECK-NEXT: [[TMP181:%.]] = icmp eq i32 [[TMP:%.]], 0
				; CHECK-NEXT: [[SPEC_SELECT:%.]] = select i1 [[TMP181]], i32 udiv (i32 1, i32 ptrtoint (i32 @G to i32)), i32 [[TMP]]
				; CHECK-NEXT: [[TMP0:%.*]] = call i32 @test(i32 4)
				; CHECK-NEXT: ret i32 [[SPEC_SELECT]]
				;
	cond_false179:			cond_false179:
	%tmp181 = icmp eq i32 %tmp, 0 ; <i1> [#uses=1]			%tmp181 = icmp eq i32 %tmp, 0
	br i1 %tmp181, label %cond_true182, label %cond_next185			br i1 %tmp181, label %cond_true182, label %cond_next185
	cond_true182: ; preds = %cond_false179			cond_true182: ; preds = %cond_false179
	br label %cond_next185			br label %cond_next185
	cond_next185: ; preds = %cond_true182, %cond_false179			cond_next185:
	%d0.3 = phi i32 [ udiv (i32 1, i32 ptrtoint (i32* @G to i32)), %cond_true182 ], [ %tmp, %cond_false179 ] ; <i32> [#uses=1]			%d0.3 = phi i32 [ udiv (i32 1, i32 ptrtoint (i32* @G to i32)), %cond_true182 ], [ %tmp, %cond_false179 ]
	call i32 @test( i32 4 ) ; <i32>:0 [#uses=0]			call i32 @test( i32 4 )
	ret i32 %d0.3			ret i32 %d0.3
	}			}

test/Transforms/SimplifyCFG/ConditionalTrappingConstantExpr.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -simplifycfg -S \| FileCheck %s			; RUN: opt < %s -simplifycfg -S \| FileCheck %s

	@G = extern_weak global i32			@G = extern_weak global i32

	; PR3354			; PR3354
	; Do not merge bb1 into the entry block, it might trap.			; Constant expressions never trap; check that we perform the transform consistently.

	define i32 @admiral(i32 %a, i32 %b) {			define i32 @admiral(i32 %a, i32 %b) {
	; CHECK-LABEL: @admiral(			; CHECK-LABEL: @admiral(
	; CHECK-NEXT: [[C:%.*]] = icmp sle i32 %a, %b			; CHECK-NEXT: bb2:
	; CHECK-NEXT: br i1 [[C]], label %bb2, label %bb1			; CHECK-NEXT: [[C:%.]] = icmp sgt i32 [[A:%.]], [[B:%.*]]
	; CHECK: bb1:
	; CHECK-NEXT: [[D:%.]] = icmp sgt i32 sdiv (i32 -32768, i32 ptrtoint (i32 @G to i32)), 0			; CHECK-NEXT: [[D:%.]] = icmp sgt i32 sdiv (i32 -32768, i32 ptrtoint (i32 @G to i32)), 0
	; CHECK-NEXT: [[DOT:%.*]] = select i1 [[D]], i32 927, i32 42			; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[C]], [[D]]
	; CHECK-NEXT: br label %bb2			; CHECK-NEXT: [[MERGE:%.*]] = select i1 [[OR_COND]], i32 927, i32 42
	; CHECK: bb2:
	; CHECK-NEXT: [[MERGE:%.*]] = phi i32 [ 42, %0 ], [ [[DOT]], %bb1 ]
	; CHECK-NEXT: ret i32 [[MERGE]]			; CHECK-NEXT: ret i32 [[MERGE]]
	;			;
	%c = icmp sle i32 %a, %b			%c = icmp sle i32 %a, %b
	br i1 %c, label %bb2, label %bb1			br i1 %c, label %bb2, label %bb1
	bb1:			bb1:
	%d = icmp sgt i32 sdiv (i32 -32768, i32 ptrtoint (i32* @G to i32)), 0			%d = icmp sgt i32 sdiv (i32 -32768, i32 ptrtoint (i32* @G to i32)), 0
	br i1 %d, label %bb6, label %bb2			br i1 %d, label %bb6, label %bb2
	bb2:			bb2:
	ret i32 42			ret i32 42
	bb6:			bb6:
	ret i32 927			ret i32 927
	}			}

	define i32 @ackbar(i1 %c) {			define i32 @ackbar(i1 %c) {
	; CHECK-LABEL: @ackbar(			; CHECK-LABEL: @ackbar(
	; CHECK-NEXT: br i1 %c, label %bb5, label %bb6			; CHECK-NEXT: bb6:
	; CHECK: bb5:			; CHECK-NEXT: [[SPEC_SELECT:%.]] = select i1 icmp sgt (i32 sdiv (i32 32767, i32 ptrtoint (i32 @G to i32)), i32 0), i32 42, i32 927
	; CHECK-NEXT: [[DOT:%.]] = select i1 icmp sgt (i32 sdiv (i32 32767, i32 ptrtoint (i32 @G to i32)), i32 0), i32 42, i32 927			; CHECK-NEXT: [[MERGE:%.]] = select i1 [[C:%.]], i32 [[SPEC_SELECT]], i32 42
	; CHECK-NEXT: br label %bb6
	; CHECK: bb6:
	; CHECK-NEXT: [[MERGE:%.*]] = phi i32 [ 42, %0 ], [ [[DOT]], %bb5 ]
	; CHECK-NEXT: ret i32 [[MERGE]]			; CHECK-NEXT: ret i32 [[MERGE]]
	;			;
	br i1 %c, label %bb5, label %bb6			br i1 %c, label %bb5, label %bb6
	bb5:			bb5:
	br i1 icmp sgt (i32 sdiv (i32 32767, i32 ptrtoint (i32* @G to i32)), i32 0), label %bb6, label %bb7			br i1 icmp sgt (i32 sdiv (i32 32767, i32 ptrtoint (i32* @G to i32)), i32 0), label %bb6, label %bb7
	bb6:			bb6:
	ret i32 42			ret i32 42
	bb7:			bb7:
	ret i32 927			ret i32 927
	}			}

	; FP ops don't trap by default, so this is safe to hoist.			; FP ops don't trap by default, so this is safe to hoist.

	define i32 @tarp(i1 %c) {			define i32 @tarp(i1 %c) {
	; CHECK-LABEL: @tarp(			; CHECK-LABEL: @tarp(
	; CHECK-NEXT: bb9:			; CHECK-NEXT: bb9:
	; CHECK-NEXT: [[DOT:%.]] = select i1 fcmp oeq (float fdiv (float 3.000000e+00, float sitofp (i32 ptrtoint (i32 @G to i32) to float)), float 1.000000e+00), i32 42, i32 927			; CHECK-NEXT: [[SPEC_SELECT:%.]] = select i1 fcmp oeq (float fdiv (float 3.000000e+00, float sitofp (i32 ptrtoint (i32 @G to i32) to float)), float 1.000000e+00), i32 42, i32 927
	; CHECK-NEXT: [[MERGE:%.*]] = select i1 %c, i32 [[DOT]], i32 42			; CHECK-NEXT: [[MERGE:%.]] = select i1 [[C:%.]], i32 [[SPEC_SELECT]], i32 42
	; CHECK-NEXT: ret i32 [[MERGE]]			; CHECK-NEXT: ret i32 [[MERGE]]
	;			;
	br i1 %c, label %bb8, label %bb9			br i1 %c, label %bb8, label %bb9
	bb8:			bb8:
	br i1 fcmp oeq (float fdiv (float 3.0, float sitofp (i32 ptrtoint (i32* @G to i32) to float)), float 1.0), label %bb9, label %bb10			br i1 fcmp oeq (float fdiv (float 3.0, float sitofp (i32 ptrtoint (i32* @G to i32) to float)), float 1.0), label %bb9, label %bb10
	bb9:			bb9:
	ret i32 42			ret i32 42
	bb10:			bb10:
	ret i32 927			ret i32 927
	}			}

test/Transforms/SimplifyCFG/PR16069.ll

	; NOTE: Assertions have been autogenerated by update_test_checks.py			; NOTE: Assertions have been autogenerated by update_test_checks.py
	; RUN: opt < %s -simplifycfg -S \| FileCheck %s			; RUN: opt < %s -simplifycfg -S \| FileCheck %s

	@b = extern_weak global i32			@b = extern_weak global i32

	define i32 @foo(i1 %y) {			define i32 @foo(i1 %y) {
	; CHECK-LABEL: @foo(			; CHECK-LABEL: @foo(
	; CHECK: [[COND_I:%.]] = phi i32 [ srem (i32 1, i32 zext (i1 icmp eq (i32 @b, i32* null) to i32)), %bb2 ], [ 0, %0 ]			; CHECK: [[COND_I:%.]] = select i1 %y, i32 0, i32 srem (i32 1, i32 zext (i1 icmp eq (i32 @b, i32* null) to i32))
	; CHECK-NEXT: ret i32 [[COND_I]]			; CHECK-NEXT: ret i32 [[COND_I]]
	;			;
	br i1 %y, label %bb1, label %bb2			br i1 %y, label %bb1, label %bb2
	bb1:			bb1:
	br label %bb3			br label %bb3
	bb2:			bb2:
	br label %bb3			br label %bb3
	bb3:			bb3:
	%cond.i = phi i32 [ 0, %bb1 ], [ srem (i32 1, i32 zext (i1 icmp eq (i32* @b, i32* null) to i32)), %bb2 ]			%cond.i = phi i32 [ 0, %bb1 ], [ srem (i32 1, i32 zext (i1 icmp eq (i32* @b, i32* null) to i32)), %bb2 ]
	ret i32 %cond.i			ret i32 %cond.i
	}			}

	define i32 @foo2(i1 %x) {			define i32 @foo2(i1 %x) {
	; CHECK-LABEL: @foo2(			; CHECK-LABEL: @foo2(
	; CHECK: [[COND:%.]] = phi i32 [ 0, %bb1 ], [ srem (i32 1, i32 zext (i1 icmp eq (i32 @b, i32* null) to i32)), %bb0 ]			; CHECK: [[COND:%.]] = select i1 %x, i32 0, i32 srem (i32 1, i32 zext (i1 icmp eq (i32 @b, i32* null) to i32))
	; CHECK-NEXT: ret i32 [[COND]]			; CHECK-NEXT: ret i32 [[COND]]
	;			;
	bb0:			bb0:
	br i1 %x, label %bb1, label %bb2			br i1 %x, label %bb1, label %bb2
	bb1:			bb1:
	br label %bb2			br label %bb2
	bb2:			bb2:
	%cond = phi i32 [ 0, %bb1 ], [ srem (i32 1, i32 zext (i1 icmp eq (i32* @b, i32* null) to i32)), %bb0 ]			%cond = phi i32 [ 0, %bb1 ], [ srem (i32 1, i32 zext (i1 icmp eq (i32* @b, i32* null) to i32)), %bb0 ]
	ret i32 %cond			ret i32 %cond
	}			}

test/Transforms/SimplifyCFG/PR17073.ll

	; RUN: opt < %s -simplifycfg -S \| FileCheck %s			; RUN: opt < %s -simplifycfg -S \| FileCheck %s

	; In PR17073 ( http://llvm.org/pr17073 ), we illegally hoisted an operation that can trap.			; Constant expressions never trap; check that we perform the transform consistently.
	; The first test confirms that we don't do that when the trapping op is reached by the current BB (block1).
	; The second test confirms that we don't do that when the trapping op is reached by the previous BB (entry).
	; The third test confirms that we can still do this optimization for an operation (add) that doesn't trap.
	; The tests must be complicated enough to prevent previous SimplifyCFG actions from optimizing away
	; the instructions that we're checking for.

	target datalayout = "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128"			target datalayout = "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128"
	target triple = "i386-apple-macosx10.9.0"			target triple = "i386-apple-macosx10.9.0"

	@a = common global i32 0, align 4			@a = common global i32 0, align 4
	@b = common global i8 0, align 1			@b = common global i8 0, align 1

	; CHECK-LABEL: can_trap1			; CHECK-LABEL: can_trap1
	; CHECK-NOT: or i1 %tobool, icmp eq (i32* bitcast (i8* @b to i32), i32 @a)			; CHECK: select i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a), i32* select (i1 icmp eq (i64 urem (i64 2, i64 zext (i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a) to i64)), i64 0), i32* null, i32* @a), i32* null
	; CHECK-NOT: select i1 %tobool, i32* null, i32* select (i1 icmp eq (i64 urem (i64 2, i64 zext (i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a) to i64)), i64 0), i32* null, i32* @a)
	define i32* @can_trap1() {			define i32* @can_trap1() {
	entry:			entry:
	%0 = load i32, i32* @a, align 4			%0 = load i32, i32* @a, align 4
	%tobool = icmp eq i32 %0, 0			%tobool = icmp eq i32 %0, 0
	br i1 %tobool, label %exit, label %block1			br i1 %tobool, label %exit, label %block1

	block1:			block1:
	br i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a), label %exit, label %block2			br i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a), label %exit, label %block2

	block2:			block2:
	br label %exit			br label %exit

	exit:			exit:
	%storemerge = phi i32* [ null, %entry ],[ null, %block2 ], [ select (i1 icmp eq (i64 urem (i64 2, i64 zext (i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a) to i64)), i64 0), i32* null, i32* @a), %block1 ]			%storemerge = phi i32* [ null, %entry ],[ null, %block2 ], [ select (i1 icmp eq (i64 urem (i64 2, i64 zext (i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a) to i64)), i64 0), i32* null, i32* @a), %block1 ]
	ret i32* %storemerge			ret i32* %storemerge
	}			}

	; CHECK-LABEL: can_trap2			; CHECK-LABEL: can_trap2
	; CHECK-NOT: or i1 %tobool, icmp eq (i32* bitcast (i8* @b to i32), i32 @a)			; CHECK: select i1 %tobool, i32* select (i1 icmp eq (i64 urem (i64 2, i64 zext (i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a) to i64)), i64 0), i32* null, i32* @a), i32* null
	; CHECK-NOT: select i1 %tobool, i32* select (i1 icmp eq (i64 urem (i64 2, i64 zext (i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a) to i64)), i64 0), i32* null, i32* @a), i32* null
	define i32* @can_trap2() {			define i32* @can_trap2() {
	entry:			entry:
	%0 = load i32, i32* @a, align 4			%0 = load i32, i32* @a, align 4
	%tobool = icmp eq i32 %0, 0			%tobool = icmp eq i32 %0, 0
	br i1 %tobool, label %exit, label %block1			br i1 %tobool, label %exit, label %block1

	block1:			block1:
	br i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a), label %exit, label %block2			br i1 icmp eq (i32* bitcast (i8* @b to i32), i32 @a), label %exit, label %block2
	Show All 27 Lines

unittests/IR/ConstantsTest.cpp

Show First 20 Lines • Show All 235 Lines • ▼ Show 20 Lines	CHECK(ConstantExpr::getAdd(P0, P0, false, true), "add nsw i32 " P0STR ", "
P0STR);		P0STR);
CHECK(ConstantExpr::getAdd(P0, P0, true, true), "add nuw nsw i32 " P0STR ", "		CHECK(ConstantExpr::getAdd(P0, P0, true, true), "add nuw nsw i32 " P0STR ", "
P0STR);		P0STR);
CHECK(ConstantExpr::getFAdd(P1, P1), "fadd float " P1STR ", " P1STR);		CHECK(ConstantExpr::getFAdd(P1, P1), "fadd float " P1STR ", " P1STR);
CHECK(ConstantExpr::getSub(P0, P0), "sub i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getSub(P0, P0), "sub i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getFSub(P1, P1), "fsub float " P1STR ", " P1STR);		CHECK(ConstantExpr::getFSub(P1, P1), "fsub float " P1STR ", " P1STR);
CHECK(ConstantExpr::getMul(P0, P0), "mul i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getMul(P0, P0), "mul i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getFMul(P1, P1), "fmul float " P1STR ", " P1STR);		CHECK(ConstantExpr::getFMul(P1, P1), "fmul float " P1STR ", " P1STR);
CHECK(ConstantExpr::getUDiv(P0, P0), "udiv i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getUDiv(P0, P0),
CHECK(ConstantExpr::getSDiv(P0, P0), "sdiv i32 " P0STR ", " P0STR);		"udiv i32 " P0STR ", select (i1 icmp eq (i32 " P0STR
		", i32 0), i32 1, i32 " P0STR ")");
		CHECK(ConstantExpr::getSDiv(P4, P0),
		"sdiv i32 " P4STR
		", select (i1 or (i1 and (i1 icmp eq (i32 " P0STR
		", i32 -1), i1 icmp eq (i32 " P4STR
		", i32 -2147483648)), i1 icmp eq (i32 " P0STR ", i32 0)), i32 1, i32 "
		P0STR ")");
CHECK(ConstantExpr::getFDiv(P1, P1), "fdiv float " P1STR ", " P1STR);		CHECK(ConstantExpr::getFDiv(P1, P1), "fdiv float " P1STR ", " P1STR);
CHECK(ConstantExpr::getURem(P0, P0), "urem i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getURem(P0, P0), "urem i32 " P0STR ", select (i1 icmp eq (i32 " P0STR ", i32 0), i32 1, i32 " P0STR ")");
CHECK(ConstantExpr::getSRem(P0, P0), "srem i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getSRem(P4, P0),
		"srem i32 " P4STR
		", select (i1 or (i1 and (i1 icmp eq (i32 " P0STR
		", i32 -1), i1 icmp eq (i32 " P4STR
		", i32 -2147483648)), i1 icmp eq (i32 " P0STR ", i32 0)), i32 1, i32 "
		P0STR ")");
CHECK(ConstantExpr::getFRem(P1, P1), "frem float " P1STR ", " P1STR);		CHECK(ConstantExpr::getFRem(P1, P1), "frem float " P1STR ", " P1STR);
CHECK(ConstantExpr::getAnd(P0, P0), "and i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getAnd(P0, P0), "and i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getOr(P0, P0), "or i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getOr(P0, P0), "or i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getXor(P0, P0), "xor i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getXor(P0, P0), "xor i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getShl(P0, P0), "shl i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getShl(P0, P0), "shl i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getShl(P0, P0, true), "shl nuw i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getShl(P0, P0, true), "shl nuw i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getShl(P0, P0, false, true), "shl nsw i32 " P0STR ", "		CHECK(ConstantExpr::getShl(P0, P0, false, true), "shl nsw i32 " P0STR ", "
P0STR);		P0STR);
CHECK(ConstantExpr::getLShr(P0, P0, false), "lshr i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getLShr(P0, P0, false), "lshr i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getLShr(P0, P0, true), "lshr exact i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getLShr(P0, P0, true), "lshr exact i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getAShr(P0, P0, false), "ashr i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getAShr(P0, P0, false), "ashr i32 " P0STR ", " P0STR);
CHECK(ConstantExpr::getAShr(P0, P0, true), "ashr exact i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getAShr(P0, P0, true), "ashr exact i32 " P0STR ", " P0STR);

CHECK(ConstantExpr::getSExt(P0, Int64Ty), "sext i32 " P0STR " to i64");		CHECK(ConstantExpr::getSExt(P0, Int64Ty), "sext i32 " P0STR " to i64");
CHECK(ConstantExpr::getZExt(P0, Int64Ty), "zext i32 " P0STR " to i64");		CHECK(ConstantExpr::getZExt(P0, Int64Ty), "zext i32 " P0STR " to i64");
CHECK(ConstantExpr::getFPTrunc(P2, FloatTy), "fptrunc double " P2STR		CHECK(ConstantExpr::getFPTrunc(P2, FloatTy), "fptrunc double " P2STR
" to float");		" to float");
CHECK(ConstantExpr::getFPExtend(P1, DoubleTy), "fpext float " P1STR		CHECK(ConstantExpr::getFPExtend(P1, DoubleTy), "fpext float " P1STR
" to double");		" to double");

CHECK(ConstantExpr::getExactUDiv(P0, P0), "udiv exact i32 " P0STR ", " P0STR);		CHECK(ConstantExpr::getExactUDiv(P0, P0),
		"udiv exact i32 " P0STR ", select (i1 icmp eq (i32 " P0STR
		", i32 0), i32 1, i32 " P0STR ")");

CHECK(ConstantExpr::getSelect(P3, P0, P4), "select i1 " P3STR ", i32 " P0STR		CHECK(ConstantExpr::getSelect(P3, P0, P4), "select i1 " P3STR ", i32 " P0STR
", i32 " P4STR);		", i32 " P4STR);
CHECK(ConstantExpr::getICmp(CmpInst::ICMP_EQ, P0, P4), "icmp eq i32 " P0STR		CHECK(ConstantExpr::getICmp(CmpInst::ICMP_EQ, P0, P4), "icmp eq i32 " P0STR
", " P4STR);		", " P4STR);
CHECK(ConstantExpr::getFCmp(CmpInst::FCMP_ULT, P1, P5), "fcmp ult float "		CHECK(ConstantExpr::getFCmp(CmpInst::FCMP_ULT, P1, P5), "fcmp ult float "
P1STR ", " P5STR);		P1STR ", " P5STR);

▲ Show 20 Lines • Show All 310 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

LLVM IR constant expressions never trap.Changes PlannedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 204362

docs/LangRef.rst

docs/ReleaseNotes.rst

include/llvm/Analysis/ValueTracking.h

include/llvm/CodeGen/GlobalISel/IRTranslator.h

include/llvm/IR/Constant.h

include/llvm/IR/Constants.h

lib/Analysis/CodeMetrics.cpp

lib/Analysis/ValueTracking.cpp

lib/CodeGen/SelectionDAG/FastISel.cpp

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

lib/IR/Constants.cpp

lib/Transforms/Utils/SimplifyCFG.cpp

lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

test/CodeGen/X86/critical-edge-split-2.ll

test/CodeGen/X86/divide-constant-expression.ll

test/Transforms/LoopVectorize/X86/masked_load_store.ll

test/Transforms/LoopVectorize/if-conversion.ll

test/Transforms/SimplifyCFG/2006-10-19-UncondDiv.ll

test/Transforms/SimplifyCFG/ConditionalTrappingConstantExpr.ll

test/Transforms/SimplifyCFG/PR16069.ll

test/Transforms/SimplifyCFG/PR17073.ll

unittests/IR/ConstantsTest.cpp

LLVM IR constant expressions never trap.
Changes PlannedPublic