This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
1
InstCombineCompares.cpp
2/7
InstCombineSimplifyDemanded.cpp
-
test/Transforms/
-
Transforms/
-
InstCombine/
-
apint-shift.ll
-
cast.ll
-
gep-combine-loop-invariant.ll
-
mul-inseltpoison.ll
-
mul.ll
-
shift.ll
-
LoopUnroll/
-
runtime-unroll-remainder.ll
-
LoopVectorize/
-
X86/
-
float-induction-x86.ll
-
invariant-load-gather.ll
-
invariant-store-vectorization.ll
-
small-size.ll
-
x86-interleaved-store-accesses-with-gaps.ll
-
float-induction.ll
-
induction.ll
-
invariant-store-vectorization-2.ll
-
invariant-store-vectorization.ll
-
loop-scalars.ll
-
vector-geps.ll
-
PhaseOrdering/X86/
-
X86/
-
excessive-unrolling.ll
-
pixel-splat.ll
-
vdiv.ll

Differential D140858

[InstCombine]: Don't simplify bits if it causes imm32 to become imm64
AcceptedPublic

Authored by goldstein.w.n on Jan 2 2023, 4:03 PM.

Download Raw Diff

Details

Reviewers

nikic
majnemer
spatel
lebedev.ri
RKSimon
craig.topper

Summary

Many targets have optimizations for encoding constants of a repeated
pattern. Sign extension is one of the more trivial/common ones so
avoid breaking it.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

goldstein.w.n created this revision.Jan 2 2023, 4:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 2 2023, 4:03 PM

Herald added subscribers: zzheng, hiraditya. · View Herald Transcript

goldstein.w.n requested review of this revision.Jan 2 2023, 4:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 2 2023, 4:03 PM

Herald added subscribers: llvm-commits, • pcwang-thead. · View Herald Transcript

goldstein.w.n added reviewers: nikic, majnemer.Jan 2 2023, 4:03 PM

Harbormaster completed remote builds in B205380: Diff 485890.Jan 2 2023, 4:49 PM

My first question here would be whether we should be undoing this in the backend instead. We perform demanded bits analysis on SDAG, and there is an existing targetShrinkDemandedConstant() hook. The name suggests that it should shrink the constant, but I see no reason why it couldn't do the reverse if that is profitable for the target. From a quick look, it seems like RISCV already does something like that.

Of course, doing this in InstCombine may also make sense -- the test diffs look like this might be slightly beneficial at the IR layer for the pattern emitted by LoopVectorize (e.g. see diffs in LoopVectorize/X86/small-size.ll). If we do want this in InstCombine, I'd expect a more principled approach that does not hardcode specific sizes. E.g. we could generally avoid masking off any sign bits, to avoid increasing the number of significant (signed) bits.

Herald added a subscriber: StephenFan. · View Herald TranscriptJan 3 2023, 2:51 AM

+1 to investigate whether this can be handled in DAG before adding something so specific to InstCombine - if value tracking can confirm that the upper bits are already zero there's no reason why the various SimplifyDemanded* calls shouldn't determine if a sext-imm is a better option for a specific target

The diffs are similar to what I was going for in:
d6498abc241b
...and I'm not aware of any regressions from that change.

So this seems fine to try in IR even before looking at codegen (and then maybe we won't need to do anything in codegen).
But I agree it should not be hard-coded based on i32 - avoid shrinking based on significant bits in general; possibly in combination with InstCombinerImpl::isDesirableIntType().

In D140858#4022703, @RKSimon wrote:

+1 to investigate whether this can be handled in DAG before adding something so specific to InstCombine - if value tracking can confirm that the upper bits are already zero there's no reason why the various SimplifyDemanded* calls shouldn't determine if a sext-imm is a better option for a specific target

So I took a look at X86 which already has X86DAGToDAGISel::shrinkAndImmediate. Tested with:

long td(long a) {
  if (a & (3L << 62)) {
    __builtin_unreachable();
  }
  return a & (-16L);
}

Which gets:

%tobool.not = icmp ult i64 %a, 4611686018427387904
tail call void @llvm.assume(i1 %tobool.not)
%and1 = and i64 %a, 4611686018427387888
ret i64 %and1

For some reason, however, the known bit information is lost by the time X86DAGToDAGISel::shrinkAndImmediate
is being called.
In SelectionDAG::MaskedValueIsZero earlier in the pipeline we get known as Zeros / Ones: c00000000000000f / 0, but in
X86ISelDAGToDag.cpp its: Zeros / Ones: 0 / 0.

Likewise I checked around TargetLowering::SimplifyDemandedBits in the ISD::AND case and by then as well LHSKnown
doesn't have the information needed: LHSKnown Zeros/Ones: 0 / 0.

My general (absolutely non-expert) opinion is handling this at InstCombine with TTI.getIntImmCostInst makes the most sense.
It avoids having to reroll this code in all target backend and seems to occur before some information loss that makes undoing
difficult.

Edit:

If I change the code to:

long foo();
long td(long a) {
  if (a & (3L << 62)) {
      return foo();
  }
  return a & (-16L);
}

It works, what I expect is happening is by the time code is in the DAG the llvm.assume has been removed.
Unless we want to propegate llvm.assume to the backends there seems to be a genuine
reason to keep this in InstCombine.

In D140858#4022695, @nikic wrote:

My first question here would be whether we should be undoing this in the backend instead. We perform demanded bits analysis on SDAG, and there is an existing targetShrinkDemandedConstant() hook. The name suggests that it should shrink the constant, but I see no reason why it couldn't do the reverse if that is profitable for the target. From a quick look, it seems like RISCV already does something like that.

See my other comment, but when I tested this in the X86 backend but if the known bits are determined with llvm.assume it seems that information is lost by the time it reached the backend.

Of course, doing this in InstCombine may also make sense -- the test diffs look like this might be slightly beneficial at the IR layer for the pattern emitted by LoopVectorize (e.g. see diffs in LoopVectorize/X86/small-size.ll). If we do want this in InstCombine, I'd expect a more principled approach that does not hardcode specific sizes. E.g. we could generally avoid masking off any sign bits, to avoid increasing the number of significant (signed) bits.

Can you expand on the rationale behind not using TTI.getIntImmCostInst here? This doesn't seem like a target specific transform and "is this imm better than that one?" seems like an inherently target specific question.

Edit:
There already seems to be a target-specific exception for nearly this exact case in InstCombinerImpl::isDesirableIntType which AFAICT is querying isLegalInteger which is target-dependent.

In D140858#4023134, @spatel wrote:

The diffs are similar to what I was going for in:
d6498abc241b
...and I'm not aware of any regressions from that change.

So this seems fine to try in IR even before looking at codegen (and then maybe we won't need to do anything in codegen).
But I agree it should not be hard-coded based on i32 - avoid shrinking based on significant bits in general; possibly in combination with InstCombinerImpl::isDesirableIntType().

I don't think InstCombinerImpl::isDesirableIntType() is what we want. For example on X86 64-bit is legal (so isDesirableIntType() will return true) so it would never
help avoid the extra instruction incurred by movabs.

In D140858#4023792, @goldstein.w.n wrote:

It works, what I expect is happening is by the time code is in the DAG the llvm.assume has been removed.
Unless we want to propegate llvm.assume to the backends there seems to be a genuine
reason to keep this in InstCombine.

Yes, that's correct, llvm.assume is not preserved in SDAG. If the motivation here are assumes in particular, then we cannot undo in the backend.

In D140858#4023805, @goldstein.w.n wrote:

In D140858#4022695, @nikic wrote:

Of course, doing this in InstCombine may also make sense -- the test diffs look like this might be slightly beneficial at the IR layer for the pattern emitted by LoopVectorize (e.g. see diffs in LoopVectorize/X86/small-size.ll). If we do want this in InstCombine, I'd expect a more principled approach that does not hardcode specific sizes. E.g. we could generally avoid masking off any sign bits, to avoid increasing the number of significant (signed) bits.

Can you expand on the rationale behind not using TTI.getIntImmCostInst here? This doesn't seem like a target specific transform and "is this imm better than that one?" seems like an inherently target specific question.

InstCombine is a target-independent canonicalization pass. It produces IR in a canonical form that other passes (or other transforms in InstCombine for that matter) can rely on. This canonical IR is independent of the target, beyond an unavoidable dependence on the DataLayout.

We could loosen this restriction (this would require an RFC on discourse), but at this point I don't really see evidence that it would be necessary to handle this case.

If we avoid shrinking that increases significant bits, the regressions I see are in @scalar_i32_lshr_and_negC_eq_X_is_constant1 and @scalar_i32_lshr_and_negC_eq_X_is_constant2, so we'd want to think about whether these can be avoided. (This would be necessary regardless of restrictions on the transform, it's just essentially impossible to find these cases if target-dependence is involved.)

llvm.assume is lost when we go to the DAG - but don't we generate a suitable AssertZext node?

In D140858#4025605, @RKSimon wrote:

llvm.assume is lost when we go to the DAG - but don't we generate a suitable AssertZext node?

We drop assumes during CGP, so no.

Make is so that we never shrink if it break sign-extension.
Fix optimization that was relying on no common bits between known.zero and val

In D140858#4025563, @nikic wrote:

In D140858#4023792, @goldstein.w.n wrote:

It works, what I expect is happening is by the time code is in the DAG the llvm.assume has been removed.
Unless we want to propegate llvm.assume to the backends there seems to be a genuine
reason to keep this in InstCombine.

Yes, that's correct, llvm.assume is not preserved in SDAG. If the motivation here are assumes in particular, then we cannot undo in the backend.

In D140858#4023805, @goldstein.w.n wrote:

In D140858#4022695, @nikic wrote:

Of course, doing this in InstCombine may also make sense -- the test diffs look like this might be slightly beneficial at the IR layer for the pattern emitted by LoopVectorize (e.g. see diffs in LoopVectorize/X86/small-size.ll). If we do want this in InstCombine, I'd expect a more principled approach that does not hardcode specific sizes. E.g. we could generally avoid masking off any sign bits, to avoid increasing the number of significant (signed) bits.

Can you expand on the rationale behind not using TTI.getIntImmCostInst here? This doesn't seem like a target specific transform and "is this imm better than that one?" seems like an inherently target specific question.

InstCombine is a target-independent canonicalization pass. It produces IR in a canonical form that other passes (or other transforms in InstCombine for that matter) can rely on. This canonical IR is independent of the target, beyond an unavoidable dependence on the DataLayout.

We could loosen this restriction (this would require an RFC on discourse), but at this point I don't really see evidence that it would be necessary to handle this case.

Fair enough, will drop it.

If we avoid shrinking that increases significant bits, the regressions I see are in @scalar_i32_lshr_and_negC_eq_X_is_constant1 and @scalar_i32_lshr_and_negC_eq_X_is_constant2, so we'd want to think about whether these can be avoided. (This would be necessary regardless of restrictions on the transform, it's just essentially impossible to find these cases if target-dependence is involved.)

Fixed those two cases. The issue was there was an assumption that Known.Zero & Value == 0 so it was using ~Known.Zero + Value == ~Known.Zero | Value. It would have broken on the original change had it tested with i64 in addition to i32

There is one more case in llvm/test/Transforms/LoopVectorize/induction.ll where we regress from zext -> sext.
I took a bit of a look, the issue is that by unsetting bits, we are essentially propegating analysis information. In InstCombinerImpl::visitSExt the check to isKnownNonNegative(Src, DL, 0, &AC, &CI, &DT) never finds the prior comparison (that was available InstCombineCompares) and since the and itself doesn't prohibit the sign but, it evaluates to false.

My feeling is we have 2 choices:

Allow removal of sign bit if it transforms imm8 -> imm16/imm32 and imm16 -> imm32 as those all the types where we can get the sext -> zext optimization.
Allow arbitrary bit simplification in InstCombine and undo it in a later pass (where maybe we can use TTI) and we still have llvm.assume.
(No idea what this would do to compile time but seems sensible in a naive way), if we add llvm.assume(%cond) at head of all taken BBs and llvm.assume(!%cond) at head of all non-taken BBs this case is fixed (analysis doesn't find the br but it finds the assumes). In general this seems sensible as there is alot of duplicate logic for proving bits between the code that evaluates the CFG and code that evaluates the assumes. No idea what this would do to compile time (my guess is absolutely explode it).

Harbormaster completed remote builds in B205816: Diff 486446.Jan 4 2023, 7:41 PM

I'm fine with this change, but let's wait for @spatel. The zext->sext changes in induction.ll don't look particularly problematic to me.

LGTM - see inline for some code comment adjustments.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1725	Adding is a bug, but I can't find a way to expose it since we always call SimplifyDemandedBits before we reach here. It's possible this patch will expose other bugs like that, so we'll need to watch for fallout.
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
47–53	For IR, the target effects are a secondary concern. We should just say something like: "We expect this is better for codegen because it allows forming smaller immediate values." The primary motivation is that it makes IR analysis easier and may help with SCEV.
57	undoinging -> undoing
59	propegated -> propagated
64–65	I'd change this sentence to make the outcome we're looking for explicit: // For anything larger than an 8-bit (byte) constant, avoid changing a // small magnitude negative number into a larger magnitude number.

This revision is now accepted and ready to land.Jan 5 2023, 5:51 AM

Forgot to mention - please edit the patch title/description to reflect the updated code (not i32/i64 specific any more).

lebedev.ri added inline comments.Jan 5 2023, 6:08 AM

llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
68	I think we are missing an inverse case. If we have some large number, and we don't demand it's sign bits, and if setting them allows it to become smaller, we should probably do that.

In D140858#4028085, @nikic wrote:

I'm fine with this change, but let's wait for @spatel. The zext->sext changes in induction.ll don't look particularly problematic to me.

Hmm, I was generally in favor of trying to preserve this. Is there opposition to allowing removal of sign bit if it can be used for
sext -> zext (so imm8 -> imm16/imm32 and imm16 -> imm32)?

goldstein.w.n added inline comments.Jan 5 2023, 7:29 AM

llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
68	I think we are missing an inverse case. If we have some large number, and we don't demand it's sign bits, and if setting them allows it to become smaller, we should probably do that. Good catch will check that out in V3.

goldstein.w.n added inline comments.Jan 5 2023, 11:44 AM

llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
68	I think we are missing an inverse case. If we have some large number, and we don't demand it's sign bits, and if setting them allows it to become smaller, we should probably do that. I implemented this but saw zero cases where it could apply. it made the code more complex so would prefer to skip it in this patch at the very least. That okay?

In D140858#4028085, @nikic wrote:

I'm fine with this change, but let's wait for @spatel. The zext->sext changes in induction.ll don't look particularly problematic to me.

I found another issue,
I changed the code to just disable ShrinkDemandedConstant and am notice a fair amount more
optimizations are breaking. We are just getting "lucky" with this patch b.c the test coverage
doesn't have constants that cross type size boundaries.

From what I can tell the following tests need to be fixed:

llvm/test/Transforms/InstCombine/demand_shrink_nsw.ll::
- foo

llvm/test/Transforms/InstCombine/icmp-and-shift.ll:
- icmp_eq_and_pow2_shl_pow2_negative1
- icmp_eq_and_pow2_shl_pow2_negative3
- icmp_eq_and_pow2_minus1_shl1
- icmp_eq_and_pow2_minus1_shl1_vec
- icmp_ne_and_pow2_minus1_shl1
- icmp_ne_and_pow2_minus1_shl1_vec
- icmp_eq_and_pow2_minus1_shl_pow2
- icmp_eq_and_pow2_minus1_shl_pow2_vec
- icmp_ne_and_pow2_minus1_shl_pow2

llvm/test/Transforms/InstCombine/icmp-mul-and.ll:
- mul_mask_pow2_sgt0
- mul_mask_pow2_eq4
- pr40493
- pr40493_neg1 (Maybe)
- pr51551_neg1

llvm/test/Transforms/InstCombine/icmp.ll:
- test17vec
- test17a
- test20vec
- test20a

llvm/test/Transforms/InstCombine/select-ctlz-to-cttz.ll:
- PR45762
- PR45762_logical

llvm/test/Transforms/InstCombine/sub.ll:
- demand_low_bits_uses_commute

And then a variety of cases where we will lose cmp r, 0 which is often better than cmp r, C.

Alot of analysis/transforms seems to use the actual constant values rather than the known-needed constant values.

I think maybe adding a seperate pass that runs right before llvm.assume is lost will be a better way to do this.
Otherwise think we should fix all those cases first.

goldstein.w.n mentioned this in D141089: [InstCombine] Fix potentially buggy code in `((%x & C) == 0) --> %x u< (-C)` transform.Jan 5 2023, 2:24 PM

nikic mentioned this in rGe6375ca6dc5b: [InstCombine] Fix potentially buggy code in `((%x & C) == 0) --> %x u< (-C)`….Jan 9 2023, 2:44 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineCompares.cpp

5 lines

InstCombineSimplifyDemanded.cpp

27 lines

test/

Transforms/

InstCombine/

apint-shift.ll

2 lines

cast.ll

4 lines

gep-combine-loop-invariant.ll

2 lines

mul-inseltpoison.ll

2 lines

mul.ll

2 lines

shift.ll

2 lines

LoopUnroll/

runtime-unroll-remainder.ll

2 lines

LoopVectorize/

X86/

float-induction-x86.ll

10 lines

invariant-load-gather.ll

4 lines

invariant-store-vectorization.ll

12 lines

small-size.ll

15 lines

x86-interleaved-store-accesses-with-gaps.ll

4 lines

float-induction.ll

8 lines

induction.ll

28 lines

invariant-store-vectorization-2.ll

6 lines

invariant-store-vectorization.ll

6 lines

loop-scalars.ll

4 lines

vector-geps.ll

4 lines

PhaseOrdering/

X86/

excessive-unrolling.ll

2 lines

pixel-splat.ll

2 lines

vdiv.ll

2 lines

Diff 486446

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,716 Lines • ▼ Show 20 Lines	if (C2->isSignMask()) {
Constant *Zero = Constant::getNullValue(X->getType());		Constant *Zero = Constant::getNullValue(X->getType());
auto NewPred = isICMP_NE ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_SGE;		auto NewPred = isICMP_NE ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_SGE;
return new ICmpInst(NewPred, X, Zero);		return new ICmpInst(NewPred, X, Zero);
}		}

APInt NewC2 = *C2;		APInt NewC2 = *C2;
KnownBits Know = computeKnownBits(And->getOperand(0), 0, And);		KnownBits Know = computeKnownBits(And->getOperand(0), 0, And);
// Set high zeros of C2 to allow matching negated power-of-2.		// Set high zeros of C2 to allow matching negated power-of-2.
NewC2 = *C2 + APInt::getHighBitsSet(C2->getBitWidth(),		// Or the high bits instead of adding them as there can be overlap between
spatelUnsubmitted Not Done Reply Inline Actions Adding is a bug, but I can't find a way to expose it since we always call SimplifyDemandedBits before we reach here. It's possible this patch will expose other bugs like that, so we'll need to watch for fallout. spatel: Adding is a bug, but I can't find a way to expose it since we always call SimplifyDemandedBits…
		// known not-needed bits and bits still set (because sign-extended imm
		// values can be more efficient to keep)
		NewC2 = *C2 \| APInt::getHighBitsSet(C2->getBitWidth(),
Know.countMinLeadingZeros());		Know.countMinLeadingZeros());

// Restrict this fold only for single-use 'and' (PR10267).		// Restrict this fold only for single-use 'and' (PR10267).
// ((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two.		// ((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two.
if (NewC2.isNegatedPowerOf2()) {		if (NewC2.isNegatedPowerOf2()) {
Constant *NegBOC = ConstantInt::get(And->getType(), -NewC2);		Constant *NegBOC = ConstantInt::get(And->getType(), -NewC2);
auto NewPred = isICMP_NE ? ICmpInst::ICMP_UGE : ICmpInst::ICMP_ULT;		auto NewPred = isICMP_NE ? ICmpInst::ICMP_UGE : ICmpInst::ICMP_ULT;
return new ICmpInst(NewPred, X, NegBOC);		return new ICmpInst(NewPred, X, NegBOC);
▲ Show 20 Lines • Show All 5,312 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp

Show All 36 Lines	static bool ShrinkDemandedConstant(Instruction *I, unsigned OpNo,
const APInt *C;		const APInt *C;
if (!match(Op, m_APInt(C)))		if (!match(Op, m_APInt(C)))
return false;		return false;

// If there are no bits set that aren't demanded, nothing to do.		// If there are no bits set that aren't demanded, nothing to do.
if (C->isSubsetOf(Demanded))		if (C->isSubsetOf(Demanded))
return false;		return false;

		APInt ShrunkC = *C & Demanded;

		// Many targets have optimizations for encoding constants of a repeated
		// pattern. Sign extension is one of the more trivial/common ones so
		// avoid breaking it. For example on X86 this can result in breaking imm8
		// encoding or cause an extra `movabs` instruction, On AArch64 it can cause
		// `ldq` or `movk + lsl` to be emitted instead of a recognized immediate
		// pattern. So if `ShrunkC` will require a larger sign-extended type than `C`
		// don't shrink C.
		spatelUnsubmitted Not Done Reply Inline Actions For IR, the target effects are a secondary concern. We should just say something like: "We expect this is better for codegen because it allows forming smaller immediate values." The primary motivation is that it makes IR analysis easier and may help with SCEV. spatel: For IR, the target effects are a secondary concern. We should just say something like: "We…
		// Note(1): A potentially better way to do this check would be use
		// TTI.getIntImmCostInst. InstCombine, however, is very explicitly
		// target-independent and should NOT use TTI.
		// Note(2): The rationale for adding this check here and not just undoinging
		spatelUnsubmitted Not Done Reply Inline Actions undoinging -> undoing spatel: undoinging -> undoing
		// this in the backend is that llvm.assume can inform necessary bits, but is
		// not propegated to the backend, so there is not always enough information to
		spatelUnsubmitted Not Done Reply Inline Actions propegated -> propagated spatel: propegated -> propagated
		// undo this transformation.
		if (Op->getType()->isIntOrPtrTy() && C->getBitWidth() <= 64) {
		unsigned BitReqBefore = PowerOf2Ceil(C->getSignificantBits());
		unsigned BitReqAfter = PowerOf2Ceil(ShrunkC.getSignificantBits());
		// Will take more sign-extended bits after and greater than 1-byte (assumed
		// minimum type size).
		spatelUnsubmitted Not Done Reply Inline Actions I'd change this sentence to make the outcome we're looking for explicit: // For anything larger than an 8-bit (byte) constant, avoid changing a // small magnitude negative number into a larger magnitude number. spatel: I'd change this sentence to make the outcome we're looking for explicit: // For anything…
		if (BitReqAfter > BitReqBefore && BitReqAfter > 8)
		return false;
		}
		lebedev.riUnsubmitted Not Done Reply Inline Actions I think we are missing an inverse case. If we have some large number, and we don't demand it's sign bits, and if setting them allows it to become smaller, we should probably do that. lebedev.ri: I think we are missing an inverse case. If we have some large number, and we don't demand it's…
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions I think we are missing an inverse case. If we have some large number, and we don't demand it's sign bits, and if setting them allows it to become smaller, we should probably do that. Good catch will check that out in V3. goldstein.w.n: > I think we are missing an inverse case. > If we have some large number, and we don't demand…
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions I think we are missing an inverse case. If we have some large number, and we don't demand it's sign bits, and if setting them allows it to become smaller, we should probably do that. I implemented this but saw zero cases where it could apply. it made the code more complex so would prefer to skip it in this patch at the very least. That okay? goldstein.w.n: > I think we are missing an inverse case. > If we have some large number, and we don't demand…

// This instruction is producing bits that are not demanded. Shrink the RHS.		// This instruction is producing bits that are not demanded. Shrink the RHS.
I->setOperand(OpNo, ConstantInt::get(Op->getType(), *C & Demanded));		I->setOperand(OpNo, ConstantInt::get(Op->getType(), ShrunkC));

return true;		return true;
}		}



/// Inst is an integer instruction that SimplifyDemandedBits knows about. See if		/// Inst is an integer instruction that SimplifyDemandedBits knows about. See if
/// the instruction has any properties that allow us to simplify its operands.		/// the instruction has any properties that allow us to simplify its operands.
▲ Show 20 Lines • Show All 1,675 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/apint-shift.ll

Show First 20 Lines • Show All 266 Lines • ▼ Show 20 Lines	;
%a = mul i18 %x, 3		%a = mul i18 %x, 3
%b = ashr i18 %a, 8		%b = ashr i18 %a, 8
%c = shl i18 %b, 9		%c = shl i18 %b, 9
ret i18 %c		ret i18 %c
}		}

define i35 @test14(i35 %A) {		define i35 @test14(i35 %A) {
; CHECK-LABEL: @test14(		; CHECK-LABEL: @test14(
; CHECK-NEXT: [[B:%.]] = and i35 [[A:%.]], -19760		; CHECK-NEXT: [[B:%.]] = and i35 [[A:%.]], -16
; CHECK-NEXT: [[C:%.*]] = or i35 [[B]], 19744		; CHECK-NEXT: [[C:%.*]] = or i35 [[B]], 19744
; CHECK-NEXT: ret i35 [[C]]		; CHECK-NEXT: ret i35 [[C]]
;		;
%B = lshr i35 %A, 4		%B = lshr i35 %A, 4
%C = or i35 %B, 1234		%C = or i35 %B, 1234
%D = shl i35 %C, 4		%D = shl i35 %C, 4
ret i35 %D		ret i35 %D
}		}
▲ Show 20 Lines • Show All 304 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/cast.ll

Show First 20 Lines • Show All 681 Lines • ▼ Show 20 Lines	;
%B = trunc i64 %a to i32		%B = trunc i64 %a to i32
%D = add i32 %B, -1		%D = add i32 %B, -1
%E = sext i32 %D to i64		%E = sext i32 %D to i64
ret i64 %E		ret i64 %E
}		}

define i64 @test51(i64 %A, i1 %cond) {		define i64 @test51(i64 %A, i1 %cond) {
; ALL-LABEL: @test51(		; ALL-LABEL: @test51(
; ALL-NEXT: [[C:%.]] = and i64 [[A:%.]], 4294967294		; ALL-NEXT: [[C:%.]] = and i64 [[A:%.]], -2
; ALL-NEXT: [[NOT_COND:%.]] = xor i1 [[COND:%.]], true		; ALL-NEXT: [[NOT_COND:%.]] = xor i1 [[COND:%.]], true
; ALL-NEXT: [[MASKSEL:%.*]] = zext i1 [[NOT_COND]] to i64		; ALL-NEXT: [[MASKSEL:%.*]] = zext i1 [[NOT_COND]] to i64
; ALL-NEXT: [[E:%.*]] = or i64 [[C]], [[MASKSEL]]		; ALL-NEXT: [[E:%.*]] = or i64 [[C]], [[MASKSEL]]
; ALL-NEXT: [[SEXT:%.*]] = shl nuw i64 [[E]], 32		; ALL-NEXT: [[SEXT:%.*]] = shl i64 [[E]], 32
; ALL-NEXT: [[F:%.*]] = ashr exact i64 [[SEXT]], 32		; ALL-NEXT: [[F:%.*]] = ashr exact i64 [[SEXT]], 32
; ALL-NEXT: ret i64 [[F]]		; ALL-NEXT: ret i64 [[F]]
;		;
%B = trunc i64 %A to i32		%B = trunc i64 %A to i32
%C = and i32 %B, -2		%C = and i32 %B, -2
%D = or i32 %B, 1		%D = or i32 %B, 1
%E = select i1 %cond, i32 %C, i32 %D		%E = select i1 %cond, i32 %C, i32 %D
%F = sext i32 %E to i64		%F = sext i32 %E to i64
▲ Show 20 Lines • Show All 1,448 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll

	Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines
	define void @PR37005_2(i8* %base, i8** %in) {			define void @PR37005_2(i8* %base, i8** %in) {
	; CHECK-LABEL: @PR37005_2(			; CHECK-LABEL: @PR37005_2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[E2:%.]] = getelementptr inbounds i8, i8** [[IN:%.*]], i64 undef			; CHECK-NEXT: [[E2:%.]] = getelementptr inbounds i8, i8** [[IN:%.*]], i64 undef
	; CHECK-NEXT: [[PI1:%.]] = ptrtoint i8* [[E2]] to i64			; CHECK-NEXT: [[PI1:%.]] = ptrtoint i8* [[E2]] to i64
	; CHECK-NEXT: [[TMP0:%.*]] = lshr i64 [[PI1]], 14			; CHECK-NEXT: [[TMP0:%.*]] = lshr i64 [[PI1]], 14
	; CHECK-NEXT: [[SL1:%.*]] = and i64 [[TMP0]], 1125899906842496			; CHECK-NEXT: [[SL1:%.*]] = and i64 [[TMP0]], -128
	; CHECK-NEXT: [[E51:%.]] = getelementptr inbounds i8, i8 [[BASE:%.*]], <2 x i64> <i64 80, i64 60>			; CHECK-NEXT: [[E51:%.]] = getelementptr inbounds i8, i8 [[BASE:%.*]], <2 x i64> <i64 80, i64 60>
	; CHECK-NEXT: [[E6:%.]] = getelementptr inbounds i8, <2 x i8> [[E51]], i64 [[SL1]]			; CHECK-NEXT: [[E6:%.]] = getelementptr inbounds i8, <2 x i8> [[E51]], i64 [[SL1]]
	; CHECK-NEXT: call void @blackhole(<2 x i8*> [[E6]])			; CHECK-NEXT: call void @blackhole(<2 x i8*> [[E6]])
	; CHECK-NEXT: br label [[LOOP]]			; CHECK-NEXT: br label [[LOOP]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	▲ Show 20 Lines • Show All 197 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/mul-inseltpoison.ll

	Show First 20 Lines • Show All 1,067 Lines • ▼ Show 20 Lines
	;			;
	%add = add <2 x i32> %a0, <i32 16, i32 32>			%add = add <2 x i32> %a0, <i32 16, i32 32>
	%mul = mul <2 x i32> %add, <i32 -4, i32 undef>			%mul = mul <2 x i32> %add, <i32 -4, i32 undef>
	ret <2 x i32> %mul			ret <2 x i32> %mul
	}			}

	define i32 @mulmuladd2(i32 %a0, i32 %a1) {			define i32 @mulmuladd2(i32 %a0, i32 %a1) {
	; CHECK-LABEL: @mulmuladd2(			; CHECK-LABEL: @mulmuladd2(
	; CHECK-NEXT: [[ADD_NEG:%.]] = sub i32 1073741808, [[A0:%.]]			; CHECK-NEXT: [[ADD_NEG:%.]] = sub i32 -16, [[A0:%.]]
	; CHECK-NEXT: [[MUL1_NEG:%.]] = mul i32 [[ADD_NEG]], [[A1:%.]]			; CHECK-NEXT: [[MUL1_NEG:%.]] = mul i32 [[ADD_NEG]], [[A1:%.]]
	; CHECK-NEXT: [[MUL2:%.*]] = shl i32 [[MUL1_NEG]], 2			; CHECK-NEXT: [[MUL2:%.*]] = shl i32 [[MUL1_NEG]], 2
	; CHECK-NEXT: ret i32 [[MUL2]]			; CHECK-NEXT: ret i32 [[MUL2]]
	;			;
	%add = add i32 %a0, 16			%add = add i32 %a0, 16
	%mul1 = mul i32 %add, %a1			%mul1 = mul i32 %add, %a1
	%mul2 = mul i32 %mul1, -4			%mul2 = mul i32 %mul1, -4
	ret i32 %mul2			ret i32 %mul2
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/mul.ll

	Show First 20 Lines • Show All 1,584 Lines • ▼ Show 20 Lines
	;			;
	%add = add <2 x i32> %a0, <i32 16, i32 32>			%add = add <2 x i32> %a0, <i32 16, i32 32>
	%mul = mul <2 x i32> %add, <i32 -4, i32 undef>			%mul = mul <2 x i32> %add, <i32 -4, i32 undef>
	ret <2 x i32> %mul			ret <2 x i32> %mul
	}			}

	define i32 @mulmuladd2(i32 %a0, i32 %a1) {			define i32 @mulmuladd2(i32 %a0, i32 %a1) {
	; CHECK-LABEL: @mulmuladd2(			; CHECK-LABEL: @mulmuladd2(
	; CHECK-NEXT: [[ADD_NEG:%.]] = sub i32 1073741808, [[A0:%.]]			; CHECK-NEXT: [[ADD_NEG:%.]] = sub i32 -16, [[A0:%.]]
	; CHECK-NEXT: [[MUL1_NEG:%.]] = mul i32 [[ADD_NEG]], [[A1:%.]]			; CHECK-NEXT: [[MUL1_NEG:%.]] = mul i32 [[ADD_NEG]], [[A1:%.]]
	; CHECK-NEXT: [[MUL2:%.*]] = shl i32 [[MUL1_NEG]], 2			; CHECK-NEXT: [[MUL2:%.*]] = shl i32 [[MUL1_NEG]], 2
	; CHECK-NEXT: ret i32 [[MUL2]]			; CHECK-NEXT: ret i32 [[MUL2]]
	;			;
	%add = add i32 %a0, 16			%add = add i32 %a0, 16
	%mul1 = mul i32 %add, %a1			%mul1 = mul i32 %add, %a1
	%mul2 = mul i32 %mul1, -4			%mul2 = mul i32 %mul1, -4
	ret i32 %mul2			ret i32 %mul2
	▲ Show 20 Lines • Show All 162 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/shift.ll

Show First 20 Lines • Show All 155 Lines • ▼ Show 20 Lines	;
%B = ashr exact i8 %a, 3		%B = ashr exact i8 %a, 3
%C = shl i8 %B, 4		%C = shl i8 %B, 4
ret i8 %C		ret i8 %C
}		}

;; D = ((B \| 1234) << 4) === ((B << 4)\|(1234 << 4)		;; D = ((B \| 1234) << 4) === ((B << 4)\|(1234 << 4)
define i32 @test14(i32 %A) {		define i32 @test14(i32 %A) {
; CHECK-LABEL: @test14(		; CHECK-LABEL: @test14(
; CHECK-NEXT: [[B:%.]] = and i32 [[A:%.]], -19760		; CHECK-NEXT: [[B:%.]] = and i32 [[A:%.]], -16
; CHECK-NEXT: [[C:%.*]] = or i32 [[B]], 19744		; CHECK-NEXT: [[C:%.*]] = or i32 [[B]], 19744
; CHECK-NEXT: ret i32 [[C]]		; CHECK-NEXT: ret i32 [[C]]
;		;
%B = lshr i32 %A, 4		%B = lshr i32 %A, 4
%C = or i32 %B, 1234		%C = or i32 %B, 1234
%D = shl i32 %C, 4		%D = shl i32 %C, 4
ret i32 %D		ret i32 %D
}		}
▲ Show 20 Lines • Show All 1,836 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/runtime-unroll-remainder.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -S -passes=loop-unroll,instcombine -unroll-runtime=true -unroll-count=4 -unroll-remainder \| FileCheck %s			; RUN: opt < %s -S -passes=loop-unroll,instcombine -unroll-runtime=true -unroll-count=4 -unroll-remainder \| FileCheck %s

	define i32 @unroll(ptr nocapture readonly %a, ptr nocapture readonly %b, i32 %N) local_unnamed_addr #0 {			define i32 @unroll(ptr nocapture readonly %a, ptr nocapture readonly %b, i32 %N) local_unnamed_addr #0 {
	;			;
	; CHECK-LABEL: @unroll(			; CHECK-LABEL: @unroll(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP9:%.]] = icmp eq i32 [[N:%.]], 0			; CHECK-NEXT: [[CMP9:%.]] = icmp eq i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[CMP9]], label [[FOR_COND_CLEANUP:%.]], label [[FOR_BODY_LR_PH:%.]]			; CHECK-NEXT: br i1 [[CMP9]], label [[FOR_COND_CLEANUP:%.]], label [[FOR_BODY_LR_PH:%.]]
	; CHECK: for.body.lr.ph:			; CHECK: for.body.lr.ph:
	; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64			; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64
	; CHECK-NEXT: [[XTRAITER:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 3			; CHECK-NEXT: [[XTRAITER:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 3
	; CHECK-NEXT: [[TMP0:%.*]] = icmp ult i32 [[N]], 4			; CHECK-NEXT: [[TMP0:%.*]] = icmp ult i32 [[N]], 4
	; CHECK-NEXT: br i1 [[TMP0]], label [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA:%.]], label [[FOR_BODY_LR_PH_NEW:%.]]			; CHECK-NEXT: br i1 [[TMP0]], label [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA:%.]], label [[FOR_BODY_LR_PH_NEW:%.]]
	; CHECK: for.body.lr.ph.new:			; CHECK: for.body.lr.ph.new:
	; CHECK-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 4294967292			; CHECK-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[WIDE_TRIP_COUNT]], -4
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.cond.cleanup.loopexit.unr-lcssa.loopexit:			; CHECK: for.cond.cleanup.loopexit.unr-lcssa.loopexit:
	; CHECK-NEXT: br label [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA]]			; CHECK-NEXT: br label [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA]]
	; CHECK: for.cond.cleanup.loopexit.unr-lcssa:			; CHECK: for.cond.cleanup.loopexit.unr-lcssa:
	; CHECK-NEXT: [[ADD_LCSSA_PH:%.]] = phi i32 [ undef, [[FOR_BODY_LR_PH]] ], [ [[ADD_3:%.]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA_LOOPEXIT:%.*]] ]			; CHECK-NEXT: [[ADD_LCSSA_PH:%.]] = phi i32 [ undef, [[FOR_BODY_LR_PH]] ], [ [[ADD_3:%.]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA_LOOPEXIT:%.*]] ]
	; CHECK-NEXT: [[INDVARS_IV_UNR:%.]] = phi i64 [ 0, [[FOR_BODY_LR_PH]] ], [ [[INDVARS_IV_NEXT_3:%.]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA_LOOPEXIT]] ]			; CHECK-NEXT: [[INDVARS_IV_UNR:%.]] = phi i64 [ 0, [[FOR_BODY_LR_PH]] ], [ [[INDVARS_IV_NEXT_3:%.]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA_LOOPEXIT]] ]
	; CHECK-NEXT: [[C_010_UNR:%.*]] = phi i32 [ 0, [[FOR_BODY_LR_PH]] ], [ [[ADD_3]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA_LOOPEXIT]] ]			; CHECK-NEXT: [[C_010_UNR:%.*]] = phi i32 [ 0, [[FOR_BODY_LR_PH]] ], [ [[ADD_3]], [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA_LOOPEXIT]] ]
	; CHECK-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0			; CHECK-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0
	▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll

	Show All 18 Lines
	; AUTO_VEC-NEXT: entry:			; AUTO_VEC-NEXT: entry:
	; AUTO_VEC-NEXT: [[CMP4:%.]] = icmp sgt i32 [[N:%.]], 0			; AUTO_VEC-NEXT: [[CMP4:%.]] = icmp sgt i32 [[N:%.]], 0
	; AUTO_VEC-NEXT: br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]			; AUTO_VEC-NEXT: br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]
	; AUTO_VEC: for.body.preheader:			; AUTO_VEC: for.body.preheader:
	; AUTO_VEC-NEXT: [[ZEXT:%.*]] = zext i32 [[N]] to i64			; AUTO_VEC-NEXT: [[ZEXT:%.*]] = zext i32 [[N]] to i64
	; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32			; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32
	; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]			; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]
	; AUTO_VEC: vector.ph:			; AUTO_VEC: vector.ph:
	; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[ZEXT]], 4294967264			; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[ZEXT]], -32
	; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float			; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float
	; AUTO_VEC-NEXT: [[TMP0:%.*]] = fmul fast float [[CAST_VTC]], 5.000000e-01			; AUTO_VEC-NEXT: [[TMP0:%.*]] = fmul fast float [[CAST_VTC]], 5.000000e-01
	; AUTO_VEC-NEXT: [[IND_END:%.*]] = fadd fast float [[TMP0]], 1.000000e+00			; AUTO_VEC-NEXT: [[IND_END:%.*]] = fadd fast float [[TMP0]], 1.000000e+00
	; AUTO_VEC-NEXT: [[TMP1:%.*]] = add nsw i64 [[ZEXT]], -32			; AUTO_VEC-NEXT: [[TMP1:%.*]] = add nsw i64 [[ZEXT]], -32
	; AUTO_VEC-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 5			; AUTO_VEC-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 5
	; AUTO_VEC-NEXT: [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1			; AUTO_VEC-NEXT: [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1
	; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP3]], 3			; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP3]], 3
	; AUTO_VEC-NEXT: [[TMP4:%.*]] = icmp ult i64 [[TMP1]], 96			; AUTO_VEC-NEXT: [[TMP4:%.*]] = icmp ult i64 [[TMP1]], 96
	▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines
	; AUTO_VEC-NEXT: [[CMP4:%.]] = icmp sgt i32 [[N:%.]], 0			; AUTO_VEC-NEXT: [[CMP4:%.]] = icmp sgt i32 [[N:%.]], 0
	; AUTO_VEC-NEXT: br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]			; AUTO_VEC-NEXT: br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]
	; AUTO_VEC: for.body.preheader:			; AUTO_VEC: for.body.preheader:
	; AUTO_VEC-NEXT: [[ZEXT:%.*]] = zext i32 [[N]] to i64			; AUTO_VEC-NEXT: [[ZEXT:%.*]] = zext i32 [[N]] to i64
	; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[ZEXT]], 7			; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[ZEXT]], 7
	; AUTO_VEC-NEXT: [[TMP0:%.*]] = icmp ult i32 [[N]], 8			; AUTO_VEC-NEXT: [[TMP0:%.*]] = icmp ult i32 [[N]], 8
	; AUTO_VEC-NEXT: br i1 [[TMP0]], label [[FOR_END_LOOPEXIT_UNR_LCSSA:%.]], label [[FOR_BODY_PREHEADER_NEW:%.]]			; AUTO_VEC-NEXT: br i1 [[TMP0]], label [[FOR_END_LOOPEXIT_UNR_LCSSA:%.]], label [[FOR_BODY_PREHEADER_NEW:%.]]
	; AUTO_VEC: for.body.preheader.new:			; AUTO_VEC: for.body.preheader.new:
	; AUTO_VEC-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[ZEXT]], 4294967288			; AUTO_VEC-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[ZEXT]], -8
	; AUTO_VEC-NEXT: br label [[FOR_BODY:%.*]]			; AUTO_VEC-NEXT: br label [[FOR_BODY:%.*]]
	; AUTO_VEC: for.body:			; AUTO_VEC: for.body:
	; AUTO_VEC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ 0, [[FOR_BODY_PREHEADER_NEW]] ], [ [[INDVARS_IV_NEXT_7:%.]], [[FOR_BODY]] ]			; AUTO_VEC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ 0, [[FOR_BODY_PREHEADER_NEW]] ], [ [[INDVARS_IV_NEXT_7:%.]], [[FOR_BODY]] ]
	; AUTO_VEC-NEXT: [[X_06:%.]] = phi float [ 1.000000e+00, [[FOR_BODY_PREHEADER_NEW]] ], [ [[CONV1_7:%.]], [[FOR_BODY]] ]			; AUTO_VEC-NEXT: [[X_06:%.]] = phi float [ 1.000000e+00, [[FOR_BODY_PREHEADER_NEW]] ], [ [[CONV1_7:%.]], [[FOR_BODY]] ]
	; AUTO_VEC-NEXT: [[NITER:%.]] = phi i64 [ 0, [[FOR_BODY_PREHEADER_NEW]] ], [ [[NITER_NEXT_7:%.]], [[FOR_BODY]] ]			; AUTO_VEC-NEXT: [[NITER:%.]] = phi i64 [ 0, [[FOR_BODY_PREHEADER_NEW]] ], [ [[NITER_NEXT_7:%.]], [[FOR_BODY]] ]
	; AUTO_VEC-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDVARS_IV]]			; AUTO_VEC-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDVARS_IV]]
	; AUTO_VEC-NEXT: store float [[X_06]], ptr [[ARRAYIDX]], align 4			; AUTO_VEC-NEXT: store float [[X_06]], ptr [[ARRAYIDX]], align 4
	; AUTO_VEC-NEXT: [[CONV1:%.*]] = fadd float [[X_06]], 5.000000e-01			; AUTO_VEC-NEXT: [[CONV1:%.*]] = fadd float [[X_06]], 5.000000e-01
	▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines

	define double @external_use_with_fast_math(ptr %a, i64 %n) {			define double @external_use_with_fast_math(ptr %a, i64 %n) {
	; AUTO_VEC-LABEL: @external_use_with_fast_math(			; AUTO_VEC-LABEL: @external_use_with_fast_math(
	; AUTO_VEC-NEXT: entry:			; AUTO_VEC-NEXT: entry:
	; AUTO_VEC-NEXT: [[SMAX:%.]] = tail call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; AUTO_VEC-NEXT: [[SMAX:%.]] = tail call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 16			; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 16
	; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]			; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]
	; AUTO_VEC: vector.ph:			; AUTO_VEC: vector.ph:
	; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775792			; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -16
	; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to double			; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to double
	; AUTO_VEC-NEXT: [[TMP0:%.*]] = fmul fast double [[CAST_VTC]], 3.000000e+00			; AUTO_VEC-NEXT: [[TMP0:%.*]] = fmul fast double [[CAST_VTC]], 3.000000e+00
	; AUTO_VEC-NEXT: [[TMP1:%.*]] = add nsw i64 [[SMAX]], -16			; AUTO_VEC-NEXT: [[TMP1:%.*]] = add nsw i64 [[SMAX]], -16
	; AUTO_VEC-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 4			; AUTO_VEC-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 4
	; AUTO_VEC-NEXT: [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1			; AUTO_VEC-NEXT: [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1
	; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP3]], 3			; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP3]], 3
	; AUTO_VEC-NEXT: [[TMP4:%.*]] = icmp ult i64 [[TMP1]], 48			; AUTO_VEC-NEXT: [[TMP4:%.*]] = icmp ult i64 [[TMP1]], 48
	; AUTO_VEC-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]			; AUTO_VEC-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]
	▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines
	; AUTO_VEC-LABEL: @external_use_without_fast_math(			; AUTO_VEC-LABEL: @external_use_without_fast_math(
	; AUTO_VEC-NEXT: entry:			; AUTO_VEC-NEXT: entry:
	; AUTO_VEC-NEXT: [[SMAX:%.]] = tail call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; AUTO_VEC-NEXT: [[SMAX:%.]] = tail call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; AUTO_VEC-NEXT: [[TMP0:%.*]] = add nsw i64 [[SMAX]], -1			; AUTO_VEC-NEXT: [[TMP0:%.*]] = add nsw i64 [[SMAX]], -1
	; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[SMAX]], 7			; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[SMAX]], 7
	; AUTO_VEC-NEXT: [[TMP1:%.*]] = icmp ult i64 [[TMP0]], 7			; AUTO_VEC-NEXT: [[TMP1:%.*]] = icmp ult i64 [[TMP0]], 7
	; AUTO_VEC-NEXT: br i1 [[TMP1]], label [[FOR_END_UNR_LCSSA:%.]], label [[ENTRY_NEW:%.]]			; AUTO_VEC-NEXT: br i1 [[TMP1]], label [[FOR_END_UNR_LCSSA:%.]], label [[ENTRY_NEW:%.]]
	; AUTO_VEC: entry.new:			; AUTO_VEC: entry.new:
	; AUTO_VEC-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[SMAX]], 9223372036854775800			; AUTO_VEC-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[SMAX]], -8
	; AUTO_VEC-NEXT: br label [[FOR_BODY:%.*]]			; AUTO_VEC-NEXT: br label [[FOR_BODY:%.*]]
	; AUTO_VEC: for.body:			; AUTO_VEC: for.body:
	; AUTO_VEC-NEXT: [[I:%.]] = phi i64 [ 0, [[ENTRY_NEW]] ], [ [[I_NEXT_7:%.]], [[FOR_BODY]] ]			; AUTO_VEC-NEXT: [[I:%.]] = phi i64 [ 0, [[ENTRY_NEW]] ], [ [[I_NEXT_7:%.]], [[FOR_BODY]] ]
	; AUTO_VEC-NEXT: [[J:%.]] = phi double [ 0.000000e+00, [[ENTRY_NEW]] ], [ [[J_NEXT_7:%.]], [[FOR_BODY]] ]			; AUTO_VEC-NEXT: [[J:%.]] = phi double [ 0.000000e+00, [[ENTRY_NEW]] ], [ [[J_NEXT_7:%.]], [[FOR_BODY]] ]
	; AUTO_VEC-NEXT: [[NITER:%.]] = phi i64 [ 0, [[ENTRY_NEW]] ], [ [[NITER_NEXT_7:%.]], [[FOR_BODY]] ]			; AUTO_VEC-NEXT: [[NITER:%.]] = phi i64 [ 0, [[ENTRY_NEW]] ], [ [[NITER_NEXT_7:%.]], [[FOR_BODY]] ]
	; AUTO_VEC-NEXT: [[T0:%.]] = getelementptr double, ptr [[A:%.]], i64 [[I]]			; AUTO_VEC-NEXT: [[T0:%.]] = getelementptr double, ptr [[A:%.]], i64 [[I]]
	; AUTO_VEC-NEXT: store double [[J]], ptr [[T0]], align 8			; AUTO_VEC-NEXT: store double [[J]], ptr [[T0]], align 8
	; AUTO_VEC-NEXT: [[I_NEXT:%.*]] = or i64 [[I]], 1			; AUTO_VEC-NEXT: [[I_NEXT:%.*]] = or i64 [[I]], 1
	▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
	; AUTO_VEC-NEXT: entry:			; AUTO_VEC-NEXT: entry:
	; AUTO_VEC-NEXT: [[CMP_NOT11:%.]] = icmp eq i32 [[N:%.]], 0			; AUTO_VEC-NEXT: [[CMP_NOT11:%.]] = icmp eq i32 [[N:%.]], 0
	; AUTO_VEC-NEXT: br i1 [[CMP_NOT11]], label [[FOR_COND_CLEANUP:%.]], label [[FOR_BODY_PREHEADER:%.]]			; AUTO_VEC-NEXT: br i1 [[CMP_NOT11]], label [[FOR_COND_CLEANUP:%.]], label [[FOR_BODY_PREHEADER:%.]]
	; AUTO_VEC: for.body.preheader:			; AUTO_VEC: for.body.preheader:
	; AUTO_VEC-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64			; AUTO_VEC-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64
	; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32			; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32
	; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]			; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]
	; AUTO_VEC: vector.ph:			; AUTO_VEC: vector.ph:
	; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967264			; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[TMP0]], -32
	; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float			; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float
	; AUTO_VEC-NEXT: [[TMP1:%.*]] = fmul reassoc float [[CAST_VTC]], 4.200000e+01			; AUTO_VEC-NEXT: [[TMP1:%.*]] = fmul reassoc float [[CAST_VTC]], 4.200000e+01
	; AUTO_VEC-NEXT: [[IND_END:%.*]] = fadd reassoc float [[TMP1]], 1.000000e+00			; AUTO_VEC-NEXT: [[IND_END:%.*]] = fadd reassoc float [[TMP1]], 1.000000e+00
	; AUTO_VEC-NEXT: [[TMP2:%.*]] = add nsw i64 [[TMP0]], -32			; AUTO_VEC-NEXT: [[TMP2:%.*]] = add nsw i64 [[TMP0]], -32
	; AUTO_VEC-NEXT: [[TMP3:%.*]] = lshr i64 [[TMP2]], 5			; AUTO_VEC-NEXT: [[TMP3:%.*]] = lshr i64 [[TMP2]], 5
	; AUTO_VEC-NEXT: [[TMP4:%.*]] = add nuw nsw i64 [[TMP3]], 1			; AUTO_VEC-NEXT: [[TMP4:%.*]] = add nuw nsw i64 [[TMP3]], 1
	; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP4]], 1			; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP4]], 1
	; AUTO_VEC-NEXT: [[TMP5:%.*]] = icmp ult i64 [[TMP2]], 32			; AUTO_VEC-NEXT: [[TMP5:%.*]] = icmp ult i64 [[TMP2]], 32
	▲ Show 20 Lines • Show All 119 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll

	Show All 18 Lines
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]
	; CHECK: vector.main.loop.iter.check:			; CHECK: vector.main.loop.iter.check:
	; CHECK-NEXT: [[MIN_ITERS_CHECK3:%.*]] = icmp ult i64 [[SMAX2]], 16			; CHECK-NEXT: [[MIN_ITERS_CHECK3:%.*]] = icmp ult i64 [[SMAX2]], 16
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK3]], label [[VEC_EPILOG_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK3]], label [[VEC_EPILOG_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775792			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -16
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <16 x ptr> poison, ptr [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <16 x ptr> poison, ptr [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <16 x ptr> [[BROADCAST_SPLATINSERT]], <16 x ptr> poison, <16 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <16 x ptr> [[BROADCAST_SPLATINSERT]], <16 x ptr> poison, <16 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT4:%.*]] = insertelement <16 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT4:%.*]] = insertelement <16 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT5:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT4]], <16 x i32> poison, <16 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT5:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT4]], <16 x i32> poison, <16 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	Show All 9 Lines
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.]], label [[VEC_EPILOG_ITER_CHECK:%.]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.]], label [[VEC_EPILOG_ITER_CHECK:%.]]
	; CHECK: vec.epilog.iter.check:			; CHECK: vec.epilog.iter.check:
	; CHECK-NEXT: [[N_VEC_REMAINING:%.*]] = and i64 [[SMAX2]], 8			; CHECK-NEXT: [[N_VEC_REMAINING:%.*]] = and i64 [[SMAX2]], 8
	; CHECK-NEXT: [[MIN_EPILOG_ITERS_CHECK_NOT_NOT:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0			; CHECK-NEXT: [[MIN_EPILOG_ITERS_CHECK_NOT_NOT:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0
	; CHECK-NEXT: br i1 [[MIN_EPILOG_ITERS_CHECK_NOT_NOT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VEC_EPILOG_PH]]			; CHECK-NEXT: br i1 [[MIN_EPILOG_ITERS_CHECK_NOT_NOT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VEC_EPILOG_PH]]
	; CHECK: vec.epilog.ph:			; CHECK: vec.epilog.ph:
	; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ]			; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ]
	; CHECK-NEXT: [[N_VEC7:%.*]] = and i64 [[SMAX2]], 9223372036854775800			; CHECK-NEXT: [[N_VEC7:%.*]] = and i64 [[SMAX2]], -8
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT11:%.*]] = insertelement <8 x ptr> poison, ptr [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT11:%.*]] = insertelement <8 x ptr> poison, ptr [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT12:%.*]] = shufflevector <8 x ptr> [[BROADCAST_SPLATINSERT11]], <8 x ptr> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT12:%.*]] = shufflevector <8 x ptr> [[BROADCAST_SPLATINSERT11]], <8 x ptr> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <8 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <8 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT13]], <8 x i32> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT13]], <8 x i32> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]
	; CHECK: vec.epilog.vector.body:			; CHECK: vec.epilog.vector.body:
	; CHECK-NEXT: [[INDEX9:%.]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT17:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX9:%.]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT17:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX9]]			; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX9]]
	▲ Show 20 Lines • Show All 57 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll

	Show All 21 Lines
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[A]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[A]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[B]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[B]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]
	; CHECK: vector.main.loop.iter.check:			; CHECK: vector.main.loop.iter.check:
	; CHECK-NEXT: [[MIN_ITERS_CHECK3:%.*]] = icmp ult i64 [[SMAX2]], 64			; CHECK-NEXT: [[MIN_ITERS_CHECK3:%.*]] = icmp ult i64 [[SMAX2]], 64
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK3]], label [[VEC_EPILOG_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK3]], label [[VEC_EPILOG_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775744			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -64
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi <16 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP5:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <16 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP5:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI4:%.]] = phi <16 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP6:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI4:%.]] = phi <16 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP6:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI5:%.]] = phi <16 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP7:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI5:%.]] = phi <16 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP7:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI6:%.]] = phi <16 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP8:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI6:%.]] = phi <16 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP8:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	Show All 21 Lines
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.]], label [[VEC_EPILOG_ITER_CHECK:%.]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.]], label [[VEC_EPILOG_ITER_CHECK:%.]]
	; CHECK: vec.epilog.iter.check:			; CHECK: vec.epilog.iter.check:
	; CHECK-NEXT: [[N_VEC_REMAINING:%.*]] = and i64 [[SMAX2]], 56			; CHECK-NEXT: [[N_VEC_REMAINING:%.*]] = and i64 [[SMAX2]], 56
	; CHECK-NEXT: [[MIN_EPILOG_ITERS_CHECK:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0			; CHECK-NEXT: [[MIN_EPILOG_ITERS_CHECK:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0
	; CHECK-NEXT: br i1 [[MIN_EPILOG_ITERS_CHECK]], label [[VEC_EPILOG_SCALAR_PH]], label [[VEC_EPILOG_PH]]			; CHECK-NEXT: br i1 [[MIN_EPILOG_ITERS_CHECK]], label [[VEC_EPILOG_SCALAR_PH]], label [[VEC_EPILOG_PH]]
	; CHECK: vec.epilog.ph:			; CHECK: vec.epilog.ph:
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ], [ [[TMP10]], [[VEC_EPILOG_ITER_CHECK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ], [ [[TMP10]], [[VEC_EPILOG_ITER_CHECK]] ]
	; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ], [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ]			; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ], [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ]
	; CHECK-NEXT: [[N_VEC13:%.*]] = and i64 [[SMAX2]], 9223372036854775800			; CHECK-NEXT: [[N_VEC13:%.*]] = and i64 [[SMAX2]], -8
	; CHECK-NEXT: [[TMP11:%.*]] = insertelement <8 x i32> <i32 poison, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>, i32 [[BC_MERGE_RDX]], i64 0			; CHECK-NEXT: [[TMP11:%.*]] = insertelement <8 x i32> <i32 poison, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>, i32 [[BC_MERGE_RDX]], i64 0
	; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]
	; CHECK: vec.epilog.vector.body:			; CHECK: vec.epilog.vector.body:
	; CHECK-NEXT: [[INDEX15:%.]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT18:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX15:%.]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT18:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI16:%.]] = phi <8 x i32> [ [[TMP11]], [[VEC_EPILOG_PH]] ], [ [[TMP13:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI16:%.]] = phi <8 x i32> [ [[TMP11]], [[VEC_EPILOG_PH]] ], [ [[TMP13:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX15]]			; CHECK-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX15]]
	; CHECK-NEXT: [[WIDE_LOAD17:%.*]] = load <8 x i32>, ptr [[TMP12]], align 8, !alias.scope !7			; CHECK-NEXT: [[WIDE_LOAD17:%.*]] = load <8 x i32>, ptr [[TMP12]], align 8, !alias.scope !7
	; CHECK-NEXT: [[TMP13]] = add <8 x i32> [[VEC_PHI16]], [[WIDE_LOAD17]]			; CHECK-NEXT: [[TMP13]] = add <8 x i32> [[VEC_PHI16]], [[WIDE_LOAD17]]
	▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]
	; CHECK: vector.main.loop.iter.check:			; CHECK: vector.main.loop.iter.check:
	; CHECK-NEXT: [[MIN_ITERS_CHECK3:%.*]] = icmp ult i64 [[SMAX2]], 16			; CHECK-NEXT: [[MIN_ITERS_CHECK3:%.*]] = icmp ult i64 [[SMAX2]], 16
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK3]], label [[VEC_EPILOG_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK3]], label [[VEC_EPILOG_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775792			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -16
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <16 x i32> poison, i32 [[K:%.]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <16 x i32> poison, i32 [[K:%.]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT]], <16 x i32> poison, <16 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT]], <16 x i32> poison, <16 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT4:%.*]] = insertelement <16 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT4:%.*]] = insertelement <16 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT5:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT4]], <16 x i32> poison, <16 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT5:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT4]], <16 x i32> poison, <16 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT6:%.*]] = insertelement <16 x ptr> poison, ptr [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT6:%.*]] = insertelement <16 x ptr> poison, ptr [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT7:%.*]] = shufflevector <16 x ptr> [[BROADCAST_SPLATINSERT6]], <16 x ptr> poison, <16 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT7:%.*]] = shufflevector <16 x ptr> [[BROADCAST_SPLATINSERT6]], <16 x ptr> poison, <16 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	Show All 10 Lines
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.]], label [[VEC_EPILOG_ITER_CHECK:%.]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.]], label [[VEC_EPILOG_ITER_CHECK:%.]]
	; CHECK: vec.epilog.iter.check:			; CHECK: vec.epilog.iter.check:
	; CHECK-NEXT: [[N_VEC_REMAINING:%.*]] = and i64 [[SMAX2]], 8			; CHECK-NEXT: [[N_VEC_REMAINING:%.*]] = and i64 [[SMAX2]], 8
	; CHECK-NEXT: [[MIN_EPILOG_ITERS_CHECK_NOT_NOT:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0			; CHECK-NEXT: [[MIN_EPILOG_ITERS_CHECK_NOT_NOT:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0
	; CHECK-NEXT: br i1 [[MIN_EPILOG_ITERS_CHECK_NOT_NOT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VEC_EPILOG_PH]]			; CHECK-NEXT: br i1 [[MIN_EPILOG_ITERS_CHECK_NOT_NOT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VEC_EPILOG_PH]]
	; CHECK: vec.epilog.ph:			; CHECK: vec.epilog.ph:
	; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ]			; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ]
	; CHECK-NEXT: [[N_VEC9:%.*]] = and i64 [[SMAX2]], 9223372036854775800			; CHECK-NEXT: [[N_VEC9:%.*]] = and i64 [[SMAX2]], -8
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <8 x i32> poison, i32 [[K]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <8 x i32> poison, i32 [[K]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT13]], <8 x i32> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT13]], <8 x i32> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT15:%.*]] = insertelement <8 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT15:%.*]] = insertelement <8 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT16:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT15]], <8 x i32> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT16:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT15]], <8 x i32> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT17:%.*]] = insertelement <8 x ptr> poison, ptr [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT17:%.*]] = insertelement <8 x ptr> poison, ptr [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT18:%.*]] = shufflevector <8 x ptr> [[BROADCAST_SPLATINSERT17]], <8 x ptr> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT18:%.*]] = shufflevector <8 x ptr> [[BROADCAST_SPLATINSERT17]], <8 x ptr> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]
	; CHECK: vec.epilog.vector.body:			; CHECK: vec.epilog.vector.body:
	▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[BOUND17:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[C]]			; CHECK-NEXT: [[BOUND17:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[C]]
	; CHECK-NEXT: [[FOUND_CONFLICT8:%.*]] = and i1 [[BOUND06]], [[BOUND17]]			; CHECK-NEXT: [[FOUND_CONFLICT8:%.*]] = and i1 [[BOUND06]], [[BOUND17]]
	; CHECK-NEXT: [[CONFLICT_RDX9:%.*]] = or i1 [[CONFLICT_RDX]], [[FOUND_CONFLICT8]]			; CHECK-NEXT: [[CONFLICT_RDX9:%.*]] = or i1 [[CONFLICT_RDX]], [[FOUND_CONFLICT8]]
	; CHECK-NEXT: br i1 [[CONFLICT_RDX9]], label [[VEC_EPILOG_SCALAR_PH]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]			; CHECK-NEXT: br i1 [[CONFLICT_RDX9]], label [[VEC_EPILOG_SCALAR_PH]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]
	; CHECK: vector.main.loop.iter.check:			; CHECK: vector.main.loop.iter.check:
	; CHECK-NEXT: [[MIN_ITERS_CHECK11:%.*]] = icmp ult i64 [[SMAX10]], 16			; CHECK-NEXT: [[MIN_ITERS_CHECK11:%.*]] = icmp ult i64 [[SMAX10]], 16
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK11]], label [[VEC_EPILOG_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK11]], label [[VEC_EPILOG_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX10]], 9223372036854775792			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX10]], -16
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <16 x i32> poison, i32 [[K:%.]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <16 x i32> poison, i32 [[K:%.]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT]], <16 x i32> poison, <16 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT]], <16 x i32> poison, <16 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT12:%.*]] = insertelement <16 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT12:%.*]] = insertelement <16 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT13:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT12]], <16 x i32> poison, <16 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT13:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT12]], <16 x i32> poison, <16 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT14:%.*]] = insertelement <16 x ptr> poison, ptr [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT14:%.*]] = insertelement <16 x ptr> poison, ptr [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT15:%.*]] = shufflevector <16 x ptr> [[BROADCAST_SPLATINSERT14]], <16 x ptr> poison, <16 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT15:%.*]] = shufflevector <16 x ptr> [[BROADCAST_SPLATINSERT14]], <16 x ptr> poison, <16 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	Show All 12 Lines
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX10]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX10]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.]], label [[VEC_EPILOG_ITER_CHECK:%.]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.]], label [[VEC_EPILOG_ITER_CHECK:%.]]
	; CHECK: vec.epilog.iter.check:			; CHECK: vec.epilog.iter.check:
	; CHECK-NEXT: [[N_VEC_REMAINING:%.*]] = and i64 [[SMAX10]], 8			; CHECK-NEXT: [[N_VEC_REMAINING:%.*]] = and i64 [[SMAX10]], 8
	; CHECK-NEXT: [[MIN_EPILOG_ITERS_CHECK_NOT_NOT:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0			; CHECK-NEXT: [[MIN_EPILOG_ITERS_CHECK_NOT_NOT:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0
	; CHECK-NEXT: br i1 [[MIN_EPILOG_ITERS_CHECK_NOT_NOT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VEC_EPILOG_PH]]			; CHECK-NEXT: br i1 [[MIN_EPILOG_ITERS_CHECK_NOT_NOT]], label [[VEC_EPILOG_SCALAR_PH]], label [[VEC_EPILOG_PH]]
	; CHECK: vec.epilog.ph:			; CHECK: vec.epilog.ph:
	; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ]			; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ]
	; CHECK-NEXT: [[N_VEC17:%.*]] = and i64 [[SMAX10]], 9223372036854775800			; CHECK-NEXT: [[N_VEC17:%.*]] = and i64 [[SMAX10]], -8
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT21:%.*]] = insertelement <8 x i32> poison, i32 [[K]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT21:%.*]] = insertelement <8 x i32> poison, i32 [[K]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT22:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT21]], <8 x i32> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT22:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT21]], <8 x i32> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT23:%.*]] = insertelement <8 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT23:%.*]] = insertelement <8 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT24:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT23]], <8 x i32> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT24:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT23]], <8 x i32> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT26:%.*]] = insertelement <8 x ptr> poison, ptr [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT26:%.*]] = insertelement <8 x ptr> poison, ptr [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT27:%.*]] = shufflevector <8 x ptr> [[BROADCAST_SPLATINSERT26]], <8 x ptr> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT27:%.*]] = shufflevector <8 x ptr> [[BROADCAST_SPLATINSERT26]], <8 x ptr> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]
	; CHECK: vec.epilog.vector.body:			; CHECK: vec.epilog.vector.body:
	▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/small-size.ll

	Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @example2(			; CHECK-LABEL: @example2(
	; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[TMP1]], label [[DOTLR_PH5_PREHEADER:%.]], label [[DOTPREHEADER:%.]]			; CHECK-NEXT: br i1 [[TMP1]], label [[DOTLR_PH5_PREHEADER:%.]], label [[DOTPREHEADER:%.]]
	; CHECK: .lr.ph5.preheader:			; CHECK: .lr.ph5.preheader:
	; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[N]], -1			; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[N]], -1
	; CHECK-NEXT: [[TMP3:%.*]] = zext i32 [[TMP2]] to i64			; CHECK-NEXT: [[TMP3:%.*]] = zext i32 [[TMP2]] to i64
	; CHECK-NEXT: [[N_RND_UP:%.*]] = add nuw nsw i64 [[TMP3]], 4			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[TMP3]], -4
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[N_RND_UP]], 8589934588
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP3]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP3]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE6:%.*]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE6:%.*]] ]
	; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[PRED_STORE_CONTINUE6]] ]			; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[PRED_STORE_CONTINUE6]] ]
	; CHECK-NEXT: [[TMP4:%.*]] = icmp ule <4 x i64> [[VEC_IND]], [[BROADCAST_SPLAT]]			; CHECK-NEXT: [[TMP4:%.*]] = icmp ule <4 x i64> [[VEC_IND]], [[BROADCAST_SPLAT]]
	; CHECK-NEXT: [[TMP5:%.*]] = extractelement <4 x i1> [[TMP4]], i64 0			; CHECK-NEXT: [[TMP5:%.*]] = extractelement <4 x i1> [[TMP4]], i64 0
	Show All 24 Lines
	; CHECK: pred.store.if5:			; CHECK: pred.store.if5:
	; CHECK-NEXT: [[TMP14:%.*]] = or i64 [[INDEX]], 3			; CHECK-NEXT: [[TMP14:%.*]] = or i64 [[INDEX]], 3
	; CHECK-NEXT: [[TMP15:%.]] = getelementptr inbounds [2048 x i32], [2048 x i32] @b, i64 0, i64 [[TMP14]]			; CHECK-NEXT: [[TMP15:%.]] = getelementptr inbounds [2048 x i32], [2048 x i32] @b, i64 0, i64 [[TMP14]]
	; CHECK-NEXT: store i32 [[X]], i32* [[TMP15]], align 4			; CHECK-NEXT: store i32 [[X]], i32* [[TMP15]], align 4
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE6]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE6]]
	; CHECK: pred.store.continue6:			; CHECK: pred.store.continue6:
	; CHECK-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 4
	; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i64> [[VEC_IND]], <i64 4, i64 4, i64 4, i64 4>			; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i64> [[VEC_IND]], <i64 4, i64 4, i64 4, i64 4>
	; CHECK-NEXT: [[TMP16:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP16:%.*]] = icmp eq i64 [[INDEX]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP16]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP16]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: br i1 true, label [[DOT_PREHEADER_CRIT_EDGE:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 true, label [[DOT_PREHEADER_CRIT_EDGE:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: br label [[DOTLR_PH5:%.*]]			; CHECK-NEXT: br label [[DOTLR_PH5:%.*]]
	; CHECK: ..preheader_crit_edge:			; CHECK: ..preheader_crit_edge:
	; CHECK-NEXT: [[PHITMP:%.*]] = sext i32 [[N]] to i64			; CHECK-NEXT: [[PHITMP:%.*]] = sext i32 [[N]] to i64
	; CHECK-NEXT: br label [[DOTPREHEADER]]			; CHECK-NEXT: br label [[DOTPREHEADER]]
	; CHECK: .preheader:			; CHECK: .preheader:
	; CHECK-NEXT: [[I_0_LCSSA:%.]] = phi i64 [ [[PHITMP]], [[DOT_PREHEADER_CRIT_EDGE]] ], [ 0, [[TMP0:%.]] ]			; CHECK-NEXT: [[I_0_LCSSA:%.]] = phi i64 [ [[PHITMP]], [[DOT_PREHEADER_CRIT_EDGE]] ], [ 0, [[TMP0:%.]] ]
	; CHECK-NEXT: [[TMP17:%.*]] = icmp eq i32 [[N]], 0			; CHECK-NEXT: [[TMP17:%.*]] = icmp eq i32 [[N]], 0
	; CHECK-NEXT: br i1 [[TMP17]], label [[DOT_CRIT_EDGE:%.]], label [[DOTLR_PH_PREHEADER:%.]]			; CHECK-NEXT: br i1 [[TMP17]], label [[DOT_CRIT_EDGE:%.]], label [[DOTLR_PH_PREHEADER:%.]]
	; CHECK: .lr.ph.preheader:			; CHECK: .lr.ph.preheader:
	; CHECK-NEXT: br i1 false, label [[SCALAR_PH8:%.]], label [[VECTOR_PH9:%.]]			; CHECK-NEXT: br i1 false, label [[SCALAR_PH8:%.]], label [[VECTOR_PH9:%.]]
	; CHECK: vector.ph9:			; CHECK: vector.ph9:
	; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[N]], -1			; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[N]], -1
	; CHECK-NEXT: [[TMP19:%.*]] = zext i32 [[TMP18]] to i64			; CHECK-NEXT: [[TMP19:%.*]] = zext i32 [[TMP18]] to i64
	; CHECK-NEXT: [[N_RND_UP10:%.*]] = add nuw nsw i64 [[TMP19]], 4			; CHECK-NEXT: [[N_VEC12:%.*]] = and i64 [[TMP19]], -4
	; CHECK-NEXT: [[N_VEC12:%.*]] = and i64 [[N_RND_UP10]], 8589934588
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT17:%.*]] = insertelement <4 x i64> poison, i64 [[TMP19]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT17:%.*]] = insertelement <4 x i64> poison, i64 [[TMP19]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT18:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT17]], <4 x i64> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT18:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT17]], <4 x i64> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY19:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY19:%.*]]
	; CHECK: vector.body19:			; CHECK: vector.body19:
	; CHECK-NEXT: [[INDEX20:%.]] = phi i64 [ 0, [[VECTOR_PH9]] ], [ [[INDEX_NEXT31:%.]], [[PRED_STORE_CONTINUE30:%.*]] ]			; CHECK-NEXT: [[INDEX20:%.]] = phi i64 [ 0, [[VECTOR_PH9]] ], [ [[INDEX_NEXT31:%.]], [[PRED_STORE_CONTINUE30:%.*]] ]
	; CHECK-NEXT: [[OFFSET_IDX:%.*]] = add i64 [[I_0_LCSSA]], [[INDEX20]]			; CHECK-NEXT: [[OFFSET_IDX:%.*]] = add i64 [[I_0_LCSSA]], [[INDEX20]]
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT21:%.*]] = insertelement <4 x i64> poison, i64 [[INDEX20]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT21:%.*]] = insertelement <4 x i64> poison, i64 [[INDEX20]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT22:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT21]], <4 x i64> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT22:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT21]], <4 x i64> poison, <4 x i32> zeroinitializer
	▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP48:%.]] = getelementptr inbounds [2048 x i32], [2048 x i32] @c, i64 0, i64 [[TMP45]]			; CHECK-NEXT: [[TMP48:%.]] = getelementptr inbounds [2048 x i32], [2048 x i32] @c, i64 0, i64 [[TMP45]]
	; CHECK-NEXT: [[TMP49:%.]] = load i32, i32 [[TMP48]], align 4			; CHECK-NEXT: [[TMP49:%.]] = load i32, i32 [[TMP48]], align 4
	; CHECK-NEXT: [[TMP50:%.*]] = and i32 [[TMP49]], [[TMP47]]			; CHECK-NEXT: [[TMP50:%.*]] = and i32 [[TMP49]], [[TMP47]]
	; CHECK-NEXT: [[TMP51:%.]] = getelementptr inbounds [2048 x i32], [2048 x i32] @a, i64 0, i64 [[TMP45]]			; CHECK-NEXT: [[TMP51:%.]] = getelementptr inbounds [2048 x i32], [2048 x i32] @a, i64 0, i64 [[TMP45]]
	; CHECK-NEXT: store i32 [[TMP50]], i32* [[TMP51]], align 4			; CHECK-NEXT: store i32 [[TMP50]], i32* [[TMP51]], align 4
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE30]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE30]]
	; CHECK: pred.store.continue30:			; CHECK: pred.store.continue30:
	; CHECK-NEXT: [[INDEX_NEXT31]] = add i64 [[INDEX20]], 4			; CHECK-NEXT: [[INDEX_NEXT31]] = add i64 [[INDEX20]], 4
	; CHECK-NEXT: [[TMP52:%.*]] = icmp eq i64 [[INDEX_NEXT31]], [[N_VEC12]]			; CHECK-NEXT: [[TMP52:%.*]] = icmp eq i64 [[INDEX20]], [[N_VEC12]]
	; CHECK-NEXT: br i1 [[TMP52]], label [[MIDDLE_BLOCK7:%.*]], label [[VECTOR_BODY19]], !llvm.loop [[LOOP5:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP52]], label [[MIDDLE_BLOCK7:%.*]], label [[VECTOR_BODY19]], !llvm.loop [[LOOP5:![0-9]+]]
	; CHECK: middle.block7:			; CHECK: middle.block7:
	; CHECK-NEXT: br i1 true, label [[DOT_CRIT_EDGE_LOOPEXIT:%.*]], label [[SCALAR_PH8]]			; CHECK-NEXT: br i1 true, label [[DOT_CRIT_EDGE_LOOPEXIT:%.*]], label [[SCALAR_PH8]]
	; CHECK: scalar.ph8:			; CHECK: scalar.ph8:
	; CHECK-NEXT: br label [[DOTLR_PH:%.*]]			; CHECK-NEXT: br label [[DOTLR_PH:%.*]]
	; CHECK: .lr.ph5:			; CHECK: .lr.ph5:
	; CHECK-NEXT: br i1 poison, label [[DOT_PREHEADER_CRIT_EDGE]], label [[DOTLR_PH5]], !llvm.loop [[LOOP6:![0-9]+]]			; CHECK-NEXT: br i1 poison, label [[DOT_PREHEADER_CRIT_EDGE]], label [[DOTLR_PH5]], !llvm.loop [[LOOP6:![0-9]+]]
	; CHECK: .lr.ph:			; CHECK: .lr.ph:
	▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @example3(			; CHECK-LABEL: @example3(
	; CHECK-NEXT: [[TMP1:%.]] = icmp eq i32 [[N:%.]], 0			; CHECK-NEXT: [[TMP1:%.]] = icmp eq i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[TMP1]], label [[DOT_CRIT_EDGE:%.]], label [[DOTLR_PH_PREHEADER:%.]]			; CHECK-NEXT: br i1 [[TMP1]], label [[DOT_CRIT_EDGE:%.]], label [[DOTLR_PH_PREHEADER:%.]]
	; CHECK: .lr.ph.preheader:			; CHECK: .lr.ph.preheader:
	; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[N]], -1			; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[N]], -1
	; CHECK-NEXT: [[TMP3:%.*]] = zext i32 [[TMP2]] to i64			; CHECK-NEXT: [[TMP3:%.*]] = zext i32 [[TMP2]] to i64
	; CHECK-NEXT: [[N_RND_UP:%.*]] = add nuw nsw i64 [[TMP3]], 4			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[TMP3]], -4
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[N_RND_UP]], 8589934588
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP3]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TMP3]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE19:%.*]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE19:%.*]] ]
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT12:%.*]] = insertelement <4 x i64> poison, i64 [[INDEX]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT12:%.*]] = insertelement <4 x i64> poison, i64 [[INDEX]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT13:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT12]], <4 x i64> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT13:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT12]], <4 x i64> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[VEC_IV:%.*]] = or <4 x i64> [[BROADCAST_SPLAT13]], <i64 0, i64 1, i64 2, i64 3>			; CHECK-NEXT: [[VEC_IV:%.*]] = or <4 x i64> [[BROADCAST_SPLAT13]], <i64 0, i64 1, i64 2, i64 3>
	Show All 36 Lines
	; CHECK-NEXT: [[NEXT_GEP7:%.]] = getelementptr i32, i32 [[P]], i64 [[TMP16]]			; CHECK-NEXT: [[NEXT_GEP7:%.]] = getelementptr i32, i32 [[P]], i64 [[TMP16]]
	; CHECK-NEXT: [[TMP17:%.*]] = or i64 [[INDEX]], 3			; CHECK-NEXT: [[TMP17:%.*]] = or i64 [[INDEX]], 3
	; CHECK-NEXT: [[NEXT_GEP11:%.]] = getelementptr i32, i32 [[Q]], i64 [[TMP17]]			; CHECK-NEXT: [[NEXT_GEP11:%.]] = getelementptr i32, i32 [[Q]], i64 [[TMP17]]
	; CHECK-NEXT: [[TMP18:%.]] = load i32, i32 [[NEXT_GEP11]], align 16			; CHECK-NEXT: [[TMP18:%.]] = load i32, i32 [[NEXT_GEP11]], align 16
	; CHECK-NEXT: store i32 [[TMP18]], i32* [[NEXT_GEP7]], align 16			; CHECK-NEXT: store i32 [[TMP18]], i32* [[NEXT_GEP7]], align 16
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE19]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE19]]
	; CHECK: pred.store.continue19:			; CHECK: pred.store.continue19:
	; CHECK-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP19]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP8:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP19]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP8:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: br i1 true, label [[DOT_CRIT_EDGE_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 true, label [[DOT_CRIT_EDGE_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: br label [[DOTLR_PH:%.*]]			; CHECK-NEXT: br label [[DOTLR_PH:%.*]]
	; CHECK: .lr.ph:			; CHECK: .lr.ph:
	; CHECK-NEXT: br i1 poison, label [[DOT_CRIT_EDGE_LOOPEXIT]], label [[DOTLR_PH]], !llvm.loop [[LOOP9:![0-9]+]]			; CHECK-NEXT: br i1 poison, label [[DOT_CRIT_EDGE_LOOPEXIT]], label [[DOTLR_PH]], !llvm.loop [[LOOP9:![0-9]+]]
	; CHECK: ._crit_edge.loopexit:			; CHECK: ._crit_edge.loopexit:
	▲ Show 20 Lines • Show All 246 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll

	Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines
	define dso_local void @test2(ptr noalias nocapture %points, i32 %numPoints, ptr noalias nocapture readonly %x, ptr noalias nocapture readonly %y) local_unnamed_addr {			define dso_local void @test2(ptr noalias nocapture %points, i32 %numPoints, ptr noalias nocapture readonly %x, ptr noalias nocapture readonly %y) local_unnamed_addr {
	; DISABLED_MASKED_STRIDED-LABEL: @test2(			; DISABLED_MASKED_STRIDED-LABEL: @test2(
	; DISABLED_MASKED_STRIDED-NEXT: entry:			; DISABLED_MASKED_STRIDED-NEXT: entry:
	; DISABLED_MASKED_STRIDED-NEXT: [[CMP15:%.]] = icmp sgt i32 [[NUMPOINTS:%.]], 0			; DISABLED_MASKED_STRIDED-NEXT: [[CMP15:%.]] = icmp sgt i32 [[NUMPOINTS:%.]], 0
	; DISABLED_MASKED_STRIDED-NEXT: br i1 [[CMP15]], label [[VECTOR_PH:%.]], label [[FOR_END:%.]]			; DISABLED_MASKED_STRIDED-NEXT: br i1 [[CMP15]], label [[VECTOR_PH:%.]], label [[FOR_END:%.]]
	; DISABLED_MASKED_STRIDED: vector.ph:			; DISABLED_MASKED_STRIDED: vector.ph:
	; DISABLED_MASKED_STRIDED-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[NUMPOINTS]] to i64			; DISABLED_MASKED_STRIDED-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[NUMPOINTS]] to i64
	; DISABLED_MASKED_STRIDED-NEXT: [[N_RND_UP:%.*]] = add nuw nsw i64 [[WIDE_TRIP_COUNT]], 3			; DISABLED_MASKED_STRIDED-NEXT: [[N_RND_UP:%.*]] = add nuw nsw i64 [[WIDE_TRIP_COUNT]], 3
	; DISABLED_MASKED_STRIDED-NEXT: [[N_VEC:%.*]] = and i64 [[N_RND_UP]], 8589934588			; DISABLED_MASKED_STRIDED-NEXT: [[N_VEC:%.*]] = and i64 [[N_RND_UP]], -4
	; DISABLED_MASKED_STRIDED-NEXT: [[TRIP_COUNT_MINUS_1:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -1			; DISABLED_MASKED_STRIDED-NEXT: [[TRIP_COUNT_MINUS_1:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -1
	; DISABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TRIP_COUNT_MINUS_1]], i64 0			; DISABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TRIP_COUNT_MINUS_1]], i64 0
	; DISABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer			; DISABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
	; DISABLED_MASKED_STRIDED-NEXT: br label [[VECTOR_BODY:%.*]]			; DISABLED_MASKED_STRIDED-NEXT: br label [[VECTOR_BODY:%.*]]
	; DISABLED_MASKED_STRIDED: vector.body:			; DISABLED_MASKED_STRIDED: vector.body:
	; DISABLED_MASKED_STRIDED-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE15:%.*]] ]			; DISABLED_MASKED_STRIDED-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE15:%.*]] ]
	; DISABLED_MASKED_STRIDED-NEXT: [[VEC_IND:%.]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[PRED_STORE_CONTINUE15]] ]			; DISABLED_MASKED_STRIDED-NEXT: [[VEC_IND:%.]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[PRED_STORE_CONTINUE15]] ]
	; DISABLED_MASKED_STRIDED-NEXT: [[TMP0:%.*]] = icmp ule <4 x i64> [[VEC_IND]], [[BROADCAST_SPLAT]]			; DISABLED_MASKED_STRIDED-NEXT: [[TMP0:%.*]] = icmp ule <4 x i64> [[VEC_IND]], [[BROADCAST_SPLAT]]
	▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines
	;			;
	; ENABLED_MASKED_STRIDED-LABEL: @test2(			; ENABLED_MASKED_STRIDED-LABEL: @test2(
	; ENABLED_MASKED_STRIDED-NEXT: entry:			; ENABLED_MASKED_STRIDED-NEXT: entry:
	; ENABLED_MASKED_STRIDED-NEXT: [[CMP15:%.]] = icmp sgt i32 [[NUMPOINTS:%.]], 0			; ENABLED_MASKED_STRIDED-NEXT: [[CMP15:%.]] = icmp sgt i32 [[NUMPOINTS:%.]], 0
	; ENABLED_MASKED_STRIDED-NEXT: br i1 [[CMP15]], label [[VECTOR_PH:%.]], label [[FOR_END:%.]]			; ENABLED_MASKED_STRIDED-NEXT: br i1 [[CMP15]], label [[VECTOR_PH:%.]], label [[FOR_END:%.]]
	; ENABLED_MASKED_STRIDED: vector.ph:			; ENABLED_MASKED_STRIDED: vector.ph:
	; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[NUMPOINTS]] to i64			; ENABLED_MASKED_STRIDED-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[NUMPOINTS]] to i64
	; ENABLED_MASKED_STRIDED-NEXT: [[N_RND_UP:%.*]] = add nuw nsw i64 [[WIDE_TRIP_COUNT]], 3			; ENABLED_MASKED_STRIDED-NEXT: [[N_RND_UP:%.*]] = add nuw nsw i64 [[WIDE_TRIP_COUNT]], 3
	; ENABLED_MASKED_STRIDED-NEXT: [[N_VEC:%.*]] = and i64 [[N_RND_UP]], 8589934588			; ENABLED_MASKED_STRIDED-NEXT: [[N_VEC:%.*]] = and i64 [[N_RND_UP]], -4
	; ENABLED_MASKED_STRIDED-NEXT: [[TRIP_COUNT_MINUS_1:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -1			; ENABLED_MASKED_STRIDED-NEXT: [[TRIP_COUNT_MINUS_1:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -1
	; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TRIP_COUNT_MINUS_1]], i64 0			; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[TRIP_COUNT_MINUS_1]], i64 0
	; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer			; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
	; ENABLED_MASKED_STRIDED-NEXT: [[TMP0:%.]] = getelementptr i16, ptr [[POINTS:%.]], i64 -1			; ENABLED_MASKED_STRIDED-NEXT: [[TMP0:%.]] = getelementptr i16, ptr [[POINTS:%.]], i64 -1
	; ENABLED_MASKED_STRIDED-NEXT: br label [[VECTOR_BODY:%.*]]			; ENABLED_MASKED_STRIDED-NEXT: br label [[VECTOR_BODY:%.*]]
	; ENABLED_MASKED_STRIDED: vector.body:			; ENABLED_MASKED_STRIDED: vector.body:
	; ENABLED_MASKED_STRIDED-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; ENABLED_MASKED_STRIDED-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLATINSERT1:%.*]] = insertelement <4 x i64> poison, i64 [[INDEX]], i64 0			; ENABLED_MASKED_STRIDED-NEXT: [[BROADCAST_SPLATINSERT1:%.*]] = insertelement <4 x i64> poison, i64 [[INDEX]], i64 0
	▲ Show 20 Lines • Show All 195 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/float-induction.ll

	Show First 20 Lines • Show All 1,354 Lines • ▼ Show 20 Lines

	define void @non_primary_iv_float_scalar(ptr %A, i64 %N) {			define void @non_primary_iv_float_scalar(ptr %A, i64 %N) {
	; VEC4_INTERL1-LABEL: @non_primary_iv_float_scalar(			; VEC4_INTERL1-LABEL: @non_primary_iv_float_scalar(
	; VEC4_INTERL1-NEXT: entry:			; VEC4_INTERL1-NEXT: entry:
	; VEC4_INTERL1-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; VEC4_INTERL1-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; VEC4_INTERL1-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4			; VEC4_INTERL1-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4
	; VEC4_INTERL1-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; VEC4_INTERL1-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; VEC4_INTERL1: vector.ph:			; VEC4_INTERL1: vector.ph:
	; VEC4_INTERL1-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775804			; VEC4_INTERL1-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -4
	; VEC4_INTERL1-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float			; VEC4_INTERL1-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float
	; VEC4_INTERL1-NEXT: br label [[VECTOR_BODY:%.*]]			; VEC4_INTERL1-NEXT: br label [[VECTOR_BODY:%.*]]
	; VEC4_INTERL1: vector.body:			; VEC4_INTERL1: vector.body:
	; VEC4_INTERL1-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE7:%.*]] ]			; VEC4_INTERL1-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE7:%.*]] ]
	; VEC4_INTERL1-NEXT: [[TMP0:%.*]] = sitofp i64 [[INDEX]] to float			; VEC4_INTERL1-NEXT: [[TMP0:%.*]] = sitofp i64 [[INDEX]] to float
	; VEC4_INTERL1-NEXT: [[TMP1:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]			; VEC4_INTERL1-NEXT: [[TMP1:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]
	; VEC4_INTERL1-NEXT: [[WIDE_LOAD:%.*]] = load <4 x float>, ptr [[TMP1]], align 4			; VEC4_INTERL1-NEXT: [[WIDE_LOAD:%.*]] = load <4 x float>, ptr [[TMP1]], align 4
	; VEC4_INTERL1-NEXT: [[TMP3:%.*]] = fcmp fast oeq <4 x float> [[WIDE_LOAD]], zeroinitializer			; VEC4_INTERL1-NEXT: [[TMP3:%.*]] = fcmp fast oeq <4 x float> [[WIDE_LOAD]], zeroinitializer
	▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines
	; VEC4_INTERL1-NEXT: ret void			; VEC4_INTERL1-NEXT: ret void
	;			;
	; VEC4_INTERL2-LABEL: @non_primary_iv_float_scalar(			; VEC4_INTERL2-LABEL: @non_primary_iv_float_scalar(
	; VEC4_INTERL2-NEXT: entry:			; VEC4_INTERL2-NEXT: entry:
	; VEC4_INTERL2-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; VEC4_INTERL2-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; VEC4_INTERL2-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 8			; VEC4_INTERL2-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 8
	; VEC4_INTERL2-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; VEC4_INTERL2-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; VEC4_INTERL2: vector.ph:			; VEC4_INTERL2: vector.ph:
	; VEC4_INTERL2-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775800			; VEC4_INTERL2-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -8
	; VEC4_INTERL2-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float			; VEC4_INTERL2-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float
	; VEC4_INTERL2-NEXT: br label [[VECTOR_BODY:%.*]]			; VEC4_INTERL2-NEXT: br label [[VECTOR_BODY:%.*]]
	; VEC4_INTERL2: vector.body:			; VEC4_INTERL2: vector.body:
	; VEC4_INTERL2-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE16:%.*]] ]			; VEC4_INTERL2-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE16:%.*]] ]
	; VEC4_INTERL2-NEXT: [[TMP0:%.*]] = sitofp i64 [[INDEX]] to float			; VEC4_INTERL2-NEXT: [[TMP0:%.*]] = sitofp i64 [[INDEX]] to float
	; VEC4_INTERL2-NEXT: [[TMP1:%.*]] = or i64 [[INDEX]], 4			; VEC4_INTERL2-NEXT: [[TMP1:%.*]] = or i64 [[INDEX]], 4
	; VEC4_INTERL2-NEXT: [[TMP2:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]			; VEC4_INTERL2-NEXT: [[TMP2:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]
	; VEC4_INTERL2-NEXT: [[WIDE_LOAD:%.*]] = load <4 x float>, ptr [[TMP2]], align 4			; VEC4_INTERL2-NEXT: [[WIDE_LOAD:%.*]] = load <4 x float>, ptr [[TMP2]], align 4
	▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines
	; VEC4_INTERL2-NEXT: ret void			; VEC4_INTERL2-NEXT: ret void
	;			;
	; VEC1_INTERL2-LABEL: @non_primary_iv_float_scalar(			; VEC1_INTERL2-LABEL: @non_primary_iv_float_scalar(
	; VEC1_INTERL2-NEXT: entry:			; VEC1_INTERL2-NEXT: entry:
	; VEC1_INTERL2-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; VEC1_INTERL2-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; VEC1_INTERL2-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2			; VEC1_INTERL2-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2
	; VEC1_INTERL2-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; VEC1_INTERL2-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; VEC1_INTERL2: vector.ph:			; VEC1_INTERL2: vector.ph:
	; VEC1_INTERL2-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775806			; VEC1_INTERL2-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -2
	; VEC1_INTERL2-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float			; VEC1_INTERL2-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float
	; VEC1_INTERL2-NEXT: br label [[VECTOR_BODY:%.*]]			; VEC1_INTERL2-NEXT: br label [[VECTOR_BODY:%.*]]
	; VEC1_INTERL2: vector.body:			; VEC1_INTERL2: vector.body:
	; VEC1_INTERL2-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE4:%.*]] ]			; VEC1_INTERL2-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE4:%.*]] ]
	; VEC1_INTERL2-NEXT: [[TMP0:%.*]] = sitofp i64 [[INDEX]] to float			; VEC1_INTERL2-NEXT: [[TMP0:%.*]] = sitofp i64 [[INDEX]] to float
	; VEC1_INTERL2-NEXT: [[INDUCTION2:%.*]] = or i64 [[INDEX]], 1			; VEC1_INTERL2-NEXT: [[INDUCTION2:%.*]] = or i64 [[INDEX]], 1
	; VEC1_INTERL2-NEXT: [[TMP1:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]			; VEC1_INTERL2-NEXT: [[TMP1:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]
	; VEC1_INTERL2-NEXT: [[TMP2:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDUCTION2]]			; VEC1_INTERL2-NEXT: [[TMP2:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDUCTION2]]
	▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	; VEC1_INTERL2-NEXT: ret void			; VEC1_INTERL2-NEXT: ret void
	;			;
	; VEC2_INTERL1_PRED_STORE-LABEL: @non_primary_iv_float_scalar(			; VEC2_INTERL1_PRED_STORE-LABEL: @non_primary_iv_float_scalar(
	; VEC2_INTERL1_PRED_STORE-NEXT: entry:			; VEC2_INTERL1_PRED_STORE-NEXT: entry:
	; VEC2_INTERL1_PRED_STORE-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; VEC2_INTERL1_PRED_STORE-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; VEC2_INTERL1_PRED_STORE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2			; VEC2_INTERL1_PRED_STORE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2
	; VEC2_INTERL1_PRED_STORE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]			; VEC2_INTERL1_PRED_STORE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]
	; VEC2_INTERL1_PRED_STORE: vector.ph:			; VEC2_INTERL1_PRED_STORE: vector.ph:
	; VEC2_INTERL1_PRED_STORE-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775806			; VEC2_INTERL1_PRED_STORE-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -2
	; VEC2_INTERL1_PRED_STORE-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float			; VEC2_INTERL1_PRED_STORE-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float
	; VEC2_INTERL1_PRED_STORE-NEXT: br label [[VECTOR_BODY:%.*]]			; VEC2_INTERL1_PRED_STORE-NEXT: br label [[VECTOR_BODY:%.*]]
	; VEC2_INTERL1_PRED_STORE: vector.body:			; VEC2_INTERL1_PRED_STORE: vector.body:
	; VEC2_INTERL1_PRED_STORE-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE3:%.*]] ]			; VEC2_INTERL1_PRED_STORE-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE3:%.*]] ]
	; VEC2_INTERL1_PRED_STORE-NEXT: [[TMP0:%.*]] = sitofp i64 [[INDEX]] to float			; VEC2_INTERL1_PRED_STORE-NEXT: [[TMP0:%.*]] = sitofp i64 [[INDEX]] to float
	; VEC2_INTERL1_PRED_STORE-NEXT: [[TMP1:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]			; VEC2_INTERL1_PRED_STORE-NEXT: [[TMP1:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]
	; VEC2_INTERL1_PRED_STORE-NEXT: [[WIDE_LOAD:%.*]] = load <2 x float>, ptr [[TMP1]], align 4			; VEC2_INTERL1_PRED_STORE-NEXT: [[WIDE_LOAD:%.*]] = load <2 x float>, ptr [[TMP1]], align 4
	; VEC2_INTERL1_PRED_STORE-NEXT: [[TMP3:%.*]] = fcmp fast oeq <2 x float> [[WIDE_LOAD]], zeroinitializer			; VEC2_INTERL1_PRED_STORE-NEXT: [[TMP3:%.*]] = fcmp fast oeq <2 x float> [[WIDE_LOAD]], zeroinitializer
	▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/induction.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 673 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret i64 [[TMP9]]			; CHECK-NEXT: ret i64 [[TMP9]]
	;			;
	; IND-LABEL: @scalarize_induction_variable_01(			; IND-LABEL: @scalarize_induction_variable_01(
	; IND-NEXT: entry:			; IND-NEXT: entry:
	; IND-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; IND-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2			; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2
	; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; IND: vector.ph:			; IND: vector.ph:
	; IND-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775806			; IND-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -2
	; IND-NEXT: br label [[VECTOR_BODY:%.*]]			; IND-NEXT: br label [[VECTOR_BODY:%.*]]
	; IND: vector.body:			; IND: vector.body:
	; IND-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; IND-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; IND-NEXT: [[VEC_PHI:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; IND-NEXT: [[VEC_PHI:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]
	; IND-NEXT: [[TMP0:%.]] = getelementptr inbounds i64, ptr [[A:%.]], i64 [[INDEX]]			; IND-NEXT: [[TMP0:%.]] = getelementptr inbounds i64, ptr [[A:%.]], i64 [[INDEX]]
	; IND-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i64>, ptr [[TMP0]], align 8			; IND-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i64>, ptr [[TMP0]], align 8
	; IND-NEXT: [[TMP1]] = add <2 x i64> [[WIDE_LOAD]], [[VEC_PHI]]			; IND-NEXT: [[TMP1]] = add <2 x i64> [[WIDE_LOAD]], [[VEC_PHI]]
	; IND-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; IND-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	Show All 21 Lines
	; IND-NEXT: ret i64 [[TMP7]]			; IND-NEXT: ret i64 [[TMP7]]
	;			;
	; UNROLL-LABEL: @scalarize_induction_variable_01(			; UNROLL-LABEL: @scalarize_induction_variable_01(
	; UNROLL-NEXT: entry:			; UNROLL-NEXT: entry:
	; UNROLL-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; UNROLL-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4			; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4
	; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; UNROLL: vector.ph:			; UNROLL: vector.ph:
	; UNROLL-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775804			; UNROLL-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -4
	; UNROLL-NEXT: br label [[VECTOR_BODY:%.*]]			; UNROLL-NEXT: br label [[VECTOR_BODY:%.*]]
	; UNROLL: vector.body:			; UNROLL: vector.body:
	; UNROLL-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; UNROLL-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; UNROLL-NEXT: [[VEC_PHI:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]			; UNROLL-NEXT: [[VEC_PHI:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]
	; UNROLL-NEXT: [[VEC_PHI1:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]			; UNROLL-NEXT: [[VEC_PHI1:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
	; UNROLL-NEXT: [[TMP0:%.]] = getelementptr inbounds i64, ptr [[A:%.]], i64 [[INDEX]]			; UNROLL-NEXT: [[TMP0:%.]] = getelementptr inbounds i64, ptr [[A:%.]], i64 [[INDEX]]
	; UNROLL-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i64>, ptr [[TMP0]], align 8			; UNROLL-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i64>, ptr [[TMP0]], align 8
	; UNROLL-NEXT: [[TMP1:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i64 2			; UNROLL-NEXT: [[TMP1:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i64 2
	▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	; UNROLL-NO-IC-NEXT: ret i64 [[TMP13]]			; UNROLL-NO-IC-NEXT: ret i64 [[TMP13]]
	;			;
	; INTERLEAVE-LABEL: @scalarize_induction_variable_01(			; INTERLEAVE-LABEL: @scalarize_induction_variable_01(
	; INTERLEAVE-NEXT: entry:			; INTERLEAVE-NEXT: entry:
	; INTERLEAVE-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; INTERLEAVE-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; INTERLEAVE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 8			; INTERLEAVE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 8
	; INTERLEAVE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; INTERLEAVE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; INTERLEAVE: vector.ph:			; INTERLEAVE: vector.ph:
	; INTERLEAVE-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775800			; INTERLEAVE-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -8
	; INTERLEAVE-NEXT: br label [[VECTOR_BODY:%.*]]			; INTERLEAVE-NEXT: br label [[VECTOR_BODY:%.*]]
	; INTERLEAVE: vector.body:			; INTERLEAVE: vector.body:
	; INTERLEAVE-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; INTERLEAVE-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; INTERLEAVE-NEXT: [[VEC_PHI:%.]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]			; INTERLEAVE-NEXT: [[VEC_PHI:%.]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]
	; INTERLEAVE-NEXT: [[VEC_PHI1:%.]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]			; INTERLEAVE-NEXT: [[VEC_PHI1:%.]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
	; INTERLEAVE-NEXT: [[TMP0:%.]] = getelementptr inbounds i64, ptr [[A:%.]], i64 [[INDEX]]			; INTERLEAVE-NEXT: [[TMP0:%.]] = getelementptr inbounds i64, ptr [[A:%.]], i64 [[INDEX]]
	; INTERLEAVE-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i64>, ptr [[TMP0]], align 8			; INTERLEAVE-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i64>, ptr [[TMP0]], align 8
	; INTERLEAVE-NEXT: [[TMP1:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i64 4			; INTERLEAVE-NEXT: [[TMP1:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i64 4
	▲ Show 20 Lines • Show All 482 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	; IND-LABEL: @scalarize_induction_variable_03(			; IND-LABEL: @scalarize_induction_variable_03(
	; IND-NEXT: entry:			; IND-NEXT: entry:
	; IND-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; IND-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2			; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2
	; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; IND: vector.ph:			; IND: vector.ph:
	; IND-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775806			; IND-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -2
	; IND-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <2 x i32> poison, i32 [[Y:%.]], i64 0			; IND-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <2 x i32> poison, i32 [[Y:%.]], i64 0
	; IND-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <2 x i32> [[BROADCAST_SPLATINSERT]], <2 x i32> poison, <2 x i32> zeroinitializer			; IND-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <2 x i32> [[BROADCAST_SPLATINSERT]], <2 x i32> poison, <2 x i32> zeroinitializer
	; IND-NEXT: br label [[VECTOR_BODY:%.*]]			; IND-NEXT: br label [[VECTOR_BODY:%.*]]
	; IND: vector.body:			; IND: vector.body:
	; IND-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; IND-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; IND-NEXT: [[TMP0:%.*]] = or i64 [[INDEX]], 1			; IND-NEXT: [[TMP0:%.*]] = or i64 [[INDEX]], 1
	; IND-NEXT: [[TMP1:%.]] = getelementptr inbounds [[PAIR_I32:%.]], ptr [[P:%.*]], i64 [[INDEX]], i32 1			; IND-NEXT: [[TMP1:%.]] = getelementptr inbounds [[PAIR_I32:%.]], ptr [[P:%.*]], i64 [[INDEX]], i32 1
	; IND-NEXT: [[TMP2:%.*]] = getelementptr inbounds [[PAIR_I32]], ptr [[P]], i64 [[TMP0]], i32 1			; IND-NEXT: [[TMP2:%.*]] = getelementptr inbounds [[PAIR_I32]], ptr [[P]], i64 [[TMP0]], i32 1
	Show All 28 Lines
	; IND-NEXT: ret void			; IND-NEXT: ret void
	;			;
	; UNROLL-LABEL: @scalarize_induction_variable_03(			; UNROLL-LABEL: @scalarize_induction_variable_03(
	; UNROLL-NEXT: entry:			; UNROLL-NEXT: entry:
	; UNROLL-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; UNROLL-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4			; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4
	; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; UNROLL: vector.ph:			; UNROLL: vector.ph:
	; UNROLL-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775804			; UNROLL-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -4
	; UNROLL-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <2 x i32> poison, i32 [[Y:%.]], i64 0			; UNROLL-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <2 x i32> poison, i32 [[Y:%.]], i64 0
	; UNROLL-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <2 x i32> [[BROADCAST_SPLATINSERT]], <2 x i32> poison, <2 x i32> zeroinitializer			; UNROLL-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <2 x i32> [[BROADCAST_SPLATINSERT]], <2 x i32> poison, <2 x i32> zeroinitializer
	; UNROLL-NEXT: [[BROADCAST_SPLATINSERT1:%.*]] = insertelement <2 x i32> poison, i32 [[Y]], i64 0			; UNROLL-NEXT: [[BROADCAST_SPLATINSERT1:%.*]] = insertelement <2 x i32> poison, i32 [[Y]], i64 0
	; UNROLL-NEXT: [[BROADCAST_SPLAT2:%.*]] = shufflevector <2 x i32> [[BROADCAST_SPLATINSERT1]], <2 x i32> poison, <2 x i32> zeroinitializer			; UNROLL-NEXT: [[BROADCAST_SPLAT2:%.*]] = shufflevector <2 x i32> [[BROADCAST_SPLATINSERT1]], <2 x i32> poison, <2 x i32> zeroinitializer
	; UNROLL-NEXT: br label [[VECTOR_BODY:%.*]]			; UNROLL-NEXT: br label [[VECTOR_BODY:%.*]]
	; UNROLL: vector.body:			; UNROLL: vector.body:
	; UNROLL-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; UNROLL-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; UNROLL-NEXT: [[TMP0:%.*]] = or i64 [[INDEX]], 1			; UNROLL-NEXT: [[TMP0:%.*]] = or i64 [[INDEX]], 1
	▲ Show 20 Lines • Show All 682 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret i32 [[VAR5]]			; CHECK-NEXT: ret i32 [[VAR5]]
	;			;
	; IND-LABEL: @scalarize_induction_variable_05(			; IND-LABEL: @scalarize_induction_variable_05(
	; IND-NEXT: entry:			; IND-NEXT: entry:
	; IND-NEXT: [[SMAX:%.]] = call i32 @llvm.smax.i32(i32 [[N:%.]], i32 1)			; IND-NEXT: [[SMAX:%.]] = call i32 @llvm.smax.i32(i32 [[N:%.]], i32 1)
	; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[SMAX]], 2			; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[SMAX]], 2
	; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; IND: vector.ph:			; IND: vector.ph:
	; IND-NEXT: [[N_VEC:%.*]] = and i32 [[SMAX]], 2147483646			; IND-NEXT: [[N_VEC:%.*]] = and i32 [[SMAX]], -2
	; IND-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <2 x i1> poison, i1 [[C:%.]], i64 0			; IND-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <2 x i1> poison, i1 [[C:%.]], i64 0
	; IND-NEXT: br label [[VECTOR_BODY:%.*]]			; IND-NEXT: br label [[VECTOR_BODY:%.*]]
	; IND: vector.body:			; IND: vector.body:
	; IND-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_UDIV_CONTINUE2:%.*]] ]			; IND-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_UDIV_CONTINUE2:%.*]] ]
	; IND-NEXT: [[VEC_PHI:%.]] = phi <2 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP13:%.]], [[PRED_UDIV_CONTINUE2]] ]			; IND-NEXT: [[VEC_PHI:%.]] = phi <2 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP13:%.]], [[PRED_UDIV_CONTINUE2]] ]
	; IND-NEXT: [[TMP0:%.*]] = sext i32 [[INDEX]] to i64			; IND-NEXT: [[TMP0:%.*]] = sext i32 [[INDEX]] to i64
	; IND-NEXT: [[TMP1:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[TMP0]]			; IND-NEXT: [[TMP1:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[TMP0]]
	; IND-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP1]], align 4			; IND-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP1]], align 4
	Show All 27 Lines
	; IND-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; IND-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; IND: scalar.ph:			; IND: scalar.ph:
	; IND-NEXT: [[BC_RESUME_VAL:%.]] = phi i32 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; IND-NEXT: [[BC_RESUME_VAL:%.]] = phi i32 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; IND-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP15]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]			; IND-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP15]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]
	; IND-NEXT: br label [[FOR_BODY:%.*]]			; IND-NEXT: br label [[FOR_BODY:%.*]]
	; IND: for.body:			; IND: for.body:
	; IND-NEXT: [[I:%.]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[IF_END:%.*]] ]			; IND-NEXT: [[I:%.]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[IF_END:%.*]] ]
	; IND-NEXT: [[SUM:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[VAR4:%.]], [[IF_END]] ]			; IND-NEXT: [[SUM:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[VAR4:%.]], [[IF_END]] ]
	; IND-NEXT: [[TMP16:%.*]] = zext i32 [[I]] to i64			; IND-NEXT: [[TMP16:%.*]] = sext i32 [[I]] to i64
	; IND-NEXT: [[VAR0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP16]]			; IND-NEXT: [[VAR0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP16]]
	; IND-NEXT: [[VAR1:%.*]] = load i32, ptr [[VAR0]], align 4			; IND-NEXT: [[VAR1:%.*]] = load i32, ptr [[VAR0]], align 4
	; IND-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[IF_END]]			; IND-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[IF_END]]
	; IND: if.then:			; IND: if.then:
	; IND-NEXT: [[VAR2:%.*]] = udiv i32 [[VAR1]], [[I]]			; IND-NEXT: [[VAR2:%.*]] = udiv i32 [[VAR1]], [[I]]
	; IND-NEXT: br label [[IF_END]]			; IND-NEXT: br label [[IF_END]]
	; IND: if.end:			; IND: if.end:
	; IND-NEXT: [[VAR3:%.*]] = phi i32 [ [[VAR2]], [[IF_THEN]] ], [ [[VAR1]], [[FOR_BODY]] ]			; IND-NEXT: [[VAR3:%.*]] = phi i32 [ [[VAR2]], [[IF_THEN]] ], [ [[VAR1]], [[FOR_BODY]] ]
	; IND-NEXT: [[VAR4]] = add i32 [[VAR3]], [[SUM]]			; IND-NEXT: [[VAR4]] = add i32 [[VAR3]], [[SUM]]
	; IND-NEXT: [[I_NEXT]] = add nuw nsw i32 [[I]], 1			; IND-NEXT: [[I_NEXT]] = add nuw nsw i32 [[I]], 1
	; IND-NEXT: [[COND:%.*]] = icmp slt i32 [[I_NEXT]], [[N]]			; IND-NEXT: [[COND:%.*]] = icmp slt i32 [[I_NEXT]], [[N]]
	; IND-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP25:![0-9]+]]			; IND-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP25:![0-9]+]]
	; IND: for.end:			; IND: for.end:
	; IND-NEXT: [[VAR5:%.*]] = phi i32 [ [[VAR4]], [[IF_END]] ], [ [[TMP15]], [[MIDDLE_BLOCK]] ]			; IND-NEXT: [[VAR5:%.*]] = phi i32 [ [[VAR4]], [[IF_END]] ], [ [[TMP15]], [[MIDDLE_BLOCK]] ]
	; IND-NEXT: ret i32 [[VAR5]]			; IND-NEXT: ret i32 [[VAR5]]
	;			;
	; UNROLL-LABEL: @scalarize_induction_variable_05(			; UNROLL-LABEL: @scalarize_induction_variable_05(
	; UNROLL-NEXT: entry:			; UNROLL-NEXT: entry:
	; UNROLL-NEXT: [[SMAX:%.]] = call i32 @llvm.smax.i32(i32 [[N:%.]], i32 1)			; UNROLL-NEXT: [[SMAX:%.]] = call i32 @llvm.smax.i32(i32 [[N:%.]], i32 1)
	; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[SMAX]], 4			; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[SMAX]], 4
	; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; UNROLL: vector.ph:			; UNROLL: vector.ph:
	; UNROLL-NEXT: [[N_VEC:%.*]] = and i32 [[SMAX]], 2147483644			; UNROLL-NEXT: [[N_VEC:%.*]] = and i32 [[SMAX]], -4
	; UNROLL-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <2 x i1> poison, i1 [[C:%.]], i64 0			; UNROLL-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <2 x i1> poison, i1 [[C:%.]], i64 0
	; UNROLL-NEXT: [[BROADCAST_SPLATINSERT5:%.*]] = insertelement <2 x i1> poison, i1 [[C]], i64 0			; UNROLL-NEXT: [[BROADCAST_SPLATINSERT5:%.*]] = insertelement <2 x i1> poison, i1 [[C]], i64 0
	; UNROLL-NEXT: br label [[VECTOR_BODY:%.*]]			; UNROLL-NEXT: br label [[VECTOR_BODY:%.*]]
	; UNROLL: vector.body:			; UNROLL: vector.body:
	; UNROLL-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_UDIV_CONTINUE10:%.*]] ]			; UNROLL-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_UDIV_CONTINUE10:%.*]] ]
	; UNROLL-NEXT: [[VEC_PHI:%.]] = phi <2 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP26:%.]], [[PRED_UDIV_CONTINUE10]] ]			; UNROLL-NEXT: [[VEC_PHI:%.]] = phi <2 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP26:%.]], [[PRED_UDIV_CONTINUE10]] ]
	; UNROLL-NEXT: [[VEC_PHI1:%.]] = phi <2 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP27:%.]], [[PRED_UDIV_CONTINUE10]] ]			; UNROLL-NEXT: [[VEC_PHI1:%.]] = phi <2 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP27:%.]], [[PRED_UDIV_CONTINUE10]] ]
	; UNROLL-NEXT: [[TMP0:%.*]] = or i32 [[INDEX]], 2			; UNROLL-NEXT: [[TMP0:%.*]] = or i32 [[INDEX]], 2
	▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
	; UNROLL-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; UNROLL-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; UNROLL: scalar.ph:			; UNROLL: scalar.ph:
	; UNROLL-NEXT: [[BC_RESUME_VAL:%.]] = phi i32 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; UNROLL-NEXT: [[BC_RESUME_VAL:%.]] = phi i32 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; UNROLL-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP29]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]			; UNROLL-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP29]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]
	; UNROLL-NEXT: br label [[FOR_BODY:%.*]]			; UNROLL-NEXT: br label [[FOR_BODY:%.*]]
	; UNROLL: for.body:			; UNROLL: for.body:
	; UNROLL-NEXT: [[I:%.]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[IF_END:%.*]] ]			; UNROLL-NEXT: [[I:%.]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[IF_END:%.*]] ]
	; UNROLL-NEXT: [[SUM:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[VAR4:%.]], [[IF_END]] ]			; UNROLL-NEXT: [[SUM:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[VAR4:%.]], [[IF_END]] ]
	; UNROLL-NEXT: [[TMP30:%.*]] = zext i32 [[I]] to i64			; UNROLL-NEXT: [[TMP30:%.*]] = sext i32 [[I]] to i64
	; UNROLL-NEXT: [[VAR0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP30]]			; UNROLL-NEXT: [[VAR0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP30]]
	; UNROLL-NEXT: [[VAR1:%.*]] = load i32, ptr [[VAR0]], align 4			; UNROLL-NEXT: [[VAR1:%.*]] = load i32, ptr [[VAR0]], align 4
	; UNROLL-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[IF_END]]			; UNROLL-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[IF_END]]
	; UNROLL: if.then:			; UNROLL: if.then:
	; UNROLL-NEXT: [[VAR2:%.*]] = udiv i32 [[VAR1]], [[I]]			; UNROLL-NEXT: [[VAR2:%.*]] = udiv i32 [[VAR1]], [[I]]
	; UNROLL-NEXT: br label [[IF_END]]			; UNROLL-NEXT: br label [[IF_END]]
	; UNROLL: if.end:			; UNROLL: if.end:
	; UNROLL-NEXT: [[VAR3:%.*]] = phi i32 [ [[VAR2]], [[IF_THEN]] ], [ [[VAR1]], [[FOR_BODY]] ]			; UNROLL-NEXT: [[VAR3:%.*]] = phi i32 [ [[VAR2]], [[IF_THEN]] ], [ [[VAR1]], [[FOR_BODY]] ]
	▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines
	; UNROLL-NO-IC-NEXT: ret i32 [[VAR5]]			; UNROLL-NO-IC-NEXT: ret i32 [[VAR5]]
	;			;
	; INTERLEAVE-LABEL: @scalarize_induction_variable_05(			; INTERLEAVE-LABEL: @scalarize_induction_variable_05(
	; INTERLEAVE-NEXT: entry:			; INTERLEAVE-NEXT: entry:
	; INTERLEAVE-NEXT: [[SMAX:%.]] = call i32 @llvm.smax.i32(i32 [[N:%.]], i32 1)			; INTERLEAVE-NEXT: [[SMAX:%.]] = call i32 @llvm.smax.i32(i32 [[N:%.]], i32 1)
	; INTERLEAVE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[SMAX]], 8			; INTERLEAVE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[SMAX]], 8
	; INTERLEAVE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; INTERLEAVE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; INTERLEAVE: vector.ph:			; INTERLEAVE: vector.ph:
	; INTERLEAVE-NEXT: [[N_VEC:%.*]] = and i32 [[SMAX]], 2147483640			; INTERLEAVE-NEXT: [[N_VEC:%.*]] = and i32 [[SMAX]], -8
	; INTERLEAVE-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x i1> poison, i1 [[C:%.]], i64 0			; INTERLEAVE-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x i1> poison, i1 [[C:%.]], i64 0
	; INTERLEAVE-NEXT: [[BROADCAST_SPLATINSERT9:%.*]] = insertelement <4 x i1> poison, i1 [[C]], i64 0			; INTERLEAVE-NEXT: [[BROADCAST_SPLATINSERT9:%.*]] = insertelement <4 x i1> poison, i1 [[C]], i64 0
	; INTERLEAVE-NEXT: br label [[VECTOR_BODY:%.*]]			; INTERLEAVE-NEXT: br label [[VECTOR_BODY:%.*]]
	; INTERLEAVE: vector.body:			; INTERLEAVE: vector.body:
	; INTERLEAVE-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_UDIV_CONTINUE18:%.*]] ]			; INTERLEAVE-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_UDIV_CONTINUE18:%.*]] ]
	; INTERLEAVE-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP46:%.]], [[PRED_UDIV_CONTINUE18]] ]			; INTERLEAVE-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP46:%.]], [[PRED_UDIV_CONTINUE18]] ]
	; INTERLEAVE-NEXT: [[VEC_PHI1:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP47:%.]], [[PRED_UDIV_CONTINUE18]] ]			; INTERLEAVE-NEXT: [[VEC_PHI1:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP47:%.]], [[PRED_UDIV_CONTINUE18]] ]
	; INTERLEAVE-NEXT: [[TMP0:%.*]] = or i32 [[INDEX]], 4			; INTERLEAVE-NEXT: [[TMP0:%.*]] = or i32 [[INDEX]], 4
	▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines
	; INTERLEAVE-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; INTERLEAVE-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; INTERLEAVE: scalar.ph:			; INTERLEAVE: scalar.ph:
	; INTERLEAVE-NEXT: [[BC_RESUME_VAL:%.]] = phi i32 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; INTERLEAVE-NEXT: [[BC_RESUME_VAL:%.]] = phi i32 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; INTERLEAVE-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP49]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]			; INTERLEAVE-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP49]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]
	; INTERLEAVE-NEXT: br label [[FOR_BODY:%.*]]			; INTERLEAVE-NEXT: br label [[FOR_BODY:%.*]]
	; INTERLEAVE: for.body:			; INTERLEAVE: for.body:
	; INTERLEAVE-NEXT: [[I:%.]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[IF_END:%.*]] ]			; INTERLEAVE-NEXT: [[I:%.]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[IF_END:%.*]] ]
	; INTERLEAVE-NEXT: [[SUM:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[VAR4:%.]], [[IF_END]] ]			; INTERLEAVE-NEXT: [[SUM:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[VAR4:%.]], [[IF_END]] ]
	; INTERLEAVE-NEXT: [[TMP50:%.*]] = zext i32 [[I]] to i64			; INTERLEAVE-NEXT: [[TMP50:%.*]] = sext i32 [[I]] to i64
	; INTERLEAVE-NEXT: [[VAR0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP50]]			; INTERLEAVE-NEXT: [[VAR0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP50]]
	; INTERLEAVE-NEXT: [[VAR1:%.*]] = load i32, ptr [[VAR0]], align 4			; INTERLEAVE-NEXT: [[VAR1:%.*]] = load i32, ptr [[VAR0]], align 4
	; INTERLEAVE-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[IF_END]]			; INTERLEAVE-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[IF_END]]
	; INTERLEAVE: if.then:			; INTERLEAVE: if.then:
	; INTERLEAVE-NEXT: [[VAR2:%.*]] = udiv i32 [[VAR1]], [[I]]			; INTERLEAVE-NEXT: [[VAR2:%.*]] = udiv i32 [[VAR1]], [[I]]
	; INTERLEAVE-NEXT: br label [[IF_END]]			; INTERLEAVE-NEXT: br label [[IF_END]]
	; INTERLEAVE: if.end:			; INTERLEAVE: if.end:
	; INTERLEAVE-NEXT: [[VAR3:%.*]] = phi i32 [ [[VAR2]], [[IF_THEN]] ], [ [[VAR1]], [[FOR_BODY]] ]			; INTERLEAVE-NEXT: [[VAR3:%.*]] = phi i32 [ [[VAR2]], [[IF_THEN]] ], [ [[VAR1]], [[FOR_BODY]] ]
	▲ Show 20 Lines • Show All 2,528 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	; IND-LABEL: @non_primary_iv_trunc(			; IND-LABEL: @non_primary_iv_trunc(
	; IND-NEXT: entry:			; IND-NEXT: entry:
	; IND-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; IND-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2			; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2
	; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; IND: vector.ph:			; IND: vector.ph:
	; IND-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775806			; IND-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -2
	; IND-NEXT: [[IND_END:%.*]] = shl nuw i64 [[N_VEC]], 1			; IND-NEXT: [[IND_END:%.*]] = shl nuw i64 [[N_VEC]], 1
	; IND-NEXT: br label [[VECTOR_BODY:%.*]]			; IND-NEXT: br label [[VECTOR_BODY:%.*]]
	; IND: vector.body:			; IND: vector.body:
	; IND-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; IND-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; IND-NEXT: [[VEC_IND:%.]] = phi <2 x i32> [ <i32 0, i32 2>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]			; IND-NEXT: [[VEC_IND:%.]] = phi <2 x i32> [ <i32 0, i32 2>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
	; IND-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[INDEX]]			; IND-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[INDEX]]
	; IND-NEXT: store <2 x i32> [[VEC_IND]], ptr [[TMP0]], align 4			; IND-NEXT: store <2 x i32> [[VEC_IND]], ptr [[TMP0]], align 4
	; IND-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; IND-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	Show All 21 Lines
	; IND-NEXT: ret void			; IND-NEXT: ret void
	;			;
	; UNROLL-LABEL: @non_primary_iv_trunc(			; UNROLL-LABEL: @non_primary_iv_trunc(
	; UNROLL-NEXT: entry:			; UNROLL-NEXT: entry:
	; UNROLL-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; UNROLL-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4			; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4
	; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; UNROLL: vector.ph:			; UNROLL: vector.ph:
	; UNROLL-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775804			; UNROLL-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -4
	; UNROLL-NEXT: [[IND_END:%.*]] = shl nuw i64 [[N_VEC]], 1			; UNROLL-NEXT: [[IND_END:%.*]] = shl nuw i64 [[N_VEC]], 1
	; UNROLL-NEXT: br label [[VECTOR_BODY:%.*]]			; UNROLL-NEXT: br label [[VECTOR_BODY:%.*]]
	; UNROLL: vector.body:			; UNROLL: vector.body:
	; UNROLL-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; UNROLL-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; UNROLL-NEXT: [[VEC_IND:%.]] = phi <2 x i32> [ <i32 0, i32 2>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]			; UNROLL-NEXT: [[VEC_IND:%.]] = phi <2 x i32> [ <i32 0, i32 2>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
	; UNROLL-NEXT: [[STEP_ADD:%.*]] = add <2 x i32> [[VEC_IND]], <i32 4, i32 4>			; UNROLL-NEXT: [[STEP_ADD:%.*]] = add <2 x i32> [[VEC_IND]], <i32 4, i32 4>
	; UNROLL-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[INDEX]]			; UNROLL-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[INDEX]]
	; UNROLL-NEXT: store <2 x i32> [[VEC_IND]], ptr [[TMP0]], align 4			; UNROLL-NEXT: store <2 x i32> [[VEC_IND]], ptr [[TMP0]], align 4
	▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	; UNROLL-NO-IC-NEXT: ret void			; UNROLL-NO-IC-NEXT: ret void
	;			;
	; INTERLEAVE-LABEL: @non_primary_iv_trunc(			; INTERLEAVE-LABEL: @non_primary_iv_trunc(
	; INTERLEAVE-NEXT: entry:			; INTERLEAVE-NEXT: entry:
	; INTERLEAVE-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; INTERLEAVE-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; INTERLEAVE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 8			; INTERLEAVE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 8
	; INTERLEAVE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; INTERLEAVE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; INTERLEAVE: vector.ph:			; INTERLEAVE: vector.ph:
	; INTERLEAVE-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775800			; INTERLEAVE-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -8
	; INTERLEAVE-NEXT: [[IND_END:%.*]] = shl nuw i64 [[N_VEC]], 1			; INTERLEAVE-NEXT: [[IND_END:%.*]] = shl nuw i64 [[N_VEC]], 1
	; INTERLEAVE-NEXT: br label [[VECTOR_BODY:%.*]]			; INTERLEAVE-NEXT: br label [[VECTOR_BODY:%.*]]
	; INTERLEAVE: vector.body:			; INTERLEAVE: vector.body:
	; INTERLEAVE-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; INTERLEAVE-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; INTERLEAVE-NEXT: [[VEC_IND:%.]] = phi <4 x i32> [ <i32 0, i32 2, i32 4, i32 6>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]			; INTERLEAVE-NEXT: [[VEC_IND:%.]] = phi <4 x i32> [ <i32 0, i32 2, i32 4, i32 6>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
	; INTERLEAVE-NEXT: [[STEP_ADD:%.*]] = add <4 x i32> [[VEC_IND]], <i32 8, i32 8, i32 8, i32 8>			; INTERLEAVE-NEXT: [[STEP_ADD:%.*]] = add <4 x i32> [[VEC_IND]], <i32 8, i32 8, i32 8, i32 8>
	; INTERLEAVE-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[INDEX]]			; INTERLEAVE-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[INDEX]]
	; INTERLEAVE-NEXT: store <4 x i32> [[VEC_IND]], ptr [[TMP0]], align 4			; INTERLEAVE-NEXT: store <4 x i32> [[VEC_IND]], ptr [[TMP0]], align 4
	▲ Show 20 Lines • Show All 1,567 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/invariant-store-vectorization-2.ll

	Show All 28 Lines
	; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2			; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2
	; CHECK-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]			; CHECK-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]
	; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[A:%.]], i64 4			; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[A:%.]], i64 4
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -4
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x i32> poison, i32 [[K:%.]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x i32> poison, i32 [[K:%.]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT3:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT3:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT4:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT3]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT4:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT3]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2			; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2
	; CHECK-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]			; CHECK-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]
	; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[A:%.]], i64 4			; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[A:%.]], i64 4
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -4
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i1> undef, i1 [[CMP]], i64 3			; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i1> undef, i1 [[CMP]], i64 3
	; CHECK-NEXT: [[BROADCAST_SPLAT6:%.*]] = insertelement <4 x i32> poison, i32 [[K]], i64 3			; CHECK-NEXT: [[BROADCAST_SPLAT6:%.*]] = insertelement <4 x i32> poison, i32 [[K]], i64 3
	; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> [[TMP1]], <4 x i32> [[BROADCAST_SPLAT]], <4 x i32> [[BROADCAST_SPLAT6]]			; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> [[TMP1]], <4 x i32> [[BROADCAST_SPLAT]], <4 x i32> [[BROADCAST_SPLAT6]]
	; CHECK-NEXT: [[TMP2:%.*]] = extractelement <4 x i32> [[PREDPHI]], i64 3			; CHECK-NEXT: [[TMP2:%.*]] = extractelement <4 x i32> [[PREDPHI]], i64 3
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[SMAX:%.*]] = call i64 @llvm.smax.i64(i64 [[N]], i64 1)			; CHECK-NEXT: [[SMAX:%.*]] = call i64 @llvm.smax.i64(i64 [[N]], i64 1)
	; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2			; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2
	; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]			; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[A]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[A]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[B]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[B]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -4
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 8, !alias.scope !15			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 8, !alias.scope !15
	; CHECK-NEXT: [[TMP2:%.*]] = extractelement <4 x i32> [[WIDE_LOAD]], i64 3			; CHECK-NEXT: [[TMP2:%.*]] = extractelement <4 x i32> [[WIDE_LOAD]], i64 3
	; CHECK-NEXT: store i32 [[TMP2]], ptr [[A]], align 4, !alias.scope !18, !noalias !15			; CHECK-NEXT: store i32 [[TMP2]], ptr [[A]], align 4, !alias.scope !18, !noalias !15
	▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/invariant-store-vectorization.ll

	Show All 27 Lines
	; CHECK-NEXT: [[SMAX:%.*]] = call i64 @llvm.smax.i64(i64 [[N]], i64 1)			; CHECK-NEXT: [[SMAX:%.*]] = call i64 @llvm.smax.i64(i64 [[N]], i64 1)
	; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2			; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2
	; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]			; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[A]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[A]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[B]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[B]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -4
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 8, !alias.scope !0			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 8, !alias.scope !0
	; CHECK-NEXT: [[TMP2]] = add <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]]			; CHECK-NEXT: [[TMP2]] = add <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]]
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !3, !noalias !0			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !3, !noalias !0
	▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[SMAX:%.*]] = call i64 @llvm.smax.i64(i64 [[N]], i64 1)			; CHECK-NEXT: [[SMAX:%.*]] = call i64 @llvm.smax.i64(i64 [[N]], i64 1)
	; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2			; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2
	; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]			; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[A]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[A]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[B]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[B]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -4
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !8, !noalias !11			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !8, !noalias !11
	; CHECK-NEXT: store <4 x i32> [[BROADCAST_SPLAT]], ptr [[TMP1]], align 4, !alias.scope !11			; CHECK-NEXT: store <4 x i32> [[BROADCAST_SPLAT]], ptr [[TMP1]], align 4, !alias.scope !11
	▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2			; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SMAX]], 2
	; CHECK-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]			; CHECK-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[B:%.]], i64 [[TMP0]]
	; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[A:%.]], i64 4			; CHECK-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[A:%.]], i64 4
	; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]			; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[B]]
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[A]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], -4
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x i32> poison, i32 [[K:%.]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x i32> poison, i32 [[K:%.]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT3:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT3:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT4:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT3]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT4:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT3]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE10:%.*]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE10:%.*]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	▲ Show 20 Lines • Show All 449 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/loop-scalars.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: opt < %s -aa-pipeline=basic-aa -passes=loop-vectorize,instcombine -force-vector-width=2 -force-vector-interleave=1 -debug-only=loop-vectorize -disable-output -print-after=instcombine 2>&1 \| FileCheck %s			; RUN: opt < %s -aa-pipeline=basic-aa -passes=loop-vectorize,instcombine -force-vector-width=2 -force-vector-interleave=1 -debug-only=loop-vectorize -disable-output -print-after=instcombine 2>&1 \| FileCheck %s

	target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"

	;			;
	define void @vector_gep(ptr %a, ptr %b, i64 %n) {			define void @vector_gep(ptr %a, ptr %b, i64 %n) {
	; CHECK-LABEL: @vector_gep(			; CHECK-LABEL: @vector_gep(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; CHECK-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775806			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -2
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_IND:%.]] = phi <2 x i64> [ <i64 0, i64 1>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_IND:%.]] = phi <2 x i64> [ <i64 0, i64 1>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[B:%.]], <2 x i64> [[VEC_IND]]			; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[B:%.]], <2 x i64> [[VEC_IND]]
	; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds ptr, ptr [[A:%.]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds ptr, ptr [[A:%.]], i64 [[INDEX]]
	; CHECK-NEXT: store <2 x ptr> [[TMP0]], ptr [[TMP1]], align 8			; CHECK-NEXT: store <2 x ptr> [[TMP0]], ptr [[TMP1]], align 8
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	▲ Show 20 Lines • Show All 156 Lines • ▼ Show 20 Lines
	;			;
	define void @no_gep_or_bitcast(ptr noalias %a, i64 %n) {			define void @no_gep_or_bitcast(ptr noalias %a, i64 %n) {
	; CHECK-LABEL: @no_gep_or_bitcast(			; CHECK-LABEL: @no_gep_or_bitcast(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; CHECK-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775806			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -2
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds ptr, ptr [[A:%.]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds ptr, ptr [[A:%.]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <2 x ptr>, ptr [[TMP0]], align 8			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <2 x ptr>, ptr [[TMP0]], align 8
	; CHECK-NEXT: [[TMP2:%.*]] = extractelement <2 x ptr> [[WIDE_LOAD]], i64 0			; CHECK-NEXT: [[TMP2:%.*]] = extractelement <2 x ptr> [[WIDE_LOAD]], i64 0
	; CHECK-NEXT: store i32 0, ptr [[TMP2]], align 8			; CHECK-NEXT: store i32 0, ptr [[TMP2]], align 8
	; CHECK-NEXT: [[TMP3:%.*]] = extractelement <2 x ptr> [[WIDE_LOAD]], i64 1			; CHECK-NEXT: [[TMP3:%.*]] = extractelement <2 x ptr> [[WIDE_LOAD]], i64 1
	Show All 36 Lines

llvm/test/Transforms/LoopVectorize/vector-geps.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -passes=loop-vectorize,instcombine -force-vector-width=4 -force-vector-interleave=1 -S \| FileCheck %s			; RUN: opt < %s -passes=loop-vectorize,instcombine -force-vector-width=4 -force-vector-interleave=1 -S \| FileCheck %s

	target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"

	;			;
	define void @vector_gep_stored(ptr %a, ptr %b, i64 %n) {			define void @vector_gep_stored(ptr %a, ptr %b, i64 %n) {
	; CHECK-LABEL: @vector_gep_stored(			; CHECK-LABEL: @vector_gep_stored(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; CHECK-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -4
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[B:%.]], <4 x i64> [[VEC_IND]]			; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[B:%.]], <4 x i64> [[VEC_IND]]
	; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds ptr, ptr [[A:%.]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds ptr, ptr [[A:%.]], i64 [[INDEX]]
	; CHECK-NEXT: store <4 x ptr> [[TMP0]], ptr [[TMP1]], align 8			; CHECK-NEXT: store <4 x ptr> [[TMP0]], ptr [[TMP1]], align 8
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	Show All 36 Lines
	;			;
	define void @uniform_vector_gep_stored(ptr %a, ptr %b, i64 %n) {			define void @uniform_vector_gep_stored(ptr %a, ptr %b, i64 %n) {
	; CHECK-LABEL: @uniform_vector_gep_stored(			; CHECK-LABEL: @uniform_vector_gep_stored(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; CHECK-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 4
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], -4
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[B:%.]], i64 1			; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[B:%.]], i64 1
	; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <4 x ptr> poison, ptr [[TMP0]], i64 0			; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <4 x ptr> poison, ptr [[TMP0]], i64 0
	; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <4 x ptr> [[DOTSPLATINSERT]], <4 x ptr> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <4 x ptr> [[DOTSPLATINSERT]], <4 x ptr> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds ptr, ptr [[A:%.]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds ptr, ptr [[A:%.]], i64 [[INDEX]]
	; CHECK-NEXT: store <4 x ptr> [[DOTSPLAT]], ptr [[TMP1]], align 8			; CHECK-NEXT: store <4 x ptr> [[DOTSPLAT]], ptr [[TMP1]], align 8
	Show All 35 Lines

llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll

	Show First 20 Lines • Show All 169 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[CMP1:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_BODY_PREHEADER:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_BODY_PREHEADER:%.]], label [[EXIT:%.]]
	; CHECK: for.body.preheader:			; CHECK: for.body.preheader:
	; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64			; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 4			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 4
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER7:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER7:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 4294967292			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], -4
	; CHECK-NEXT: [[TMP0:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -4			; CHECK-NEXT: [[TMP0:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -4
	; CHECK-NEXT: [[TMP1:%.*]] = lshr i64 [[TMP0]], 2			; CHECK-NEXT: [[TMP1:%.*]] = lshr i64 [[TMP0]], 2
	; CHECK-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1			; CHECK-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1
	; CHECK-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP2]], 7			; CHECK-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP2]], 7
	; CHECK-NEXT: [[TMP3:%.*]] = icmp ult i64 [[TMP0]], 28			; CHECK-NEXT: [[TMP3:%.*]] = icmp ult i64 [[TMP0]], 28
	; CHECK-NEXT: br i1 [[TMP3]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]			; CHECK-NEXT: br i1 [[TMP3]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]
	; CHECK: vector.ph.new:			; CHECK: vector.ph.new:
	; CHECK-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[TMP2]], -8			; CHECK-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[TMP2]], -8
	▲ Show 20 Lines • Show All 198 Lines • Show Last 20 Lines

llvm/test/Transforms/PhaseOrdering/X86/pixel-splat.ll

	Show All 22 Lines
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp sgt i32 [[S:%.]], 0			; CHECK-NEXT: [[CMP1:%.]] = icmp sgt i32 [[S:%.]], 0
	; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]			; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]
	; CHECK: for.body.preheader:			; CHECK: for.body.preheader:
	; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[S]] to i64			; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[S]] to i64
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[S]], 8			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[S]], 8
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER5:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER5:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 4294967288			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], -8
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i8, ptr [[PIN:%.]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i8, ptr [[PIN:%.]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i8>, ptr [[TMP0]], align 1			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i8>, ptr [[TMP0]], align 1
	; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 4			; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 4
	; CHECK-NEXT: [[WIDE_LOAD4:%.*]] = load <4 x i8>, ptr [[TMP2]], align 1			; CHECK-NEXT: [[WIDE_LOAD4:%.*]] = load <4 x i8>, ptr [[TMP2]], align 1
	; CHECK-NEXT: [[TMP4:%.*]] = zext <4 x i8> [[WIDE_LOAD]] to <4 x i32>			; CHECK-NEXT: [[TMP4:%.*]] = zext <4 x i8> [[WIDE_LOAD]] to <4 x i32>
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll

	Show All 20 Lines
	; CHECK-NEXT: [[Y5:%.]] = ptrtoint ptr [[Y:%.]] to i64			; CHECK-NEXT: [[Y5:%.]] = ptrtoint ptr [[Y:%.]] to i64
	; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64			; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 16			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 16
	; CHECK-NEXT: [[TMP0:%.*]] = sub i64 [[X4]], [[Y5]]			; CHECK-NEXT: [[TMP0:%.*]] = sub i64 [[X4]], [[Y5]]
	; CHECK-NEXT: [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP0]], 128			; CHECK-NEXT: [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP0]], 128
	; CHECK-NEXT: [[OR_COND:%.*]] = select i1 [[MIN_ITERS_CHECK]], i1 true, i1 [[DIFF_CHECK]]			; CHECK-NEXT: [[OR_COND:%.*]] = select i1 [[MIN_ITERS_CHECK]], i1 true, i1 [[DIFF_CHECK]]
	; CHECK-NEXT: br i1 [[OR_COND]], label [[FOR_BODY_PREHEADER15:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[OR_COND]], label [[FOR_BODY_PREHEADER15:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 4294967280			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], -16
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x double> poison, double [[A:%.]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x double> poison, double [[A:%.]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT]], <4 x double> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT]], <4 x double> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT9:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT9:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT10:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT9]], <4 x double> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT10:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT9]], <4 x double> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT11:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT11:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT12:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT11]], <4 x double> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT12:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT11]], <4 x double> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT13]], <4 x double> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT13]], <4 x double> poison, <4 x i32> zeroinitializer
	▲ Show 20 Lines • Show All 228 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine]: Don't simplify bits if it causes imm32 to become imm64AcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 486446

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp

llvm/test/Transforms/InstCombine/apint-shift.ll

llvm/test/Transforms/InstCombine/cast.ll

llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll

llvm/test/Transforms/InstCombine/mul-inseltpoison.ll

llvm/test/Transforms/InstCombine/mul.ll

llvm/test/Transforms/InstCombine/shift.ll

llvm/test/Transforms/LoopUnroll/runtime-unroll-remainder.ll

llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll

llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll

llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll

llvm/test/Transforms/LoopVectorize/X86/small-size.ll

llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll

llvm/test/Transforms/LoopVectorize/float-induction.ll

llvm/test/Transforms/LoopVectorize/induction.ll

llvm/test/Transforms/LoopVectorize/invariant-store-vectorization-2.ll

llvm/test/Transforms/LoopVectorize/invariant-store-vectorization.ll

llvm/test/Transforms/LoopVectorize/loop-scalars.ll

llvm/test/Transforms/LoopVectorize/vector-geps.ll

llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll

llvm/test/Transforms/PhaseOrdering/X86/pixel-splat.ll

llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll

[InstCombine]: Don't simplify bits if it causes imm32 to become imm64
AcceptedPublic