This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Vectorize/
-
Transforms/
-
Vectorize/
3
LoopVectorize.cpp
-
test/Transforms/
-
Transforms/
-
LoopVectorize/
-
ARM/
-
pointer_iv.ll
-
tail-folding-loop-hint.ll
-
PowerPC/
-
optimal-epilog-vectorization.ll
-
X86/
-
already-vectorized.ll
-
float-induction-x86.ll
-
gather_scatter.ll
-
invariant-load-gather.ll
-
invariant-store-vectorization.ll
-
masked_load_store.ll
-
metadata-enable.ll
-
tail_loop_folding.ll
-
uniform_mem_op.ll
-
followup.ll
-
if-pred-non-void.ll
-
induction.ll
-
interleaved-accesses.ll
-
invariant-store-vectorization-2.ll
-
invariant-store-vectorization.ll
-
memdep-fold-tail.ll
-
optsize.ll
-
pointer-select-runtime-checks.ll
-
reduction-with-invariant-store.ll
-
runtime-check.ll
-
vectorize-once.ll
-
PhaseOrdering/
-
AArch64/
-
hoisting-sinking-required-for-vectorization.ll
-
X86/
-
excessive-unrolling.ll
-
vdiv.ll

Differential D115261

[LV] Disable runtime unrolling for vectorized loops.
ClosedPublic

Authored by fhahn on Dec 7 2021, 9:21 AM.

Download Raw Diff

Details

Reviewers

Ayal
gilr
reames
dmgreen
lebedev.ri
nikic

Commits

rG68469a80cb74: [LV] Disable runtime unrolling for vectorized loops.

Summary

This patch adds metadata to disable runtime unrolling to the vectorized
loop. If runtime unrolling/interleaving is considered profitable, LV
will interleave the loop directly. There should be no need to perform
runtime unrolling at a later stage.

Note that we already add metadata to disable runtime unrolling to the
scalar loop after vectorization.

The additional unrolling unnecessarily increases code size and compile
time. In addition to that we have several bug reports of unncessary
runtime unrolling for vectorized loops, e.g. PR40961

Compile-time improvements:

NewPM-O3: -2.08%
NewPM-ReleaseThinLTO: -1.09%
NewPM-ReleaseLTO-g: -1.59%

http://llvm-compile-time-tracker.com/compare.php?from=398dffd4ffc7cbc320e15207c8c04ca682d821c4&to=86c8f5499f5c9ef34d018491f8eb4364579dca16&stat=instructions

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fhahn created this revision.Dec 7 2021, 9:21 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptDec 7 2021, 9:21 AM

fhahn requested review of this revision.Dec 7 2021, 9:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2021, 9:21 AM

Compile time is misleading.
What about run time impact?

Harbormaster completed remote builds in B137933: Diff 392435.Dec 7 2021, 10:06 AM

Even if both of the unrollers are right as per their model
(LU duplicates whole loop body; while LU duplicates each instruction,
increasing live ranges, i believe), i'm mainly just worried
that two unroll strategies disagree in the end.

Which one is actually right? LV?
Is there some analysis that can be extracted from LV that LU could use
to deduce better unroll factor? (which would be 1x (no further unroll) after LV)

All that being said, i don't have any concrete examples that regress with this.

In D115261#3177417, @lebedev.ri wrote:

Even if both of the unrollers are right as per their model
(LU duplicates whole loop body; while LU duplicates each instruction,
increasing live ranges, i believe), i'm mainly just worried
that two unroll strategies disagree in the end.

Which one is actually right? LV?
Is there some analysis that can be extracted from LV that LU could use
to deduce better unroll factor? (which would be 1x (no further unroll) after LV)

All that being said, i don't have any concrete examples that regress with this.

Runtime loop unrolling doesn't really have anything that deserves the name of "cost model", at least if there is no profile data. It basically just unrolls the loop as many times as fits into a threshold. I don't know what kind of cost modelling LV does in this area, but I can only assume it's better than that ;)

I believe many targets already disable runtime unrolling for loops that contain vector instructions. For example AArch64 does that, though X86 currently does not. This is the principal alternative I would see, to move that logic up into the generic unroll preferences. It would be the difference between not unrolling loops that LLVM vectorized and not unrolling vector loops in general -- I assume the preference would be the former, as this patch does?

I would still want to see some numbers from a run on an affected arch/cpu, where there previously would be unrolling and now there won't be.
Lack of change will be great, presence will mainly be a canary test only, not a blocker.

In D115261#3182932, @nikic wrote:

In D115261#3177417, @lebedev.ri wrote:

Even if both of the unrollers are right as per their model
(LU duplicates whole loop body; while LU duplicates each instruction,
increasing live ranges, i believe), i'm mainly just worried
that two unroll strategies disagree in the end.

Which one is actually right? LV?
Is there some analysis that can be extracted from LV that LU could use
to deduce better unroll factor? (which would be 1x (no further unroll) after LV)

All that being said, i don't have any concrete examples that regress with this.

Runtime loop unrolling doesn't really have anything that deserves the name of "cost model", at least if there is no profile data. It basically just unrolls the loop as many times as fits into a threshold. I don't know what kind of cost modelling LV does in this area, but I can only assume it's better than that ;)

LV at least tries to limit interleaving based on the number of execution units, so in that respect it should be a more realistic heuristic than the purely size-based on in the unroller. I guess one reason why the size based thresholds for unrolling are still in place is that one of the main benefits from aggressive unrolling in LLVM is increasing the context for later local optimizations. This point shouldn't really apply for vectorized loops in most cases.

One interesting point that @lebedev.ri is that in some cases interleaving by LV won't happen due to it causing spills of vector registers, whereas this isn't a problem with runtime-unrolling. But I'd assume in practice such loops should already be 'large enough'.

I believe many targets already disable runtime unrolling for loops that contain vector instructions. For example AArch64 does that, though X86 currently does not. This is the principal alternative I would see, to move that logic up into the generic unroll preferences. It would be the difference between not unrolling loops that LLVM vectorized and not unrolling vector loops in general -- I assume the preference would be the former, as this patch does?

IIRC AArch64 only enables runtime unrolling for in-order targets at the moment. Disabling unrolling for loops with vector instructions in general seems like a workaround.

In D115261#3185543, @lebedev.ri wrote:

I would still want to see some numbers from a run on an affected arch/cpu, where there previously would be unrolling and now there won't be.
Lack of change will be great, presence will mainly be a canary test only, not a blocker.

Agreed & fair point! I run SPEC2006 on X86 and the only notable change is sphinx3, but interestingly enough in the function with the biggest runtime difference there are no codegen changes. Still taking a closer look, but it may be down to code alignment changes.

In theory, I'm not a fan of this patch for many of the reasons @lebedev.ri spelled out. Our cost model really should be good enough to avoid unrolling something which is not profitable without needing to have one pass overrule the heuristic in another. In practice, I think this is a reasonable workaround for a very deep and hard to fix problem (poor unroll heuristics). Assuming perf results are neutral, I'd be fine with this going in.

@fhahn Can you point to a couple other cases where this is needed for profitability? I took a look at PR40961, and that really looks more like a problem us figuring out small trip counts and restricting transforms appropriately. (i.e. the unroller should never be unrolling longer than the total loop trip count) That's "easily" fixable, and we should do so.

In D115261#3185784, @reames wrote:

@fhahn Can you point to a couple other cases where this is needed for profitability? I took a look at PR40961, and that really looks more like a problem us figuring out small trip counts and restricting transforms appropriately. (i.e. the unroller should never be unrolling longer than the total loop trip count) That's "easily" fixable, and we should do so.

Yeah some of the issues are related to SCEV missing trip counts and a couple of those have been fixed. That was one of the original motivations for the applyLoopGuards logic in SCEV. Unfortunately I am not able to find the other issue in the bug tracker I was thinking about just now.

But one different example is https://godbolt.org/z/7cE9soTa9 with a dependency on the accumulator registers

In D115261#3185855, @fhahn wrote:

In D115261#3185784, @reames wrote:

@fhahn Can you point to a couple other cases where this is needed for profitability? I took a look at PR40961, and that really looks more like a problem us figuring out small trip counts and restricting transforms appropriately. (i.e. the unroller should never be unrolling longer than the total loop trip count) That's "easily" fixable, and we should do so.

Yeah some of the issues are related to SCEV missing trip counts and a couple of those have been fixed. That was one of the original motivations for the applyLoopGuards logic in SCEV. Unfortunately I am not able to find the other issue in the bug tracker I was thinking about just now.

But one different example is https://godbolt.org/z/7cE9soTa9 with a dependency on the accumulator registers

So, looking at the codegen for that one, the unrolled form doesn't really look terrible. For a unknown trip count loop without profiling available, unrolling here seems fairly neutral perf wise (or at least I'd guess so). Codesize is definitely a regression. Can you explain why it is that unrolling is strictly unprofitable here? I have a bad feeling that this is only indirectly an interaction with the vectorizer, and there's some alternate framing here. I want to see your explanation to see what might fall out.

I believe many targets already disable runtime unrolling for loops that contain vector instructions. For example AArch64 does that, though X86 currently does not. This is the principal alternative I would see, to move that logic up into the generic unroll preferences. It would be the difference between not unrolling loops that LLVM vectorized and not unrolling vector loops in general -- I assume the preference would be the former, as this patch does?

In ARM-MVE we do already, both for vectorized code from the vectorizer and vector code from the frontend (intrinsics for example). That is largely to do with the way MVE handles tail-predication and hardware loops, which makes runtime unrolling generally detrimental.

For AArch64 we inherit some of the unrolling preferences from the base class, so I would expect this to alter in-order and out of order unrolling decisions to some degree. The main reasons we disabled the extra unrolling of loops with vectors was to not over-unroll loops past the point that it is useful - as it if you have a loop acting on i8's that is vectorized x16, then "interleaved" x2 or x4, then further runtime unrolled, you end up with a fast loop body that handles 64 or 128 or even 256 elements at a time. For too many tripcounts you just don't enter that loop or do most of the processing in the remainder loop. (Also we didn't want to break a lot of the hand-crafted intrinsic code that is out there that already unrolls the loop to carefully fit into the number of registers available).

Note that we already add metadata to disable runtime unrolling to the scalar loop after vectorization.

We only currently add no-unroll metadata to the remainder when there are no runtime-checks added. I wouldn't be surprised if a lot of the codesize gain from this patch is due to this patch adding no-unroll to the remainder, not to the vector body. It is quite easy to construct cases where the remainder really should be allowed to unroll.

LLVM has never had very good end-to-end testing for this kind of thing and has relied upon benchmarks like the llvm-test-suite to fill that gap. The noise makes it difficult, but I would expect a descent amount of benchmarking to prove a change like this is better overall and to catch a lot of the cases where it is not.

In D115261#3188312, @dmgreen wrote:

I believe many targets already disable runtime unrolling for loops that contain vector instructions. For example AArch64 does that, though X86 currently does not. This is the principal alternative I would see, to move that logic up into the generic unroll preferences. It would be the difference between not unrolling loops that LLVM vectorized and not unrolling vector loops in general -- I assume the preference would be the former, as this patch does?

In ARM-MVE we do already, both for vectorized code from the vectorizer and vector code from the frontend (intrinsics for example). That is largely to do with the way MVE handles tail-predication and hardware loops, which makes runtime unrolling generally detrimental.

For AArch64 we inherit some of the unrolling preferences from the base class, so I would expect this to alter in-order and out of order unrolling decisions to some degree. The main reasons we disabled the extra unrolling of loops with vectors was to not over-unroll loops past the point that it is useful - as it if you have a loop acting on i8's that is vectorized x16, then "interleaved" x2 or x4, then further runtime unrolled, you end up with a fast loop body that handles 64 or 128 or even 256 elements at a time. For too many tripcounts you just don't enter that loop or do most of the processing in the remainder loop. (Also we didn't want to break a lot of the hand-crafted intrinsic code that is out there that already unrolls the loop to carefully fit into the number of registers available).

That's a good point. AFAICT AArch64 disables unrolling of loops with vector instructions in general (not just runtime unrolling, which is still not enabled by default in general IIRC). I guess an alternative would be to do this generally (or at least for x86 as well),, as I sketched in D131972. It would probably also make sense to only do it for out-of-order CPUs for now.

But the general assumption remains: If LV didn't decide to interleave to increase parallelism, it is extremely unlikely LoopUnroll will make a better informed discussion later on.

Note that we already add metadata to disable runtime unrolling to the scalar loop after vectorization.

We only currently add no-unroll metadata to the remainder when there are no runtime-checks added. I wouldn't be surprised if a lot of the codesize gain from this patch is due to this patch adding no-unroll to the remainder, not to the vector body. It is quite easy to construct cases where the remainder really should be allowed to unroll.

LLVM has never had very good end-to-end testing for this kind of thing and has relied upon benchmarks like the llvm-test-suite to fill that gap. The noise makes it difficult, but I would expect a descent amount of benchmarking to prove a change like this is better overall and to catch a lot of the cases where it is not.

I updated the patch to leave the metadata of the scalar loop untouched and adds metadata to disable any unrolling of the vector loop. This should effectively be the same as effect as D131972, although compile-time is slightly better as LoopUnroll exits earlier (geoman reduction with this patch for -O3 is -1.38% vs -1.18%).

If we decide to predicate this on out-of-order vs in-order, I am currently leaning on making this decision in getUnrollingPreferences in TTI.

There is further feedback that aggressive unrolling of vector code on X86 isn't beneficial: https://github.com/llvm/llvm-project/issues/42332

https://llvm-compile-time-tracker.com/compare.php?from=a4a2ac5d1878177b57b76b109fda3820c6939a28&to=dca1dc4332b14064cf7f7618de58f2407b52c805&stat=instructions
https://llvm-compile-time-tracker.com/compare.php?from=94d21a94d90db8bc0e983bde672790843f81ddca&to=a908a561c4639f45d29b43dd921fee0b24b42dfb&stat=instructions

Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2022, 9:10 AM

Herald added a subscriber: zzheng. · View Herald Transcript

Harbormaster completed remote builds in B181548: Diff 453036.Aug 16 2022, 9:40 AM

LGTM. I think we should start with this patch, as the more conservative variant to D131972. We can then explore applying that one on top.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7575	Should probably drop Runtime from the name?
10564	Broken indent?

This revision is now accepted and ready to land.Aug 29 2022, 5:03 AM

Herald added a subscriber: • pcwang-thead. · View Herald TranscriptAug 29 2022, 5:03 AM

Yeah, this looks as a great start. Better performance & compile times.

Maybe also add a release note that LLVM no longer unrolls vectorized loops?

Hello. Sorry I have been travelling, I had not seen this had returned.

This looks like it disabled full unrolling after vectorization too? I think it is fairly important for performance to be able to simplify loops away where the runtime trip count is small enough by unrolling them. Either where the loop count becomes constant propagated through LTO or the trip count is just low enough. As far as I understand straight line code will almost always be better that looping, and preventing full unrolling would lead to performance regressions.

There is an example in https://godbolt.org/z/8YnsWfGGo where.. something is going wrong.

fhahn mentioned this in rGfaad567589a3: [LV] Add test case where SCEV is needed to remove vector backedge..Aug 31 2022, 6:02 AM

In D115261#3760460, @dmgreen wrote:

There is an example in https://godbolt.org/z/8YnsWfGGo where.. something is going wrong.

Yeah this is a case where the current logic to eliminate single-iteration vector loops doesn't trigger. Should be fixed by D133017

This looks like it disabled full unrolling after vectorization too? I think it is fairly important for performance to be able to simplify loops away where the runtime trip count is small enough by unrolling them. Either where the loop count becomes constant propagated through LTO or the trip count is just low enough. As far as I understand straight line code will almost always be better that looping, and preventing full unrolling would lead to performance regressions.

The above patch doesn't solve the issue where the loop will need unrolling twice or more times to completely remove the loop overhead. This shouldn't be an issue for X86/most AArch64 cores and full unrolling leads to excessive unrolling as in https://github.com/llvm/llvm-project/issues/42332. But it may still be an issue for some other architectures. So maybe targets should be able to opt-in/out?

In D115261#3761027, @fhahn wrote:

In D115261#3760460, @dmgreen wrote:

There is an example in https://godbolt.org/z/8YnsWfGGo where.. something is going wrong.

Yeah this is a case where the current logic to eliminate single-iteration vector loops doesn't trigger. Should be fixed by D133017

Thanks

This looks like it disabled full unrolling after vectorization too? I think it is fairly important for performance to be able to simplify loops away where the runtime trip count is small enough by unrolling them. Either where the loop count becomes constant propagated through LTO or the trip count is just low enough. As far as I understand straight line code will almost always be better that looping, and preventing full unrolling would lead to performance regressions.

The above patch doesn't solve the issue where the loop will need unrolling twice or more times to completely remove the loop overhead. This shouldn't be an issue for X86/most AArch64 cores and full unrolling leads to excessive unrolling as in https://github.com/llvm/llvm-project/issues/42332. But it may still be an issue for some other architectures. So maybe targets should be able to opt-in/out?

Thanks for the extra context. The reasoning given (micro-op buffers) does sound very (micro-)architectural. I'm surprised if the loop body is the bottleneck that decode couldn't keep up with a fully unrolled version. A (relatively small) number of vector operations in a single basic block will often be faster than a loop. Just having the loop is going to cause a certain amount of overhead.

Disabling extra runtime unrolling after vectorization makes sense - otherwise the loop body can get too big and end up never executed. We do the same thing already on certain targets. There will be places where it is worse for performance, but the benefits are likely to be more common.

But full unrolling sounds like it should be mostly beneficial, due to the extra simplification it can provide. We are going from too much unrolling to too little. I don't think I have anything that shows it in the benchmarks I usually run, but the cases I've heard from customers in the past were DPS routines like those from https://github.com/ARM-software/CMSIS-DSP/tree/main/Source. They can get compiled with LTO, so during the normal compile step they are vectorized with unknown trip counts. During LTO they get inlined or const-propagated and a lot of the loop bounds become constant. It is expected that the compiler can simplify the result nicely, including the predicated vector loops we might have produced (which is why patches like https://reviews.llvm.org/D94103 were useful).

So I have no evidence of full unrolling being a problem, but my gut says that if it is useful to unroll scalar loops then vector loops shouldn't really be treated any differently.

Are you gonna land this patch? Or any blockers?

Matt added a subscriber: Matt.Sep 27 2022, 11:41 AM

Does this disable all unrolling, or only runtime unrolling?
I strongly suspect that the full unrolling should still be allowed.
Consider e.g. D136806 from the https://godbolt.org/z/fsdMhETh3 from D136806,
there we still need to full-unroll, after vectorization, to get the SROA to trigger.

I'm also a bit vary to the fact that LV unrolling
is functionally different to the normal unrolling/

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7701	Looks like this is not runtime-specific?

nikic mentioned this in D140698: [LoopUnroll] Be more permissive to high-cost loop trip count SCEV's.Dec 27 2022, 1:12 PM

Rebased and adjusted code back to be for runtime-unrolling only for now.

In D115261#3807958, @xbolva00 wrote:

Are you gonna land this patch? Or any blockers?

Not any longer with 9758242046b3 landed.

In D115261#4001190, @lebedev.ri wrote:

Does this disable all unrolling, or only runtime unrolling?

This went through a couple of iterations. Updated the code to limit to runtime unrolling only as the description/title says.

I strongly suspect that the full unrolling should still be allowed.
Consider e.g. D136806 from the https://godbolt.org/z/fsdMhETh3 from D136806,
there we still need to full-unroll, after vectorization, to get the SROA to trigger.

Full unrolling is back to being allowed for this patch.

I'm also a bit vary to the fact that LV unrolling
is functionally different to the normal unrolling/

It is different, but LV's cost-modeling should be more realistic then the one for runtime unrolling. Runtime unrolling can actively be harmful (https://github.com/llvm/llvm-project/issues/40306) after LV and it adds substantial compile-time for little/no gain in the benchmark runs I did. If there are cases where it actively helps, I am happy to analyze those.

In D115261#4021955, @fhahn wrote:

Rebased and adjusted code back to be for runtime-unrolling only for now.

In D115261#3807958, @xbolva00 wrote:

Are you gonna land this patch? Or any blockers?

Not any longer with 9758242046b3 landed.

(not any longer what? :)

In D115261#4001190, @lebedev.ri wrote:

Does this disable all unrolling, or only runtime unrolling?

This went through a couple of iterations. Updated the code to limit to runtime unrolling only as the description/title says.

I strongly suspect that the full unrolling should still be allowed.
Consider e.g. D136806 from the https://godbolt.org/z/fsdMhETh3 from D136806,
there we still need to full-unroll, after vectorization, to get the SROA to trigger.

Full unrolling is back to being allowed for this patch.

Okay.

I'm also a bit vary to the fact that LV unrolling
is functionally different to the normal unrolling/

It is different, but LV's cost-modeling should be more realistic then the one for runtime unrolling. Runtime unrolling can actively be harmful (https://github.com/llvm/llvm-project/issues/40306) after LV and it adds substantial compile-time for little/no gain in the benchmark runs I did. If there are cases where it actively helps, I am happy to analyze those.

I'm talking about the fact that LV unrolling increases vector sizes,
(i mean, unless i'm grossly misremembering things?) while normal
runtime unrolling just executes N vector iterations at once,
allowing for speculative execution. So ignoring the cost model question,
they *are* different.

Look e.g. at the actual codegen for interleaving,
with higher VF's we often start running out of registers and spill,
and by that point any vectorization gain are almost lost,
while if we'd stayed at some smaller VF, we'd be fine.

Harbormaster completed remote builds in B205365: Diff 485873.Jan 2 2023, 11:29 AM

Hmm.

LV: IC is 4
LV: VF is 8
LV: Interleaving to saturate store or load ports.
LV: Minimum required TC for runtime checks to be profitable:0
LV: Found a vectorizable loop (8) in <stdin>
LV: Interleave Count is 4
LEV: Epilogue vectorization is not profitable for this loop
Executing best plan with VF=8, UF=4
LV: vectorizing VPBB:vector.ph in BB:vector.ph
LV: filled BB:
vector.ph:                                        ; preds = %.lr.ph.preheader
  %n.mod.vf = urem i64 %3, 32
  %n.vec = sub i64 %3, %n.mod.vf
  br label %middle.block
LV: VPBlock in RPO vector.body
LV: created vector.body
LV: draw edge fromvector.ph
LV: vectorizing VPBB:vector.body in BB:vector.body
LV: filled BB:
vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i64 [ 0, %vector.ph ]
  %4 = add i64 %index, 0
  %5 = add i64 %index, 8
  %6 = add i64 %index, 16
  %7 = add i64 %index, 24
  %8 = getelementptr inbounds i32, ptr %0, i64 %4
  %9 = getelementptr inbounds i32, ptr %0, i64 %5
  %10 = getelementptr inbounds i32, ptr %0, i64 %6
  %11 = getelementptr inbounds i32, ptr %0, i64 %7
  %12 = getelementptr inbounds i32, ptr %8, i32 0
  %wide.load = load <8 x i32>, ptr %12, align 4, !tbaa !5
  %13 = getelementptr inbounds i32, ptr %8, i32 8
  %wide.load7 = load <8 x i32>, ptr %13, align 4, !tbaa !5
  %14 = getelementptr inbounds i32, ptr %8, i32 16
  %wide.load8 = load <8 x i32>, ptr %14, align 4, !tbaa !5
  %15 = getelementptr inbounds i32, ptr %8, i32 24
  %wide.load9 = load <8 x i32>, ptr %15, align 4, !tbaa !5
  %16 = mul nsw <8 x i32> %wide.load, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
  %17 = mul nsw <8 x i32> %wide.load7, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
  %18 = mul nsw <8 x i32> %wide.load8, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
  %19 = mul nsw <8 x i32> %wide.load9, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
  %20 = getelementptr inbounds i32, ptr %8, i32 0
  store <8 x i32> %16, ptr %20, align 4, !tbaa !5
  %21 = getelementptr inbounds i32, ptr %8, i32 8
  store <8 x i32> %17, ptr %21, align 4, !tbaa !5
  %22 = getelementptr inbounds i32, ptr %8, i32 16
  store <8 x i32> %18, ptr %22, align 4, !tbaa !5
  %23 = getelementptr inbounds i32, ptr %8, i32 24
  store <8 x i32> %19, ptr %23, align 4, !tbaa !5
  %index.next = add nuw i64 %index, 32
  %24 = icmp eq i64 %index.next, %n.vec
  br i1 %24, <null operand!>, label %vector.body
LV: vectorizing VPBB:middle.block in BB:middle.block

So i *was* thinking of something else.
It's possible that LV's unroll heuristic
may need further tuning, but in general
please proceed with this.

In D115261#3807958, @xbolva00 wrote:

Are you gonna land this patch? Or any blockers?

Not any longer with 9758242046b3 landed.

(not any longer what? :)

As reply to @xbolva00 no pending blocker patches I was aware of :)

In D115261#4023078, @lebedev.ri wrote:

Hmm.
So i *was* thinking of something else.
It's possible that LV's unroll heuristic
may need further tuning, but in general
please proceed with this.

Thanks for taking a look! Interleaving in LV clones vector operations. If interleaving causes spilling, this is a bug in the cost model and should be fixed. There will be some cases where we won't perform interleaving but could do runtime unrolling, but it is unlikely to give performance benefits (at least on recentish OOO CPUs). If there are cases where this causes regressions, I'll take a look!

Rebase, I am planning on landing this soon now :)

Herald added a subscriber: nemanjai. · View Herald TranscriptJan 6 2023, 12:39 AM

Harbormaster completed remote builds in B206047: Diff 486765.Jan 6 2023, 1:33 AM

Closed by commit rG68469a80cb74: [LV] Disable runtime unrolling for vectorized loops. (authored by fhahn). · Explain WhyJan 6 2023, 2:56 AM

This revision was automatically updated to reflect the committed changes.

fhahn added a commit: rG68469a80cb74: [LV] Disable runtime unrolling for vectorized loops..

Allen added a subscriber: Allen.Apr 13 2023, 10:24 PM

Herald added a subscriber: StephenFan. · View Herald TranscriptApr 13 2023, 10:24 PM

alex-t mentioned this in D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.Apr 26 2023, 12:10 PM

Alexander Timofeev <alexander.timofeev@amd.com> mentioned this in rGbad4de1ae7fa: Don't disable loop unroll for vectorized loops on AMDGPU target.May 25 2023, 1:55 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Vectorize/

LoopVectorize.cpp

4 lines

test/

Transforms/

LoopVectorize/

ARM/

pointer_iv.ll

6 lines

tail-folding-loop-hint.ll

6 lines

PowerPC/

optimal-epilog-vectorization.ll

8 lines

X86/

already-vectorized.ll

5 lines

float-induction-x86.ll

341 lines

gather_scatter.ll

4 lines

invariant-load-gather.ll

6 lines

invariant-store-vectorization.ll

6 lines

86 lines

4 lines

2 lines

2 lines

3 lines

96 lines

10 lines

interleaved-accesses.ll

2 lines

invariant-store-vectorization-2.ll

18 lines

invariant-store-vectorization.ll

32 lines

memdep-fold-tail.ll

4 lines

optsize.ll

2 lines

pointer-select-runtime-checks.ll

42 lines

reduction-with-invariant-store.ll

16 lines

runtime-check.ll

48 lines

vectorize-once.ll

4 lines

PhaseOrdering/

AArch64/

hoisting-sinking-required-for-vectorization.ll

14 lines

X86/

excessive-unrolling.ll

182 lines

vdiv.ll

191 lines

Diff 486789

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,566 Lines • ▼ Show 20 Lines	VPlan &LoopVectorizationPlanner::getBestPlanFor(ElementCount VF) const {

for (const VPlanPtr &Plan : VPlans) {		for (const VPlanPtr &Plan : VPlans) {
if (Plan->hasVF(VF))		if (Plan->hasVF(VF))
return *Plan.get();		return *Plan.get();
}		}
llvm_unreachable("No plan found!");		llvm_unreachable("No plan found!");
}		}

static void AddRuntimeUnrollDisableMetaData(Loop *L) {		static void AddRuntimeUnrollDisableMetaData(Loop *L) {
		nikicUnsubmitted Not Done Reply Inline Actions Should probably drop Runtime from the name? nikic: Should probably drop Runtime from the name?
SmallVector<Metadata *, 4> MDs;		SmallVector<Metadata *, 4> MDs;
// Reserve first location for self reference to the LoopID metadata node.		// Reserve first location for self reference to the LoopID metadata node.
MDs.push_back(nullptr);		MDs.push_back(nullptr);
bool IsUnrollMetadata = false;		bool IsUnrollMetadata = false;
MDNode *LoopID = L->getLoopID();		MDNode *LoopID = L->getLoopID();
if (LoopID) {		if (LoopID) {
// First find existing loop unrolling disable metadata.		// First find existing loop unrolling disable metadata.
for (unsigned i = 1, ie = LoopID->getNumOperands(); i < ie; ++i) {		for (unsigned i = 1, ie = LoopID->getNumOperands(); i < ie; ++i) {
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	else {
// Keep all loop hints from the original loop on the vector loop (we'll		// Keep all loop hints from the original loop on the vector loop (we'll
// replace the vectorizer-specific hints below).		// replace the vectorizer-specific hints below).
if (MDNode *LID = OrigLoop->getLoopID())		if (MDNode *LID = OrigLoop->getLoopID())
L->setLoopID(LID);		L->setLoopID(LID);

LoopVectorizeHints Hints(L, true, *ORE);		LoopVectorizeHints Hints(L, true, *ORE);
Hints.setAlreadyVectorized();		Hints.setAlreadyVectorized();
}		}
// Disable runtime unrolling when vectorizing the epilogue loop.
if (CanonicalIVStartValue)
AddRuntimeUnrollDisableMetaData(L);		AddRuntimeUnrollDisableMetaData(L);
		lebedev.riUnsubmitted Not Done Reply Inline Actions Looks like this is not runtime-specific? lebedev.ri: Looks like this is not runtime-specific?

// 3. Fix the vectorized code: take care of header phi's, live-outs,		// 3. Fix the vectorized code: take care of header phi's, live-outs,
// predication, updating analyses.		// predication, updating analyses.
ILV.fixVectorizedLoop(State, BestVPlan);		ILV.fixVectorizedLoop(State, BestVPlan);

ILV.printDebugTracesAtEnd();		ILV.printDebugTracesAtEnd();
}		}

▲ Show 20 Lines • Show All 2,846 Lines • ▼ Show 20 Lines	#endif /* NDEBUG */

std::optional<MDNode *> RemainderLoopID =		std::optional<MDNode *> RemainderLoopID =
makeFollowupLoopID(OrigLoopID, {LLVMLoopVectorizeFollowupAll,		makeFollowupLoopID(OrigLoopID, {LLVMLoopVectorizeFollowupAll,
LLVMLoopVectorizeFollowupEpilogue});		LLVMLoopVectorizeFollowupEpilogue});
if (RemainderLoopID) {		if (RemainderLoopID) {
L->setLoopID(*RemainderLoopID);		L->setLoopID(*RemainderLoopID);
} else {		} else {
if (DisableRuntimeUnroll)		if (DisableRuntimeUnroll)
AddRuntimeUnrollDisableMetaData(L);		AddRuntimeUnrollDisableMetaData(L);
		nikicUnsubmitted Not Done Reply Inline Actions Broken indent? nikic: Broken indent?

// Mark the loop as already vectorized to avoid vectorizing again.		// Mark the loop as already vectorized to avoid vectorizing again.
Hints.setAlreadyVectorized();		Hints.setAlreadyVectorized();
}		}

assert(!verifyFunction(*L->getHeader()->getParent(), &dbgs()));		assert(!verifyFunction(*L->getHeader()->getParent(), &dbgs()));
return true;		return true;
}		}
▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/ARM/pointer_iv.ll

	Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[NEXT_GEP:%.*]] = getelementptr i8, ptr [[A]], i32 [[TMP0]]			; CHECK-NEXT: [[NEXT_GEP:%.*]] = getelementptr i8, ptr [[A]], i32 [[TMP0]]
	; CHECK-NEXT: [[TMP1:%.*]] = shl i32 [[INDEX]], 2			; CHECK-NEXT: [[TMP1:%.*]] = shl i32 [[INDEX]], 2
	; CHECK-NEXT: [[NEXT_GEP4:%.*]] = getelementptr i8, ptr [[B]], i32 [[TMP1]]			; CHECK-NEXT: [[NEXT_GEP4:%.*]] = getelementptr i8, ptr [[B]], i32 [[TMP1]]
	; CHECK-NEXT: [[WIDE_VEC:%.*]] = load <8 x i32>, ptr [[NEXT_GEP]], align 4			; CHECK-NEXT: [[WIDE_VEC:%.*]] = load <8 x i32>, ptr [[NEXT_GEP]], align 4
	; CHECK-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <8 x i32> [[WIDE_VEC]], <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>			; CHECK-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <8 x i32> [[WIDE_VEC]], <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
	; CHECK-NEXT: [[TMP2:%.*]] = add nsw <4 x i32> [[STRIDED_VEC]], [[BROADCAST_SPLAT]]			; CHECK-NEXT: [[TMP2:%.*]] = add nsw <4 x i32> [[STRIDED_VEC]], [[BROADCAST_SPLAT]]
	; CHECK-NEXT: store <4 x i32> [[TMP2]], ptr [[NEXT_GEP4]], align 4			; CHECK-NEXT: store <4 x i32> [[TMP2]], ptr [[NEXT_GEP4]], align 4
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
	; CHECK-NEXT: [[TMP3:%.*]] = icmp eq i32 [[INDEX_NEXT]], 996			; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i32 [[INDEX_NEXT]], 996
	; CHECK-NEXT: br i1 [[TMP3]], label [[FOR_BODY:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP4]], label [[FOR_BODY:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[A_ADDR_09:%.]] = phi ptr [ [[ADD_PTR:%.]], [[FOR_BODY]] ], [ [[IND_END]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[A_ADDR_09:%.]] = phi ptr [ [[ADD_PTR:%.]], [[FOR_BODY]] ], [ [[IND_END]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[I_08:%.]] = phi i32 [ [[INC:%.]], [[FOR_BODY]] ], [ 996, [[VECTOR_BODY]] ]			; CHECK-NEXT: [[I_08:%.]] = phi i32 [ [[INC:%.]], [[FOR_BODY]] ], [ 996, [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[B_ADDR_07:%.]] = phi ptr [ [[INCDEC_PTR:%.]], [[FOR_BODY]] ], [ [[IND_END2]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[B_ADDR_07:%.]] = phi ptr [ [[INCDEC_PTR:%.]], [[FOR_BODY]] ], [ [[IND_END2]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP4:%.*]] = load i32, ptr [[A_ADDR_09]], align 4			; CHECK-NEXT: [[TMP4:%.*]] = load i32, ptr [[A_ADDR_09]], align 4
	; CHECK-NEXT: [[ADD_PTR]] = getelementptr inbounds i32, ptr [[A_ADDR_09]], i32 2			; CHECK-NEXT: [[ADD_PTR]] = getelementptr inbounds i32, ptr [[A_ADDR_09]], i32 2
	; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP4]], [[Y]]			; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP4]], [[Y]]
	; CHECK-NEXT: store i32 [[ADD]], ptr [[B_ADDR_07]], align 4			; CHECK-NEXT: store i32 [[ADD]], ptr [[B_ADDR_07]], align 4
	; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i32, ptr [[B_ADDR_07]], i32 1			; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i32, ptr [[B_ADDR_07]], i32 1
	; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_08]], 1			; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_08]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[INC]], 1000			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[INC]], 1000
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[END:%.*]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[END:%.*]], label [[FOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
	; CHECK: end:			; CHECK: end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body
	for.body:			for.body:
	%A.addr.09 = phi ptr [ %add.ptr, %for.body ], [ %A, %entry ]			%A.addr.09 = phi ptr [ %add.ptr, %for.body ], [ %A, %entry ]
	%i.08 = phi i32 [ %inc, %for.body ], [ 0, %entry ]			%i.08 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
	▲ Show 20 Lines • Show All 891 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/ARM/tail-folding-loop-hint.ll

Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	for.body:
%add = add nsw i32 %1, %0		%add = add nsw i32 %1, %0
%arrayidx4 = getelementptr inbounds i32, ptr %A, i64 %indvars.iv		%arrayidx4 = getelementptr inbounds i32, ptr %A, i64 %indvars.iv
store i32 %add, ptr %arrayidx4, align 4		store i32 %add, ptr %arrayidx4, align 4
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 430		%exitcond = icmp eq i64 %indvars.iv.next, 430
br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !6		br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !6
}		}

; CHECK: [[VEC_LOOP1]] = distinct !{[[VEC_LOOP1]], [[MD_IS_VEC:![0-9]+]]}		; CHECK: [[VEC_LOOP1]] = distinct !{[[VEC_LOOP1]], [[MD_IS_VEC:![0-9]+]], [[MD_RT_UNROLL_DIS:![0-9]+]]}
; CHECK-NEXT: [[MD_IS_VEC:![0-9]+]] = !{!"llvm.loop.isvectorized", i32 1}		; CHECK-NEXT: [[MD_IS_VEC:![0-9]+]] = !{!"llvm.loop.isvectorized", i32 1}
; CHECK-NEXT: [[SCALAR_LOOP1]] = distinct !{[[SCALAR_LOOP1]], [[MD_RT_UNROLL_DIS:![0-9]+]], [[MD_IS_VEC]]}
; CHECK-NEXT: [[MD_RT_UNROLL_DIS]] = !{!"llvm.loop.unroll.runtime.disable"}		; CHECK-NEXT: [[MD_RT_UNROLL_DIS]] = !{!"llvm.loop.unroll.runtime.disable"}
; CHECK-NEXT: [[VEC_LOOP2]] = distinct !{[[VEC_LOOP2]], [[MD_IS_VEC]]}		; CHECK-NEXT: [[SCALAR_LOOP1]] = distinct !{[[SCALAR_LOOP1]], [[MD_RT_UNROLL_DIS]], [[MD_IS_VEC]]}
		; CHECK-NEXT: [[VEC_LOOP2]] = distinct !{[[VEC_LOOP2]], [[MD_IS_VEC]], [[MD_RT_UNROLL_DIS]]}
; CHECK-NEXT: [[SCALAR_LOOP2]] = distinct !{[[SCALAR_LOOP2]], [[MD_RT_UNROLL_DIS]], [[MD_IS_VEC]]}		; CHECK-NEXT: [[SCALAR_LOOP2]] = distinct !{[[SCALAR_LOOP2]], [[MD_RT_UNROLL_DIS]], [[MD_IS_VEC]]}

!6 = distinct !{!6, !7, !8}		!6 = distinct !{!6, !7, !8}
!7 = !{!"llvm.loop.vectorize.predicate.enable", i1 true}		!7 = !{!"llvm.loop.vectorize.predicate.enable", i1 true}
!8 = !{!"llvm.loop.vectorize.enable", i1 true}		!8 = !{!"llvm.loop.vectorize.enable", i1 true}

llvm/test/Transforms/LoopVectorize/PowerPC/optimal-epilog-vectorization.ll

	Show First 20 Lines • Show All 997 Lines • ▼ Show 20 Lines

	for.end.loopexit: ; preds = %for.body			for.end.loopexit: ; preds = %for.body
	br label %for.end			br label %for.end

	for.end: ; preds = %for.end.loopexit, %entry			for.end: ; preds = %for.end.loopexit, %entry
	ret i32 0			ret i32 0
	}			}

	; VF-TWO-CHECK-DAG: [[LOOPID_MV]] = distinct !{[[LOOPID_MV]], [[LOOPID_DISABLE_VECT:!.*]]}			; VF-TWO-CHECK-DAG: [[LOOPID_MV]] = distinct !{[[LOOPID_MV]], [[LOOPID_DISABLE_VECT:!.]], [[LOOPID_DISABLE_UNROLL:!.]]}
	; VF-TWO-CHECK-DAG: [[LOOPID_EV]] = distinct !{[[LOOPID_EV]], [[LOOPID_DISABLE_VECT]], [[LOOPID_DISABLE_UNROLL:!.*]]}			; VF-TWO-CHECK-DAG: [[LOOPID_EV]] = distinct !{[[LOOPID_EV]], [[LOOPID_DISABLE_VECT]], [[LOOPID_DISABLE_UNROLL]]}
	; VF-TWO-CHECK-DAG: [[LOOPID_DISABLE_VECT]] = [[DISABLE_VECT_STR:!{!"llvm.loop.isvectorized".}.]]			; VF-TWO-CHECK-DAG: [[LOOPID_DISABLE_VECT]] = [[DISABLE_VECT_STR:!{!"llvm.loop.isvectorized".}.]]
	; VF-TWO-CHECK-DAG: [[LOOPID_DISABLE_UNROLL]] = [[DISABLE_UNROLL_STR:!{!"llvm.loop.unroll.runtime.disable"}.*]]			; VF-TWO-CHECK-DAG: [[LOOPID_DISABLE_UNROLL]] = [[DISABLE_UNROLL_STR:!{!"llvm.loop.unroll.runtime.disable"}.*]]
	;			;
	; VF-FOUR-CHECK-DAG: [[LOOPID_MV_CM]] = distinct !{[[LOOPID_MV_CM]], [[LOOPID_DISABLE_VECT_CM:!.*]]}			; VF-FOUR-CHECK-DAG: [[LOOPID_MV_CM]] = distinct !{[[LOOPID_MV_CM]], [[LOOPID_DISABLE_VECT_CM:!.]], [[LOOPID_DISABLE_UNROLL_CM:!.]]}
	; VF-FOUR-CHECK-DAG: [[LOOPID_EV_CM]] = distinct !{[[LOOPID_EV_CM]], [[LOOPID_DISABLE_VECT_CM]], [[LOOPID_DISABLE_UNROLL_CM:!.*]]}			; VF-FOUR-CHECK-DAG: [[LOOPID_EV_CM]] = distinct !{[[LOOPID_EV_CM]], [[LOOPID_DISABLE_VECT_CM]], [[LOOPID_DISABLE_UNROLL_CM]]}
	; VF-FOUR-CHECK-DAG: [[LOOPID_DISABLE_VECT_CM]] = [[DISABLE_VECT_STR_CM:!{!"llvm.loop.isvectorized".}.]]			; VF-FOUR-CHECK-DAG: [[LOOPID_DISABLE_VECT_CM]] = [[DISABLE_VECT_STR_CM:!{!"llvm.loop.isvectorized".}.]]
	; VF-FOUR-CHECK-DAG: [[LOOPID_DISABLE_UNROLL_CM]] = [[DISABLE_UNROLL_STR_CM:!{!"llvm.loop.unroll.runtime.disable"}.*]]			; VF-FOUR-CHECK-DAG: [[LOOPID_DISABLE_UNROLL_CM]] = [[DISABLE_UNROLL_STR_CM:!{!"llvm.loop.unroll.runtime.disable"}.*]]
	;			;
	attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="true" "no-jump-tables"="false" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="ppc64le" "target-features"="+altivec,+bpermd,+crypto,+direct-move,+extdiv,+htm,+power8-vector,+vsx,-power9-vector,-spe" "unsafe-fp-math"="true" "use-soft-float"="false" }			attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="true" "no-jump-tables"="false" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="ppc64le" "target-features"="+altivec,+bpermd,+crypto,+direct-move,+extdiv,+htm,+power8-vector,+vsx,-power9-vector,-spe" "unsafe-fp-math"="true" "use-soft-float"="false" }

llvm/test/Transforms/LoopVectorize/X86/already-vectorized.ll

	Show All 35 Lines
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK: br {{.}} label %for.body{{.}}, !llvm.loop [[scalar:![0-9]+]]			; CHECK: br {{.}} label %for.body{{.}}, !llvm.loop [[scalar:![0-9]+]]

	for.end: ; preds = %for.body			for.end: ; preds = %for.body
	ret i32 %add			ret i32 %add
	}			}

	; Now, we check for the Hint metadata			; Now, we check for the Hint metadata
	; CHECK: [[vect]] = distinct !{[[vect]], [[width:![0-9]+]]}			; CHECK: [[vect]] = distinct !{[[vect]], [[width:![0-9]+]], [[runtime_unroll:![0-9]+]]}
	; CHECK: [[width]] = !{!"llvm.loop.isvectorized", i32 1}			; CHECK: [[width]] = !{!"llvm.loop.isvectorized", i32 1}
	; CHECK: [[scalar]] = distinct !{[[scalar]], [[runtime_unroll:![0-9]+]], [[width]]}
	; CHECK: [[runtime_unroll]] = !{!"llvm.loop.unroll.runtime.disable"}			; CHECK: [[runtime_unroll]] = !{!"llvm.loop.unroll.runtime.disable"}
				; CHECK: [[scalar]] = distinct !{[[scalar]], [[runtime_unroll]], [[width]]}

llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll

	Show All 19 Lines
	; AUTO_VEC-NEXT: [[CMP4:%.]] = icmp sgt i32 [[N:%.]], 0			; AUTO_VEC-NEXT: [[CMP4:%.]] = icmp sgt i32 [[N:%.]], 0
	; AUTO_VEC-NEXT: br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]			; AUTO_VEC-NEXT: br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]
	; AUTO_VEC: for.body.preheader:			; AUTO_VEC: for.body.preheader:
	; AUTO_VEC-NEXT: [[ZEXT:%.*]] = zext i32 [[N]] to i64			; AUTO_VEC-NEXT: [[ZEXT:%.*]] = zext i32 [[N]] to i64
	; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32			; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32
	; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]			; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]
	; AUTO_VEC: vector.ph:			; AUTO_VEC: vector.ph:
	; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[ZEXT]], 4294967264			; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[ZEXT]], 4294967264
	; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float			; AUTO_VEC-NEXT: [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
	; AUTO_VEC-NEXT: [[TMP0:%.*]] = fmul fast float [[CAST_VTC]], 5.000000e-01			; AUTO_VEC-NEXT: [[TMP0:%.*]] = fmul fast float [[DOTCAST]], 5.000000e-01
	; AUTO_VEC-NEXT: [[IND_END:%.*]] = fadd fast float [[TMP0]], 1.000000e+00			; AUTO_VEC-NEXT: [[IND_END:%.*]] = fadd fast float [[TMP0]], 1.000000e+00
	; AUTO_VEC-NEXT: [[TMP1:%.*]] = add nsw i64 [[ZEXT]], -32
	; AUTO_VEC-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 5
	; AUTO_VEC-NEXT: [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1
	; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP3]], 3
	; AUTO_VEC-NEXT: [[TMP4:%.*]] = icmp ult i64 [[TMP1]], 96
	; AUTO_VEC-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]
	; AUTO_VEC: vector.ph.new:
	; AUTO_VEC-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[TMP3]], -4
	; AUTO_VEC-NEXT: br label [[VECTOR_BODY:%.*]]			; AUTO_VEC-NEXT: br label [[VECTOR_BODY:%.*]]
	; AUTO_VEC: vector.body:			; AUTO_VEC: vector.body:
	; AUTO_VEC-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[INDEX_NEXT_3:%.]], [[VECTOR_BODY]] ]			; AUTO_VEC-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[VEC_IND:%.]] = phi <8 x float> [ <float 1.000000e+00, float 1.500000e+00, float 2.000000e+00, float 2.500000e+00, float 3.000000e+00, float 3.500000e+00, float 4.000000e+00, float 4.500000e+00>, [[VECTOR_PH_NEW]] ], [ [[VEC_IND_NEXT_3:%.]], [[VECTOR_BODY]] ]			; AUTO_VEC-NEXT: [[VEC_IND:%.]] = phi <8 x float> [ <float 1.000000e+00, float 1.500000e+00, float 2.000000e+00, float 2.500000e+00, float 3.000000e+00, float 3.500000e+00, float 4.000000e+00, float 4.500000e+00>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[NITER:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[NITER_NEXT_3:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[STEP_ADD:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00>			; AUTO_VEC-NEXT: [[STEP_ADD:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00>
	; AUTO_VEC-NEXT: [[STEP_ADD2:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00>			; AUTO_VEC-NEXT: [[STEP_ADD2:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00>
	; AUTO_VEC-NEXT: [[STEP_ADD3:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01>			; AUTO_VEC-NEXT: [[STEP_ADD3:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01>
	; AUTO_VEC-NEXT: [[TMP5:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]			; AUTO_VEC-NEXT: [[TMP1:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]
	; AUTO_VEC-NEXT: store <8 x float> [[VEC_IND]], ptr [[TMP5]], align 4			; AUTO_VEC-NEXT: store <8 x float> [[VEC_IND]], ptr [[TMP1]], align 4
	; AUTO_VEC-NEXT: [[TMP7:%.*]] = getelementptr inbounds float, ptr [[TMP5]], i64 8			; AUTO_VEC-NEXT: [[TMP2:%.*]] = getelementptr inbounds float, ptr [[TMP1]], i64 8
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD]], ptr [[TMP7]], align 4			; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD]], ptr [[TMP2]], align 4
	; AUTO_VEC-NEXT: [[TMP9:%.*]] = getelementptr inbounds float, ptr [[TMP5]], i64 16			; AUTO_VEC-NEXT: [[TMP3:%.*]] = getelementptr inbounds float, ptr [[TMP1]], i64 16
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD2]], ptr [[TMP9]], align 4			; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD2]], ptr [[TMP3]], align 4
	; AUTO_VEC-NEXT: [[TMP11:%.*]] = getelementptr inbounds float, ptr [[TMP5]], i64 24			; AUTO_VEC-NEXT: [[TMP4:%.*]] = getelementptr inbounds float, ptr [[TMP1]], i64 24
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD3]], ptr [[TMP11]], align 4			; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD3]], ptr [[TMP4]], align 4
	; AUTO_VEC-NEXT: [[INDEX_NEXT:%.*]] = or i64 [[INDEX]], 32			; AUTO_VEC-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 32
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01>			; AUTO_VEC-NEXT: [[VEC_IND_NEXT]] = fadd fast <8 x float> [[VEC_IND]], <float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD_1:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 2.000000e+01, float 2.000000e+01, float 2.000000e+01, float 2.000000e+01, float 2.000000e+01, float 2.000000e+01, float 2.000000e+01, float 2.000000e+01>			; AUTO_VEC-NEXT: [[TMP5:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AUTO_VEC-NEXT: [[STEP_ADD2_1:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 2.400000e+01, float 2.400000e+01, float 2.400000e+01, float 2.400000e+01, float 2.400000e+01, float 2.400000e+01, float 2.400000e+01, float 2.400000e+01>			; AUTO_VEC-NEXT: br i1 [[TMP5]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
	; AUTO_VEC-NEXT: [[STEP_ADD3_1:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 2.800000e+01, float 2.800000e+01, float 2.800000e+01, float 2.800000e+01, float 2.800000e+01, float 2.800000e+01, float 2.800000e+01, float 2.800000e+01>
	; AUTO_VEC-NEXT: [[TMP13:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDEX_NEXT]]
	; AUTO_VEC-NEXT: store <8 x float> [[VEC_IND_NEXT]], ptr [[TMP13]], align 4
	; AUTO_VEC-NEXT: [[TMP15:%.*]] = getelementptr inbounds float, ptr [[TMP13]], i64 8
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD_1]], ptr [[TMP15]], align 4
	; AUTO_VEC-NEXT: [[TMP17:%.*]] = getelementptr inbounds float, ptr [[TMP13]], i64 16
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD2_1]], ptr [[TMP17]], align 4
	; AUTO_VEC-NEXT: [[TMP19:%.*]] = getelementptr inbounds float, ptr [[TMP13]], i64 24
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD3_1]], ptr [[TMP19]], align 4
	; AUTO_VEC-NEXT: [[INDEX_NEXT_1:%.*]] = or i64 [[INDEX]], 64
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_1:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 3.200000e+01, float 3.200000e+01, float 3.200000e+01, float 3.200000e+01, float 3.200000e+01, float 3.200000e+01, float 3.200000e+01, float 3.200000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD_2:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 3.600000e+01, float 3.600000e+01, float 3.600000e+01, float 3.600000e+01, float 3.600000e+01, float 3.600000e+01, float 3.600000e+01, float 3.600000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD2_2:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 4.000000e+01, float 4.000000e+01, float 4.000000e+01, float 4.000000e+01, float 4.000000e+01, float 4.000000e+01, float 4.000000e+01, float 4.000000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD3_2:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 4.400000e+01, float 4.400000e+01, float 4.400000e+01, float 4.400000e+01, float 4.400000e+01, float 4.400000e+01, float 4.400000e+01, float 4.400000e+01>
	; AUTO_VEC-NEXT: [[TMP21:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDEX_NEXT_1]]
	; AUTO_VEC-NEXT: store <8 x float> [[VEC_IND_NEXT_1]], ptr [[TMP21]], align 4
	; AUTO_VEC-NEXT: [[TMP23:%.*]] = getelementptr inbounds float, ptr [[TMP21]], i64 8
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD_2]], ptr [[TMP23]], align 4
	; AUTO_VEC-NEXT: [[TMP25:%.*]] = getelementptr inbounds float, ptr [[TMP21]], i64 16
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD2_2]], ptr [[TMP25]], align 4
	; AUTO_VEC-NEXT: [[TMP27:%.*]] = getelementptr inbounds float, ptr [[TMP21]], i64 24
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD3_2]], ptr [[TMP27]], align 4
	; AUTO_VEC-NEXT: [[INDEX_NEXT_2:%.*]] = or i64 [[INDEX]], 96
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_2:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 4.800000e+01, float 4.800000e+01, float 4.800000e+01, float 4.800000e+01, float 4.800000e+01, float 4.800000e+01, float 4.800000e+01, float 4.800000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD_3:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 5.200000e+01, float 5.200000e+01, float 5.200000e+01, float 5.200000e+01, float 5.200000e+01, float 5.200000e+01, float 5.200000e+01, float 5.200000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD2_3:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 5.600000e+01, float 5.600000e+01, float 5.600000e+01, float 5.600000e+01, float 5.600000e+01, float 5.600000e+01, float 5.600000e+01, float 5.600000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD3_3:%.*]] = fadd fast <8 x float> [[VEC_IND]], <float 6.000000e+01, float 6.000000e+01, float 6.000000e+01, float 6.000000e+01, float 6.000000e+01, float 6.000000e+01, float 6.000000e+01, float 6.000000e+01>
	; AUTO_VEC-NEXT: [[TMP29:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDEX_NEXT_2]]
	; AUTO_VEC-NEXT: store <8 x float> [[VEC_IND_NEXT_2]], ptr [[TMP29]], align 4
	; AUTO_VEC-NEXT: [[TMP31:%.*]] = getelementptr inbounds float, ptr [[TMP29]], i64 8
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD_3]], ptr [[TMP31]], align 4
	; AUTO_VEC-NEXT: [[TMP33:%.*]] = getelementptr inbounds float, ptr [[TMP29]], i64 16
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD2_3]], ptr [[TMP33]], align 4
	; AUTO_VEC-NEXT: [[TMP35:%.*]] = getelementptr inbounds float, ptr [[TMP29]], i64 24
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD3_3]], ptr [[TMP35]], align 4
	; AUTO_VEC-NEXT: [[INDEX_NEXT_3]] = add nuw i64 [[INDEX]], 128
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_3]] = fadd fast <8 x float> [[VEC_IND]], <float 6.400000e+01, float 6.400000e+01, float 6.400000e+01, float 6.400000e+01, float 6.400000e+01, float 6.400000e+01, float 6.400000e+01, float 6.400000e+01>
	; AUTO_VEC-NEXT: [[NITER_NEXT_3]] = add i64 [[NITER]], 4
	; AUTO_VEC-NEXT: [[NITER_NCMP_3:%.*]] = icmp eq i64 [[NITER_NEXT_3]], [[UNROLL_ITER]]
	; AUTO_VEC-NEXT: br i1 [[NITER_NCMP_3]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
	; AUTO_VEC: middle.block.unr-lcssa:
	; AUTO_VEC-NEXT: [[INDEX_UNR:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT_3]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[VEC_IND_UNR:%.*]] = phi <8 x float> [ <float 1.000000e+00, float 1.500000e+00, float 2.000000e+00, float 2.500000e+00, float 3.000000e+00, float 3.500000e+00, float 4.000000e+00, float 4.500000e+00>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT_3]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0
	; AUTO_VEC-NEXT: br i1 [[LCMP_MOD_NOT]], label [[MIDDLE_BLOCK:%.]], label [[VECTOR_BODY_EPIL:%.]]
	; AUTO_VEC: vector.body.epil:
	; AUTO_VEC-NEXT: [[INDEX_EPIL:%.]] = phi i64 [ [[INDEX_NEXT_EPIL:%.]], [[VECTOR_BODY_EPIL]] ], [ [[INDEX_UNR]], [[MIDDLE_BLOCK_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[VEC_IND_EPIL:%.]] = phi <8 x float> [ [[VEC_IND_NEXT_EPIL:%.]], [[VECTOR_BODY_EPIL]] ], [ [[VEC_IND_UNR]], [[MIDDLE_BLOCK_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[EPIL_ITER_NEXT:%.]], [[VECTOR_BODY_EPIL]] ], [ 0, [[MIDDLE_BLOCK_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[STEP_ADD_EPIL:%.*]] = fadd fast <8 x float> [[VEC_IND_EPIL]], <float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00, float 4.000000e+00>
	; AUTO_VEC-NEXT: [[STEP_ADD2_EPIL:%.*]] = fadd fast <8 x float> [[VEC_IND_EPIL]], <float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00, float 8.000000e+00>
	; AUTO_VEC-NEXT: [[STEP_ADD3_EPIL:%.*]] = fadd fast <8 x float> [[VEC_IND_EPIL]], <float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01, float 1.200000e+01>
	; AUTO_VEC-NEXT: [[TMP37:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDEX_EPIL]]
	; AUTO_VEC-NEXT: store <8 x float> [[VEC_IND_EPIL]], ptr [[TMP37]], align 4
	; AUTO_VEC-NEXT: [[TMP39:%.*]] = getelementptr inbounds float, ptr [[TMP37]], i64 8
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD_EPIL]], ptr [[TMP39]], align 4
	; AUTO_VEC-NEXT: [[TMP41:%.*]] = getelementptr inbounds float, ptr [[TMP37]], i64 16
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD2_EPIL]], ptr [[TMP41]], align 4
	; AUTO_VEC-NEXT: [[TMP43:%.*]] = getelementptr inbounds float, ptr [[TMP37]], i64 24
	; AUTO_VEC-NEXT: store <8 x float> [[STEP_ADD3_EPIL]], ptr [[TMP43]], align 4
	; AUTO_VEC-NEXT: [[INDEX_NEXT_EPIL]] = add nuw i64 [[INDEX_EPIL]], 32
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_EPIL]] = fadd fast <8 x float> [[VEC_IND_EPIL]], <float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01, float 1.600000e+01>
	; AUTO_VEC-NEXT: [[EPIL_ITER_NEXT]] = add i64 [[EPIL_ITER]], 1
	; AUTO_VEC-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_NEXT]], [[XTRAITER]]
	; AUTO_VEC-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[MIDDLE_BLOCK]], label [[VECTOR_BODY_EPIL]], !llvm.loop [[LOOP2:![0-9]+]]
	; AUTO_VEC: middle.block:			; AUTO_VEC: middle.block:
	; AUTO_VEC-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[ZEXT]]			; AUTO_VEC-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[ZEXT]]
	; AUTO_VEC-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY]]			; AUTO_VEC-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY]]
	; AUTO_VEC: for.body:			; AUTO_VEC: for.body:
	; AUTO_VEC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]			; AUTO_VEC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
	; AUTO_VEC-NEXT: [[X_06:%.]] = phi float [ [[CONV1:%.]], [[FOR_BODY]] ], [ 1.000000e+00, [[FOR_BODY_PREHEADER]] ], [ [[IND_END]], [[MIDDLE_BLOCK]] ]			; AUTO_VEC-NEXT: [[X_06:%.]] = phi float [ [[CONV1:%.]], [[FOR_BODY]] ], [ 1.000000e+00, [[FOR_BODY_PREHEADER]] ], [ [[IND_END]], [[MIDDLE_BLOCK]] ]
	; AUTO_VEC-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDVARS_IV]]			; AUTO_VEC-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDVARS_IV]]
	; AUTO_VEC-NEXT: store float [[X_06]], ptr [[ARRAYIDX]], align 4			; AUTO_VEC-NEXT: store float [[X_06]], ptr [[ARRAYIDX]], align 4
	; AUTO_VEC-NEXT: [[CONV1]] = fadd fast float [[X_06]], 5.000000e-01			; AUTO_VEC-NEXT: [[CONV1]] = fadd fast float [[X_06]], 5.000000e-01
	; AUTO_VEC-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AUTO_VEC-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AUTO_VEC-NEXT: [[TMP45:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[ZEXT]]			; AUTO_VEC-NEXT: [[TMP6:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[ZEXT]]
	; AUTO_VEC-NEXT: br i1 [[TMP45]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]			; AUTO_VEC-NEXT: br i1 [[TMP6]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; AUTO_VEC: for.end:			; AUTO_VEC: for.end:
	; AUTO_VEC-NEXT: ret void			; AUTO_VEC-NEXT: ret void
	;			;
	entry:			entry:
	%cmp4 = icmp sgt i32 %N, 0			%cmp4 = icmp sgt i32 %N, 0
	br i1 %cmp4, label %for.body.preheader, label %for.end			br i1 %cmp4, label %for.body.preheader, label %for.end

	for.body.preheader: ; preds = %entry			for.body.preheader: ; preds = %entry
	▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
	; AUTO_VEC-NEXT: [[X_06_EPIL:%.]] = phi float [ [[CONV1_EPIL:%.]], [[FOR_BODY_EPIL]] ], [ [[X_06_UNR]], [[FOR_END_LOOPEXIT_UNR_LCSSA]] ]			; AUTO_VEC-NEXT: [[X_06_EPIL:%.]] = phi float [ [[CONV1_EPIL:%.]], [[FOR_BODY_EPIL]] ], [ [[X_06_UNR]], [[FOR_END_LOOPEXIT_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[EPIL_ITER_NEXT:%.]], [[FOR_BODY_EPIL]] ], [ 0, [[FOR_END_LOOPEXIT_UNR_LCSSA]] ]			; AUTO_VEC-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[EPIL_ITER_NEXT:%.]], [[FOR_BODY_EPIL]] ], [ 0, [[FOR_END_LOOPEXIT_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[ARRAYIDX_EPIL:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDVARS_IV_EPIL]]			; AUTO_VEC-NEXT: [[ARRAYIDX_EPIL:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDVARS_IV_EPIL]]
	; AUTO_VEC-NEXT: store float [[X_06_EPIL]], ptr [[ARRAYIDX_EPIL]], align 4			; AUTO_VEC-NEXT: store float [[X_06_EPIL]], ptr [[ARRAYIDX_EPIL]], align 4
	; AUTO_VEC-NEXT: [[CONV1_EPIL]] = fadd float [[X_06_EPIL]], 5.000000e-01			; AUTO_VEC-NEXT: [[CONV1_EPIL]] = fadd float [[X_06_EPIL]], 5.000000e-01
	; AUTO_VEC-NEXT: [[INDVARS_IV_NEXT_EPIL]] = add nuw nsw i64 [[INDVARS_IV_EPIL]], 1			; AUTO_VEC-NEXT: [[INDVARS_IV_NEXT_EPIL]] = add nuw nsw i64 [[INDVARS_IV_EPIL]], 1
	; AUTO_VEC-NEXT: [[EPIL_ITER_NEXT]] = add i64 [[EPIL_ITER]], 1			; AUTO_VEC-NEXT: [[EPIL_ITER_NEXT]] = add i64 [[EPIL_ITER]], 1
	; AUTO_VEC-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_NEXT]], [[XTRAITER]]			; AUTO_VEC-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_NEXT]], [[XTRAITER]]
	; AUTO_VEC-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[FOR_END]], label [[FOR_BODY_EPIL]], !llvm.loop [[LOOP6:![0-9]+]]			; AUTO_VEC-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[FOR_END]], label [[FOR_BODY_EPIL]], !llvm.loop [[LOOP4:![0-9]+]]
	; AUTO_VEC: for.end:			; AUTO_VEC: for.end:
	; AUTO_VEC-NEXT: ret void			; AUTO_VEC-NEXT: ret void
	;			;
	entry:			entry:
	%cmp4 = icmp sgt i32 %N, 0			%cmp4 = icmp sgt i32 %N, 0
	br i1 %cmp4, label %for.body.preheader, label %for.end			br i1 %cmp4, label %for.body.preheader, label %for.end

	for.body.preheader: ; preds = %entry			for.body.preheader: ; preds = %entry
	Show All 20 Lines
	define double @external_use_with_fast_math(ptr %a, i64 %n) {			define double @external_use_with_fast_math(ptr %a, i64 %n) {
	; AUTO_VEC-LABEL: @external_use_with_fast_math(			; AUTO_VEC-LABEL: @external_use_with_fast_math(
	; AUTO_VEC-NEXT: entry:			; AUTO_VEC-NEXT: entry:
	; AUTO_VEC-NEXT: [[SMAX:%.]] = tail call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; AUTO_VEC-NEXT: [[SMAX:%.]] = tail call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 16			; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 16
	; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]			; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]
	; AUTO_VEC: vector.ph:			; AUTO_VEC: vector.ph:
	; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775792			; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX]], 9223372036854775792
	; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to double			; AUTO_VEC-NEXT: [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to double
	; AUTO_VEC-NEXT: [[TMP0:%.*]] = fmul fast double [[CAST_VTC]], 3.000000e+00			; AUTO_VEC-NEXT: [[TMP0:%.*]] = fmul fast double [[DOTCAST]], 3.000000e+00
	; AUTO_VEC-NEXT: [[TMP1:%.*]] = add nsw i64 [[SMAX]], -16
	; AUTO_VEC-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 4
	; AUTO_VEC-NEXT: [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1
	; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP3]], 3
	; AUTO_VEC-NEXT: [[TMP4:%.*]] = icmp ult i64 [[TMP1]], 48
	; AUTO_VEC-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]
	; AUTO_VEC: vector.ph.new:
	; AUTO_VEC-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[TMP3]], -4
	; AUTO_VEC-NEXT: br label [[VECTOR_BODY:%.*]]			; AUTO_VEC-NEXT: br label [[VECTOR_BODY:%.*]]
	; AUTO_VEC: vector.body:			; AUTO_VEC: vector.body:
	; AUTO_VEC-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[INDEX_NEXT_3:%.]], [[VECTOR_BODY]] ]			; AUTO_VEC-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[VEC_IND:%.]] = phi <4 x double> [ <double 0.000000e+00, double 3.000000e+00, double 6.000000e+00, double 9.000000e+00>, [[VECTOR_PH_NEW]] ], [ [[VEC_IND_NEXT_3:%.]], [[VECTOR_BODY]] ]			; AUTO_VEC-NEXT: [[VEC_IND:%.]] = phi <4 x double> [ <double 0.000000e+00, double 3.000000e+00, double 6.000000e+00, double 9.000000e+00>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[NITER:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[NITER_NEXT_3:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[STEP_ADD:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.200000e+01, double 1.200000e+01, double 1.200000e+01, double 1.200000e+01>			; AUTO_VEC-NEXT: [[STEP_ADD:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.200000e+01, double 1.200000e+01, double 1.200000e+01, double 1.200000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD2:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 2.400000e+01, double 2.400000e+01, double 2.400000e+01, double 2.400000e+01>			; AUTO_VEC-NEXT: [[STEP_ADD2:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 2.400000e+01, double 2.400000e+01, double 2.400000e+01, double 2.400000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD3:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 3.600000e+01, double 3.600000e+01, double 3.600000e+01, double 3.600000e+01>			; AUTO_VEC-NEXT: [[STEP_ADD3:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 3.600000e+01, double 3.600000e+01, double 3.600000e+01, double 3.600000e+01>
	; AUTO_VEC-NEXT: [[TMP5:%.]] = getelementptr double, ptr [[A:%.]], i64 [[INDEX]]			; AUTO_VEC-NEXT: [[TMP1:%.]] = getelementptr double, ptr [[A:%.]], i64 [[INDEX]]
	; AUTO_VEC-NEXT: store <4 x double> [[VEC_IND]], ptr [[TMP5]], align 8			; AUTO_VEC-NEXT: store <4 x double> [[VEC_IND]], ptr [[TMP1]], align 8
	; AUTO_VEC-NEXT: [[TMP7:%.*]] = getelementptr double, ptr [[TMP5]], i64 4			; AUTO_VEC-NEXT: [[TMP2:%.*]] = getelementptr double, ptr [[TMP1]], i64 4
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD]], ptr [[TMP7]], align 8			; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD]], ptr [[TMP2]], align 8
	; AUTO_VEC-NEXT: [[TMP9:%.*]] = getelementptr double, ptr [[TMP5]], i64 8			; AUTO_VEC-NEXT: [[TMP3:%.*]] = getelementptr double, ptr [[TMP1]], i64 8
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD2]], ptr [[TMP9]], align 8			; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD2]], ptr [[TMP3]], align 8
	; AUTO_VEC-NEXT: [[TMP11:%.*]] = getelementptr double, ptr [[TMP5]], i64 12			; AUTO_VEC-NEXT: [[TMP4:%.*]] = getelementptr double, ptr [[TMP1]], i64 12
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD3]], ptr [[TMP11]], align 8			; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD3]], ptr [[TMP4]], align 8
	; AUTO_VEC-NEXT: [[INDEX_NEXT:%.*]] = or i64 [[INDEX]], 16			; AUTO_VEC-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 4.800000e+01, double 4.800000e+01, double 4.800000e+01, double 4.800000e+01>			; AUTO_VEC-NEXT: [[VEC_IND_NEXT]] = fadd fast <4 x double> [[VEC_IND]], <double 4.800000e+01, double 4.800000e+01, double 4.800000e+01, double 4.800000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD_1:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 6.000000e+01, double 6.000000e+01, double 6.000000e+01, double 6.000000e+01>			; AUTO_VEC-NEXT: [[TMP5:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AUTO_VEC-NEXT: [[STEP_ADD2_1:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 7.200000e+01, double 7.200000e+01, double 7.200000e+01, double 7.200000e+01>			; AUTO_VEC-NEXT: br i1 [[TMP5]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
	; AUTO_VEC-NEXT: [[STEP_ADD3_1:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 8.400000e+01, double 8.400000e+01, double 8.400000e+01, double 8.400000e+01>
	; AUTO_VEC-NEXT: [[TMP13:%.*]] = getelementptr double, ptr [[A]], i64 [[INDEX_NEXT]]
	; AUTO_VEC-NEXT: store <4 x double> [[VEC_IND_NEXT]], ptr [[TMP13]], align 8
	; AUTO_VEC-NEXT: [[TMP15:%.*]] = getelementptr double, ptr [[TMP13]], i64 4
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD_1]], ptr [[TMP15]], align 8
	; AUTO_VEC-NEXT: [[TMP17:%.*]] = getelementptr double, ptr [[TMP13]], i64 8
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD2_1]], ptr [[TMP17]], align 8
	; AUTO_VEC-NEXT: [[TMP19:%.*]] = getelementptr double, ptr [[TMP13]], i64 12
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD3_1]], ptr [[TMP19]], align 8
	; AUTO_VEC-NEXT: [[INDEX_NEXT_1:%.*]] = or i64 [[INDEX]], 32
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_1:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 9.600000e+01, double 9.600000e+01, double 9.600000e+01, double 9.600000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD_2:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.080000e+02, double 1.080000e+02, double 1.080000e+02, double 1.080000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD2_2:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.200000e+02, double 1.200000e+02, double 1.200000e+02, double 1.200000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD3_2:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.320000e+02, double 1.320000e+02, double 1.320000e+02, double 1.320000e+02>
	; AUTO_VEC-NEXT: [[TMP21:%.*]] = getelementptr double, ptr [[A]], i64 [[INDEX_NEXT_1]]
	; AUTO_VEC-NEXT: store <4 x double> [[VEC_IND_NEXT_1]], ptr [[TMP21]], align 8
	; AUTO_VEC-NEXT: [[TMP23:%.*]] = getelementptr double, ptr [[TMP21]], i64 4
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD_2]], ptr [[TMP23]], align 8
	; AUTO_VEC-NEXT: [[TMP25:%.*]] = getelementptr double, ptr [[TMP21]], i64 8
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD2_2]], ptr [[TMP25]], align 8
	; AUTO_VEC-NEXT: [[TMP27:%.*]] = getelementptr double, ptr [[TMP21]], i64 12
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD3_2]], ptr [[TMP27]], align 8
	; AUTO_VEC-NEXT: [[INDEX_NEXT_2:%.*]] = or i64 [[INDEX]], 48
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_2:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.440000e+02, double 1.440000e+02, double 1.440000e+02, double 1.440000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD_3:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.560000e+02, double 1.560000e+02, double 1.560000e+02, double 1.560000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD2_3:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.680000e+02, double 1.680000e+02, double 1.680000e+02, double 1.680000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD3_3:%.*]] = fadd fast <4 x double> [[VEC_IND]], <double 1.800000e+02, double 1.800000e+02, double 1.800000e+02, double 1.800000e+02>
	; AUTO_VEC-NEXT: [[TMP29:%.*]] = getelementptr double, ptr [[A]], i64 [[INDEX_NEXT_2]]
	; AUTO_VEC-NEXT: store <4 x double> [[VEC_IND_NEXT_2]], ptr [[TMP29]], align 8
	; AUTO_VEC-NEXT: [[TMP31:%.*]] = getelementptr double, ptr [[TMP29]], i64 4
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD_3]], ptr [[TMP31]], align 8
	; AUTO_VEC-NEXT: [[TMP33:%.*]] = getelementptr double, ptr [[TMP29]], i64 8
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD2_3]], ptr [[TMP33]], align 8
	; AUTO_VEC-NEXT: [[TMP35:%.*]] = getelementptr double, ptr [[TMP29]], i64 12
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD3_3]], ptr [[TMP35]], align 8
	; AUTO_VEC-NEXT: [[INDEX_NEXT_3]] = add nuw i64 [[INDEX]], 64
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_3]] = fadd fast <4 x double> [[VEC_IND]], <double 1.920000e+02, double 1.920000e+02, double 1.920000e+02, double 1.920000e+02>
	; AUTO_VEC-NEXT: [[NITER_NEXT_3]] = add i64 [[NITER]], 4
	; AUTO_VEC-NEXT: [[NITER_NCMP_3:%.*]] = icmp eq i64 [[NITER_NEXT_3]], [[UNROLL_ITER]]
	; AUTO_VEC-NEXT: br i1 [[NITER_NCMP_3]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
	; AUTO_VEC: middle.block.unr-lcssa:
	; AUTO_VEC-NEXT: [[INDEX_UNR:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT_3]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[VEC_IND_UNR:%.*]] = phi <4 x double> [ <double 0.000000e+00, double 3.000000e+00, double 6.000000e+00, double 9.000000e+00>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT_3]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0
	; AUTO_VEC-NEXT: br i1 [[LCMP_MOD_NOT]], label [[MIDDLE_BLOCK:%.]], label [[VECTOR_BODY_EPIL:%.]]
	; AUTO_VEC: vector.body.epil:
	; AUTO_VEC-NEXT: [[INDEX_EPIL:%.]] = phi i64 [ [[INDEX_NEXT_EPIL:%.]], [[VECTOR_BODY_EPIL]] ], [ [[INDEX_UNR]], [[MIDDLE_BLOCK_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[VEC_IND_EPIL:%.]] = phi <4 x double> [ [[VEC_IND_NEXT_EPIL:%.]], [[VECTOR_BODY_EPIL]] ], [ [[VEC_IND_UNR]], [[MIDDLE_BLOCK_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[EPIL_ITER_NEXT:%.]], [[VECTOR_BODY_EPIL]] ], [ 0, [[MIDDLE_BLOCK_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[STEP_ADD_EPIL:%.*]] = fadd fast <4 x double> [[VEC_IND_EPIL]], <double 1.200000e+01, double 1.200000e+01, double 1.200000e+01, double 1.200000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD2_EPIL:%.*]] = fadd fast <4 x double> [[VEC_IND_EPIL]], <double 2.400000e+01, double 2.400000e+01, double 2.400000e+01, double 2.400000e+01>
	; AUTO_VEC-NEXT: [[STEP_ADD3_EPIL:%.*]] = fadd fast <4 x double> [[VEC_IND_EPIL]], <double 3.600000e+01, double 3.600000e+01, double 3.600000e+01, double 3.600000e+01>
	; AUTO_VEC-NEXT: [[TMP37:%.*]] = getelementptr double, ptr [[A]], i64 [[INDEX_EPIL]]
	; AUTO_VEC-NEXT: store <4 x double> [[VEC_IND_EPIL]], ptr [[TMP37]], align 8
	; AUTO_VEC-NEXT: [[TMP39:%.*]] = getelementptr double, ptr [[TMP37]], i64 4
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD_EPIL]], ptr [[TMP39]], align 8
	; AUTO_VEC-NEXT: [[TMP41:%.*]] = getelementptr double, ptr [[TMP37]], i64 8
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD2_EPIL]], ptr [[TMP41]], align 8
	; AUTO_VEC-NEXT: [[TMP43:%.*]] = getelementptr double, ptr [[TMP37]], i64 12
	; AUTO_VEC-NEXT: store <4 x double> [[STEP_ADD3_EPIL]], ptr [[TMP43]], align 8
	; AUTO_VEC-NEXT: [[INDEX_NEXT_EPIL]] = add nuw i64 [[INDEX_EPIL]], 16
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_EPIL]] = fadd fast <4 x double> [[VEC_IND_EPIL]], <double 4.800000e+01, double 4.800000e+01, double 4.800000e+01, double 4.800000e+01>
	; AUTO_VEC-NEXT: [[EPIL_ITER_NEXT]] = add i64 [[EPIL_ITER]], 1
	; AUTO_VEC-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_NEXT]], [[XTRAITER]]
	; AUTO_VEC-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[MIDDLE_BLOCK]], label [[VECTOR_BODY_EPIL]], !llvm.loop [[LOOP8:![0-9]+]]
	; AUTO_VEC: middle.block:			; AUTO_VEC: middle.block:
	; AUTO_VEC-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX]], [[N_VEC]]			; AUTO_VEC-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX]], [[N_VEC]]
	; AUTO_VEC-NEXT: [[TMP45:%.*]] = add nsw i64 [[N_VEC]], -1			; AUTO_VEC-NEXT: [[CMO:%.*]] = add nsw i64 [[N_VEC]], -1
	; AUTO_VEC-NEXT: [[CAST_CMO:%.*]] = sitofp i64 [[TMP45]] to double			; AUTO_VEC-NEXT: [[DOTCAST6:%.*]] = sitofp i64 [[CMO]] to double
	; AUTO_VEC-NEXT: [[TMP46:%.*]] = fmul fast double [[CAST_CMO]], 3.000000e+00			; AUTO_VEC-NEXT: [[TMP6:%.*]] = fmul fast double [[DOTCAST6]], 3.000000e+00
	; AUTO_VEC-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[FOR_BODY]]			; AUTO_VEC-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[FOR_BODY]]
	; AUTO_VEC: for.body:			; AUTO_VEC: for.body:
	; AUTO_VEC-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]			; AUTO_VEC-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
	; AUTO_VEC-NEXT: [[J:%.]] = phi double [ [[J_NEXT:%.]], [[FOR_BODY]] ], [ 0.000000e+00, [[ENTRY]] ], [ [[TMP0]], [[MIDDLE_BLOCK]] ]			; AUTO_VEC-NEXT: [[J:%.]] = phi double [ [[J_NEXT:%.]], [[FOR_BODY]] ], [ 0.000000e+00, [[ENTRY]] ], [ [[TMP0]], [[MIDDLE_BLOCK]] ]
	; AUTO_VEC-NEXT: [[T0:%.*]] = getelementptr double, ptr [[A]], i64 [[I]]			; AUTO_VEC-NEXT: [[T0:%.*]] = getelementptr double, ptr [[A]], i64 [[I]]
	; AUTO_VEC-NEXT: store double [[J]], ptr [[T0]], align 8			; AUTO_VEC-NEXT: store double [[J]], ptr [[T0]], align 8
	; AUTO_VEC-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; AUTO_VEC-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; AUTO_VEC-NEXT: [[J_NEXT]] = fadd fast double [[J]], 3.000000e+00			; AUTO_VEC-NEXT: [[J_NEXT]] = fadd fast double [[J]], 3.000000e+00
	; AUTO_VEC-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[I_NEXT]], [[SMAX]]			; AUTO_VEC-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[I_NEXT]], [[SMAX]]
	; AUTO_VEC-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP9:![0-9]+]]			; AUTO_VEC-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
	; AUTO_VEC: for.end:			; AUTO_VEC: for.end:
	; AUTO_VEC-NEXT: [[J_LCSSA:%.*]] = phi double [ [[TMP46]], [[MIDDLE_BLOCK]] ], [ [[J]], [[FOR_BODY]] ]			; AUTO_VEC-NEXT: [[J_LCSSA:%.*]] = phi double [ [[TMP6]], [[MIDDLE_BLOCK]] ], [ [[J]], [[FOR_BODY]] ]
	; AUTO_VEC-NEXT: ret double [[J_LCSSA]]			; AUTO_VEC-NEXT: ret double [[J_LCSSA]]
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body:			for.body:
	%i = phi i64 [ 0, %entry ], [%i.next, %for.body]			%i = phi i64 [ 0, %entry ], [%i.next, %for.body]
	%j = phi double [ 0.0, %entry ], [ %j.next, %for.body ]			%j = phi double [ 0.0, %entry ], [ %j.next, %for.body ]
	▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	; AUTO_VEC-NEXT: [[J_EPIL:%.]] = phi double [ [[J_NEXT_EPIL:%.]], [[FOR_BODY_EPIL]] ], [ [[J_UNR]], [[FOR_END_UNR_LCSSA]] ]			; AUTO_VEC-NEXT: [[J_EPIL:%.]] = phi double [ [[J_NEXT_EPIL:%.]], [[FOR_BODY_EPIL]] ], [ [[J_UNR]], [[FOR_END_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[EPIL_ITER_NEXT:%.]], [[FOR_BODY_EPIL]] ], [ 0, [[FOR_END_UNR_LCSSA]] ]			; AUTO_VEC-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[EPIL_ITER_NEXT:%.]], [[FOR_BODY_EPIL]] ], [ 0, [[FOR_END_UNR_LCSSA]] ]
	; AUTO_VEC-NEXT: [[T0_EPIL:%.*]] = getelementptr double, ptr [[A]], i64 [[I_EPIL]]			; AUTO_VEC-NEXT: [[T0_EPIL:%.*]] = getelementptr double, ptr [[A]], i64 [[I_EPIL]]
	; AUTO_VEC-NEXT: store double [[J_EPIL]], ptr [[T0_EPIL]], align 8			; AUTO_VEC-NEXT: store double [[J_EPIL]], ptr [[T0_EPIL]], align 8
	; AUTO_VEC-NEXT: [[I_NEXT_EPIL]] = add nuw nsw i64 [[I_EPIL]], 1			; AUTO_VEC-NEXT: [[I_NEXT_EPIL]] = add nuw nsw i64 [[I_EPIL]], 1
	; AUTO_VEC-NEXT: [[J_NEXT_EPIL]] = fadd double [[J_EPIL]], 3.000000e+00			; AUTO_VEC-NEXT: [[J_NEXT_EPIL]] = fadd double [[J_EPIL]], 3.000000e+00
	; AUTO_VEC-NEXT: [[EPIL_ITER_NEXT]] = add i64 [[EPIL_ITER]], 1			; AUTO_VEC-NEXT: [[EPIL_ITER_NEXT]] = add i64 [[EPIL_ITER]], 1
	; AUTO_VEC-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_NEXT]], [[XTRAITER]]			; AUTO_VEC-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_NEXT]], [[XTRAITER]]
	; AUTO_VEC-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[FOR_END]], label [[FOR_BODY_EPIL]], !llvm.loop [[LOOP10:![0-9]+]]			; AUTO_VEC-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[FOR_END]], label [[FOR_BODY_EPIL]], !llvm.loop [[LOOP8:![0-9]+]]
	; AUTO_VEC: for.end:			; AUTO_VEC: for.end:
	; AUTO_VEC-NEXT: [[J_LCSSA:%.*]] = phi double [ [[J_LCSSA_PH]], [[FOR_END_UNR_LCSSA]] ], [ [[J_EPIL]], [[FOR_BODY_EPIL]] ]			; AUTO_VEC-NEXT: [[J_LCSSA:%.*]] = phi double [ [[J_LCSSA_PH]], [[FOR_END_UNR_LCSSA]] ], [ [[J_EPIL]], [[FOR_BODY_EPIL]] ]
	; AUTO_VEC-NEXT: ret double [[J_LCSSA]]			; AUTO_VEC-NEXT: ret double [[J_LCSSA]]
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body:			for.body:
	Show All 25 Lines
	; AUTO_VEC-NEXT: [[CMP_NOT11:%.]] = icmp eq i32 [[N:%.]], 0			; AUTO_VEC-NEXT: [[CMP_NOT11:%.]] = icmp eq i32 [[N:%.]], 0
	; AUTO_VEC-NEXT: br i1 [[CMP_NOT11]], label [[FOR_COND_CLEANUP:%.]], label [[FOR_BODY_PREHEADER:%.]]			; AUTO_VEC-NEXT: br i1 [[CMP_NOT11]], label [[FOR_COND_CLEANUP:%.]], label [[FOR_BODY_PREHEADER:%.]]
	; AUTO_VEC: for.body.preheader:			; AUTO_VEC: for.body.preheader:
	; AUTO_VEC-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64			; AUTO_VEC-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64
	; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32			; AUTO_VEC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32
	; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]			; AUTO_VEC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]
	; AUTO_VEC: vector.ph:			; AUTO_VEC: vector.ph:
	; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967264			; AUTO_VEC-NEXT: [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967264
	; AUTO_VEC-NEXT: [[CAST_VTC:%.*]] = sitofp i64 [[N_VEC]] to float			; AUTO_VEC-NEXT: [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
	; AUTO_VEC-NEXT: [[TMP1:%.*]] = fmul reassoc float [[CAST_VTC]], 4.200000e+01			; AUTO_VEC-NEXT: [[TMP1:%.*]] = fmul reassoc float [[DOTCAST]], 4.200000e+01
	; AUTO_VEC-NEXT: [[IND_END:%.*]] = fadd reassoc float [[TMP1]], 1.000000e+00			; AUTO_VEC-NEXT: [[IND_END:%.*]] = fadd reassoc float [[TMP1]], 1.000000e+00
	; AUTO_VEC-NEXT: [[TMP2:%.*]] = add nsw i64 [[TMP0]], -32
	; AUTO_VEC-NEXT: [[TMP3:%.*]] = lshr i64 [[TMP2]], 5
	; AUTO_VEC-NEXT: [[TMP4:%.*]] = add nuw nsw i64 [[TMP3]], 1
	; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP4]], 1
	; AUTO_VEC-NEXT: [[TMP5:%.*]] = icmp ult i64 [[TMP2]], 32
	; AUTO_VEC-NEXT: br i1 [[TMP5]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]
	; AUTO_VEC: vector.ph.new:
	; AUTO_VEC-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[TMP4]], -2
	; AUTO_VEC-NEXT: br label [[VECTOR_BODY:%.*]]			; AUTO_VEC-NEXT: br label [[VECTOR_BODY:%.*]]
	; AUTO_VEC: vector.body:			; AUTO_VEC: vector.body:
	; AUTO_VEC-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[INDEX_NEXT_1:%.]], [[VECTOR_BODY]] ]			; AUTO_VEC-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[VEC_IND:%.]] = phi <8 x float> [ <float 1.000000e+00, float 4.300000e+01, float 8.500000e+01, float 1.270000e+02, float 1.690000e+02, float 2.110000e+02, float 2.530000e+02, float 2.950000e+02>, [[VECTOR_PH_NEW]] ], [ [[VEC_IND_NEXT_1:%.]], [[VECTOR_BODY]] ]			; AUTO_VEC-NEXT: [[VEC_IND:%.]] = phi <8 x float> [ <float 1.000000e+00, float 4.300000e+01, float 8.500000e+01, float 1.270000e+02, float 1.690000e+02, float 2.110000e+02, float 2.530000e+02, float 2.950000e+02>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[NITER:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[NITER_NEXT_1:%.]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[STEP_ADD:%.*]] = fadd reassoc <8 x float> [[VEC_IND]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>			; AUTO_VEC-NEXT: [[STEP_ADD:%.*]] = fadd reassoc <8 x float> [[VEC_IND]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD2:%.*]] = fadd reassoc <8 x float> [[STEP_ADD]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>			; AUTO_VEC-NEXT: [[STEP_ADD2:%.*]] = fadd reassoc <8 x float> [[STEP_ADD]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD3:%.*]] = fadd reassoc <8 x float> [[STEP_ADD2]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>			; AUTO_VEC-NEXT: [[STEP_ADD3:%.*]] = fadd reassoc <8 x float> [[STEP_ADD2]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[TMP6:%.]] = getelementptr inbounds float, ptr [[P:%.]], i64 [[INDEX]]			; AUTO_VEC-NEXT: [[TMP2:%.]] = getelementptr inbounds float, ptr [[P:%.]], i64 [[INDEX]]
	; AUTO_VEC-NEXT: [[WIDE_LOAD:%.*]] = load <8 x float>, ptr [[TMP6]], align 4			; AUTO_VEC-NEXT: [[WIDE_LOAD:%.*]] = load <8 x float>, ptr [[TMP2]], align 4
	; AUTO_VEC-NEXT: [[TMP8:%.*]] = getelementptr inbounds float, ptr [[TMP6]], i64 8			; AUTO_VEC-NEXT: [[TMP3:%.*]] = getelementptr inbounds float, ptr [[TMP2]], i64 8
	; AUTO_VEC-NEXT: [[WIDE_LOAD5:%.*]] = load <8 x float>, ptr [[TMP8]], align 4			; AUTO_VEC-NEXT: [[WIDE_LOAD5:%.*]] = load <8 x float>, ptr [[TMP3]], align 4
	; AUTO_VEC-NEXT: [[TMP10:%.*]] = getelementptr inbounds float, ptr [[TMP6]], i64 16			; AUTO_VEC-NEXT: [[TMP4:%.*]] = getelementptr inbounds float, ptr [[TMP2]], i64 16
	; AUTO_VEC-NEXT: [[WIDE_LOAD6:%.*]] = load <8 x float>, ptr [[TMP10]], align 4			; AUTO_VEC-NEXT: [[WIDE_LOAD6:%.*]] = load <8 x float>, ptr [[TMP4]], align 4
	; AUTO_VEC-NEXT: [[TMP12:%.*]] = getelementptr inbounds float, ptr [[TMP6]], i64 24			; AUTO_VEC-NEXT: [[TMP5:%.*]] = getelementptr inbounds float, ptr [[TMP2]], i64 24
	; AUTO_VEC-NEXT: [[WIDE_LOAD7:%.*]] = load <8 x float>, ptr [[TMP12]], align 4			; AUTO_VEC-NEXT: [[WIDE_LOAD7:%.*]] = load <8 x float>, ptr [[TMP5]], align 4
	; AUTO_VEC-NEXT: [[TMP14:%.*]] = fadd reassoc <8 x float> [[VEC_IND]], [[WIDE_LOAD]]			; AUTO_VEC-NEXT: [[TMP6:%.*]] = fadd reassoc <8 x float> [[VEC_IND]], [[WIDE_LOAD]]
	; AUTO_VEC-NEXT: [[TMP15:%.*]] = fadd reassoc <8 x float> [[STEP_ADD]], [[WIDE_LOAD5]]			; AUTO_VEC-NEXT: [[TMP7:%.*]] = fadd reassoc <8 x float> [[STEP_ADD]], [[WIDE_LOAD5]]
	; AUTO_VEC-NEXT: [[TMP16:%.*]] = fadd reassoc <8 x float> [[STEP_ADD2]], [[WIDE_LOAD6]]			; AUTO_VEC-NEXT: [[TMP8:%.*]] = fadd reassoc <8 x float> [[STEP_ADD2]], [[WIDE_LOAD6]]
	; AUTO_VEC-NEXT: [[TMP17:%.*]] = fadd reassoc <8 x float> [[STEP_ADD3]], [[WIDE_LOAD7]]			; AUTO_VEC-NEXT: [[TMP9:%.*]] = fadd reassoc <8 x float> [[STEP_ADD3]], [[WIDE_LOAD7]]
	; AUTO_VEC-NEXT: store <8 x float> [[TMP14]], ptr [[TMP6]], align 4			; AUTO_VEC-NEXT: store <8 x float> [[TMP6]], ptr [[TMP2]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP15]], ptr [[TMP8]], align 4			; AUTO_VEC-NEXT: store <8 x float> [[TMP7]], ptr [[TMP3]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP16]], ptr [[TMP10]], align 4			; AUTO_VEC-NEXT: store <8 x float> [[TMP8]], ptr [[TMP4]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP17]], ptr [[TMP12]], align 4			; AUTO_VEC-NEXT: store <8 x float> [[TMP9]], ptr [[TMP5]], align 4
	; AUTO_VEC-NEXT: [[INDEX_NEXT:%.*]] = or i64 [[INDEX]], 32			; AUTO_VEC-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 32
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT:%.*]] = fadd reassoc <8 x float> [[STEP_ADD3]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>			; AUTO_VEC-NEXT: [[VEC_IND_NEXT]] = fadd reassoc <8 x float> [[STEP_ADD3]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD_1:%.*]] = fadd reassoc <8 x float> [[VEC_IND_NEXT]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>			; AUTO_VEC-NEXT: [[TMP10:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AUTO_VEC-NEXT: [[STEP_ADD2_1:%.*]] = fadd reassoc <8 x float> [[STEP_ADD_1]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>			; AUTO_VEC-NEXT: br i1 [[TMP10]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP9:![0-9]+]]
	; AUTO_VEC-NEXT: [[STEP_ADD3_1:%.*]] = fadd reassoc <8 x float> [[STEP_ADD2_1]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[TMP22:%.*]] = getelementptr inbounds float, ptr [[P]], i64 [[INDEX_NEXT]]
	; AUTO_VEC-NEXT: [[WIDE_LOAD_1:%.*]] = load <8 x float>, ptr [[TMP22]], align 4
	; AUTO_VEC-NEXT: [[TMP24:%.*]] = getelementptr inbounds float, ptr [[TMP22]], i64 8
	; AUTO_VEC-NEXT: [[WIDE_LOAD5_1:%.*]] = load <8 x float>, ptr [[TMP24]], align 4
	; AUTO_VEC-NEXT: [[TMP26:%.*]] = getelementptr inbounds float, ptr [[TMP22]], i64 16
	; AUTO_VEC-NEXT: [[WIDE_LOAD6_1:%.*]] = load <8 x float>, ptr [[TMP26]], align 4
	; AUTO_VEC-NEXT: [[TMP28:%.*]] = getelementptr inbounds float, ptr [[TMP22]], i64 24
	; AUTO_VEC-NEXT: [[WIDE_LOAD7_1:%.*]] = load <8 x float>, ptr [[TMP28]], align 4
	; AUTO_VEC-NEXT: [[TMP30:%.*]] = fadd reassoc <8 x float> [[VEC_IND_NEXT]], [[WIDE_LOAD_1]]
	; AUTO_VEC-NEXT: [[TMP31:%.*]] = fadd reassoc <8 x float> [[STEP_ADD_1]], [[WIDE_LOAD5_1]]
	; AUTO_VEC-NEXT: [[TMP32:%.*]] = fadd reassoc <8 x float> [[STEP_ADD2_1]], [[WIDE_LOAD6_1]]
	; AUTO_VEC-NEXT: [[TMP33:%.*]] = fadd reassoc <8 x float> [[STEP_ADD3_1]], [[WIDE_LOAD7_1]]
	; AUTO_VEC-NEXT: store <8 x float> [[TMP30]], ptr [[TMP22]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP31]], ptr [[TMP24]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP32]], ptr [[TMP26]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP33]], ptr [[TMP28]], align 4
	; AUTO_VEC-NEXT: [[INDEX_NEXT_1]] = add nuw i64 [[INDEX]], 64
	; AUTO_VEC-NEXT: [[VEC_IND_NEXT_1]] = fadd reassoc <8 x float> [[STEP_ADD3_1]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[NITER_NEXT_1]] = add i64 [[NITER]], 2
	; AUTO_VEC-NEXT: [[NITER_NCMP_1:%.*]] = icmp eq i64 [[NITER_NEXT_1]], [[UNROLL_ITER]]
	; AUTO_VEC-NEXT: br i1 [[NITER_NCMP_1]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], !llvm.loop [[LOOP11:![0-9]+]]
	; AUTO_VEC: middle.block.unr-lcssa:
	; AUTO_VEC-NEXT: [[INDEX_UNR:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT_1]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[VEC_IND_UNR:%.*]] = phi <8 x float> [ <float 1.000000e+00, float 4.300000e+01, float 8.500000e+01, float 1.270000e+02, float 1.690000e+02, float 2.110000e+02, float 2.530000e+02, float 2.950000e+02>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT_1]], [[VECTOR_BODY]] ]
	; AUTO_VEC-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0
	; AUTO_VEC-NEXT: br i1 [[LCMP_MOD_NOT]], label [[MIDDLE_BLOCK:%.]], label [[VECTOR_BODY_EPIL:%.]]
	; AUTO_VEC: vector.body.epil:
	; AUTO_VEC-NEXT: [[STEP_ADD_EPIL:%.*]] = fadd reassoc <8 x float> [[VEC_IND_UNR]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD2_EPIL:%.*]] = fadd reassoc <8 x float> [[STEP_ADD_EPIL]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[STEP_ADD3_EPIL:%.*]] = fadd reassoc <8 x float> [[STEP_ADD2_EPIL]], <float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02, float 3.360000e+02>
	; AUTO_VEC-NEXT: [[TMP38:%.*]] = getelementptr inbounds float, ptr [[P]], i64 [[INDEX_UNR]]
	; AUTO_VEC-NEXT: [[WIDE_LOAD_EPIL:%.*]] = load <8 x float>, ptr [[TMP38]], align 4
	; AUTO_VEC-NEXT: [[TMP40:%.*]] = getelementptr inbounds float, ptr [[TMP38]], i64 8
	; AUTO_VEC-NEXT: [[WIDE_LOAD5_EPIL:%.*]] = load <8 x float>, ptr [[TMP40]], align 4
	; AUTO_VEC-NEXT: [[TMP42:%.*]] = getelementptr inbounds float, ptr [[TMP38]], i64 16
	; AUTO_VEC-NEXT: [[WIDE_LOAD6_EPIL:%.*]] = load <8 x float>, ptr [[TMP42]], align 4
	; AUTO_VEC-NEXT: [[TMP44:%.*]] = getelementptr inbounds float, ptr [[TMP38]], i64 24
	; AUTO_VEC-NEXT: [[WIDE_LOAD7_EPIL:%.*]] = load <8 x float>, ptr [[TMP44]], align 4
	; AUTO_VEC-NEXT: [[TMP46:%.*]] = fadd reassoc <8 x float> [[VEC_IND_UNR]], [[WIDE_LOAD_EPIL]]
	; AUTO_VEC-NEXT: [[TMP47:%.*]] = fadd reassoc <8 x float> [[STEP_ADD_EPIL]], [[WIDE_LOAD5_EPIL]]
	; AUTO_VEC-NEXT: [[TMP48:%.*]] = fadd reassoc <8 x float> [[STEP_ADD2_EPIL]], [[WIDE_LOAD6_EPIL]]
	; AUTO_VEC-NEXT: [[TMP49:%.*]] = fadd reassoc <8 x float> [[STEP_ADD3_EPIL]], [[WIDE_LOAD7_EPIL]]
	; AUTO_VEC-NEXT: store <8 x float> [[TMP46]], ptr [[TMP38]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP47]], ptr [[TMP40]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP48]], ptr [[TMP42]], align 4
	; AUTO_VEC-NEXT: store <8 x float> [[TMP49]], ptr [[TMP44]], align 4
	; AUTO_VEC-NEXT: br label [[MIDDLE_BLOCK]]
	; AUTO_VEC: middle.block:			; AUTO_VEC: middle.block:
	; AUTO_VEC-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[TMP0]]			; AUTO_VEC-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[TMP0]]
	; AUTO_VEC-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]]			; AUTO_VEC-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]]
	; AUTO_VEC: for.cond.cleanup:			; AUTO_VEC: for.cond.cleanup:
	; AUTO_VEC-NEXT: ret void			; AUTO_VEC-NEXT: ret void
	; AUTO_VEC: for.body:			; AUTO_VEC: for.body:
	; AUTO_VEC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]			; AUTO_VEC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
	; AUTO_VEC-NEXT: [[X_012:%.]] = phi float [ [[ADD3:%.]], [[FOR_BODY]] ], [ 1.000000e+00, [[FOR_BODY_PREHEADER]] ], [ [[IND_END]], [[MIDDLE_BLOCK]] ]			; AUTO_VEC-NEXT: [[X_012:%.]] = phi float [ [[ADD3:%.]], [[FOR_BODY]] ], [ 1.000000e+00, [[FOR_BODY_PREHEADER]] ], [ [[IND_END]], [[MIDDLE_BLOCK]] ]
	; AUTO_VEC-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds float, ptr [[P]], i64 [[INDVARS_IV]]			; AUTO_VEC-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds float, ptr [[P]], i64 [[INDVARS_IV]]
	; AUTO_VEC-NEXT: [[TMP54:%.*]] = load float, ptr [[ARRAYIDX]], align 4			; AUTO_VEC-NEXT: [[TMP11:%.*]] = load float, ptr [[ARRAYIDX]], align 4
	; AUTO_VEC-NEXT: [[ADD:%.*]] = fadd reassoc float [[X_012]], [[TMP54]]			; AUTO_VEC-NEXT: [[ADD:%.*]] = fadd reassoc float [[X_012]], [[TMP11]]
	; AUTO_VEC-NEXT: store float [[ADD]], ptr [[ARRAYIDX]], align 4			; AUTO_VEC-NEXT: store float [[ADD]], ptr [[ARRAYIDX]], align 4
	; AUTO_VEC-NEXT: [[ADD3]] = fadd reassoc float [[X_012]], 4.200000e+01			; AUTO_VEC-NEXT: [[ADD3]] = fadd reassoc float [[X_012]], 4.200000e+01
	; AUTO_VEC-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AUTO_VEC-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AUTO_VEC-NEXT: [[CMP_NOT:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[TMP0]]			; AUTO_VEC-NEXT: [[CMP_NOT:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[TMP0]]
	; AUTO_VEC-NEXT: br i1 [[CMP_NOT]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]			; AUTO_VEC-NEXT: br i1 [[CMP_NOT]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP10:![0-9]+]]
	;			;
	entry:			entry:
	%cmp.not11 = icmp eq i32 %N, 0			%cmp.not11 = icmp eq i32 %N, 0
	br i1 %cmp.not11, label %for.cond.cleanup, label %for.body.preheader			br i1 %cmp.not11, label %for.cond.cleanup, label %for.body.preheader

	for.body.preheader:			for.body.preheader:
	%0 = zext i32 %N to i64			%0 = zext i32 %N to i64
	br label %for.body			br label %for.body
	Show All 16 Lines

llvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll

	Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
	; AVX512-NEXT: [[TMP14:%.*]] = load float, ptr [[ARRAYIDX5]], align 4			; AVX512-NEXT: [[TMP14:%.*]] = load float, ptr [[ARRAYIDX5]], align 4
	; AVX512-NEXT: [[ADD:%.*]] = fadd float [[TMP14]], 5.000000e-01			; AVX512-NEXT: [[ADD:%.*]] = fadd float [[TMP14]], 5.000000e-01
	; AVX512-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds float, ptr [[OUT]], i64 [[INDVARS_IV]]			; AVX512-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds float, ptr [[OUT]], i64 [[INDVARS_IV]]
	; AVX512-NEXT: store float [[ADD]], ptr [[ARRAYIDX7]], align 4			; AVX512-NEXT: store float [[ADD]], ptr [[ARRAYIDX7]], align 4
	; AVX512-NEXT: br label [[FOR_INC]]			; AVX512-NEXT: br label [[FOR_INC]]
	; AVX512: for.inc:			; AVX512: for.inc:
	; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX512-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX512-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 4096			; AVX512-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 4096
	; AVX512-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; AVX512-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; AVX512: for.end:			; AVX512: for.end:
	; AVX512-NEXT: ret void			; AVX512-NEXT: ret void
	;			;
	; FVW2-LABEL: @foo1(			; FVW2-LABEL: @foo1(
	; FVW2-NEXT: entry:			; FVW2-NEXT: entry:
	; FVW2-NEXT: br label [[VECTOR_BODY:%.*]]			; FVW2-NEXT: br label [[VECTOR_BODY:%.*]]
	; FVW2: vector.body:			; FVW2: vector.body:
	; FVW2-NEXT: [[INDEX1:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]			; FVW2-NEXT: [[INDEX1:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
	Show All 32 Lines
	; FVW2-NEXT: [[TMP14:%.*]] = load float, ptr [[ARRAYIDX5]], align 4			; FVW2-NEXT: [[TMP14:%.*]] = load float, ptr [[ARRAYIDX5]], align 4
	; FVW2-NEXT: [[ADD:%.*]] = fadd float [[TMP14]], 5.000000e-01			; FVW2-NEXT: [[ADD:%.*]] = fadd float [[TMP14]], 5.000000e-01
	; FVW2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds float, ptr [[OUT]], i64 [[INDVARS_IV]]			; FVW2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds float, ptr [[OUT]], i64 [[INDVARS_IV]]
	; FVW2-NEXT: store float [[ADD]], ptr [[ARRAYIDX7]], align 4			; FVW2-NEXT: store float [[ADD]], ptr [[ARRAYIDX7]], align 4
	; FVW2-NEXT: br label [[FOR_INC]]			; FVW2-NEXT: br label [[FOR_INC]]
	; FVW2: for.inc:			; FVW2: for.inc:
	; FVW2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; FVW2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; FVW2-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 4096			; FVW2-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 4096
	; FVW2-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; FVW2-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; FVW2: for.end:			; FVW2: for.end:
	; FVW2-NEXT: ret void			; FVW2-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body:			for.body:
	%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.inc ]			%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.inc ]
	▲ Show 20 Lines • Show All 973 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT11:%.*]] = insertelement <8 x ptr> poison, ptr [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT11:%.*]] = insertelement <8 x ptr> poison, ptr [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT12:%.*]] = shufflevector <8 x ptr> [[BROADCAST_SPLATINSERT11]], <8 x ptr> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT12:%.*]] = shufflevector <8 x ptr> [[BROADCAST_SPLATINSERT11]], <8 x ptr> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <8 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <8 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT13]], <8 x i32> poison, <8 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <8 x i32> [[BROADCAST_SPLATINSERT13]], <8 x i32> poison, <8 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]
	; CHECK: vec.epilog.vector.body:			; CHECK: vec.epilog.vector.body:
	; CHECK-NEXT: [[INDEX9:%.]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT17:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX9:%.]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT17:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX9]]			; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX9]]
	; CHECK-NEXT: store <8 x i32> [[BROADCAST_SPLAT14]], ptr [[TMP5]], align 4, !alias.scope !7, !noalias !10			; CHECK-NEXT: store <8 x i32> [[BROADCAST_SPLAT14]], ptr [[TMP5]], align 4, !alias.scope !8, !noalias !11
	; CHECK-NEXT: [[INDEX_NEXT17]] = add nuw i64 [[INDEX9]], 8			; CHECK-NEXT: [[INDEX_NEXT17]] = add nuw i64 [[INDEX9]], 8
	; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[INDEX_NEXT17]], [[N_VEC7]]			; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[INDEX_NEXT17]], [[N_VEC7]]
	; CHECK-NEXT: br i1 [[TMP6]], label [[VEC_EPILOG_MIDDLE_BLOCK:%.*]], label [[VEC_EPILOG_VECTOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP6]], label [[VEC_EPILOG_MIDDLE_BLOCK:%.*]], label [[VEC_EPILOG_VECTOR_BODY]], !llvm.loop [[LOOP13:![0-9]+]]
	; CHECK: vec.epilog.middle.block:			; CHECK: vec.epilog.middle.block:
	; CHECK-NEXT: [[TMP7:%.*]] = icmp ne <8 x ptr> [[BROADCAST_SPLAT12]], zeroinitializer			; CHECK-NEXT: [[TMP7:%.*]] = icmp ne <8 x ptr> [[BROADCAST_SPLAT12]], zeroinitializer
	; CHECK-NEXT: [[WIDE_MASKED_GATHER15:%.*]] = call <8 x i32> @llvm.masked.gather.v8i32.v8p0(<8 x ptr> [[BROADCAST_SPLAT12]], i32 4, <8 x i1> [[TMP7]], <8 x i32> poison), !alias.scope !10			; CHECK-NEXT: [[WIDE_MASKED_GATHER15:%.*]] = call <8 x i32> @llvm.masked.gather.v8i32.v8p0(<8 x ptr> [[BROADCAST_SPLAT12]], i32 4, <8 x i1> [[TMP7]], <8 x i32> poison), !alias.scope !11
	; CHECK-NEXT: [[PREDPHI16:%.*]] = select <8 x i1> [[TMP7]], <8 x i32> [[WIDE_MASKED_GATHER15]], <8 x i32> <i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 1>			; CHECK-NEXT: [[PREDPHI16:%.*]] = select <8 x i1> [[TMP7]], <8 x i32> [[WIDE_MASKED_GATHER15]], <8 x i32> <i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 1>
	; CHECK-NEXT: [[TMP8:%.*]] = extractelement <8 x i32> [[PREDPHI16]], i64 7			; CHECK-NEXT: [[TMP8:%.*]] = extractelement <8 x i32> [[PREDPHI16]], i64 7
	; CHECK-NEXT: [[CMP_N8:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC7]]			; CHECK-NEXT: [[CMP_N8:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC7]]
	; CHECK-NEXT: br i1 [[CMP_N8]], label [[FOR_END]], label [[VEC_EPILOG_SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N8]], label [[FOR_END]], label [[VEC_EPILOG_SCALAR_PH]]
	; CHECK: vec.epilog.scalar.ph:			; CHECK: vec.epilog.scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC7]], [[VEC_EPILOG_MIDDLE_BLOCK]] ], [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[ITER_CHECK:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC7]], [[VEC_EPILOG_MIDDLE_BLOCK]] ], [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[ITER_CHECK:%.]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ], [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ]			; CHECK-NEXT: [[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ 0, [[VECTOR_MAIN_LOOP_ITER_CHECK]] ], [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ]
	; CHECK-NEXT: [[N_VEC13:%.*]] = and i64 [[SMAX2]], 9223372036854775800			; CHECK-NEXT: [[N_VEC13:%.*]] = and i64 [[SMAX2]], 9223372036854775800
	; CHECK-NEXT: [[TMP11:%.*]] = insertelement <8 x i32> <i32 poison, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>, i32 [[BC_MERGE_RDX]], i64 0			; CHECK-NEXT: [[TMP11:%.*]] = insertelement <8 x i32> <i32 poison, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>, i32 [[BC_MERGE_RDX]], i64 0
	; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VEC_EPILOG_VECTOR_BODY:%.*]]
	; CHECK: vec.epilog.vector.body:			; CHECK: vec.epilog.vector.body:
	; CHECK-NEXT: [[INDEX15:%.]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT18:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX15:%.]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT18:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI16:%.]] = phi <8 x i32> [ [[TMP11]], [[VEC_EPILOG_PH]] ], [ [[TMP13:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI16:%.]] = phi <8 x i32> [ [[TMP11]], [[VEC_EPILOG_PH]] ], [ [[TMP13:%.]], [[VEC_EPILOG_VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX15]]			; CHECK-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX15]]
	; CHECK-NEXT: [[WIDE_LOAD17:%.*]] = load <8 x i32>, ptr [[TMP12]], align 8, !alias.scope !7			; CHECK-NEXT: [[WIDE_LOAD17:%.*]] = load <8 x i32>, ptr [[TMP12]], align 8, !alias.scope !8
	; CHECK-NEXT: [[TMP13]] = add <8 x i32> [[VEC_PHI16]], [[WIDE_LOAD17]]			; CHECK-NEXT: [[TMP13]] = add <8 x i32> [[VEC_PHI16]], [[WIDE_LOAD17]]
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !10, !noalias !7			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !11, !noalias !8
	; CHECK-NEXT: [[INDEX_NEXT18]] = add nuw i64 [[INDEX15]], 8			; CHECK-NEXT: [[INDEX_NEXT18]] = add nuw i64 [[INDEX15]], 8
	; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i64 [[INDEX_NEXT18]], [[N_VEC13]]			; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i64 [[INDEX_NEXT18]], [[N_VEC13]]
	; CHECK-NEXT: br i1 [[TMP14]], label [[VEC_EPILOG_MIDDLE_BLOCK:%.*]], label [[VEC_EPILOG_VECTOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP14]], label [[VEC_EPILOG_MIDDLE_BLOCK:%.*]], label [[VEC_EPILOG_VECTOR_BODY]], !llvm.loop [[LOOP13:![0-9]+]]
	; CHECK: vec.epilog.middle.block:			; CHECK: vec.epilog.middle.block:
	; CHECK-NEXT: [[TMP15:%.*]] = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> [[TMP13]])			; CHECK-NEXT: [[TMP15:%.*]] = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> [[TMP13]])
	; CHECK-NEXT: [[CMP_N14:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC13]]			; CHECK-NEXT: [[CMP_N14:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC13]]
	; CHECK-NEXT: br i1 [[CMP_N14]], label [[FOR_END]], label [[VEC_EPILOG_SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N14]], label [[FOR_END]], label [[VEC_EPILOG_SCALAR_PH]]
	; CHECK: vec.epilog.scalar.ph:			; CHECK: vec.epilog.scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC13]], [[VEC_EPILOG_MIDDLE_BLOCK]] ], [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[ITER_CHECK:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC13]], [[VEC_EPILOG_MIDDLE_BLOCK]] ], [ [[N_VEC]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[ITER_CHECK:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX19:%.*]] = phi i32 [ [[TMP15]], [[VEC_EPILOG_MIDDLE_BLOCK]] ], [ [[TMP10]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[ITER_CHECK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX19:%.*]] = phi i32 [ [[TMP15]], [[VEC_EPILOG_MIDDLE_BLOCK]] ], [ [[TMP10]], [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[ITER_CHECK]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	▲ Show 20 Lines • Show All 278 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: [[TMP13:%.*]] = load i32, ptr [[ARRAYIDX3]], align 4			; AVX1-NEXT: [[TMP13:%.*]] = load i32, ptr [[ARRAYIDX3]], align 4
	; AVX1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP13]], [[TMP12]]			; AVX1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP13]], [[TMP12]]
	; AVX1-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX7]], align 4			; AVX1-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX7]], align 4
	; AVX1-NEXT: br label [[FOR_INC]]			; AVX1-NEXT: br label [[FOR_INC]]
	; AVX1: for.inc:			; AVX1: for.inc:
	; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; AVX1: for.end:			; AVX1: for.end:
	; AVX1-NEXT: ret void			; AVX1-NEXT: ret void
	;			;
	; AVX2-LABEL: @foo1(			; AVX2-LABEL: @foo1(
	; AVX2-NEXT: entry:			; AVX2-NEXT: entry:
	; AVX2-NEXT: [[B3:%.]] = ptrtoint ptr [[B:%.]] to i64			; AVX2-NEXT: [[B3:%.]] = ptrtoint ptr [[B:%.]] to i64
	; AVX2-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr [[TRIGGER:%.]] to i64			; AVX2-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr [[TRIGGER:%.]] to i64
	; AVX2-NEXT: [[A1:%.]] = ptrtoint ptr [[A:%.]] to i64			; AVX2-NEXT: [[A1:%.]] = ptrtoint ptr [[A:%.]] to i64
	▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: [[TMP40:%.*]] = load i32, ptr [[ARRAYIDX3]], align 4			; AVX2-NEXT: [[TMP40:%.*]] = load i32, ptr [[ARRAYIDX3]], align 4
	; AVX2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP40]], [[TMP39]]			; AVX2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP40]], [[TMP39]]
	; AVX2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX7]], align 4			; AVX2-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX7]], align 4
	; AVX2-NEXT: br label [[FOR_INC]]			; AVX2-NEXT: br label [[FOR_INC]]
	; AVX2: for.inc:			; AVX2: for.inc:
	; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; AVX2: for.end:			; AVX2: for.end:
	; AVX2-NEXT: ret void			; AVX2-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo1(			; AVX512-LABEL: @foo1(
	; AVX512-NEXT: iter.check:			; AVX512-NEXT: iter.check:
	; AVX512-NEXT: [[B3:%.]] = ptrtoint ptr [[B:%.]] to i64			; AVX512-NEXT: [[B3:%.]] = ptrtoint ptr [[B:%.]] to i64
	; AVX512-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr [[TRIGGER:%.]] to i64			; AVX512-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr [[TRIGGER:%.]] to i64
	; AVX512-NEXT: [[A1:%.]] = ptrtoint ptr [[A:%.]] to i64			; AVX512-NEXT: [[A1:%.]] = ptrtoint ptr [[A:%.]] to i64
	▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
	; AVX512-NEXT: [[TMP44:%.*]] = getelementptr i32, ptr [[TMP43]], i32 0			; AVX512-NEXT: [[TMP44:%.*]] = getelementptr i32, ptr [[TMP43]], i32 0
	; AVX512-NEXT: [[WIDE_MASKED_LOAD14:%.*]] = call <8 x i32> @llvm.masked.load.v8i32.p0(ptr [[TMP44]], i32 4, <8 x i1> [[TMP42]], <8 x i32> poison)			; AVX512-NEXT: [[WIDE_MASKED_LOAD14:%.*]] = call <8 x i32> @llvm.masked.load.v8i32.p0(ptr [[TMP44]], i32 4, <8 x i1> [[TMP42]], <8 x i32> poison)
	; AVX512-NEXT: [[TMP45:%.*]] = add nsw <8 x i32> [[WIDE_MASKED_LOAD14]], [[WIDE_LOAD13]]			; AVX512-NEXT: [[TMP45:%.*]] = add nsw <8 x i32> [[WIDE_MASKED_LOAD14]], [[WIDE_LOAD13]]
	; AVX512-NEXT: [[TMP46:%.*]] = getelementptr i32, ptr [[A]], i64 [[TMP39]]			; AVX512-NEXT: [[TMP46:%.*]] = getelementptr i32, ptr [[A]], i64 [[TMP39]]
	; AVX512-NEXT: [[TMP47:%.*]] = getelementptr i32, ptr [[TMP46]], i32 0			; AVX512-NEXT: [[TMP47:%.*]] = getelementptr i32, ptr [[TMP46]], i32 0
	; AVX512-NEXT: call void @llvm.masked.store.v8i32.p0(<8 x i32> [[TMP45]], ptr [[TMP47]], i32 4, <8 x i1> [[TMP42]])			; AVX512-NEXT: call void @llvm.masked.store.v8i32.p0(<8 x i32> [[TMP45]], ptr [[TMP47]], i32 4, <8 x i1> [[TMP42]])
	; AVX512-NEXT: [[INDEX_NEXT15]] = add nuw i64 [[INDEX12]], 8			; AVX512-NEXT: [[INDEX_NEXT15]] = add nuw i64 [[INDEX12]], 8
	; AVX512-NEXT: [[TMP48:%.*]] = icmp eq i64 [[INDEX_NEXT15]], 10000			; AVX512-NEXT: [[TMP48:%.*]] = icmp eq i64 [[INDEX_NEXT15]], 10000
	; AVX512-NEXT: br i1 [[TMP48]], label [[VEC_EPILOG_MIDDLE_BLOCK:%.*]], label [[VEC_EPILOG_VECTOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; AVX512-NEXT: br i1 [[TMP48]], label [[VEC_EPILOG_MIDDLE_BLOCK:%.*]], label [[VEC_EPILOG_VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; AVX512: vec.epilog.middle.block:			; AVX512: vec.epilog.middle.block:
	; AVX512-NEXT: [[CMP_N11:%.*]] = icmp eq i64 10000, 10000			; AVX512-NEXT: [[CMP_N11:%.*]] = icmp eq i64 10000, 10000
	; AVX512-NEXT: br i1 [[CMP_N11]], label [[FOR_END]], label [[VEC_EPILOG_SCALAR_PH]]			; AVX512-NEXT: br i1 [[CMP_N11]], label [[FOR_END]], label [[VEC_EPILOG_SCALAR_PH]]
	; AVX512: vec.epilog.scalar.ph:			; AVX512: vec.epilog.scalar.ph:
	; AVX512-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 10000, [[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 9984, [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[ITER_CHECK:%.]] ]			; AVX512-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 10000, [[VEC_EPILOG_MIDDLE_BLOCK]] ], [ 9984, [[VEC_EPILOG_ITER_CHECK]] ], [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[ITER_CHECK:%.]] ]
	; AVX512-NEXT: br label [[FOR_BODY:%.*]]			; AVX512-NEXT: br label [[FOR_BODY:%.*]]
	; AVX512: for.body:			; AVX512: for.body:
	; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[VEC_EPILOG_SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX512-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[VEC_EPILOG_SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: [[TMP7:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP6]], i32 0			; AVX1-NEXT: [[TMP7:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP6]], i32 0
	; AVX1-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <8 x i32> @llvm.masked.load.v8i32.p1(ptr addrspace(1) [[TMP7]], i32 4, <8 x i1> [[TMP5]], <8 x i32> poison)			; AVX1-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <8 x i32> @llvm.masked.load.v8i32.p1(ptr addrspace(1) [[TMP7]], i32 4, <8 x i1> [[TMP5]], <8 x i32> poison)
	; AVX1-NEXT: [[TMP8:%.*]] = add nsw <8 x i32> [[WIDE_MASKED_LOAD]], [[WIDE_LOAD]]			; AVX1-NEXT: [[TMP8:%.*]] = add nsw <8 x i32> [[WIDE_MASKED_LOAD]], [[WIDE_LOAD]]
	; AVX1-NEXT: [[TMP9:%.*]] = getelementptr i32, ptr addrspace(1) [[A]], i64 [[TMP2]]			; AVX1-NEXT: [[TMP9:%.*]] = getelementptr i32, ptr addrspace(1) [[A]], i64 [[TMP2]]
	; AVX1-NEXT: [[TMP10:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP9]], i32 0			; AVX1-NEXT: [[TMP10:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP9]], i32 0
	; AVX1-NEXT: call void @llvm.masked.store.v8i32.p1(<8 x i32> [[TMP8]], ptr addrspace(1) [[TMP10]], i32 4, <8 x i1> [[TMP5]])			; AVX1-NEXT: call void @llvm.masked.store.v8i32.p1(<8 x i32> [[TMP8]], ptr addrspace(1) [[TMP10]], i32 4, <8 x i1> [[TMP5]])
	; AVX1-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8			; AVX1-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
	; AVX1-NEXT: [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT]], 10000			; AVX1-NEXT: [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT]], 10000
	; AVX1-NEXT: br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]			; AVX1-NEXT: br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
	; AVX1: middle.block:			; AVX1: middle.block:
	; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 10000			; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 10000
	; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; AVX1: scalar.ph:			; AVX1: scalar.ph:
	; AVX1-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 10000, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; AVX1-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 10000, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; AVX1-NEXT: br label [[FOR_BODY:%.*]]			; AVX1-NEXT: br label [[FOR_BODY:%.*]]
	; AVX1: for.body:			; AVX1: for.body:
	; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX1-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[TRIGGER]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: [[TMP12:%.*]] = load i32, ptr addrspace(1) [[ARRAYIDX]], align 4			; AVX1-NEXT: [[TMP12:%.*]] = load i32, ptr addrspace(1) [[ARRAYIDX]], align 4
	; AVX1-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP12]], 100			; AVX1-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP12]], 100
	; AVX1-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX1-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX1: if.then:			; AVX1: if.then:
	; AVX1-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[B]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[B]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: [[TMP13:%.*]] = load i32, ptr addrspace(1) [[ARRAYIDX3]], align 4			; AVX1-NEXT: [[TMP13:%.*]] = load i32, ptr addrspace(1) [[ARRAYIDX3]], align 4
	; AVX1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP13]], [[TMP12]]			; AVX1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP13]], [[TMP12]]
	; AVX1-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[A]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[A]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: store i32 [[ADD]], ptr addrspace(1) [[ARRAYIDX7]], align 4			; AVX1-NEXT: store i32 [[ADD]], ptr addrspace(1) [[ARRAYIDX7]], align 4
	; AVX1-NEXT: br label [[FOR_INC]]			; AVX1-NEXT: br label [[FOR_INC]]
	; AVX1: for.inc:			; AVX1: for.inc:
	; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]			; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
	; AVX1: for.end:			; AVX1: for.end:
	; AVX1-NEXT: ret void			; AVX1-NEXT: ret void
	;			;
	; AVX2-LABEL: @foo1_addrspace1(			; AVX2-LABEL: @foo1_addrspace1(
	; AVX2-NEXT: entry:			; AVX2-NEXT: entry:
	; AVX2-NEXT: [[B3:%.]] = ptrtoint ptr addrspace(1) [[B:%.]] to i64			; AVX2-NEXT: [[B3:%.]] = ptrtoint ptr addrspace(1) [[B:%.]] to i64
	; AVX2-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr addrspace(1) [[TRIGGER:%.]] to i64			; AVX2-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr addrspace(1) [[TRIGGER:%.]] to i64
	; AVX2-NEXT: [[A1:%.]] = ptrtoint ptr addrspace(1) [[A:%.]] to i64			; AVX2-NEXT: [[A1:%.]] = ptrtoint ptr addrspace(1) [[A:%.]] to i64
	▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: [[TMP35:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP30]], i32 8			; AVX2-NEXT: [[TMP35:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP30]], i32 8
	; AVX2-NEXT: call void @llvm.masked.store.v8i32.p1(<8 x i32> [[TMP27]], ptr addrspace(1) [[TMP35]], i32 4, <8 x i1> [[TMP15]])			; AVX2-NEXT: call void @llvm.masked.store.v8i32.p1(<8 x i32> [[TMP27]], ptr addrspace(1) [[TMP35]], i32 4, <8 x i1> [[TMP15]])
	; AVX2-NEXT: [[TMP36:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP30]], i32 16			; AVX2-NEXT: [[TMP36:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP30]], i32 16
	; AVX2-NEXT: call void @llvm.masked.store.v8i32.p1(<8 x i32> [[TMP28]], ptr addrspace(1) [[TMP36]], i32 4, <8 x i1> [[TMP16]])			; AVX2-NEXT: call void @llvm.masked.store.v8i32.p1(<8 x i32> [[TMP28]], ptr addrspace(1) [[TMP36]], i32 4, <8 x i1> [[TMP16]])
	; AVX2-NEXT: [[TMP37:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP30]], i32 24			; AVX2-NEXT: [[TMP37:%.*]] = getelementptr i32, ptr addrspace(1) [[TMP30]], i32 24
	; AVX2-NEXT: call void @llvm.masked.store.v8i32.p1(<8 x i32> [[TMP29]], ptr addrspace(1) [[TMP37]], i32 4, <8 x i1> [[TMP17]])			; AVX2-NEXT: call void @llvm.masked.store.v8i32.p1(<8 x i32> [[TMP29]], ptr addrspace(1) [[TMP37]], i32 4, <8 x i1> [[TMP17]])
	; AVX2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 32			; AVX2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 32
	; AVX2-NEXT: [[TMP38:%.*]] = icmp eq i64 [[INDEX_NEXT]], 9984			; AVX2-NEXT: [[TMP38:%.*]] = icmp eq i64 [[INDEX_NEXT]], 9984
	; AVX2-NEXT: br i1 [[TMP38]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]			; AVX2-NEXT: br i1 [[TMP38]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
	; AVX2: middle.block:			; AVX2: middle.block:
	; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 9984			; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 9984
	; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; AVX2: scalar.ph:			; AVX2: scalar.ph:
	; AVX2-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 9984, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; AVX2-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 9984, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; AVX2-NEXT: br label [[FOR_BODY:%.*]]			; AVX2-NEXT: br label [[FOR_BODY:%.*]]
	; AVX2: for.body:			; AVX2: for.body:
	; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[TRIGGER]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: [[TMP39:%.*]] = load i32, ptr addrspace(1) [[ARRAYIDX]], align 4			; AVX2-NEXT: [[TMP39:%.*]] = load i32, ptr addrspace(1) [[ARRAYIDX]], align 4
	; AVX2-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP39]], 100			; AVX2-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP39]], 100
	; AVX2-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX2-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX2: if.then:			; AVX2: if.then:
	; AVX2-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[B]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[B]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: [[TMP40:%.*]] = load i32, ptr addrspace(1) [[ARRAYIDX3]], align 4			; AVX2-NEXT: [[TMP40:%.*]] = load i32, ptr addrspace(1) [[ARRAYIDX3]], align 4
	; AVX2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP40]], [[TMP39]]			; AVX2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP40]], [[TMP39]]
	; AVX2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[A]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds i32, ptr addrspace(1) [[A]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: store i32 [[ADD]], ptr addrspace(1) [[ARRAYIDX7]], align 4			; AVX2-NEXT: store i32 [[ADD]], ptr addrspace(1) [[ARRAYIDX7]], align 4
	; AVX2-NEXT: br label [[FOR_INC]]			; AVX2-NEXT: br label [[FOR_INC]]
	; AVX2: for.inc:			; AVX2: for.inc:
	; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]			; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
	; AVX2: for.end:			; AVX2: for.end:
	; AVX2-NEXT: ret void			; AVX2-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo1_addrspace1(			; AVX512-LABEL: @foo1_addrspace1(
	; AVX512-NEXT: iter.check:			; AVX512-NEXT: iter.check:
	; AVX512-NEXT: [[B3:%.]] = ptrtoint ptr addrspace(1) [[B:%.]] to i64			; AVX512-NEXT: [[B3:%.]] = ptrtoint ptr addrspace(1) [[B:%.]] to i64
	; AVX512-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr addrspace(1) [[TRIGGER:%.]] to i64			; AVX512-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr addrspace(1) [[TRIGGER:%.]] to i64
	; AVX512-NEXT: [[A1:%.]] = ptrtoint ptr addrspace(1) [[A:%.]] to i64			; AVX512-NEXT: [[A1:%.]] = ptrtoint ptr addrspace(1) [[A:%.]] to i64
	▲ Show 20 Lines • Show All 179 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <8 x float> @llvm.masked.load.v8f32.p0(ptr [[TMP7]], i32 4, <8 x i1> [[TMP5]], <8 x float> poison)			; AVX1-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <8 x float> @llvm.masked.load.v8f32.p0(ptr [[TMP7]], i32 4, <8 x i1> [[TMP5]], <8 x float> poison)
	; AVX1-NEXT: [[TMP8:%.*]] = sitofp <8 x i32> [[WIDE_LOAD]] to <8 x float>			; AVX1-NEXT: [[TMP8:%.*]] = sitofp <8 x i32> [[WIDE_LOAD]] to <8 x float>
	; AVX1-NEXT: [[TMP9:%.*]] = fadd <8 x float> [[WIDE_MASKED_LOAD]], [[TMP8]]			; AVX1-NEXT: [[TMP9:%.*]] = fadd <8 x float> [[WIDE_MASKED_LOAD]], [[TMP8]]
	; AVX1-NEXT: [[TMP10:%.*]] = getelementptr float, ptr [[A]], i64 [[TMP2]]			; AVX1-NEXT: [[TMP10:%.*]] = getelementptr float, ptr [[A]], i64 [[TMP2]]
	; AVX1-NEXT: [[TMP11:%.*]] = getelementptr float, ptr [[TMP10]], i32 0			; AVX1-NEXT: [[TMP11:%.*]] = getelementptr float, ptr [[TMP10]], i32 0
	; AVX1-NEXT: call void @llvm.masked.store.v8f32.p0(<8 x float> [[TMP9]], ptr [[TMP11]], i32 4, <8 x i1> [[TMP5]])			; AVX1-NEXT: call void @llvm.masked.store.v8f32.p0(<8 x float> [[TMP9]], ptr [[TMP11]], i32 4, <8 x i1> [[TMP5]])
	; AVX1-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8			; AVX1-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
	; AVX1-NEXT: [[TMP12:%.*]] = icmp eq i64 [[INDEX_NEXT]], 10000			; AVX1-NEXT: [[TMP12:%.*]] = icmp eq i64 [[INDEX_NEXT]], 10000
	; AVX1-NEXT: br i1 [[TMP12]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]			; AVX1-NEXT: br i1 [[TMP12]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
	; AVX1: middle.block:			; AVX1: middle.block:
	; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 10000			; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 10000
	; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; AVX1: scalar.ph:			; AVX1: scalar.ph:
	; AVX1-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 10000, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; AVX1-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 10000, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; AVX1-NEXT: br label [[FOR_BODY:%.*]]			; AVX1-NEXT: br label [[FOR_BODY:%.*]]
	; AVX1: for.body:			; AVX1: for.body:
	; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX1-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: [[TMP13:%.*]] = load i32, ptr [[ARRAYIDX]], align 4			; AVX1-NEXT: [[TMP13:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
	; AVX1-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP13]], 100			; AVX1-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP13]], 100
	; AVX1-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX1-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX1: if.then:			; AVX1: if.then:
	; AVX1-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds float, ptr [[B]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds float, ptr [[B]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: [[TMP14:%.*]] = load float, ptr [[ARRAYIDX3]], align 4			; AVX1-NEXT: [[TMP14:%.*]] = load float, ptr [[ARRAYIDX3]], align 4
	; AVX1-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP13]] to float			; AVX1-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP13]] to float
	; AVX1-NEXT: [[ADD:%.*]] = fadd float [[TMP14]], [[CONV]]			; AVX1-NEXT: [[ADD:%.*]] = fadd float [[TMP14]], [[CONV]]
	; AVX1-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: store float [[ADD]], ptr [[ARRAYIDX7]], align 4			; AVX1-NEXT: store float [[ADD]], ptr [[ARRAYIDX7]], align 4
	; AVX1-NEXT: br label [[FOR_INC]]			; AVX1-NEXT: br label [[FOR_INC]]
	; AVX1: for.inc:			; AVX1: for.inc:
	; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]			; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
	; AVX1: for.end:			; AVX1: for.end:
	; AVX1-NEXT: ret void			; AVX1-NEXT: ret void
	;			;
	; AVX2-LABEL: @foo2(			; AVX2-LABEL: @foo2(
	; AVX2-NEXT: entry:			; AVX2-NEXT: entry:
	; AVX2-NEXT: [[B3:%.]] = ptrtoint ptr [[B:%.]] to i64			; AVX2-NEXT: [[B3:%.]] = ptrtoint ptr [[B:%.]] to i64
	; AVX2-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr [[TRIGGER:%.]] to i64			; AVX2-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr [[TRIGGER:%.]] to i64
	; AVX2-NEXT: [[A1:%.]] = ptrtoint ptr [[A:%.]] to i64			; AVX2-NEXT: [[A1:%.]] = ptrtoint ptr [[A:%.]] to i64
	▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: [[TMP39:%.*]] = getelementptr float, ptr [[TMP34]], i32 8			; AVX2-NEXT: [[TMP39:%.*]] = getelementptr float, ptr [[TMP34]], i32 8
	; AVX2-NEXT: call void @llvm.masked.store.v8f32.p0(<8 x float> [[TMP31]], ptr [[TMP39]], i32 4, <8 x i1> [[TMP15]])			; AVX2-NEXT: call void @llvm.masked.store.v8f32.p0(<8 x float> [[TMP31]], ptr [[TMP39]], i32 4, <8 x i1> [[TMP15]])
	; AVX2-NEXT: [[TMP40:%.*]] = getelementptr float, ptr [[TMP34]], i32 16			; AVX2-NEXT: [[TMP40:%.*]] = getelementptr float, ptr [[TMP34]], i32 16
	; AVX2-NEXT: call void @llvm.masked.store.v8f32.p0(<8 x float> [[TMP32]], ptr [[TMP40]], i32 4, <8 x i1> [[TMP16]])			; AVX2-NEXT: call void @llvm.masked.store.v8f32.p0(<8 x float> [[TMP32]], ptr [[TMP40]], i32 4, <8 x i1> [[TMP16]])
	; AVX2-NEXT: [[TMP41:%.*]] = getelementptr float, ptr [[TMP34]], i32 24			; AVX2-NEXT: [[TMP41:%.*]] = getelementptr float, ptr [[TMP34]], i32 24
	; AVX2-NEXT: call void @llvm.masked.store.v8f32.p0(<8 x float> [[TMP33]], ptr [[TMP41]], i32 4, <8 x i1> [[TMP17]])			; AVX2-NEXT: call void @llvm.masked.store.v8f32.p0(<8 x float> [[TMP33]], ptr [[TMP41]], i32 4, <8 x i1> [[TMP17]])
	; AVX2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 32			; AVX2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 32
	; AVX2-NEXT: [[TMP42:%.*]] = icmp eq i64 [[INDEX_NEXT]], 9984			; AVX2-NEXT: [[TMP42:%.*]] = icmp eq i64 [[INDEX_NEXT]], 9984
	; AVX2-NEXT: br i1 [[TMP42]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]			; AVX2-NEXT: br i1 [[TMP42]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
	; AVX2: middle.block:			; AVX2: middle.block:
	; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 9984			; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 9984
	; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; AVX2: scalar.ph:			; AVX2: scalar.ph:
	; AVX2-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 9984, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; AVX2-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 9984, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; AVX2-NEXT: br label [[FOR_BODY:%.*]]			; AVX2-NEXT: br label [[FOR_BODY:%.*]]
	; AVX2: for.body:			; AVX2: for.body:
	; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: [[TMP43:%.*]] = load i32, ptr [[ARRAYIDX]], align 4			; AVX2-NEXT: [[TMP43:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
	; AVX2-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP43]], 100			; AVX2-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP43]], 100
	; AVX2-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX2-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX2: if.then:			; AVX2: if.then:
	; AVX2-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds float, ptr [[B]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds float, ptr [[B]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: [[TMP44:%.*]] = load float, ptr [[ARRAYIDX3]], align 4			; AVX2-NEXT: [[TMP44:%.*]] = load float, ptr [[ARRAYIDX3]], align 4
	; AVX2-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP43]] to float			; AVX2-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP43]] to float
	; AVX2-NEXT: [[ADD:%.*]] = fadd float [[TMP44]], [[CONV]]			; AVX2-NEXT: [[ADD:%.*]] = fadd float [[TMP44]], [[CONV]]
	; AVX2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: store float [[ADD]], ptr [[ARRAYIDX7]], align 4			; AVX2-NEXT: store float [[ADD]], ptr [[ARRAYIDX7]], align 4
	; AVX2-NEXT: br label [[FOR_INC]]			; AVX2-NEXT: br label [[FOR_INC]]
	; AVX2: for.inc:			; AVX2: for.inc:
	; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]			; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
	; AVX2: for.end:			; AVX2: for.end:
	; AVX2-NEXT: ret void			; AVX2-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo2(			; AVX512-LABEL: @foo2(
	; AVX512-NEXT: iter.check:			; AVX512-NEXT: iter.check:
	; AVX512-NEXT: [[B3:%.]] = ptrtoint ptr [[B:%.]] to i64			; AVX512-NEXT: [[B3:%.]] = ptrtoint ptr [[B:%.]] to i64
	; AVX512-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr [[TRIGGER:%.]] to i64			; AVX512-NEXT: [[TRIGGER2:%.]] = ptrtoint ptr [[TRIGGER:%.]] to i64
	; AVX512-NEXT: [[A1:%.]] = ptrtoint ptr [[A:%.]] to i64			; AVX512-NEXT: [[A1:%.]] = ptrtoint ptr [[A:%.]] to i64
	▲ Show 20 Lines • Show All 182 Lines • ▼ Show 20 Lines
	; AVX-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 4			; AVX-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 4
	; AVX-NEXT: [[TMP2:%.*]] = add i64 [[INDEX]], 8			; AVX-NEXT: [[TMP2:%.*]] = add i64 [[INDEX]], 8
	; AVX-NEXT: [[TMP3:%.*]] = add i64 [[INDEX]], 12			; AVX-NEXT: [[TMP3:%.*]] = add i64 [[INDEX]], 12
	; AVX-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP0]]			; AVX-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP0]]
	; AVX-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP1]]			; AVX-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP1]]
	; AVX-NEXT: [[TMP6:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP2]]			; AVX-NEXT: [[TMP6:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP2]]
	; AVX-NEXT: [[TMP7:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP3]]			; AVX-NEXT: [[TMP7:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP3]]
	; AVX-NEXT: [[TMP8:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 0			; AVX-NEXT: [[TMP8:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 0
	; AVX-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP8]], align 4, !alias.scope !7			; AVX-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP8]], align 4, !alias.scope !8
	; AVX-NEXT: [[TMP9:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 4			; AVX-NEXT: [[TMP9:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 4
	; AVX-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x i32>, ptr [[TMP9]], align 4, !alias.scope !7			; AVX-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x i32>, ptr [[TMP9]], align 4, !alias.scope !8
	; AVX-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 8			; AVX-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 8
	; AVX-NEXT: [[WIDE_LOAD7:%.*]] = load <4 x i32>, ptr [[TMP10]], align 4, !alias.scope !7			; AVX-NEXT: [[WIDE_LOAD7:%.*]] = load <4 x i32>, ptr [[TMP10]], align 4, !alias.scope !8
	; AVX-NEXT: [[TMP11:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 12			; AVX-NEXT: [[TMP11:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 12
	; AVX-NEXT: [[WIDE_LOAD8:%.*]] = load <4 x i32>, ptr [[TMP11]], align 4, !alias.scope !7			; AVX-NEXT: [[WIDE_LOAD8:%.*]] = load <4 x i32>, ptr [[TMP11]], align 4, !alias.scope !8
	; AVX-NEXT: [[TMP12:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD]], <i32 100, i32 100, i32 100, i32 100>			; AVX-NEXT: [[TMP12:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD]], <i32 100, i32 100, i32 100, i32 100>
	; AVX-NEXT: [[TMP13:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD6]], <i32 100, i32 100, i32 100, i32 100>			; AVX-NEXT: [[TMP13:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD6]], <i32 100, i32 100, i32 100, i32 100>
	; AVX-NEXT: [[TMP14:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD7]], <i32 100, i32 100, i32 100, i32 100>			; AVX-NEXT: [[TMP14:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD7]], <i32 100, i32 100, i32 100, i32 100>
	; AVX-NEXT: [[TMP15:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD8]], <i32 100, i32 100, i32 100, i32 100>			; AVX-NEXT: [[TMP15:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD8]], <i32 100, i32 100, i32 100, i32 100>
	; AVX-NEXT: [[TMP16:%.*]] = getelementptr double, ptr [[B]], i64 [[TMP0]]			; AVX-NEXT: [[TMP16:%.*]] = getelementptr double, ptr [[B]], i64 [[TMP0]]
	; AVX-NEXT: [[TMP17:%.*]] = getelementptr double, ptr [[B]], i64 [[TMP1]]			; AVX-NEXT: [[TMP17:%.*]] = getelementptr double, ptr [[B]], i64 [[TMP1]]
	; AVX-NEXT: [[TMP18:%.*]] = getelementptr double, ptr [[B]], i64 [[TMP2]]			; AVX-NEXT: [[TMP18:%.*]] = getelementptr double, ptr [[B]], i64 [[TMP2]]
	; AVX-NEXT: [[TMP19:%.*]] = getelementptr double, ptr [[B]], i64 [[TMP3]]			; AVX-NEXT: [[TMP19:%.*]] = getelementptr double, ptr [[B]], i64 [[TMP3]]
	; AVX-NEXT: [[TMP20:%.*]] = getelementptr double, ptr [[TMP16]], i32 0			; AVX-NEXT: [[TMP20:%.*]] = getelementptr double, ptr [[TMP16]], i32 0
	; AVX-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP20]], i32 8, <4 x i1> [[TMP12]], <4 x double> poison), !alias.scope !10			; AVX-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP20]], i32 8, <4 x i1> [[TMP12]], <4 x double> poison), !alias.scope !11
	; AVX-NEXT: [[TMP21:%.*]] = getelementptr double, ptr [[TMP16]], i32 4			; AVX-NEXT: [[TMP21:%.*]] = getelementptr double, ptr [[TMP16]], i32 4
	; AVX-NEXT: [[WIDE_MASKED_LOAD9:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP21]], i32 8, <4 x i1> [[TMP13]], <4 x double> poison), !alias.scope !10			; AVX-NEXT: [[WIDE_MASKED_LOAD9:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP21]], i32 8, <4 x i1> [[TMP13]], <4 x double> poison), !alias.scope !11
	; AVX-NEXT: [[TMP22:%.*]] = getelementptr double, ptr [[TMP16]], i32 8			; AVX-NEXT: [[TMP22:%.*]] = getelementptr double, ptr [[TMP16]], i32 8
	; AVX-NEXT: [[WIDE_MASKED_LOAD10:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP22]], i32 8, <4 x i1> [[TMP14]], <4 x double> poison), !alias.scope !10			; AVX-NEXT: [[WIDE_MASKED_LOAD10:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP22]], i32 8, <4 x i1> [[TMP14]], <4 x double> poison), !alias.scope !11
	; AVX-NEXT: [[TMP23:%.*]] = getelementptr double, ptr [[TMP16]], i32 12			; AVX-NEXT: [[TMP23:%.*]] = getelementptr double, ptr [[TMP16]], i32 12
	; AVX-NEXT: [[WIDE_MASKED_LOAD11:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP23]], i32 8, <4 x i1> [[TMP15]], <4 x double> poison), !alias.scope !10			; AVX-NEXT: [[WIDE_MASKED_LOAD11:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP23]], i32 8, <4 x i1> [[TMP15]], <4 x double> poison), !alias.scope !11
	; AVX-NEXT: [[TMP24:%.*]] = sitofp <4 x i32> [[WIDE_LOAD]] to <4 x double>			; AVX-NEXT: [[TMP24:%.*]] = sitofp <4 x i32> [[WIDE_LOAD]] to <4 x double>
	; AVX-NEXT: [[TMP25:%.*]] = sitofp <4 x i32> [[WIDE_LOAD6]] to <4 x double>			; AVX-NEXT: [[TMP25:%.*]] = sitofp <4 x i32> [[WIDE_LOAD6]] to <4 x double>
	; AVX-NEXT: [[TMP26:%.*]] = sitofp <4 x i32> [[WIDE_LOAD7]] to <4 x double>			; AVX-NEXT: [[TMP26:%.*]] = sitofp <4 x i32> [[WIDE_LOAD7]] to <4 x double>
	; AVX-NEXT: [[TMP27:%.*]] = sitofp <4 x i32> [[WIDE_LOAD8]] to <4 x double>			; AVX-NEXT: [[TMP27:%.*]] = sitofp <4 x i32> [[WIDE_LOAD8]] to <4 x double>
	; AVX-NEXT: [[TMP28:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD]], [[TMP24]]			; AVX-NEXT: [[TMP28:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD]], [[TMP24]]
	; AVX-NEXT: [[TMP29:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD9]], [[TMP25]]			; AVX-NEXT: [[TMP29:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD9]], [[TMP25]]
	; AVX-NEXT: [[TMP30:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD10]], [[TMP26]]			; AVX-NEXT: [[TMP30:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD10]], [[TMP26]]
	; AVX-NEXT: [[TMP31:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD11]], [[TMP27]]			; AVX-NEXT: [[TMP31:%.*]] = fadd <4 x double> [[WIDE_MASKED_LOAD11]], [[TMP27]]
	; AVX-NEXT: [[TMP32:%.*]] = getelementptr double, ptr [[A]], i64 [[TMP0]]			; AVX-NEXT: [[TMP32:%.*]] = getelementptr double, ptr [[A]], i64 [[TMP0]]
	; AVX-NEXT: [[TMP33:%.*]] = getelementptr double, ptr [[A]], i64 [[TMP1]]			; AVX-NEXT: [[TMP33:%.*]] = getelementptr double, ptr [[A]], i64 [[TMP1]]
	; AVX-NEXT: [[TMP34:%.*]] = getelementptr double, ptr [[A]], i64 [[TMP2]]			; AVX-NEXT: [[TMP34:%.*]] = getelementptr double, ptr [[A]], i64 [[TMP2]]
	; AVX-NEXT: [[TMP35:%.*]] = getelementptr double, ptr [[A]], i64 [[TMP3]]			; AVX-NEXT: [[TMP35:%.*]] = getelementptr double, ptr [[A]], i64 [[TMP3]]
	; AVX-NEXT: [[TMP36:%.*]] = getelementptr double, ptr [[TMP32]], i32 0			; AVX-NEXT: [[TMP36:%.*]] = getelementptr double, ptr [[TMP32]], i32 0
	; AVX-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[TMP28]], ptr [[TMP36]], i32 8, <4 x i1> [[TMP12]]), !alias.scope !12, !noalias !14			; AVX-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[TMP28]], ptr [[TMP36]], i32 8, <4 x i1> [[TMP12]]), !alias.scope !13, !noalias !15
	; AVX-NEXT: [[TMP37:%.*]] = getelementptr double, ptr [[TMP32]], i32 4			; AVX-NEXT: [[TMP37:%.*]] = getelementptr double, ptr [[TMP32]], i32 4
	; AVX-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[TMP29]], ptr [[TMP37]], i32 8, <4 x i1> [[TMP13]]), !alias.scope !12, !noalias !14			; AVX-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[TMP29]], ptr [[TMP37]], i32 8, <4 x i1> [[TMP13]]), !alias.scope !13, !noalias !15
	; AVX-NEXT: [[TMP38:%.*]] = getelementptr double, ptr [[TMP32]], i32 8			; AVX-NEXT: [[TMP38:%.*]] = getelementptr double, ptr [[TMP32]], i32 8
	; AVX-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[TMP30]], ptr [[TMP38]], i32 8, <4 x i1> [[TMP14]]), !alias.scope !12, !noalias !14			; AVX-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[TMP30]], ptr [[TMP38]], i32 8, <4 x i1> [[TMP14]]), !alias.scope !13, !noalias !15
	; AVX-NEXT: [[TMP39:%.*]] = getelementptr double, ptr [[TMP32]], i32 12			; AVX-NEXT: [[TMP39:%.*]] = getelementptr double, ptr [[TMP32]], i32 12
	; AVX-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[TMP31]], ptr [[TMP39]], i32 8, <4 x i1> [[TMP15]]), !alias.scope !12, !noalias !14			; AVX-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[TMP31]], ptr [[TMP39]], i32 8, <4 x i1> [[TMP15]]), !alias.scope !13, !noalias !15
	; AVX-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16			; AVX-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16
	; AVX-NEXT: [[TMP40:%.*]] = icmp eq i64 [[INDEX_NEXT]], 10000			; AVX-NEXT: [[TMP40:%.*]] = icmp eq i64 [[INDEX_NEXT]], 10000
	; AVX-NEXT: br i1 [[TMP40]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP15:![0-9]+]]			; AVX-NEXT: br i1 [[TMP40]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP16:![0-9]+]]
	; AVX: middle.block:			; AVX: middle.block:
	; AVX-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 10000			; AVX-NEXT: [[CMP_N:%.*]] = icmp eq i64 10000, 10000
	; AVX-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; AVX-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; AVX: scalar.ph:			; AVX: scalar.ph:
	; AVX-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 10000, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; AVX-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 10000, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; AVX-NEXT: br label [[FOR_BODY:%.*]]			; AVX-NEXT: br label [[FOR_BODY:%.*]]
	; AVX: for.body:			; AVX: for.body:
	; AVX-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[INDVARS_IV]]			; AVX-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX-NEXT: [[TMP41:%.*]] = load i32, ptr [[ARRAYIDX]], align 4			; AVX-NEXT: [[TMP41:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
	; AVX-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP41]], 100			; AVX-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP41]], 100
	; AVX-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX: if.then:			; AVX: if.then:
	; AVX-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds double, ptr [[B]], i64 [[INDVARS_IV]]			; AVX-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds double, ptr [[B]], i64 [[INDVARS_IV]]
	; AVX-NEXT: [[TMP42:%.*]] = load double, ptr [[ARRAYIDX3]], align 8			; AVX-NEXT: [[TMP42:%.*]] = load double, ptr [[ARRAYIDX3]], align 8
	; AVX-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP41]] to double			; AVX-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP41]] to double
	; AVX-NEXT: [[ADD:%.*]] = fadd double [[TMP42]], [[CONV]]			; AVX-NEXT: [[ADD:%.*]] = fadd double [[TMP42]], [[CONV]]
	; AVX-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds double, ptr [[A]], i64 [[INDVARS_IV]]			; AVX-NEXT: [[ARRAYIDX7:%.*]] = getelementptr inbounds double, ptr [[A]], i64 [[INDVARS_IV]]
	; AVX-NEXT: store double [[ADD]], ptr [[ARRAYIDX7]], align 8			; AVX-NEXT: store double [[ADD]], ptr [[ARRAYIDX7]], align 8
	; AVX-NEXT: br label [[FOR_INC]]			; AVX-NEXT: br label [[FOR_INC]]
	; AVX: for.inc:			; AVX: for.inc:
	; AVX-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000			; AVX-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 10000
	; AVX-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP16:![0-9]+]]			; AVX-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP17:![0-9]+]]
	; AVX: for.end:			; AVX: for.end:
	; AVX-NEXT: ret void			; AVX-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo3(			; AVX512-LABEL: @foo3(
	; AVX512-NEXT: entry:			; AVX512-NEXT: entry:
	; AVX512-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]			; AVX512-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]
	; AVX512: vector.memcheck:			; AVX512: vector.memcheck:
	; AVX512-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[A:%.]], i64 80000			; AVX512-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[A:%.]], i64 80000
	▲ Show 20 Lines • Show All 311 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: [[TMP2:%.*]] = add i64 [[OFFSET_IDX]], -8			; AVX2-NEXT: [[TMP2:%.*]] = add i64 [[OFFSET_IDX]], -8
	; AVX2-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], -12			; AVX2-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], -12
	; AVX2-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP0]]			; AVX2-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP0]]
	; AVX2-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP1]]			; AVX2-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP1]]
	; AVX2-NEXT: [[TMP6:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP2]]			; AVX2-NEXT: [[TMP6:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP2]]
	; AVX2-NEXT: [[TMP7:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP3]]			; AVX2-NEXT: [[TMP7:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[TMP3]]
	; AVX2-NEXT: [[TMP8:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 0			; AVX2-NEXT: [[TMP8:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 0
	; AVX2-NEXT: [[TMP9:%.*]] = getelementptr inbounds i32, ptr [[TMP8]], i32 -3			; AVX2-NEXT: [[TMP9:%.*]] = getelementptr inbounds i32, ptr [[TMP8]], i32 -3
	; AVX2-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP9]], align 4, !alias.scope !17			; AVX2-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP9]], align 4, !alias.scope !18
	; AVX2-NEXT: [[REVERSE:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 -4			; AVX2-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 -4
	; AVX2-NEXT: [[TMP11:%.*]] = getelementptr inbounds i32, ptr [[TMP10]], i32 -3			; AVX2-NEXT: [[TMP11:%.*]] = getelementptr inbounds i32, ptr [[TMP10]], i32 -3
	; AVX2-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x i32>, ptr [[TMP11]], align 4, !alias.scope !17			; AVX2-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x i32>, ptr [[TMP11]], align 4, !alias.scope !18
	; AVX2-NEXT: [[REVERSE7:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD6]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE7:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD6]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 -8			; AVX2-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 -8
	; AVX2-NEXT: [[TMP13:%.*]] = getelementptr inbounds i32, ptr [[TMP12]], i32 -3			; AVX2-NEXT: [[TMP13:%.*]] = getelementptr inbounds i32, ptr [[TMP12]], i32 -3
	; AVX2-NEXT: [[WIDE_LOAD8:%.*]] = load <4 x i32>, ptr [[TMP13]], align 4, !alias.scope !17			; AVX2-NEXT: [[WIDE_LOAD8:%.*]] = load <4 x i32>, ptr [[TMP13]], align 4, !alias.scope !18
	; AVX2-NEXT: [[REVERSE9:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD8]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE9:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD8]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP14:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 -12			; AVX2-NEXT: [[TMP14:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 -12
	; AVX2-NEXT: [[TMP15:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i32 -3			; AVX2-NEXT: [[TMP15:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i32 -3
	; AVX2-NEXT: [[WIDE_LOAD10:%.*]] = load <4 x i32>, ptr [[TMP15]], align 4, !alias.scope !17			; AVX2-NEXT: [[WIDE_LOAD10:%.*]] = load <4 x i32>, ptr [[TMP15]], align 4, !alias.scope !18
	; AVX2-NEXT: [[REVERSE11:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD10]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE11:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD10]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP16:%.*]] = icmp sgt <4 x i32> [[REVERSE]], zeroinitializer			; AVX2-NEXT: [[TMP16:%.*]] = icmp sgt <4 x i32> [[REVERSE]], zeroinitializer
	; AVX2-NEXT: [[TMP17:%.*]] = icmp sgt <4 x i32> [[REVERSE7]], zeroinitializer			; AVX2-NEXT: [[TMP17:%.*]] = icmp sgt <4 x i32> [[REVERSE7]], zeroinitializer
	; AVX2-NEXT: [[TMP18:%.*]] = icmp sgt <4 x i32> [[REVERSE9]], zeroinitializer			; AVX2-NEXT: [[TMP18:%.*]] = icmp sgt <4 x i32> [[REVERSE9]], zeroinitializer
	; AVX2-NEXT: [[TMP19:%.*]] = icmp sgt <4 x i32> [[REVERSE11]], zeroinitializer			; AVX2-NEXT: [[TMP19:%.*]] = icmp sgt <4 x i32> [[REVERSE11]], zeroinitializer
	; AVX2-NEXT: [[TMP20:%.*]] = getelementptr double, ptr [[IN]], i64 [[TMP0]]			; AVX2-NEXT: [[TMP20:%.*]] = getelementptr double, ptr [[IN]], i64 [[TMP0]]
	; AVX2-NEXT: [[TMP21:%.*]] = getelementptr double, ptr [[IN]], i64 [[TMP1]]			; AVX2-NEXT: [[TMP21:%.*]] = getelementptr double, ptr [[IN]], i64 [[TMP1]]
	; AVX2-NEXT: [[TMP22:%.*]] = getelementptr double, ptr [[IN]], i64 [[TMP2]]			; AVX2-NEXT: [[TMP22:%.*]] = getelementptr double, ptr [[IN]], i64 [[TMP2]]
	; AVX2-NEXT: [[TMP23:%.*]] = getelementptr double, ptr [[IN]], i64 [[TMP3]]			; AVX2-NEXT: [[TMP23:%.*]] = getelementptr double, ptr [[IN]], i64 [[TMP3]]
	; AVX2-NEXT: [[TMP24:%.*]] = getelementptr double, ptr [[TMP20]], i32 0			; AVX2-NEXT: [[TMP24:%.*]] = getelementptr double, ptr [[TMP20]], i32 0
	; AVX2-NEXT: [[TMP25:%.*]] = getelementptr double, ptr [[TMP24]], i32 -3			; AVX2-NEXT: [[TMP25:%.*]] = getelementptr double, ptr [[TMP24]], i32 -3
	; AVX2-NEXT: [[REVERSE12:%.*]] = shufflevector <4 x i1> [[TMP16]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE12:%.*]] = shufflevector <4 x i1> [[TMP16]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP25]], i32 8, <4 x i1> [[REVERSE12]], <4 x double> poison), !alias.scope !20			; AVX2-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP25]], i32 8, <4 x i1> [[REVERSE12]], <4 x double> poison), !alias.scope !21
	; AVX2-NEXT: [[REVERSE13:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE13:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP26:%.*]] = getelementptr double, ptr [[TMP20]], i32 -4			; AVX2-NEXT: [[TMP26:%.*]] = getelementptr double, ptr [[TMP20]], i32 -4
	; AVX2-NEXT: [[TMP27:%.*]] = getelementptr double, ptr [[TMP26]], i32 -3			; AVX2-NEXT: [[TMP27:%.*]] = getelementptr double, ptr [[TMP26]], i32 -3
	; AVX2-NEXT: [[REVERSE14:%.*]] = shufflevector <4 x i1> [[TMP17]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE14:%.*]] = shufflevector <4 x i1> [[TMP17]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[WIDE_MASKED_LOAD15:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP27]], i32 8, <4 x i1> [[REVERSE14]], <4 x double> poison), !alias.scope !20			; AVX2-NEXT: [[WIDE_MASKED_LOAD15:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP27]], i32 8, <4 x i1> [[REVERSE14]], <4 x double> poison), !alias.scope !21
	; AVX2-NEXT: [[REVERSE16:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD15]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE16:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD15]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP28:%.*]] = getelementptr double, ptr [[TMP20]], i32 -8			; AVX2-NEXT: [[TMP28:%.*]] = getelementptr double, ptr [[TMP20]], i32 -8
	; AVX2-NEXT: [[TMP29:%.*]] = getelementptr double, ptr [[TMP28]], i32 -3			; AVX2-NEXT: [[TMP29:%.*]] = getelementptr double, ptr [[TMP28]], i32 -3
	; AVX2-NEXT: [[REVERSE17:%.*]] = shufflevector <4 x i1> [[TMP18]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE17:%.*]] = shufflevector <4 x i1> [[TMP18]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[WIDE_MASKED_LOAD18:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP29]], i32 8, <4 x i1> [[REVERSE17]], <4 x double> poison), !alias.scope !20			; AVX2-NEXT: [[WIDE_MASKED_LOAD18:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP29]], i32 8, <4 x i1> [[REVERSE17]], <4 x double> poison), !alias.scope !21
	; AVX2-NEXT: [[REVERSE19:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD18]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE19:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD18]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP30:%.*]] = getelementptr double, ptr [[TMP20]], i32 -12			; AVX2-NEXT: [[TMP30:%.*]] = getelementptr double, ptr [[TMP20]], i32 -12
	; AVX2-NEXT: [[TMP31:%.*]] = getelementptr double, ptr [[TMP30]], i32 -3			; AVX2-NEXT: [[TMP31:%.*]] = getelementptr double, ptr [[TMP30]], i32 -3
	; AVX2-NEXT: [[REVERSE20:%.*]] = shufflevector <4 x i1> [[TMP19]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE20:%.*]] = shufflevector <4 x i1> [[TMP19]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[WIDE_MASKED_LOAD21:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP31]], i32 8, <4 x i1> [[REVERSE20]], <4 x double> poison), !alias.scope !20			; AVX2-NEXT: [[WIDE_MASKED_LOAD21:%.*]] = call <4 x double> @llvm.masked.load.v4f64.p0(ptr [[TMP31]], i32 8, <4 x i1> [[REVERSE20]], <4 x double> poison), !alias.scope !21
	; AVX2-NEXT: [[REVERSE22:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD21]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE22:%.*]] = shufflevector <4 x double> [[WIDE_MASKED_LOAD21]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP32:%.*]] = fadd <4 x double> [[REVERSE13]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX2-NEXT: [[TMP32:%.*]] = fadd <4 x double> [[REVERSE13]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX2-NEXT: [[TMP33:%.*]] = fadd <4 x double> [[REVERSE16]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX2-NEXT: [[TMP33:%.*]] = fadd <4 x double> [[REVERSE16]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX2-NEXT: [[TMP34:%.*]] = fadd <4 x double> [[REVERSE19]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX2-NEXT: [[TMP34:%.*]] = fadd <4 x double> [[REVERSE19]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX2-NEXT: [[TMP35:%.*]] = fadd <4 x double> [[REVERSE22]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>			; AVX2-NEXT: [[TMP35:%.*]] = fadd <4 x double> [[REVERSE22]], <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>
	; AVX2-NEXT: [[TMP36:%.*]] = getelementptr double, ptr [[OUT]], i64 [[TMP0]]			; AVX2-NEXT: [[TMP36:%.*]] = getelementptr double, ptr [[OUT]], i64 [[TMP0]]
	; AVX2-NEXT: [[TMP37:%.*]] = getelementptr double, ptr [[OUT]], i64 [[TMP1]]			; AVX2-NEXT: [[TMP37:%.*]] = getelementptr double, ptr [[OUT]], i64 [[TMP1]]
	; AVX2-NEXT: [[TMP38:%.*]] = getelementptr double, ptr [[OUT]], i64 [[TMP2]]			; AVX2-NEXT: [[TMP38:%.*]] = getelementptr double, ptr [[OUT]], i64 [[TMP2]]
	; AVX2-NEXT: [[TMP39:%.*]] = getelementptr double, ptr [[OUT]], i64 [[TMP3]]			; AVX2-NEXT: [[TMP39:%.*]] = getelementptr double, ptr [[OUT]], i64 [[TMP3]]
	; AVX2-NEXT: [[REVERSE23:%.*]] = shufflevector <4 x double> [[TMP32]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE23:%.*]] = shufflevector <4 x double> [[TMP32]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP40:%.*]] = getelementptr double, ptr [[TMP36]], i32 0			; AVX2-NEXT: [[TMP40:%.*]] = getelementptr double, ptr [[TMP36]], i32 0
	; AVX2-NEXT: [[TMP41:%.*]] = getelementptr double, ptr [[TMP40]], i32 -3			; AVX2-NEXT: [[TMP41:%.*]] = getelementptr double, ptr [[TMP40]], i32 -3
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[REVERSE23]], ptr [[TMP41]], i32 8, <4 x i1> [[REVERSE12]]), !alias.scope !22, !noalias !24			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[REVERSE23]], ptr [[TMP41]], i32 8, <4 x i1> [[REVERSE12]]), !alias.scope !23, !noalias !25
	; AVX2-NEXT: [[REVERSE25:%.*]] = shufflevector <4 x double> [[TMP33]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE25:%.*]] = shufflevector <4 x double> [[TMP33]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP42:%.*]] = getelementptr double, ptr [[TMP36]], i32 -4			; AVX2-NEXT: [[TMP42:%.*]] = getelementptr double, ptr [[TMP36]], i32 -4
	; AVX2-NEXT: [[TMP43:%.*]] = getelementptr double, ptr [[TMP42]], i32 -3			; AVX2-NEXT: [[TMP43:%.*]] = getelementptr double, ptr [[TMP42]], i32 -3
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[REVERSE25]], ptr [[TMP43]], i32 8, <4 x i1> [[REVERSE14]]), !alias.scope !22, !noalias !24			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[REVERSE25]], ptr [[TMP43]], i32 8, <4 x i1> [[REVERSE14]]), !alias.scope !23, !noalias !25
	; AVX2-NEXT: [[REVERSE27:%.*]] = shufflevector <4 x double> [[TMP34]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE27:%.*]] = shufflevector <4 x double> [[TMP34]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP44:%.*]] = getelementptr double, ptr [[TMP36]], i32 -8			; AVX2-NEXT: [[TMP44:%.*]] = getelementptr double, ptr [[TMP36]], i32 -8
	; AVX2-NEXT: [[TMP45:%.*]] = getelementptr double, ptr [[TMP44]], i32 -3			; AVX2-NEXT: [[TMP45:%.*]] = getelementptr double, ptr [[TMP44]], i32 -3
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[REVERSE27]], ptr [[TMP45]], i32 8, <4 x i1> [[REVERSE17]]), !alias.scope !22, !noalias !24			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[REVERSE27]], ptr [[TMP45]], i32 8, <4 x i1> [[REVERSE17]]), !alias.scope !23, !noalias !25
	; AVX2-NEXT: [[REVERSE29:%.*]] = shufflevector <4 x double> [[TMP35]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>			; AVX2-NEXT: [[REVERSE29:%.*]] = shufflevector <4 x double> [[TMP35]], <4 x double> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
	; AVX2-NEXT: [[TMP46:%.*]] = getelementptr double, ptr [[TMP36]], i32 -12			; AVX2-NEXT: [[TMP46:%.*]] = getelementptr double, ptr [[TMP36]], i32 -12
	; AVX2-NEXT: [[TMP47:%.*]] = getelementptr double, ptr [[TMP46]], i32 -3			; AVX2-NEXT: [[TMP47:%.*]] = getelementptr double, ptr [[TMP46]], i32 -3
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[REVERSE29]], ptr [[TMP47]], i32 8, <4 x i1> [[REVERSE20]]), !alias.scope !22, !noalias !24			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> [[REVERSE29]], ptr [[TMP47]], i32 8, <4 x i1> [[REVERSE20]]), !alias.scope !23, !noalias !25
	; AVX2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16			; AVX2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16
	; AVX2-NEXT: [[TMP48:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096			; AVX2-NEXT: [[TMP48:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096
	; AVX2-NEXT: br i1 [[TMP48]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP25:![0-9]+]]			; AVX2-NEXT: br i1 [[TMP48]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]]
	; AVX2: middle.block:			; AVX2: middle.block:
	; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 4096, 4096			; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 4096, 4096
	; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; AVX2: scalar.ph:			; AVX2: scalar.ph:
	; AVX2-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ -1, [[MIDDLE_BLOCK]] ], [ 4095, [[ENTRY:%.]] ], [ 4095, [[VECTOR_MEMCHECK]] ]			; AVX2-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ -1, [[MIDDLE_BLOCK]] ], [ 4095, [[ENTRY:%.]] ], [ 4095, [[VECTOR_MEMCHECK]] ]
	; AVX2-NEXT: br label [[FOR_BODY:%.*]]			; AVX2-NEXT: br label [[FOR_BODY:%.*]]
	; AVX2: for.body:			; AVX2: for.body:
	; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	; AVX2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TRIGGER]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: [[TMP49:%.*]] = load i32, ptr [[ARRAYIDX]], align 4			; AVX2-NEXT: [[TMP49:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
	; AVX2-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP49]], 0			; AVX2-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[TMP49]], 0
	; AVX2-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; AVX2-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; AVX2: if.then:			; AVX2: if.then:
	; AVX2-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds double, ptr [[IN]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds double, ptr [[IN]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: [[TMP50:%.*]] = load double, ptr [[ARRAYIDX3]], align 8			; AVX2-NEXT: [[TMP50:%.*]] = load double, ptr [[ARRAYIDX3]], align 8
	; AVX2-NEXT: [[ADD:%.*]] = fadd double [[TMP50]], 5.000000e-01			; AVX2-NEXT: [[ADD:%.*]] = fadd double [[TMP50]], 5.000000e-01
	; AVX2-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds double, ptr [[OUT]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds double, ptr [[OUT]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: store double [[ADD]], ptr [[ARRAYIDX5]], align 8			; AVX2-NEXT: store double [[ADD]], ptr [[ARRAYIDX5]], align 8
	; AVX2-NEXT: br label [[FOR_INC]]			; AVX2-NEXT: br label [[FOR_INC]]
	; AVX2: for.inc:			; AVX2: for.inc:
	; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1			; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
	; AVX2-NEXT: [[CMP:%.*]] = icmp eq i64 [[INDVARS_IV]], 0			; AVX2-NEXT: [[CMP:%.*]] = icmp eq i64 [[INDVARS_IV]], 0
	; AVX2-NEXT: br i1 [[CMP]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]]			; AVX2-NEXT: br i1 [[CMP]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP27:![0-9]+]]
	; AVX2: for.end:			; AVX2: for.end:
	; AVX2-NEXT: ret void			; AVX2-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo6(			; AVX512-LABEL: @foo6(
	; AVX512-NEXT: entry:			; AVX512-NEXT: entry:
	; AVX512-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]			; AVX512-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]
	; AVX512: vector.memcheck:			; AVX512: vector.memcheck:
	; AVX512-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[OUT:%.]], i64 32768			; AVX512-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[OUT:%.]], i64 32768
	▲ Show 20 Lines • Show All 228 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: [[TMP49:%.*]] = getelementptr double, ptr [[TMP36]], i32 4			; AVX1-NEXT: [[TMP49:%.*]] = getelementptr double, ptr [[TMP36]], i32 4
	; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP49]], i32 8, <4 x i1> [[TMP45]])			; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP49]], i32 8, <4 x i1> [[TMP45]])
	; AVX1-NEXT: [[TMP50:%.*]] = getelementptr double, ptr [[TMP36]], i32 8			; AVX1-NEXT: [[TMP50:%.*]] = getelementptr double, ptr [[TMP36]], i32 8
	; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP50]], i32 8, <4 x i1> [[TMP46]])			; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP50]], i32 8, <4 x i1> [[TMP46]])
	; AVX1-NEXT: [[TMP51:%.*]] = getelementptr double, ptr [[TMP36]], i32 12			; AVX1-NEXT: [[TMP51:%.*]] = getelementptr double, ptr [[TMP36]], i32 12
	; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP51]], i32 8, <4 x i1> [[TMP47]])			; AVX1-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP51]], i32 8, <4 x i1> [[TMP47]])
	; AVX1-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16			; AVX1-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16
	; AVX1-NEXT: [[TMP52:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; AVX1-NEXT: [[TMP52:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AVX1-NEXT: br i1 [[TMP52]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP17:![0-9]+]]			; AVX1-NEXT: br i1 [[TMP52]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]
	; AVX1: middle.block:			; AVX1: middle.block:
	; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]			; AVX1-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
	; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; AVX1-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; AVX1: scalar.ph:			; AVX1: scalar.ph:
	; AVX1-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; AVX1-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; AVX1-NEXT: br label [[FOR_BODY:%.*]]			; AVX1-NEXT: br label [[FOR_BODY:%.*]]
	; AVX1: for.body:			; AVX1: for.body:
	; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX1-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	Show All 9 Lines
	; AVX1-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]			; AVX1-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]
	; AVX1: if.then:			; AVX1: if.then:
	; AVX1-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds double, ptr [[OUT]], i64 [[INDVARS_IV]]			; AVX1-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds double, ptr [[OUT]], i64 [[INDVARS_IV]]
	; AVX1-NEXT: store double 5.000000e-01, ptr [[ARRAYIDX5]], align 8			; AVX1-NEXT: store double 5.000000e-01, ptr [[ARRAYIDX5]], align 8
	; AVX1-NEXT: br label [[FOR_INC]]			; AVX1-NEXT: br label [[FOR_INC]]
	; AVX1: for.inc:			; AVX1: for.inc:
	; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX1-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; AVX1-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]			; AVX1-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP19:![0-9]+]]
	; AVX1: for.end.loopexit:			; AVX1: for.end.loopexit:
	; AVX1-NEXT: br label [[FOR_END]]			; AVX1-NEXT: br label [[FOR_END]]
	; AVX1: for.end:			; AVX1: for.end:
	; AVX1-NEXT: ret void			; AVX1-NEXT: ret void
	;			;
	; AVX2-LABEL: @foo7(			; AVX2-LABEL: @foo7(
	; AVX2-NEXT: entry:			; AVX2-NEXT: entry:
	; AVX2-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0			; AVX2-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0
	▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: [[TMP49:%.*]] = getelementptr double, ptr [[TMP36]], i32 4			; AVX2-NEXT: [[TMP49:%.*]] = getelementptr double, ptr [[TMP36]], i32 4
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP49]], i32 8, <4 x i1> [[TMP45]])			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP49]], i32 8, <4 x i1> [[TMP45]])
	; AVX2-NEXT: [[TMP50:%.*]] = getelementptr double, ptr [[TMP36]], i32 8			; AVX2-NEXT: [[TMP50:%.*]] = getelementptr double, ptr [[TMP36]], i32 8
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP50]], i32 8, <4 x i1> [[TMP46]])			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP50]], i32 8, <4 x i1> [[TMP46]])
	; AVX2-NEXT: [[TMP51:%.*]] = getelementptr double, ptr [[TMP36]], i32 12			; AVX2-NEXT: [[TMP51:%.*]] = getelementptr double, ptr [[TMP36]], i32 12
	; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP51]], i32 8, <4 x i1> [[TMP47]])			; AVX2-NEXT: call void @llvm.masked.store.v4f64.p0(<4 x double> <double 5.000000e-01, double 5.000000e-01, double 5.000000e-01, double 5.000000e-01>, ptr [[TMP51]], i32 8, <4 x i1> [[TMP47]])
	; AVX2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16			; AVX2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16
	; AVX2-NEXT: [[TMP52:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; AVX2-NEXT: [[TMP52:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; AVX2-NEXT: br i1 [[TMP52]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP27:![0-9]+]]			; AVX2-NEXT: br i1 [[TMP52]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP28:![0-9]+]]
	; AVX2: middle.block:			; AVX2: middle.block:
	; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]			; AVX2-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
	; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; AVX2-NEXT: br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; AVX2: scalar.ph:			; AVX2: scalar.ph:
	; AVX2-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]			; AVX2-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[FOR_BODY_PREHEADER]] ]
	; AVX2-NEXT: br label [[FOR_BODY:%.*]]			; AVX2-NEXT: br label [[FOR_BODY:%.*]]
	; AVX2: for.body:			; AVX2: for.body:
	; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]			; AVX2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_INC:%.*]] ]
	Show All 9 Lines
	; AVX2-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]			; AVX2-NEXT: br i1 [[CMP3]], label [[FOR_INC]], label [[IF_THEN:%.*]]
	; AVX2: if.then:			; AVX2: if.then:
	; AVX2-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds double, ptr [[OUT]], i64 [[INDVARS_IV]]			; AVX2-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds double, ptr [[OUT]], i64 [[INDVARS_IV]]
	; AVX2-NEXT: store double 5.000000e-01, ptr [[ARRAYIDX5]], align 8			; AVX2-NEXT: store double 5.000000e-01, ptr [[ARRAYIDX5]], align 8
	; AVX2-NEXT: br label [[FOR_INC]]			; AVX2-NEXT: br label [[FOR_INC]]
	; AVX2: for.inc:			; AVX2: for.inc:
	; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; AVX2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; AVX2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP28:![0-9]+]]			; AVX2-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP29:![0-9]+]]
	; AVX2: for.end.loopexit:			; AVX2: for.end.loopexit:
	; AVX2-NEXT: br label [[FOR_END]]			; AVX2-NEXT: br label [[FOR_END]]
	; AVX2: for.end:			; AVX2: for.end:
	; AVX2-NEXT: ret void			; AVX2-NEXT: ret void
	;			;
	; AVX512-LABEL: @foo7(			; AVX512-LABEL: @foo7(
	; AVX512-NEXT: entry:			; AVX512-NEXT: entry:
	; AVX512-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0			; AVX512-NEXT: [[CMP5:%.]] = icmp eq i32 [[SIZE:%.]], 0
	▲ Show 20 Lines • Show All 527 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/metadata-enable.ll

	Show First 20 Lines • Show All 1,205 Lines • ▼ Show 20 Lines
	; O1VEC2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]			; O1VEC2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]
	; O1VEC2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDVARS_IV]]			; O1VEC2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDVARS_IV]]
	; O1VEC2-NEXT: [[TMP7:%.*]] = load i32, ptr [[ARRAYIDX]], align 4			; O1VEC2-NEXT: [[TMP7:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
	; O1VEC2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], [[N]]			; O1VEC2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], [[N]]
	; O1VEC2-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; O1VEC2-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; O1VEC2-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX2]], align 4			; O1VEC2-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX2]], align 4
	; O1VEC2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; O1VEC2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; O1VEC2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 64			; O1VEC2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 64
	; O1VEC2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; O1VEC2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; O1VEC2: for.end:			; O1VEC2: for.end:
	; O1VEC2-NEXT: [[TMP8:%.*]] = load i32, ptr [[A]], align 4			; O1VEC2-NEXT: [[TMP8:%.*]] = load i32, ptr [[A]], align 4
	; O1VEC2-NEXT: ret i32 [[TMP8]]			; O1VEC2-NEXT: ret i32 [[TMP8]]
	;			;
	; OzVEC2-LABEL: @nopragma(			; OzVEC2-LABEL: @nopragma(
	; OzVEC2-NEXT: entry:			; OzVEC2-NEXT: entry:
	; OzVEC2-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; OzVEC2-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; OzVEC2: vector.ph:			; OzVEC2: vector.ph:
	Show All 23 Lines
	; OzVEC2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]			; OzVEC2-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]
	; OzVEC2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDVARS_IV]]			; OzVEC2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDVARS_IV]]
	; OzVEC2-NEXT: [[TMP7:%.*]] = load i32, ptr [[ARRAYIDX]], align 4			; OzVEC2-NEXT: [[TMP7:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
	; OzVEC2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], [[N]]			; OzVEC2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], [[N]]
	; OzVEC2-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; OzVEC2-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; OzVEC2-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX2]], align 4			; OzVEC2-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX2]], align 4
	; OzVEC2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; OzVEC2-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; OzVEC2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 64			; OzVEC2-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 64
	; OzVEC2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; OzVEC2-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; OzVEC2: for.end:			; OzVEC2: for.end:
	; OzVEC2-NEXT: [[TMP8:%.*]] = load i32, ptr [[A]], align 4			; OzVEC2-NEXT: [[TMP8:%.*]] = load i32, ptr [[A]], align 4
	; OzVEC2-NEXT: ret i32 [[TMP8]]			; OzVEC2-NEXT: ret i32 [[TMP8]]
	;			;
	; O3DIS-LABEL: @nopragma(			; O3DIS-LABEL: @nopragma(
	; O3DIS-NEXT: entry:			; O3DIS-NEXT: entry:
	; O3DIS-NEXT: br label [[FOR_BODY:%.*]]			; O3DIS-NEXT: br label [[FOR_BODY:%.*]]
	; O3DIS: for.body:			; O3DIS: for.body:
	▲ Show 20 Lines • Show All 257 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/tail_loop_folding.ll

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP10:%.*]] = load i32, ptr [[ARRAYIDX]], align 4			; CHECK-NEXT: [[TMP10:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP11:%.*]] = load i32, ptr [[ARRAYIDX2]], align 4			; CHECK-NEXT: [[TMP11:%.*]] = load i32, ptr [[ARRAYIDX2]], align 4
	; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP11]], [[TMP10]]			; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP11]], [[TMP10]]
	; CHECK-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX4]], align 4			; CHECK-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX4]], align 4
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 430			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 430
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.cond.cleanup:			for.cond.cleanup:
	ret void			ret void

	for.body:			for.body:
	▲ Show 20 Lines • Show All 181 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/uniform_mem_op.ll

	Show All 25 Lines
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 4096, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 4096, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]
	; CHECK-NEXT: [[LOAD:%.*]] = load i32, ptr [[ADDR]], align 4			; CHECK-NEXT: [[LOAD:%.*]] = load i32, ptr [[ADDR]], align 4
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[IV]], 4096			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[IV]], 4096
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[LOOPEXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[LOOPEXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; CHECK: loopexit:			; CHECK: loopexit:
	; CHECK-NEXT: [[LOAD_LCSSA:%.*]] = phi i32 [ [[LOAD]], [[FOR_BODY]] ], [ [[TMP0]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[LOAD_LCSSA:%.*]] = phi i32 [ [[LOAD]], [[FOR_BODY]] ], [ [[TMP0]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret i32 [[LOAD_LCSSA]]			; CHECK-NEXT: ret i32 [[LOAD_LCSSA]]
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body:			for.body:
	▲ Show 20 Lines • Show All 565 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/followup.ll

	Show All 30 Lines

	; CHECK-LABEL: @followup(			; CHECK-LABEL: @followup(

	; CHECK-LABEL: vector.body:			; CHECK-LABEL: vector.body:
	; CHECK: br i1 %{{[0-9]*}}, label %middle.block, label %vector.body, !llvm.loop ![[LOOP_VECTOR:[0-9]+]]			; CHECK: br i1 %{{[0-9]*}}, label %middle.block, label %vector.body, !llvm.loop ![[LOOP_VECTOR:[0-9]+]]
	; CHECK-LABEL: for.body:			; CHECK-LABEL: for.body:
	; CHECK: br i1 %exitcond, label %for.end.loopexit, label %for.body, !llvm.loop ![[LOOP_EPILOGUE:[0-9]+]]			; CHECK: br i1 %exitcond, label %for.end.loopexit, label %for.body, !llvm.loop ![[LOOP_EPILOGUE:[0-9]+]]

	; CHECK: ![[LOOP_VECTOR]] = distinct !{![[LOOP_VECTOR]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_VECTORIZED:[0-9]+]]}			; CHECK: ![[LOOP_VECTOR]] = distinct !{![[LOOP_VECTOR]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_VECTORIZED:[0-9]+]], ![[RT_UNROLL_DIS:[0-9]+]]}
	; CHECK: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}			; CHECK: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
	; CHECK: ![[FOLLOWUP_VECTORIZED:[0-9]+]] = !{!"FollowupVectorized"}			; CHECK: ![[FOLLOWUP_VECTORIZED:[0-9]+]] = !{!"FollowupVectorized"}
				; CHECK: ![[RT_UNROLL_DIS]] = !{!"llvm.loop.unroll.runtime.disable"}
	; CHECK: ![[LOOP_EPILOGUE]] = distinct !{![[LOOP_EPILOGUE]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_EPILOGUE:[0-9]+]]}			; CHECK: ![[LOOP_EPILOGUE]] = distinct !{![[LOOP_EPILOGUE]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_EPILOGUE:[0-9]+]]}
	; CHECK: ![[FOLLOWUP_EPILOGUE]] = !{!"FollowupEpilogue"}			; CHECK: ![[FOLLOWUP_EPILOGUE]] = !{!"FollowupEpilogue"}

llvm/test/Transforms/LoopVectorize/if-pred-non-void.ll

	Show First 20 Lines • Show All 159 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[YSR_0:%.*]] = phi i32 [ [[RSR]], [[IF_THEN]] ], [ [[PSR]], [[FOR_BODY]] ]			; CHECK-NEXT: [[YSR_0:%.*]] = phi i32 [ [[RSR]], [[IF_THEN]] ], [ [[PSR]], [[FOR_BODY]] ]
	; CHECK-NEXT: [[YUR_0:%.*]] = phi i32 [ [[RUR]], [[IF_THEN]] ], [ [[PUR]], [[FOR_BODY]] ]			; CHECK-NEXT: [[YUR_0:%.*]] = phi i32 [ [[RUR]], [[IF_THEN]] ], [ [[PUR]], [[FOR_BODY]] ]
	; CHECK-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4			; CHECK-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4
	; CHECK-NEXT: store i32 [[YUD_0]], ptr [[IUD]], align 4			; CHECK-NEXT: store i32 [[YUD_0]], ptr [[IUD]], align 4
	; CHECK-NEXT: store i32 [[YSR_0]], ptr [[ISR]], align 4			; CHECK-NEXT: store i32 [[YSR_0]], ptr [[ISR]], align 4
	; CHECK-NEXT: store i32 [[YUR_0]], ptr [[IUR]], align 4			; CHECK-NEXT: store i32 [[YUR_0]], ptr [[IUR]], align 4
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP19:![0-9]+]]
	;			;
	; UNROLL-NO-VF-LABEL: @test(			; UNROLL-NO-VF-LABEL: @test(
	; UNROLL-NO-VF-NEXT: entry:			; UNROLL-NO-VF-NEXT: entry:
	; UNROLL-NO-VF-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]			; UNROLL-NO-VF-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]
	; UNROLL-NO-VF: vector.memcheck:			; UNROLL-NO-VF: vector.memcheck:
	; UNROLL-NO-VF-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[ASD:%.]], i64 512			; UNROLL-NO-VF-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[ASD:%.]], i64 512
	; UNROLL-NO-VF-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[AUD:%.]], i64 512			; UNROLL-NO-VF-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[AUD:%.]], i64 512
	; UNROLL-NO-VF-NEXT: [[UGLYGEP2:%.]] = getelementptr i8, ptr [[ASR:%.]], i64 512			; UNROLL-NO-VF-NEXT: [[UGLYGEP2:%.]] = getelementptr i8, ptr [[ASR:%.]], i64 512
	▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines
	; UNROLL-NO-VF-NEXT: [[YSR_0:%.*]] = phi i32 [ [[RSR]], [[IF_THEN]] ], [ [[PSR]], [[FOR_BODY]] ]			; UNROLL-NO-VF-NEXT: [[YSR_0:%.*]] = phi i32 [ [[RSR]], [[IF_THEN]] ], [ [[PSR]], [[FOR_BODY]] ]
	; UNROLL-NO-VF-NEXT: [[YUR_0:%.*]] = phi i32 [ [[RUR]], [[IF_THEN]] ], [ [[PUR]], [[FOR_BODY]] ]			; UNROLL-NO-VF-NEXT: [[YUR_0:%.*]] = phi i32 [ [[RUR]], [[IF_THEN]] ], [ [[PUR]], [[FOR_BODY]] ]
	; UNROLL-NO-VF-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4			; UNROLL-NO-VF-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4
	; UNROLL-NO-VF-NEXT: store i32 [[YUD_0]], ptr [[IUD]], align 4			; UNROLL-NO-VF-NEXT: store i32 [[YUD_0]], ptr [[IUD]], align 4
	; UNROLL-NO-VF-NEXT: store i32 [[YSR_0]], ptr [[ISR]], align 4			; UNROLL-NO-VF-NEXT: store i32 [[YSR_0]], ptr [[ISR]], align 4
	; UNROLL-NO-VF-NEXT: store i32 [[YUR_0]], ptr [[IUR]], align 4			; UNROLL-NO-VF-NEXT: store i32 [[YUR_0]], ptr [[IUR]], align 4
	; UNROLL-NO-VF-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; UNROLL-NO-VF-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; UNROLL-NO-VF-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128			; UNROLL-NO-VF-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128
	; UNROLL-NO-VF-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]			; UNROLL-NO-VF-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP19:![0-9]+]]
	;			;
	ptr nocapture %asr, ptr nocapture %aur) {			ptr nocapture %asr, ptr nocapture %aur) {
	entry:			entry:
	br label %for.body			br label %for.body

	for.cond.cleanup: ; preds = %if.end			for.cond.cleanup: ; preds = %if.end
	ret void			ret void

	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ult ptr [[BSD]], [[UGLYGEP]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ult ptr [[BSD]], [[UGLYGEP]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH:%.]], label [[VECTOR_BODY:%.]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH:%.]], label [[VECTOR_BODY:%.]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ [[INDEX_NEXT:%.]], [[PRED_SDIV_CONTINUE4:%.]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ [[INDEX_NEXT:%.]], [[PRED_SDIV_CONTINUE4:%.]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP0]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP0]]
	; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0			; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP2]], align 4, !alias.scope !19, !noalias !22			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP2]], align 4, !alias.scope !20, !noalias !23
	; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP0]]			; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP0]]
	; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TMP3]], i32 0			; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TMP3]], i32 0
	; CHECK-NEXT: [[WIDE_LOAD2:%.*]] = load <2 x i32>, ptr [[TMP4]], align 4, !alias.scope !22			; CHECK-NEXT: [[WIDE_LOAD2:%.*]] = load <2 x i32>, ptr [[TMP4]], align 4, !alias.scope !23
	; CHECK-NEXT: [[TMP5:%.*]] = add nsw <2 x i32> [[WIDE_LOAD]], <i32 23, i32 23>			; CHECK-NEXT: [[TMP5:%.*]] = add nsw <2 x i32> [[WIDE_LOAD]], <i32 23, i32 23>
	; CHECK-NEXT: [[TMP6:%.*]] = icmp slt <2 x i32> [[WIDE_LOAD]], <i32 100, i32 100>			; CHECK-NEXT: [[TMP6:%.*]] = icmp slt <2 x i32> [[WIDE_LOAD]], <i32 100, i32 100>
	; CHECK-NEXT: [[TMP7:%.*]] = extractelement <2 x i1> [[TMP6]], i32 0			; CHECK-NEXT: [[TMP7:%.*]] = extractelement <2 x i1> [[TMP6]], i32 0
	; CHECK-NEXT: br i1 [[TMP7]], label [[PRED_SDIV_IF:%.]], label [[PRED_SDIV_CONTINUE:%.]]			; CHECK-NEXT: br i1 [[TMP7]], label [[PRED_SDIV_IF:%.]], label [[PRED_SDIV_CONTINUE:%.]]
	; CHECK: pred.sdiv.if:			; CHECK: pred.sdiv.if:
	; CHECK-NEXT: [[TMP8:%.*]] = extractelement <2 x i32> [[TMP5]], i32 0			; CHECK-NEXT: [[TMP8:%.*]] = extractelement <2 x i32> [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP9:%.*]] = extractelement <2 x i32> [[WIDE_LOAD]], i32 0			; CHECK-NEXT: [[TMP9:%.*]] = extractelement <2 x i32> [[WIDE_LOAD]], i32 0
	; CHECK-NEXT: [[TMP10:%.*]] = sdiv i32 [[TMP8]], [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = sdiv i32 [[TMP8]], [[TMP9]]
	Show All 15 Lines
	; CHECK-NEXT: [[TMP22:%.*]] = insertelement <2 x i32> [[TMP15]], i32 [[TMP21]], i32 1			; CHECK-NEXT: [[TMP22:%.*]] = insertelement <2 x i32> [[TMP15]], i32 [[TMP21]], i32 1
	; CHECK-NEXT: br label [[PRED_SDIV_CONTINUE4]]			; CHECK-NEXT: br label [[PRED_SDIV_CONTINUE4]]
	; CHECK: pred.sdiv.continue4:			; CHECK: pred.sdiv.continue4:
	; CHECK-NEXT: [[TMP23:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP19]], [[PRED_SDIV_IF3]] ]			; CHECK-NEXT: [[TMP23:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP19]], [[PRED_SDIV_IF3]] ]
	; CHECK-NEXT: [[TMP24:%.*]] = phi <2 x i32> [ [[TMP15]], [[PRED_SDIV_CONTINUE]] ], [ [[TMP22]], [[PRED_SDIV_IF3]] ]			; CHECK-NEXT: [[TMP24:%.*]] = phi <2 x i32> [ [[TMP15]], [[PRED_SDIV_CONTINUE]] ], [ [[TMP22]], [[PRED_SDIV_IF3]] ]
	; CHECK-NEXT: [[TMP25:%.*]] = xor <2 x i1> [[TMP6]], <i1 true, i1 true>			; CHECK-NEXT: [[TMP25:%.*]] = xor <2 x i1> [[TMP6]], <i1 true, i1 true>
	; CHECK-NEXT: [[PREDPHI:%.*]] = select <2 x i1> [[TMP25]], <2 x i32> [[TMP5]], <2 x i32> [[TMP24]]			; CHECK-NEXT: [[PREDPHI:%.*]] = select <2 x i1> [[TMP25]], <2 x i32> [[TMP5]], <2 x i32> [[TMP24]]
	; CHECK-NEXT: [[TMP26:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0			; CHECK-NEXT: [[TMP26:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0
	; CHECK-NEXT: store <2 x i32> [[PREDPHI]], ptr [[TMP26]], align 4, !alias.scope !19, !noalias !22			; CHECK-NEXT: store <2 x i32> [[PREDPHI]], ptr [[TMP26]], align 4, !alias.scope !20, !noalias !23
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; CHECK-NEXT: [[TMP27:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128			; CHECK-NEXT: [[TMP27:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128
	; CHECK-NEXT: br i1 [[TMP27]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP24:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP27]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP25:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 128, 128			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 128, 128
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ 128, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ 128, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.cond.cleanup:			; CHECK: for.cond.cleanup:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	Show All 10 Lines
	; CHECK-NEXT: [[SD1:%.*]] = sdiv i32 [[PSD]], [[LSD]]			; CHECK-NEXT: [[SD1:%.*]] = sdiv i32 [[PSD]], [[LSD]]
	; CHECK-NEXT: [[RSD:%.*]] = sdiv i32 [[LSD_B]], [[SD1]]			; CHECK-NEXT: [[RSD:%.*]] = sdiv i32 [[LSD_B]], [[SD1]]
	; CHECK-NEXT: br label [[IF_END]]			; CHECK-NEXT: br label [[IF_END]]
	; CHECK: if.end:			; CHECK: if.end:
	; CHECK-NEXT: [[YSD_0:%.*]] = phi i32 [ [[RSD]], [[IF_THEN]] ], [ [[PSD]], [[FOR_BODY]] ]			; CHECK-NEXT: [[YSD_0:%.*]] = phi i32 [ [[RSD]], [[IF_THEN]] ], [ [[PSD]], [[FOR_BODY]] ]
	; CHECK-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4			; CHECK-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP25:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]]
	;			;
	; UNROLL-NO-VF-LABEL: @test_scalar2scalar(			; UNROLL-NO-VF-LABEL: @test_scalar2scalar(
	; UNROLL-NO-VF-NEXT: entry:			; UNROLL-NO-VF-NEXT: entry:
	; UNROLL-NO-VF-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]			; UNROLL-NO-VF-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]
	; UNROLL-NO-VF: vector.memcheck:			; UNROLL-NO-VF: vector.memcheck:
	; UNROLL-NO-VF-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[ASD:%.]], i64 512			; UNROLL-NO-VF-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[ASD:%.]], i64 512
	; UNROLL-NO-VF-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[BSD:%.]], i64 512			; UNROLL-NO-VF-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[BSD:%.]], i64 512
	; UNROLL-NO-VF-NEXT: [[BOUND0:%.*]] = icmp ult ptr [[ASD]], [[UGLYGEP1]]			; UNROLL-NO-VF-NEXT: [[BOUND0:%.*]] = icmp ult ptr [[ASD]], [[UGLYGEP1]]
	; UNROLL-NO-VF-NEXT: [[BOUND1:%.*]] = icmp ult ptr [[BSD]], [[UGLYGEP]]			; UNROLL-NO-VF-NEXT: [[BOUND1:%.*]] = icmp ult ptr [[BSD]], [[UGLYGEP]]
	; UNROLL-NO-VF-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; UNROLL-NO-VF-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; UNROLL-NO-VF-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; UNROLL-NO-VF-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; UNROLL-NO-VF: vector.ph:			; UNROLL-NO-VF: vector.ph:
	; UNROLL-NO-VF-NEXT: br label [[VECTOR_BODY:%.*]]			; UNROLL-NO-VF-NEXT: br label [[VECTOR_BODY:%.*]]
	; UNROLL-NO-VF: vector.body:			; UNROLL-NO-VF: vector.body:
	; UNROLL-NO-VF-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_SDIV_CONTINUE3:%.*]] ]			; UNROLL-NO-VF-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_SDIV_CONTINUE3:%.*]] ]
	; UNROLL-NO-VF-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0			; UNROLL-NO-VF-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
	; UNROLL-NO-VF-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 1			; UNROLL-NO-VF-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 1
	; UNROLL-NO-VF-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP0]]			; UNROLL-NO-VF-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP0]]
	; UNROLL-NO-VF-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP1]]			; UNROLL-NO-VF-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP1]]
	; UNROLL-NO-VF-NEXT: [[TMP4:%.*]] = load i32, ptr [[TMP2]], align 4, !alias.scope !19, !noalias !22			; UNROLL-NO-VF-NEXT: [[TMP4:%.*]] = load i32, ptr [[TMP2]], align 4, !alias.scope !20, !noalias !23
	; UNROLL-NO-VF-NEXT: [[TMP5:%.*]] = load i32, ptr [[TMP3]], align 4, !alias.scope !19, !noalias !22			; UNROLL-NO-VF-NEXT: [[TMP5:%.*]] = load i32, ptr [[TMP3]], align 4, !alias.scope !20, !noalias !23
	; UNROLL-NO-VF-NEXT: [[TMP6:%.*]] = add nsw i32 [[TMP4]], 23			; UNROLL-NO-VF-NEXT: [[TMP6:%.*]] = add nsw i32 [[TMP4]], 23
	; UNROLL-NO-VF-NEXT: [[TMP7:%.*]] = add nsw i32 [[TMP5]], 23			; UNROLL-NO-VF-NEXT: [[TMP7:%.*]] = add nsw i32 [[TMP5]], 23
	; UNROLL-NO-VF-NEXT: [[TMP8:%.*]] = icmp slt i32 [[TMP4]], 100			; UNROLL-NO-VF-NEXT: [[TMP8:%.*]] = icmp slt i32 [[TMP4]], 100
	; UNROLL-NO-VF-NEXT: [[TMP9:%.*]] = icmp slt i32 [[TMP5]], 100			; UNROLL-NO-VF-NEXT: [[TMP9:%.*]] = icmp slt i32 [[TMP5]], 100
	; UNROLL-NO-VF-NEXT: br i1 [[TMP8]], label [[PRED_SDIV_IF:%.]], label [[PRED_SDIV_CONTINUE:%.]]			; UNROLL-NO-VF-NEXT: br i1 [[TMP8]], label [[PRED_SDIV_IF:%.]], label [[PRED_SDIV_CONTINUE:%.]]
	; UNROLL-NO-VF: pred.sdiv.if:			; UNROLL-NO-VF: pred.sdiv.if:
	; UNROLL-NO-VF-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP0]]			; UNROLL-NO-VF-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP0]]
	; UNROLL-NO-VF-NEXT: [[TMP11:%.*]] = load i32, ptr [[TMP10]], align 4, !alias.scope !22			; UNROLL-NO-VF-NEXT: [[TMP11:%.*]] = load i32, ptr [[TMP10]], align 4, !alias.scope !23
	; UNROLL-NO-VF-NEXT: [[TMP12:%.*]] = sdiv i32 [[TMP6]], [[TMP4]]			; UNROLL-NO-VF-NEXT: [[TMP12:%.*]] = sdiv i32 [[TMP6]], [[TMP4]]
	; UNROLL-NO-VF-NEXT: [[TMP13:%.*]] = sdiv i32 [[TMP11]], [[TMP12]]			; UNROLL-NO-VF-NEXT: [[TMP13:%.*]] = sdiv i32 [[TMP11]], [[TMP12]]
	; UNROLL-NO-VF-NEXT: br label [[PRED_SDIV_CONTINUE]]			; UNROLL-NO-VF-NEXT: br label [[PRED_SDIV_CONTINUE]]
	; UNROLL-NO-VF: pred.sdiv.continue:			; UNROLL-NO-VF: pred.sdiv.continue:
	; UNROLL-NO-VF-NEXT: [[TMP14:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP12]], [[PRED_SDIV_IF]] ]			; UNROLL-NO-VF-NEXT: [[TMP14:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP12]], [[PRED_SDIV_IF]] ]
	; UNROLL-NO-VF-NEXT: [[TMP15:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP13]], [[PRED_SDIV_IF]] ]			; UNROLL-NO-VF-NEXT: [[TMP15:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP13]], [[PRED_SDIV_IF]] ]
	; UNROLL-NO-VF-NEXT: br i1 [[TMP9]], label [[PRED_SDIV_IF2:%.*]], label [[PRED_SDIV_CONTINUE3]]			; UNROLL-NO-VF-NEXT: br i1 [[TMP9]], label [[PRED_SDIV_IF2:%.*]], label [[PRED_SDIV_CONTINUE3]]
	; UNROLL-NO-VF: pred.sdiv.if2:			; UNROLL-NO-VF: pred.sdiv.if2:
	; UNROLL-NO-VF-NEXT: [[TMP16:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP1]]			; UNROLL-NO-VF-NEXT: [[TMP16:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP1]]
	; UNROLL-NO-VF-NEXT: [[TMP17:%.*]] = load i32, ptr [[TMP16]], align 4, !alias.scope !22			; UNROLL-NO-VF-NEXT: [[TMP17:%.*]] = load i32, ptr [[TMP16]], align 4, !alias.scope !23
	; UNROLL-NO-VF-NEXT: [[TMP18:%.*]] = sdiv i32 [[TMP7]], [[TMP5]]			; UNROLL-NO-VF-NEXT: [[TMP18:%.*]] = sdiv i32 [[TMP7]], [[TMP5]]
	; UNROLL-NO-VF-NEXT: [[TMP19:%.*]] = sdiv i32 [[TMP17]], [[TMP18]]			; UNROLL-NO-VF-NEXT: [[TMP19:%.*]] = sdiv i32 [[TMP17]], [[TMP18]]
	; UNROLL-NO-VF-NEXT: br label [[PRED_SDIV_CONTINUE3]]			; UNROLL-NO-VF-NEXT: br label [[PRED_SDIV_CONTINUE3]]
	; UNROLL-NO-VF: pred.sdiv.continue3:			; UNROLL-NO-VF: pred.sdiv.continue3:
	; UNROLL-NO-VF-NEXT: [[TMP20:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP18]], [[PRED_SDIV_IF2]] ]			; UNROLL-NO-VF-NEXT: [[TMP20:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP18]], [[PRED_SDIV_IF2]] ]
	; UNROLL-NO-VF-NEXT: [[TMP21:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP19]], [[PRED_SDIV_IF2]] ]			; UNROLL-NO-VF-NEXT: [[TMP21:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP19]], [[PRED_SDIV_IF2]] ]
	; UNROLL-NO-VF-NEXT: [[TMP22:%.*]] = xor i1 [[TMP8]], true			; UNROLL-NO-VF-NEXT: [[TMP22:%.*]] = xor i1 [[TMP8]], true
	; UNROLL-NO-VF-NEXT: [[TMP23:%.*]] = xor i1 [[TMP9]], true			; UNROLL-NO-VF-NEXT: [[TMP23:%.*]] = xor i1 [[TMP9]], true
	; UNROLL-NO-VF-NEXT: [[PREDPHI:%.*]] = select i1 [[TMP22]], i32 [[TMP6]], i32 [[TMP15]]			; UNROLL-NO-VF-NEXT: [[PREDPHI:%.*]] = select i1 [[TMP22]], i32 [[TMP6]], i32 [[TMP15]]
	; UNROLL-NO-VF-NEXT: [[PREDPHI4:%.*]] = select i1 [[TMP23]], i32 [[TMP7]], i32 [[TMP21]]			; UNROLL-NO-VF-NEXT: [[PREDPHI4:%.*]] = select i1 [[TMP23]], i32 [[TMP7]], i32 [[TMP21]]
	; UNROLL-NO-VF-NEXT: store i32 [[PREDPHI]], ptr [[TMP2]], align 4, !alias.scope !19, !noalias !22			; UNROLL-NO-VF-NEXT: store i32 [[PREDPHI]], ptr [[TMP2]], align 4, !alias.scope !20, !noalias !23
	; UNROLL-NO-VF-NEXT: store i32 [[PREDPHI4]], ptr [[TMP3]], align 4, !alias.scope !19, !noalias !22			; UNROLL-NO-VF-NEXT: store i32 [[PREDPHI4]], ptr [[TMP3]], align 4, !alias.scope !20, !noalias !23
	; UNROLL-NO-VF-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; UNROLL-NO-VF-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; UNROLL-NO-VF-NEXT: [[TMP24:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128			; UNROLL-NO-VF-NEXT: [[TMP24:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128
	; UNROLL-NO-VF-NEXT: br i1 [[TMP24]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP24:![0-9]+]]			; UNROLL-NO-VF-NEXT: br i1 [[TMP24]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP25:![0-9]+]]
	; UNROLL-NO-VF: middle.block:			; UNROLL-NO-VF: middle.block:
	; UNROLL-NO-VF-NEXT: [[CMP_N:%.*]] = icmp eq i64 128, 128			; UNROLL-NO-VF-NEXT: [[CMP_N:%.*]] = icmp eq i64 128, 128
	; UNROLL-NO-VF-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]			; UNROLL-NO-VF-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
	; UNROLL-NO-VF: scalar.ph:			; UNROLL-NO-VF: scalar.ph:
	; UNROLL-NO-VF-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 128, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; UNROLL-NO-VF-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 128, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; UNROLL-NO-VF-NEXT: br label [[FOR_BODY:%.*]]			; UNROLL-NO-VF-NEXT: br label [[FOR_BODY:%.*]]
	; UNROLL-NO-VF: for.cond.cleanup:			; UNROLL-NO-VF: for.cond.cleanup:
	; UNROLL-NO-VF-NEXT: ret void			; UNROLL-NO-VF-NEXT: ret void
	Show All 10 Lines
	; UNROLL-NO-VF-NEXT: [[SD1:%.*]] = sdiv i32 [[PSD]], [[LSD]]			; UNROLL-NO-VF-NEXT: [[SD1:%.*]] = sdiv i32 [[PSD]], [[LSD]]
	; UNROLL-NO-VF-NEXT: [[RSD:%.*]] = sdiv i32 [[LSD_B]], [[SD1]]			; UNROLL-NO-VF-NEXT: [[RSD:%.*]] = sdiv i32 [[LSD_B]], [[SD1]]
	; UNROLL-NO-VF-NEXT: br label [[IF_END]]			; UNROLL-NO-VF-NEXT: br label [[IF_END]]
	; UNROLL-NO-VF: if.end:			; UNROLL-NO-VF: if.end:
	; UNROLL-NO-VF-NEXT: [[YSD_0:%.*]] = phi i32 [ [[RSD]], [[IF_THEN]] ], [ [[PSD]], [[FOR_BODY]] ]			; UNROLL-NO-VF-NEXT: [[YSD_0:%.*]] = phi i32 [ [[RSD]], [[IF_THEN]] ], [ [[PSD]], [[FOR_BODY]] ]
	; UNROLL-NO-VF-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4			; UNROLL-NO-VF-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4
	; UNROLL-NO-VF-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; UNROLL-NO-VF-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; UNROLL-NO-VF-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128			; UNROLL-NO-VF-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128
	; UNROLL-NO-VF-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP25:![0-9]+]]			; UNROLL-NO-VF-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]]
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.cond.cleanup: ; preds = %if.end			for.cond.cleanup: ; preds = %if.end
	ret void			ret void


	Show All 29 Lines
	; CHECK-NEXT: [[BOUND1:%.*]] = icmp ult ptr [[BSD]], [[UGLYGEP]]			; CHECK-NEXT: [[BOUND1:%.*]] = icmp ult ptr [[BSD]], [[UGLYGEP]]
	; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH:%.]], label [[VECTOR_BODY:%.]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH:%.]], label [[VECTOR_BODY:%.]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ [[INDEX_NEXT:%.]], [[PRED_SDIV_CONTINUE4:%.]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ [[INDEX_NEXT:%.]], [[PRED_SDIV_CONTINUE4:%.]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP0]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP0]]
	; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0			; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP2]], align 4, !alias.scope !28, !noalias !31			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP2]], align 4, !alias.scope !29, !noalias !32
	; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP0]]			; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP0]]
	; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TMP3]], i32 0			; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TMP3]], i32 0
	; CHECK-NEXT: [[WIDE_LOAD2:%.*]] = load <2 x i32>, ptr [[TMP4]], align 4, !alias.scope !31			; CHECK-NEXT: [[WIDE_LOAD2:%.*]] = load <2 x i32>, ptr [[TMP4]], align 4, !alias.scope !32
	; CHECK-NEXT: [[TMP5:%.*]] = add nsw <2 x i32> [[WIDE_LOAD]], <i32 23, i32 23>			; CHECK-NEXT: [[TMP5:%.*]] = add nsw <2 x i32> [[WIDE_LOAD]], <i32 23, i32 23>
	; CHECK-NEXT: [[TMP6:%.*]] = icmp slt <2 x i32> [[WIDE_LOAD]], <i32 100, i32 100>			; CHECK-NEXT: [[TMP6:%.*]] = icmp slt <2 x i32> [[WIDE_LOAD]], <i32 100, i32 100>
	; CHECK-NEXT: [[TMP7:%.*]] = icmp sge <2 x i32> [[WIDE_LOAD]], <i32 200, i32 200>			; CHECK-NEXT: [[TMP7:%.*]] = icmp sge <2 x i32> [[WIDE_LOAD]], <i32 200, i32 200>
	; CHECK-NEXT: [[TMP8:%.*]] = xor <2 x i1> [[TMP6]], <i1 true, i1 true>, !dbg [[DBG33:![0-9]+]]			; CHECK-NEXT: [[TMP8:%.*]] = xor <2 x i1> [[TMP6]], <i1 true, i1 true>, !dbg [[DBG34:![0-9]+]]
	; CHECK-NEXT: [[TMP9:%.*]] = select <2 x i1> [[TMP8]], <2 x i1> [[TMP7]], <2 x i1> zeroinitializer, !dbg [[DBG34:![0-9]+]]			; CHECK-NEXT: [[TMP9:%.*]] = select <2 x i1> [[TMP8]], <2 x i1> [[TMP7]], <2 x i1> zeroinitializer, !dbg [[DBG35:![0-9]+]]
	; CHECK-NEXT: [[TMP10:%.*]] = or <2 x i1> [[TMP9]], [[TMP6]]			; CHECK-NEXT: [[TMP10:%.*]] = or <2 x i1> [[TMP9]], [[TMP6]]
	; CHECK-NEXT: [[TMP11:%.*]] = extractelement <2 x i1> [[TMP10]], i32 0			; CHECK-NEXT: [[TMP11:%.*]] = extractelement <2 x i1> [[TMP10]], i32 0
	; CHECK-NEXT: br i1 [[TMP11]], label [[PRED_SDIV_IF:%.]], label [[PRED_SDIV_CONTINUE:%.]]			; CHECK-NEXT: br i1 [[TMP11]], label [[PRED_SDIV_IF:%.]], label [[PRED_SDIV_CONTINUE:%.]]
	; CHECK: pred.sdiv.if:			; CHECK: pred.sdiv.if:
	; CHECK-NEXT: [[TMP12:%.*]] = extractelement <2 x i32> [[TMP5]], i32 0			; CHECK-NEXT: [[TMP12:%.*]] = extractelement <2 x i32> [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP13:%.*]] = extractelement <2 x i32> [[WIDE_LOAD]], i32 0			; CHECK-NEXT: [[TMP13:%.*]] = extractelement <2 x i32> [[WIDE_LOAD]], i32 0
	; CHECK-NEXT: [[TMP14:%.*]] = sdiv i32 [[TMP12]], [[TMP13]]			; CHECK-NEXT: [[TMP14:%.*]] = sdiv i32 [[TMP12]], [[TMP13]]
	; CHECK-NEXT: [[TMP15:%.*]] = extractelement <2 x i32> [[WIDE_LOAD2]], i32 0			; CHECK-NEXT: [[TMP15:%.*]] = extractelement <2 x i32> [[WIDE_LOAD2]], i32 0
	Show All 11 Lines
	; CHECK-NEXT: [[TMP23:%.*]] = sdiv i32 [[TMP21]], [[TMP22]]			; CHECK-NEXT: [[TMP23:%.*]] = sdiv i32 [[TMP21]], [[TMP22]]
	; CHECK-NEXT: [[TMP24:%.*]] = extractelement <2 x i32> [[WIDE_LOAD2]], i32 1			; CHECK-NEXT: [[TMP24:%.*]] = extractelement <2 x i32> [[WIDE_LOAD2]], i32 1
	; CHECK-NEXT: [[TMP25:%.*]] = sdiv i32 [[TMP24]], [[TMP23]]			; CHECK-NEXT: [[TMP25:%.*]] = sdiv i32 [[TMP24]], [[TMP23]]
	; CHECK-NEXT: [[TMP26:%.*]] = insertelement <2 x i32> [[TMP19]], i32 [[TMP25]], i32 1			; CHECK-NEXT: [[TMP26:%.*]] = insertelement <2 x i32> [[TMP19]], i32 [[TMP25]], i32 1
	; CHECK-NEXT: br label [[PRED_SDIV_CONTINUE4]]			; CHECK-NEXT: br label [[PRED_SDIV_CONTINUE4]]
	; CHECK: pred.sdiv.continue4:			; CHECK: pred.sdiv.continue4:
	; CHECK-NEXT: [[TMP27:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP23]], [[PRED_SDIV_IF3]] ]			; CHECK-NEXT: [[TMP27:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP23]], [[PRED_SDIV_IF3]] ]
	; CHECK-NEXT: [[TMP28:%.*]] = phi <2 x i32> [ [[TMP19]], [[PRED_SDIV_CONTINUE]] ], [ [[TMP26]], [[PRED_SDIV_IF3]] ]			; CHECK-NEXT: [[TMP28:%.*]] = phi <2 x i32> [ [[TMP19]], [[PRED_SDIV_CONTINUE]] ], [ [[TMP26]], [[PRED_SDIV_IF3]] ]
	; CHECK-NEXT: [[TMP29:%.*]] = xor <2 x i1> [[TMP7]], <i1 true, i1 true>, !dbg [[DBG34]]			; CHECK-NEXT: [[TMP29:%.*]] = xor <2 x i1> [[TMP7]], <i1 true, i1 true>, !dbg [[DBG35]]
	; CHECK-NEXT: [[TMP30:%.*]] = select <2 x i1> [[TMP8]], <2 x i1> [[TMP29]], <2 x i1> zeroinitializer, !dbg [[DBG34]]			; CHECK-NEXT: [[TMP30:%.*]] = select <2 x i1> [[TMP8]], <2 x i1> [[TMP29]], <2 x i1> zeroinitializer, !dbg [[DBG35]]
	; CHECK-NEXT: [[PREDPHI:%.*]] = select <2 x i1> [[TMP30]], <2 x i32> [[TMP5]], <2 x i32> [[TMP28]]			; CHECK-NEXT: [[PREDPHI:%.*]] = select <2 x i1> [[TMP30]], <2 x i32> [[TMP5]], <2 x i32> [[TMP28]]
	; CHECK-NEXT: [[TMP31:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0			; CHECK-NEXT: [[TMP31:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0
	; CHECK-NEXT: store <2 x i32> [[PREDPHI]], ptr [[TMP31]], align 4, !alias.scope !28, !noalias !31			; CHECK-NEXT: store <2 x i32> [[PREDPHI]], ptr [[TMP31]], align 4, !alias.scope !29, !noalias !32
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; CHECK-NEXT: [[TMP32:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128			; CHECK-NEXT: [[TMP32:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128
	; CHECK-NEXT: br i1 [[TMP32]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP35:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP32]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP36:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 128, 128			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 128, 128
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ 128, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ 128, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.cond.cleanup:			; CHECK: for.cond.cleanup:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[IF_END:%.*]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[IF_END:%.*]] ]
	; CHECK-NEXT: [[ISD:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ISD:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[LSD:%.*]] = load i32, ptr [[ISD]], align 4			; CHECK-NEXT: [[LSD:%.*]] = load i32, ptr [[ISD]], align 4
	; CHECK-NEXT: [[ISD_B:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ISD_B:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[LSD_B:%.*]] = load i32, ptr [[ISD_B]], align 4			; CHECK-NEXT: [[LSD_B:%.*]] = load i32, ptr [[ISD_B]], align 4
	; CHECK-NEXT: [[PSD:%.*]] = add nsw i32 [[LSD]], 23			; CHECK-NEXT: [[PSD:%.*]] = add nsw i32 [[LSD]], 23
	; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i32 [[LSD]], 100			; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i32 [[LSD]], 100
	; CHECK-NEXT: [[CMP2:%.*]] = icmp sge i32 [[LSD]], 200			; CHECK-NEXT: [[CMP2:%.*]] = icmp sge i32 [[LSD]], 200
	; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[CMP1]], [[CMP2]], !dbg [[DBG33]]			; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[CMP1]], [[CMP2]], !dbg [[DBG34]]
	; CHECK-NEXT: br i1 [[OR_COND]], label [[IF_THEN:%.*]], label [[IF_END]], !dbg [[DBG33]]			; CHECK-NEXT: br i1 [[OR_COND]], label [[IF_THEN:%.*]], label [[IF_END]], !dbg [[DBG34]]
	; CHECK: if.then:			; CHECK: if.then:
	; CHECK-NEXT: [[SD1:%.*]] = sdiv i32 [[PSD]], [[LSD]]			; CHECK-NEXT: [[SD1:%.*]] = sdiv i32 [[PSD]], [[LSD]]
	; CHECK-NEXT: [[RSD:%.*]] = sdiv i32 [[LSD_B]], [[SD1]]			; CHECK-NEXT: [[RSD:%.*]] = sdiv i32 [[LSD_B]], [[SD1]]
	; CHECK-NEXT: br label [[IF_END]]			; CHECK-NEXT: br label [[IF_END]]
	; CHECK: if.end:			; CHECK: if.end:
	; CHECK-NEXT: [[YSD_0:%.*]] = phi i32 [ [[RSD]], [[IF_THEN]] ], [ [[PSD]], [[FOR_BODY]] ]			; CHECK-NEXT: [[YSD_0:%.*]] = phi i32 [ [[RSD]], [[IF_THEN]] ], [ [[PSD]], [[FOR_BODY]] ]
	; CHECK-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4			; CHECK-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP36:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP37:![0-9]+]]
	;			;
	; UNROLL-NO-VF-LABEL: @pr30172(			; UNROLL-NO-VF-LABEL: @pr30172(
	; UNROLL-NO-VF-NEXT: entry:			; UNROLL-NO-VF-NEXT: entry:
	; UNROLL-NO-VF-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]			; UNROLL-NO-VF-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_MEMCHECK:%.]]
	; UNROLL-NO-VF: vector.memcheck:			; UNROLL-NO-VF: vector.memcheck:
	; UNROLL-NO-VF-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[ASD:%.]], i64 512			; UNROLL-NO-VF-NEXT: [[UGLYGEP:%.]] = getelementptr i8, ptr [[ASD:%.]], i64 512
	; UNROLL-NO-VF-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[BSD:%.]], i64 512			; UNROLL-NO-VF-NEXT: [[UGLYGEP1:%.]] = getelementptr i8, ptr [[BSD:%.]], i64 512
	; UNROLL-NO-VF-NEXT: [[BOUND0:%.*]] = icmp ult ptr [[ASD]], [[UGLYGEP1]]			; UNROLL-NO-VF-NEXT: [[BOUND0:%.*]] = icmp ult ptr [[ASD]], [[UGLYGEP1]]
	; UNROLL-NO-VF-NEXT: [[BOUND1:%.*]] = icmp ult ptr [[BSD]], [[UGLYGEP]]			; UNROLL-NO-VF-NEXT: [[BOUND1:%.*]] = icmp ult ptr [[BSD]], [[UGLYGEP]]
	; UNROLL-NO-VF-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]			; UNROLL-NO-VF-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
	; UNROLL-NO-VF-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; UNROLL-NO-VF-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; UNROLL-NO-VF: vector.ph:			; UNROLL-NO-VF: vector.ph:
	; UNROLL-NO-VF-NEXT: br label [[VECTOR_BODY:%.*]]			; UNROLL-NO-VF-NEXT: br label [[VECTOR_BODY:%.*]]
	; UNROLL-NO-VF: vector.body:			; UNROLL-NO-VF: vector.body:
	; UNROLL-NO-VF-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_SDIV_CONTINUE3:%.*]] ]			; UNROLL-NO-VF-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_SDIV_CONTINUE3:%.*]] ]
	; UNROLL-NO-VF-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0			; UNROLL-NO-VF-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
	; UNROLL-NO-VF-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 1			; UNROLL-NO-VF-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 1
	; UNROLL-NO-VF-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP0]]			; UNROLL-NO-VF-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP0]]
	; UNROLL-NO-VF-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP1]]			; UNROLL-NO-VF-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[TMP1]]
	; UNROLL-NO-VF-NEXT: [[TMP4:%.*]] = load i32, ptr [[TMP2]], align 4, !alias.scope !28, !noalias !31			; UNROLL-NO-VF-NEXT: [[TMP4:%.*]] = load i32, ptr [[TMP2]], align 4, !alias.scope !29, !noalias !32
	; UNROLL-NO-VF-NEXT: [[TMP5:%.*]] = load i32, ptr [[TMP3]], align 4, !alias.scope !28, !noalias !31			; UNROLL-NO-VF-NEXT: [[TMP5:%.*]] = load i32, ptr [[TMP3]], align 4, !alias.scope !29, !noalias !32
	; UNROLL-NO-VF-NEXT: [[TMP6:%.*]] = add nsw i32 [[TMP4]], 23			; UNROLL-NO-VF-NEXT: [[TMP6:%.*]] = add nsw i32 [[TMP4]], 23
	; UNROLL-NO-VF-NEXT: [[TMP7:%.*]] = add nsw i32 [[TMP5]], 23			; UNROLL-NO-VF-NEXT: [[TMP7:%.*]] = add nsw i32 [[TMP5]], 23
	; UNROLL-NO-VF-NEXT: [[TMP8:%.*]] = icmp slt i32 [[TMP4]], 100			; UNROLL-NO-VF-NEXT: [[TMP8:%.*]] = icmp slt i32 [[TMP4]], 100
	; UNROLL-NO-VF-NEXT: [[TMP9:%.*]] = icmp slt i32 [[TMP5]], 100			; UNROLL-NO-VF-NEXT: [[TMP9:%.*]] = icmp slt i32 [[TMP5]], 100
	; UNROLL-NO-VF-NEXT: [[TMP10:%.*]] = icmp sge i32 [[TMP4]], 200			; UNROLL-NO-VF-NEXT: [[TMP10:%.*]] = icmp sge i32 [[TMP4]], 200
	; UNROLL-NO-VF-NEXT: [[TMP11:%.*]] = icmp sge i32 [[TMP5]], 200			; UNROLL-NO-VF-NEXT: [[TMP11:%.*]] = icmp sge i32 [[TMP5]], 200
	; UNROLL-NO-VF-NEXT: [[TMP12:%.*]] = xor i1 [[TMP8]], true, !dbg [[DBG33:![0-9]+]]			; UNROLL-NO-VF-NEXT: [[TMP12:%.*]] = xor i1 [[TMP8]], true, !dbg [[DBG34:![0-9]+]]
	; UNROLL-NO-VF-NEXT: [[TMP13:%.*]] = xor i1 [[TMP9]], true, !dbg [[DBG33]]			; UNROLL-NO-VF-NEXT: [[TMP13:%.*]] = xor i1 [[TMP9]], true, !dbg [[DBG34]]
	; UNROLL-NO-VF-NEXT: [[TMP14:%.*]] = select i1 [[TMP12]], i1 [[TMP10]], i1 false, !dbg [[DBG34:![0-9]+]]			; UNROLL-NO-VF-NEXT: [[TMP14:%.*]] = select i1 [[TMP12]], i1 [[TMP10]], i1 false, !dbg [[DBG35:![0-9]+]]
	; UNROLL-NO-VF-NEXT: [[TMP15:%.*]] = select i1 [[TMP13]], i1 [[TMP11]], i1 false, !dbg [[DBG34]]			; UNROLL-NO-VF-NEXT: [[TMP15:%.*]] = select i1 [[TMP13]], i1 [[TMP11]], i1 false, !dbg [[DBG35]]
	; UNROLL-NO-VF-NEXT: [[TMP16:%.*]] = or i1 [[TMP14]], [[TMP8]]			; UNROLL-NO-VF-NEXT: [[TMP16:%.*]] = or i1 [[TMP14]], [[TMP8]]
	; UNROLL-NO-VF-NEXT: [[TMP17:%.*]] = or i1 [[TMP15]], [[TMP9]]			; UNROLL-NO-VF-NEXT: [[TMP17:%.*]] = or i1 [[TMP15]], [[TMP9]]
	; UNROLL-NO-VF-NEXT: br i1 [[TMP16]], label [[PRED_SDIV_IF:%.]], label [[PRED_SDIV_CONTINUE:%.]]			; UNROLL-NO-VF-NEXT: br i1 [[TMP16]], label [[PRED_SDIV_IF:%.]], label [[PRED_SDIV_CONTINUE:%.]]
	; UNROLL-NO-VF: pred.sdiv.if:			; UNROLL-NO-VF: pred.sdiv.if:
	; UNROLL-NO-VF-NEXT: [[TMP18:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP0]]			; UNROLL-NO-VF-NEXT: [[TMP18:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP0]]
	; UNROLL-NO-VF-NEXT: [[TMP19:%.*]] = load i32, ptr [[TMP18]], align 4, !alias.scope !31			; UNROLL-NO-VF-NEXT: [[TMP19:%.*]] = load i32, ptr [[TMP18]], align 4, !alias.scope !32
	; UNROLL-NO-VF-NEXT: [[TMP20:%.*]] = sdiv i32 [[TMP6]], [[TMP4]]			; UNROLL-NO-VF-NEXT: [[TMP20:%.*]] = sdiv i32 [[TMP6]], [[TMP4]]
	; UNROLL-NO-VF-NEXT: [[TMP21:%.*]] = sdiv i32 [[TMP19]], [[TMP20]]			; UNROLL-NO-VF-NEXT: [[TMP21:%.*]] = sdiv i32 [[TMP19]], [[TMP20]]
	; UNROLL-NO-VF-NEXT: br label [[PRED_SDIV_CONTINUE]]			; UNROLL-NO-VF-NEXT: br label [[PRED_SDIV_CONTINUE]]
	; UNROLL-NO-VF: pred.sdiv.continue:			; UNROLL-NO-VF: pred.sdiv.continue:
	; UNROLL-NO-VF-NEXT: [[TMP22:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP20]], [[PRED_SDIV_IF]] ]			; UNROLL-NO-VF-NEXT: [[TMP22:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP20]], [[PRED_SDIV_IF]] ]
	; UNROLL-NO-VF-NEXT: [[TMP23:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP21]], [[PRED_SDIV_IF]] ]			; UNROLL-NO-VF-NEXT: [[TMP23:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP21]], [[PRED_SDIV_IF]] ]
	; UNROLL-NO-VF-NEXT: br i1 [[TMP17]], label [[PRED_SDIV_IF2:%.*]], label [[PRED_SDIV_CONTINUE3]]			; UNROLL-NO-VF-NEXT: br i1 [[TMP17]], label [[PRED_SDIV_IF2:%.*]], label [[PRED_SDIV_CONTINUE3]]
	; UNROLL-NO-VF: pred.sdiv.if2:			; UNROLL-NO-VF: pred.sdiv.if2:
	; UNROLL-NO-VF-NEXT: [[TMP24:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP1]]			; UNROLL-NO-VF-NEXT: [[TMP24:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[TMP1]]
	; UNROLL-NO-VF-NEXT: [[TMP25:%.*]] = load i32, ptr [[TMP24]], align 4, !alias.scope !31			; UNROLL-NO-VF-NEXT: [[TMP25:%.*]] = load i32, ptr [[TMP24]], align 4, !alias.scope !32
	; UNROLL-NO-VF-NEXT: [[TMP26:%.*]] = sdiv i32 [[TMP7]], [[TMP5]]			; UNROLL-NO-VF-NEXT: [[TMP26:%.*]] = sdiv i32 [[TMP7]], [[TMP5]]
	; UNROLL-NO-VF-NEXT: [[TMP27:%.*]] = sdiv i32 [[TMP25]], [[TMP26]]			; UNROLL-NO-VF-NEXT: [[TMP27:%.*]] = sdiv i32 [[TMP25]], [[TMP26]]
	; UNROLL-NO-VF-NEXT: br label [[PRED_SDIV_CONTINUE3]]			; UNROLL-NO-VF-NEXT: br label [[PRED_SDIV_CONTINUE3]]
	; UNROLL-NO-VF: pred.sdiv.continue3:			; UNROLL-NO-VF: pred.sdiv.continue3:
	; UNROLL-NO-VF-NEXT: [[TMP28:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP26]], [[PRED_SDIV_IF2]] ]			; UNROLL-NO-VF-NEXT: [[TMP28:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP26]], [[PRED_SDIV_IF2]] ]
	; UNROLL-NO-VF-NEXT: [[TMP29:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP27]], [[PRED_SDIV_IF2]] ]			; UNROLL-NO-VF-NEXT: [[TMP29:%.*]] = phi i32 [ poison, [[PRED_SDIV_CONTINUE]] ], [ [[TMP27]], [[PRED_SDIV_IF2]] ]
	; UNROLL-NO-VF-NEXT: [[TMP30:%.*]] = xor i1 [[TMP10]], true, !dbg [[DBG34]]			; UNROLL-NO-VF-NEXT: [[TMP30:%.*]] = xor i1 [[TMP10]], true, !dbg [[DBG35]]
	; UNROLL-NO-VF-NEXT: [[TMP31:%.*]] = xor i1 [[TMP11]], true, !dbg [[DBG34]]			; UNROLL-NO-VF-NEXT: [[TMP31:%.*]] = xor i1 [[TMP11]], true, !dbg [[DBG35]]
	; UNROLL-NO-VF-NEXT: [[TMP32:%.*]] = select i1 [[TMP12]], i1 [[TMP30]], i1 false, !dbg [[DBG34]]			; UNROLL-NO-VF-NEXT: [[TMP32:%.*]] = select i1 [[TMP12]], i1 [[TMP30]], i1 false, !dbg [[DBG35]]
	; UNROLL-NO-VF-NEXT: [[TMP33:%.*]] = select i1 [[TMP13]], i1 [[TMP31]], i1 false, !dbg [[DBG34]]			; UNROLL-NO-VF-NEXT: [[TMP33:%.*]] = select i1 [[TMP13]], i1 [[TMP31]], i1 false, !dbg [[DBG35]]
	; UNROLL-NO-VF-NEXT: [[PREDPHI:%.*]] = select i1 [[TMP32]], i32 [[TMP6]], i32 [[TMP23]]			; UNROLL-NO-VF-NEXT: [[PREDPHI:%.*]] = select i1 [[TMP32]], i32 [[TMP6]], i32 [[TMP23]]
	; UNROLL-NO-VF-NEXT: [[PREDPHI4:%.*]] = select i1 [[TMP33]], i32 [[TMP7]], i32 [[TMP29]]			; UNROLL-NO-VF-NEXT: [[PREDPHI4:%.*]] = select i1 [[TMP33]], i32 [[TMP7]], i32 [[TMP29]]
	; UNROLL-NO-VF-NEXT: store i32 [[PREDPHI]], ptr [[TMP2]], align 4, !alias.scope !28, !noalias !31			; UNROLL-NO-VF-NEXT: store i32 [[PREDPHI]], ptr [[TMP2]], align 4, !alias.scope !29, !noalias !32
	; UNROLL-NO-VF-NEXT: store i32 [[PREDPHI4]], ptr [[TMP3]], align 4, !alias.scope !28, !noalias !31			; UNROLL-NO-VF-NEXT: store i32 [[PREDPHI4]], ptr [[TMP3]], align 4, !alias.scope !29, !noalias !32
	; UNROLL-NO-VF-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; UNROLL-NO-VF-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; UNROLL-NO-VF-NEXT: [[TMP34:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128			; UNROLL-NO-VF-NEXT: [[TMP34:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128
	; UNROLL-NO-VF-NEXT: br i1 [[TMP34]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP35:![0-9]+]]			; UNROLL-NO-VF-NEXT: br i1 [[TMP34]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP36:![0-9]+]]
	; UNROLL-NO-VF: middle.block:			; UNROLL-NO-VF: middle.block:
	; UNROLL-NO-VF-NEXT: [[CMP_N:%.*]] = icmp eq i64 128, 128			; UNROLL-NO-VF-NEXT: [[CMP_N:%.*]] = icmp eq i64 128, 128
	; UNROLL-NO-VF-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]			; UNROLL-NO-VF-NEXT: br i1 [[CMP_N]], label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
	; UNROLL-NO-VF: scalar.ph:			; UNROLL-NO-VF: scalar.ph:
	; UNROLL-NO-VF-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 128, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; UNROLL-NO-VF-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ 128, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; UNROLL-NO-VF-NEXT: br label [[FOR_BODY:%.*]]			; UNROLL-NO-VF-NEXT: br label [[FOR_BODY:%.*]]
	; UNROLL-NO-VF: for.cond.cleanup:			; UNROLL-NO-VF: for.cond.cleanup:
	; UNROLL-NO-VF-NEXT: ret void			; UNROLL-NO-VF-NEXT: ret void
	; UNROLL-NO-VF: for.body:			; UNROLL-NO-VF: for.body:
	; UNROLL-NO-VF-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[IF_END:%.*]] ]			; UNROLL-NO-VF-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[IF_END:%.*]] ]
	; UNROLL-NO-VF-NEXT: [[ISD:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[INDVARS_IV]]			; UNROLL-NO-VF-NEXT: [[ISD:%.*]] = getelementptr inbounds i32, ptr [[ASD]], i64 [[INDVARS_IV]]
	; UNROLL-NO-VF-NEXT: [[LSD:%.*]] = load i32, ptr [[ISD]], align 4			; UNROLL-NO-VF-NEXT: [[LSD:%.*]] = load i32, ptr [[ISD]], align 4
	; UNROLL-NO-VF-NEXT: [[ISD_B:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[INDVARS_IV]]			; UNROLL-NO-VF-NEXT: [[ISD_B:%.*]] = getelementptr inbounds i32, ptr [[BSD]], i64 [[INDVARS_IV]]
	; UNROLL-NO-VF-NEXT: [[LSD_B:%.*]] = load i32, ptr [[ISD_B]], align 4			; UNROLL-NO-VF-NEXT: [[LSD_B:%.*]] = load i32, ptr [[ISD_B]], align 4
	; UNROLL-NO-VF-NEXT: [[PSD:%.*]] = add nsw i32 [[LSD]], 23			; UNROLL-NO-VF-NEXT: [[PSD:%.*]] = add nsw i32 [[LSD]], 23
	; UNROLL-NO-VF-NEXT: [[CMP1:%.*]] = icmp slt i32 [[LSD]], 100			; UNROLL-NO-VF-NEXT: [[CMP1:%.*]] = icmp slt i32 [[LSD]], 100
	; UNROLL-NO-VF-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.]], label [[CHECKBB:%.]], !dbg [[DBG33]]			; UNROLL-NO-VF-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.]], label [[CHECKBB:%.]], !dbg [[DBG34]]
	; UNROLL-NO-VF: checkbb:			; UNROLL-NO-VF: checkbb:
	; UNROLL-NO-VF-NEXT: [[CMP2:%.*]] = icmp sge i32 [[LSD]], 200			; UNROLL-NO-VF-NEXT: [[CMP2:%.*]] = icmp sge i32 [[LSD]], 200
	; UNROLL-NO-VF-NEXT: br i1 [[CMP2]], label [[IF_THEN]], label [[IF_END]], !dbg [[DBG34]]			; UNROLL-NO-VF-NEXT: br i1 [[CMP2]], label [[IF_THEN]], label [[IF_END]], !dbg [[DBG35]]
	; UNROLL-NO-VF: if.then:			; UNROLL-NO-VF: if.then:
	; UNROLL-NO-VF-NEXT: [[SD1:%.*]] = sdiv i32 [[PSD]], [[LSD]]			; UNROLL-NO-VF-NEXT: [[SD1:%.*]] = sdiv i32 [[PSD]], [[LSD]]
	; UNROLL-NO-VF-NEXT: [[RSD:%.*]] = sdiv i32 [[LSD_B]], [[SD1]]			; UNROLL-NO-VF-NEXT: [[RSD:%.*]] = sdiv i32 [[LSD_B]], [[SD1]]
	; UNROLL-NO-VF-NEXT: br label [[IF_END]]			; UNROLL-NO-VF-NEXT: br label [[IF_END]]
	; UNROLL-NO-VF: if.end:			; UNROLL-NO-VF: if.end:
	; UNROLL-NO-VF-NEXT: [[YSD_0:%.*]] = phi i32 [ [[RSD]], [[IF_THEN]] ], [ [[PSD]], [[CHECKBB]] ]			; UNROLL-NO-VF-NEXT: [[YSD_0:%.*]] = phi i32 [ [[RSD]], [[IF_THEN]] ], [ [[PSD]], [[CHECKBB]] ]
	; UNROLL-NO-VF-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4			; UNROLL-NO-VF-NEXT: store i32 [[YSD_0]], ptr [[ISD]], align 4
	; UNROLL-NO-VF-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; UNROLL-NO-VF-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; UNROLL-NO-VF-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128			; UNROLL-NO-VF-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 128
	; UNROLL-NO-VF-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP36:![0-9]+]]			; UNROLL-NO-VF-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP37:![0-9]+]]
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.cond.cleanup: ; preds = %if.end			for.cond.cleanup: ; preds = %if.end
	ret void			ret void

	for.body: ; preds = %if.end, %entry			for.body: ; preds = %if.end, %entry
	▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: br label [[PRED_UDIV_CONTINUE2]]			; CHECK-NEXT: br label [[PRED_UDIV_CONTINUE2]]
	; CHECK: pred.udiv.continue2:			; CHECK: pred.udiv.continue2:
	; CHECK-NEXT: [[TMP16:%.*]] = phi <2 x i32> [ [[TMP9]], [[PRED_UDIV_CONTINUE]] ], [ [[TMP15]], [[PRED_UDIV_IF1]] ]			; CHECK-NEXT: [[TMP16:%.*]] = phi <2 x i32> [ [[TMP9]], [[PRED_UDIV_CONTINUE]] ], [ [[TMP15]], [[PRED_UDIV_IF1]] ]
	; CHECK-NEXT: [[TMP17:%.*]] = xor <2 x i1> [[BROADCAST_SPLAT]], <i1 true, i1 true>			; CHECK-NEXT: [[TMP17:%.*]] = xor <2 x i1> [[BROADCAST_SPLAT]], <i1 true, i1 true>
	; CHECK-NEXT: [[PREDPHI:%.*]] = select <2 x i1> [[BROADCAST_SPLAT]], <2 x i32> [[TMP16]], <2 x i32> [[WIDE_LOAD]]			; CHECK-NEXT: [[PREDPHI:%.*]] = select <2 x i1> [[BROADCAST_SPLAT]], <2 x i32> [[TMP16]], <2 x i32> [[WIDE_LOAD]]
	; CHECK-NEXT: [[TMP18]] = add <2 x i32> [[VEC_PHI]], [[PREDPHI]]			; CHECK-NEXT: [[TMP18]] = add <2 x i32> [[VEC_PHI]], [[PREDPHI]]
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; CHECK-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP19]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP37:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP19]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP38:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[TMP20:%.*]] = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> [[TMP18]])			; CHECK-NEXT: [[TMP20:%.*]] = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> [[TMP18]])
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[TMP20]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[TMP20]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[I:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[FOR_INC:%.*]] ]			; CHECK-NEXT: [[I:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[FOR_INC:%.*]] ]
	; CHECK-NEXT: [[R:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[T6:%.]], [[FOR_INC]] ]			; CHECK-NEXT: [[R:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[T6:%.]], [[FOR_INC]] ]
	; CHECK-NEXT: [[T0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[I]]			; CHECK-NEXT: [[T0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[I]]
	; CHECK-NEXT: [[T2:%.*]] = load i32, ptr [[T0]], align 4			; CHECK-NEXT: [[T2:%.*]] = load i32, ptr [[T0]], align 4
	; CHECK-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; CHECK-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; CHECK: if.then:			; CHECK: if.then:
	; CHECK-NEXT: [[T3:%.*]] = add nsw i32 [[T2]], [[X]]			; CHECK-NEXT: [[T3:%.*]] = add nsw i32 [[T2]], [[X]]
	; CHECK-NEXT: [[T4:%.*]] = udiv i32 [[T2]], [[T3]]			; CHECK-NEXT: [[T4:%.*]] = udiv i32 [[T2]], [[T3]]
	; CHECK-NEXT: br label [[FOR_INC]]			; CHECK-NEXT: br label [[FOR_INC]]
	; CHECK: for.inc:			; CHECK: for.inc:
	; CHECK-NEXT: [[T5:%.*]] = phi i32 [ [[T2]], [[FOR_BODY]] ], [ [[T4]], [[IF_THEN]] ]			; CHECK-NEXT: [[T5:%.*]] = phi i32 [ [[T2]], [[FOR_BODY]] ], [ [[T4]], [[IF_THEN]] ]
	; CHECK-NEXT: [[T6]] = add i32 [[R]], [[T5]]			; CHECK-NEXT: [[T6]] = add i32 [[R]], [[T5]]
	; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]			; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP38:![0-9]+]]			; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP39:![0-9]+]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: [[T7:%.*]] = phi i32 [ [[T6]], [[FOR_INC]] ], [ [[TMP20]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[T7:%.*]] = phi i32 [ [[T6]], [[FOR_INC]] ], [ [[TMP20]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret i32 [[T7]]			; CHECK-NEXT: ret i32 [[T7]]
	;			;
	; UNROLL-NO-VF-LABEL: @predicated_udiv_scalarized_operand(			; UNROLL-NO-VF-LABEL: @predicated_udiv_scalarized_operand(
	; UNROLL-NO-VF-NEXT: entry:			; UNROLL-NO-VF-NEXT: entry:
	; UNROLL-NO-VF-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)			; UNROLL-NO-VF-NEXT: [[SMAX:%.]] = call i64 @llvm.smax.i64(i64 [[N:%.]], i64 1)
	; UNROLL-NO-VF-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2			; UNROLL-NO-VF-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[SMAX]], 2
	Show All 29 Lines
	; UNROLL-NO-VF-NEXT: [[TMP12:%.*]] = xor i1 [[C]], true			; UNROLL-NO-VF-NEXT: [[TMP12:%.*]] = xor i1 [[C]], true
	; UNROLL-NO-VF-NEXT: [[TMP13:%.*]] = xor i1 [[C]], true			; UNROLL-NO-VF-NEXT: [[TMP13:%.*]] = xor i1 [[C]], true
	; UNROLL-NO-VF-NEXT: [[PREDPHI:%.*]] = select i1 [[C]], i32 [[TMP8]], i32 [[TMP4]]			; UNROLL-NO-VF-NEXT: [[PREDPHI:%.*]] = select i1 [[C]], i32 [[TMP8]], i32 [[TMP4]]
	; UNROLL-NO-VF-NEXT: [[PREDPHI4:%.*]] = select i1 [[C]], i32 [[TMP11]], i32 [[TMP5]]			; UNROLL-NO-VF-NEXT: [[PREDPHI4:%.*]] = select i1 [[C]], i32 [[TMP11]], i32 [[TMP5]]
	; UNROLL-NO-VF-NEXT: [[TMP14]] = add i32 [[VEC_PHI]], [[PREDPHI]]			; UNROLL-NO-VF-NEXT: [[TMP14]] = add i32 [[VEC_PHI]], [[PREDPHI]]
	; UNROLL-NO-VF-NEXT: [[TMP15]] = add i32 [[VEC_PHI1]], [[PREDPHI4]]			; UNROLL-NO-VF-NEXT: [[TMP15]] = add i32 [[VEC_PHI1]], [[PREDPHI4]]
	; UNROLL-NO-VF-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; UNROLL-NO-VF-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; UNROLL-NO-VF-NEXT: [[TMP16:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; UNROLL-NO-VF-NEXT: [[TMP16:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; UNROLL-NO-VF-NEXT: br i1 [[TMP16]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP37:![0-9]+]]			; UNROLL-NO-VF-NEXT: br i1 [[TMP16]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP38:![0-9]+]]
	; UNROLL-NO-VF: middle.block:			; UNROLL-NO-VF: middle.block:
	; UNROLL-NO-VF-NEXT: [[BIN_RDX:%.*]] = add i32 [[TMP15]], [[TMP14]]			; UNROLL-NO-VF-NEXT: [[BIN_RDX:%.*]] = add i32 [[TMP15]], [[TMP14]]
	; UNROLL-NO-VF-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX]], [[N_VEC]]			; UNROLL-NO-VF-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX]], [[N_VEC]]
	; UNROLL-NO-VF-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; UNROLL-NO-VF-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; UNROLL-NO-VF: scalar.ph:			; UNROLL-NO-VF: scalar.ph:
	; UNROLL-NO-VF-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; UNROLL-NO-VF-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; UNROLL-NO-VF-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[BIN_RDX]], [[MIDDLE_BLOCK]] ]			; UNROLL-NO-VF-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[BIN_RDX]], [[MIDDLE_BLOCK]] ]
	; UNROLL-NO-VF-NEXT: br label [[FOR_BODY:%.*]]			; UNROLL-NO-VF-NEXT: br label [[FOR_BODY:%.*]]
	; UNROLL-NO-VF: for.body:			; UNROLL-NO-VF: for.body:
	; UNROLL-NO-VF-NEXT: [[I:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[FOR_INC:%.*]] ]			; UNROLL-NO-VF-NEXT: [[I:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[I_NEXT:%.]], [[FOR_INC:%.*]] ]
	; UNROLL-NO-VF-NEXT: [[R:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[T6:%.]], [[FOR_INC]] ]			; UNROLL-NO-VF-NEXT: [[R:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[T6:%.]], [[FOR_INC]] ]
	; UNROLL-NO-VF-NEXT: [[T0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[I]]			; UNROLL-NO-VF-NEXT: [[T0:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[I]]
	; UNROLL-NO-VF-NEXT: [[T2:%.*]] = load i32, ptr [[T0]], align 4			; UNROLL-NO-VF-NEXT: [[T2:%.*]] = load i32, ptr [[T0]], align 4
	; UNROLL-NO-VF-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[FOR_INC]]			; UNROLL-NO-VF-NEXT: br i1 [[C]], label [[IF_THEN:%.*]], label [[FOR_INC]]
	; UNROLL-NO-VF: if.then:			; UNROLL-NO-VF: if.then:
	; UNROLL-NO-VF-NEXT: [[T3:%.*]] = add nsw i32 [[T2]], [[X]]			; UNROLL-NO-VF-NEXT: [[T3:%.*]] = add nsw i32 [[T2]], [[X]]
	; UNROLL-NO-VF-NEXT: [[T4:%.*]] = udiv i32 [[T2]], [[T3]]			; UNROLL-NO-VF-NEXT: [[T4:%.*]] = udiv i32 [[T2]], [[T3]]
	; UNROLL-NO-VF-NEXT: br label [[FOR_INC]]			; UNROLL-NO-VF-NEXT: br label [[FOR_INC]]
	; UNROLL-NO-VF: for.inc:			; UNROLL-NO-VF: for.inc:
	; UNROLL-NO-VF-NEXT: [[T5:%.*]] = phi i32 [ [[T2]], [[FOR_BODY]] ], [ [[T4]], [[IF_THEN]] ]			; UNROLL-NO-VF-NEXT: [[T5:%.*]] = phi i32 [ [[T2]], [[FOR_BODY]] ], [ [[T4]], [[IF_THEN]] ]
	; UNROLL-NO-VF-NEXT: [[T6]] = add i32 [[R]], [[T5]]			; UNROLL-NO-VF-NEXT: [[T6]] = add i32 [[R]], [[T5]]
	; UNROLL-NO-VF-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; UNROLL-NO-VF-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; UNROLL-NO-VF-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]			; UNROLL-NO-VF-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]
	; UNROLL-NO-VF-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP38:![0-9]+]]			; UNROLL-NO-VF-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP39:![0-9]+]]
	; UNROLL-NO-VF: for.end:			; UNROLL-NO-VF: for.end:
	; UNROLL-NO-VF-NEXT: [[T7:%.*]] = phi i32 [ [[T6]], [[FOR_INC]] ], [ [[BIN_RDX]], [[MIDDLE_BLOCK]] ]			; UNROLL-NO-VF-NEXT: [[T7:%.*]] = phi i32 [ [[T6]], [[FOR_INC]] ], [ [[BIN_RDX]], [[MIDDLE_BLOCK]] ]
	; UNROLL-NO-VF-NEXT: ret i32 [[T7]]			; UNROLL-NO-VF-NEXT: ret i32 [[T7]]
	;			;
	entry:			entry:
	br label %for.body			br label %for.body


	Show All 40 Lines

llvm/test/Transforms/LoopVectorize/induction.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]
	; CHECK-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]			; CHECK-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]
	; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4			; CHECK-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4
	; CHECK-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1			; CHECK-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32			; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[N]]			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[N]]
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP2:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP3:![0-9]+]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	; IND-LABEL: @multi_int_induction(			; IND-LABEL: @multi_int_induction(
	; IND-NEXT: for.body.lr.ph:			; IND-NEXT: for.body.lr.ph:
	; IND-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], -1			; IND-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], -1
	; IND-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64			; IND-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
	; IND-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1			; IND-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1
	Show All 24 Lines
	; IND-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]			; IND-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]
	; IND-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]			; IND-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]
	; IND-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; IND-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; IND-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4			; IND-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4
	; IND-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1			; IND-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1
	; IND-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1			; IND-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1
	; IND-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32			; IND-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
	; IND-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]			; IND-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]
	; IND-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; IND-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; IND: for.end:			; IND: for.end:
	; IND-NEXT: ret void			; IND-NEXT: ret void
	;			;
	; UNROLL-LABEL: @multi_int_induction(			; UNROLL-LABEL: @multi_int_induction(
	; UNROLL-NEXT: for.body.lr.ph:			; UNROLL-NEXT: for.body.lr.ph:
	; UNROLL-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], -1			; UNROLL-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], -1
	; UNROLL-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64			; UNROLL-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
	; UNROLL-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1			; UNROLL-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1
	Show All 27 Lines
	; UNROLL-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]			; UNROLL-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]
	; UNROLL-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]			; UNROLL-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]
	; UNROLL-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; UNROLL-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; UNROLL-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4			; UNROLL-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4
	; UNROLL-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1			; UNROLL-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1
	; UNROLL-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1			; UNROLL-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1
	; UNROLL-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32			; UNROLL-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
	; UNROLL-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]			; UNROLL-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]
	; UNROLL-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; UNROLL-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; UNROLL: for.end:			; UNROLL: for.end:
	; UNROLL-NEXT: ret void			; UNROLL-NEXT: ret void
	;			;
	; UNROLL-NO-IC-LABEL: @multi_int_induction(			; UNROLL-NO-IC-LABEL: @multi_int_induction(
	; UNROLL-NO-IC-NEXT: for.body.lr.ph:			; UNROLL-NO-IC-NEXT: for.body.lr.ph:
	; UNROLL-NO-IC-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], -1			; UNROLL-NO-IC-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], -1
	; UNROLL-NO-IC-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64			; UNROLL-NO-IC-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
	; UNROLL-NO-IC-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1			; UNROLL-NO-IC-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1
	Show All 32 Lines
	; UNROLL-NO-IC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]			; UNROLL-NO-IC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]
	; UNROLL-NO-IC-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]			; UNROLL-NO-IC-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]
	; UNROLL-NO-IC-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; UNROLL-NO-IC-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; UNROLL-NO-IC-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4			; UNROLL-NO-IC-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4
	; UNROLL-NO-IC-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1			; UNROLL-NO-IC-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1
	; UNROLL-NO-IC-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1			; UNROLL-NO-IC-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1
	; UNROLL-NO-IC-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32			; UNROLL-NO-IC-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
	; UNROLL-NO-IC-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[N]]			; UNROLL-NO-IC-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[N]]
	; UNROLL-NO-IC-NEXT: br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP2:![0-9]+]]			; UNROLL-NO-IC-NEXT: br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP3:![0-9]+]]
	; UNROLL-NO-IC: for.end:			; UNROLL-NO-IC: for.end:
	; UNROLL-NO-IC-NEXT: ret void			; UNROLL-NO-IC-NEXT: ret void
	;			;
	; INTERLEAVE-LABEL: @multi_int_induction(			; INTERLEAVE-LABEL: @multi_int_induction(
	; INTERLEAVE-NEXT: for.body.lr.ph:			; INTERLEAVE-NEXT: for.body.lr.ph:
	; INTERLEAVE-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], -1			; INTERLEAVE-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], -1
	; INTERLEAVE-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64			; INTERLEAVE-NEXT: [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
	; INTERLEAVE-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1			; INTERLEAVE-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1
	Show All 27 Lines
	; INTERLEAVE-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]			; INTERLEAVE-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ]
	; INTERLEAVE-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]			; INTERLEAVE-NEXT: [[COUNT_09:%.]] = phi i32 [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ], [ [[INC:%.]], [[FOR_BODY]] ]
	; INTERLEAVE-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]			; INTERLEAVE-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDVARS_IV]]
	; INTERLEAVE-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4			; INTERLEAVE-NEXT: store i32 [[COUNT_09]], ptr [[ARRAYIDX2]], align 4
	; INTERLEAVE-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1			; INTERLEAVE-NEXT: [[INC]] = add nsw i32 [[COUNT_09]], 1
	; INTERLEAVE-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1			; INTERLEAVE-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1
	; INTERLEAVE-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32			; INTERLEAVE-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
	; INTERLEAVE-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]			; INTERLEAVE-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]]
	; INTERLEAVE-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]			; INTERLEAVE-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; INTERLEAVE: for.end:			; INTERLEAVE: for.end:
	; INTERLEAVE-NEXT: ret void			; INTERLEAVE-NEXT: ret void
	;			;
	for.body.lr.ph:			for.body.lr.ph:
	br label %for.body			br label %for.body

	for.body:			for.body:
	%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]			%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]
	▲ Show 20 Lines • Show All 6,428 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], 512			; CHECK-NEXT: [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], 512
	; CHECK-NEXT: br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: br i1 true, label [[FOR_END:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 true, label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: br i1 poison, label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP2:![0-9]+]]			; CHECK-NEXT: br i1 poison, label [[FOR_BODY]], label [[FOR_END]], !llvm.loop [[LOOP3:![0-9]+]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body: ; preds = %for.body, %entry			for.body: ; preds = %for.body, %entry
	%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]			%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
	▲ Show 20 Lines • Show All 1,546 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/invariant-store-vectorization-2.ll

	Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: br label [[LATCH]]			; CHECK-NEXT: br label [[LATCH]]
	; CHECK: cond_store_k:			; CHECK: cond_store_k:
	; CHECK-NEXT: br label [[LATCH]]			; CHECK-NEXT: br label [[LATCH]]
	; CHECK: latch:			; CHECK: latch:
	; CHECK-NEXT: [[STOREVAL:%.*]] = phi i32 [ [[NTRUNC]], [[COND_STORE]] ], [ [[K]], [[COND_STORE_K]] ]			; CHECK-NEXT: [[STOREVAL:%.*]] = phi i32 [ [[NTRUNC]], [[COND_STORE]] ], [ [[K]], [[COND_STORE_K]] ]
	; CHECK-NEXT: store i32 [[STOREVAL]], ptr [[A]], align 4			; CHECK-NEXT: store i32 [[STOREVAL]], ptr [[A]], align 4
	; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]			; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP7:![0-9]+]]			; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP8:![0-9]+]]
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	; CHECK-NEXT: br label [[FOR_END]]			; CHECK-NEXT: br label [[FOR_END]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%ntrunc = trunc i64 %n to i32			%ntrunc = trunc i64 %n to i32
	br label %for.body			br label %for.body
	▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i1> undef, i1 [[CMP]], i64 3			; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i1> undef, i1 [[CMP]], i64 3
	; CHECK-NEXT: [[BROADCAST_SPLAT6:%.*]] = insertelement <4 x i32> poison, i32 [[K]], i64 3			; CHECK-NEXT: [[BROADCAST_SPLAT6:%.*]] = insertelement <4 x i32> poison, i32 [[K]], i64 3
	; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> [[TMP1]], <4 x i32> [[BROADCAST_SPLAT]], <4 x i32> [[BROADCAST_SPLAT6]]			; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> [[TMP1]], <4 x i32> [[BROADCAST_SPLAT]], <4 x i32> [[BROADCAST_SPLAT6]]
	; CHECK-NEXT: [[TMP2:%.*]] = extractelement <4 x i32> [[PREDPHI]], i64 3			; CHECK-NEXT: [[TMP2:%.*]] = extractelement <4 x i32> [[PREDPHI]], i64 3
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	; CHECK-NEXT: store <4 x i32> [[BROADCAST_SPLAT]], ptr [[TMP3]], align 4, !alias.scope !8, !noalias !11			; CHECK-NEXT: store <4 x i32> [[BROADCAST_SPLAT]], ptr [[TMP3]], align 4, !alias.scope !9, !noalias !12
	; CHECK-NEXT: store i32 [[TMP2]], ptr [[A]], align 4, !alias.scope !11			; CHECK-NEXT: store i32 [[TMP2]], ptr [[A]], align 4, !alias.scope !12
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP13:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP14:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[LATCH:%.*]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]			; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[LATCH:%.*]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]
	; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]			; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[I1]], align 4			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[I1]], align 4
	; CHECK-NEXT: br i1 [[CMP]], label [[COND_STORE:%.]], label [[COND_STORE_K:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[COND_STORE:%.]], label [[COND_STORE_K:%.]]
	; CHECK: cond_store:			; CHECK: cond_store:
	; CHECK-NEXT: br label [[LATCH]]			; CHECK-NEXT: br label [[LATCH]]
	; CHECK: cond_store_k:			; CHECK: cond_store_k:
	; CHECK-NEXT: br label [[LATCH]]			; CHECK-NEXT: br label [[LATCH]]
	; CHECK: latch:			; CHECK: latch:
	; CHECK-NEXT: [[STOREVAL:%.*]] = phi i32 [ [[NTRUNC]], [[COND_STORE]] ], [ [[K]], [[COND_STORE_K]] ]			; CHECK-NEXT: [[STOREVAL:%.*]] = phi i32 [ [[NTRUNC]], [[COND_STORE]] ], [ [[K]], [[COND_STORE_K]] ]
	; CHECK-NEXT: store i32 [[STOREVAL]], ptr [[A]], align 4			; CHECK-NEXT: store i32 [[STOREVAL]], ptr [[A]], align 4
	; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]			; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP14:![0-9]+]]			; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP15:![0-9]+]]
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	; CHECK-NEXT: br label [[FOR_END]]			; CHECK-NEXT: br label [[FOR_END]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%ntrunc = trunc i64 %n to i32			%ntrunc = trunc i64 %n to i32
	%cmp = icmp eq i32 %ntrunc, %k			%cmp = icmp eq i32 %ntrunc, %k
	▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]			; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 8, !alias.scope !15			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 8, !alias.scope !16
	; CHECK-NEXT: [[TMP2:%.*]] = extractelement <4 x i32> [[WIDE_LOAD]], i64 3			; CHECK-NEXT: [[TMP2:%.*]] = extractelement <4 x i32> [[WIDE_LOAD]], i64 3
	; CHECK-NEXT: store i32 [[TMP2]], ptr [[A]], align 4, !alias.scope !18, !noalias !15			; CHECK-NEXT: store i32 [[TMP2]], ptr [[A]], align 4, !alias.scope !19, !noalias !16
	; CHECK-NEXT: [[TMP3]] = add <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]]			; CHECK-NEXT: [[TMP3]] = add <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]]
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP20:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP21:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[DOTLCSSA:%.*]] = phi <4 x i32> [ [[TMP3]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[DOTLCSSA:%.*]] = phi <4 x i32> [ [[TMP3]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP5:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[DOTLCSSA]])			; CHECK-NEXT: [[TMP5:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[DOTLCSSA]])
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP5]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP5]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]			; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]
	; CHECK-NEXT: [[I0:%.]] = phi i32 [ [[I3:%.]], [[FOR_BODY]] ], [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ]			; CHECK-NEXT: [[I0:%.]] = phi i32 [ [[I3:%.]], [[FOR_BODY]] ], [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ]
	; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]			; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]
	; CHECK-NEXT: [[I2:%.*]] = load i32, ptr [[I1]], align 8			; CHECK-NEXT: [[I2:%.*]] = load i32, ptr [[I1]], align 8
	; CHECK-NEXT: store i32 [[I2]], ptr [[A]], align 4			; CHECK-NEXT: store i32 [[I2]], ptr [[A]], align 4
	; CHECK-NEXT: [[I3]] = add i32 [[I0]], [[I2]]			; CHECK-NEXT: [[I3]] = add i32 [[I0]], [[I2]]
	; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]			; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP21:![0-9]+]]			; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP22:![0-9]+]]
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	; CHECK-NEXT: [[I3_LCSSA:%.*]] = phi i32 [ [[I3]], [[FOR_BODY]] ]			; CHECK-NEXT: [[I3_LCSSA:%.*]] = phi i32 [ [[I3]], [[FOR_BODY]] ]
	; CHECK-NEXT: br label [[FOR_END]]			; CHECK-NEXT: br label [[FOR_END]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: [[RDX_LCSSA:%.*]] = phi i32 [ [[TMP5]], [[MIDDLE_BLOCK]] ], [ [[I3_LCSSA]], [[FOR_END_LOOPEXIT]] ]			; CHECK-NEXT: [[RDX_LCSSA:%.*]] = phi i32 [ [[TMP5]], [[MIDDLE_BLOCK]] ], [ [[I3_LCSSA]], [[FOR_END_LOOPEXIT]] ]
	; CHECK-NEXT: ret i32 [[RDX_LCSSA]]			; CHECK-NEXT: ret i32 [[RDX_LCSSA]]
	;			;
	entry:			entry:
	Show All 19 Lines

llvm/test/Transforms/LoopVectorize/invariant-store-vectorization.ll

	Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]			; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]
	; CHECK-NEXT: [[I0:%.]] = phi i32 [ [[I3:%.]], [[FOR_BODY]] ], [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ]			; CHECK-NEXT: [[I0:%.]] = phi i32 [ [[I3:%.]], [[FOR_BODY]] ], [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ]
	; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]			; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]
	; CHECK-NEXT: [[I2:%.*]] = load i32, ptr [[I1]], align 8			; CHECK-NEXT: [[I2:%.*]] = load i32, ptr [[I1]], align 8
	; CHECK-NEXT: [[I3]] = add i32 [[I0]], [[I2]]			; CHECK-NEXT: [[I3]] = add i32 [[I0]], [[I2]]
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4
	; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]			; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP7:![0-9]+]]			; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP8:![0-9]+]]
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	; CHECK-NEXT: [[I3_LCSSA:%.*]] = phi i32 [ [[I3]], [[FOR_BODY]] ]			; CHECK-NEXT: [[I3_LCSSA:%.*]] = phi i32 [ [[I3]], [[FOR_BODY]] ]
	; CHECK-NEXT: br label [[FOR_END]]			; CHECK-NEXT: br label [[FOR_END]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: [[I4:%.*]] = phi i32 [ [[TMP4]], [[MIDDLE_BLOCK]] ], [ [[I3_LCSSA]], [[FOR_END_LOOPEXIT]] ]			; CHECK-NEXT: [[I4:%.*]] = phi i32 [ [[TMP4]], [[MIDDLE_BLOCK]] ], [ [[I3_LCSSA]], [[FOR_END_LOOPEXIT]] ]
	; CHECK-NEXT: ret i32 [[I4]]			; CHECK-NEXT: ret i32 [[I4]]
	;			;
	entry:			entry:
	Show All 35 Lines
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[SMAX2]], 9223372036854775804
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !8, !noalias !11			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !9, !noalias !12
	; CHECK-NEXT: store <4 x i32> [[BROADCAST_SPLAT]], ptr [[TMP1]], align 4, !alias.scope !11			; CHECK-NEXT: store <4 x i32> [[BROADCAST_SPLAT]], ptr [[TMP1]], align 4, !alias.scope !12
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP13:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP14:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]			; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]
	; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]			; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[I1]], align 4			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[I1]], align 4
	; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]			; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP14:![0-9]+]]			; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP15:![0-9]+]]
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	; CHECK-NEXT: br label [[FOR_END]]			; CHECK-NEXT: br label [[FOR_END]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%ntrunc = trunc i64 %n to i32			%ntrunc = trunc i64 %n to i32
	br label %for.body			br label %for.body
	▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x i32> poison, i32 [[K:%.]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x i32> poison, i32 [[K:%.]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT3:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT3:%.*]] = insertelement <4 x i32> poison, i32 [[NTRUNC]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT4:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT3]], <4 x i32> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT4:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT3]], <4 x i32> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE10:%.*]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[PRED_STORE_CONTINUE10:%.*]] ]
	; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 8, !alias.scope !15, !noalias !18			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 8, !alias.scope !16, !noalias !19
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq <4 x i32> [[WIDE_LOAD]], [[BROADCAST_SPLAT]]			; CHECK-NEXT: [[TMP2:%.*]] = icmp eq <4 x i32> [[WIDE_LOAD]], [[BROADCAST_SPLAT]]
	; CHECK-NEXT: store <4 x i32> [[BROADCAST_SPLAT4]], ptr [[TMP1]], align 4, !alias.scope !15, !noalias !18			; CHECK-NEXT: store <4 x i32> [[BROADCAST_SPLAT4]], ptr [[TMP1]], align 4, !alias.scope !16, !noalias !19
	; CHECK-NEXT: [[TMP3:%.*]] = extractelement <4 x i1> [[TMP2]], i64 0			; CHECK-NEXT: [[TMP3:%.*]] = extractelement <4 x i1> [[TMP2]], i64 0
	; CHECK-NEXT: br i1 [[TMP3]], label [[PRED_STORE_IF:%.]], label [[PRED_STORE_CONTINUE:%.]]			; CHECK-NEXT: br i1 [[TMP3]], label [[PRED_STORE_IF:%.]], label [[PRED_STORE_CONTINUE:%.]]
	; CHECK: pred.store.if:			; CHECK: pred.store.if:
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !18			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !19
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE]]
	; CHECK: pred.store.continue:			; CHECK: pred.store.continue:
	; CHECK-NEXT: [[TMP4:%.*]] = extractelement <4 x i1> [[TMP2]], i64 1			; CHECK-NEXT: [[TMP4:%.*]] = extractelement <4 x i1> [[TMP2]], i64 1
	; CHECK-NEXT: br i1 [[TMP4]], label [[PRED_STORE_IF5:%.]], label [[PRED_STORE_CONTINUE6:%.]]			; CHECK-NEXT: br i1 [[TMP4]], label [[PRED_STORE_IF5:%.]], label [[PRED_STORE_CONTINUE6:%.]]
	; CHECK: pred.store.if5:			; CHECK: pred.store.if5:
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !18			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !19
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE6]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE6]]
	; CHECK: pred.store.continue6:			; CHECK: pred.store.continue6:
	; CHECK-NEXT: [[TMP5:%.*]] = extractelement <4 x i1> [[TMP2]], i64 2			; CHECK-NEXT: [[TMP5:%.*]] = extractelement <4 x i1> [[TMP2]], i64 2
	; CHECK-NEXT: br i1 [[TMP5]], label [[PRED_STORE_IF7:%.]], label [[PRED_STORE_CONTINUE8:%.]]			; CHECK-NEXT: br i1 [[TMP5]], label [[PRED_STORE_IF7:%.]], label [[PRED_STORE_CONTINUE8:%.]]
	; CHECK: pred.store.if7:			; CHECK: pred.store.if7:
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !18			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !19
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE8]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE8]]
	; CHECK: pred.store.continue8:			; CHECK: pred.store.continue8:
	; CHECK-NEXT: [[TMP6:%.*]] = extractelement <4 x i1> [[TMP2]], i64 3			; CHECK-NEXT: [[TMP6:%.*]] = extractelement <4 x i1> [[TMP2]], i64 3
	; CHECK-NEXT: br i1 [[TMP6]], label [[PRED_STORE_IF9:%.*]], label [[PRED_STORE_CONTINUE10]]			; CHECK-NEXT: br i1 [[TMP6]], label [[PRED_STORE_IF9:%.*]], label [[PRED_STORE_CONTINUE10]]
	; CHECK: pred.store.if9:			; CHECK: pred.store.if9:
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !18			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4, !alias.scope !19
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE10]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE10]]
	; CHECK: pred.store.continue10:			; CHECK: pred.store.continue10:
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP7:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP7:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP20:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP21:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[SMAX2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[LATCH:%.*]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]			; CHECK-NEXT: [[I:%.]] = phi i64 [ [[I_NEXT:%.]], [[LATCH:%.*]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]
	; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]			; CHECK-NEXT: [[I1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[I]]
	; CHECK-NEXT: [[I2:%.*]] = load i32, ptr [[I1]], align 8			; CHECK-NEXT: [[I2:%.*]] = load i32, ptr [[I1]], align 8
	; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[I2]], [[K]]			; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[I2]], [[K]]
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[I1]], align 4			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[I1]], align 4
	; CHECK-NEXT: br i1 [[CMP]], label [[COND_STORE:%.*]], label [[LATCH]]			; CHECK-NEXT: br i1 [[CMP]], label [[COND_STORE:%.*]], label [[LATCH]]
	; CHECK: cond_store:			; CHECK: cond_store:
	; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4			; CHECK-NEXT: store i32 [[NTRUNC]], ptr [[A]], align 4
	; CHECK-NEXT: br label [[LATCH]]			; CHECK-NEXT: br label [[LATCH]]
	; CHECK: latch:			; CHECK: latch:
	; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1			; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i64 [[I]], 1
	; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]			; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[I_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP21:![0-9]+]]			; CHECK-NEXT: br i1 [[COND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP22:![0-9]+]]
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	; CHECK-NEXT: br label [[FOR_END]]			; CHECK-NEXT: br label [[FOR_END]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%ntrunc = trunc i64 %n to i32			%ntrunc = trunc i64 %n to i32
	br label %for.body			br label %for.body
	▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[IND_END:%.*]] = add nuw nsw i64 [[N_VEC]], [[TMP2]]			; CHECK-NEXT: [[IND_END:%.*]] = add nuw nsw i64 [[N_VEC]], [[TMP2]]
	; CHECK-NEXT: [[TMP13:%.*]] = insertelement <4 x i32> <i32 poison, i32 0, i32 0, i32 0>, i32 [[ARRAYIDX5_PROMOTED]], i64 0			; CHECK-NEXT: [[TMP13:%.*]] = insertelement <4 x i32> <i32 poison, i32 0, i32 0, i32 0>, i32 [[ARRAYIDX5_PROMOTED]], i64 0
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ [[TMP13]], [[VECTOR_PH]] ], [ [[TMP16:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ [[TMP13]], [[VECTOR_PH]] ], [ [[TMP16:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[OFFSET_IDX:%.*]] = add i64 [[INDEX]], [[TMP2]]			; CHECK-NEXT: [[OFFSET_IDX:%.*]] = add i64 [[INDEX]], [[TMP2]]
	; CHECK-NEXT: [[TMP14:%.*]] = getelementptr inbounds i32, ptr [[VAR2]], i64 [[OFFSET_IDX]]			; CHECK-NEXT: [[TMP14:%.*]] = getelementptr inbounds i32, ptr [[VAR2]], i64 [[OFFSET_IDX]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP14]], align 4, !alias.scope !22			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP14]], align 4, !alias.scope !23
	; CHECK-NEXT: [[TMP15:%.*]] = add <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]]			; CHECK-NEXT: [[TMP15:%.*]] = add <4 x i32> [[VEC_PHI]], [[WIDE_LOAD]]
	; CHECK-NEXT: [[TMP16]] = add <4 x i32> [[TMP15]], <i32 1, i32 1, i32 1, i32 1>			; CHECK-NEXT: [[TMP16]] = add <4 x i32> [[TMP15]], <i32 1, i32 1, i32 1, i32 1>
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP17:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP17:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP17]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP25:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP17]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[DOTLCSSA:%.*]] = phi <4 x i32> [ [[TMP16]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[DOTLCSSA:%.*]] = phi <4 x i32> [ [[TMP16]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP18:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[DOTLCSSA]])			; CHECK-NEXT: [[TMP18:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[DOTLCSSA]])
	; CHECK-NEXT: store i32 [[TMP18]], ptr [[ARRAYIDX5]], align 4			; CHECK-NEXT: store i32 [[TMP18]], ptr [[ARRAYIDX5]], align 4
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP6]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP6]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_INC8_LOOPEXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_INC8_LOOPEXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ [[TMP2]], [[FOR_BODY3_LR_PH]] ], [ [[TMP2]], [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ [[TMP2]], [[FOR_BODY3_LR_PH]] ], [ [[TMP2]], [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP18]], [[MIDDLE_BLOCK]] ], [ [[ARRAYIDX5_PROMOTED]], [[FOR_BODY3_LR_PH]] ], [ [[ARRAYIDX5_PROMOTED]], [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP18]], [[MIDDLE_BLOCK]] ], [ [[ARRAYIDX5_PROMOTED]], [[FOR_BODY3_LR_PH]] ], [ [[ARRAYIDX5_PROMOTED]], [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[FOR_BODY3:%.*]]			; CHECK-NEXT: br label [[FOR_BODY3:%.*]]
	; CHECK: for.body3:			; CHECK: for.body3:
	; CHECK-NEXT: [[TMP19:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[TMP21:%.]], [[FOR_BODY3]] ]			; CHECK-NEXT: [[TMP19:%.]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[TMP21:%.]], [[FOR_BODY3]] ]
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY3]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY3]] ]
	; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[VAR2]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[VAR2]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP20:%.*]] = load i32, ptr [[ARRAYIDX]], align 4			; CHECK-NEXT: [[TMP20:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP19]], [[TMP20]]			; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP19]], [[TMP20]]
	; CHECK-NEXT: [[TMP21]] = add nsw i32 [[ADD]], 1			; CHECK-NEXT: [[TMP21]] = add nsw i32 [[ADD]], 1
	; CHECK-NEXT: store i32 [[TMP21]], ptr [[ARRAYIDX5]], align 4			; CHECK-NEXT: store i32 [[TMP21]], ptr [[ARRAYIDX5]], align 4
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32			; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[ITR]]			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[ITR]]
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_INC8_LOOPEXIT_LOOPEXIT:%.*]], label [[FOR_BODY3]], !llvm.loop [[LOOP26:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_INC8_LOOPEXIT_LOOPEXIT:%.*]], label [[FOR_BODY3]], !llvm.loop [[LOOP27:![0-9]+]]
	; CHECK: for.inc8.loopexit.loopexit:			; CHECK: for.inc8.loopexit.loopexit:
	; CHECK-NEXT: br label [[FOR_INC8_LOOPEXIT]]			; CHECK-NEXT: br label [[FOR_INC8_LOOPEXIT]]
	; CHECK: for.inc8.loopexit:			; CHECK: for.inc8.loopexit:
	; CHECK-NEXT: br label [[FOR_INC8]]			; CHECK-NEXT: br label [[FOR_INC8]]
	; CHECK: for.inc8:			; CHECK: for.inc8:
	; CHECK-NEXT: [[J_1_LCSSA]] = phi i32 [ [[J_022]], [[FOR_COND1_PREHEADER]] ], [ [[ITR]], [[FOR_INC8_LOOPEXIT]] ]			; CHECK-NEXT: [[J_1_LCSSA]] = phi i32 [ [[J_022]], [[FOR_COND1_PREHEADER]] ], [ [[ITR]], [[FOR_INC8_LOOPEXIT]] ]
	; CHECK-NEXT: [[INDVARS_IV_NEXT24]] = add nuw nsw i64 [[INDVARS_IV23]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT24]] = add nuw nsw i64 [[INDVARS_IV23]], 1
	; CHECK-NEXT: [[LFTR_WIDEIV25:%.*]] = trunc i64 [[INDVARS_IV_NEXT24]] to i32			; CHECK-NEXT: [[LFTR_WIDEIV25:%.*]] = trunc i64 [[INDVARS_IV_NEXT24]] to i32
	▲ Show 20 Lines • Show All 216 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/memdep-fold-tail.ll

	Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP12:%.*]] = extractelement <2 x i32> [[TMP7]], i32 1			; CHECK-NEXT: [[TMP12:%.*]] = extractelement <2 x i32> [[TMP7]], i32 1
	; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds [18 x i8], ptr @a, i32 0, i32 [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds [18 x i8], ptr @a, i32 0, i32 [[TMP12]]
	; CHECK-NEXT: store i8 7, ptr [[TMP13]], align 8			; CHECK-NEXT: store i8 7, ptr [[TMP13]], align 8
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE6]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE6]]
	; CHECK: pred.store.continue6:			; CHECK: pred.store.continue6:
	; CHECK-NEXT: [[INDEX_NEXT]] = add i32 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add i32 [[INDEX]], 2
	; CHECK-NEXT: [[VEC_IND_NEXT]] = add <2 x i32> [[VEC_IND]], <i32 2, i32 2>			; CHECK-NEXT: [[VEC_IND_NEXT]] = add <2 x i32> [[VEC_IND]], <i32 2, i32 2>
	; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], 16			; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], 16
	; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !0			; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: br i1 true, label [[FOR_END:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 true, label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i32 [ 16, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i32 [ 16, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[J:%.]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[J_NEXT:%.]], [[FOR_BODY]] ]			; CHECK-NEXT: [[J:%.]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[J_NEXT:%.]], [[FOR_BODY]] ]
	; CHECK-NEXT: [[AJ:%.*]] = getelementptr inbounds [18 x i8], ptr @a, i32 0, i32 [[J]]			; CHECK-NEXT: [[AJ:%.*]] = getelementptr inbounds [18 x i8], ptr @a, i32 0, i32 [[J]]
	; CHECK-NEXT: store i8 69, ptr [[AJ]], align 8			; CHECK-NEXT: store i8 69, ptr [[AJ]], align 8
	; CHECK-NEXT: [[JP3:%.*]] = add nuw nsw i32 3, [[J]]			; CHECK-NEXT: [[JP3:%.*]] = add nuw nsw i32 3, [[J]]
	; CHECK-NEXT: [[AJP3:%.*]] = getelementptr inbounds [18 x i8], ptr @a, i32 0, i32 [[JP3]]			; CHECK-NEXT: [[AJP3:%.*]] = getelementptr inbounds [18 x i8], ptr @a, i32 0, i32 [[JP3]]
	; CHECK-NEXT: store i8 7, ptr [[AJP3]], align 8			; CHECK-NEXT: store i8 7, ptr [[AJP3]], align 8
	; CHECK-NEXT: [[J_NEXT]] = add nuw nsw i32 [[J]], 1			; CHECK-NEXT: [[J_NEXT]] = add nuw nsw i32 [[J]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[J_NEXT]], 15			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[J_NEXT]], 15
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop !2			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body:			for.body:
	%j = phi i32 [ 0, %entry ], [ %j.next, %for.body ]			%j = phi i32 [ 0, %entry ], [ %j.next, %for.body ]
	Show All 15 Lines

llvm/test/Transforms/LoopVectorize/optsize.ll

	Show First 20 Lines • Show All 247 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP6:%.*]] = extractelement <2 x i32> [[TMP0]], i32 1			; CHECK-NEXT: [[TMP6:%.*]] = extractelement <2 x i32> [[TMP0]], i32 1
	; CHECK-NEXT: [[TMP7:%.*]] = getelementptr inbounds i16, ptr [[B]], i32 [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = getelementptr inbounds i16, ptr [[B]], i32 [[TMP6]]
	; CHECK-NEXT: store i16 42, ptr [[TMP7]], align 4			; CHECK-NEXT: store i16 42, ptr [[TMP7]], align 4
	; CHECK-NEXT: br label [[PRED_STORE_CONTINUE2]]			; CHECK-NEXT: br label [[PRED_STORE_CONTINUE2]]
	; CHECK: pred.store.continue2:			; CHECK: pred.store.continue2:
	; CHECK-NEXT: [[INDEX_NEXT]] = add i32 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add i32 [[INDEX]], 2
	; CHECK-NEXT: [[VEC_IND_NEXT]] = add <2 x i32> [[VEC_IND]], <i32 2, i32 2>			; CHECK-NEXT: [[VEC_IND_NEXT]] = add <2 x i32> [[VEC_IND]], <i32 2, i32 2>
	; CHECK-NEXT: [[TMP8:%.*]] = icmp eq i32 [[INDEX_NEXT]], 1026			; CHECK-NEXT: [[TMP8:%.*]] = icmp eq i32 [[INDEX_NEXT]], 1026
	; CHECK-NEXT: br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !21			; CHECK-NEXT: br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: br i1 true, label [[FOR_END:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 true, label [[FOR_END:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	; PGSO-LABEL: @stride1(			; PGSO-LABEL: @stride1(
	; PGSO-NEXT: entry:			; PGSO-NEXT: entry:
	▲ Show 20 Lines • Show All 117 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/pointer-select-runtime-checks.ll

	Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8			; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8
	; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]			; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]
	; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2			; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1
	; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]			; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP7:![0-9]+]]			; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP8:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%ptr.sel = select i1 %c, ptr %src.1, ptr %src.2			%ptr.sel = select i1 %c, ptr %src.1, ptr %src.2
	br label %loop			br label %loop

	loop:			loop:
	▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP9]], align 8			; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP9]], align 8
	; CHECK-NEXT: [[TMP12:%.*]] = load i8, ptr [[TMP10]], align 8			; CHECK-NEXT: [[TMP12:%.*]] = load i8, ptr [[TMP10]], align 8
	; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION]]			; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION]]
	; CHECK-NEXT: [[TMP14:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION6]]			; CHECK-NEXT: [[TMP14:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION6]]
	; CHECK-NEXT: store i8 [[TMP11]], ptr [[TMP13]], align 2			; CHECK-NEXT: store i8 [[TMP11]], ptr [[TMP13]], align 2
	; CHECK-NEXT: store i8 [[TMP12]], ptr [[TMP14]], align 2			; CHECK-NEXT: store i8 [[TMP12]], ptr [[TMP14]], align 2
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2
	; CHECK-NEXT: [[TMP15:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP15:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP15]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP8:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP15]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP9:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i8 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i8 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[GEP_SRC_1:%.*]] = getelementptr i8, ptr [[SRC_1]], i8 [[IV]]			; CHECK-NEXT: [[GEP_SRC_1:%.*]] = getelementptr i8, ptr [[SRC_1]], i8 [[IV]]
	; CHECK-NEXT: [[GEP_SRC_2:%.*]] = getelementptr i8, ptr [[SRC_2]], i8 [[IV]]			; CHECK-NEXT: [[GEP_SRC_2:%.*]] = getelementptr i8, ptr [[SRC_2]], i8 [[IV]]
	; CHECK-NEXT: [[PTR_SEL:%.*]] = select i1 [[C]], ptr [[GEP_SRC_1]], ptr [[GEP_SRC_2]]			; CHECK-NEXT: [[PTR_SEL:%.*]] = select i1 [[C]], ptr [[GEP_SRC_1]], ptr [[GEP_SRC_2]]
	; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8			; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8
	; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]			; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]
	; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2			; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1
	; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]			; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP9:![0-9]+]]			; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP10:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
	▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[OFFSET_IDX:%.*]] = trunc i32 [[INDEX]] to i8			; CHECK-NEXT: [[OFFSET_IDX:%.*]] = trunc i32 [[INDEX]] to i8
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i8 [[OFFSET_IDX]], 0			; CHECK-NEXT: [[INDUCTION:%.*]] = add i8 [[OFFSET_IDX]], 0
	; CHECK-NEXT: [[INDUCTION6:%.*]] = add i8 [[OFFSET_IDX]], 1			; CHECK-NEXT: [[INDUCTION6:%.*]] = add i8 [[OFFSET_IDX]], 1
	; CHECK-NEXT: [[TMP6:%.]] = icmp ult i8 [[INDUCTION]], [[X:%.]]			; CHECK-NEXT: [[TMP6:%.]] = icmp ult i8 [[INDUCTION]], [[X:%.]]
	; CHECK-NEXT: [[TMP7:%.*]] = icmp ult i8 [[INDUCTION6]], [[X]]			; CHECK-NEXT: [[TMP7:%.*]] = icmp ult i8 [[INDUCTION6]], [[X]]
	; CHECK-NEXT: [[TMP8:%.*]] = select i1 [[TMP6]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[TMP8:%.*]] = select i1 [[TMP6]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[TMP9:%.*]] = select i1 [[TMP7]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[TMP9:%.*]] = select i1 [[TMP7]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[TMP10:%.*]] = load i8, ptr [[TMP8]], align 8, !alias.scope !10			; CHECK-NEXT: [[TMP10:%.*]] = load i8, ptr [[TMP8]], align 8, !alias.scope !11
	; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP9]], align 8, !alias.scope !10			; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP9]], align 8, !alias.scope !11
	; CHECK-NEXT: [[TMP12:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION]]			; CHECK-NEXT: [[TMP12:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION]]
	; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION6]]			; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION6]]
	; CHECK-NEXT: store i8 [[TMP10]], ptr [[TMP12]], align 2, !alias.scope !13, !noalias !15			; CHECK-NEXT: store i8 [[TMP10]], ptr [[TMP12]], align 2, !alias.scope !14, !noalias !16
	; CHECK-NEXT: store i8 [[TMP11]], ptr [[TMP13]], align 2, !alias.scope !13, !noalias !15			; CHECK-NEXT: store i8 [[TMP11]], ptr [[TMP13]], align 2, !alias.scope !14, !noalias !16
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2
	; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP17:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i8 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i8 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[C:%.*]] = icmp ult i8 [[IV]], [[X]]			; CHECK-NEXT: [[C:%.*]] = icmp ult i8 [[IV]], [[X]]
	; CHECK-NEXT: [[PTR_SEL:%.*]] = select i1 [[C]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[PTR_SEL:%.*]] = select i1 [[C]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8			; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8
	; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]			; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]
	; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2			; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1
	; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]			; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP18:![0-9]+]]			; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP19:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[OFFSET_IDX:%.*]] = trunc i32 [[INDEX]] to i8			; CHECK-NEXT: [[OFFSET_IDX:%.*]] = trunc i32 [[INDEX]] to i8
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i8 [[OFFSET_IDX]], 0			; CHECK-NEXT: [[INDUCTION:%.*]] = add i8 [[OFFSET_IDX]], 0
	; CHECK-NEXT: [[INDUCTION6:%.*]] = add i8 [[OFFSET_IDX]], 1			; CHECK-NEXT: [[INDUCTION6:%.*]] = add i8 [[OFFSET_IDX]], 1
	; CHECK-NEXT: [[TMP6:%.]] = icmp ult i8 [[INDUCTION]], [[X:%.]]			; CHECK-NEXT: [[TMP6:%.]] = icmp ult i8 [[INDUCTION]], [[X:%.]]
	; CHECK-NEXT: [[TMP7:%.*]] = icmp ult i8 [[INDUCTION6]], [[X]]			; CHECK-NEXT: [[TMP7:%.*]] = icmp ult i8 [[INDUCTION6]], [[X]]
	; CHECK-NEXT: [[TMP8:%.*]] = select i1 [[TMP6]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[TMP8:%.*]] = select i1 [[TMP6]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[TMP9:%.*]] = select i1 [[TMP7]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[TMP9:%.*]] = select i1 [[TMP7]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[TMP10:%.*]] = load i8, ptr [[TMP8]], align 8, !alias.scope !19			; CHECK-NEXT: [[TMP10:%.*]] = load i8, ptr [[TMP8]], align 8, !alias.scope !20
	; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP9]], align 8, !alias.scope !19			; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP9]], align 8, !alias.scope !20
	; CHECK-NEXT: [[TMP12:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION]]			; CHECK-NEXT: [[TMP12:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION]]
	; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION6]]			; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION6]]
	; CHECK-NEXT: store i8 [[TMP10]], ptr [[TMP12]], align 2, !alias.scope !22, !noalias !24			; CHECK-NEXT: store i8 [[TMP10]], ptr [[TMP12]], align 2, !alias.scope !23, !noalias !25
	; CHECK-NEXT: store i8 [[TMP11]], ptr [[TMP13]], align 2, !alias.scope !22, !noalias !24			; CHECK-NEXT: store i8 [[TMP11]], ptr [[TMP13]], align 2, !alias.scope !23, !noalias !25
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2
	; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP26:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP27:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i8 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i8 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[C:%.*]] = icmp ult i8 [[IV]], [[X]]			; CHECK-NEXT: [[C:%.*]] = icmp ult i8 [[IV]], [[X]]
	; CHECK-NEXT: [[PTR_SEL:%.*]] = select i1 [[C]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[PTR_SEL:%.*]] = select i1 [[C]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8			; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8
	; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]			; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]
	; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2			; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1
	; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]			; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP27:![0-9]+]]			; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP28:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[OFFSET_IDX:%.*]] = trunc i32 [[INDEX]] to i8			; CHECK-NEXT: [[OFFSET_IDX:%.*]] = trunc i32 [[INDEX]] to i8
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i8 [[OFFSET_IDX]], 0			; CHECK-NEXT: [[INDUCTION:%.*]] = add i8 [[OFFSET_IDX]], 0
	; CHECK-NEXT: [[INDUCTION6:%.*]] = add i8 [[OFFSET_IDX]], 1			; CHECK-NEXT: [[INDUCTION6:%.*]] = add i8 [[OFFSET_IDX]], 1
	; CHECK-NEXT: [[TMP6:%.]] = icmp ult i8 [[INDUCTION]], [[X:%.]]			; CHECK-NEXT: [[TMP6:%.]] = icmp ult i8 [[INDUCTION]], [[X:%.]]
	; CHECK-NEXT: [[TMP7:%.*]] = icmp ult i8 [[INDUCTION6]], [[X]]			; CHECK-NEXT: [[TMP7:%.*]] = icmp ult i8 [[INDUCTION6]], [[X]]
	; CHECK-NEXT: [[TMP8:%.*]] = select i1 [[TMP6]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[TMP8:%.*]] = select i1 [[TMP6]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[TMP9:%.*]] = select i1 [[TMP7]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[TMP9:%.*]] = select i1 [[TMP7]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[TMP10:%.*]] = load i8, ptr [[TMP8]], align 8, !alias.scope !28			; CHECK-NEXT: [[TMP10:%.*]] = load i8, ptr [[TMP8]], align 8, !alias.scope !29
	; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP9]], align 8, !alias.scope !28			; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP9]], align 8, !alias.scope !29
	; CHECK-NEXT: [[TMP12:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION]]			; CHECK-NEXT: [[TMP12:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION]]
	; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION6]]			; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[DST]], i8 [[INDUCTION6]]
	; CHECK-NEXT: store i8 [[TMP10]], ptr [[TMP12]], align 2, !alias.scope !31, !noalias !33			; CHECK-NEXT: store i8 [[TMP10]], ptr [[TMP12]], align 2, !alias.scope !32, !noalias !34
	; CHECK-NEXT: store i8 [[TMP11]], ptr [[TMP13]], align 2, !alias.scope !31, !noalias !33			; CHECK-NEXT: store i8 [[TMP11]], ptr [[TMP13]], align 2, !alias.scope !32, !noalias !34
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2
	; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP35:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP36:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i32 [[TMP2]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i8 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i8 [ [[IND_END]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[C:%.*]] = icmp ult i8 [[IV]], [[X]]			; CHECK-NEXT: [[C:%.*]] = icmp ult i8 [[IV]], [[X]]
	; CHECK-NEXT: [[PTR_SEL:%.*]] = select i1 [[C]], ptr [[SRC_1]], ptr [[SRC_2]]			; CHECK-NEXT: [[PTR_SEL:%.*]] = select i1 [[C]], ptr [[SRC_1]], ptr [[SRC_2]]
	; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8			; CHECK-NEXT: [[L_1:%.*]] = load i8, ptr [[PTR_SEL]], align 8
	; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]			; CHECK-NEXT: [[GEP_DST:%.*]] = getelementptr i8, ptr [[DST]], i8 [[IV]]
	; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2			; CHECK-NEXT: store i8 [[L_1]], ptr [[GEP_DST]], align 2
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i8 [[IV]], 1
	; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]			; CHECK-NEXT: [[EC:%.*]] = icmp eq i8 [[IV_NEXT]], [[N]]
	; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP36:![0-9]+]]			; CHECK-NEXT: br i1 [[EC]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP37:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
	Show All 12 Lines

llvm/test/Transforms/LoopVectorize/reduction-with-invariant-store.ll

	Show First 20 Lines • Show All 182 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[OFFSET_IDX]], 0			; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[OFFSET_IDX]], 0
	; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[OFFSET_IDX]], 2			; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[OFFSET_IDX]], 2
	; CHECK-NEXT: [[TMP2:%.*]] = add i64 [[OFFSET_IDX]], 4			; CHECK-NEXT: [[TMP2:%.*]] = add i64 [[OFFSET_IDX]], 4
	; CHECK-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], 6			; CHECK-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], 6
	; CHECK-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[SRC:%.*]], i64 [[TMP0]]			; CHECK-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[SRC:%.*]], i64 [[TMP0]]
	; CHECK-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP1]]			; CHECK-NEXT: [[TMP5:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP1]]
	; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP2]]			; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP2]]
	; CHECK-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP3]]			; CHECK-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP3]]
	; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 [[TMP4]], align 4, !alias.scope !11			; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 [[TMP4]], align 4, !alias.scope !12
	; CHECK-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP5]], align 4, !alias.scope !11			; CHECK-NEXT: [[TMP9:%.]] = load i32, i32 [[TMP5]], align 4, !alias.scope !12
	; CHECK-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP6]], align 4, !alias.scope !11			; CHECK-NEXT: [[TMP10:%.]] = load i32, i32 [[TMP6]], align 4, !alias.scope !12
	; CHECK-NEXT: [[TMP11:%.]] = load i32, i32 [[TMP7]], align 4, !alias.scope !11			; CHECK-NEXT: [[TMP11:%.]] = load i32, i32 [[TMP7]], align 4, !alias.scope !12
	; CHECK-NEXT: [[TMP12:%.*]] = insertelement <4 x i32> poison, i32 [[TMP8]], i32 0			; CHECK-NEXT: [[TMP12:%.*]] = insertelement <4 x i32> poison, i32 [[TMP8]], i32 0
	; CHECK-NEXT: [[TMP13:%.*]] = insertelement <4 x i32> [[TMP12]], i32 [[TMP9]], i32 1			; CHECK-NEXT: [[TMP13:%.*]] = insertelement <4 x i32> [[TMP12]], i32 [[TMP9]], i32 1
	; CHECK-NEXT: [[TMP14:%.*]] = insertelement <4 x i32> [[TMP13]], i32 [[TMP10]], i32 2			; CHECK-NEXT: [[TMP14:%.*]] = insertelement <4 x i32> [[TMP13]], i32 [[TMP10]], i32 2
	; CHECK-NEXT: [[TMP15:%.*]] = insertelement <4 x i32> [[TMP14]], i32 [[TMP11]], i32 3			; CHECK-NEXT: [[TMP15:%.*]] = insertelement <4 x i32> [[TMP14]], i32 [[TMP11]], i32 3
	; CHECK-NEXT: [[TMP16:%.*]] = add <4 x i32> [[TMP15]], [[VEC_PHI]]			; CHECK-NEXT: [[TMP16:%.*]] = add <4 x i32> [[TMP15]], [[VEC_PHI]]
	; CHECK-NEXT: [[TMP17:%.*]] = or <4 x i64> [[VEC_IND]], <i64 1, i64 1, i64 1, i64 1>			; CHECK-NEXT: [[TMP17:%.*]] = or <4 x i64> [[VEC_IND]], <i64 1, i64 1, i64 1, i64 1>
	; CHECK-NEXT: [[TMP18:%.*]] = extractelement <4 x i64> [[TMP17]], i32 0			; CHECK-NEXT: [[TMP18:%.*]] = extractelement <4 x i64> [[TMP17]], i32 0
	; CHECK-NEXT: [[TMP19:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP18]]			; CHECK-NEXT: [[TMP19:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP18]]
	; CHECK-NEXT: [[TMP20:%.*]] = extractelement <4 x i64> [[TMP17]], i32 1			; CHECK-NEXT: [[TMP20:%.*]] = extractelement <4 x i64> [[TMP17]], i32 1
	; CHECK-NEXT: [[TMP21:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP20]]			; CHECK-NEXT: [[TMP21:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP20]]
	; CHECK-NEXT: [[TMP22:%.*]] = extractelement <4 x i64> [[TMP17]], i32 2			; CHECK-NEXT: [[TMP22:%.*]] = extractelement <4 x i64> [[TMP17]], i32 2
	; CHECK-NEXT: [[TMP23:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP22]]			; CHECK-NEXT: [[TMP23:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP22]]
	; CHECK-NEXT: [[TMP24:%.*]] = extractelement <4 x i64> [[TMP17]], i32 3			; CHECK-NEXT: [[TMP24:%.*]] = extractelement <4 x i64> [[TMP17]], i32 3
	; CHECK-NEXT: [[TMP25:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP24]]			; CHECK-NEXT: [[TMP25:%.]] = getelementptr inbounds i32, i32 [[SRC]], i64 [[TMP24]]
	; CHECK-NEXT: [[TMP26:%.]] = load i32, i32 [[TMP19]], align 4, !alias.scope !11			; CHECK-NEXT: [[TMP26:%.]] = load i32, i32 [[TMP19]], align 4, !alias.scope !12
	; CHECK-NEXT: [[TMP27:%.]] = load i32, i32 [[TMP21]], align 4, !alias.scope !11			; CHECK-NEXT: [[TMP27:%.]] = load i32, i32 [[TMP21]], align 4, !alias.scope !12
	; CHECK-NEXT: [[TMP28:%.]] = load i32, i32 [[TMP23]], align 4, !alias.scope !11			; CHECK-NEXT: [[TMP28:%.]] = load i32, i32 [[TMP23]], align 4, !alias.scope !12
	; CHECK-NEXT: [[TMP29:%.]] = load i32, i32 [[TMP25]], align 4, !alias.scope !11			; CHECK-NEXT: [[TMP29:%.]] = load i32, i32 [[TMP25]], align 4, !alias.scope !12
	; CHECK-NEXT: [[TMP30:%.*]] = insertelement <4 x i32> poison, i32 [[TMP26]], i32 0			; CHECK-NEXT: [[TMP30:%.*]] = insertelement <4 x i32> poison, i32 [[TMP26]], i32 0
	; CHECK-NEXT: [[TMP31:%.*]] = insertelement <4 x i32> [[TMP30]], i32 [[TMP27]], i32 1			; CHECK-NEXT: [[TMP31:%.*]] = insertelement <4 x i32> [[TMP30]], i32 [[TMP27]], i32 1
	; CHECK-NEXT: [[TMP32:%.*]] = insertelement <4 x i32> [[TMP31]], i32 [[TMP28]], i32 2			; CHECK-NEXT: [[TMP32:%.*]] = insertelement <4 x i32> [[TMP31]], i32 [[TMP28]], i32 2
	; CHECK-NEXT: [[TMP33:%.*]] = insertelement <4 x i32> [[TMP32]], i32 [[TMP29]], i32 3			; CHECK-NEXT: [[TMP33:%.*]] = insertelement <4 x i32> [[TMP32]], i32 [[TMP29]], i32 3
	; CHECK-NEXT: [[TMP34]] = add <4 x i32> [[TMP33]], [[TMP16]]			; CHECK-NEXT: [[TMP34]] = add <4 x i32> [[TMP33]], [[TMP16]]
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i64> [[VEC_IND]], <i64 8, i64 8, i64 8, i64 8>			; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i64> [[VEC_IND]], <i64 8, i64 8, i64 8, i64 8>
	; CHECK-NEXT: [[TMP35:%.*]] = icmp eq i64 [[INDEX_NEXT]], 500			; CHECK-NEXT: [[TMP35:%.*]] = icmp eq i64 [[INDEX_NEXT]], 500
	▲ Show 20 Lines • Show All 342 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/runtime-check.ll

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds float, float [[B]], i64 [[INDVARS_IV]], !dbg [[DBG9]]		; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds float, float [[B]], i64 [[INDVARS_IV]], !dbg [[DBG9]]
; CHECK-NEXT: [[TMP10:%.]] = load float, float [[ARRAYIDX]], align 4, !dbg [[DBG9]]		; CHECK-NEXT: [[TMP10:%.]] = load float, float [[ARRAYIDX]], align 4, !dbg [[DBG9]]
; CHECK-NEXT: [[MUL:%.*]] = fmul float [[TMP10]], 3.000000e+00, !dbg [[DBG9]]		; CHECK-NEXT: [[MUL:%.*]] = fmul float [[TMP10]], 3.000000e+00, !dbg [[DBG9]]
; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds float, float [[A]], i64 [[INDVARS_IV]], !dbg [[DBG9]]		; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds float, float [[A]], i64 [[INDVARS_IV]], !dbg [[DBG9]]
; CHECK-NEXT: store float [[MUL]], float* [[ARRAYIDX2]], align 4, !dbg [[DBG9]]		; CHECK-NEXT: store float [[MUL]], float* [[ARRAYIDX2]], align 4, !dbg [[DBG9]]
; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1, !dbg [[DBG9]]		; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1, !dbg [[DBG9]]
; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32, !dbg [[DBG9]]		; CHECK-NEXT: [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32, !dbg [[DBG9]]
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]], !dbg [[DBG9]]		; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[LFTR_WIDEIV]], [[N]], !dbg [[DBG9]]
; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !dbg [[DBG9]], !llvm.loop [[LOOP12:![0-9]+]]		; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_END_LOOPEXIT]], label [[FOR_BODY]], !dbg [[DBG9]], !llvm.loop [[LOOP13:![0-9]+]]
; CHECK: for.end.loopexit:		; CHECK: for.end.loopexit:
; CHECK-NEXT: br label [[FOR_END]], !dbg [[DBG13:![0-9]+]]		; CHECK-NEXT: br label [[FOR_END]], !dbg [[DBG14:![0-9]+]]
; CHECK: for.end:		; CHECK: for.end:
; CHECK-NEXT: ret i32 undef, !dbg [[DBG13]]		; CHECK-NEXT: ret i32 undef, !dbg [[DBG14]]
;		;
; FORCED_OPTSIZE-LABEL: @foo(		; FORCED_OPTSIZE-LABEL: @foo(
; FORCED_OPTSIZE-NEXT: entry:		; FORCED_OPTSIZE-NEXT: entry:
; FORCED_OPTSIZE-NEXT: [[CMP6:%.]] = icmp sgt i32 [[N:%.]], 0, !dbg [[DBG4:![0-9]+]]		; FORCED_OPTSIZE-NEXT: [[CMP6:%.]] = icmp sgt i32 [[N:%.]], 0, !dbg [[DBG4:![0-9]+]]
; FORCED_OPTSIZE-NEXT: br i1 [[CMP6]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]], !dbg [[DBG4]]		; FORCED_OPTSIZE-NEXT: br i1 [[CMP6]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]], !dbg [[DBG4]]
; FORCED_OPTSIZE: for.body.preheader:		; FORCED_OPTSIZE: for.body.preheader:
; FORCED_OPTSIZE-NEXT: br label [[FOR_BODY:%.*]], !dbg [[DBG9:![0-9]+]]		; FORCED_OPTSIZE-NEXT: br label [[FOR_BODY:%.*]], !dbg [[DBG9:![0-9]+]]
; FORCED_OPTSIZE: for.body:		; FORCED_OPTSIZE: for.body:
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x float> poison, float [[B:%.]], i64 0		; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x float> poison, float [[B:%.]], i64 0
; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x float> [[BROADCAST_SPLATINSERT]], <4 x float> poison, <4 x i32> zeroinitializer		; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x float> [[BROADCAST_SPLATINSERT]], <4 x float> poison, <4 x i32> zeroinitializer
; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]		; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
; CHECK: vector.body:		; CHECK: vector.body:
; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]		; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[TMP2:%.*]] = add i64 [[INDEX]], [[OFFSET]]		; CHECK-NEXT: [[TMP2:%.*]] = add i64 [[INDEX]], [[OFFSET]]
; CHECK-NEXT: [[TMP3:%.]] = getelementptr inbounds float, float [[A]], i64 [[TMP2]]		; CHECK-NEXT: [[TMP3:%.]] = getelementptr inbounds float, float [[A]], i64 [[TMP2]]
; CHECK-NEXT: [[TMP4:%.]] = bitcast float [[TMP3]] to <4 x float>*		; CHECK-NEXT: [[TMP4:%.]] = bitcast float [[TMP3]] to <4 x float>*
; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <4 x float>, <4 x float> [[TMP4]], align 4, !alias.scope !14, !noalias !17		; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <4 x float>, <4 x float> [[TMP4]], align 4, !alias.scope !15, !noalias !18
; CHECK-NEXT: [[TMP5:%.*]] = add i64 [[INDEX]], [[OFFSET2]]		; CHECK-NEXT: [[TMP5:%.*]] = add i64 [[INDEX]], [[OFFSET2]]
; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds float, float [[A]], i64 [[TMP5]]		; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds float, float [[A]], i64 [[TMP5]]
; CHECK-NEXT: [[TMP7:%.]] = bitcast float [[TMP6]] to <4 x float>*		; CHECK-NEXT: [[TMP7:%.]] = bitcast float [[TMP6]] to <4 x float>*
; CHECK-NEXT: [[WIDE_LOAD8:%.]] = load <4 x float>, <4 x float> [[TMP7]], align 4, !alias.scope !17		; CHECK-NEXT: [[WIDE_LOAD8:%.]] = load <4 x float>, <4 x float> [[TMP7]], align 4, !alias.scope !18
; CHECK-NEXT: [[TMP8:%.*]] = fmul fast <4 x float> [[BROADCAST_SPLAT]], [[WIDE_LOAD8]]		; CHECK-NEXT: [[TMP8:%.*]] = fmul fast <4 x float> [[BROADCAST_SPLAT]], [[WIDE_LOAD8]]
; CHECK-NEXT: [[TMP9:%.*]] = fadd fast <4 x float> [[WIDE_LOAD]], [[TMP8]]		; CHECK-NEXT: [[TMP9:%.*]] = fadd fast <4 x float> [[WIDE_LOAD]], [[TMP8]]
; CHECK-NEXT: [[TMP10:%.]] = bitcast float [[TMP3]] to <4 x float>*		; CHECK-NEXT: [[TMP10:%.]] = bitcast float [[TMP3]] to <4 x float>*
; CHECK-NEXT: store <4 x float> [[TMP9]], <4 x float>* [[TMP10]], align 4, !alias.scope !14, !noalias !17		; CHECK-NEXT: store <4 x float> [[TMP9]], <4 x float>* [[TMP10]], align 4, !alias.scope !15, !noalias !18
; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4		; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]		; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
; CHECK-NEXT: br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP19:![0-9]+]]		; CHECK-NEXT: br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP20:![0-9]+]]
; CHECK: middle.block:		; CHECK: middle.block:
; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[N]]		; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[N]]
; CHECK-NEXT: br i1 [[CMP_N]], label [[LOOPEXIT:%.*]], label [[SCALAR_PH]]		; CHECK-NEXT: br i1 [[CMP_N]], label [[LOOPEXIT:%.*]], label [[SCALAR_PH]]
; CHECK: scalar.ph:		; CHECK: scalar.ph:
; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]		; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ], [ 0, [[VECTOR_MEMCHECK]] ]
; CHECK-NEXT: br label [[FOR_BODY:%.*]]		; CHECK-NEXT: br label [[FOR_BODY:%.*]]
; CHECK: for.body:		; CHECK: for.body:
; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[FOR_BODY]] ]		; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[FOR_BODY]] ]
; CHECK-NEXT: [[IND_SUM:%.*]] = add i64 [[IV]], [[OFFSET]]		; CHECK-NEXT: [[IND_SUM:%.*]] = add i64 [[IV]], [[OFFSET]]
; CHECK-NEXT: [[ARR_IDX:%.]] = getelementptr inbounds float, float [[A]], i64 [[IND_SUM]]		; CHECK-NEXT: [[ARR_IDX:%.]] = getelementptr inbounds float, float [[A]], i64 [[IND_SUM]]
; CHECK-NEXT: [[L1:%.]] = load float, float [[ARR_IDX]], align 4		; CHECK-NEXT: [[L1:%.]] = load float, float [[ARR_IDX]], align 4
; CHECK-NEXT: [[IND_SUM2:%.*]] = add i64 [[IV]], [[OFFSET2]]		; CHECK-NEXT: [[IND_SUM2:%.*]] = add i64 [[IV]], [[OFFSET2]]
; CHECK-NEXT: [[ARR_IDX2:%.]] = getelementptr inbounds float, float [[A]], i64 [[IND_SUM2]]		; CHECK-NEXT: [[ARR_IDX2:%.]] = getelementptr inbounds float, float [[A]], i64 [[IND_SUM2]]
; CHECK-NEXT: [[L2:%.]] = load float, float [[ARR_IDX2]], align 4		; CHECK-NEXT: [[L2:%.]] = load float, float [[ARR_IDX2]], align 4
; CHECK-NEXT: [[M:%.*]] = fmul fast float [[L2]], [[B]]		; CHECK-NEXT: [[M:%.*]] = fmul fast float [[L2]], [[B]]
; CHECK-NEXT: [[AD:%.*]] = fadd fast float [[L1]], [[M]]		; CHECK-NEXT: [[AD:%.*]] = fadd fast float [[L1]], [[M]]
; CHECK-NEXT: store float [[AD]], float* [[ARR_IDX]], align 4		; CHECK-NEXT: store float [[AD]], float* [[ARR_IDX]], align 4
; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[IV_NEXT]], [[N]]		; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[IV_NEXT]], [[N]]
; CHECK-NEXT: br i1 [[EXITCOND]], label [[LOOPEXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP20:![0-9]+]]		; CHECK-NEXT: br i1 [[EXITCOND]], label [[LOOPEXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP21:![0-9]+]]
; CHECK: loopexit:		; CHECK: loopexit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
; FORCED_OPTSIZE-LABEL: @test_runtime_check(		; FORCED_OPTSIZE-LABEL: @test_runtime_check(
; FORCED_OPTSIZE-NEXT: entry:		; FORCED_OPTSIZE-NEXT: entry:
; FORCED_OPTSIZE-NEXT: br label [[FOR_BODY:%.*]]		; FORCED_OPTSIZE-NEXT: br label [[FOR_BODY:%.*]]
; FORCED_OPTSIZE: for.body:		; FORCED_OPTSIZE: for.body:
; FORCED_OPTSIZE-NEXT: [[IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[FOR_BODY]] ]		; FORCED_OPTSIZE-NEXT: [[IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[FOR_BODY]] ]
▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	for.body:
%iv.next = add nuw nsw i64 %iv, 1		%iv.next = add nuw nsw i64 %iv, 1
%exitcond = icmp eq i64 %iv.next, %n		%exitcond = icmp eq i64 %iv.next, %n
br i1 %exitcond, label %loopexit, label %for.body		br i1 %exitcond, label %loopexit, label %for.body

loopexit:		loopexit:
ret void		ret void
}		}

; CHECK: !9 = !DILocation(line: 101, column: 1, scope: !{{.*}})

define dso_local void @forced_optsize(i64* noalias nocapture readonly %x_p, i64* noalias nocapture readonly %y_p, i64* noalias nocapture %z_p) minsize optsize {		define dso_local void @forced_optsize(i64* noalias nocapture readonly %x_p, i64* noalias nocapture readonly %y_p, i64* noalias nocapture %z_p) minsize optsize {
		; CHECK-LABEL: @forced_optsize(
		; CHECK-NEXT: entry:
		; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
		; CHECK: vector.ph:
		; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
		; CHECK: vector.body:
		; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
		; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i64, i64 [[X_P:%.*]], i64 [[INDEX]]
		; CHECK-NEXT: [[TMP1:%.]] = bitcast i64 [[TMP0]] to <2 x i64>*
		; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <2 x i64>, <2 x i64> [[TMP1]], align 8
		; CHECK-NEXT: [[TMP2:%.]] = getelementptr inbounds i64, i64 [[Y_P:%.*]], i64 [[INDEX]]
		; CHECK-NEXT: [[TMP3:%.]] = bitcast i64 [[TMP2]] to <2 x i64>*
		; CHECK-NEXT: [[WIDE_LOAD1:%.]] = load <2 x i64>, <2 x i64> [[TMP3]], align 8
		; CHECK-NEXT: [[TMP4:%.*]] = add nsw <2 x i64> [[WIDE_LOAD1]], [[WIDE_LOAD]]
		; CHECK-NEXT: [[TMP5:%.]] = getelementptr inbounds i64, i64 [[Z_P:%.*]], i64 [[INDEX]]
		; CHECK-NEXT: [[TMP6:%.]] = bitcast i64 [[TMP5]] to <2 x i64>*
		; CHECK-NEXT: store <2 x i64> [[TMP4]], <2 x i64>* [[TMP6]], align 8
		; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
		; CHECK-NEXT: [[TMP7:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128
		; CHECK-NEXT: br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP22:![0-9]+]]
		; CHECK: middle.block:
		; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
		; CHECK: scalar.ph:
		; CHECK-NEXT: br label [[FOR_BODY:%.*]]
		; CHECK: for.cond.cleanup:
		; CHECK-NEXT: ret void
		; CHECK: for.body:
		; CHECK-NEXT: br i1 poison, label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP23:![0-9]+]]
		;
; FORCED_OPTSIZE-LABEL: @forced_optsize(		; FORCED_OPTSIZE-LABEL: @forced_optsize(
; FORCED_OPTSIZE-NEXT: entry:		; FORCED_OPTSIZE-NEXT: entry:
; FORCED_OPTSIZE-NEXT: br label [[FOR_BODY:%.*]]		; FORCED_OPTSIZE-NEXT: br label [[FOR_BODY:%.*]]
; FORCED_OPTSIZE: for.cond.cleanup:		; FORCED_OPTSIZE: for.cond.cleanup:
; FORCED_OPTSIZE-NEXT: ret void		; FORCED_OPTSIZE-NEXT: ret void
; FORCED_OPTSIZE: for.body:		; FORCED_OPTSIZE: for.body:
; FORCED_OPTSIZE-NEXT: [[INDVARS_IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INDVARS_IV_NEXT:%.*]], [[FOR_BODY]] ]		; FORCED_OPTSIZE-NEXT: [[INDVARS_IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INDVARS_IV_NEXT:%.*]], [[FOR_BODY]] ]
; FORCED_OPTSIZE-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i64, i64 [[X_P:%.*]], i64 [[INDVARS_IV]]		; FORCED_OPTSIZE-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i64, i64 [[X_P:%.*]], i64 [[INDVARS_IV]]
Show All 22 Lines	for.body:
%add = add nsw i64 %1, %0		%add = add nsw i64 %1, %0
%arrayidx4 = getelementptr inbounds i64, i64* %z_p, i64 %indvars.iv		%arrayidx4 = getelementptr inbounds i64, i64* %z_p, i64 %indvars.iv
store i64 %add, i64* %arrayidx4, align 8		store i64 %add, i64* %arrayidx4, align 8
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 128		%exitcond = icmp eq i64 %indvars.iv.next, 128
br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !12		br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !12
}		}

		; CHECK: !9 = !DILocation(line: 101, column: 1, scope: !{{.*}})

!llvm.module.flags = !{!0, !1}		!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!9}		!llvm.dbg.cu = !{!9}
!0 = !{i32 2, !"Dwarf Version", i32 4}		!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}		!1 = !{i32 2, !"Debug Info Version", i32 3}

!2 = !{}		!2 = !{}
!3 = !DISubroutineType(types: !2)		!3 = !DISubroutineType(types: !2)
!4 = !DIFile(filename: "test.cpp", directory: "/tmp")		!4 = !DIFile(filename: "test.cpp", directory: "/tmp")
Show All 13 Lines

llvm/test/Transforms/LoopVectorize/vectorize-once.ll

	Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines

	_ZSt10accumulateIPiiET0_T_S2_S1_.exit: ; preds = %for.body.i, %entry			_ZSt10accumulateIPiiET0_T_S2_S1_.exit: ; preds = %for.body.i, %entry
	%__init.addr.0.lcssa.i = phi i32 [ 0, %entry ], [ %add.i, %for.body.i ]			%__init.addr.0.lcssa.i = phi i32 [ 0, %entry ], [ %add.i, %for.body.i ]
	ret i32 %__init.addr.0.lcssa.i			ret i32 %__init.addr.0.lcssa.i
	}			}

	attributes #0 = { nounwind readonly ssp uwtable "fp-contract-model"="standard" "frame-pointer"="non-leaf" "realign-stack" "relocation-model"="pic" "ssp-buffers-size"="8" }			attributes #0 = { nounwind readonly ssp uwtable "fp-contract-model"="standard" "frame-pointer"="non-leaf" "realign-stack" "relocation-model"="pic" "ssp-buffers-size"="8" }

	; CHECK: [[VEC_LOOP1]] = distinct !{[[VEC_LOOP1]], [[MD_IS_VEC:![0-9]+]]}			; CHECK: [[VEC_LOOP1]] = distinct !{[[VEC_LOOP1]], [[MD_IS_VEC:![0-9]+]], [[MD_RT_UNROLL_DIS:![0-9]+]]}
	; CHECK-NEXT: [[MD_IS_VEC:![0-9]+]] = !{!"llvm.loop.isvectorized", i32 1}			; CHECK-NEXT: [[MD_IS_VEC:![0-9]+]] = !{!"llvm.loop.isvectorized", i32 1}
	; CHECK-NEXT: [[SCALAR_LOOP1]] = distinct !{[[SCALAR_LOOP1]], [[MD_RT_UNROLL_DIS:![0-9]+]], [[MD_IS_VEC]]}
	; CHECK-NEXT: [[MD_RT_UNROLL_DIS]] = !{!"llvm.loop.unroll.runtime.disable"}			; CHECK-NEXT: [[MD_RT_UNROLL_DIS]] = !{!"llvm.loop.unroll.runtime.disable"}
				; CHECK-NEXT: [[SCALAR_LOOP1]] = distinct !{[[SCALAR_LOOP1]], [[MD_RT_UNROLL_DIS:![0-9]+]], [[MD_IS_VEC]]}
	; CHECK-NEXT: [[SCALAR_LOOP2]] = distinct !{[[SCALAR_LOOP2]], [[VEC_WIDTH_1:![0-9]+]]}			; CHECK-NEXT: [[SCALAR_LOOP2]] = distinct !{[[SCALAR_LOOP2]], [[VEC_WIDTH_1:![0-9]+]]}
	; CHECK-NEXT: [[VEC_WIDTH_1]] = !{!"llvm.loop.vectorize.width", i32 1}			; CHECK-NEXT: [[VEC_WIDTH_1]] = !{!"llvm.loop.vectorize.width", i32 1}

	!0 = !{!0, !1}			!0 = !{!0, !1}
	!1 = !{!"llvm.loop.vectorize.width", i32 1}			!1 = !{!"llvm.loop.vectorize.width", i32 1}

llvm/test/Transforms/PhaseOrdering/AArch64/hoisting-sinking-required-for-vectorization.ll

	Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[CMP_I:%.*]] = fcmp olt double [[TMP15]], 0.000000e+00			; CHECK-NEXT: [[CMP_I:%.*]] = fcmp olt double [[TMP15]], 0.000000e+00
	; CHECK-NEXT: [[CMP1_I:%.*]] = fcmp ogt double [[TMP15]], 6.000000e+00			; CHECK-NEXT: [[CMP1_I:%.*]] = fcmp ogt double [[TMP15]], 6.000000e+00
	; CHECK-NEXT: [[DOTV_I:%.*]] = select i1 [[CMP1_I]], double 6.000000e+00, double [[TMP15]]			; CHECK-NEXT: [[DOTV_I:%.*]] = select i1 [[CMP1_I]], double 6.000000e+00, double [[TMP15]]
	; CHECK-NEXT: [[RETVAL_0_I:%.*]] = select i1 [[CMP_I]], double 0.000000e+00, double [[DOTV_I]]			; CHECK-NEXT: [[RETVAL_0_I:%.*]] = select i1 [[CMP_I]], double 0.000000e+00, double [[DOTV_I]]
	; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[IDXPROM]]			; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[IDXPROM]]
	; CHECK-NEXT: store double [[RETVAL_0_I]], ptr [[ARRAYIDX2]], align 8			; CHECK-NEXT: store double [[RETVAL_0_I]], ptr [[ARRAYIDX2]], align 8
	; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_04]], 1			; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_04]], 1
	; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[I_04]], 19999			; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[I_04]], 19999
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_COND_CLEANUP]], !llvm.loop [[LOOP2:![0-9]+]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_COND_CLEANUP]], !llvm.loop [[LOOP3:![0-9]+]]
	;			;
	entry:			entry:
	%X.addr = alloca ptr, align 8			%X.addr = alloca ptr, align 8
	%Y.addr = alloca ptr, align 8			%Y.addr = alloca ptr, align 8
	%i = alloca i32, align 4			%i = alloca i32, align 4
	store ptr %X, ptr %X.addr, align 8			store ptr %X, ptr %X.addr, align 8
	store ptr %Y, ptr %Y.addr, align 8			store ptr %Y, ptr %Y.addr, align 8
	call void @llvm.lifetime.start.p0(i64 4, ptr %i) #2			call void @llvm.lifetime.start.p0(i64 4, ptr %i) #2
	▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: br i1 [[CONFLICT_RDX]], label [[LOOP_BODY:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[CONFLICT_RDX]], label [[LOOP_BODY:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x float> poison, float [[X:%.]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x float> poison, float [[X:%.]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x float> [[BROADCAST_SPLATINSERT]], <4 x float> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x float> [[BROADCAST_SPLATINSERT]], <4 x float> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP0]], align 4, !alias.scope !3			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP0]], align 4, !alias.scope !4
	; CHECK-NEXT: [[TMP1:%.*]] = icmp eq <4 x i32> [[WIDE_LOAD]], <i32 20, i32 20, i32 20, i32 20>			; CHECK-NEXT: [[TMP1:%.*]] = icmp eq <4 x i32> [[WIDE_LOAD]], <i32 20, i32 20, i32 20, i32 20>
	; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD7:%.*]] = load <4 x float>, ptr [[TMP2]], align 4, !alias.scope !6			; CHECK-NEXT: [[WIDE_LOAD7:%.*]] = load <4 x float>, ptr [[TMP2]], align 4, !alias.scope !7
	; CHECK-NEXT: [[TMP3:%.*]] = fmul <4 x float> [[WIDE_LOAD7]], [[BROADCAST_SPLAT]]			; CHECK-NEXT: [[TMP3:%.*]] = fmul <4 x float> [[WIDE_LOAD7]], [[BROADCAST_SPLAT]]
	; CHECK-NEXT: [[TMP4:%.*]] = getelementptr float, ptr [[B]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP4:%.*]] = getelementptr float, ptr [[B]], i64 [[INDEX]]
	; CHECK-NEXT: [[WIDE_LOAD8:%.*]] = load <4 x float>, ptr [[TMP4]], align 4, !alias.scope !8, !noalias !10			; CHECK-NEXT: [[WIDE_LOAD8:%.*]] = load <4 x float>, ptr [[TMP4]], align 4, !alias.scope !9, !noalias !11
	; CHECK-NEXT: [[TMP5:%.*]] = select <4 x i1> [[TMP1]], <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, <4 x float> [[WIDE_LOAD8]]			; CHECK-NEXT: [[TMP5:%.*]] = select <4 x i1> [[TMP1]], <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, <4 x float> [[WIDE_LOAD8]]
	; CHECK-NEXT: [[PREDPHI:%.*]] = fadd <4 x float> [[TMP3]], [[TMP5]]			; CHECK-NEXT: [[PREDPHI:%.*]] = fadd <4 x float> [[TMP3]], [[TMP5]]
	; CHECK-NEXT: store <4 x float> [[PREDPHI]], ptr [[TMP4]], align 4, !alias.scope !8, !noalias !10			; CHECK-NEXT: store <4 x float> [[PREDPHI]], ptr [[TMP4]], align 4, !alias.scope !9, !noalias !11
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[INDEX_NEXT]], 10000			; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i64 [[INDEX_NEXT]], 10000
	; CHECK-NEXT: br i1 [[TMP6]], label [[EXIT:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP11:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP6]], label [[EXIT:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]
	; CHECK: loop.body:			; CHECK: loop.body:
	; CHECK-NEXT: [[IV1:%.]] = phi i64 [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[IV1:%.]] = phi i64 [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[C_GEP:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[IV1]]			; CHECK-NEXT: [[C_GEP:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[IV1]]
	; CHECK-NEXT: [[C_LV:%.*]] = load i32, ptr [[C_GEP]], align 4			; CHECK-NEXT: [[C_LV:%.*]] = load i32, ptr [[C_GEP]], align 4
	; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[C_LV]], 20			; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[C_LV]], 20
	; CHECK-NEXT: [[A_GEP_0:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[IV1]]			; CHECK-NEXT: [[A_GEP_0:%.*]] = getelementptr inbounds float, ptr [[A]], i64 [[IV1]]
	; CHECK-NEXT: [[A_LV_0:%.*]] = load float, ptr [[A_GEP_0]], align 4			; CHECK-NEXT: [[A_LV_0:%.*]] = load float, ptr [[A_GEP_0]], align 4
	; CHECK-NEXT: [[MUL2_I81_I:%.*]] = fmul float [[A_LV_0]], [[X]]			; CHECK-NEXT: [[MUL2_I81_I:%.*]] = fmul float [[A_LV_0]], [[X]]
	; CHECK-NEXT: [[B_GEP_0:%.*]] = getelementptr inbounds float, ptr [[B]], i64 [[IV1]]			; CHECK-NEXT: [[B_GEP_0:%.*]] = getelementptr inbounds float, ptr [[B]], i64 [[IV1]]
	; CHECK-NEXT: br i1 [[CMP]], label [[LOOP_LATCH]], label [[ELSE:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[LOOP_LATCH]], label [[ELSE:%.*]]
	; CHECK: else:			; CHECK: else:
	; CHECK-NEXT: [[B_LV:%.*]] = load float, ptr [[B_GEP_0]], align 4			; CHECK-NEXT: [[B_LV:%.*]] = load float, ptr [[B_GEP_0]], align 4
	; CHECK-NEXT: [[ADD:%.*]] = fadd float [[MUL2_I81_I]], [[B_LV]]			; CHECK-NEXT: [[ADD:%.*]] = fadd float [[MUL2_I81_I]], [[B_LV]]
	; CHECK-NEXT: br label [[LOOP_LATCH]]			; CHECK-NEXT: br label [[LOOP_LATCH]]
	; CHECK: loop.latch:			; CHECK: loop.latch:
	; CHECK-NEXT: [[ADD_SINK:%.*]] = phi float [ [[ADD]], [[ELSE]] ], [ [[MUL2_I81_I]], [[LOOP_BODY]] ]			; CHECK-NEXT: [[ADD_SINK:%.*]] = phi float [ [[ADD]], [[ELSE]] ], [ [[MUL2_I81_I]], [[LOOP_BODY]] ]
	; CHECK-NEXT: store float [[ADD_SINK]], ptr [[B_GEP_0]], align 4			; CHECK-NEXT: store float [[ADD_SINK]], ptr [[B_GEP_0]], align 4
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV1]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV1]], 1
	; CHECK-NEXT: [[CMP_0:%.*]] = icmp ult i64 [[IV1]], 9999			; CHECK-NEXT: [[CMP_0:%.*]] = icmp ult i64 [[IV1]], 9999
	; CHECK-NEXT: br i1 [[CMP_0]], label [[LOOP_BODY]], label [[EXIT]], !llvm.loop [[LOOP12:![0-9]+]]			; CHECK-NEXT: br i1 [[CMP_0]], label [[LOOP_BODY]], label [[EXIT]], !llvm.loop [[LOOP13:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %loop.header			br label %loop.header

	loop.header:			loop.header:
	%iv = phi i64 [ %iv.next, %loop.latch ], [ 0, %entry ]			%iv = phi i64 [ %iv.next, %loop.latch ], [ 0, %entry ]
	Show All 38 Lines

llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll

	Show First 20 Lines • Show All 170 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[CMP1:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[CMP1:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_BODY_PREHEADER:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_BODY_PREHEADER:%.]], label [[EXIT:%.]]
	; CHECK: for.body.preheader:			; CHECK: for.body.preheader:
	; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64			; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 4			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 4
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER7:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER7:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 4294967292			; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 4294967292
	; CHECK-NEXT: [[TMP0:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -4
	; CHECK-NEXT: [[TMP1:%.*]] = lshr i64 [[TMP0]], 2
	; CHECK-NEXT: [[TMP2:%.*]] = add nuw nsw i64 [[TMP1]], 1
	; CHECK-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP2]], 7
	; CHECK-NEXT: [[TMP3:%.*]] = icmp ult i64 [[TMP0]], 28
	; CHECK-NEXT: br i1 [[TMP3]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]
	; CHECK: vector.ph.new:
	; CHECK-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[TMP2]], -8
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[INDEX_NEXT_7:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[NITER:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[NITER_NEXT_7:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX]]
	; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX]]			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <2 x double>, ptr [[TMP0]], align 16
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <2 x double>, ptr [[TMP4]], align 16			; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds double, ptr [[TMP0]], i64 2
	; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds double, ptr [[TMP4]], i64 2			; CHECK-NEXT: [[WIDE_LOAD4:%.*]] = load <2 x double>, ptr [[TMP1]], align 16
	; CHECK-NEXT: [[WIDE_LOAD4:%.*]] = load <2 x double>, ptr [[TMP5]], align 16			; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX]]
	; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX]]			; CHECK-NEXT: [[WIDE_LOAD5:%.*]] = load <2 x double>, ptr [[TMP2]], align 16
	; CHECK-NEXT: [[WIDE_LOAD5:%.*]] = load <2 x double>, ptr [[TMP6]], align 16			; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds double, ptr [[TMP2]], i64 2
				; CHECK-NEXT: [[WIDE_LOAD6:%.*]] = load <2 x double>, ptr [[TMP3]], align 16
				; CHECK-NEXT: [[TMP4:%.*]] = fadd <2 x double> [[WIDE_LOAD]], [[WIDE_LOAD5]]
				; CHECK-NEXT: [[TMP5:%.*]] = fadd <2 x double> [[WIDE_LOAD4]], [[WIDE_LOAD6]]
				; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX]]
				; CHECK-NEXT: store <2 x double> [[TMP4]], ptr [[TMP6]], align 16
	; CHECK-NEXT: [[TMP7:%.*]] = getelementptr inbounds double, ptr [[TMP6]], i64 2			; CHECK-NEXT: [[TMP7:%.*]] = getelementptr inbounds double, ptr [[TMP6]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6:%.*]] = load <2 x double>, ptr [[TMP7]], align 16			; CHECK-NEXT: store <2 x double> [[TMP5]], ptr [[TMP7]], align 16
	; CHECK-NEXT: [[TMP8:%.*]] = fadd <2 x double> [[WIDE_LOAD]], [[WIDE_LOAD5]]			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP9:%.*]] = fadd <2 x double> [[WIDE_LOAD4]], [[WIDE_LOAD6]]			; CHECK-NEXT: [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX]]			; CHECK-NEXT: br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
	; CHECK-NEXT: store <2 x double> [[TMP8]], ptr [[TMP10]], align 16
	; CHECK-NEXT: [[TMP11:%.*]] = getelementptr inbounds double, ptr [[TMP10]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP9]], ptr [[TMP11]], align 16
	; CHECK-NEXT: [[INDEX_NEXT:%.*]] = or i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP12:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX_NEXT]]
	; CHECK-NEXT: [[WIDE_LOAD_1:%.*]] = load <2 x double>, ptr [[TMP12]], align 16
	; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds double, ptr [[TMP12]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD4_1:%.*]] = load <2 x double>, ptr [[TMP13]], align 16
	; CHECK-NEXT: [[TMP14:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX_NEXT]]
	; CHECK-NEXT: [[WIDE_LOAD5_1:%.*]] = load <2 x double>, ptr [[TMP14]], align 16
	; CHECK-NEXT: [[TMP15:%.*]] = getelementptr inbounds double, ptr [[TMP14]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6_1:%.*]] = load <2 x double>, ptr [[TMP15]], align 16
	; CHECK-NEXT: [[TMP16:%.*]] = fadd <2 x double> [[WIDE_LOAD_1]], [[WIDE_LOAD5_1]]
	; CHECK-NEXT: [[TMP17:%.*]] = fadd <2 x double> [[WIDE_LOAD4_1]], [[WIDE_LOAD6_1]]
	; CHECK-NEXT: [[TMP18:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX_NEXT]]
	; CHECK-NEXT: store <2 x double> [[TMP16]], ptr [[TMP18]], align 16
	; CHECK-NEXT: [[TMP19:%.*]] = getelementptr inbounds double, ptr [[TMP18]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP17]], ptr [[TMP19]], align 16
	; CHECK-NEXT: [[INDEX_NEXT_1:%.*]] = or i64 [[INDEX]], 8
	; CHECK-NEXT: [[TMP20:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX_NEXT_1]]
	; CHECK-NEXT: [[WIDE_LOAD_2:%.*]] = load <2 x double>, ptr [[TMP20]], align 16
	; CHECK-NEXT: [[TMP21:%.*]] = getelementptr inbounds double, ptr [[TMP20]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD4_2:%.*]] = load <2 x double>, ptr [[TMP21]], align 16
	; CHECK-NEXT: [[TMP22:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX_NEXT_1]]
	; CHECK-NEXT: [[WIDE_LOAD5_2:%.*]] = load <2 x double>, ptr [[TMP22]], align 16
	; CHECK-NEXT: [[TMP23:%.*]] = getelementptr inbounds double, ptr [[TMP22]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6_2:%.*]] = load <2 x double>, ptr [[TMP23]], align 16
	; CHECK-NEXT: [[TMP24:%.*]] = fadd <2 x double> [[WIDE_LOAD_2]], [[WIDE_LOAD5_2]]
	; CHECK-NEXT: [[TMP25:%.*]] = fadd <2 x double> [[WIDE_LOAD4_2]], [[WIDE_LOAD6_2]]
	; CHECK-NEXT: [[TMP26:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX_NEXT_1]]
	; CHECK-NEXT: store <2 x double> [[TMP24]], ptr [[TMP26]], align 16
	; CHECK-NEXT: [[TMP27:%.*]] = getelementptr inbounds double, ptr [[TMP26]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP25]], ptr [[TMP27]], align 16
	; CHECK-NEXT: [[INDEX_NEXT_2:%.*]] = or i64 [[INDEX]], 12
	; CHECK-NEXT: [[TMP28:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX_NEXT_2]]
	; CHECK-NEXT: [[WIDE_LOAD_3:%.*]] = load <2 x double>, ptr [[TMP28]], align 16
	; CHECK-NEXT: [[TMP29:%.*]] = getelementptr inbounds double, ptr [[TMP28]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD4_3:%.*]] = load <2 x double>, ptr [[TMP29]], align 16
	; CHECK-NEXT: [[TMP30:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX_NEXT_2]]
	; CHECK-NEXT: [[WIDE_LOAD5_3:%.*]] = load <2 x double>, ptr [[TMP30]], align 16
	; CHECK-NEXT: [[TMP31:%.*]] = getelementptr inbounds double, ptr [[TMP30]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6_3:%.*]] = load <2 x double>, ptr [[TMP31]], align 16
	; CHECK-NEXT: [[TMP32:%.*]] = fadd <2 x double> [[WIDE_LOAD_3]], [[WIDE_LOAD5_3]]
	; CHECK-NEXT: [[TMP33:%.*]] = fadd <2 x double> [[WIDE_LOAD4_3]], [[WIDE_LOAD6_3]]
	; CHECK-NEXT: [[TMP34:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX_NEXT_2]]
	; CHECK-NEXT: store <2 x double> [[TMP32]], ptr [[TMP34]], align 16
	; CHECK-NEXT: [[TMP35:%.*]] = getelementptr inbounds double, ptr [[TMP34]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP33]], ptr [[TMP35]], align 16
	; CHECK-NEXT: [[INDEX_NEXT_3:%.*]] = or i64 [[INDEX]], 16
	; CHECK-NEXT: [[TMP36:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX_NEXT_3]]
	; CHECK-NEXT: [[WIDE_LOAD_4:%.*]] = load <2 x double>, ptr [[TMP36]], align 16
	; CHECK-NEXT: [[TMP37:%.*]] = getelementptr inbounds double, ptr [[TMP36]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD4_4:%.*]] = load <2 x double>, ptr [[TMP37]], align 16
	; CHECK-NEXT: [[TMP38:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX_NEXT_3]]
	; CHECK-NEXT: [[WIDE_LOAD5_4:%.*]] = load <2 x double>, ptr [[TMP38]], align 16
	; CHECK-NEXT: [[TMP39:%.*]] = getelementptr inbounds double, ptr [[TMP38]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6_4:%.*]] = load <2 x double>, ptr [[TMP39]], align 16
	; CHECK-NEXT: [[TMP40:%.*]] = fadd <2 x double> [[WIDE_LOAD_4]], [[WIDE_LOAD5_4]]
	; CHECK-NEXT: [[TMP41:%.*]] = fadd <2 x double> [[WIDE_LOAD4_4]], [[WIDE_LOAD6_4]]
	; CHECK-NEXT: [[TMP42:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX_NEXT_3]]
	; CHECK-NEXT: store <2 x double> [[TMP40]], ptr [[TMP42]], align 16
	; CHECK-NEXT: [[TMP43:%.*]] = getelementptr inbounds double, ptr [[TMP42]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP41]], ptr [[TMP43]], align 16
	; CHECK-NEXT: [[INDEX_NEXT_4:%.*]] = or i64 [[INDEX]], 20
	; CHECK-NEXT: [[TMP44:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX_NEXT_4]]
	; CHECK-NEXT: [[WIDE_LOAD_5:%.*]] = load <2 x double>, ptr [[TMP44]], align 16
	; CHECK-NEXT: [[TMP45:%.*]] = getelementptr inbounds double, ptr [[TMP44]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD4_5:%.*]] = load <2 x double>, ptr [[TMP45]], align 16
	; CHECK-NEXT: [[TMP46:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX_NEXT_4]]
	; CHECK-NEXT: [[WIDE_LOAD5_5:%.*]] = load <2 x double>, ptr [[TMP46]], align 16
	; CHECK-NEXT: [[TMP47:%.*]] = getelementptr inbounds double, ptr [[TMP46]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6_5:%.*]] = load <2 x double>, ptr [[TMP47]], align 16
	; CHECK-NEXT: [[TMP48:%.*]] = fadd <2 x double> [[WIDE_LOAD_5]], [[WIDE_LOAD5_5]]
	; CHECK-NEXT: [[TMP49:%.*]] = fadd <2 x double> [[WIDE_LOAD4_5]], [[WIDE_LOAD6_5]]
	; CHECK-NEXT: [[TMP50:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX_NEXT_4]]
	; CHECK-NEXT: store <2 x double> [[TMP48]], ptr [[TMP50]], align 16
	; CHECK-NEXT: [[TMP51:%.*]] = getelementptr inbounds double, ptr [[TMP50]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP49]], ptr [[TMP51]], align 16
	; CHECK-NEXT: [[INDEX_NEXT_5:%.*]] = or i64 [[INDEX]], 24
	; CHECK-NEXT: [[TMP52:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX_NEXT_5]]
	; CHECK-NEXT: [[WIDE_LOAD_6:%.*]] = load <2 x double>, ptr [[TMP52]], align 16
	; CHECK-NEXT: [[TMP53:%.*]] = getelementptr inbounds double, ptr [[TMP52]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD4_6:%.*]] = load <2 x double>, ptr [[TMP53]], align 16
	; CHECK-NEXT: [[TMP54:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX_NEXT_5]]
	; CHECK-NEXT: [[WIDE_LOAD5_6:%.*]] = load <2 x double>, ptr [[TMP54]], align 16
	; CHECK-NEXT: [[TMP55:%.*]] = getelementptr inbounds double, ptr [[TMP54]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6_6:%.*]] = load <2 x double>, ptr [[TMP55]], align 16
	; CHECK-NEXT: [[TMP56:%.*]] = fadd <2 x double> [[WIDE_LOAD_6]], [[WIDE_LOAD5_6]]
	; CHECK-NEXT: [[TMP57:%.*]] = fadd <2 x double> [[WIDE_LOAD4_6]], [[WIDE_LOAD6_6]]
	; CHECK-NEXT: [[TMP58:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX_NEXT_5]]
	; CHECK-NEXT: store <2 x double> [[TMP56]], ptr [[TMP58]], align 16
	; CHECK-NEXT: [[TMP59:%.*]] = getelementptr inbounds double, ptr [[TMP58]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP57]], ptr [[TMP59]], align 16
	; CHECK-NEXT: [[INDEX_NEXT_6:%.*]] = or i64 [[INDEX]], 28
	; CHECK-NEXT: [[TMP60:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX_NEXT_6]]
	; CHECK-NEXT: [[WIDE_LOAD_7:%.*]] = load <2 x double>, ptr [[TMP60]], align 16
	; CHECK-NEXT: [[TMP61:%.*]] = getelementptr inbounds double, ptr [[TMP60]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD4_7:%.*]] = load <2 x double>, ptr [[TMP61]], align 16
	; CHECK-NEXT: [[TMP62:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX_NEXT_6]]
	; CHECK-NEXT: [[WIDE_LOAD5_7:%.*]] = load <2 x double>, ptr [[TMP62]], align 16
	; CHECK-NEXT: [[TMP63:%.*]] = getelementptr inbounds double, ptr [[TMP62]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6_7:%.*]] = load <2 x double>, ptr [[TMP63]], align 16
	; CHECK-NEXT: [[TMP64:%.*]] = fadd <2 x double> [[WIDE_LOAD_7]], [[WIDE_LOAD5_7]]
	; CHECK-NEXT: [[TMP65:%.*]] = fadd <2 x double> [[WIDE_LOAD4_7]], [[WIDE_LOAD6_7]]
	; CHECK-NEXT: [[TMP66:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX_NEXT_6]]
	; CHECK-NEXT: store <2 x double> [[TMP64]], ptr [[TMP66]], align 16
	; CHECK-NEXT: [[TMP67:%.*]] = getelementptr inbounds double, ptr [[TMP66]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP65]], ptr [[TMP67]], align 16
	; CHECK-NEXT: [[INDEX_NEXT_7]] = add nuw i64 [[INDEX]], 32
	; CHECK-NEXT: [[NITER_NEXT_7]] = add i64 [[NITER]], 8
	; CHECK-NEXT: [[NITER_NCMP_7:%.*]] = icmp eq i64 [[NITER_NEXT_7]], [[UNROLL_ITER]]
	; CHECK-NEXT: br i1 [[NITER_NCMP_7]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
	; CHECK: middle.block.unr-lcssa:
	; CHECK-NEXT: [[INDEX_UNR:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT_7]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0
	; CHECK-NEXT: br i1 [[LCMP_MOD_NOT]], label [[MIDDLE_BLOCK:%.]], label [[VECTOR_BODY_EPIL:%.]]
	; CHECK: vector.body.epil:
	; CHECK-NEXT: [[INDEX_EPIL:%.]] = phi i64 [ [[INDEX_NEXT_EPIL:%.]], [[VECTOR_BODY_EPIL]] ], [ [[INDEX_UNR]], [[MIDDLE_BLOCK_UNR_LCSSA]] ]
	; CHECK-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[EPIL_ITER_NEXT:%.]], [[VECTOR_BODY_EPIL]] ], [ 0, [[MIDDLE_BLOCK_UNR_LCSSA]] ]
	; CHECK-NEXT: [[TMP68:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDEX_EPIL]]
	; CHECK-NEXT: [[WIDE_LOAD_EPIL:%.*]] = load <2 x double>, ptr [[TMP68]], align 16
	; CHECK-NEXT: [[TMP69:%.*]] = getelementptr inbounds double, ptr [[TMP68]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD4_EPIL:%.*]] = load <2 x double>, ptr [[TMP69]], align 16
	; CHECK-NEXT: [[TMP70:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDEX_EPIL]]
	; CHECK-NEXT: [[WIDE_LOAD5_EPIL:%.*]] = load <2 x double>, ptr [[TMP70]], align 16
	; CHECK-NEXT: [[TMP71:%.*]] = getelementptr inbounds double, ptr [[TMP70]], i64 2
	; CHECK-NEXT: [[WIDE_LOAD6_EPIL:%.*]] = load <2 x double>, ptr [[TMP71]], align 16
	; CHECK-NEXT: [[TMP72:%.*]] = fadd <2 x double> [[WIDE_LOAD_EPIL]], [[WIDE_LOAD5_EPIL]]
	; CHECK-NEXT: [[TMP73:%.*]] = fadd <2 x double> [[WIDE_LOAD4_EPIL]], [[WIDE_LOAD6_EPIL]]
	; CHECK-NEXT: [[TMP74:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDEX_EPIL]]
	; CHECK-NEXT: store <2 x double> [[TMP72]], ptr [[TMP74]], align 16
	; CHECK-NEXT: [[TMP75:%.*]] = getelementptr inbounds double, ptr [[TMP74]], i64 2
	; CHECK-NEXT: store <2 x double> [[TMP73]], ptr [[TMP75]], align 16
	; CHECK-NEXT: [[INDEX_NEXT_EPIL]] = add nuw i64 [[INDEX_EPIL]], 4
	; CHECK-NEXT: [[EPIL_ITER_NEXT]] = add i64 [[EPIL_ITER]], 1
	; CHECK-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_NEXT]], [[XTRAITER]]
	; CHECK-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[MIDDLE_BLOCK]], label [[VECTOR_BODY_EPIL]], !llvm.loop [[LOOP2:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[WIDE_TRIP_COUNT]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[WIDE_TRIP_COUNT]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT]], label [[FOR_BODY_PREHEADER7]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT]], label [[FOR_BODY_PREHEADER7]]
	; CHECK: for.body.preheader7:			; CHECK: for.body.preheader7:
	; CHECK-NEXT: [[INDVARS_IV_PH:%.*]] = phi i64 [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[INDVARS_IV_PH:%.*]] = phi i64 [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ [[INDVARS_IV_PH]], [[FOR_BODY_PREHEADER7]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ [[INDVARS_IV_PH]], [[FOR_BODY_PREHEADER7]] ]
	; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds [58 x double], ptr @b, i64 0, i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP76:%.*]] = load double, ptr [[ARRAYIDX]], align 8			; CHECK-NEXT: [[TMP9:%.*]] = load double, ptr [[ARRAYIDX]], align 8
	; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds [58 x double], ptr @c, i64 0, i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP77:%.*]] = load double, ptr [[ARRAYIDX2]], align 8			; CHECK-NEXT: [[TMP10:%.*]] = load double, ptr [[ARRAYIDX2]], align 8
	; CHECK-NEXT: [[ADD:%.*]] = fadd double [[TMP76]], [[TMP77]]			; CHECK-NEXT: [[ADD:%.*]] = fadd double [[TMP9]], [[TMP10]]
	; CHECK-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds [58 x double], ptr @a, i64 0, i64 [[INDVARS_IV]]
	; CHECK-NEXT: store double [[ADD]], ptr [[ARRAYIDX4]], align 8			; CHECK-NEXT: store double [[ADD]], ptr [[ARRAYIDX4]], align 8
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.cond			br label %for.cond

	for.cond:			for.cond:
	%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ]			%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
	Show All 20 Lines

llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll

	Show All 29 Lines
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x double> poison, double [[A:%.]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.]] = insertelement <4 x double> poison, double [[A:%.]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT]], <4 x double> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT]], <4 x double> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT9:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT9:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT10:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT9]], <4 x double> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT10:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT9]], <4 x double> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT11:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT11:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT12:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT11]], <4 x double> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT12:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT11]], <4 x double> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0			; CHECK-NEXT: [[BROADCAST_SPLATINSERT13:%.*]] = insertelement <4 x double> poison, double [[A]], i64 0
	; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT13]], <4 x double> poison, <4 x i32> zeroinitializer			; CHECK-NEXT: [[BROADCAST_SPLAT14:%.*]] = shufflevector <4 x double> [[BROADCAST_SPLATINSERT13]], <4 x double> poison, <4 x i32> zeroinitializer
	; CHECK-NEXT: [[TMP1:%.*]] = add nsw i64 [[WIDE_TRIP_COUNT]], -16			; CHECK-NEXT: [[TMP1:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]
	; CHECK-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 4			; CHECK-NEXT: [[TMP2:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT10]]
	; CHECK-NEXT: [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1			; CHECK-NEXT: [[TMP3:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT12]]
	; CHECK-NEXT: [[XTRAITER:%.*]] = and i64 [[TMP3]], 1			; CHECK-NEXT: [[TMP4:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT14]]
	; CHECK-NEXT: [[TMP4:%.*]] = icmp ult i64 [[TMP1]], 16
	; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK_UNR_LCSSA:%.]], label [[VECTOR_PH_NEW:%.]]
	; CHECK: vector.ph.new:
	; CHECK-NEXT: [[UNROLL_ITER:%.*]] = and i64 [[TMP3]], -2
	; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]
	; CHECK-NEXT: [[TMP6:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT10]]
	; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT12]]
	; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT14]]
	; CHECK-NEXT: [[TMP9:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]
	; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT10]]
	; CHECK-NEXT: [[TMP11:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT12]]
	; CHECK-NEXT: [[TMP12:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT14]]
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[INDEX_NEXT_1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[NITER:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[NITER_NEXT_1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDEX]]
	; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDEX]]			; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x double>, ptr [[TMP5]], align 8, !tbaa [[TBAA3:![0-9]+]]
	; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x double>, ptr [[TMP13]], align 8, !tbaa [[TBAA3:![0-9]+]]			; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds double, ptr [[TMP5]], i64 4
	; CHECK-NEXT: [[TMP15:%.*]] = getelementptr inbounds double, ptr [[TMP13]], i64 4			; CHECK-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x double>, ptr [[TMP6]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x double>, ptr [[TMP15]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP7:%.*]] = getelementptr inbounds double, ptr [[TMP5]], i64 8
	; CHECK-NEXT: [[TMP17:%.*]] = getelementptr inbounds double, ptr [[TMP13]], i64 8			; CHECK-NEXT: [[WIDE_LOAD7:%.*]] = load <4 x double>, ptr [[TMP7]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[WIDE_LOAD7:%.*]] = load <4 x double>, ptr [[TMP17]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP8:%.*]] = getelementptr inbounds double, ptr [[TMP5]], i64 12
	; CHECK-NEXT: [[TMP19:%.*]] = getelementptr inbounds double, ptr [[TMP13]], i64 12			; CHECK-NEXT: [[WIDE_LOAD8:%.*]] = load <4 x double>, ptr [[TMP8]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[WIDE_LOAD8:%.*]] = load <4 x double>, ptr [[TMP19]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP9:%.*]] = fmul fast <4 x double> [[WIDE_LOAD]], [[TMP1]]
	; CHECK-NEXT: [[TMP21:%.*]] = fmul fast <4 x double> [[WIDE_LOAD]], [[TMP5]]			; CHECK-NEXT: [[TMP10:%.*]] = fmul fast <4 x double> [[WIDE_LOAD6]], [[TMP2]]
	; CHECK-NEXT: [[TMP22:%.*]] = fmul fast <4 x double> [[WIDE_LOAD6]], [[TMP6]]			; CHECK-NEXT: [[TMP11:%.*]] = fmul fast <4 x double> [[WIDE_LOAD7]], [[TMP3]]
	; CHECK-NEXT: [[TMP23:%.*]] = fmul fast <4 x double> [[WIDE_LOAD7]], [[TMP7]]			; CHECK-NEXT: [[TMP12:%.*]] = fmul fast <4 x double> [[WIDE_LOAD8]], [[TMP4]]
	; CHECK-NEXT: [[TMP24:%.*]] = fmul fast <4 x double> [[WIDE_LOAD8]], [[TMP8]]			; CHECK-NEXT: [[TMP13:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDEX]]
	; CHECK-NEXT: [[TMP25:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDEX]]			; CHECK-NEXT: store <4 x double> [[TMP9]], ptr [[TMP13]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: store <4 x double> [[TMP21]], ptr [[TMP25]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP14:%.*]] = getelementptr inbounds double, ptr [[TMP13]], i64 4
	; CHECK-NEXT: [[TMP27:%.*]] = getelementptr inbounds double, ptr [[TMP25]], i64 4			; CHECK-NEXT: store <4 x double> [[TMP10]], ptr [[TMP14]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: store <4 x double> [[TMP22]], ptr [[TMP27]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP15:%.*]] = getelementptr inbounds double, ptr [[TMP13]], i64 8
	; CHECK-NEXT: [[TMP29:%.*]] = getelementptr inbounds double, ptr [[TMP25]], i64 8			; CHECK-NEXT: store <4 x double> [[TMP11]], ptr [[TMP15]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: store <4 x double> [[TMP23]], ptr [[TMP29]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP16:%.*]] = getelementptr inbounds double, ptr [[TMP13]], i64 12
	; CHECK-NEXT: [[TMP31:%.*]] = getelementptr inbounds double, ptr [[TMP25]], i64 12			; CHECK-NEXT: store <4 x double> [[TMP12]], ptr [[TMP16]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: store <4 x double> [[TMP24]], ptr [[TMP31]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16
	; CHECK-NEXT: [[INDEX_NEXT:%.*]] = or i64 [[INDEX]], 16			; CHECK-NEXT: [[TMP17:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: [[TMP33:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDEX_NEXT]]			; CHECK-NEXT: br i1 [[TMP17]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
	; CHECK-NEXT: [[WIDE_LOAD_1:%.*]] = load <4 x double>, ptr [[TMP33]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP35:%.*]] = getelementptr inbounds double, ptr [[TMP33]], i64 4
	; CHECK-NEXT: [[WIDE_LOAD6_1:%.*]] = load <4 x double>, ptr [[TMP35]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP37:%.*]] = getelementptr inbounds double, ptr [[TMP33]], i64 8
	; CHECK-NEXT: [[WIDE_LOAD7_1:%.*]] = load <4 x double>, ptr [[TMP37]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP39:%.*]] = getelementptr inbounds double, ptr [[TMP33]], i64 12
	; CHECK-NEXT: [[WIDE_LOAD8_1:%.*]] = load <4 x double>, ptr [[TMP39]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP41:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_1]], [[TMP9]]
	; CHECK-NEXT: [[TMP42:%.*]] = fmul fast <4 x double> [[WIDE_LOAD6_1]], [[TMP10]]
	; CHECK-NEXT: [[TMP43:%.*]] = fmul fast <4 x double> [[WIDE_LOAD7_1]], [[TMP11]]
	; CHECK-NEXT: [[TMP44:%.*]] = fmul fast <4 x double> [[WIDE_LOAD8_1]], [[TMP12]]
	; CHECK-NEXT: [[TMP45:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDEX_NEXT]]
	; CHECK-NEXT: store <4 x double> [[TMP41]], ptr [[TMP45]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP47:%.*]] = getelementptr inbounds double, ptr [[TMP45]], i64 4
	; CHECK-NEXT: store <4 x double> [[TMP42]], ptr [[TMP47]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP49:%.*]] = getelementptr inbounds double, ptr [[TMP45]], i64 8
	; CHECK-NEXT: store <4 x double> [[TMP43]], ptr [[TMP49]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP51:%.*]] = getelementptr inbounds double, ptr [[TMP45]], i64 12
	; CHECK-NEXT: store <4 x double> [[TMP44]], ptr [[TMP51]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDEX_NEXT_1]] = add nuw i64 [[INDEX]], 32
	; CHECK-NEXT: [[NITER_NEXT_1]] = add i64 [[NITER]], 2
	; CHECK-NEXT: [[NITER_NCMP_1:%.*]] = icmp eq i64 [[NITER_NEXT_1]], [[UNROLL_ITER]]
	; CHECK-NEXT: br i1 [[NITER_NCMP_1]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
	; CHECK: middle.block.unr-lcssa:
	; CHECK-NEXT: [[INDEX_UNR:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT_1]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0
	; CHECK-NEXT: br i1 [[LCMP_MOD_NOT]], label [[MIDDLE_BLOCK:%.]], label [[VECTOR_BODY_EPIL:%.]]
	; CHECK: vector.body.epil:
	; CHECK-NEXT: [[TMP53:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDEX_UNR]]
	; CHECK-NEXT: [[WIDE_LOAD_EPIL:%.*]] = load <4 x double>, ptr [[TMP53]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP55:%.*]] = getelementptr inbounds double, ptr [[TMP53]], i64 4
	; CHECK-NEXT: [[WIDE_LOAD6_EPIL:%.*]] = load <4 x double>, ptr [[TMP55]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP57:%.*]] = getelementptr inbounds double, ptr [[TMP53]], i64 8
	; CHECK-NEXT: [[WIDE_LOAD7_EPIL:%.*]] = load <4 x double>, ptr [[TMP57]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP59:%.*]] = getelementptr inbounds double, ptr [[TMP53]], i64 12
	; CHECK-NEXT: [[WIDE_LOAD8_EPIL:%.*]] = load <4 x double>, ptr [[TMP59]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP61:%.*]] = fdiv fast <4 x double> [[WIDE_LOAD_EPIL]], [[BROADCAST_SPLAT]]
	; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast <4 x double> [[WIDE_LOAD6_EPIL]], [[BROADCAST_SPLAT10]]
	; CHECK-NEXT: [[TMP63:%.*]] = fdiv fast <4 x double> [[WIDE_LOAD7_EPIL]], [[BROADCAST_SPLAT12]]
	; CHECK-NEXT: [[TMP64:%.*]] = fdiv fast <4 x double> [[WIDE_LOAD8_EPIL]], [[BROADCAST_SPLAT14]]
	; CHECK-NEXT: [[TMP65:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDEX_UNR]]
	; CHECK-NEXT: store <4 x double> [[TMP61]], ptr [[TMP65]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP67:%.*]] = getelementptr inbounds double, ptr [[TMP65]], i64 4
	; CHECK-NEXT: store <4 x double> [[TMP62]], ptr [[TMP67]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP69:%.*]] = getelementptr inbounds double, ptr [[TMP65]], i64 8
	; CHECK-NEXT: store <4 x double> [[TMP63]], ptr [[TMP69]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP71:%.*]] = getelementptr inbounds double, ptr [[TMP65]], i64 12
	; CHECK-NEXT: store <4 x double> [[TMP64]], ptr [[TMP71]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: br label [[MIDDLE_BLOCK]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[WIDE_TRIP_COUNT]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[WIDE_TRIP_COUNT]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY_PREHEADER15]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY_PREHEADER15]]
	; CHECK: for.body.preheader15:			; CHECK: for.body.preheader15:
	; CHECK-NEXT: [[INDVARS_IV_PH:%.*]] = phi i64 [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[INDVARS_IV_PH:%.*]] = phi i64 [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: [[TMP73:%.*]] = xor i64 [[INDVARS_IV_PH]], -1			; CHECK-NEXT: [[TMP18:%.*]] = xor i64 [[INDVARS_IV_PH]], -1
	; CHECK-NEXT: [[TMP74:%.*]] = add nsw i64 [[TMP73]], [[WIDE_TRIP_COUNT]]			; CHECK-NEXT: [[TMP19:%.*]] = add nsw i64 [[TMP18]], [[WIDE_TRIP_COUNT]]
	; CHECK-NEXT: [[XTRAITER16:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 7			; CHECK-NEXT: [[XTRAITER:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 7
	; CHECK-NEXT: [[LCMP_MOD17_NOT:%.*]] = icmp eq i64 [[XTRAITER16]], 0			; CHECK-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0
	; CHECK-NEXT: br i1 [[LCMP_MOD17_NOT]], label [[FOR_BODY_PROL_LOOPEXIT:%.]], label [[FOR_BODY_PROL_PREHEADER:%.]]			; CHECK-NEXT: br i1 [[LCMP_MOD_NOT]], label [[FOR_BODY_PROL_LOOPEXIT:%.]], label [[FOR_BODY_PROL_PREHEADER:%.]]
	; CHECK: for.body.prol.preheader:			; CHECK: for.body.prol.preheader:
	; CHECK-NEXT: [[TMP75:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP20:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: br label [[FOR_BODY_PROL:%.*]]			; CHECK-NEXT: br label [[FOR_BODY_PROL:%.*]]
	; CHECK: for.body.prol:			; CHECK: for.body.prol:
	; CHECK-NEXT: [[INDVARS_IV_PROL:%.]] = phi i64 [ [[INDVARS_IV_NEXT_PROL:%.]], [[FOR_BODY_PROL]] ], [ [[INDVARS_IV_PH]], [[FOR_BODY_PROL_PREHEADER]] ]			; CHECK-NEXT: [[INDVARS_IV_PROL:%.]] = phi i64 [ [[INDVARS_IV_NEXT_PROL:%.]], [[FOR_BODY_PROL]] ], [ [[INDVARS_IV_PH]], [[FOR_BODY_PROL_PREHEADER]] ]
	; CHECK-NEXT: [[PROL_ITER:%.]] = phi i64 [ [[PROL_ITER_NEXT:%.]], [[FOR_BODY_PROL]] ], [ 0, [[FOR_BODY_PROL_PREHEADER]] ]			; CHECK-NEXT: [[PROL_ITER:%.]] = phi i64 [ [[PROL_ITER_NEXT:%.]], [[FOR_BODY_PROL]] ], [ 0, [[FOR_BODY_PROL_PREHEADER]] ]
	; CHECK-NEXT: [[ARRAYIDX_PROL:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_PROL]]			; CHECK-NEXT: [[ARRAYIDX_PROL:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_PROL]]
	; CHECK-NEXT: [[T0_PROL:%.*]] = load double, ptr [[ARRAYIDX_PROL]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0_PROL:%.*]] = load double, ptr [[ARRAYIDX_PROL]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP76:%.*]] = fmul fast double [[T0_PROL]], [[TMP75]]			; CHECK-NEXT: [[TMP21:%.*]] = fmul fast double [[T0_PROL]], [[TMP20]]
	; CHECK-NEXT: [[ARRAYIDX2_PROL:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_PROL]]			; CHECK-NEXT: [[ARRAYIDX2_PROL:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_PROL]]
	; CHECK-NEXT: store double [[TMP76]], ptr [[ARRAYIDX2_PROL]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP21]], ptr [[ARRAYIDX2_PROL]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_PROL]] = add nuw nsw i64 [[INDVARS_IV_PROL]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT_PROL]] = add nuw nsw i64 [[INDVARS_IV_PROL]], 1
	; CHECK-NEXT: [[PROL_ITER_NEXT]] = add i64 [[PROL_ITER]], 1			; CHECK-NEXT: [[PROL_ITER_NEXT]] = add i64 [[PROL_ITER]], 1
	; CHECK-NEXT: [[PROL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[PROL_ITER_NEXT]], [[XTRAITER16]]			; CHECK-NEXT: [[PROL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[PROL_ITER_NEXT]], [[XTRAITER]]
	; CHECK-NEXT: br i1 [[PROL_ITER_CMP_NOT]], label [[FOR_BODY_PROL_LOOPEXIT]], label [[FOR_BODY_PROL]], !llvm.loop [[LOOP9:![0-9]+]]			; CHECK-NEXT: br i1 [[PROL_ITER_CMP_NOT]], label [[FOR_BODY_PROL_LOOPEXIT]], label [[FOR_BODY_PROL]], !llvm.loop [[LOOP10:![0-9]+]]
	; CHECK: for.body.prol.loopexit:			; CHECK: for.body.prol.loopexit:
	; CHECK-NEXT: [[INDVARS_IV_UNR:%.*]] = phi i64 [ [[INDVARS_IV_PH]], [[FOR_BODY_PREHEADER15]] ], [ [[INDVARS_IV_NEXT_PROL]], [[FOR_BODY_PROL]] ]			; CHECK-NEXT: [[INDVARS_IV_UNR:%.*]] = phi i64 [ [[INDVARS_IV_PH]], [[FOR_BODY_PREHEADER15]] ], [ [[INDVARS_IV_NEXT_PROL]], [[FOR_BODY_PROL]] ]
	; CHECK-NEXT: [[TMP77:%.*]] = icmp ult i64 [[TMP74]], 7			; CHECK-NEXT: [[TMP22:%.*]] = icmp ult i64 [[TMP19]], 7
	; CHECK-NEXT: br i1 [[TMP77]], label [[FOR_END]], label [[FOR_BODY_PREHEADER15_NEW:%.*]]			; CHECK-NEXT: br i1 [[TMP22]], label [[FOR_END]], label [[FOR_BODY_PREHEADER15_NEW:%.*]]
	; CHECK: for.body.preheader15.new:			; CHECK: for.body.preheader15.new:
	; CHECK-NEXT: [[TMP78:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP23:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP79:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP24:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP80:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP25:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP81:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP26:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP82:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP83:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP28:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP84:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP85:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP30:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_UNR]], [[FOR_BODY_PREHEADER15_NEW]] ], [ [[INDVARS_IV_NEXT_7:%.]], [[FOR_BODY]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_UNR]], [[FOR_BODY_PREHEADER15_NEW]] ], [ [[INDVARS_IV_NEXT_7:%.]], [[FOR_BODY]] ]
	; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[T0:%.*]] = load double, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0:%.*]] = load double, ptr [[ARRAYIDX]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP86:%.*]] = fmul fast double [[T0]], [[TMP78]]			; CHECK-NEXT: [[TMP31:%.*]] = fmul fast double [[T0]], [[TMP23]]
	; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: store double [[TMP86]], ptr [[ARRAYIDX2]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP31]], ptr [[ARRAYIDX2]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT]]			; CHECK-NEXT: [[ARRAYIDX_1:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT]]
	; CHECK-NEXT: [[T0_1:%.*]] = load double, ptr [[ARRAYIDX_1]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0_1:%.*]] = load double, ptr [[ARRAYIDX_1]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP87:%.*]] = fmul fast double [[T0_1]], [[TMP79]]			; CHECK-NEXT: [[TMP32:%.*]] = fmul fast double [[T0_1]], [[TMP24]]
	; CHECK-NEXT: [[ARRAYIDX2_1:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT]]			; CHECK-NEXT: [[ARRAYIDX2_1:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT]]
	; CHECK-NEXT: store double [[TMP87]], ptr [[ARRAYIDX2_1]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP32]], ptr [[ARRAYIDX2_1]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_1:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 2			; CHECK-NEXT: [[INDVARS_IV_NEXT_1:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 2
	; CHECK-NEXT: [[ARRAYIDX_2:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_1]]			; CHECK-NEXT: [[ARRAYIDX_2:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_1]]
	; CHECK-NEXT: [[T0_2:%.*]] = load double, ptr [[ARRAYIDX_2]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0_2:%.*]] = load double, ptr [[ARRAYIDX_2]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP88:%.*]] = fmul fast double [[T0_2]], [[TMP80]]			; CHECK-NEXT: [[TMP33:%.*]] = fmul fast double [[T0_2]], [[TMP25]]
	; CHECK-NEXT: [[ARRAYIDX2_2:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_1]]			; CHECK-NEXT: [[ARRAYIDX2_2:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_1]]
	; CHECK-NEXT: store double [[TMP88]], ptr [[ARRAYIDX2_2]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP33]], ptr [[ARRAYIDX2_2]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_2:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 3			; CHECK-NEXT: [[INDVARS_IV_NEXT_2:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 3
	; CHECK-NEXT: [[ARRAYIDX_3:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_2]]			; CHECK-NEXT: [[ARRAYIDX_3:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_2]]
	; CHECK-NEXT: [[T0_3:%.*]] = load double, ptr [[ARRAYIDX_3]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0_3:%.*]] = load double, ptr [[ARRAYIDX_3]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP89:%.*]] = fmul fast double [[T0_3]], [[TMP81]]			; CHECK-NEXT: [[TMP34:%.*]] = fmul fast double [[T0_3]], [[TMP26]]
	; CHECK-NEXT: [[ARRAYIDX2_3:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_2]]			; CHECK-NEXT: [[ARRAYIDX2_3:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_2]]
	; CHECK-NEXT: store double [[TMP89]], ptr [[ARRAYIDX2_3]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP34]], ptr [[ARRAYIDX2_3]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_3:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 4			; CHECK-NEXT: [[INDVARS_IV_NEXT_3:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 4
	; CHECK-NEXT: [[ARRAYIDX_4:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_3]]			; CHECK-NEXT: [[ARRAYIDX_4:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_3]]
	; CHECK-NEXT: [[T0_4:%.*]] = load double, ptr [[ARRAYIDX_4]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0_4:%.*]] = load double, ptr [[ARRAYIDX_4]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP90:%.*]] = fmul fast double [[T0_4]], [[TMP82]]			; CHECK-NEXT: [[TMP35:%.*]] = fmul fast double [[T0_4]], [[TMP27]]
	; CHECK-NEXT: [[ARRAYIDX2_4:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_3]]			; CHECK-NEXT: [[ARRAYIDX2_4:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_3]]
	; CHECK-NEXT: store double [[TMP90]], ptr [[ARRAYIDX2_4]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP35]], ptr [[ARRAYIDX2_4]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_4:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 5			; CHECK-NEXT: [[INDVARS_IV_NEXT_4:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 5
	; CHECK-NEXT: [[ARRAYIDX_5:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_4]]			; CHECK-NEXT: [[ARRAYIDX_5:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_4]]
	; CHECK-NEXT: [[T0_5:%.*]] = load double, ptr [[ARRAYIDX_5]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0_5:%.*]] = load double, ptr [[ARRAYIDX_5]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP91:%.*]] = fmul fast double [[T0_5]], [[TMP83]]			; CHECK-NEXT: [[TMP36:%.*]] = fmul fast double [[T0_5]], [[TMP28]]
	; CHECK-NEXT: [[ARRAYIDX2_5:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_4]]			; CHECK-NEXT: [[ARRAYIDX2_5:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_4]]
	; CHECK-NEXT: store double [[TMP91]], ptr [[ARRAYIDX2_5]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP36]], ptr [[ARRAYIDX2_5]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_5:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 6			; CHECK-NEXT: [[INDVARS_IV_NEXT_5:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 6
	; CHECK-NEXT: [[ARRAYIDX_6:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_5]]			; CHECK-NEXT: [[ARRAYIDX_6:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_5]]
	; CHECK-NEXT: [[T0_6:%.*]] = load double, ptr [[ARRAYIDX_6]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0_6:%.*]] = load double, ptr [[ARRAYIDX_6]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP92:%.*]] = fmul fast double [[T0_6]], [[TMP84]]			; CHECK-NEXT: [[TMP37:%.*]] = fmul fast double [[T0_6]], [[TMP29]]
	; CHECK-NEXT: [[ARRAYIDX2_6:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_5]]			; CHECK-NEXT: [[ARRAYIDX2_6:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_5]]
	; CHECK-NEXT: store double [[TMP92]], ptr [[ARRAYIDX2_6]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP37]], ptr [[ARRAYIDX2_6]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_6:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 7			; CHECK-NEXT: [[INDVARS_IV_NEXT_6:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 7
	; CHECK-NEXT: [[ARRAYIDX_7:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_6]]			; CHECK-NEXT: [[ARRAYIDX_7:%.*]] = getelementptr inbounds double, ptr [[Y]], i64 [[INDVARS_IV_NEXT_6]]
	; CHECK-NEXT: [[T0_7:%.*]] = load double, ptr [[ARRAYIDX_7]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: [[T0_7:%.*]] = load double, ptr [[ARRAYIDX_7]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[TMP93:%.*]] = fmul fast double [[T0_7]], [[TMP85]]			; CHECK-NEXT: [[TMP38:%.*]] = fmul fast double [[T0_7]], [[TMP30]]
	; CHECK-NEXT: [[ARRAYIDX2_7:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_6]]			; CHECK-NEXT: [[ARRAYIDX2_7:%.*]] = getelementptr inbounds double, ptr [[X]], i64 [[INDVARS_IV_NEXT_6]]
	; CHECK-NEXT: store double [[TMP93]], ptr [[ARRAYIDX2_7]], align 8, !tbaa [[TBAA3]]			; CHECK-NEXT: store double [[TMP38]], ptr [[ARRAYIDX2_7]], align 8, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_7]] = add nuw nsw i64 [[INDVARS_IV]], 8			; CHECK-NEXT: [[INDVARS_IV_NEXT_7]] = add nuw nsw i64 [[INDVARS_IV]], 8
	; CHECK-NEXT: [[EXITCOND_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_7]], [[WIDE_TRIP_COUNT]]			; CHECK-NEXT: [[EXITCOND_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_7]], [[WIDE_TRIP_COUNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT_7]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP11:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT_7]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%div = fdiv fast double 1.0, %a			%div = fdiv fast double 1.0, %a
	br label %for.cond			br label %for.cond

	for.cond:			for.cond:
	Show All 38 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LV] Disable runtime unrolling for vectorized loops.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 486789

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

llvm/test/Transforms/LoopVectorize/ARM/pointer_iv.ll

llvm/test/Transforms/LoopVectorize/ARM/tail-folding-loop-hint.ll

llvm/test/Transforms/LoopVectorize/PowerPC/optimal-epilog-vectorization.ll

llvm/test/Transforms/LoopVectorize/X86/already-vectorized.ll

llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll

llvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll

llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll

llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll

llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll

llvm/test/Transforms/LoopVectorize/X86/metadata-enable.ll

llvm/test/Transforms/LoopVectorize/X86/tail_loop_folding.ll

llvm/test/Transforms/LoopVectorize/X86/uniform_mem_op.ll

llvm/test/Transforms/LoopVectorize/followup.ll

llvm/test/Transforms/LoopVectorize/if-pred-non-void.ll

llvm/test/Transforms/LoopVectorize/induction.ll

llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll

llvm/test/Transforms/LoopVectorize/invariant-store-vectorization-2.ll

llvm/test/Transforms/LoopVectorize/invariant-store-vectorization.ll

llvm/test/Transforms/LoopVectorize/memdep-fold-tail.ll

llvm/test/Transforms/LoopVectorize/optsize.ll

llvm/test/Transforms/LoopVectorize/pointer-select-runtime-checks.ll

llvm/test/Transforms/LoopVectorize/reduction-with-invariant-store.ll

llvm/test/Transforms/LoopVectorize/runtime-check.ll

llvm/test/Transforms/LoopVectorize/vectorize-once.ll

llvm/test/Transforms/PhaseOrdering/AArch64/hoisting-sinking-required-for-vectorization.ll

llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll

llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll

[LV] Disable runtime unrolling for vectorized loops.
ClosedPublic