This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/
-
Transforms/
-
Utils/
-
BasicBlockUtils.cpp
-
Vectorize/
-
LoopVectorize.cpp
-
test/
-
DebugInfo/AArch64/
-
AArch64/
-
inlined-argument.ll
-
Transforms/
-
LoopIdiom/
-
bcmp-debugify-remarks.ll
-
memset-debugify-remarks.ll
-
LoopSimplify/
-
dbg-loc.ll
-
do-preheader-dbg.ll
-
for-preheader-dbg.ll
-
LoopUnroll/
-
runtime-loop1.ll
-
LoopVectorize/
-
X86/
-
vectorization-remarks-missed.ll
-
vectorization-remarks-profitable.ll
-
debugloc.ll
-
fix-reduction-dbg.ll
-
unsafe-dep-remark.ll

Differential D60831

[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
ClosedPublic

Authored by Orlando on Apr 17 2019, 10:13 AM.

Download Raw Diff

Details

Reviewers

samsonov
vsk
aprantl
probinson
anemet
hfinkel
jmorse

Commits

Summary

Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Diff Detail

Repository: rL LLVM

Event Timeline

Orlando created this revision.Apr 17 2019, 10:13 AM

Herald added subscribers: llvm-commits, zzheng, hiraditya. · View Herald TranscriptApr 17 2019, 10:13 AM

gbedwell added a subscriber: gbedwell.Apr 17 2019, 10:18 AM

Orlando edited the summary of this revision. (Show Details)Apr 17 2019, 10:19 AM

How will this appear to profiling tools using PC sampling and using debug info to map the PC samples back to line numbers in the code?

In D60831#1470520, @hfinkel wrote:

How will this appear to profiling tools using PC sampling and using debug info to map the PC samples back to line numbers in the code?

I brought this up in some offline discussions which all concluded that the impact on profilers would be small and the trade off for better debugging is worth it. The impact on profilers seems like it would be small because the middle block is only visited once after running through the vectorized loop.

I am glad you asked because this concern gives rise to an argument for giving the middle block instructions line 0 instead, and i am interested in hearing other's opinions.

In D60831#1470585, @Orlando wrote:

In D60831#1470520, @hfinkel wrote:

How will this appear to profiling tools using PC sampling and using debug info to map the PC samples back to line numbers in the code?

I brought this up in some offline discussions which all concluded that the impact on profilers would be small and the trade off for better debugging is worth it. The impact on profilers seems like it would be small because the middle block is only visited once after running through the vectorized loop.

I am glad you asked because this concern gives rise to an argument for giving the middle block instructions line 0 instead, and i am interested in hearing other's opinions.

Interesting. The middle block just has the check for whether or not we need to run the remainder loop, right? I can definitely see this as kind of latch-like.

@jmellorcrummey , do you have an opinion on this?

Sorry, I missed a couple of failing tests. I've fixed them and updated the diff.

In D60831#1470677, @hfinkel wrote:

In D60831#1470585, @Orlando wrote:

In D60831#1470520, @hfinkel wrote:

How will this appear to profiling tools using PC sampling and using debug info to map the PC samples back to line numbers in the code?

I brought this up in some offline discussions which all concluded that the impact on profilers would be small and the trade off for better debugging is worth it. The impact on profilers seems like it would be small because the middle block is only visited once after running through the vectorized loop.

I am glad you asked because this concern gives rise to an argument for giving the middle block instructions line 0 instead, and i am interested in hearing other's opinions.

Interesting. The middle block just has the check for whether or not we need to run the remainder loop, right? I can definitely see this as kind of latch-like.

@jmellorcrummey , do you have an opinion on this?

The middle block does the check for whether or not we need to run the remainder loop but also, for the bug [0], executes some operations to convert the result of the vectorized loop into a scalar value.

middle.block:                                     ; preds = %vector.body.epil, %middle.block.unr-lcssa
  %.lcssa20 = phi <4 x i32> [ %.lcssa20.ph, %middle.block.unr-lcssa ], [ %33, %vector.body.epil ]
  %.lcssa = phi <4 x i32> [ %.lcssa.ph, %middle.block.unr-lcssa ], [ %34, %vector.body.epil 
  %bin.rdx = add <4 x i32> %.lcssa, %.lcssa20
  %rdx.shuf = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
  %bin.rdx16 = add <4 x i32> %bin.rdx, %rdx.shuf
  %rdx.shuf17 = shufflevector <4 x i32> %bin.rdx16, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
  %bin.rdx18 = add <4 x i32> %bin.rdx16, %rdx.shuf17
  %35 = extractelement <4 x i32> %bin.rdx18, i32 0
  %cmp.n = icmp eq i64 %n.vec, %wide.trip.count
  br i1 %cmp.n, label %return, label %for.body.preheader19

[0] https://bugs.llvm.org/show_bug.cgi?id=39024

aprantl added a reviewer: anemet.Apr 18 2019, 10:01 AM

aprantl added inline comments.

llvm/test/Transforms/LoopVectorize/unsafe-dep-remark.ll
14 ↗	(On Diff #195698)	@anemet Does that change look reasonable?

In D60831#1470677, @hfinkel wrote:

In D60831#1470585, @Orlando wrote:

In D60831#1470520, @hfinkel wrote:

How will this appear to profiling tools using PC sampling and using debug info to map the PC samples back to line numbers in the code?

I brought this up in some offline discussions which all concluded that the impact on profilers would be small and the trade off for better debugging is worth it. The impact on profilers seems like it would be small because the middle block is only visited once after running through the vectorized loop.

I am glad you asked because this concern gives rise to an argument for giving the middle block instructions line 0 instead, and i am interested in hearing other's opinions.

Interesting. The middle block just has the check for whether or not we need to run the remainder loop, right? I can definitely see this as kind of latch-like.

@jmellorcrummey , do you have an opinion on this?

Unsurprisingly, I have an opinion :-).

As the lead of a project building profiling tools, I am strongly against having any instructions map to line 0. Having address ranges without mappings completely distorts our view of the code and we have to add patches to cope with this case. All machine instructions should have a natural mapping if you put a bit of thought into it.

For instance, I sent comments back to IBM complaining about the lack of line mappings on machine instructions related to code generated for OpenMP offloading onto GPUs. Rather than having code intervals map to line 0, I advocated that they map boilerplate instructions back to the directive for which the code is being generated.

I strongly believe that instructions in what was described as the "middle block" above should map back to a plausible line. In the case of transformations such as common subexpression elimination that involve a one-to-many mapping, the mapping should be one of the lines where an operator appeared.

In D60831#1472296, @jmellorcrummey wrote:

As the lead of a project building profiling tools, I am strongly against having any instructions map to line 0.

This is probably not what you meant, but for completeness I feel like I should point out that there are many legitimate situations where LLVM generates a line 0 location. The most prominent example is instruction merging: Since both LLVM IR and DWARF currently require each PC address to map to exactly one source location, LLVM's will insert a line 0 location when it merges two instructions with distinct source locations. I can't speak for profiling, but at least on the debugger side, the consensus is that potentially misleading information is worse than no information, because if there is no way to distinguish "always correct" from "maybe correct" information, the user can't trust any information.

In D60831#1472303, @aprantl wrote:

In D60831#1472296, @jmellorcrummey wrote:

As the lead of a project building profiling tools, I am strongly against having any instructions map to line 0.

This is probably not what you meant, but for completeness I feel like I should point out that there are many legitimate situations where LLVM generates a line 0 location. The most prominent example is instruction merging: Since both LLVM IR and DWARF currently require each PC address to map to exactly one source location, LLVM's will insert a line 0 location when it merges two instructions with distinct source locations. I can't speak for profiling, but at least on the debugger side, the consensus is that potentially misleading information is worse than no information, because if there is no way to distinguish "always correct" from "maybe correct" information, the user can't trust any information.

FWIW, the motivating case for the introduction of getMergedLocation, which generally handles the insertion of line 0 locations in this case was to improve the performance of profile-guided optimized code when using a sampling profiler tool (see http://llvm.org/devmtg/2017-03//assets/slides/delivering_sample_based_pgo_for_playstation_r_4.pdf ).

In D60831#1472310, @gbedwell wrote:

In D60831#1472303, @aprantl wrote:

In D60831#1472296, @jmellorcrummey wrote:

As the lead of a project building profiling tools, I am strongly against having any instructions map to line 0.

This is probably not what you meant, but for completeness I feel like I should point out that there are many legitimate situations where LLVM generates a line 0 location. The most prominent example is instruction merging: Since both LLVM IR and DWARF currently require each PC address to map to exactly one source location, LLVM's will insert a line 0 location when it merges two instructions with distinct source locations. I can't speak for profiling, but at least on the debugger side, the consensus is that potentially misleading information is worse than no information, because if there is no way to distinguish "always correct" from "maybe correct" information, the user can't trust any information.

When merging happens, I don't see why mapping to 0 is better than attributing it to one of a set of locations that a fused operation came from. I see that as representative of reality. I would prefer a rule stating that if two operations are merged, always pick the lexicographically least (file, line number) pair. In the case of mappings for merged instructions, I see either mapping as "correct".

FWIW, the motivating case for the introduction of getMergedLocation, which generally handles the insertion of line 0 locations in this case was to improve the performance of profile-guided optimized code when using a sampling profiler tool (see http://llvm.org/devmtg/2017-03//assets/slides/delivering_sample_based_pgo_for_playstation_r_4.pdf ).

I looked at the slides. The PGO is nice work. The slide about getMergedLocation describes the rationale for mapping to line 0 rather than another line mapping if the line mappings don't agree. I don't see it wrong to simply choose one.

In D60831#1472494, @jmellorcrummey wrote:

In D60831#1472310, @gbedwell wrote:

In D60831#1472303, @aprantl wrote:

In D60831#1472296, @jmellorcrummey wrote:

As the lead of a project building profiling tools, I am strongly against having any instructions map to line 0.

This is probably not what you meant, but for completeness I feel like I should point out that there are many legitimate situations where LLVM generates a line 0 location. The most prominent example is instruction merging: Since both LLVM IR and DWARF currently require each PC address to map to exactly one source location, LLVM's will insert a line 0 location when it merges two instructions with distinct source locations. I can't speak for profiling, but at least on the debugger side, the consensus is that potentially misleading information is worse than no information, because if there is no way to distinguish "always correct" from "maybe correct" information, the user can't trust any information.

When merging happens, I don't see why mapping to 0 is better than attributing it to one of a set of locations that a fused operation came from. I see that as representative of reality.

I would prefer a rule stating that if two operations are merged, always pick the lexicographically least (file, line number) pair. In the case of mappings for merged instructions, I see either mapping as "correct".

I don't see things this way. Arbitrarily picking a location can result in a false execution history being presented to the user. Line 0 works a lot better imo, because it a) doesn't actively mislead and b) preserves inline scope.

FWIW, the motivating case for the introduction of getMergedLocation, which generally handles the insertion of line 0 locations in this case was to improve the performance of profile-guided optimized code when using a sampling profiler tool (see http://llvm.org/devmtg/2017-03//assets/slides/delivering_sample_based_pgo_for_playstation_r_4.pdf ).

I looked at the slides. The PGO is nice work. The slide about getMergedLocation describes the rationale for mapping to line 0 rather than another line mapping if the line mappings don't agree. I don't see it wrong to simply choose one.

In D60831#1472518, @vsk wrote:
In D60831#1472494, @jmellorcrummey wrote:

In D60831#1472310, @gbedwell wrote:

In D60831#1472303, @aprantl wrote:

In D60831#1472296, @jmellorcrummey wrote:

As the lead of a project building profiling tools, I am strongly against having any instructions map to line 0.

This is probably not what you meant, but for completeness I feel like I should point out that there are many legitimate situations where LLVM generates a line 0 location. The most prominent example is instruction merging: Since both LLVM IR and DWARF currently require each PC address to map to exactly one source location, LLVM's will insert a line 0 location when it merges two instructions with distinct source locations. I can't speak for profiling, but at least on the debugger side, the consensus is that potentially misleading information is worse than no information, because if there is no way to distinguish "always correct" from "maybe correct" information, the user can't trust any information.

When merging happens, I don't see why mapping to 0 is better than attributing it to one of a set of locations that a fused operation came from. I see that as representative of reality.

I would prefer a rule stating that if two operations are merged, always pick the lexicographically least (file, line number) pair. In the case of mappings for merged instructions, I see either mapping as "correct".

I don't see things this way. Arbitrarily picking a location can result in a false execution history being presented to the user. Line 0 works a lot better imo, because it a) doesn't actively mislead and b) preserves inline scope.
FWIW, the motivating case for the introduction of getMergedLocation, which generally handles the insertion of line 0 locations in this case was to improve the performance of profile-guided optimized code when using a sampling profiler tool (see http://llvm.org/devmtg/2017-03//assets/slides/delivering_sample_based_pgo_for_playstation_r_4.pdf ).

I looked at the slides. The PGO is nice work. The slide about getMergedLocation describes the rationale for mapping to line 0 rather than another line mapping if the line mappings don't agree. I don't see it wrong to simply choose one.

A good example for why arbitrarily picking one location during merging is when the two locations are coming from different inlined instances of different functions (or perhaps even worse: two inlined instances of the same function). I would assume that even in profiling a wrong backtrace would invalidate or render untrustworthy large parts of any analysis being done one this data.

anemet added inline comments.Apr 18 2019, 6:14 PM

llvm/test/Transforms/LoopVectorize/unsafe-dep-remark.ll
14 ↗	(On Diff #195698)	Yes, even though it looks the version on the left was able to pinpoint the offending memory operation that is actually not the case. We are using the debug location of the loop in the remark, so that was only coincidence.

A good example for why arbitrarily picking one location during merging is when the two locations are coming from different inlined instances of different functions (or perhaps even worse: two inlined instances of the same function). I would assume that even in profiling a wrong backtrace would invalidate or render untrustworthy large parts of any analysis being done one this data.

I think we fundamentally disagree on what is good for profiling. If for instance two "load" instructions from different inlined contexts merge, I would prefer that they be charged to one of the locations where the instruction came from and the other gets the benefit of that operation for free. (That's what common subexpression elimination is for!) Saying that one got merged into the other or vice versa is an acceptable view. I don't see this as misleading, untrustworthy, or invalidating anything.

If the instructions come from two different files, I guess that you won't even associate it with one of the files. So, I have instructions in the binary that won't be covered by line map entries at all. Having LITERALLY NO INFORMATION where they came from without tracing instruction generation through the compiler is something that I fundamentally oppose.

In D60831#1472604, @jmellorcrummey wrote:

A good example for why arbitrarily picking one location during merging is when the two locations are coming from different inlined instances of different functions (or perhaps even worse: two inlined instances of the same function). I would assume that even in profiling a wrong backtrace would invalidate or render untrustworthy large parts of any analysis being done one this data.

I think we fundamentally disagree on what is good for profiling.

That is possible.

If for instance two "load" instructions from different inlined contexts merge, I would prefer that they be charged to one of the locations where the instruction came from and the other gets the benefit of that operation for free. (That's what common subexpression elimination is for!) Saying that one got merged into the other or vice versa is an acceptable view. I don't see this as misleading, untrustworthy, or invalidating anything.

If the instructions come from two different files, I guess that you won't even associate it with one of the files. So, I have instructions in the binary that won't be covered by line map entries at all. Having LITERALLY NO INFORMATION where they came from without tracing instruction generation through the compiler is something that I fundamentally oppose.

Precisely to that point I was hoping to provide a few compelling counterexamples to demonstrate why potentially wrong information is actually worse than no information.

But I guess what this really boils down to is that all debug information in LLVM IR is (at the moment) "must" information that is supposed to be either 100% reliable or omitted. It sounds like for the kinds of analysis that you are doing, you would also benefit from a second category of "may" information that may or may not be valid. That's a legitimate ask, but if we wanted to include this in LLVM IR, we would need to qualify it as not reliable, so it doesn't, for example, leak into debug info that software developers rely on.

What I would find more interesting would be extending LLVM IR to support a one-to-many mapping from PC address to source locations. This way we would also be up front about the fact that the source location is one out of a set, but we then could use additional contextual information (such as the current backtrace and DWARF call site information) to potentially disambiguate them before use.

Precisely to that point I was hoping to provide a few compelling counterexamples to demonstrate why potentially wrong information is actually worse than no information.

But I guess what this really boils down to is that all debug information in LLVM IR is (at the moment) "must" information that is supposed to be either 100% reliable or omitted. It sounds like for the kinds of analysis that you are doing, you would also benefit from a second category of "may" information that may or may not be valid. That's a legitimate ask, but if we wanted to include this in LLVM IR, we would need to qualify it as not reliable, so it doesn't, for example, leak into debug info that software developers rely on.

This will be my last comment on the topic and then everyone can get back to work :-). I agree with the LLVM IR requirement that one not to lie to application developers; lies are worse than nothing. However, I don't feel that you have adequately explained why attributing to any of the available (file, line) mappings for an instruction that has been merged is incorrect and not 100% reliable. Just because we can't include all contributing mappings for a merged instruction doesn't make any one unreliable; I see attributing to any one of a set of mappings as 100% correct, even though it is only partial information.

In D60831#1472621, @jmellorcrummey wrote:

This will be my last comment on the topic and then everyone can get back to work :-). I agree with the LLVM IR requirement that one not to lie to application developers; lies are worse than nothing. However, I don't feel that you have adequately explained why attributing to any of the available (file, line) mappings for an instruction that has been merged is incorrect and not 100% reliable. Just because we can't include all contributing mappings for a merged instruction doesn't make any one unreliable; I see attributing to any one of a set of mappings as 100% correct, even though it is only partial information.

Sorry for jumping in after you said you would stop, but I have been away.
Personally I find this example compelling: When there's an if/then/else construct, and some instruction is hoisted above the if, you could assign it to (for example) the source location of the 'then' block. Now in your training run, the 'then' block can be given 100% of executions (because that's what the hoisted instruction says) even if the 'else' block was chosen 100% of the time. I find the resulting profile is "incorrect and not 100% reliable" and when used for PGO will produce sub-optimal code. It is hard to imagine any other valid conclusion for this use-case.

As Adrian mentioned, being able to attribute multiple locations would be preferable to attributing line 0, and I hope we can make that happen in the future.

In D60831#1476952, @probinson wrote:

In D60831#1472621, @jmellorcrummey wrote:

This will be my last comment on the topic and then everyone can get back to work :-). I agree with the LLVM IR requirement that one not to lie to application developers; lies are worse than nothing. However, I don't feel that you have adequately explained why attributing to any of the available (file, line) mappings for an instruction that has been merged is incorrect and not 100% reliable. Just because we can't include all contributing mappings for a merged instruction doesn't make any one unreliable; I see attributing to any one of a set of mappings as 100% correct, even though it is only partial information.

Sorry for jumping in after you said you would stop, but I have been away.
Personally I find this example compelling: When there's an if/then/else construct, and some instruction is hoisted above the if, you could assign it to (for example) the source location of the 'then' block. Now in your training run, the 'then' block can be given 100% of executions (because that's what the hoisted instruction says) even if the 'else' block was chosen 100% of the time. I find the resulting profile is "incorrect and not 100% reliable" and when used for PGO will produce sub-optimal code. It is hard to imagine any other valid conclusion for this use-case.

As Adrian mentioned, being able to attribute multiple locations would be preferable to attributing line 0, and I hope we can make that happen in the future.

Even with multiple locations it might be weird when using those locations for profiling (or code coverage) and not just as debug information to see which source line the code origins from. If the hoisted instruction has locations from both the 'if', 'elseif' and 'else' blocks it might appear as if we have executed all paths, even if the 'else' block was chosen 100% of the time. So I guess that the samples with multiple locations should be discarded when doing profiling (perhaps depending on the goal with the profiling).

Ping.

A summary of the discussion so far:

The middle block is quite "latch like" in that it dictates control flow between the vectorized and scalar loops, but it can contain additional
instructions. For example, it may also handle converting the vectorized results into a scalar value.

The patch maps all middle block instructions to the line of the loop latch branch.

An initial question of "How much will this affect profiling tools that rely on this debug info?" was not fully addressed because
discussions on an alternative solution I mentioned eclipsed the original question.

In D60831#1482390, @Orlando wrote:

Ping.

A summary of the discussion so far:

The middle block is quite "latch like" in that it dictates control flow between the vectorized and scalar loops, but it can contain additional
instructions. For example, it may also handle converting the vectorized results into a scalar value.

The patch maps all middle block instructions to the line of the loop latch branch.

An initial question of "How much will this affect profiling tools that rely on this debug info?" was not fully addressed because
discussions on an alternative solution I mentioned eclipsed the original question.

My take-away from the discussion was this: It is desirable to map the instructions to something in the loop (e.g., not line 0), unless doing so will provide confusing information to the mapping that PGO uses to optimize the relevant branches. Am I correct in saying that this latter issue is of minimal concern in this case?

My take-away from the discussion was this: It is desirable to map the instructions to something in the loop (e.g., not line 0), unless doing so will provide confusing information to the mapping that PGO uses to optimize the relevant branches. Am I correct in saying that this latter issue is of minimal concern in this case?

As far as I see it you are correct. There should be negligible to no impact on PGO.

In D60831#1483817, @Orlando wrote:

My take-away from the discussion was this: It is desirable to map the instructions to something in the loop (e.g., not line 0), unless doing so will provide confusing information to the mapping that PGO uses to optimize the relevant branches. Am I correct in saying that this latter issue is of minimal concern in this case?

As far as I see it you are correct. There should be negligible to no impact on PGO.

Sounds good. LGTM.

This revision is now accepted and ready to land.Apr 30 2019, 9:54 AM

Closed by commit rL360162: [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through… (authored by orlandoch). · Explain WhyMay 7 2019, 8:35 AM

This revision was automatically updated to reflect the committed changes.

Hi,

This change broke most of the buildbots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32015/steps/check-llvm%20asan/logs/stdio
lib/CodeGen/LexicalScopes.cpp:176: llvm::LexicalScope *llvm::LexicalScopes::getOrCreateRegularScope(const llvm::DILocalScope *): Assertion `cast<DISubprogram>(Scope)->describes(&MF->getFunction())' failed.

I am going to revert it.

Thanks!

reverted in r360190.

This update addresses the problems with the original patch that caused the built bot failures
(sorry about that!).

The original patch fixed SplitBlockPredecessors(..) so that when splitting a loop header
the new preheader was assigned the DebugLoc of the start of the loop. The DebugLoc
was determined by calling Loop::getStartLoc(), which relies on loop metadata (MD_Loop).

Currently, when inlining a loop, the loop metadata is cloned without modification. This means
that the loop metadata "start" and "end" locations refer to the locations in the cloned function.
I.e. The fact that the loop has been inlined is ignored in the loop metadata.

This update copies and then modifies the loop metadata during inlining such that the "start" and
"end" DILocations include "inlinedAt" metadata.

There have been a few very minor changes to the original patch, so the files of interest are:
"InlineFunction.cpp", "inlined-loop-metadata.ll", "inlined-argument.ll", "bcmp-debugify-remarks.ll"

A new regression test "inlined-loop-metadata.ll" covers the loop metadata problem.
"bcmp-debugify-remarks.ll" and "inlined-argument.ll" needed to be updated
because they rely on the old incorrect metadata.

Thank you,
Orlando

Herald added subscribers: eraman, javed.absar. · View Herald TranscriptMay 14 2019, 8:17 AM

Orlando reopened this revision.May 14 2019, 8:23 AM

This revision is now accepted and ready to land.May 14 2019, 8:23 AM

Orlando requested review of this revision.May 14 2019, 8:27 AM

It sounds like inlining a function with a loop has a bug with respect to how the loop metadata is handled, and your patch merely tripped over that. Would it be reasonable to fix the inlining-loop-metadata bug separately first? And then the original patch is likely to Just Work?
Fixing one bug at a time is more in line with project practices.

Orlando mentioned this in D61933: [DebugInfo] Update loop metadata for inlined loops.May 15 2019, 1:37 AM

Orlando edited the summary of this revision. (Show Details)May 15 2019, 1:43 AM

Diffusion mentioned this in rL361132: [DebugInfo] Update loop metadata for inlined loops.May 20 2019, 2:38 AM

Orlando mentioned this in rG6e8f1a80cd98: [DebugInfo] Update loop metadata for inlined loops.May 20 2019, 2:41 AM

Diffusion mentioned this in rL361149: Resubmit "[DebugInfo] Update loop metadata for inlined loops".May 20 2019, 6:00 AM

Orlando mentioned this in rGed67bf8d2f31: Resubmit "[DebugInfo] Update loop metadata for inlined loops".May 20 2019, 6:04 AM

The inlined loop metadata part has been separated into D61933 (rL361149) which this patch is now based on.

This patch "just works" with D61933 and a few test changes: "bcmp-debugify-remarks.ll" and "inlined-argument.ll" needed to be updated because they rely on the old incorrect metadata.

Ping.

The major change to this previously accepted patch is the modification of "bcmp-debugify-remarks.ll" and "inlined-argument.ll".
Minor changes include spelling corrections in comments and removing some superfluous debug data from the new tests.

sidorovd mentioned this in rG921aab4c18ff: [DebugInfo] Update loop metadata for inlined loops.May 30 2019, 10:43 AM

sidorovd mentioned this in rGb7c9c9a7cbea: Resubmit "[DebugInfo] Update loop metadata for inlined loops".

Looking at the delta between the approved and ~final version, it looks good to me, seeing how the bug this tripped over was fixed in D61933 and nothing in the code has changed.

(I'll leave it a day before clicking the green button, in case people have further opinions).

LGTM -- this is the already-accepted patch with trivial change, the real difference is that a bug got fixed elsewhere.

This revision is now accepted and ready to land.Jun 4 2019, 2:44 AM

Closed by commit rL363046: [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through… (authored by orlandoch). · Explain WhyJun 11 2019, 3:34 AM

This revision was automatically updated to reflect the committed changes.

This patch broke our internal CI. You can reproduce the issue by downloading https://gist.github.com/adrian-prantl/ba88912878db855ec96534e6510246e6 (this is AArch64 bitcode for a file in the Swift stdlib) and running

clang "-cc1" "-triple" "arm64-apple-ios7.0.0" "-emit-obj" "-disable-llvm-passes" "-target-abi" "darwinpcs" "-O2" "-x" "ir"  "-o" "-" test.ll

!dbg attachment points at wrong subprogram for function
!631 = distinct !DISubprogram(name: "__hidden#25349_", scope: !9, file: !9, line: 318, type: !10, scopeLine: 319, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !5)
%"__ir_hidden#3555_"* (%"__ir_hidden#3418_"*, %"__ir_hidden#3549_"*)* @"\01__hidden#24886_"
  br label %19, !dbg !653
!653 = !DILocation(line: 258, column: 5, scope: !643)
!643 = distinct !DISubprogram(name: "__hidden#25389_", scope: !9, file: !9, line: 247, type: !10, scopeLine: 247, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition | DISPFlagOptimized, unit: !5)
!643 = distinct !DISubprogram(name: "__hidden#25389_", scope: !9, file: !9, line: 247, type: !10, scopeLine: 247, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition | DISPFlagOptimized, unit: !5)
fatal error: error in backend: Broken module found, compilation aborted!

Would you mind reverting the patch? Please let me know if you need any help investigating this. My assumption is that the patch somewhere doesn't take the inlinedAt: debug info into account.

Diffusion mentioned this in rL363132: Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step….Jun 12 2019, 1:32 AM

Orlando mentioned this in rGa94715639619: Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step….Jun 12 2019, 1:32 AM

No problem at all - reverted with rGa94715639619. I'll take a look at this next week, thank you for the info.

Thanks for the quick response! Please let me know if I can help debugging this.

Hi Adrian.

This seems to be a case of having old, incorrect, metadata baked into the test.
The 'startLoc' (!680) and 'endLoc' (!681) in loop metadta for loop !679 are both
missing an 'inlinedAt' node.

!680 = !DILocation(line: 258, column: 5, scope: !659)
!681 = !DILocation(line: 259, column: 30, scope: !659)

Regenerating this test with patch rL361149 should fix this.

Note: clang reports no errors when the follwoing DILocations are modified by
hand. !660 is a DILocation with scope !647 which refers to a location within
the caller.

!680 = !DILocation(line: 258, column: 5, scope: !659, inlinedAt: !660)
!681 = !DILocation(line: 259, column: 30, scope: !659, inlinedAt: !660)

This seems to be a case of having old, incorrect, metadata baked into the test.
The 'startLoc' (!680) and 'endLoc' (!681) in loop metadta for loop !679 are both
missing an 'inlinedAt' node.

Thank you very much for the analysis!

There is one thing I don't understand:
It looks like !680 is only used in the !llvm.loop !679 metadata. How can that trigger a !dbg attachment points at wrong subprogram for function verifier failure?

https://github.com/llvm/llvm-project/blob/5d00c3060e11b1b8725c0af110f011c4d110d39a/llvm/lib/IR/Verifier.cpp#L2337

IIUC, that check is only performed for !dbg attachments and should skip over !llvm.look attachments. Do loop attachements leak into the !dbg attachments? If yes, we should probably extend the verifier to also check loop attachments in that loop, so this situation can be detected and rejected at import time.

In D60831#1546469, @aprantl wrote:

There is one thing I don't understand:
It looks like !680 is only used in the !llvm.loop !679 metadata. How can that trigger a !dbg attachment points at wrong subprogram for function verifier failure?

Empty loop preheaders are sometimes removed and later regenerated. This patch
uses loop metadata to identify the start of the loop when regenerating the
preheader. The DILocation of the start node (node index 1) of the loop
metadata is given to the new preheader's branch.

I put up a Verifier patch at https://reviews.llvm.org/D63499. Once that patch has landed, this patch should be safe to re-apply.

Thanks, I've now resubmitted this patch (1251cac62af5).

Thanks for your help!

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Utils/

BasicBlockUtils.cpp

8 lines

Vectorize/

LoopVectorize.cpp

20 lines

test/

DebugInfo/

AArch64/

inlined-argument.ll

151 lines

Transforms/

LoopIdiom/

bcmp-debugify-remarks.ll

52 lines

memset-debugify-remarks.ll

16 lines

LoopSimplify/

dbg-loc.ll

2 lines

do-preheader-dbg.ll

122 lines

for-preheader-dbg.ll

102 lines

LoopUnroll/

runtime-loop1.ll

16 lines

LoopVectorize/

X86/

vectorization-remarks-missed.ll

12 lines

vectorization-remarks-profitable.ll

4 lines

debugloc.ll

9 lines

fix-reduction-dbg.ll

87 lines

unsafe-dep-remark.ll

2 lines

Diff 204005

llvm/trunk/lib/Transforms/Utils/BasicBlockUtils.cpp

Show First 20 Lines • Show All 573 Lines • ▼ Show 20 Lines	BasicBlock llvm::SplitBlockPredecessors(BasicBlock BB,
}		}

// Create new basic block, insert right before the original block.		// Create new basic block, insert right before the original block.
BasicBlock *NewBB = BasicBlock::Create(		BasicBlock *NewBB = BasicBlock::Create(
BB->getContext(), BB->getName() + Suffix, BB->getParent(), BB);		BB->getContext(), BB->getName() + Suffix, BB->getParent(), BB);

// The new block unconditionally branches to the old block.		// The new block unconditionally branches to the old block.
BranchInst *BI = BranchInst::Create(BB, NewBB);		BranchInst *BI = BranchInst::Create(BB, NewBB);
		// Splitting the predecessors of a loop header creates a preheader block.
		if (LI && LI->isLoopHeader(BB))
		// Using the loop start line number prevents debuggers stepping into the
		// loop body for this instruction.
		BI->setDebugLoc(LI->getLoopFor(BB)->getStartLoc());
		else
BI->setDebugLoc(BB->getFirstNonPHIOrDbg()->getDebugLoc());		BI->setDebugLoc(BB->getFirstNonPHIOrDbg()->getDebugLoc());

// Move the edges from Preds to point to NewBB instead of BB.		// Move the edges from Preds to point to NewBB instead of BB.
for (unsigned i = 0, e = Preds.size(); i != e; ++i) {		for (unsigned i = 0, e = Preds.size(); i != e; ++i) {
// This is slightly more strict than necessary; the minimum requirement		// This is slightly more strict than necessary; the minimum requirement
// is that there be no more than one indirectbr branching to BB. And		// is that there be no more than one indirectbr branching to BB. And
// all BlockAddress uses would need to be updated.		// all BlockAddress uses would need to be updated.
assert(!isa<IndirectBrInst>(Preds[i]->getTerminator()) &&		assert(!isa<IndirectBrInst>(Preds[i]->getTerminator()) &&
"Cannot split an edge from an IndirectBrInst");		"Cannot split an edge from an IndirectBrInst");
▲ Show 20 Lines • Show All 338 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,905 Lines • ▼ Show 20 Lines	BasicBlock *InnerLoopVectorizer::createVectorizedLoopSkeleton() {
// If (N - N%VF) == N, then we don't need to run the remainder.		// If (N - N%VF) == N, then we don't need to run the remainder.
// If tail is to be folded, we know we don't need to run the remainder.		// If tail is to be folded, we know we don't need to run the remainder.
Value *CmpN = Builder.getTrue();		Value *CmpN = Builder.getTrue();
if (!Cost->foldTailByMasking()) {		if (!Cost->foldTailByMasking()) {
CmpN =		CmpN =
CmpInst::Create(Instruction::ICmp, CmpInst::ICMP_EQ, Count,		CmpInst::Create(Instruction::ICmp, CmpInst::ICMP_EQ, Count,
CountRoundDown, "cmp.n", MiddleBlock->getTerminator());		CountRoundDown, "cmp.n", MiddleBlock->getTerminator());

// Provide correct stepping behaviour by using the same DebugLoc as the		// Here we use the same DebugLoc as the scalar loop latch branch instead
// scalar loop latch branch cmp if it exists.		// of the corresponding compare because they may have ended up with
if (CmpInst *ScalarLatchCmp =		// different line numbers and we want to avoid awkward line stepping while
dyn_cast_or_null<CmpInst>(ScalarLatchBr->getCondition()))		// debugging. Eg. if the compare has got a line number inside the loop.
cast<Instruction>(CmpN)->setDebugLoc(ScalarLatchCmp->getDebugLoc());		cast<Instruction>(CmpN)->setDebugLoc(ScalarLatchBr->getDebugLoc());
}		}

BranchInst *BrInst = BranchInst::Create(ExitBlock, ScalarPH, CmpN);		BranchInst *BrInst = BranchInst::Create(ExitBlock, ScalarPH, CmpN);
BrInst->setDebugLoc(ScalarLatchBr->getDebugLoc());		BrInst->setDebugLoc(ScalarLatchBr->getDebugLoc());
ReplaceInstWithInst(MiddleBlock->getTerminator(), BrInst);		ReplaceInstWithInst(MiddleBlock->getTerminator(), BrInst);

// Get ready to start creating new instructions into the vectorized body.		// Get ready to start creating new instructions into the vectorized body.
Builder.SetInsertPoint(&*VecBody->getFirstInsertionPt());		Builder.SetInsertPoint(&*VecBody->getFirstInsertionPt());
▲ Show 20 Lines • Show All 676 Lines • ▼ Show 20 Lines	for (unsigned Part = 0; Part < UF; ++Part) {
RdxParts[Part] = Builder.CreateTrunc(RdxParts[Part], RdxVecTy);		RdxParts[Part] = Builder.CreateTrunc(RdxParts[Part], RdxVecTy);
VectorLoopValueMap.resetVectorValue(LoopExitInst, Part, RdxParts[Part]);		VectorLoopValueMap.resetVectorValue(LoopExitInst, Part, RdxParts[Part]);
}		}
}		}

// Reduce all of the unrolled parts into a single vector.		// Reduce all of the unrolled parts into a single vector.
Value *ReducedPartRdx = VectorLoopValueMap.getVectorValue(LoopExitInst, 0);		Value *ReducedPartRdx = VectorLoopValueMap.getVectorValue(LoopExitInst, 0);
unsigned Op = RecurrenceDescriptor::getRecurrenceBinOp(RK);		unsigned Op = RecurrenceDescriptor::getRecurrenceBinOp(RK);
setDebugLocFromInst(Builder, ReducedPartRdx);
		// The middle block terminator has already been assigned a DebugLoc here (the
		// OrigLoop's single latch terminator). We want the whole middle block to
		// appear to execute on this line because: (a) it is all compiler generated,
		// (b) these instructions are always executed after evaluating the latch
		// conditional branch, and (c) other passes may add new predecessors which
		// terminate on this line. This is the easiest way to ensure we don't
		// accidentally cause an extra step back into the loop while debugging.
		setDebugLocFromInst(Builder, LoopMiddleBlock->getTerminator());
for (unsigned Part = 1; Part < UF; ++Part) {		for (unsigned Part = 1; Part < UF; ++Part) {
Value *RdxPart = VectorLoopValueMap.getVectorValue(LoopExitInst, Part);		Value *RdxPart = VectorLoopValueMap.getVectorValue(LoopExitInst, Part);
if (Op != Instruction::ICmp && Op != Instruction::FCmp)		if (Op != Instruction::ICmp && Op != Instruction::FCmp)
// Floating point operations had to be 'fast' to enable the reduction.		// Floating point operations had to be 'fast' to enable the reduction.
ReducedPartRdx = addFastMathFlag(		ReducedPartRdx = addFastMathFlag(
Builder.CreateBinOp((Instruction::BinaryOps)Op, RdxPart,		Builder.CreateBinOp((Instruction::BinaryOps)Op, RdxPart,
ReducedPartRdx, "bin.rdx"),		ReducedPartRdx, "bin.rdx"),
RdxDesc.getFastMathFlags());		RdxDesc.getFastMathFlags());
▲ Show 20 Lines • Show All 4,045 Lines • Show Last 20 Lines

llvm/trunk/test/DebugInfo/AArch64/inlined-argument.ll

	Show All 32 Lines
	; }			; }
	; int g(t_t t, unsigned long long r) {			; int g(t_t t, unsigned long long r) {
	; struct q *q;			; struct q *q;
	; q = find(t, r);			; q = find(t, r);
	; if (!q)			; if (!q)
	; if (__builtin_expect(enable, 0)) { }			; if (__builtin_expect(enable, 0)) { }
	; }			; }

				; ModuleID = 'inlined-arg.c'
	source_filename = "test.i"			source_filename = "inlined-arg.c"
	target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
	target triple = "arm64-apple-ios5.0.0"			target triple = "arm64-apple-ios5.0.0"

	%struct.t = type { %struct.q* }			%struct.t = type { %struct.q* }
	%struct.q = type { %struct.q*, i64 }			%struct.q = type { %struct.q*, i64 }

	@tt = local_unnamed_addr global %struct.t* null, align 8, !dbg !0			@tt = common local_unnamed_addr global %struct.t* null, align 8, !dbg !0

	; Function Attrs: noredzone nounwind readonly ssp			; Function Attrs: norecurse nounwind readonly ssp uwtable
	define i32 @g(%struct.t* nocapture readonly %t, i64 %r) local_unnamed_addr #0 !dbg !20 {			define i32 @g(%struct.t* nocapture readonly %t, i64 %r) local_unnamed_addr !dbg !21 {
	entry:			entry:
	tail call void @llvm.dbg.value(metadata %struct.t* %t, metadata !26, metadata !DIExpression()), !dbg !29			call void @llvm.dbg.value(metadata %struct.t* %t, metadata !27, metadata !DIExpression()), !dbg !30
	tail call void @llvm.dbg.value(metadata i64 %r, metadata !27, metadata !DIExpression()), !dbg !30			call void @llvm.dbg.value(metadata i64 %r, metadata !28, metadata !DIExpression()), !dbg !31
	tail call void @llvm.dbg.value(metadata %struct.t* %t, metadata !31, metadata !DIExpression()), !dbg !39			call void @llvm.dbg.value(metadata %struct.t* %t, metadata !32, metadata !DIExpression()), !dbg !39
	tail call void @llvm.dbg.value(metadata i64 %r, metadata !37, metadata !DIExpression()), !dbg !41			call void @llvm.dbg.value(metadata i64 %r, metadata !37, metadata !DIExpression()), !dbg !41
	%s.i5 = bitcast %struct.t* %t to %struct.q**			%s.i = getelementptr inbounds %struct.t, %struct.t* %t, i64 0, i32 0, !dbg !42
	tail call void @llvm.dbg.value(metadata %struct.q** %s.i5, metadata !38, metadata !DIExpression(DW_OP_deref)), !dbg !42			%q.05.i = load %struct.q, %struct.q* %s.i, align 8, !dbg !43, !tbaa !44
	%q.06.i = load %struct.q, %struct.q* %s.i5, align 8			call void @llvm.dbg.value(metadata %struct.q* %q.05.i, metadata !38, metadata !DIExpression()), !dbg !48
	tail call void @llvm.dbg.value(metadata %struct.q* %q.06.i, metadata !38, metadata !DIExpression()), !dbg !42			%tobool6.i = icmp eq %struct.q* %q.05.i, null, !dbg !49
	%tobool7.i = icmp eq %struct.q* %q.06.i, null, !dbg !43			br i1 %tobool6.i, label %find.exit, label %while.body.i, !dbg !49
	br i1 %tobool7.i, label %find.exit, label %while.body.i.preheader, !dbg !43
				while.body.i: ; preds = %entry, %if.end.i
	while.body.i.preheader: ; preds = %entry			%q.07.i = phi %struct.q* [ %q.0.i, %if.end.i ], [ %q.05.i, %entry ]
	br label %while.body.i, !dbg !44			%resource1.i = getelementptr inbounds %struct.q, %struct.q* %q.07.i, i64 0, i32 1, !dbg !50
				%0 = load i64, i64* %resource1.i, align 8, !dbg !50, !tbaa !53
	while.body.i: ; preds = %while.body.i.preheader, %if.end.i			%cmp.i = icmp eq i64 %0, %r, !dbg !56
	%q.08.i = phi %struct.q* [ %q.0.i, %if.end.i ], [ %q.06.i, %while.body.i.preheader ]			br i1 %cmp.i, label %find.exit, label %if.end.i, !dbg !57
	%resource1.i = getelementptr inbounds %struct.q, %struct.q* %q.08.i, i64 0, i32 1, !dbg !44
	%0 = load i64, i64* %resource1.i, align 8, !dbg !44
	%cmp.i = icmp eq i64 %0, %r, !dbg !47
	br i1 %cmp.i, label %find.exit, label %if.end.i, !dbg !48

	if.end.i: ; preds = %while.body.i			if.end.i: ; preds = %while.body.i
	%next.i6 = bitcast %struct.q* %q.08.i to %struct.q**			%next.i = getelementptr inbounds %struct.q, %struct.q* %q.07.i, i64 0, i32 0, !dbg !58
	tail call void @llvm.dbg.value(metadata %struct.q** %next.i6, metadata !38, metadata !DIExpression(DW_OP_deref)), !dbg !42			%q.0.i = load %struct.q, %struct.q* %next.i, align 8, !dbg !43, !tbaa !44
	%q.0.i = load %struct.q, %struct.q* %next.i6, align 8			call void @llvm.dbg.value(metadata %struct.q* %q.0.i, metadata !38, metadata !DIExpression()), !dbg !48
	tail call void @llvm.dbg.value(metadata %struct.q* %q.0.i, metadata !38, metadata !DIExpression()), !dbg !42			%tobool.i = icmp eq %struct.q* %q.0.i, null, !dbg !49
	%tobool.i = icmp eq %struct.q* %q.0.i, null, !dbg !43			br i1 %tobool.i, label %find.exit, label %while.body.i, !dbg !49, !llvm.loop !59
	br i1 %tobool.i, label %find.exit, label %while.body.i, !dbg !43, !llvm.loop !49

	find.exit: ; preds = %while.body.i, %if.end.i, %entry			find.exit: ; preds = %while.body.i, %if.end.i, %entry
	ret i32 undef, !dbg !52			call void @llvm.dbg.value(metadata %struct.q* undef, metadata !29, metadata !DIExpression()), !dbg !61
				ret i32 undef, !dbg !62
	}			}

	; Function Attrs: nounwind readnone speculatable			; Function Attrs: nounwind readnone speculatable
	declare void @llvm.dbg.value(metadata, metadata, metadata) #1			declare void @llvm.dbg.value(metadata, metadata, metadata)

	attributes #0 = { noredzone nounwind readonly ssp }
	attributes #1 = { nounwind readnone speculatable }

	!llvm.dbg.cu = !{!2}			!llvm.dbg.cu = !{!2}
	!llvm.module.flags = !{!16, !17, !18}			!llvm.module.flags = !{!16, !17, !18, !19}
	!llvm.ident = !{!19}			!llvm.ident = !{!20}

	!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())			!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
	!1 = distinct !DIGlobalVariable(name: "tt", scope: !2, file: !3, line: 8, type: !6, isLocal: false, isDefinition: true)			!1 = distinct !DIGlobalVariable(name: "tt", scope: !2, file: !3, line: 8, type: !6, isLocal: false, isDefinition: true)
	!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "clang version 6.0.0 (trunk 317516) (llvm/trunk 317518)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !4, globals: !5)			!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "clang version 9.0.0 (https://github.com/llvm/llvm-project.git cd3671d5dabc8848619d872f994770167a44ac5a)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !4, globals: !5, nameTableKind: GNU)
	!3 = !DIFile(filename: "test.i", directory: "/")			!3 = !DIFile(filename: "inlined-arg.c", directory: "")
	!4 = !{}			!4 = !{}
	!5 = !{!0}			!5 = !{!0}
	!6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64)			!6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64)
	!7 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t", file: !3, line: 3, size: 64, elements: !8)			!7 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t", file: !3, line: 3, size: 64, elements: !8)
	!8 = !{!9}			!8 = !{!9}
	!9 = !DIDerivedType(tag: DW_TAG_member, name: "s", scope: !7, file: !3, line: 7, baseType: !10, size: 64)			!9 = !DIDerivedType(tag: DW_TAG_member, name: "s", scope: !7, file: !3, line: 7, baseType: !10, size: 64)
	!10 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)			!10 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
	!11 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "q", file: !3, line: 4, size: 128, elements: !12)			!11 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "q", file: !3, line: 4, size: 128, elements: !12)
	!12 = !{!13, !14}			!12 = !{!13, !14}
	!13 = !DIDerivedType(tag: DW_TAG_member, name: "next", scope: !11, file: !3, line: 5, baseType: !10, size: 64)			!13 = !DIDerivedType(tag: DW_TAG_member, name: "next", scope: !11, file: !3, line: 5, baseType: !10, size: 64)
	!14 = !DIDerivedType(tag: DW_TAG_member, name: "resource", scope: !11, file: !3, line: 6, baseType: !15, size: 64, offset: 64)			!14 = !DIDerivedType(tag: DW_TAG_member, name: "resource", scope: !11, file: !3, line: 6, baseType: !15, size: 64, offset: 64)
	!15 = !DIBasicType(name: "long long unsigned int", size: 64, encoding: DW_ATE_unsigned)			!15 = !DIBasicType(name: "long long unsigned int", size: 64, encoding: DW_ATE_unsigned)
	!16 = !{i32 2, !"Dwarf Version", i32 2}			!16 = !{i32 2, !"Dwarf Version", i32 2}
	!17 = !{i32 2, !"Debug Info Version", i32 3}			!17 = !{i32 2, !"Debug Info Version", i32 3}
	!18 = !{i32 1, !"wchar_size", i32 4}			!18 = !{i32 1, !"wchar_size", i32 4}
	!19 = !{!"clang version 6.0.0 (trunk 317516) (llvm/trunk 317518)"}			!19 = !{i32 7, !"PIC Level", i32 2}
	!20 = distinct !DISubprogram(name: "g", scope: !3, file: !3, line: 18, type: !21, isLocal: false, isDefinition: true, scopeLine: 18, flags: DIFlagPrototyped, isOptimized: true, unit: !2, retainedNodes: !25)			!20 = !{!"clang version 9.0.0 (https://github.com/llvm/llvm-project.git cd3671d5dabc8848619d872f994770167a44ac5a)"}
	!21 = !DISubroutineType(types: !22)			!21 = distinct !DISubprogram(name: "g", scope: !3, file: !3, line: 19, type: !22, scopeLine: 19, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !26)
	!22 = !{!23, !24, !15}			!22 = !DISubroutineType(types: !23)
	!23 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)			!23 = !{!24, !25, !15}
	!24 = !DIDerivedType(tag: DW_TAG_typedef, name: "t_t", file: !3, line: 1, baseType: !6)			!24 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
	!25 = !{!26, !27, !28}			!25 = !DIDerivedType(tag: DW_TAG_typedef, name: "t_t", file: !3, line: 1, baseType: !6)
	!26 = !DILocalVariable(name: "t", arg: 1, scope: !20, file: !3, line: 18, type: !24)			!26 = !{!27, !28, !29}
	!27 = !DILocalVariable(name: "r", arg: 2, scope: !20, file: !3, line: 18, type: !15)			!27 = !DILocalVariable(name: "t", arg: 1, scope: !21, file: !3, line: 19, type: !25)
	!28 = !DILocalVariable(name: "q", scope: !20, file: !3, line: 19, type: !10)			!28 = !DILocalVariable(name: "r", arg: 2, scope: !21, file: !3, line: 19, type: !15)
	!29 = !DILocation(line: 18, column: 11, scope: !20)			!29 = !DILocalVariable(name: "q", scope: !21, file: !3, line: 20, type: !10)
	!30 = !DILocation(line: 18, column: 33, scope: !20)			!30 = !DILocation(line: 19, column: 11, scope: !21)
	!31 = !DILocalVariable(name: "t", arg: 1, scope: !32, file: !3, line: 9, type: !24)			!31 = !DILocation(line: 19, column: 33, scope: !21)
	!32 = distinct !DISubprogram(name: "find", scope: !3, file: !3, line: 9, type: !33, isLocal: true, isDefinition: true, scopeLine: 9, flags: DIFlagPrototyped, isOptimized: true, unit: !2, retainedNodes: !36)			!32 = !DILocalVariable(name: "t", arg: 1, scope: !33, file: !3, line: 10, type: !25)
	!33 = !DISubroutineType(types: !34)			!33 = distinct !DISubprogram(name: "find", scope: !3, file: !3, line: 10, type: !34, scopeLine: 10, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit \| DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !36)
	!34 = !{!35, !24, !15}			!34 = !DISubroutineType(types: !35)
	!35 = !DIBasicType(name: "long unsigned int", size: 64, encoding: DW_ATE_unsigned)			!35 = !{!10, !25, !15}
	!36 = !{!31, !37, !38}			!36 = !{!32, !37, !38}
	!37 = !DILocalVariable(name: "resource", arg: 2, scope: !32, file: !3, line: 9, type: !15)			!37 = !DILocalVariable(name: "resource", arg: 2, scope: !33, file: !3, line: 10, type: !15)
	!38 = !DILocalVariable(name: "q", scope: !32, file: !3, line: 10, type: !10)			!38 = !DILocalVariable(name: "q", scope: !33, file: !3, line: 11, type: !10)
	!39 = !DILocation(line: 9, column: 31, scope: !32, inlinedAt: !40)			!39 = !DILocation(line: 10, column: 27, scope: !33, inlinedAt: !40)
	!40 = distinct !DILocation(line: 20, column: 7, scope: !20)			!40 = distinct !DILocation(line: 21, column: 7, scope: !21)
	!41 = !DILocation(line: 9, column: 53, scope: !32, inlinedAt: !40)			!41 = !DILocation(line: 10, column: 49, scope: !33, inlinedAt: !40)
	!42 = !DILocation(line: 10, column: 13, scope: !32, inlinedAt: !40)			!42 = !DILocation(line: 12, column: 10, scope: !33, inlinedAt: !40)
	!43 = !DILocation(line: 12, column: 3, scope: !32, inlinedAt: !40)			!43 = !DILocation(line: 0, scope: !33, inlinedAt: !40)
	!44 = !DILocation(line: 13, column: 12, scope: !45, inlinedAt: !40)			!44 = !{!45, !45, i64 0}
	!45 = distinct !DILexicalBlock(scope: !46, file: !3, line: 13, column: 9)			!45 = !{!"any pointer", !46, i64 0}
	!46 = distinct !DILexicalBlock(scope: !32, file: !3, line: 12, column: 13)			!46 = !{!"omnipotent char", !47, i64 0}
	!47 = !DILocation(line: 13, column: 21, scope: !45, inlinedAt: !40)			!47 = !{!"Simple C/C++ TBAA"}
	!48 = !DILocation(line: 13, column: 9, scope: !46, inlinedAt: !40)			!48 = !DILocation(line: 11, column: 13, scope: !33, inlinedAt: !40)
	!49 = distinct !{!49, !50, !51}			!49 = !DILocation(line: 13, column: 3, scope: !33, inlinedAt: !40)
	!50 = !DILocation(line: 12, column: 3, scope: !32)			!50 = !DILocation(line: 14, column: 12, scope: !51, inlinedAt: !40)
	!51 = !DILocation(line: 16, column: 3, scope: !32)			!51 = distinct !DILexicalBlock(scope: !52, file: !3, line: 14, column: 9)
	!52 = !DILocation(line: 24, column: 1, scope: !20)			!52 = distinct !DILexicalBlock(scope: !33, file: !3, line: 13, column: 13)
				!53 = !{!54, !55, i64 8}
				!54 = !{!"q", !45, i64 0, !55, i64 8}
				!55 = !{!"long long", !46, i64 0}
				!56 = !DILocation(line: 14, column: 21, scope: !51, inlinedAt: !40)
				!57 = !DILocation(line: 14, column: 9, scope: !52, inlinedAt: !40)
				!58 = !DILocation(line: 16, column: 12, scope: !52, inlinedAt: !40)
				!59 = distinct !{!59, !49, !60}
				!60 = !DILocation(line: 17, column: 3, scope: !33, inlinedAt: !40)
				!61 = !DILocation(line: 20, column: 13, scope: !21)
				!62 = !DILocation(line: 24, column: 1, scope: !21)

llvm/trunk/test/Transforms/LoopIdiom/bcmp-debugify-remarks.ll

	Show All 34 Lines
	; CHECK-NEXT: br label [[FOR_BODY:%.*]], !dbg !25			; CHECK-NEXT: br label [[FOR_BODY:%.*]], !dbg !25
	; CHECK: for.cond:			; CHECK: for.cond:
	; CHECK-NEXT: [[CMP:%.]] = icmp ult i64 [[INC:%.]], [[COUNT]], !dbg !26			; CHECK-NEXT: [[CMP:%.]] = icmp ult i64 [[INC:%.]], [[COUNT]], !dbg !26
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP]], metadata !13, metadata !DIExpression()), !dbg !26			; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP]], metadata !13, metadata !DIExpression()), !dbg !26
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT:%.*]], !dbg !27			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT:%.*]], !dbg !27
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[I_015:%.]] = phi i64 [ [[INC]], [[FOR_COND:%.]] ], [ 0, [[FOR_BODY_PREHEADER]] ], !dbg !28			; CHECK-NEXT: [[I_015:%.]] = phi i64 [ [[INC]], [[FOR_COND:%.]] ], [ 0, [[FOR_BODY_PREHEADER]] ], !dbg !28
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[I_015]], metadata !14, metadata !DIExpression()), !dbg !28			; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[I_015]], metadata !14, metadata !DIExpression()), !dbg !28
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i8, i8 [[PTR]], i64 [[I_015]], !dbg !25			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i8, i8 [[PTR]], i64 [[I_015]], !dbg !29
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ARRAYIDX]], metadata !15, metadata !DIExpression()), !dbg !25			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ARRAYIDX]], metadata !15, metadata !DIExpression()), !dbg !29
	; CHECK-NEXT: [[V0:%.]] = load i8, i8 [[ARRAYIDX]], !dbg !29			; CHECK-NEXT: [[V0:%.]] = load i8, i8 [[ARRAYIDX]], !dbg !30
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8 [[V0]], metadata !16, metadata !DIExpression()), !dbg !29			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8 [[V0]], metadata !16, metadata !DIExpression()), !dbg !30
	; CHECK-NEXT: [[ARRAYIDX1:%.]] = getelementptr inbounds i8, i8 [[ADD_PTR]], i64 [[I_015]], !dbg !30			; CHECK-NEXT: [[ARRAYIDX1:%.]] = getelementptr inbounds i8, i8 [[ADD_PTR]], i64 [[I_015]], !dbg !31
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ARRAYIDX1]], metadata !17, metadata !DIExpression()), !dbg !30			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ARRAYIDX1]], metadata !17, metadata !DIExpression()), !dbg !31
	; CHECK-NEXT: [[V1:%.]] = load i8, i8 [[ARRAYIDX1]], !dbg !31			; CHECK-NEXT: [[V1:%.]] = load i8, i8 [[ARRAYIDX1]], !dbg !32
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8 [[V1]], metadata !18, metadata !DIExpression()), !dbg !31			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8 [[V1]], metadata !18, metadata !DIExpression()), !dbg !32
	; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]], !dbg !32			; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]], !dbg !33
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP3]], metadata !19, metadata !DIExpression()), !dbg !32			; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP3]], metadata !19, metadata !DIExpression()), !dbg !33
	; CHECK-NEXT: [[INC]] = add nuw i64 [[I_015]], 1, !dbg !33			; CHECK-NEXT: [[INC]] = add nuw i64 [[I_015]], 1, !dbg !34
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[INC]], metadata !20, metadata !DIExpression()), !dbg !33			; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[INC]], metadata !20, metadata !DIExpression()), !dbg !34
	; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP_LOOPEXIT]], !dbg !34			; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP_LOOPEXIT]], !dbg !25
	; CHECK: cleanup.loopexit:			; CHECK: cleanup.loopexit:
	; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ]			; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ]
	; CHECK-NEXT: br label [[CLEANUP]], !dbg !35			; CHECK-NEXT: br label [[CLEANUP]], !dbg !35
	; CHECK: cleanup:			; CHECK: cleanup:
	; CHECK-NEXT: [[RES:%.]] = phi i1 [ true, [[ENTRY:%.]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ], !dbg !36			; CHECK-NEXT: [[RES:%.]] = phi i1 [ true, [[ENTRY:%.]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ], !dbg !36
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[RES]], metadata !21, metadata !DIExpression()), !dbg !36			; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[RES]], metadata !21, metadata !DIExpression()), !dbg !36
	; CHECK-NEXT: ret i1 [[RES]], !dbg !35			; CHECK-NEXT: ret i1 [[RES]], !dbg !35
	;			;
	Show All 31 Lines
	; CHECK-NEXT: br label [[FOR_BODY:%.*]], !dbg !62			; CHECK-NEXT: br label [[FOR_BODY:%.*]], !dbg !62
	; CHECK: for.cond.cleanup.loopexit:			; CHECK: for.cond.cleanup.loopexit:
	; CHECK-NEXT: br label [[FOR_COND_CLEANUP]], !dbg !63			; CHECK-NEXT: br label [[FOR_COND_CLEANUP]], !dbg !63
	; CHECK: for.cond.cleanup:			; CHECK: for.cond.cleanup:
	; CHECK-NEXT: ret void, !dbg !63			; CHECK-NEXT: ret void, !dbg !63
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[I_012:%.]] = phi i64 [ [[INC:%.]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ], !dbg !64			; CHECK-NEXT: [[I_012:%.]] = phi i64 [ [[INC:%.]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ], !dbg !64
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[I_012]], metadata !40, metadata !DIExpression()), !dbg !64			; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[I_012]], metadata !40, metadata !DIExpression()), !dbg !64
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i8, i8** [[PTR0:%.*]], i64 [[I_012]], !dbg !62			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i8, i8** [[PTR0:%.*]], i64 [[I_012]], !dbg !65
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8** [[ARRAYIDX]], metadata !41, metadata !DIExpression()), !dbg !62			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8** [[ARRAYIDX]], metadata !41, metadata !DIExpression()), !dbg !65
	; CHECK-NEXT: [[T0:%.]] = load i8, i8** [[ARRAYIDX]], !dbg !65			; CHECK-NEXT: [[T0:%.]] = load i8, i8** [[ARRAYIDX]], !dbg !66
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[T0]], metadata !42, metadata !DIExpression()), !dbg !65			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[T0]], metadata !42, metadata !DIExpression()), !dbg !66
	; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds i64, i64 [[COUNT:%.*]], i64 [[I_012]], !dbg !66			; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds i64, i64 [[COUNT:%.*]], i64 [[I_012]], !dbg !67
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i64* [[ARRAYIDX2]], metadata !43, metadata !DIExpression()), !dbg !66			; CHECK-NEXT: call void @llvm.dbg.value(metadata i64* [[ARRAYIDX2]], metadata !43, metadata !DIExpression()), !dbg !67
	; CHECK-NEXT: [[T1:%.]] = load i64, i64 [[ARRAYIDX2]], !dbg !67			; CHECK-NEXT: [[T1:%.]] = load i64, i64 [[ARRAYIDX2]], !dbg !68
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[T1]], metadata !44, metadata !DIExpression()), !dbg !67			; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[T1]], metadata !44, metadata !DIExpression()), !dbg !68
	; CHECK-NEXT: [[ADD_PTR:%.]] = getelementptr inbounds i8, i8 [[T0]], i64 [[T1]], !dbg !68			; CHECK-NEXT: [[ADD_PTR:%.]] = getelementptr inbounds i8, i8 [[T0]], i64 [[T1]], !dbg !69
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ADD_PTR]], metadata !45, metadata !DIExpression()), !dbg !68			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ADD_PTR]], metadata !45, metadata !DIExpression()), !dbg !69
	; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[T1]], 0, !dbg !69			; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[T1]], 0, !dbg !70
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP5_I_I]], metadata !46, metadata !DIExpression()), !dbg !69			; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP5_I_I]], metadata !46, metadata !DIExpression()), !dbg !70
	; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], label [[FOR_BODY_I_I_PREHEADER:%.*]], !dbg !70			; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], label [[FOR_BODY_I_I_PREHEADER:%.*]], !dbg !62
	; CHECK: for.body.i.i.preheader:			; CHECK: for.body.i.i.preheader:
	; CHECK-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds i8, i8** [[PTR1:%.*]], i64 [[I_012]], !dbg !71			; CHECK-NEXT: [[ARRAYIDX3:%.]] = getelementptr inbounds i8, i8** [[PTR1:%.*]], i64 [[I_012]], !dbg !71
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8** [[ARRAYIDX3]], metadata !47, metadata !DIExpression()), !dbg !71			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8** [[ARRAYIDX3]], metadata !47, metadata !DIExpression()), !dbg !71
	; CHECK-NEXT: [[T2:%.]] = load i8, i8** [[ARRAYIDX3]], !dbg !72			; CHECK-NEXT: [[T2:%.]] = load i8, i8** [[ARRAYIDX3]], !dbg !72
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[T2]], metadata !48, metadata !DIExpression()), !dbg !72			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[T2]], metadata !48, metadata !DIExpression()), !dbg !72
	; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]], !dbg !73			; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]], !dbg !73
	; CHECK: for.body.i.i:			; CHECK: for.body.i.i:
	; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.]] = phi i8 [ [[INCDEC_PTR1_I_I:%.]], [[FOR_INC_I_I:%.]] ], [ [[T2]], [[FOR_BODY_I_I_PREHEADER]] ], !dbg !74			; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.]] = phi i8 [ [[INCDEC_PTR1_I_I:%.]], [[FOR_INC_I_I:%.]] ], [ [[T2]], [[FOR_BODY_I_I_PREHEADER]] ], !dbg !74
	▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopIdiom/memset-debugify-remarks.ll

	Show All 18 Lines
	; CHECK-NEXT: [[PTR1:%.]] = ptrtoint i8 [[PTR:%.*]] to i64			; CHECK-NEXT: [[PTR1:%.]] = ptrtoint i8 [[PTR:%.*]] to i64
	; CHECK-NEXT: [[CMP3:%.]] = icmp eq i8 [[PTR]], [[END:%.*]], !dbg !15			; CHECK-NEXT: [[CMP3:%.]] = icmp eq i8 [[PTR]], [[END:%.*]], !dbg !15
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP3]], metadata !9, metadata !DIExpression()), !dbg !15			; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP3]], metadata !9, metadata !DIExpression()), !dbg !15
	; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_END:%.]], label [[FOR_BODY_PREHEADER:%.]], !dbg !16			; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_END:%.]], label [[FOR_BODY_PREHEADER:%.]], !dbg !16
	; CHECK: for.body.preheader:			; CHECK: for.body.preheader:
	; CHECK-NEXT: [[TMP0:%.*]] = sub i64 0, [[PTR1]], !dbg !17			; CHECK-NEXT: [[TMP0:%.*]] = sub i64 0, [[PTR1]], !dbg !17
	; CHECK-NEXT: [[SCEVGEP:%.]] = getelementptr i8, i8 [[END]], i64 [[TMP0]], !dbg !17			; CHECK-NEXT: [[SCEVGEP:%.]] = getelementptr i8, i8 [[END]], i64 [[TMP0]], !dbg !17
	; CHECK-NEXT: [[SCEVGEP2:%.]] = ptrtoint i8 [[SCEVGEP]] to i64			; CHECK-NEXT: [[SCEVGEP2:%.]] = ptrtoint i8 [[SCEVGEP]] to i64
	; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 1 [[PTR]], i8 [[VALUE:%.*]], i64 [[SCEVGEP2]], i1 false), !dbg !17			; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 1 [[PTR]], i8 [[VALUE:%.*]], i64 [[SCEVGEP2]], i1 false), !dbg !18
	; CHECK-NEXT: br label [[FOR_BODY:%.*]], !dbg !17			; CHECK-NEXT: br label [[FOR_BODY:%.*]], !dbg !17
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[PTR_ADDR_04:%.]] = phi i8 [ [[INCDEC_PTR:%.*]], [[FOR_BODY]] ], [ [[PTR]], [[FOR_BODY_PREHEADER]] ], !dbg !18			; CHECK-NEXT: [[PTR_ADDR_04:%.]] = phi i8 [ [[INCDEC_PTR:%.*]], [[FOR_BODY]] ], [ [[PTR]], [[FOR_BODY_PREHEADER]] ], !dbg !19
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[PTR_ADDR_04]], metadata !11, metadata !DIExpression()), !dbg !18			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[PTR_ADDR_04]], metadata !11, metadata !DIExpression()), !dbg !19
	; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR_ADDR_04]], i64 1, !dbg !19			; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR_ADDR_04]], i64 1, !dbg !20
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[INCDEC_PTR]], metadata !13, metadata !DIExpression()), !dbg !19			; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[INCDEC_PTR]], metadata !13, metadata !DIExpression()), !dbg !20
	; CHECK-NEXT: [[CMP:%.]] = icmp eq i8 [[INCDEC_PTR]], [[END]], !dbg !20			; CHECK-NEXT: [[CMP:%.]] = icmp eq i8 [[INCDEC_PTR]], [[END]], !dbg !21
	; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP]], metadata !14, metadata !DIExpression()), !dbg !20			; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP]], metadata !14, metadata !DIExpression()), !dbg !21
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]], !dbg !21			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_END_LOOPEXIT:%.*]], label [[FOR_BODY]], !dbg !17
	; CHECK: for.end.loopexit:			; CHECK: for.end.loopexit:
	; CHECK-NEXT: br label [[FOR_END]], !dbg !22			; CHECK-NEXT: br label [[FOR_END]], !dbg !22
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void, !dbg !22			; CHECK-NEXT: ret void, !dbg !22
	;			;
	entry:			entry:
	%cmp3 = icmp eq i8* %ptr, %end			%cmp3 = icmp eq i8* %ptr, %end
	br i1 %cmp3, label %for.end, label %for.body			br i1 %cmp3, label %for.end, label %for.body
	Show All 11 Lines

llvm/trunk/test/Transforms/LoopSimplify/dbg-loc.ll

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	eh.resume: ; preds = %catch
%1 = landingpad { i8*, i32 }		%1 = landingpad { i8*, i32 }
cleanup catch i8* bitcast ({ i8, i8, i8* }* @catchtypeinfo to i8*), !dbg !13		cleanup catch i8* bitcast ({ i8, i8, i8* }* @catchtypeinfo to i8*), !dbg !13
resume { i8*, i32 } undef, !dbg !13		resume { i8*, i32 } undef, !dbg !13
}		}

; Function Attrs: nounwind readnone		; Function Attrs: nounwind readnone
declare void @llvm.dbg.value(metadata, metadata, metadata)		declare void @llvm.dbg.value(metadata, metadata, metadata)

; CHECK-DAG: [[PREHEADER_LOC]] = !DILocation(line: 73, column: 27, scope: !{{[0-9]+}})		; CHECK-DAG: [[PREHEADER_LOC]] = !DILocation(line: 73, column: 13, scope: !{{[0-9]+}})
; CHECK-DAG: [[LOOPEXIT_LOC]] = !DILocation(line: 75, column: 9, scope: !{{[0-9]+}})		; CHECK-DAG: [[LOOPEXIT_LOC]] = !DILocation(line: 75, column: 9, scope: !{{[0-9]+}})
; CHECK-DAG: [[LPAD_PREHEADER_LOC]] = !DILocation(line: 85, column: 1, scope: !{{[0-9]+}})		; CHECK-DAG: [[LPAD_PREHEADER_LOC]] = !DILocation(line: 85, column: 1, scope: !{{[0-9]+}})

!llvm.module.flags = !{!0, !1, !2}		!llvm.module.flags = !{!0, !1, !2}
!llvm.dbg.cu = !{!14}		!llvm.dbg.cu = !{!14}
!0 = !{i32 2, !"Dwarf Version", i32 4}		!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}		!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = !{i32 1, !"PIC Level", i32 2}		!2 = !{i32 1, !"PIC Level", i32 2}
Show All 19 Lines

llvm/trunk/test/Transforms/LoopSimplify/do-preheader-dbg.ll

Property	Old Value	New Value
svn:executable	null	* \ No newline at end of property

				; Confirm that the line number for the do.body.preheader block
				; branch is the the start of the loop.

				; RUN: opt -simplifycfg -loop-simplify -keep-loops="false" -S <%s \| FileCheck %s

				; CHECK: do.body.preheader:
				; CHECK-NEXT: phi
				; CHECK-NEXT: phi
				; CHECK-NEXT: br label %do.body, !dbg ![[DL:[0-9]+]]
				; CHECK: ![[DL]] = !DILocation(line: 4,

				; This IR can be generated by running:
				; clang src.cpp -O2 -g -S -emit-llvm -mllvm -opt-bisect-limit=62 -o -
				;
				; Where src.cpp contains:
				; int foo(char *Bytes, int Count)
				; {
				; int Total = 0;
				; do
				; Total += Bytes[--Count];
				; while (Count);
				; return Total;
				; }

				define dso_local i32 @"foo"(i8* nocapture readonly %Bytes, i32 %Count) local_unnamed_addr !dbg !8 {
				entry:
				%0 = sext i32 %Count to i64, !dbg !10
				%min.iters.check = icmp ult i32 %Count, 8, !dbg !10
				br i1 %min.iters.check, label %do.body.preheader, label %vector.ph, !dbg !10

				vector.ph: ; preds = %entry
				%n.vec = and i64 %0, -8, !dbg !10
				%ind.end = sub nsw i64 %0, %n.vec, !dbg !10
				br label %vector.body, !dbg !10

				vector.body: ; preds = %vector.body, %vector.ph
				%index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
				%vec.phi = phi <4 x i32> [ zeroinitializer, %vector.ph ], [ %11, %vector.body ]
				%vec.phi5 = phi <4 x i32> [ zeroinitializer, %vector.ph ], [ %12, %vector.body ]
				%1 = xor i64 %index, -1, !dbg !11
				%2 = add i64 %1, %0, !dbg !11
				%3 = getelementptr inbounds i8, i8* %Bytes, i64 %2, !dbg !11
				%4 = getelementptr inbounds i8, i8* %3, i64 -3, !dbg !11
				%5 = bitcast i8* %4 to <4 x i8>*, !dbg !11
				%wide.load = load <4 x i8>, <4 x i8>* %5, align 1, !dbg !11, !tbaa !12
				%reverse = shufflevector <4 x i8> %wide.load, <4 x i8> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>, !dbg !11
				%6 = getelementptr inbounds i8, i8* %3, i64 -4, !dbg !11
				%7 = getelementptr inbounds i8, i8* %6, i64 -3, !dbg !11
				%8 = bitcast i8* %7 to <4 x i8>*, !dbg !11
				%wide.load6 = load <4 x i8>, <4 x i8>* %8, align 1, !dbg !11, !tbaa !12
				%reverse7 = shufflevector <4 x i8> %wide.load6, <4 x i8> undef, <4 x i32> <i32 3, i32 2, i32 1, i32 0>, !dbg !11
				%9 = sext <4 x i8> %reverse to <4 x i32>, !dbg !11
				%10 = sext <4 x i8> %reverse7 to <4 x i32>, !dbg !11
				%11 = add nsw <4 x i32> %vec.phi, %9, !dbg !11
				%12 = add nsw <4 x i32> %vec.phi5, %10, !dbg !11
				%index.next = add i64 %index, 8
				%13 = icmp eq i64 %index.next, %n.vec
				br i1 %13, label %middle.block, label %vector.body, !llvm.loop !15

				middle.block: ; preds = %vector.body
				%.lcssa12 = phi <4 x i32> [ %11, %vector.body ], !dbg !11
				%.lcssa = phi <4 x i32> [ %12, %vector.body ], !dbg !11
				%bin.rdx = add <4 x i32> %.lcssa, %.lcssa12, !dbg !11
				%rdx.shuf = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>, !dbg !11
				%bin.rdx8 = add <4 x i32> %bin.rdx, %rdx.shuf, !dbg !11
				%rdx.shuf9 = shufflevector <4 x i32> %bin.rdx8, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>, !dbg !11
				%bin.rdx10 = add <4 x i32> %bin.rdx8, %rdx.shuf9, !dbg !11
				%14 = extractelement <4 x i32> %bin.rdx10, i32 0, !dbg !11
				%cmp.n = icmp eq i64 %n.vec, %0
				br i1 %cmp.n, label %do.end, label %do.body.preheader, !dbg !10

				do.body.preheader: ; preds = %middle.block, %entry
				%indvars.iv.ph = phi i64 [ %0, %entry ], [ %ind.end, %middle.block ]
				%Total.0.ph = phi i32 [ 0, %entry ], [ %14, %middle.block ]
				br label %do.body, !dbg !11

				do.body: ; preds = %do.body.preheader, %do.body
				%indvars.iv = phi i64 [ %indvars.iv.next, %do.body ], [ %indvars.iv.ph, %do.body.preheader ]
				%Total.0 = phi i32 [ %add, %do.body ], [ %Total.0.ph, %do.body.preheader ], !dbg !18
				%indvars.iv.next = add nsw i64 %indvars.iv, -1, !dbg !11
				%arrayidx = getelementptr inbounds i8, i8* %Bytes, i64 %indvars.iv.next, !dbg !11
				%15 = load i8, i8* %arrayidx, align 1, !dbg !11, !tbaa !12
				%conv = sext i8 %15 to i32, !dbg !11
				%add = add nsw i32 %Total.0, %conv, !dbg !11
				%16 = icmp eq i64 %indvars.iv.next, 0
				br i1 %16, label %do.end.loopexit, label %do.body, !dbg !11, !llvm.loop !19

				do.end.loopexit: ; preds = %do.body
				%add.lcssa11 = phi i32 [ %add, %do.body ], !dbg !11
				br label %do.end, !dbg !21

				do.end: ; preds = %do.end.loopexit, %middle.block
				%add.lcssa = phi i32 [ %14, %middle.block ], [ %add.lcssa11, %do.end.loopexit ], !dbg !11
				ret i32 %add.lcssa, !dbg !21
				}

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!3, !4, !5, !6}
				!llvm.ident = !{!7}

				!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !2, nameTableKind: None)
				!1 = !DIFile(filename: "src2.cpp", directory: "")
				!2 = !{}
				!3 = !{i32 2, !"CodeView", i32 1}
				!4 = !{i32 2, !"Debug Info Version", i32 3}
				!5 = !{i32 1, !"wchar_size", i32 2}
				!6 = !{i32 7, !"PIC Level", i32 2}
				!7 = !{!""}
				!8 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !9, scopeLine: 2, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !2)
				!9 = !DISubroutineType(types: !2)
				!10 = !DILocation(line: 4, scope: !8)
				!11 = !DILocation(line: 5, scope: !8)
				!12 = !{!13, !13, i64 0}
				!13 = !{!"omnipotent char", !14, i64 0}
				!14 = !{!"Simple C++ TBAA"}
				!15 = distinct !{!15, !10, !16, !17}
				!16 = !DILocation(line: 6, scope: !8)
				!17 = !{!"llvm.loop.isvectorized", i32 1}
				!18 = !DILocation(line: 0, scope: !8)
				!19 = distinct !{!19, !10, !16, !20, !17}
				!20 = !{!"llvm.loop.unroll.runtime.disable"}
				!21 = !DILocation(line: 7, scope: !8)

llvm/trunk/test/Transforms/LoopSimplify/for-preheader-dbg.ll

Property	Old Value	New Value
svn:executable	null	* \ No newline at end of property

				; Confirm that the line number for the for.body.preheader block
				; branch is the the start of the loop.

				; RUN: opt -simplifycfg -loop-simplify -S <%s \| FileCheck %s
				;
				; CHECK: for.body.preheader:
				; CHECK-NEXT: br label %for.body, !dbg ![[DL:[0-9]+]]
				; CHECK: ![[DL]] = !DILocation(line: 8,

				; This IR can be generated by running:
				; clang src.cpp -O0 -g -S -emit-llvm -Xclang -disable-O0-optnone -o - \| \
				; opt -O2 -S -opt-bisect-limit=27 -o -
				;
				; Where src.cpp contains:
				; int foo(int count, int *bar)
				; {
				; if (count + 1 > 256)
				; return 0;
				;
				; int ret = count;
				; int tmp;
				; for (int j = 0; j < count; j++) {
				; tmp = bar[j];
				; ret += tmp;
				; }
				;
				; return ret;
				; }

				define dso_local i32 @"foo"(i32 %count, i32* nocapture readonly %bar) local_unnamed_addr !dbg !8 {
				entry:
				%cmp = icmp sgt i32 %count, 255, !dbg !16
				br i1 %cmp, label %return, label %for.cond.preheader, !dbg !16

				for.cond.preheader: ; preds = %entry
				%cmp16 = icmp slt i32 0, %count, !dbg !19
				br i1 %cmp16, label %for.body.lr.ph, label %return.loopexit, !dbg !19

				for.body.lr.ph: ; preds = %for.cond.preheader
				br label %for.body, !dbg !19

				for.body: ; preds = %for.body.lr.ph, %for.body
				%j.08 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
				%ret.07 = phi i32 [ %count, %for.body.lr.ph ], [ %add2, %for.body ]
				%0 = zext i32 %j.08 to i64, !dbg !22
				%arrayidx = getelementptr inbounds i32, i32* %bar, i64 %0, !dbg !22
				%1 = load i32, i32* %arrayidx, align 4, !dbg !22
				%add2 = add nsw i32 %1, %ret.07, !dbg !27
				%inc = add nuw nsw i32 %j.08, 1, !dbg !28
				%cmp1 = icmp slt i32 %inc, %count, !dbg !19
				br i1 %cmp1, label %for.body, label %for.cond.return.loopexit_crit_edge, !dbg !19, !llvm.loop !29

				for.cond.return.loopexit_crit_edge: ; preds = %for.body
				%split = phi i32 [ %add2, %for.body ]
				br label %return.loopexit, !dbg !19

				return.loopexit: ; preds = %for.cond.return.loopexit_crit_edge, %for.cond.preheader
				%ret.0.lcssa = phi i32 [ %split, %for.cond.return.loopexit_crit_edge ], [ %count, %for.cond.preheader ], !dbg !31
				br label %return, !dbg !32

				return: ; preds = %return.loopexit, %entry
				%retval.0 = phi i32 [ 0, %entry ], [ %ret.0.lcssa, %return.loopexit ], !dbg !31
				ret i32 %retval.0, !dbg !32
				}

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!3, !4, !5, !6}
				!llvm.ident = !{!7}

				!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None)
				!1 = !DIFile(filename: "src.cpp", directory: "")
				!2 = !{}
				!3 = !{i32 2, !"CodeView", i32 1}
				!4 = !{i32 2, !"Debug Info Version", i32 3}
				!5 = !{i32 1, !"wchar_size", i32 2}
				!6 = !{i32 7, !"PIC Level", i32 2}
				!7 = !{!""}
				!8 = distinct !DISubprogram(name: "foo", linkageName: "?foo@@YAHHPEAH@Z", scope: !1, file: !1, line: 1, type: !9, scopeLine: 2, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
				!9 = !DISubroutineType(types: !10)
				!10 = !{!11, !11, !12}
				!11 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!12 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
				!13 = !DILocalVariable(name: "bar", arg: 2, scope: !8, file: !1, line: 1, type: !12)
				!14 = !DILocation(line: 1, scope: !8)
				!15 = !DILocalVariable(name: "count", arg: 1, scope: !8, file: !1, line: 1, type: !11)
				!16 = !DILocation(line: 3, scope: !8)
				!17 = !DILocalVariable(name: "j", scope: !18, file: !1, line: 8, type: !11)
				!18 = distinct !DILexicalBlock(scope: !8, file: !1, line: 8)
				!19 = !DILocation(line: 8, scope: !18)
				!20 = !DILocalVariable(name: "ret", scope: !8, file: !1, line: 6, type: !11)
				!21 = !DILocation(line: 6, scope: !8)
				!22 = !DILocation(line: 9, scope: !23)
				!23 = distinct !DILexicalBlock(scope: !24, file: !1, line: 8)
				!24 = distinct !DILexicalBlock(scope: !18, file: !1, line: 8)
				!25 = !DILocalVariable(name: "tmp", scope: !8, file: !1, line: 7, type: !11)
				!26 = !DILocation(line: 7, scope: !8)
				!27 = !DILocation(line: 10, scope: !23)
				!28 = !DILocation(line: 8, scope: !24)
				!29 = distinct !{!29, !19, !30}
				!30 = !DILocation(line: 11, scope: !18)
				!31 = !DILocation(line: 0, scope: !8)
				!32 = !DILocation(line: 14, scope: !8)

llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll

	; RUN: opt < %s -S -loop-unroll -unroll-runtime -unroll-count=2 -unroll-runtime-epilog=true \| FileCheck %s -check-prefix=EPILOG			; RUN: opt < %s -S -loop-unroll -unroll-runtime -unroll-count=2 -unroll-runtime-epilog=true \| FileCheck %s -check-prefix=EPILOG
	; RUN: opt < %s -S -loop-unroll -unroll-runtime -unroll-count=2 -unroll-runtime-epilog=false \| FileCheck %s -check-prefix=PROLOG			; RUN: opt < %s -S -loop-unroll -unroll-runtime -unroll-count=2 -unroll-runtime-epilog=false \| FileCheck %s -check-prefix=PROLOG

	; RUN: opt < %s -S -passes='require<opt-remark-emit>,unroll' -unroll-runtime -unroll-count=2 -unroll-runtime-epilog=true \| FileCheck %s -check-prefix=EPILOG			; RUN: opt < %s -S -passes='require<opt-remark-emit>,unroll' -unroll-runtime -unroll-count=2 -unroll-runtime-epilog=true \| FileCheck %s -check-prefix=EPILOG
	; RUN: opt < %s -S -passes='require<opt-remark-emit>,unroll' -unroll-runtime -unroll-count=2 -unroll-runtime-epilog=false \| FileCheck %s -check-prefix=PROLOG			; RUN: opt < %s -S -passes='require<opt-remark-emit>,unroll' -unroll-runtime -unroll-count=2 -unroll-runtime-epilog=false \| FileCheck %s -check-prefix=PROLOG

	; This tests that setting the unroll count works			; This tests that setting the unroll count works


	; EPILOG: for.body.preheader:			; EPILOG: for.body.preheader:
	; EPILOG: br i1 %1, label %for.end.loopexit.unr-lcssa, label %for.body.preheader.new, !dbg [[PH_LOC:![0-9]+]]			; EPILOG: br i1 %1, label %for.end.loopexit.unr-lcssa, label %for.body.preheader.new, !dbg [[PH_LOC:![0-9]+]]
	; EPILOG: for.body:			; EPILOG: for.body:
	; EPILOG: br i1 %niter.ncmp.1, label %for.end.loopexit.unr-lcssa.loopexit, label %for.body, !dbg [[BODY_LOC:![0-9]+]]			; EPILOG: br i1 %niter.ncmp.1, label %for.end.loopexit.unr-lcssa.loopexit, label %for.body, !dbg [[PH_LOC]]
	; EPILOG-NOT: br i1 %niter.ncmp.2, label %for.end.loopexit{{.*}}, label %for.body			; EPILOG-NOT: br i1 %niter.ncmp.2, label %for.end.loopexit{{.*}}, label %for.body
	; EPILOG: for.body.epil.preheader:			; EPILOG: for.body.epil.preheader:
	; EPILOG: br label %for.body.epil, !dbg [[BODY_LOC]]			; EPILOG: br label %for.body.epil, !dbg [[PH_LOC]]
	; EPILOG: for.body.epil:			; EPILOG: for.body.epil:
	; EPILOG: br label %for.end.loopexit.epilog-lcssa, !dbg [[BODY_LOC]]			; EPILOG: br label %for.end.loopexit.epilog-lcssa, !dbg [[PH_LOC]]
	; EPILOG: for.end.loopexit:			; EPILOG: for.end.loopexit:
	; EPILOG: br label %for.end, !dbg [[EXIT_LOC:![0-9]+]]			; EPILOG: br label %for.end, !dbg [[EXIT_LOC:![0-9]+]]

	; EPILOG-DAG: [[PH_LOC]] = !DILocation(line: 101, column: 1, scope: !{{.*}})			; EPILOG-DAG: [[PH_LOC]] = !DILocation(line: 102, column: 1, scope: !{{.*}})
	; EPILOG-DAG: [[BODY_LOC]] = !DILocation(line: 102, column: 1, scope: !{{.*}})
	; EPILOG-DAG: [[EXIT_LOC]] = !DILocation(line: 103, column: 1, scope: !{{.*}})			; EPILOG-DAG: [[EXIT_LOC]] = !DILocation(line: 103, column: 1, scope: !{{.*}})

	; PROLOG: for.body.preheader:			; PROLOG: for.body.preheader:
	; PROLOG: br {{.*}} label %for.body.prol.preheader, label %for.body.prol.loopexit, !dbg [[PH_LOC:![0-9]+]]			; PROLOG: br {{.*}} label %for.body.prol.preheader, label %for.body.prol.loopexit, !dbg [[PH_LOC:![0-9]+]]
	; PROLOG: for.body.prol:			; PROLOG: for.body.prol:
	; PROLOG: br label %for.body.prol.loopexit, !dbg [[BODY_LOC:![0-9]+]]			; PROLOG: br label %for.body.prol.loopexit, !dbg [[PH_LOC:![0-9]+]]
	; PROLOG: for.body.prol.loopexit:			; PROLOG: for.body.prol.loopexit:
	; PROLOG: br {{.*}} label %for.end.loopexit, label %for.body.preheader.new, !dbg [[PH_LOC]]			; PROLOG: br {{.*}} label %for.end.loopexit, label %for.body.preheader.new, !dbg [[PH_LOC]]
	; PROLOG: for.body:			; PROLOG: for.body:
	; PROLOG: br i1 %exitcond.1, label %for.end.loopexit.unr-lcssa, label %for.body, !dbg [[BODY_LOC]]			; PROLOG: br i1 %exitcond.1, label %for.end.loopexit.unr-lcssa, label %for.body, !dbg [[PH_LOC]]
	; PROLOG-NOT: br i1 %exitcond.4, label %for.end.loopexit{{.*}}, label %for.body			; PROLOG-NOT: br i1 %exitcond.4, label %for.end.loopexit{{.*}}, label %for.body

	; PROLOG-DAG: [[PH_LOC]] = !DILocation(line: 101, column: 1, scope: !{{.*}})			; PROLOG-DAG: [[PH_LOC]] = !DILocation(line: 102, column: 1, scope: !{{.*}})
	; PROLOG-DAG: [[BODY_LOC]] = !DILocation(line: 102, column: 1, scope: !{{.*}})

	define i32 @test(i32* nocapture %a, i32 %n) nounwind uwtable readonly !dbg !6 {			define i32 @test(i32* nocapture %a, i32 %n) nounwind uwtable readonly !dbg !6 {
	entry:			entry:
	%cmp1 = icmp eq i32 %n, 0, !dbg !7			%cmp1 = icmp eq i32 %n, 0, !dbg !7
	br i1 %cmp1, label %for.end, label %for.body, !dbg !7			br i1 %cmp1, label %for.end, label %for.body, !dbg !7

	for.body: ; preds = %for.body, %entry			for.body: ; preds = %for.body, %entry
	%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]			%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
	Show All 32 Lines

llvm/trunk/test/Transforms/LoopVectorize/X86/vectorization-remarks-missed.ll

	Show All 10 Lines
	; #pragma clang loop vectorize(enable) interleave(enable)			; #pragma clang loop vectorize(enable) interleave(enable)
	; for (int i = 0; i < Length; i++) {			; for (int i = 0; i < Length; i++) {
	; A[i] = i;			; A[i] = i;
	; if (A[i] > Length)			; if (A[i] > Length)
	; break;			; break;
	; }			; }
	; }			; }
	; File, line, and column should match those specified in the metadata			; File, line, and column should match those specified in the metadata
	; CHECK: remark: source.cpp:4:5: loop not vectorized: could not determine number of loop iterations			; CHECK: remark: source.cpp:5:9: loop not vectorized: could not determine number of loop iterations
	; CHECK: remark: source.cpp:4:5: loop not vectorized			; CHECK: remark: source.cpp:5:9: loop not vectorized

	; void test_disabled(int *A, int Length) {			; void test_disabled(int *A, int Length) {
	; #pragma clang loop vectorize(disable) interleave(disable)			; #pragma clang loop vectorize(disable) interleave(disable)
	; for (int i = 0; i < Length; i++)			; for (int i = 0; i < Length; i++)
	; A[i] = i;			; A[i] = i;
	; }			; }
	; CHECK: remark: source.cpp:13:5: loop not vectorized: vectorization and interleaving are explicitly disabled, or the loop has already been vectorized			; CHECK: remark: source.cpp:12:8: loop not vectorized: vectorization and interleaving are explicitly disabled, or the loop has already been vectorized

	; void test_array_bounds(int A, int B, int Length) {			; void test_array_bounds(int A, int B, int Length) {
	; #pragma clang loop vectorize(enable)			; #pragma clang loop vectorize(enable)
	; for (int i = 0; i < Length; i++)			; for (int i = 0; i < Length; i++)
	; A[i] = A[B[i]];			; A[i] = A[B[i]];
	; }			; }
	; CHECK: remark: source.cpp:19:5: loop not vectorized: cannot identify array bounds			; CHECK: remark: source.cpp:19:5: loop not vectorized: cannot identify array bounds
	; CHECK: remark: source.cpp:19:5: loop not vectorized			; CHECK: remark: source.cpp:19:5: loop not vectorized
	Show All 10 Lines
	; return k;			; return k;
	; }			; }
	; CHECK: remark: source.cpp:29:7: loop not vectorized: control flow cannot be substituted for a select			; CHECK: remark: source.cpp:29:7: loop not vectorized: control flow cannot be substituted for a select
	; CHECK: remark: source.cpp:27:3: loop not vectorized			; CHECK: remark: source.cpp:27:3: loop not vectorized

	; YAML: --- !Analysis			; YAML: --- !Analysis
	; YAML-NEXT: Pass: loop-vectorize			; YAML-NEXT: Pass: loop-vectorize
	; YAML-NEXT: Name: CantComputeNumberOfIterations			; YAML-NEXT: Name: CantComputeNumberOfIterations
	; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 4, Column: 5 }			; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 5, Column: 9 }
	; YAML-NEXT: Function: _Z4testPii			; YAML-NEXT: Function: _Z4testPii
	; YAML-NEXT: Args:			; YAML-NEXT: Args:
	; YAML-NEXT: - String: 'loop not vectorized: '			; YAML-NEXT: - String: 'loop not vectorized: '
	; YAML-NEXT: - String: could not determine number of loop iterations			; YAML-NEXT: - String: could not determine number of loop iterations
	; YAML-NEXT: ...			; YAML-NEXT: ...
	; YAML-NEXT: --- !Missed			; YAML-NEXT: --- !Missed
	; YAML-NEXT: Pass: loop-vectorize			; YAML-NEXT: Pass: loop-vectorize
	; YAML-NEXT: Name: MissedDetails			; YAML-NEXT: Name: MissedDetails
	; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 4, Column: 5 }			; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 5, Column: 9 }
	; YAML-NEXT: Function: _Z4testPii			; YAML-NEXT: Function: _Z4testPii
	; YAML-NEXT: Args:			; YAML-NEXT: Args:
	; YAML-NEXT: - String: loop not vectorized			; YAML-NEXT: - String: loop not vectorized
	; YAML-NEXT: ...			; YAML-NEXT: ...
	; YAML-NEXT: --- !Analysis			; YAML-NEXT: --- !Analysis
	; YAML-NEXT: Pass: loop-vectorize			; YAML-NEXT: Pass: loop-vectorize
	; YAML-NEXT: Name: AllDisabled			; YAML-NEXT: Name: AllDisabled
	; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 13, Column: 5 }			; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 12, Column: 8 }
	; YAML-NEXT: Function: _Z13test_disabledPii			; YAML-NEXT: Function: _Z13test_disabledPii
	; YAML-NEXT: Args:			; YAML-NEXT: Args:
	; YAML-NEXT: - String: 'loop not vectorized: vectorization and interleaving are explicitly disabled, or the loop has already been vectorized			; YAML-NEXT: - String: 'loop not vectorized: vectorization and interleaving are explicitly disabled, or the loop has already been vectorized
	; YAML-NEXT: ...			; YAML-NEXT: ...
	; YAML-NEXT: --- !Analysis			; YAML-NEXT: --- !Analysis
	; YAML-NEXT: Pass: ''			; YAML-NEXT: Pass: ''
	; YAML-NEXT: Name: CantIdentifyArrayBounds			; YAML-NEXT: Name: CantIdentifyArrayBounds
	; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }			; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
	▲ Show 20 Lines • Show All 234 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopVectorize/X86/vectorization-remarks-profitable.ll

	; RUN: opt < %s -loop-vectorize -pass-remarks-missed='loop-vectorize' -mtriple=x86_64-unknown-linux -S 2>&1 \| FileCheck %s			; RUN: opt < %s -loop-vectorize -pass-remarks-missed='loop-vectorize' -mtriple=x86_64-unknown-linux -S 2>&1 \| FileCheck %s

	; Verify analysis remarks are generated when interleaving is not beneficial.			; Verify analysis remarks are generated when interleaving is not beneficial.
	; CHECK: remark: vectorization-remarks-profitable.c:5:17: the cost-model indicates that vectorization is not beneficial			; CHECK: remark: vectorization-remarks-profitable.c:5:17: the cost-model indicates that vectorization is not beneficial
	; CHECK: remark: vectorization-remarks-profitable.c:5:17: the cost-model indicates that interleaving is not beneficial and is explicitly disabled or interleave count is set to 1			; CHECK: remark: vectorization-remarks-profitable.c:5:17: the cost-model indicates that interleaving is not beneficial and is explicitly disabled or interleave count is set to 1
	; CHECK: remark: vectorization-remarks-profitable.c:12:17: the cost-model indicates that vectorization is not beneficial			; CHECK: remark: vectorization-remarks-profitable.c:11:3: the cost-model indicates that vectorization is not beneficial
	; CHECK: remark: vectorization-remarks-profitable.c:12:17: the cost-model indicates that interleaving is not beneficial			; CHECK: remark: vectorization-remarks-profitable.c:11:3: the cost-model indicates that interleaving is not beneficial

	; First loop.			; First loop.
	; #pragma clang loop interleave(disable) unroll(disable)			; #pragma clang loop interleave(disable) unroll(disable)
	; for(int i = 0; i < n; i++) {			; for(int i = 0; i < n; i++) {
	; out[i] = *in[i];			; out[i] = *in[i];
	; }			; }

	; Second loop.			; Second loop.
	▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopVectorize/debugloc.ll

	; RUN: opt -S < %s -loop-vectorize -force-vector-interleave=1 -force-vector-width=2 \| FileCheck %s			; RUN: opt -S < %s -loop-vectorize -force-vector-interleave=1 -force-vector-width=2 \| FileCheck %s

	target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"			target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"

	; Make sure we are preserving debug info in the vectorized code.			; Make sure we are preserving debug info in the vectorized code.

	; CHECK: for.body.lr.ph			; CHECK: for.body.lr.ph
	; CHECK: min.iters.check = icmp ult i64 {{.*}}, 2, !dbg !{{[0-9]+}}			; CHECK: min.iters.check = icmp ult i64 {{.*}}, 2, !dbg !{{[0-9]+}}
	; CHECK: vector.body			; CHECK: vector.body
	; CHECK: index {{.*}}, !dbg ![[LOC:[0-9]+]]			; CHECK: index {{.*}}, !dbg ![[LOC:[0-9]+]]
	; CHECK: getelementptr inbounds i32, i32* %a, {{.*}}, !dbg ![[LOC]]			; CHECK: getelementptr inbounds i32, i32* %a, {{.*}}, !dbg ![[LOC]]
	; CHECK: load <2 x i32>, <2 x i32>* {{.*}}, !dbg ![[LOC]]			; CHECK: load <2 x i32>, <2 x i32>* {{.*}}, !dbg ![[LOC]]
	; CHECK: add <2 x i32> {{.*}}, !dbg ![[LOC]]			; CHECK: add <2 x i32> {{.*}}, !dbg ![[LOC]]
	; CHECK: add i64 %index, 2, !dbg ![[LOC]]			; CHECK: add i64 %index, 2, !dbg ![[LOC]]
	; CHECK: icmp eq i64 %index.next, %n.vec, !dbg ![[LOC]]			; CHECK: icmp eq i64 %index.next, %n.vec, !dbg ![[LOC]]
	; CHECK: middle.block			; CHECK: middle.block
	; CHECK: add <2 x i32> %{{.*}}, %rdx.shuf, !dbg ![[LOC]]			; CHECK: add <2 x i32> %{{.*}}, %rdx.shuf, !dbg ![[BR_LOC:[0-9]+]]
	; CHECK: extractelement <2 x i32> %bin.rdx, i32 0, !dbg ![[LOC]]			; CHECK: extractelement <2 x i32> %bin.rdx, i32 0, !dbg ![[BR_LOC]]
				; CHECK: for.body
				; CHECK br i1{{.}}, label %for.body,{{.}}, !dbg ![[BR_LOC]],
				; CHECK: ![[BR_LOC]] = !DILocation(line: 5,

	define i32 @f(i32* nocapture %a, i32 %size) #0 !dbg !4 {			define i32 @f(i32* nocapture %a, i32 %size) #0 !dbg !4 {
	entry:			entry:
	call void @llvm.dbg.value(metadata i32* %a, metadata !13, metadata !DIExpression()), !dbg !19			call void @llvm.dbg.value(metadata i32* %a, metadata !13, metadata !DIExpression()), !dbg !19
	call void @llvm.dbg.value(metadata i32 %size, metadata !14, metadata !DIExpression()), !dbg !19			call void @llvm.dbg.value(metadata i32 %size, metadata !14, metadata !DIExpression()), !dbg !19
	call void @llvm.dbg.value(metadata i32 0, metadata !15, metadata !DIExpression()), !dbg !20			call void @llvm.dbg.value(metadata i32 0, metadata !15, metadata !DIExpression()), !dbg !20
	call void @llvm.dbg.value(metadata i32 0, metadata !16, metadata !DIExpression()), !dbg !21			call void @llvm.dbg.value(metadata i32 0, metadata !16, metadata !DIExpression()), !dbg !21
	%cmp4 = icmp eq i32 %size, 0, !dbg !21			%cmp4 = icmp eq i32 %size, 0, !dbg !21
	br i1 %cmp4, label %for.end, label %for.body.lr.ph, !dbg !21			br i1 %cmp4, label %for.end, label %for.body.lr.ph, !dbg !21

	for.body.lr.ph: ; preds = %entry			for.body.lr.ph: ; preds = %entry
	br label %for.body, !dbg !21			br label %for.body, !dbg !21

	for.body: ; preds = %for.body.lr.ph, %for.body			for.body: ; preds = %for.body.lr.ph, %for.body
	%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]			%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]
	%sum.05 = phi i32 [ 0, %for.body.lr.ph ], [ %add, %for.body ]			%sum.05 = phi i32 [ 0, %for.body.lr.ph ], [ %add, %for.body ]
	%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv, !dbg !22			%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv, !dbg !22
	%0 = load i32, i32* %arrayidx, align 4, !dbg !22			%0 = load i32, i32* %arrayidx, align 4, !dbg !22
	%add = add i32 %0, %sum.05, !dbg !22			%add = add i32 %0, %sum.05, !dbg !22
	%indvars.iv.next = add i64 %indvars.iv, 1, !dbg !22			%indvars.iv.next = add i64 %indvars.iv, 1, !dbg !22
	call void @llvm.dbg.value(metadata !{null}, metadata !16, metadata !DIExpression()), !dbg !22			call void @llvm.dbg.value(metadata !{null}, metadata !16, metadata !DIExpression()), !dbg !22
	%lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !22			%lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !22
	%exitcond = icmp ne i32 %lftr.wideiv, %size, !dbg !22			%exitcond = icmp ne i32 %lftr.wideiv, %size, !dbg !21
	br i1 %exitcond, label %for.body, label %for.cond.for.end_crit_edge, !dbg !21			br i1 %exitcond, label %for.body, label %for.cond.for.end_crit_edge, !dbg !21

	for.cond.for.end_crit_edge: ; preds = %for.body			for.cond.for.end_crit_edge: ; preds = %for.body
	%add.lcssa = phi i32 [ %add, %for.body ]			%add.lcssa = phi i32 [ %add, %for.body ]
	call void @llvm.dbg.value(metadata i32 %add.lcssa, metadata !15, metadata !DIExpression()), !dbg !22			call void @llvm.dbg.value(metadata i32 %add.lcssa, metadata !15, metadata !DIExpression()), !dbg !22
	br label %for.end, !dbg !21			br label %for.end, !dbg !21

	for.end: ; preds = %entry, %for.cond.for.end_crit_edge			for.end: ; preds = %entry, %for.cond.for.end_crit_edge
	Show All 40 Lines

llvm/trunk/test/Transforms/LoopVectorize/fix-reduction-dbg.ll

Property	Old Value	New Value
svn:executable	null	* \ No newline at end of property

				; Confirm that the line numbers for the middle.block operations are all the
				; same as the start of the loop.

				; RUN: opt -S -loop-vectorize -force-vector-width=4 -force-vector-interleave=4 <%s \| FileCheck %s
				;
				; CHECK: middle.block:
				; CHECK-NEXT: %{{.}}= add <4 x i32>{{.}}, !dbg ![[DL:[0-9]+]]
				; CHECK-NEXT: %{{.}}= add <4 x i32>{{.}}, !dbg ![[DL]]
				; CHECK-NEXT: %{{.}}= add <4 x i32>{{.}}, !dbg ![[DL]]
				; CHECK-NEXT: %{{.}}= shufflevector <4 x i32>{{.}}, !dbg ![[DL]]
				; CHECK-NEXT: %{{.}}= add <4 x i32>{{.}}, !dbg ![[DL]]
				; CHECK-NEXT: %{{.}}= shufflevector <4 x i32>{{.}}, !dbg ![[DL]]
				; CHECK-NEXT: %{{.}}= add <4 x i32>{{.}}, !dbg ![[DL]]
				; CHECK-NEXT: %{{.}}= extractelement <4 x i32>{{.}}, !dbg ![[DL]]
				; CHECK-NEXT: %{{.}}= icmp eq i64{{.}}, !dbg ![[DL]]
				; CHECK-NEXT: br i1 %{{.*}}, !dbg ![[DL]]
				; CHECK: ![[DL]] = !DILocation(line: 5,

				; This IR can be generated by running:
				; clang -gmlt -S src.cpp -emit-llvm -mllvm -opt-bisect-limit=56 -O2 -o -
				;
				; Where src.cpp contains:
				; int foo(int count, int *bar)
				; {
				; int ret = count;
				; int tmp;
				; for (int j = 0; j < count; j++) {
				; tmp = bar[j];
				; ret += tmp;
				; }
				;
				; return ret;
				; }

				define dso_local i32 @"foo"(i32 %count, i32* nocapture readonly %bar) local_unnamed_addr !dbg !8 {
				entry:
				%cmp8 = icmp sgt i32 %count, 0, !dbg !10
				br i1 %cmp8, label %for.body.preheader, label %for.cond.cleanup, !dbg !10

				for.body.preheader: ; preds = %entry
				%wide.trip.count = zext i32 %count to i64
				br label %for.body, !dbg !11

				for.cond.cleanup.loopexit: ; preds = %for.body
				%add.lcssa = phi i32 [ %add, %for.body ], !dbg !12
				br label %for.cond.cleanup, !dbg !13

				for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry
				%ret.0.lcssa = phi i32 [ %count, %entry ], [ %add.lcssa, %for.cond.cleanup.loopexit ], !dbg !14
				ret i32 %ret.0.lcssa, !dbg !13

				for.body: ; preds = %for.body, %for.body.preheader
				%indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
				%ret.09 = phi i32 [ %count, %for.body.preheader ], [ %add, %for.body ]
				%arrayidx = getelementptr inbounds i32, i32* %bar, i64 %indvars.iv, !dbg !11
				%0 = load i32, i32* %arrayidx, align 4, !dbg !11, !tbaa !15
				%add = add nsw i32 %0, %ret.09, !dbg !12
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !10
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count, !dbg !10
				br i1 %exitcond, label %for.cond.cleanup.loopexit, label %for.body, !dbg !10, !llvm.loop !19
				}

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!3, !4, !5, !6}
				!llvm.ident = !{!7}

				!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !2, nameTableKind: None)
				!1 = !DIFile(filename: "src.cpp", directory: "")
				!2 = !{}
				!3 = !{i32 2, !"CodeView", i32 1}
				!4 = !{i32 2, !"Debug Info Version", i32 3}
				!5 = !{i32 1, !"wchar_size", i32 2}
				!6 = !{i32 7, !"PIC Level", i32 2}
				!7 = !{!""}
				!8 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !9, scopeLine: 2, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !0, retainedNodes: !2)
				!9 = !DISubroutineType(types: !2)
				!10 = !DILocation(line: 5, scope: !8)
				!11 = !DILocation(line: 6, scope: !8)
				!12 = !DILocation(line: 7, scope: !8)
				!13 = !DILocation(line: 10, scope: !8)
				!14 = !DILocation(line: 0, scope: !8)
				!15 = !{!16, !16, i64 0}
				!16 = !{!"int", !17, i64 0}
				!17 = !{!"omnipotent char", !18, i64 0}
				!18 = !{!"Simple C++ TBAA"}
				!19 = distinct !{!19, !10, !20}
				!20 = !DILocation(line: 8, scope: !8)

llvm/trunk/test/Transforms/LoopVectorize/unsafe-dep-remark.ll

	; RUN: opt -loop-vectorize -force-vector-width=2 -pass-remarks-analysis=loop-vectorize < %s 2>&1 \| FileCheck %s			; RUN: opt -loop-vectorize -force-vector-width=2 -pass-remarks-analysis=loop-vectorize < %s 2>&1 \| FileCheck %s

	; ModuleID = '/tmp/kk.c'			; ModuleID = '/tmp/kk.c'
	source_filename = "/tmp/kk.c"			source_filename = "/tmp/kk.c"
	target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"

	; 1 void success (char A, char B, char C, char D, char *E, int N) {			; 1 void success (char A, char B, char C, char D, char *E, int N) {
	; 2 for(int i = 0; i < N; i++) {			; 2 for(int i = 0; i < N; i++) {
	; 3 A[i + 1] = A[i] + B[i];			; 3 A[i + 1] = A[i] + B[i];
	; 4 C[i] = D[i] * E[i];			; 4 C[i] = D[i] * E[i];
	; 5 }			; 5 }
	; 6 }			; 6 }

	; CHECK: remark: /tmp/kk.c:3:16: loop not vectorized: unsafe dependent memory operations in loop. Use #pragma loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop			; CHECK: remark: /tmp/kk.c:2:3: loop not vectorized: unsafe dependent memory operations in loop. Use #pragma loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop

	define void @success(i8* nocapture %A, i8* nocapture readonly %B, i8* nocapture %C, i8* nocapture readonly %D, i8* nocapture readonly %E, i32 %N) !dbg !6 {			define void @success(i8* nocapture %A, i8* nocapture readonly %B, i8* nocapture %C, i8* nocapture readonly %D, i8* nocapture readonly %E, i32 %N) !dbg !6 {
	entry:			entry:
	%cmp28 = icmp sgt i32 %N, 0, !dbg !8			%cmp28 = icmp sgt i32 %N, 0, !dbg !8
	br i1 %cmp28, label %for.body, label %for.cond.cleanup, !dbg !9			br i1 %cmp28, label %for.body, label %for.cond.cleanup, !dbg !9

	for.body: ; preds = %entry, %for.body			for.body: ; preds = %entry, %for.body
	%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]			%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines