This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
InitializePasses.h
-
Transforms/
-
Utils.h
-
Utils/
-
CanonicalizeFreezeInLoops.h
-
lib/
-
Passes/
1/2
PassBuilder.cpp
-
PassRegistry.def
-
Transforms/Utils/
-
Utils/
-
CMakeLists.txt
65/69
CanonicalizeFreezeInLoops.cpp
-
Utils.cpp
-
test/Transforms/CanonicalizeFreezeInLoops/
-
Transforms/
-
CanonicalizeFreezeInLoops/
2/2
func_from_mcf_r.ll
-
nonsteps-preserve-flags.ll
2/2
onephi.ll
-
phis.ll

Differential D77523

Add CanonicalizeFreezeInLoops pass
ClosedPublic

Authored by aqjune on Apr 5 2020, 11:18 PM.

Download Raw Diff

Details

Reviewers

spatel
efriedma
lebedev.ri
fhahn
jdoerfert

Commits

rGd9a4a244138c: Add CanonicalizeFreezeInLoops pass

Summary

If an induction variable is frozen and used, SCEV yields imprecise result
because it doesn't say anything about frozen variables.

Due to this reason, performance degradation happened after
https://reviews.llvm.org/D76483 is merged, causing
SCEV yield imprecise result and preventing LSR to optimize a loop.

The suggested solution here is to add a pass which canonicalizes frozen variables
inside a loop. To be specific, it pushes freezes out of the loop by freezing
the initial value and step values instead & dropping nsw/nuw flags from instructions used by freeze.
This solution was also mentioned at https://reviews.llvm.org/D70623 .

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aqjune created this revision.Apr 5 2020, 11:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2020, 11:18 PM

Herald added subscribers: llvm-commits, javed.absar, hiraditya, mgorny. · View Herald Transcript

aqjune added a child revision: D77524: [TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR.Apr 5 2020, 11:34 PM

clang-format

Harbormaster failed remote builds in B51882: Diff 255226!Apr 5 2020, 11:57 PM

Harbormaster failed remote builds in B51885: Diff 255229!Apr 6 2020, 12:29 AM

Minor updates (capitalize pass name)

nikic added a subscriber: nikic.Apr 6 2020, 1:18 AM

Harbormaster failed remote builds in B51892: Diff 255245!Apr 6 2020, 1:36 AM

efriedma added inline comments.Apr 6 2020, 11:41 AM

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
117	What guarantees that SteppingInst is an add or sub?
125	If StepBB is inside the loop, it can't be the preheader.

Remove L->getLoopPreheader() != StepBB check, add a test for this (with minor updates too)

aqjune marked an inline comment as done.Apr 6 2020, 12:13 PM

aqjune added inline comments.

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
117	It is `L->isAuxiliaryInductionVariable`.

Harbormaster failed remote builds in B52003: Diff 255434!Apr 6 2020, 1:04 PM

efriedma added inline comments.Apr 6 2020, 2:07 PM

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
100	OpIdx is unused?
126	It looks like isAuxiliaryInductionVariable checks that the step is loop-invariant?

I might have missed this but what was the reason we don't do this as part of LICM?

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
72	Is there a good reason not to make this a struct with named members so the values and use sites have meaning?
75	`auto *BB` if these are pointers.
100	You shadow PN even though you capture everything. Choose a different name, return the PHI, or use the captured version of PN?
115	Please initialize PN with null for the future, similar SteppingInst.
117	I have a bad feeling about this but if people think it's ok I'm fine with an assert.
130	Shouldn't we teach SCEV about freeze instead?
143	Style: Single DEBUG? If you really want multiple dbgs() make it `LLVM_DEBUG({ ... });`
172	Do we handle the case where both are `GuaranteedNotToBeUndefOrPoison` somewhere else? In that case we don't need to drop the poison generating flags, right?

Address comments

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
72	Tuple was used because its types seemed to represent what they stand for.
126	A semantically loop invariant instruction may reside in a loop. For example, if it is division of two loop-invariant variables, it is also loop invariant but cannot be hoisted.
130	I think we have two kinds of analyses conceptually: must-be analysis (any program execution should satisfy the analysis result) and may-be analysis (there exists an execution that satisfies the result). SCEV analysis seems to satisfy both. In case of freeze, if SCEV result of freeze(val) is defined as SCEV of val, must-be analysis becomes problematic; freeze(val) can be any value if val was poison. So, teaching SCEV about freeze may not fully recover precision in general.
172	It can be poison if summation overflowed. But it seems SteppingInst can be non-poison in certain case (e.g. it was used by divison; it is UB if poison), so added GuaranteedNotToBeUndefOrPoison check on SteppingInst.

In D77523#1965758, @jdoerfert wrote:

I might have missed this but what was the reason we don't do this as part of LICM?

The motivation was that freezes inserted by DivRemPairs should be moved out of the loop so LoopStrengthReduce (in TargetPassConfig::addIRPasses) optimizes successfully.
Currently LICM isn't there between the two. There is SimplifyCFGPass , but it does not have dependency on loop analyses and it fires regardless of whether it is LTO or not (whereas the regression happened only when LTO was enabled), so seemed it wasn't a good candidate for having this.
In terms of reusability of this pass, I believe this pass can be inserted after e.g. passes like LoopUnswitch inserted freeze.

jdoerfert added inline comments.Apr 6 2020, 10:42 PM

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
72	Sure, a freeze inst, a phi node, a binary operator and a value. You have to see `Value *StepValue = std::get<3>(Candidate);` to guess what the value is. If you want to access the phi (as an Instruction or auto) you have to go back to the vector definition to determine the index. These are (IMHO) good reason against a tuple. Are there any benefits?
172	I see.

Harbormaster failed remote builds in B52110: Diff 255587!Apr 6 2020, 10:53 PM

Move to a named struct from tuple

Minor update to a comment

aqjune marked 2 inline comments as done.Apr 6 2020, 11:10 PM

aqjune added inline comments.

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
72	Had no special preference for it, so moved to a struct with names.

Harbormaster failed remote builds in B52116: Diff 255597!Apr 6 2020, 11:57 PM

Harbormaster failed remote builds in B52114: Diff 255595!

fhahn added inline comments.Apr 7 2020, 5:11 AM

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
11	'use induction variables' is a bit imprecise IMO; there are no induction variables in IR. You could say 'freeze instructions that use either an induction PHI or the corresponding 'step' instruction (= the incoming value from the loop latch of the induction PHI)'
63	nit: A struct.
86	An alternative approach would be to look at the phis of the loop header and use InductionDescriptor::isInductionPHI to identify induction PHIs and their connected 'step' instruction. Then look at their users to see if they contain a freeze instruction. Candidates would be {PHI, IVDescripter, [list of freeze users to eliminate]}. I think this potentially could simplify the code a bit. The alternative has a slightly different compile-time cost: with the alternative approach, you always have to pay to cost to find the induction PHIs (shouldn't be too costly, as the information is available in SE already), with the current approach you always have to iterate over all instructions in the loop. But I think that's unlikely to really matter in the grand scheme of things. The current approach potentially leads to duplicated induction checks currently. The alternative approach would also handle the case where we have multiple freeze instructions using the same induction PHIs/steps without having multiple candidates, meaning each induction PHI/step would be frozen exactly once, not multiple times as with the current approach.
93	There's a test case missing for that scenario. Also, I think removing FI here will invalidate your iterator over the BB, probably causing a crash. You can work around that by using make_early_inc_range(*BB) and iterating over that. But I am not sure we actually need to do this here, as freeze without users should be cleaned up by the existing DCE, right? Not sure if we need to handle it here.
151	I think it would be good to be a bit more descriptive in the debug message, e.g. it could say something like 'Pushing freeze ' << %FI << through ' << SteppingInst << ' and ' << PN << ' outside of loop' ,
155	Is this overly conservative? The transform ensures that both operands won't be poison, right? Would it be enough to check if the instruction may result in poison with 2 non-poison operands?
164	I think currently cases where we have multiple candidates for the same PHI are not handled correctly at the moment here, as after handling the first candidate the step operand will not match the original StepValue. Multiple candidates would be added for a loop like the one below I think. loop: %i = phi i32 [%init, %entry], [%i.next, %loop] %i.fr = freeze i32 %i call void @call(i32 %i.fr) %i.next = add nsw nuw i32 %i, %step %i.next.fr = freeze i32 %i.next %cond = icmp eq i32 %i.next, %n call void @call(i32 %i.next.fr) br i1 %cond, label %loop, label %exit

Don't wait for me on this, ping me if needed.

Oh, I was busy today and didn't have enough time to update this patch. Maybe I can update this tomorrow.

I found that dealing with multiple phis had problems, so had to change the
algorithm.
Rather than having a candidate for each (auxiliary) induction PHI, now it
just maintains three sets:

InstsToDropFlags: instructions to drop flags like nsw/nuw
FIsToRemove: Freeze insts to remove
UsesToInsertFr: Uses (Value, User) which should be frozen so

it becomes (freeze(Value), User). The frozen values are placed at
the loop preheader.

What CanonicalizeFreezeInLoopsImpl::run does is:

(1) Gets reachable instructions from auxiliary induction PHIs (upto MaxDepth step)
(2) For each freeze among the reachable instructions, see whether it can be
pushed out of the loop by seeing instructions among paths from PHI to the freeze
whether their flags can be dropped / relevant users can be frozen. If not, the
freeze cannot be pushed out of the loop.
(3) For all 'pushable' freezes, push them out of the loop.

aqjune marked 3 inline comments as done and 4 inline comments as done.Apr 15 2020, 3:06 AM

aqjune added inline comments.

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
164	Now this case will be correctly handled. Have a test add_multiuses2.

Harbormaster failed remote builds in B53321: Diff 257644!Apr 15 2020, 3:47 AM

Marked reflected/old comments as done

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
11	replaced it with 'induction PHI' in all places.
86	Now the code iterates over phis first, as you suggested
117	The assertion has been removed

aqjune edited the summary of this revision. (Show Details)Apr 15 2020, 7:22 PM

aqjune edited the summary of this revision. (Show Details)

I can report that in our testing on SPEC 2017, this pass fixes the regression to mcf introduced with D76483.

This patch gives an uplift of 1.5% on mcf_r; other benchmarks in SPEC 2017 intrate are between +0.4% and -0.2%.

We also looked into what seemed to be a regression of at most 1.5% on xalancbmk on some different hardware but this seems to be within normal fluctuations of this particular benchmark, with no changes apparent in the relevant (performance critical) areas of the code.

From a performance point of view this is good to go, but I'd like to leave an LGTM to others.

Rebase

Ping

Harbormaster failed remote builds in B54678: Diff 260093!Apr 25 2020, 7:56 AM

I'm asking myself if it wouldn't be better to start with the freeze instructions and work our way up through the operands. At the end of the day we do a lot of work just to figure out there is no freeze or the instruction chains we build do not end in one. Am I missing something?

Also, could this drop flags of instructions not on the path between a phi and a freeze? So do we keep the nsw on the sub and mul in the following example:

%iv = phi [0, ...] [%step, ...]
%step = add nsw %iv, 1
%f = freeze %step
%not_step1 = sub nsw %iv, 1
%not_step2 = mul nsw %step, 2

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
78	Nit: I would call this `canHandleInst` or something similar. We can deal with various instructions I guess, some require us to drop flags but others not, e.g., freeze or bitcast.
91	What happens if the step value looks like this: %iv = phi [0, ...] [%step, ...] %step = add %iv, %iv Just checking that we don't have a problem in this case.
123	Style: `for (const auto &U : I->users())` ? (w/ or w/o const)
138	Is the find stuff necessary or could you just check the return value of the `insert` call?
141	I think we often avoid such recursion in favor of worklists but if this doesn't show on the profile it's OK I guess.
151	`true` is an unsigned for the callee, maybe `/* CurDepth */ 1` instead?
166	Do we really need a queue or would LIFO (=smallvector) also work?
282	Nit: I guess we could make all these class members, right? Unclear if that is better though. Nit: DenseMap instead of std::map? ^ Both should be considered but can be kept this way too.
303	Nit: StepI is checked twice for null. Style: I'd use the braces around the inner conditional instead
313	`const auto &Pair` I think `const` and `*` or `&` should be combined with `auto` whenever possible.

Hi,

In D77523#2012741, @jdoerfert wrote:

I'm asking myself if it wouldn't be better to start with the freeze instructions and work our way up through the operands. At the end of the day we do a lot of work just to figure out there is no freeze or the instruction chains we build do not end in one. Am I missing something?

It's true, but we should iterate over instructions inside the loop to find freeze instructions.
Indexing freezes that are uses of induction variables will be helpful for exiting early; right now, DivRemPair is the place where it introduces freezes, but it does not do relevant conversion in x86-64, so the code will have almost zero freeze instructions.
Would it make sense if IVUsers maintains the info (existence of freeze)? I'm not familiar with the class, but may have a look.

Also, could this drop flags of instructions not on the path between a phi and a freeze? So do we keep the nsw on the sub and mul in the following example:
%iv = phi [0, ...] [%step, ...]
%step = add nsw %iv, 1
%f = freeze %step
%not_step1 = sub nsw %iv, 1
%not_step2 = mul nsw %step, 2

The nsw flags of sub and mul will be preserved. I'll add this as a test.

In D77523#2014624, @aqjune wrote:

Hi,

In D77523#2012741, @jdoerfert wrote:

I'm asking myself if it wouldn't be better to start with the freeze instructions and work our way up through the operands. At the end of the day we do a lot of work just to figure out there is no freeze or the instruction chains we build do not end in one. Am I missing something?

It's true, but we should iterate over instructions inside the loop to find freeze instructions.
Indexing freezes that are uses of induction variables will be helpful for exiting early; right now, DivRemPair is the place where it introduces freezes, but it does not do relevant conversion in x86-64, so the code will have almost zero freeze instructions.
Would it make sense if IVUsers maintains the info (existence of freeze)? I'm not familiar with the class, but may have a look.

It might but that means maintaining state, which is hard and error prone.

I was thinking more along the lines of this:

for (auto *I : L->instruction())
  if (auto *Fr = dyn_cast<Freeze>(I))
    handleFreeze(Fr);

handleFreeze(L, Fr) { 
Worklist = {Fr};
while (!Worklist.empty()) {
  Inst =  dyn_cast<Instruction>(Worklist.pop())
  if (!Inst || !L->contains(Inst) || !Visited.insert(V))
    continue;
  if (isSpecialPHI(V))
    // found special PHI that ends in a freeze, do record it
  for (auto &Op : Inst->operands())
    Worklist.push_back(Op);
}
}

The main difference is that we are performing a single traversal of the loop + work per freeze instruction that is bounded by the def-use chain ending in the freeze and contained in the loop.

Also, could this drop flags of instructions not on the path between a phi and a freeze? So do we keep the nsw on the sub and mul in the following example:
%iv = phi [0, ...] [%step, ...]
%step = add nsw %iv, 1
%f = freeze %step
%not_step1 = sub nsw %iv, 1
%not_step2 = mul nsw %step, 2
The nsw flags of sub and mul will be preserved. I'll add this as a test.

Thx!

In D77523#2014839, @jdoerfert wrote:
In D77523#2014624, @aqjune wrote:

Hi,

In D77523#2012741, @jdoerfert wrote:

I'm asking myself if it wouldn't be better to start with the freeze instructions and work our way up through the operands. At the end of the day we do a lot of work just to figure out there is no freeze or the instruction chains we build do not end in one. Am I missing something?

It's true, but we should iterate over instructions inside the loop to find freeze instructions.
Indexing freezes that are uses of induction variables will be helpful for exiting early; right now, DivRemPair is the place where it introduces freezes, but it does not do relevant conversion in x86-64, so the code will have almost zero freeze instructions.
Would it make sense if IVUsers maintains the info (existence of freeze)? I'm not familiar with the class, but may have a look.

It might but that means maintaining state, which is hard and error prone.

I was thinking more along the lines of this:
for (auto *I : L->instruction())
  if (auto *Fr = dyn_cast<Freeze>(I))
    handleFreeze(Fr);

handleFreeze(L, Fr) { 
Worklist = {Fr};
while (!Worklist.empty()) {
  Inst =  dyn_cast<Instruction>(Worklist.pop())
  if (!Inst || !L->contains(Inst) || !Visited.insert(V))
    continue;
  if (isSpecialPHI(V))
    // found special PHI that ends in a freeze, do record it
  for (auto &Op : Inst->operands())
    Worklist.push_back(Op);
}
}
The main difference is that we are performing a single traversal of the loop + work per freeze instruction that is bounded by the def-use chain ending in the freeze and contained in the loop.

I guess it depends on what kinds of freeze instructions we are looking for exactly.

IIRC the earlier version of the patch only considered freeze instructions where the operand is either an induction PHI or the step instruction. If we only look for those cases, checking only the users of the PHI and the step instruction potentially avoids having to iterate over large loop bodies (also, parent loops contain all blocks of their child loops, so I think iterating over all instructions in a loop would mean revisiting all instructions in child loops).

If we are looking for any freeze that is reachable from an induction PHI, then the benefit of starting at a PHI will probably much less.

But I thought the problem that this pass is supposed to solve is quite narrow: if there is a freeze in a cycle involving an induction PHI, push the freeze out of the loop. To detect those cases, it should be enough to (recursively) check the operands of the step instruction. That should only require a few steps at most, as the instructions that can be part of such a cycle are quite limited.

FWIW I think ideally the patch should be kept as narrow as possible to handle the motivating case initially and additional cases as follow-ups as required. That makes the reviews easier, reduces risk, lowers the threshold for enabling it and allows to better analyse the impact of extensions on its own. For example, there might not be much (any?) benefit of pushing certain freezes out of a loop.

For easier review, I agree with @fhahn .
I'll split this patch so first it only deals with a simple case (consider the case when freeze uses either phi or step instruction only). This will make the patch smaller, and also compilation time concern can be addressed as well.
I'll check the performance impact, and if it needs more general version to fully address the slowdown, it will be the following one.

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
91	It will be safe (won't be optimized). The test step_inst at onephi.ll checks it.
151	Giving true wasn't intended, I removed the value.
282	Nit: I guess we could make all these class members, right? Unclear if that is better though. Previously there was one candidate object per phi, so making it as a class with named fields was making sense (instead of having tuple). Here, I think leaving it as variables is also okay, maybe. Nit: DenseMap instead of std::map? Just found that LLVM coding convention also suggests using ADTs, so fixed
313	Oops sorry, fixed

Address comments

Harbormaster failed remote builds in B55567: Diff 261678!May 2 2020, 8:08 PM

Added a test (func_from_mcf_r.ll) that is a real-world code which should be optimized.

I agree. @fhahn will you finish the review?

llvm/test/Transforms/CanonicalizeFreezeInLoops/func_from_mcf_r.ll
86	I think we can and should remove the metadata and attributes.

Leave the basic algorithm only.

I see that by this patch alone the slowdown in mcf_r isn't resolved.
I'll bring a follow-up patch.
While doing this, I'll investigate whether it is possible to apply more specific
pattern rather than just looking through instructions within N steps.

Remove an attribute from func_from_mcf_r.ll

aqjune edited the summary of this revision. (Show Details)May 3 2020, 11:39 AM

Harbormaster failed remote builds in B55580: Diff 261712!May 3 2020, 12:13 PM

Harbormaster failed remote builds in B55581: Diff 261713!

I investigated how this patch affects mcf_r , and the result was as follows:

(1) For the previous experimental result, I made a silly mistake :(
Actually, it brought speedup; I ran mcf_r with testsuite with LTO enabled, and applying this patch showed 4.7% speedup (91.86 sec -> 87.55 sec), which is much more than I expected.
To check whether it affects other benchmarks, I'm running SPEC 2017 (without using testsuite).

(2) For mcf_r, I investigated how many steps do we need to see from PHI to push all freezes out of the loop, and the maximum distance was 2.
From this fact, we may not need very general algorithm at this point. Simple syntactic rules can be added if mcf_r shows slowdown again. If more general pattern is needed, we can revisit the general algorithm.

I can invest only a few hours for patches per a day, so updates might be a bit slow. Anyway, I'll attach the SPEC 2017 result after running is done.

I applied this patch on e124e83, and ran SPEC2017rate. Here is the result (unit: sec.):

		e124e83	+D77523	speedup
500.perlbench_r	844.24	830.61	1.64%
502.gcc_r	488.62	487.73	0.18%
505.mcf_r	673.73	678.61	-0.72%
520.omnetpp_r	684.97	684.75	0.03%
523.xalancbmk_r	753.45	750.64	0.37%
525.x264_r	598.98	597.80	0.20%
531.deepsjeng_r	481.23	481.94	-0.15%
541.leela_r	774.59	775.01	-0.05%
557.xz_r	694.84	696.73	-0.27%
508.namd_r	621.90	622.51	-0.10%
511.povray_r	967.31	966.13	0.12%
519.lbm_r	671.68	671.86	-0.03%
538.imagick_r	1122.32	1134.04	-1.03%
544.nab_r	943.58	943.61	0.00%

A few tests that raised an error because it required fortran are excluded.
(1) I chose e124e83 because it was the commit where xalancbmk_r showed the small slowdown. With this patch only, xalancbmk_r is okay.
(2) mcf_r shows slowdown with this patch, but on my machine the speedup of the (full) patch was visible only on SPEC compiled with LLVM testsuite.
Interestingly again, with this patch only, its speedup (of the testsuite's mcf_r) is larger than expected (compared with e124e83, it is 3.4%)
(3) 538.imagick_r has 1% slowdown, which seems to be within error - these numbers are medians after 3 runs, and the baseline of 538.imagick_r fluctuates between around 1122 and 1133 (+D77523: 1133 ~ 1138).
(4) 500.perlbench_r has 1.6% speedup, and the speedup is consistent (it did not fluctuate).

I have minor comments, but otherwise I think this is nicely restricted as a start. Why isn't it added to the pass pipeline?
@fhahn @efriedma Any objections?

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
46	Not needed anymore?
164	Can you also add: loop: %i = phi i32 [%init, %entry], [%i.next, %loop] %i.fr1 = freeze i32 %i call void @call(i32 %i.fr1) %i.fr2 = freeze i32 %i call void @call(i32 %i.fr2) %i.next = add nsw nuw i32 %i, %step %cond = icmp eq i32 %i.next, %n br i1 %cond, label %loop, label %exit and the double use of the frozen step value as well?
180	Nit: No braces.
188	`if (!ProcessedPHIs.insert(Info.PHI))` (or `insert(..).second`)
190	Please add a message to asserts, also in other places.

I don't have any concerns about the general approach. A couple drive-by comments.

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
230	getLoopAnalysisUsage?
248	Newline

Address comments

Addition to the pipeline is addressed at the separate patch (https://reviews.llvm.org/D77524)
to make revert easier when either this patch or registration of the pass causes
a problem.

aqjune marked 6 inline comments as done.May 7 2020, 1:03 PM

aqjune added inline comments.

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
230	Seems like using getLoopAnalysisUsage affects the code generation of LSR even if the pass is doing nothing, causing tests such as CodeGen/X86/lsr-loop-exit-cond.ll fail.

efriedma added inline comments.May 7 2020, 1:24 PM

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
230	What, specifically, in getLoopAnalysisUsage is causing that? If you're intentionally not using getLoopAnalysisUsage, please carefully document what you're doing here and in LSR.

Harbormaster failed remote builds in B56094: Diff 262735!May 7 2020, 1:35 PM

Explain why AnalysisUsage is manually updated

aqjune marked an inline comment as not done.May 7 2020, 2:04 PM

aqjune added inline comments.

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
230	It is because it calls AU.addRequiredID(LCSSAID) , and LSR does not require LCSSA. I left a comment about this.

Harbormaster failed remote builds in B56101: Diff 262755!May 7 2020, 2:43 PM

aqjune mentioned this in D77524: [TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR.May 7 2020, 7:38 PM

ping @fhahn

Does it look good to other reviewers as well?

For me this makes sense. No objections.

fhahn added inline comments.May 15 2020, 10:30 AM

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
40	Not used?
47	not used?
65	nit: private not needed here.
71	public not needed here.
85	nit: I think it would be slightly more straight forward to have this function return a new FrozenIndPHIInfo.
85	On second look, this seems very similar to InductionDescriptor::isInductionPHI. Can we use that instead? isInductionPHI should also be able to subsume the call to isAuxiliaryInductionVariable
110	nit: placed in?
116	nit: needs message

Use InductionDescriptor::isInductionPHI , address comments

aqjune marked 9 inline comments as done.May 15 2020, 12:30 PM

aqjune added inline comments.

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
85	Yes, InductionDescriptor::isInductionPHI works as well. Thank you for the information.

aqjune edited the summary of this revision. (Show Details)May 15 2020, 12:30 PM

aqjune marked 3 inline comments as done.

Harbormaster failed remote builds in B56904: Diff 264311!May 15 2020, 1:37 PM

LGTM with some optional nits and a few comments about the tests. Please wait with committing for a day or so incase there are any additional comments.

llvm/lib/Passes/PassBuilder.cpp
182	nit: unused?
llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
67	nit: It would be good to add comments to the members here, especially how they are related.
74	nit: It would be good to clarify what 'can handle' means here in function comment. Potentially move the comment about dropping nsw/nuw flags into it/
96	nit: you can use cast instead of dyn_cast. cast asserts that the value can be cast and you can UserI from the asset below.
127	nit: consider initializing PHI & StepInst via constructor.
146	nit: This does exactly the same as for the users of Info.StepInst, right (modulo the debug message). Might be good to use something like concat_range or a lambda for the common body.
llvm/test/Transforms/CanonicalizeFreezeInLoops/func_from_mcf_r.ll
4	same about AArch64 triple as for llvm/test/Transforms/CanonicalizeFreezeInLoops/onephi.ll. Also, it would be good to reduce the test to the important bits. For example, the types <{ i64, %struct.arc, %struct.g }>, <{ i64, %struct.arc, %struct.g } and the global aliases are quite verbose and do not really impact the freeze hoisting.
llvm/test/Transforms/CanonicalizeFreezeInLoops/onephi.ll
5	does this require an AArch64 triple? If so, you have to add something like `REQUIRES: aarch64-registered-target` I think, otherwise it might if LLVM is built without AARch64 backend.

This revision is now accepted and ready to land.May 17 2020, 10:36 AM

Remove unnecessary parts from tests, address comments

Thank you! I'll wait 2 days and land this on Wednesday.

llvm/lib/Passes/PassBuilder.cpp
182	This seems to be needed by PassRegistry.def .
llvm/test/Transforms/CanonicalizeFreezeInLoops/onephi.ll
5	aarch64 wasn't needed. removed

Add REQUIRES to func_from_mcf_r.ll

Harbormaster failed remote builds in B57015: Diff 264533!May 17 2020, 10:52 PM

Harbormaster failed remote builds in B57012: Diff 264531!May 17 2020, 11:24 PM

Closed by commit rGd9a4a244138c: Add CanonicalizeFreezeInLoops pass (authored by aqjune). · Explain WhyMay 20 2020, 5:41 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

include/

llvm/

InitializePasses.h

1 line

Transforms/

Utils.h

7 lines

Utils/

CanonicalizeFreezeInLoops.h

32 lines

lib/

Passes/

PassBuilder.cpp

1 line

PassRegistry.def

1 line

Transforms/

Utils/

CMakeLists.txt

1 line

CanonicalizeFreezeInLoops.cpp

247 lines

Utils.cpp

1 line

test/

Transforms/

CanonicalizeFreezeInLoops/

func_from_mcf_r.ll

71 lines

nonsteps-preserve-flags.ll

34 lines

onephi.ll

547 lines

phis.ll

114 lines

Diff 265394

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 85 Lines • ▼ Show 20 Lines
	void initializeBlockFrequencyInfoWrapperPassPass(PassRegistry&);			void initializeBlockFrequencyInfoWrapperPassPass(PassRegistry&);
	void initializeBoundsCheckingLegacyPassPass(PassRegistry&);			void initializeBoundsCheckingLegacyPassPass(PassRegistry&);
	void initializeBranchFolderPassPass(PassRegistry&);			void initializeBranchFolderPassPass(PassRegistry&);
	void initializeBranchProbabilityInfoWrapperPassPass(PassRegistry&);			void initializeBranchProbabilityInfoWrapperPassPass(PassRegistry&);
	void initializeBranchRelaxationPass(PassRegistry&);			void initializeBranchRelaxationPass(PassRegistry&);
	void initializeBreakCriticalEdgesPass(PassRegistry&);			void initializeBreakCriticalEdgesPass(PassRegistry&);
	void initializeBreakFalseDepsPass(PassRegistry&);			void initializeBreakFalseDepsPass(PassRegistry&);
	void initializeCanonicalizeAliasesLegacyPassPass(PassRegistry &);			void initializeCanonicalizeAliasesLegacyPassPass(PassRegistry &);
				void initializeCanonicalizeFreezeInLoopsPass(PassRegistry &);
	void initializeCFGOnlyPrinterLegacyPassPass(PassRegistry&);			void initializeCFGOnlyPrinterLegacyPassPass(PassRegistry&);
	void initializeCFGOnlyViewerLegacyPassPass(PassRegistry&);			void initializeCFGOnlyViewerLegacyPassPass(PassRegistry&);
	void initializeCFGPrinterLegacyPassPass(PassRegistry&);			void initializeCFGPrinterLegacyPassPass(PassRegistry&);
	void initializeCFGSimplifyPassPass(PassRegistry&);			void initializeCFGSimplifyPassPass(PassRegistry&);
	void initializeCFGuardPass(PassRegistry&);			void initializeCFGuardPass(PassRegistry&);
	void initializeCFGuardLongjmpPass(PassRegistry&);			void initializeCFGuardLongjmpPass(PassRegistry&);
	void initializeCFGViewerLegacyPassPass(PassRegistry&);			void initializeCFGViewerLegacyPassPass(PassRegistry&);
	void initializeCFIInstrInserterPass(PassRegistry&);			void initializeCFIInstrInserterPass(PassRegistry&);
	▲ Show 20 Lines • Show All 341 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Utils.h

	Show First 20 Lines • Show All 148 Lines • ▼ Show 20 Lines
	FunctionPass *createFixIrreduciblePass();			FunctionPass *createFixIrreduciblePass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// AssumeSimplify - remove redundant assumes and merge assumes in the same			// AssumeSimplify - remove redundant assumes and merge assumes in the same
	// BasicBlock when possible.			// BasicBlock when possible.
	//			//
	FunctionPass *createAssumeSimplifyPass();			FunctionPass *createAssumeSimplifyPass();

				//===----------------------------------------------------------------------===//
				//
				// CanonicalizeFreezeInLoops - Canonicalize freeze instructions in loops so they
				// don't block SCEV.
				//
				Pass *createCanonicalizeFreezeInLoopsPass();
	} // namespace llvm			} // namespace llvm

	#endif			#endif

llvm/include/llvm/Transforms/Utils/CanonicalizeFreezeInLoops.h

This file was added.

				//==- CanonicalizeFreezeInLoop.h - Canonicalize freezes in a loop-- C++ --==//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file canonicalizes freeze instructions in a loop.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TRANSFORMS_UTILS_CANONICALIZE_FREEZES_IN_LOOPS_H
				#define LLVM_TRANSFORMS_UTILS_CANONICALIZE_FREEZES_IN_LOOPS_H

				#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/IR/PassManager.h"
				#include "llvm/Transforms/Scalar/LoopPassManager.h"

				namespace llvm {

				/// A pass that canonicalizes freeze instructions in a loop.
				class CanonicalizeFreezeInLoopsPass
				: public PassInfoMixin<CanonicalizeFreezeInLoopsPass> {
				public:
				PreservedAnalyses run(Loop &L, LoopAnalysisManager &AM,
				LoopStandardAnalysisResults &AR, LPMUpdater &U);
				};

				} // end namespace llvm

				#endif // LLVM_TRANSFORMS_UTILS_CANONICALIZE_FREEZES_IN_LOOPS_H

llvm/lib/Passes/PassBuilder.cpp

	Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
	#include "llvm/Transforms/Scalar/SpeculateAroundPHIs.h"			#include "llvm/Transforms/Scalar/SpeculateAroundPHIs.h"
	#include "llvm/Transforms/Scalar/SpeculativeExecution.h"			#include "llvm/Transforms/Scalar/SpeculativeExecution.h"
	#include "llvm/Transforms/Scalar/TailRecursionElimination.h"			#include "llvm/Transforms/Scalar/TailRecursionElimination.h"
	#include "llvm/Transforms/Scalar/WarnMissedTransforms.h"			#include "llvm/Transforms/Scalar/WarnMissedTransforms.h"
	#include "llvm/Transforms/Utils/AddDiscriminators.h"			#include "llvm/Transforms/Utils/AddDiscriminators.h"
	#include "llvm/Transforms/Utils/AssumeBundleBuilder.h"			#include "llvm/Transforms/Utils/AssumeBundleBuilder.h"
	#include "llvm/Transforms/Utils/BreakCriticalEdges.h"			#include "llvm/Transforms/Utils/BreakCriticalEdges.h"
	#include "llvm/Transforms/Utils/CanonicalizeAliases.h"			#include "llvm/Transforms/Utils/CanonicalizeAliases.h"
				#include "llvm/Transforms/Utils/CanonicalizeFreezeInLoops.h"
				fhahnUnsubmitted Not Done Reply Inline Actions nit: unused? fhahn: nit: unused?
				aqjuneAuthorUnsubmitted Done Reply Inline Actions This seems to be needed by PassRegistry.def . aqjune: This seems to be needed by PassRegistry.def .
	#include "llvm/Transforms/Utils/EntryExitInstrumenter.h"			#include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
	#include "llvm/Transforms/Utils/InjectTLIMappings.h"			#include "llvm/Transforms/Utils/InjectTLIMappings.h"
	#include "llvm/Transforms/Utils/LCSSA.h"			#include "llvm/Transforms/Utils/LCSSA.h"
	#include "llvm/Transforms/Utils/LibCallsShrinkWrap.h"			#include "llvm/Transforms/Utils/LibCallsShrinkWrap.h"
	#include "llvm/Transforms/Utils/LoopSimplify.h"			#include "llvm/Transforms/Utils/LoopSimplify.h"
	#include "llvm/Transforms/Utils/LowerInvoke.h"			#include "llvm/Transforms/Utils/LowerInvoke.h"
	#include "llvm/Transforms/Utils/Mem2Reg.h"			#include "llvm/Transforms/Utils/Mem2Reg.h"
	#include "llvm/Transforms/Utils/NameAnonGlobals.h"			#include "llvm/Transforms/Utils/NameAnonGlobals.h"
	▲ Show 20 Lines • Show All 2,323 Lines • Show Last 20 Lines

llvm/lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 310 Lines • ▼ Show 20 Lines
	LOOP_ANALYSIS("ddg", DDGAnalysis())			LOOP_ANALYSIS("ddg", DDGAnalysis())
	LOOP_ANALYSIS("ivusers", IVUsersAnalysis())			LOOP_ANALYSIS("ivusers", IVUsersAnalysis())
	LOOP_ANALYSIS("pass-instrumentation", PassInstrumentationAnalysis(PIC))			LOOP_ANALYSIS("pass-instrumentation", PassInstrumentationAnalysis(PIC))
	#undef LOOP_ANALYSIS			#undef LOOP_ANALYSIS

	#ifndef LOOP_PASS			#ifndef LOOP_PASS
	#define LOOP_PASS(NAME, CREATE_PASS)			#define LOOP_PASS(NAME, CREATE_PASS)
	#endif			#endif
				LOOP_PASS("canon-freeze", CanonicalizeFreezeInLoopsPass())
	LOOP_PASS("invalidate<all>", InvalidateAllAnalysesPass())			LOOP_PASS("invalidate<all>", InvalidateAllAnalysesPass())
	LOOP_PASS("licm", LICMPass())			LOOP_PASS("licm", LICMPass())
	LOOP_PASS("loop-idiom", LoopIdiomRecognizePass())			LOOP_PASS("loop-idiom", LoopIdiomRecognizePass())
	LOOP_PASS("loop-instsimplify", LoopInstSimplifyPass())			LOOP_PASS("loop-instsimplify", LoopInstSimplifyPass())
	LOOP_PASS("rotate", LoopRotatePass())			LOOP_PASS("rotate", LoopRotatePass())
	LOOP_PASS("no-op-loop", NoOpLoopPass())			LOOP_PASS("no-op-loop", NoOpLoopPass())
	LOOP_PASS("print", PrintLoopPass(dbgs()))			LOOP_PASS("print", PrintLoopPass(dbgs()))
	LOOP_PASS("loop-deletion", LoopDeletionPass())			LOOP_PASS("loop-deletion", LoopDeletionPass())
	Show All 22 Lines

llvm/lib/Transforms/Utils/CMakeLists.txt

	add_llvm_component_library(LLVMTransformUtils			add_llvm_component_library(LLVMTransformUtils
	AddDiscriminators.cpp			AddDiscriminators.cpp
	AMDGPUEmitPrintf.cpp			AMDGPUEmitPrintf.cpp
	ASanStackFrameLayout.cpp			ASanStackFrameLayout.cpp
	AssumeBundleBuilder.cpp			AssumeBundleBuilder.cpp
	BasicBlockUtils.cpp			BasicBlockUtils.cpp
	BreakCriticalEdges.cpp			BreakCriticalEdges.cpp
	BuildLibCalls.cpp			BuildLibCalls.cpp
	BypassSlowDivision.cpp			BypassSlowDivision.cpp
	CallPromotionUtils.cpp			CallPromotionUtils.cpp
	CallGraphUpdater.cpp			CallGraphUpdater.cpp
	CanonicalizeAliases.cpp			CanonicalizeAliases.cpp
				CanonicalizeFreezeInLoops.cpp
	CloneFunction.cpp			CloneFunction.cpp
	CloneModule.cpp			CloneModule.cpp
	CodeExtractor.cpp			CodeExtractor.cpp
	CodeMoverUtils.cpp			CodeMoverUtils.cpp
	CtorUtils.cpp			CtorUtils.cpp
	Debugify.cpp			Debugify.cpp
	DemoteRegToStack.cpp			DemoteRegToStack.cpp
	EntryExitInstrumenter.cpp			EntryExitInstrumenter.cpp
	▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp

This file was added.

				//==- CanonicalizeFreezeInLoops - Canonicalize freezes in a loop-- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass canonicalizes freeze instructions in a loop by pushing them out to
				// the preheader.
				//
				fhahnUnsubmitted Done Reply Inline Actions 'use induction variables' is a bit imprecise IMO; there are no induction variables in IR. You could say 'freeze instructions that use either an induction PHI or the corresponding 'step' instruction (= the incoming value from the loop latch of the induction PHI)' fhahn: 'use induction variables' is a bit imprecise IMO; there are no induction variables in IR. You…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions replaced it with 'induction PHI' in all places. aqjune: replaced it with 'induction PHI' in all places.
				// loop:
				// i = phi init, i.next
				// i.next = add nsw i, 1
				// i.next.fr = freeze i.next // push this out of this loop
				// use(i.next.fr)
				// br i1 (i.next <= N), loop, exit
				// =>
				// init.fr = freeze init
				// loop:
				// i = phi init.fr, i.next
				// i.next = add i, 1 // nsw is dropped here
				// use(i.next)
				// br i1 (i.next <= N), loop, exit
				//
				// Removing freezes from these chains help scalar evolution successfully analyze
				// expressions.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Transforms/Utils/CanonicalizeFreezeInLoops.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/Analysis/IVUsers.h"
				#include "llvm/Analysis/LoopAnalysisManager.h"
				#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/Analysis/LoopPass.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/InitializePasses.h"
				fhahnUnsubmitted Done Reply Inline Actions Not used? fhahn: Not used?
				#include "llvm/Pass.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Transforms/Utils.h"

				using namespace llvm;

				jdoerfertUnsubmitted Done Reply Inline Actions Not needed anymore? jdoerfert: Not needed anymore?
				#define DEBUG_TYPE "canon-freeze"
				fhahnUnsubmitted Done Reply Inline Actions not used? fhahn: not used?

				namespace {

				class CanonicalizeFreezeInLoops : public LoopPass {
				public:
				static char ID;

				CanonicalizeFreezeInLoops();

				private:
				bool runOnLoop(Loop *L, LPPassManager &LPM) override;
				void getAnalysisUsage(AnalysisUsage &AU) const override;
				};

				class CanonicalizeFreezeInLoopsImpl {
				Loop *L;
				fhahnUnsubmitted Done Reply Inline Actions nit: A struct. fhahn: nit: A struct.
				ScalarEvolution &SE;
				DominatorTree &DT;
				fhahnUnsubmitted Done Reply Inline Actions nit: private not needed here. fhahn: nit: private not needed here.

				struct FrozenIndPHIInfo {
				fhahnUnsubmitted Done Reply Inline Actions nit: It would be good to add comments to the members here, especially how they are related. fhahn: nit: It would be good to add comments to the members here, especially how they are related.
				// A freeze instruction that uses an induction phi
				FreezeInst *FI = nullptr;
				// The induction phi, step instruction, the operand idx of StepInst which is
				// a step value
				fhahnUnsubmitted Done Reply Inline Actions public not needed here. fhahn: public not needed here.
				PHINode *PHI;
				jdoerfertUnsubmitted Done Reply Inline Actions Is there a good reason not to make this a struct with named members so the values and use sites have meaning? jdoerfert: Is there a good reason not to make this a struct with named members so the values and use sites…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Tuple was used because its types seemed to represent what they stand for. aqjune: Tuple was used because its types seemed to represent what they stand for.
				jdoerfertUnsubmitted Done Reply Inline Actions Sure, a freeze inst, a phi node, a binary operator and a value. You have to see `Value StepValue = std::get<3>(Candidate);` to guess what the value is. If you want to access the phi (as an Instruction or auto) you have to go back to the vector definition to determine the index. These are (IMHO) good reason against a tuple. Are there any benefits? jdoerfert:* Sure, a freeze inst, a phi node, a binary operator and a value. You have to see `Value…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Had no special preference for it, so moved to a struct with names. aqjune: Had no special preference for it, so moved to a struct with names.
				BinaryOperator *StepInst;
				unsigned StepValIdx = 0;
				fhahnUnsubmitted Done Reply Inline Actions nit: It would be good to clarify what 'can handle' means here in function comment. Potentially move the comment about dropping nsw/nuw flags into it/ fhahn: nit: It would be good to clarify what 'can handle' means here in function comment. Potentially…

				jdoerfertUnsubmitted Done Reply Inline Actions `auto BB` if these are pointers. jdoerfert:* `auto *BB` if these are pointers.
				FrozenIndPHIInfo(PHINode PHI, BinaryOperator StepInst)
				: PHI(PHI), StepInst(StepInst) {}
				};
				jdoerfertUnsubmitted Done Reply Inline Actions Nit: I would call this `canHandleInst` or something similar. We can deal with various instructions I guess, some require us to drop flags but others not, e.g., freeze or bitcast. jdoerfert: Nit: I would call this `canHandleInst` or something similar. We can deal with various…

				// Can freeze instruction be pushed into operands of I?
				// In order to do this, I should not create a poison after I's flags are
				// stripped.
				bool canHandleInst(const Instruction *I) {
				auto Opc = I->getOpcode();
				// If add/sub/mul, drop nsw/nuw flags.
				fhahnUnsubmitted Done Reply Inline Actions nit: I think it would be slightly more straight forward to have this function return a new FrozenIndPHIInfo. fhahn: nit: I think it would be slightly more straight forward to have this function return a new…
				fhahnUnsubmitted Done Reply Inline Actions On second look, this seems very similar to InductionDescriptor::isInductionPHI. Can we use that instead? isInductionPHI should also be able to subsume the call to isAuxiliaryInductionVariable fhahn: On second look, this seems very similar to InductionDescriptor::isInductionPHI. Can we use that…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Yes, InductionDescriptor::isInductionPHI works as well. Thank you for the information. aqjune: Yes, InductionDescriptor::isInductionPHI works as well. Thank you for the information.
				return Opc == Instruction::Add \|\| Opc == Instruction::Sub \|\|
				fhahnUnsubmitted Done Reply Inline Actions An alternative approach would be to look at the phis of the loop header and use InductionDescriptor::isInductionPHI to identify induction PHIs and their connected 'step' instruction. Then look at their users to see if they contain a freeze instruction. Candidates would be {PHI, IVDescripter, [list of freeze users to eliminate]}. I think this potentially could simplify the code a bit. The alternative has a slightly different compile-time cost: with the alternative approach, you always have to pay to cost to find the induction PHIs (shouldn't be too costly, as the information is available in SE already), with the current approach you always have to iterate over all instructions in the loop. But I think that's unlikely to really matter in the grand scheme of things. The current approach potentially leads to duplicated induction checks currently. The alternative approach would also handle the case where we have multiple freeze instructions using the same induction PHIs/steps without having multiple candidates, meaning each induction PHI/step would be frozen exactly once, not multiple times as with the current approach. fhahn: An alternative approach would be to look at the phis of the loop header and use…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Now the code iterates over phis first, as you suggested aqjune: Now the code iterates over phis first, as you suggested
				Opc == Instruction::Mul;
				}

				void InsertFreezeAndForgetFromSCEV(Use &U);

				jdoerfertUnsubmitted Done Reply Inline Actions What happens if the step value looks like this: %iv = phi [0, ...] [%step, ...] %step = add %iv, %iv Just checking that we don't have a problem in this case. jdoerfert: What happens if the step value looks like this: ``` %iv = phi [0, ...] [%step, ...] %step =…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions It will be safe (won't be optimized). The test step_inst at onephi.ll checks it. aqjune: It will be safe (won't be optimized). The test step_inst at onephi.ll checks it.
				public:
				CanonicalizeFreezeInLoopsImpl(Loop *L, ScalarEvolution &SE, DominatorTree &DT)
				fhahnUnsubmitted Done Reply Inline Actions There's a test case missing for that scenario. Also, I think removing FI here will invalidate your iterator over the BB, probably causing a crash. You can work around that by using make_early_inc_range(BB) and iterating over that. But I am not sure we actually need to do this here, as freeze without users should be cleaned up by the existing DCE, right? Not sure if we need to handle it here. fhahn:* There's a test case missing for that scenario. Also, I think removing FI here will invalidate…
				: L(L), SE(SE), DT(DT) {}
				bool run();
				};
				fhahnUnsubmitted Done Reply Inline Actions nit: you can use cast instead of dyn_cast. cast asserts that the value can be cast and you can UserI from the asset below. fhahn: nit: you can use cast instead of dyn_cast. cast asserts that the value can be cast and you can…

				} // anonymous namespace

				// Given U = (value, user), replace value with freeze(value), and let
				efriedmaUnsubmitted Done Reply Inline Actions OpIdx is unused? efriedma: OpIdx is unused?
				jdoerfertUnsubmitted Done Reply Inline Actions You shadow PN even though you capture everything. Choose a different name, return the PHI, or use the captured version of PN? jdoerfert: You shadow PN even though you capture everything. Choose a different name, return the PHI, or…
				// SCEV forget user. The inserted freeze is placed in the preheader.
				void CanonicalizeFreezeInLoopsImpl::InsertFreezeAndForgetFromSCEV(Use &U) {
				auto *PH = L->getLoopPreheader();

				auto *UserI = cast<Instruction>(U.getUser());
				auto *ValueToFr = U.get();
				assert(L->contains(UserI->getParent()) &&
				"Should not process an instruction that isn't inside the loop");
				if (isGuaranteedNotToBeUndefOrPoison(ValueToFr, UserI, &DT))
				return;
				fhahnUnsubmitted Done Reply Inline Actions nit: placed in? fhahn: nit: placed in?

				LLVM_DEBUG(dbgs() << "canonfr: inserting freeze:\n");
				LLVM_DEBUG(dbgs() << "\tUser: " << *U.getUser() << "\n");
				LLVM_DEBUG(dbgs() << "\tOperand: " << *U.get() << "\n");

				jdoerfertUnsubmitted Done Reply Inline Actions Please initialize PN with null for the future, similar SteppingInst. jdoerfert: Please initialize PN with null for the future, similar SteppingInst.
				U.set(new FreezeInst(ValueToFr, ValueToFr->getName() + ".frozen",
				fhahnUnsubmitted Done Reply Inline Actions nit: needs message fhahn: nit: needs message
				PH->getTerminator()));
				efriedmaUnsubmitted Done Reply Inline Actions What guarantees that SteppingInst is an add or sub? efriedma: What guarantees that SteppingInst is an add or sub?
				aqjuneAuthorUnsubmitted Done Reply Inline Actions It is `L->isAuxiliaryInductionVariable`. aqjune: It is `L->isAuxiliaryInductionVariable`.
				jdoerfertUnsubmitted Done Reply Inline Actions I have a bad feeling about this but if people think it's ok I'm fine with an assert. jdoerfert: I have a bad feeling about this but if people think it's ok I'm fine with an assert.
				aqjuneAuthorUnsubmitted Done Reply Inline Actions The assertion has been removed aqjune: The assertion has been removed

				SE.forgetValue(UserI);
				}

				bool CanonicalizeFreezeInLoopsImpl::run() {
				// The loop should be in LoopSimplify form.
				jdoerfertUnsubmitted Done Reply Inline Actions Style: `for (const auto &U : I->users())` ? (w/ or w/o const) jdoerfert: Style: `for (const auto &U : I->users())` ? (w/ or w/o const)
				if (!L->isLoopSimplifyForm())
				return false;
				efriedmaUnsubmitted Done Reply Inline Actions If StepBB is inside the loop, it can't be the preheader. efriedma: If StepBB is inside the loop, it can't be the preheader.

				efriedmaUnsubmitted Done Reply Inline Actions It looks like isAuxiliaryInductionVariable checks that the step is loop-invariant? efriedma: It looks like isAuxiliaryInductionVariable checks that the step is loop-invariant?
				aqjuneAuthorUnsubmitted Done Reply Inline Actions A semantically loop invariant instruction may reside in a loop. For example, if it is division of two loop-invariant variables, it is also loop invariant but cannot be hoisted. aqjune: A semantically loop invariant instruction may reside in a loop. For example, if it is division…
				SmallVector<FrozenIndPHIInfo, 4> Candidates;
				fhahnUnsubmitted Done Reply Inline Actions nit: consider initializing PHI & StepInst via constructor. fhahn: nit: consider initializing PHI & StepInst via constructor.

				for (auto &PHI : L->getHeader()->phis()) {
				InductionDescriptor ID;
				jdoerfertUnsubmitted Done Reply Inline Actions Shouldn't we teach SCEV about freeze instead? jdoerfert: Shouldn't we teach SCEV about freeze instead?
				aqjuneAuthorUnsubmitted Done Reply Inline Actions I think we have two kinds of analyses conceptually: must-be analysis (any program execution should satisfy the analysis result) and may-be analysis (there exists an execution that satisfies the result). SCEV analysis seems to satisfy both. In case of freeze, if SCEV result of freeze(val) is defined as SCEV of val, must-be analysis becomes problematic; freeze(val) can be any value if val was poison. So, teaching SCEV about freeze may not fully recover precision in general. aqjune: I think we have two kinds of analyses conceptually: must-be analysis (any program execution…
				if (!InductionDescriptor::isInductionPHI(&PHI, L, &SE, ID))
				continue;

				LLVM_DEBUG(dbgs() << "canonfr: PHI: " << PHI << "\n");
				FrozenIndPHIInfo Info(&PHI, ID.getInductionBinOp());
				if (!Info.StepInst \|\| !canHandleInst(Info.StepInst)) {
				// The stepping instruction has unknown form.
				// Ignore this PHI.
				jdoerfertUnsubmitted Done Reply Inline Actions Is the find stuff necessary or could you just check the return value of the `insert` call? jdoerfert: Is the find stuff necessary or could you just check the return value of the `insert` call?
				continue;
				}

				jdoerfertUnsubmitted Done Reply Inline Actions I think we often avoid such recursion in favor of worklists but if this doesn't show on the profile it's OK I guess. jdoerfert: I think we often avoid such recursion in favor of worklists but if this doesn't show on the…
				Info.StepValIdx = Info.StepInst->getOperand(0) == &PHI;
				Value *StepV = Info.StepInst->getOperand(Info.StepValIdx);
				jdoerfertUnsubmitted Done Reply Inline Actions Style: Single DEBUG? If you really want multiple dbgs() make it `LLVM_DEBUG({ ... });` jdoerfert: Style: Single DEBUG? If you really want multiple dbgs() make it `LLVM_DEBUG({ ... });`
				if (auto *StepI = dyn_cast<Instruction>(StepV)) {
				if (L->contains(StepI->getParent())) {
				// The step value is inside the loop. Freezing step value will introduce
				fhahnUnsubmitted Done Reply Inline Actions nit: This does exactly the same as for the users of Info.StepInst, right (modulo the debug message). Might be good to use something like concat_range or a lambda for the common body. fhahn: nit: This does exactly the same as for the users of Info.StepInst, right (modulo the debug…
				// another freeze into the loop, so skip this PHI.
				continue;
				}
				}

				fhahnUnsubmitted Done Reply Inline Actions I think it would be good to be a bit more descriptive in the debug message, e.g. it could say something like 'Pushing freeze ' << %FI << through ' << SteppingInst << ' and ' << PN << ' outside of loop' , fhahn: I think it would be good to be a bit more descriptive in the debug message, e.g. it could say…
				jdoerfertUnsubmitted Done Reply Inline Actions `true` is an unsigned for the callee, maybe `/* CurDepth / 1` instead? jdoerfert:* `true` is an unsigned for the callee, maybe `/* CurDepth */ 1` instead?
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Giving true wasn't intended, I removed the value. aqjune: Giving true wasn't intended, I removed the value.
				auto Visit = [&](User *U) {
				if (auto *FI = dyn_cast<FreezeInst>(U)) {
				LLVM_DEBUG(dbgs() << "canonfr: found: " << *FI << "\n");
				Info.FI = FI;
				fhahnUnsubmitted Done Reply Inline Actions Is this overly conservative? The transform ensures that both operands won't be poison, right? Would it be enough to check if the instruction may result in poison with 2 non-poison operands? fhahn: Is this overly conservative? The transform ensures that both operands won't be poison, right?
				Candidates.push_back(Info);
				}
				};
				for_each(PHI.users(), Visit);
				for_each(Info.StepInst->users(), Visit);
				}

				if (Candidates.empty())
				return false;
				fhahnUnsubmitted Done Reply Inline Actions I think currently cases where we have multiple candidates for the same PHI are not handled correctly at the moment here, as after handling the first candidate the step operand will not match the original StepValue. Multiple candidates would be added for a loop like the one below I think. loop: %i = phi i32 [%init, %entry], [%i.next, %loop] %i.fr = freeze i32 %i call void @call(i32 %i.fr) %i.next = add nsw nuw i32 %i, %step %i.next.fr = freeze i32 %i.next %cond = icmp eq i32 %i.next, %n call void @call(i32 %i.next.fr) br i1 %cond, label %loop, label %exit fhahn: I think currently cases where we have multiple candidates for the same PHI are not handled…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Now this case will be correctly handled. Have a test add_multiuses2. aqjune: Now this case will be correctly handled. Have a test add_multiuses2.
				jdoerfertUnsubmitted Done Reply Inline Actions Can you also add: loop: %i = phi i32 [%init, %entry], [%i.next, %loop] %i.fr1 = freeze i32 %i call void @call(i32 %i.fr1) %i.fr2 = freeze i32 %i call void @call(i32 %i.fr2) %i.next = add nsw nuw i32 %i, %step %cond = icmp eq i32 %i.next, %n br i1 %cond, label %loop, label %exit and the double use of the frozen step value as well? jdoerfert: Can you also add: ``` loop: %i = phi i32 [%init, %entry], [%i.next, %loop] %i.fr1 = freeze…

				SmallSet<PHINode *, 8> ProcessedPHIs;
				jdoerfertUnsubmitted Done Reply Inline Actions Do we really need a queue or would LIFO (=smallvector) also work? jdoerfert: Do we really need a queue or would LIFO (=smallvector) also work?
				for (const auto &Info : Candidates) {
				PHINode *PHI = Info.PHI;
				if (!ProcessedPHIs.insert(Info.PHI).second)
				continue;

				BinaryOperator *StepI = Info.StepInst;
				jdoerfertUnsubmitted Done Reply Inline Actions Do we handle the case where both are `GuaranteedNotToBeUndefOrPoison` somewhere else? In that case we don't need to drop the poison generating flags, right? jdoerfert: Do we handle the case where both are `GuaranteedNotToBeUndefOrPoison` somewhere else? In that…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions It can be poison if summation overflowed. But it seems SteppingInst can be non-poison in certain case (e.g. it was used by divison; it is UB if poison), so added GuaranteedNotToBeUndefOrPoison check on SteppingInst. aqjune: It can be poison if summation overflowed. But it seems SteppingInst can be non-poison in…
				jdoerfertUnsubmitted Done Reply Inline Actions I see. jdoerfert: I see.
				assert(StepI && "Step instruction should have been found");

				// Drop flags from the step instruction.
				if (!isGuaranteedNotToBeUndefOrPoison(StepI, StepI, &DT)) {
				LLVM_DEBUG(dbgs() << "canonfr: drop flags: " << *StepI << "\n");
				StepI->dropPoisonGeneratingFlags();
				SE.forgetValue(StepI);
				}
				jdoerfertUnsubmitted Done Reply Inline Actions Nit: No braces. jdoerfert: Nit: No braces.

				InsertFreezeAndForgetFromSCEV(StepI->getOperandUse(Info.StepValIdx));

				unsigned OperandIdx =
				PHI->getOperandNumForIncomingValue(PHI->getIncomingValue(0) == StepI);
				InsertFreezeAndForgetFromSCEV(PHI->getOperandUse(OperandIdx));
				}

				jdoerfertUnsubmitted Done Reply Inline Actions `if (!ProcessedPHIs.insert(Info.PHI))` (or `insert(..).second`) jdoerfert: `if (!ProcessedPHIs.insert(Info.PHI))` (or `insert(..).second`)
				// Finally, remove the old freeze instructions.
				for (const auto &Item : Candidates) {
				jdoerfertUnsubmitted Done Reply Inline Actions Please add a message to asserts, also in other places. jdoerfert: Please add a message to asserts, also in other places.
				auto *FI = Item.FI;
				LLVM_DEBUG(dbgs() << "canonfr: removing " << *FI << "\n");
				SE.forgetValue(FI);
				FI->replaceAllUsesWith(FI->getOperand(0));
				FI->eraseFromParent();
				}

				return true;
				}

				CanonicalizeFreezeInLoops::CanonicalizeFreezeInLoops() : LoopPass(ID) {
				initializeCanonicalizeFreezeInLoopsPass(*PassRegistry::getPassRegistry());
				}

				void CanonicalizeFreezeInLoops::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.addPreservedID(LoopSimplifyID);
				AU.addRequired<LoopInfoWrapperPass>();
				AU.addPreserved<LoopInfoWrapperPass>();
				AU.addRequiredID(LoopSimplifyID);
				AU.addRequired<ScalarEvolutionWrapperPass>();
				AU.addPreserved<ScalarEvolutionWrapperPass>();
				AU.addRequired<DominatorTreeWrapperPass>();
				AU.addPreserved<DominatorTreeWrapperPass>();
				}

				bool CanonicalizeFreezeInLoops::runOnLoop(Loop *L, LPPassManager &) {
				if (skipLoop(L))
				return false;

				auto &SE = getAnalysis<ScalarEvolutionWrapperPass>().getSE();
				auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
				return CanonicalizeFreezeInLoopsImpl(L, SE, DT).run();
				}

				PreservedAnalyses
				CanonicalizeFreezeInLoopsPass::run(Loop &L, LoopAnalysisManager &AM,
				LoopStandardAnalysisResults &AR,
				LPMUpdater &U) {
				if (!CanonicalizeFreezeInLoopsImpl(&L, AR.SE, AR.DT).run())
				return PreservedAnalyses::all();
				efriedmaUnsubmitted Not Done Reply Inline Actions getLoopAnalysisUsage? efriedma: getLoopAnalysisUsage?
				aqjuneAuthorUnsubmitted Not Done Reply Inline Actions Seems like using getLoopAnalysisUsage affects the code generation of LSR even if the pass is doing nothing, causing tests such as CodeGen/X86/lsr-loop-exit-cond.ll fail. aqjune: Seems like using getLoopAnalysisUsage affects the code generation of LSR even if the pass is…
				efriedmaUnsubmitted Not Done Reply Inline Actions What, specifically, in getLoopAnalysisUsage is causing that? If you're intentionally not using getLoopAnalysisUsage, please carefully document what you're doing here and in LSR. efriedma: What, specifically, in getLoopAnalysisUsage is causing that? If you're intentionally not using…
				aqjuneAuthorUnsubmitted Not Done Reply Inline Actions It is because it calls AU.addRequiredID(LCSSAID) , and LSR does not require LCSSA. I left a comment about this. aqjune: It is because it calls AU.addRequiredID(LCSSAID) , and LSR does not require LCSSA. I left a…

				return getLoopPassPreservedAnalyses();
				}

				INITIALIZE_PASS_BEGIN(CanonicalizeFreezeInLoops, "canon-freeze",
				"Canonicalize Freeze Instructions in Loops", false, false)
				INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(ScalarEvolutionWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(LoopSimplify)
				INITIALIZE_PASS_END(CanonicalizeFreezeInLoops, "canon-freeze",
				"Canonicalize Freeze Instructions in Loops", false, false)

				Pass *llvm::createCanonicalizeFreezeInLoopsPass() {
				return new CanonicalizeFreezeInLoops();
				}

				char CanonicalizeFreezeInLoops::ID = 0;
				jdoerfertUnsubmitted Done Reply Inline Actions Nit: StepI is checked twice for null. Style: I'd use the braces around the inner conditional instead jdoerfert: Nit: StepI is checked twice for null. Style: I'd use the braces around the inner conditional…
				jdoerfertUnsubmitted Done Reply Inline Actions Nit: I guess we could make all these class members, right? Unclear if that is better though. Nit: DenseMap instead of std::map? ^ Both should be considered but can be kept this way too. jdoerfert: Nit: I guess we could make all these class members, right? Unclear if that is better though.
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Nit: I guess we could make all these class members, right? Unclear if that is better though. Previously there was one candidate object per phi, so making it as a class with named fields was making sense (instead of having tuple). Here, I think leaving it as variables is also okay, maybe. Nit: DenseMap instead of std::map? Just found that LLVM coding convention also suggests using ADTs, so fixed aqjune: > Nit: I guess we could make all these class members, right? Unclear if that is better though.
				jdoerfertUnsubmitted Done Reply Inline Actions `const auto &Pair` I think `const` and `` or `&` should be combined with `auto` whenever possible. jdoerfert:* `const auto &Pair` I think `const` and `*` or `&` should be combined with `auto` whenever…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions Oops sorry, fixed aqjune: Oops sorry, fixed
				efriedmaUnsubmitted Done Reply Inline Actions Newline efriedma: Newline

llvm/lib/Transforms/Utils/Utils.cpp

	Show All 21 Lines

	/// initializeTransformUtils - Initialize all passes in the TransformUtils			/// initializeTransformUtils - Initialize all passes in the TransformUtils
	/// library.			/// library.
	void llvm::initializeTransformUtils(PassRegistry &Registry) {			void llvm::initializeTransformUtils(PassRegistry &Registry) {
	initializeAddDiscriminatorsLegacyPassPass(Registry);			initializeAddDiscriminatorsLegacyPassPass(Registry);
	initializeAssumeSimplifyPassLegacyPassPass(Registry);			initializeAssumeSimplifyPassLegacyPassPass(Registry);
	initializeBreakCriticalEdgesPass(Registry);			initializeBreakCriticalEdgesPass(Registry);
	initializeCanonicalizeAliasesLegacyPassPass(Registry);			initializeCanonicalizeAliasesLegacyPassPass(Registry);
				initializeCanonicalizeFreezeInLoopsPass(Registry);
	initializeInstNamerPass(Registry);			initializeInstNamerPass(Registry);
	initializeLCSSAWrapperPassPass(Registry);			initializeLCSSAWrapperPassPass(Registry);
	initializeLibCallsShrinkWrapLegacyPassPass(Registry);			initializeLibCallsShrinkWrapLegacyPassPass(Registry);
	initializeLoopSimplifyPass(Registry);			initializeLoopSimplifyPass(Registry);
	initializeLowerInvokeLegacyPassPass(Registry);			initializeLowerInvokeLegacyPassPass(Registry);
	initializeLowerSwitchPass(Registry);			initializeLowerSwitchPass(Registry);
	initializeNameAnonGlobalLegacyPassPass(Registry);			initializeNameAnonGlobalLegacyPassPass(Registry);
	initializePromoteLegacyPassPass(Registry);			initializePromoteLegacyPassPass(Registry);
	Show All 27 Lines

llvm/test/Transforms/CanonicalizeFreezeInLoops/func_from_mcf_r.ll

This file was added.

				; RUN: opt < %s -canon-freeze -S \| FileCheck %s
				; REQUIRES: aarch64-registered-target
				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64-unknown-linux-gnu"
				fhahnUnsubmitted Done Reply Inline Actions same about AArch64 triple as for llvm/test/Transforms/CanonicalizeFreezeInLoops/onephi.ll. Also, it would be good to reduce the test to the important bits. For example, the types <{ i64, %struct.arc, %struct.g }>, <{ i64, %struct.arc, %struct.g } and the global aliases are quite verbose and do not really impact the freeze hoisting. fhahn: same about AArch64 triple as for llvm/test/Transforms/CanonicalizeFreezeInLoops/onephi.ll.

				%struct.arc = type { i32 }
				%struct.g = type { i64, %struct.arc, i64, i64, i64 }

				@m = global i64 0
				@h = global %struct.arc* null
				@j = global %struct.g zeroinitializer

				define dso_local i32 @main() {
				bb:
				%tmp = load i64, i64* getelementptr inbounds (%struct.g, %struct.g* @j, i32 0, i32 0), align 8
				%tmp1 = icmp sgt i64 %tmp, 0
				br i1 %tmp1, label %bb2, label %bb35

				bb2: ; preds = %bb
				%tmp3 = load i64, i64* @m, align 8
				%tmp4 = load %struct.arc, %struct.arc* @h, align 8
				; CHECK: %tmp3.frozen = freeze i64 %tmp3
				br label %bb5

				bb5: ; preds = %bb28, %bb2
				%tmp6 = phi %struct.arc* [ %tmp4, %bb2 ], [ %tmp31, %bb28 ]
				%tmp7 = phi i64 [ %tmp3, %bb2 ], [ %tmp12, %bb28 ]
				; CHECK: %tmp7 = phi i64 [ %tmp3.frozen, %bb2 ], [ %tmp12, %bb28 ]
				%tmp8 = phi i64 [ 0, %bb2 ], [ %tmp11, %bb28 ]
				%tmp9 = trunc i64 %tmp7 to i32
				%tmp10 = getelementptr inbounds %struct.arc, %struct.arc* %tmp6, i64 0, i32 0
				store i32 %tmp9, i32* %tmp10, align 4
				%tmp11 = add nuw nsw i64 %tmp8, 1
				%tmp12 = add nsw i64 %tmp7, 1
				; CHECK: %tmp12 = add i64 %tmp7, 1
				store i64 %tmp12, i64* @m, align 8
				%tmp13 = load i64, i64* inttoptr (i64 16 to i64*), align 16
				%tmp14 = freeze i64 %tmp12
				; CHECK-NOT: %tmp14 = freeze i64 %tmp12
				%tmp15 = freeze i64 %tmp13
				%tmp16 = sdiv i64 %tmp14, %tmp15
				%tmp17 = mul i64 %tmp16, %tmp15
				%tmp18 = sub i64 %tmp14, %tmp17
				%tmp19 = load i64, i64* inttoptr (i64 24 to i64*), align 8
				%tmp20 = icmp sgt i64 %tmp18, %tmp19
				%tmp21 = load i64, i64* inttoptr (i64 32 to i64*), align 32
				br i1 %tmp20, label %bb22, label %bb28

				bb22: ; preds = %bb5
				%tmp23 = mul nsw i64 %tmp21, %tmp19
				%tmp24 = sub nsw i64 %tmp18, %tmp19
				%tmp25 = add nsw i64 %tmp21, -1
				%tmp26 = mul nsw i64 %tmp25, %tmp24
				%tmp27 = add nsw i64 %tmp26, %tmp23
				br label %bb28

				bb28: ; preds = %bb22, %bb5
				%tmp29 = phi i64 [ %tmp27, %bb22 ], [ %tmp21, %bb5 ]
				%tmp30 = add nsw i64 %tmp29, %tmp16
				%tmp31 = getelementptr inbounds %struct.arc, %struct.arc* getelementptr inbounds (%struct.g, %struct.g* @j, i32 0, i32 1), i64 %tmp30
				store %struct.arc* %tmp31, %struct.arc** @h, align 8
				%tmp32 = load i64, i64* getelementptr inbounds (%struct.g, %struct.g* @j, i32 0, i32 0), align 8
				%tmp33 = icmp slt i64 %tmp11, %tmp32
				br i1 %tmp33, label %bb5, label %bb34

				bb34: ; preds = %bb28
				br label %bb35

				bb35: ; preds = %bb34, %bb
				ret i32 0
				}
				jdoerfertUnsubmitted Done Reply Inline Actions I think we can and should remove the metadata and attributes. jdoerfert: I think we can and should remove the metadata and attributes.

llvm/test/Transforms/CanonicalizeFreezeInLoops/nonsteps-preserve-flags.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -canon-freeze -S \| FileCheck %s
				declare void @call(i32)

				define void @add(i32 %init, i32 %n) {
				; CHECK-LABEL: @add(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: [[NONSTEP:%.*]] = mul nsw i32 [[I]], 2
				; CHECK-NEXT: call void @call(i32 [[NONSTEP]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry], [%i.next, %loop ]
				%i.next = add nsw i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				%nonstep = mul nsw i32 %i, 2
				call void @call(i32 %nonstep)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

llvm/test/Transforms/CanonicalizeFreezeInLoops/onephi.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -canon-freeze -S \| FileCheck %s
				; A set of tests that have one phi node
				declare void @call(i32)
				declare i32 @get_step()
				fhahnUnsubmitted Done Reply Inline Actions does this require an AArch64 triple? If so, you have to add something like `REQUIRES: aarch64-registered-target` I think, otherwise it might if LLVM is built without AARch64 backend. fhahn: does this require an AArch64 triple? If so, you have to add something like `REQUIRES: aarch64…
				aqjuneAuthorUnsubmitted Done Reply Inline Actions aarch64 wasn't needed. removed aqjune: aarch64 wasn't needed. removed

				define void @add(i32 %init, i32 %n) {
				; CHECK-LABEL: @add(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry], [%i.next, %loop ]
				%i.next = add i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @add_comm(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_comm(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 1, [[I]]
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.next = add i32 1, %i
				%i.next.fr = freeze i32 %i.next
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @add_multiuses(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_multiuses(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.next = add i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @add_multiuses2(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_multiuses2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.next = add i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%i.next.fr2 = freeze i32 %i.next
				call void @call(i32 %i.next.fr2)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @add_flags(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_flags(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @add_ind(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_ind(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: [[I_FR_NEXT:%.*]] = add nuw nsw i32 [[I]], 1
				; CHECK-NEXT: call void @call(i32 [[I_FR_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_FR_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				%i.fr = freeze i32 %i
				%i.fr.next = add nuw nsw i32 %i.fr, 1
				call void @call(i32 %i.fr.next)
				%cond = icmp eq i32 %i.fr.next, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				; Negative test
				define void @add_ind_frozen(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_ind_frozen(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT:%.]], [[ENTRY:%.]] ], [ [[I_NEXT_FR:%.]], [[LOOP]] ]
				; CHECK-NEXT: [[I_FR:%.*]] = freeze i32 [[I]]
				; CHECK-NEXT: [[I_NEXT_FR]] = add nuw nsw i32 [[I_FR]], 1
				; CHECK-NEXT: call void @call(i32 [[I_NEXT_FR]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT_FR]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [%init, %entry], [%i.next.fr, %loop]
				%i.fr = freeze i32 %i
				%i.next.fr = add nuw nsw i32 %i.fr, 1
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @add_flags_not_compared(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_flags_not_compared(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				; Negative test
				define void @add_flags_not_compared_stepinst(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_flags_not_compared_stepinst(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT:%.]], [[ENTRY:%.]] ], [ [[I_NEXT_FR:%.]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT:%.*]] = add nuw nsw i32 [[I]], 1
				; CHECK-NEXT: [[I_NEXT_FR]] = freeze i32 [[I_NEXT]]
				; CHECK-NEXT: call void @call(i32 [[I_NEXT_FR]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next.fr, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				; Negative test
				; If pushing freeze through icmp is needed, this should be enabled.
				; There is no correctness issue in pushing freeze into icmp here, just it's
				; being conservative right now.
				define void @add_flags_stepinst_frozen(i32 %init, i32 %n) {
				; CHECK-LABEL: @add_flags_stepinst_frozen(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT:%.]], [[ENTRY:%.]] ], [ [[I_NEXT:%.]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i32 [[I]], 1
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: [[COND_FR:%.*]] = freeze i1 [[COND]]
				; CHECK-NEXT: br i1 [[COND_FR]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				call void @call(i32 %i.next)
				%cond = icmp eq i32 %i.next, %n
				%cond.fr = freeze i1 %cond
				br i1 %cond.fr, label %loop, label %exit

				exit:
				ret void
				}

				define void @sub(i32 %init, i32 %n) {
				; CHECK-LABEL: @sub(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = sub i32 [[I]], 1
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [%init, %entry], [%i.next, %loop]
				%i.next = sub nuw nsw i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @init_const(i32 %n) {
				; CHECK-LABEL: @init_const(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ 0, %entry ], [ %i.next, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @step_init_arg(i32 %init, i32 %n, i32 %step) {
				; CHECK-LABEL: @step_init_arg(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[STEP_FROZEN:%.]] = freeze i32 [[STEP:%.]]
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], [[STEP_FROZEN]]
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [%init, %entry], [%i.next, %loop]
				%i.next = add nuw nsw i32 %i, %step
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @step_init_arg_multiuses(i32 %init, i32 %n, i32 %step) {
				; CHECK-LABEL: @step_init_arg_multiuses(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[STEP_FROZEN:%.]] = freeze i32 [[STEP:%.]]
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], [[STEP_FROZEN]]
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.next = add nsw nuw i32 %i, %step
				%i.next.fr1 = freeze i32 %i.next
				call void @call(i32 %i.next.fr1)
				%i.next.fr2 = freeze i32 %i.next
				call void @call(i32 %i.next.fr2)
				%cond = icmp eq i32 %i.next, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @step_init_arg_multiuses2(i32 %init, i32 %n, i32 %step) {
				; CHECK-LABEL: @step_init_arg_multiuses2(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[STEP_FROZEN:%.]] = freeze i32 [[STEP:%.]]
				; CHECK-NEXT: [[INIT_FROZEN:%.]] = freeze i32 [[INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: call void @call(i32 [[I]])
				; CHECK-NEXT: call void @call(i32 [[I]])
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], [[STEP_FROZEN]]
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %init, %entry ], [ %i.next, %loop ]
				%i.fr1 = freeze i32 %i
				call void @call(i32 %i.fr1)
				%i.fr2 = freeze i32 %i
				call void @call(i32 %i.fr2)
				%i.next = add nsw nuw i32 %i, %step
				%cond = icmp eq i32 %i.next, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				define void @step_init_inst(i32 %n) {
				; CHECK-LABEL: @step_init_inst(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[STEP:%.*]] = call i32 @get_step()
				; CHECK-NEXT: [[INIT:%.*]] = call i32 @get_step()
				; CHECK-NEXT: [[STEP_FROZEN:%.*]] = freeze i32 [[STEP]]
				; CHECK-NEXT: [[INIT_FROZEN:%.*]] = freeze i32 [[INIT]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], [[STEP_FROZEN]]
				; CHECK-NEXT: call void @call(i32 [[I_NEXT]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				%step = call i32 @get_step()
				%init = call i32 @get_step()
				br label %loop

				loop:
				%i = phi i32 [%init, %entry], [%i.next, %loop]
				%i.next = add nuw nsw i32 %i, %step
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				; Negative test
				define void @step_inst(i32 %init, i32 %n) {
				; CHECK-LABEL: @step_inst(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[INIT:%.]], [[ENTRY:%.]] ], [ [[I_NEXT:%.]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i32 [[I]], [[I]]
				; CHECK-NEXT: [[I_NEXT_FR:%.*]] = freeze i32 [[I_NEXT]]
				; CHECK-NEXT: call void @call(i32 [[I_NEXT_FR]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT_FR]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [%init, %entry], [%i.next, %loop]
				%i.next = add nuw nsw i32 %i, %i
				%i.next.fr = freeze i32 %i.next
				call void @call(i32 %i.next.fr)
				%cond = icmp eq i32 %i.next.fr, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				; Negative test
				define void @gep(i8* %init, i8* %end) {
				; CHECK-LABEL: @gep(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i8 [ [[INIT:%.]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = getelementptr inbounds i8, i8* [[I]], i64 1
				; CHECK-NEXT: [[I_NEXT_FR:%.]] = freeze i8 [[I_NEXT]]
				; CHECK-NEXT: [[COND:%.]] = icmp eq i8 [[I_NEXT_FR]], [[END:%.*]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i8* [ %init, %entry], [%i.next, %loop ]
				%i.next = getelementptr inbounds i8, i8* %i, i64 1
				%i.next.fr = freeze i8* %i.next
				%cond = icmp eq i8* %i.next.fr, %end
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

llvm/test/Transforms/CanonicalizeFreezeInLoops/phis.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -canon-freeze -S \| FileCheck %s
				; A set of tests that have several phi nodes
				declare void @call(i32)
				declare i32 @call2()

				define void @onephi_used(i32 %n, i32 %i.init, i32 %j.init) {
				; CHECK-LABEL: @onephi_used(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[I_INIT_FROZEN:%.]] = freeze i32 [[I_INIT:%.]]
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[I_INIT_FROZEN]], [[ENTRY:%.]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[J:%.]] = phi i32 [ [[J_INIT:%.]], [[ENTRY]] ], [ [[J_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add i32 [[I]], 1
				; CHECK-NEXT: [[J_NEXT]] = add nuw nsw i32 [[J]], -2
				; CHECK-NEXT: call void @call(i32 [[I]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %i.init, %entry ], [ %i.next, %loop ]
				%j = phi i32 [ %j.init, %entry ], [ %j.next, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				%j.next = add nuw nsw i32 %j, -2
				%i.fr = freeze i32 %i
				call void @call(i32 %i.fr)
				%cond = icmp eq i32 %i.next, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				; Negative test
				define void @twophis_used(i32 %n, i32 %i.init, i32 %j.init) {
				; CHECK-LABEL: @twophis_used(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[I_INIT:%.]], [[ENTRY:%.]] ], [ [[I_NEXT:%.]], [[LOOP]] ]
				; CHECK-NEXT: [[J:%.]] = phi i32 [ [[J_INIT:%.]], [[ENTRY]] ], [ [[J_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i32 [[I]], 1
				; CHECK-NEXT: [[J_NEXT]] = add nuw nsw i32 [[J]], -2
				; CHECK-NEXT: [[IJ:%.*]] = add i32 [[I]], [[J]]
				; CHECK-NEXT: [[IJ_FR:%.*]] = freeze i32 [[IJ]]
				; CHECK-NEXT: call void @call(i32 [[IJ_FR]])
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %i.init, %entry ], [ %i.next, %loop ]
				%j = phi i32 [ %j.init, %entry ], [ %j.next, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				%j.next = add nuw nsw i32 %j, -2
				%ij = add i32 %i, %j
				%ij.fr = freeze i32 %ij
				call void @call(i32 %ij.fr)
				%cond = icmp eq i32 %i.next, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

				; Negative test
				define void @nonindphi_used(i32 %n, i32 %i.init, i32 %j.init, i32 %k.init) {
				; CHECK-LABEL: @nonindphi_used(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP:%.*]]
				; CHECK: loop:
				; CHECK-NEXT: [[I:%.]] = phi i32 [ [[I_INIT:%.]], [[ENTRY:%.]] ], [ [[I_NEXT:%.]], [[LOOP]] ]
				; CHECK-NEXT: [[J:%.]] = phi i32 [ [[J_INIT:%.]], [[ENTRY]] ], [ [[J_NEXT:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[K:%.]] = phi i32 [ [[K_INIT:%.]], [[ENTRY]] ], [ [[ANY:%.*]], [[LOOP]] ]
				; CHECK-NEXT: [[I_NEXT]] = add nuw nsw i32 [[I]], 1
				; CHECK-NEXT: [[J_NEXT]] = add nuw nsw i32 [[J]], -2
				; CHECK-NEXT: [[IJ:%.*]] = add i32 [[I]], [[J]]
				; CHECK-NEXT: [[IJK:%.*]] = add i32 [[IJ]], [[K]]
				; CHECK-NEXT: [[IJK_FR:%.*]] = freeze i32 [[IJK]]
				; CHECK-NEXT: call void @call(i32 [[IJK_FR]])
				; CHECK-NEXT: [[ANY]] = call i32 @call2()
				; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[I_NEXT]], [[N:%.]]
				; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
				; CHECK: exit:
				; CHECK-NEXT: ret void
				;
				entry:
				br label %loop

				loop:
				%i = phi i32 [ %i.init, %entry ], [ %i.next, %loop ]
				%j = phi i32 [ %j.init, %entry ], [ %j.next, %loop ]
				%k = phi i32 [ %k.init, %entry ], [ %any, %loop ]
				%i.next = add nuw nsw i32 %i, 1
				%j.next = add nuw nsw i32 %j, -2
				%ij = add i32 %i, %j
				%ijk = add i32 %ij, %k
				%ijk.fr = freeze i32 %ijk
				call void @call(i32 %ijk.fr)
				%any = call i32 @call2()
				%cond = icmp eq i32 %i.next, %n
				br i1 %cond, label %loop, label %exit

				exit:
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

Add CanonicalizeFreezeInLoops passClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 265394

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/Transforms/Utils.h

llvm/include/llvm/Transforms/Utils/CanonicalizeFreezeInLoops.h

llvm/lib/Passes/PassBuilder.cpp

llvm/lib/Passes/PassRegistry.def

llvm/lib/Transforms/Utils/CMakeLists.txt

llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp

llvm/lib/Transforms/Utils/Utils.cpp

llvm/test/Transforms/CanonicalizeFreezeInLoops/func_from_mcf_r.ll

llvm/test/Transforms/CanonicalizeFreezeInLoops/nonsteps-preserve-flags.ll

llvm/test/Transforms/CanonicalizeFreezeInLoops/onephi.ll

llvm/test/Transforms/CanonicalizeFreezeInLoops/phis.ll

Add CanonicalizeFreezeInLoops pass
ClosedPublic