This is an archive of the discontinued LLVM Phabricator instance.

[LoopFlatten] Move it to a LoopPassManager
ClosedPublic

Authored by SjoerdMeijer on Sep 20 2021, 3:26 AM.

Details

Summary

In D109958 it was noticed that we could optimise the pipeline and avoid rerunning LoopSimplify/LCSSA for LoopFlatten by moving it to a LoopPassManager. I think the reason why it currently is added to a FunctionPassManager, is because under the legacy-PM a loop pass manager couldn't handle a loop pass deleting/modifying a loop.

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Sep 20 2021, 3:26 AM
SjoerdMeijer requested review of this revision.Sep 20 2021, 3:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2021, 3:26 AM

Fixed amdgpu test that I had missed.

the test changes seem to indicate that this is on top of D109958. can we land this one first?

llvm/test/CodeGen/AMDGPU/opt-pipeline.ll
163 ↗(On Diff #373565)

this is a legacy PM test, it shouldn't be getting updated?

My understanding was that the LoopFlattening was where it was in the pipeline because that gave better results than putting it elsewhere. Any ideas why?

My understanding was that the LoopFlattening was where it was in the pipeline because that gave better results than putting it elsewhere. Any ideas why?

I haven't moved LoopFlattening around in the past, and can't recall we have experimented with that much. Perhaps someone else did.... So I can't answer why, but I am also not sure much thought has gone into it in the pass. When it was function pass, my guess was this was a reasonable place where it was doing its job. My intention is to make it LoopNestPass, and have only moved it a bit to add it to the LPM2. I thought important was that Invar-simplify and simplify-cfg runs before LoopFlatten, which is still the case. I did check the benchmark that we care about, and actually saw a small improvement. No idea why, but it was confirmation enough that this move seems reasonable.

This removes the test changes, I had indeed incorrectly put this patch on top of D109958.

aeubanks accepted this revision.Sep 20 2021, 9:55 AM

lgtm if others are okay with the position in the pipeline

This revision is now accepted and ready to land.Sep 20 2021, 9:55 AM
asbirlea requested changes to this revision.Sep 20 2021, 4:49 PM

In D109958 I missed the fact that LoopFlatten does not preserve any analyses. This is, I assume, the reason it was not added alongside other Loop passes before and probably also why I'm seeing a crash when adding it as part of LPM2.
The pass should not misbehave on its own in a LoopPassManager because it's not using the Updater to readd loops for reprocessing, otherwise it would be incorrect alone too.
So either LoopFlatten needs to be remain in it's own LPM or taught to preserve analyses (LoopInfo, DominatorTree, etc)

This revision now requires changes to proceed.Sep 20 2021, 4:49 PM

In D109958 I missed the fact that LoopFlatten does not preserve any analyses. This is, I assume, the reason it was not added alongside other Loop passes before and probably also why I'm seeing a crash when adding it as part of LPM2.
The pass should not misbehave on its own in a LoopPassManager because it's not using the Updater to readd loops for reprocessing, otherwise it would be incorrect alone too.
So either LoopFlatten needs to be remain in it's own LPM or taught to preserve analyses (LoopInfo, DominatorTree, etc)

I think there are 2 separate things going on here:

  • A recent patch is causing a problem (as reported in D109958), which I am investigating and fixing. I suspect this is the same crash that you're seeing.
  • In addition, we have the question of preserved analysis. LoopFlatten removes an inner-loop, so it sounds right to me that LoopFlatten doesn't preserve LoopInfo, DomTree, or readd loops, etc. I was assuming that when a pass doesn't preserve an analysis, the pass manager reruns it when a subsequent pass requires it. But please advise if LoopFlatten is fine where it is here in this patch, or if it needs moving to its own LPM.

Also having talked to @dmgreen , I am now more convinced that having LoopFlatten separate makes sense (because it doesn't preserve much info as it works on pairs of outer/inner loops and removes the inner if it can). So I think we can abandon this patch. I can look into preserving LCSSA though, perhaps that can be kept up to date avoiding rerunning the pass.

FWIW, there are other loop passes that drastically change the loop structure (e.g. SimpleLoopUnswitch), and preserve all analyses, including LoopInfo.

nikic added a comment.Oct 7 2021, 8:12 AM

In D109958 I missed the fact that LoopFlatten does not preserve any analyses. This is, I assume, the reason it was not added alongside other Loop passes before and probably also why I'm seeing a crash when adding it as part of LPM2.
The pass should not misbehave on its own in a LoopPassManager because it's not using the Updater to readd loops for reprocessing, otherwise it would be incorrect alone too.
So either LoopFlatten needs to be remain in it's own LPM or taught to preserve analyses (LoopInfo, DominatorTree, etc)

Even if LoopFlatten is in its own LPM, doesn't it still break the LPM contract by not preserving analyses? Wouldn't the function to loop pass adaptor still claim the standard analyses are preserved, even though they actually aren't?

I don't understand why this is not already triggering verification failures.

nikic added a comment.Oct 7 2021, 8:26 AM

In D109958 I missed the fact that LoopFlatten does not preserve any analyses. This is, I assume, the reason it was not added alongside other Loop passes before and probably also why I'm seeing a crash when adding it as part of LPM2.
The pass should not misbehave on its own in a LoopPassManager because it's not using the Updater to readd loops for reprocessing, otherwise it would be incorrect alone too.
So either LoopFlatten needs to be remain in it's own LPM or taught to preserve analyses (LoopInfo, DominatorTree, etc)

Even if LoopFlatten is in its own LPM, doesn't it still break the LPM contract by not preserving analyses? Wouldn't the function to loop pass adaptor still claim the standard analyses are preserved, even though they actually aren't?

I don't understand why this is not already triggering verification failures.

Ah, I think the reason is that it actually does preserve the analyses, or at least some subset of them: https://github.com/llvm/llvm-project/blob/d550930afcbb84740681c219ab13efd133143f88/llvm/lib/Transforms/Scalar/LoopFlatten.cpp#L645 https://github.com/llvm/llvm-project/blob/d550930afcbb84740681c219ab13efd133143f88/llvm/lib/Transforms/Scalar/LoopFlatten.cpp#L663-L665

Possibly it already preserves everything properly and just needs to report that it does? Plus report the deleted loop to the pass manager?

New pass manager loop pass invalidation isn't exactly my area of expertise. My main issue with moving the pass was that performance was worse. I don't now remember why, but I have a vague memory of widening IV's in the flattening pass, if they have already been partially widened in indvarsimplify.

It may be important to note that LoopFlatten isn't a Loop pass. It's a LoopNest Pass and has been since D102904.

nikic added a comment.Oct 11 2021, 1:28 AM

New pass manager loop pass invalidation isn't exactly my area of expertise. My main issue with moving the pass was that performance was worse. I don't now remember why, but I have a vague memory of widening IV's in the flattening pass, if they have already been partially widened in indvarsimplify.

Would it be possible to reevaluate whether that issue still exists? If it is important for performance to have LoopFlatten in a standalone LPM, we should have a PhaseOrdering test that demonstrates that (which would have failed with this patch and made the issue obvious).

bmahjour added a project: Restricted Project.Dec 1 2021, 8:22 AM

I would like to pick this up, have reread the comments, but it's still a bit unclear to me what needs doing at this point.

First, returning to @nikic comment:

Possibly it already preserves everything properly and just needs to report that it does? Plus report the deleted loop to the pass manager?

Yeah, I think so, just wanted to confirm this: it uses forgetLoop from ScalarEvolution and markLoopAsDeleted from the LPMUpdater to invalidate the old loops, and updates the DomTree with deleteEdge.

So does this mean that this patch is fine as it is? Modulo "the report the deleted loop to the pass manage" perhaps, will need to look into that.
Can you please advise @nikic , @asbirlea ?

nikic accepted this revision.Dec 30 2021, 2:08 AM

Yes, I believe this should be fine as-is, the analysis preservation issues were addressed in D111328 and D111350.

Great, many thanks for your help with this!

This revision was not accepted when it landed; it landed in state Needs Revision.Dec 30 2021, 4:32 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.