This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/Scalar/
-
llvm/
-
Transforms/
-
Scalar/
1
LoopPassManager.h
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
LoopInterchange.cpp
4/8
LoopPassManager.cpp
-
test/Transforms/LoopInterchange/
-
Transforms/
-
LoopInterchange/
-
interchanged-loop-nest-4.ll

Differential D132199

[LoopPassManager] Ensure to construct loop nests with the outermost loop
ClosedPublic

Authored by congzhe on Aug 18 2022, 9:02 PM.

Download Raw Diff

Details

Reviewers

Whitney
bmahjour
Meinersbur
uabelho
asbirlea
aeubanks

Group Reviewers

Restricted Project

Commits

rG6782d71680ea: [LoopPassManager] Ensure to construct loop nests with the outermost loop

Summary

This patch is to resolve the bug reported and discussed in https://reviews.llvm.org/D124926#3718761, https://reviews.llvm.org/D124926#3719876.

The bug is that when we run loop interchange twice with the new pass manager using -passes="loop(loop-interchange,loop-interchange)" on the IR attached in https://reviews.llvm.org/D124926#3718761, it hangs forever and consumes more and more memory. The IR is added as a new lit test file in this patch.

The underlying reason, as described in https://reviews.llvm.org/D124926#3719876, is that loop interchange is a loopnest pass under the new pass manager, but the loop nest is not constructed correctly by the loop pass manager after completing the first loop interchange pass and before running the second interchange pass. The loop in the IR is a triply nested loop. But after completing the first interchange pass, the loop nest constructed is a doubly nested loop which is incorrect and caused the trouble.

The reason that the loop nest is constructed incorrectly is that the outermost loop has changed after the first interchange, and what was the original outermost loop is not the current outermost loop anymore. For this bug (https://reviews.llvm.org/D124926#3718761) the original outermost loop L has actually become the middle loop. However the loop nest is still constructed based on L, that is why the loop nest is constructed as a doubly nested loop.

What this patch does is that, in the loop pass manager before running each pass, we always let L point to the current outermost loop, because loop nests should be constructed based on the outermost loop and it is only valid to run a loopnest pass when L is the outermost loop. Please refer to lines 89 to 94 in LoopPassManager.cpp in this patch.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

congzhe created this revision.Aug 18 2022, 9:02 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 9:02 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

congzhe requested review of this revision.Aug 18 2022, 9:02 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 9:02 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

congzhe edited the summary of this revision. (Show Details)Aug 18 2022, 9:06 PM

congzhe added reviewers: Whitney, bmahjour, Meinersbur, uabelho, Restricted Project.

congzhe added a project: Restricted Project.

congzhe mentioned this in D124926: [LoopInterchange] New cost model for loop interchange.

congzhe edited the summary of this revision. (Show Details)Aug 18 2022, 9:10 PM

Harbormaster completed remote builds in B182143: Diff 453868.Aug 18 2022, 9:51 PM

I have no idea if this is the right fix, but I've verified that it solves the problem that we saw.
Thanks!

bmahjour added inline comments.Aug 22 2022, 9:52 AM

llvm/lib/Transforms/Scalar/LoopPassManager.cpp
71	can we copy the address of L at the start of the function, so we don't have to change the reference to a pointer in the function's interface?

LGTM other than the comment from Bardia.

aeubanks added a subscriber: aeubanks.Aug 22 2022, 10:24 AM

aeubanks added inline comments.

llvm/lib/Transforms/Scalar/LoopPassManager.cpp
93–94	this feels like a hack and might hide future issues around accidentally swapping loops around augmenting `LPMUpdater` to have a method to set the loopnest and having loop-interchange use that feels like a more proper and explicit solution thoughts?

aeubanks added reviewers: asbirlea, aeubanks.Aug 22 2022, 10:24 AM

In D132199#3740143, @Whitney wrote:

LGTM other than the comment from Bardia.

Thanks for the review :)

llvm/lib/Transforms/Scalar/LoopPassManager.cpp
71	Thanks Bardia for your comment! I've updated the patch according to your comment.
93–94	Thanks for commenting! Following your comments I've updated the patch such that if "LoopNestAnalysis" is not preserved by a pass (which essentially results in `IsLoopNestPtrValid=false`), we re-construct the loop nest based on the current outermost loop. IMHO this logic is the same as your proposal that if a pass invalidates the loopnest, we'll set the loopnest again.

congzhe updated this revision to Diff 454630.Aug 22 2022, 3:21 PM

Harbormaster completed remote builds in B182697: Diff 454630.Aug 22 2022, 4:20 PM

aeubanks added inline comments.Aug 24 2022, 1:46 PM

llvm/lib/Transforms/Scalar/LoopPassManager.cpp
93–94	first, the existing code seems to be overly complicated, https://reviews.llvm.org/D132581 cleans up the loop nest analysis code but the bigger thing is that this still seems too magical. the ideology behind the loop pass manager (and the CGSCC pass manager) is that passes should explicitly tell the pass manager what happens to loops. changing `L` throughout the execution isn't really intended, we should be bailing out and letting the function->loop adaptor visit whatever we need to visit next via `LPMUpdater::Worklist` IIUC, loop-interchange can arbitrarily change the nesting of a loop nest. the question is, when that happens, what should the loop pass manager do? I'd imagine that if loop-interchange makes a change, we'd want to bail out after it's done, then completely revisit all the loops in the loop nest inside-out to optimize them in their new nesting. but I'd like to hear your thoughts on this (and if I understand loop-interchange correctly). I had something like https://reviews.llvm.org/D132599 in mind (currently it only revisits the new outermost loop, but something along those lines API-wise), but it seems to still hang on your repro. but that seems like an issue with loop-interchange not being idempotent, i.e. if you keep running it on a loop, it keeps generating IR which seems bad

congzhe added inline comments.Aug 25 2022, 3:17 PM

llvm/lib/Transforms/Scalar/LoopPassManager.cpp
93–94	Thanks for your input! I applied D132581 and D132599 and did some quick debugging. The reason that opt keeps generating IR is that we add the current outermost loop to the `worklist` via LPMUpdater at the end of loop interchange, so the `worklist` is never empty and we keep poping loops from it and run optimizations on it forever. The behavior occurs for both the new and legacy pass manager. Regarding loop interchange behavior: under the new pass manager it is a loopnest pass, so we are not concerned about the inner or middle loops at all from loop pass manager's perspective. We won't visit all the loops in the loop nest inside-out but we only visit the outermost loops and construct loop nests based on outermost loops. It is my understanding that after the loop interchange pass makes a change (which can be not one but multiple interchanges running one iteration of the pass), we would need to construct the loop nest again based on the current outermost loop, before running the next loop interchange pass. I'd appreciate it if you could let me know if it sounds clear to you.

aeubanks added inline comments.Aug 29 2022, 10:48 AM

llvm/lib/Transforms/Scalar/LoopPassManager.cpp
93–94	D132599 only adds the loop nest back to the worklist if it made a change. It sounds like if you keep running loop-interchange on a loop nest, it can keep optimizing it forever, rather than eventually stopping. That seems bad. In terms of visiting loops, I understand that that this is a loop nest pass and only runs on top level loops. But my question is, given that there are other loop passes that aren't loop nest passes in the same LPM, if we have `loop-pass-A,loop-interchange,loop-pass-B` and loop-interchange makes a change (say on `L1(L2)` where L1 is the outer loop, L2 is the inner loop -> `L2(L1)`), how should the LPM deal with the new loop nest? Should it revisit everything starting from the inner-most loop (e.g. the LPM started with `L1(L2)`, visited L2 with loop-pass-A and loop-pass-B, then ran loop-pass-A on L1 and loop-interchange on the whole loop nest, interchanging the loops, now should it restart all the loop passes on `L2(L1)` starting with L1, then L2, because the structure changed)?

congzhe updated this revision to Diff 457479.Sep 1 2022, 7:50 PM

congzhe added inline comments.

llvm/lib/Transforms/Scalar/LoopPassManager.cpp
93–94	Thanks for this. The reason why loop interchange keeps optimizing the loopnest is that loop interchange can interchange loops either to achieve better locality, or to expose more vectorization opportunities. Following your terminology above, it may happen that for a loopnest L1(L2), loop interchange will interchange it to achieve better locality, resulting in L2(L1). Now if we run another loop interchange pass on it, it may still interchange the loopnest, resulting in L1(L2). This time the pass interchanges the loop because it might expose more vectorization opportunities. This behavior can go on and on. In terms of visiting loops, I think what the pass manager does currently is as follows, which seems to be appropriate. With your example above, the LPM started with L1(L2), visited L2 with loop-pass-A and loop-pass-B, then ran loop-pass-A on L1 and loop-interchange on the whole loop nest, interchanging the loops which results in L2(L1). Then it would ran run loop-pass-B on L1. It looks appropriate because L1 is the one that we want to optimize with loop-pass-B at this point, even it becase the inner loop after interchange. Nevertheless, we can always "restart all the loop passes on L2(L1) starting with L1, then L2, because the structure changed". The potential drawback is that it causes compile time increase, as well as the infinite run problem we described in our first paragraph. I do get your point that we may not want to change `L` in this function `runWithLoopNestPasses()`. I've provided another solution which does not change `L` but keeps a pointer to the outermost loop. It is based on your patch D132581. Whenever we need to run loopnest passes, we contruct the loopnest based on the outermost loop. The interface with the LPM::Updater (`U.isLoopNestChanged()` and `U.markLoopNestChanged()`) may not be necessary though. It suffices to only add the while loop `while (auto *PL = OuterMostLoop->getParentLoop())` after `PassPA = runSinglePass(LN, Pass, AM, AR, U, PI)`. I'd appreciate it if you could let me know your comments on the most recent version. Thanks a lot.

Will rebase on D132581 once it is landed.

Harbormaster completed remote builds in B184747: Diff 457479.Sep 1 2022, 8:36 PM

Rebased on D132581.

Subtleties of this update can be discussed upon, e.g., whether it is necessary to interact with the LPM::Updater, etc. Comments are appreciated :)

Harbormaster completed remote builds in B184877: Diff 457680.Sep 2 2022, 1:30 PM

ping @aeubanks :)
I'm wondering if you have comments on my most recent change?

Sorry, I'd meant to reply but didn't find time, I'll try to be more prompt about responding to this

This is definitely better than the previous version.

IMO it would still be best if continually rerunning a pass on some IR would reach a fixed point. For example, when loop unswitching happens, we revisit the current loop (which means restarting the entire loop pipeline), and if we end up creating a new loop, we also add that new loop to the worklist. There has been care taken to ensure that unswitching cannot run forever. If interchange did this, we'd be able to just revisit the entire loop nest again. For the interchange example, you say that loop-interchange may continually swap between L1(L2) and L2(L1), one of them must be better than the other, why can't we converge on the better nesting? I don't quite understand the "locality vs vectorization" argument, there still must be one nesting that's ultimately optimal in the end. Then we can restart the pipeline on the loop nest, or just the loops that got swapped around.

But if that doesn't make sense, then something along these lines seems ok. I think it might be worth revisiting exactly how loop nest passes work, but I haven't thought too hard about it. But then again, this seems like a loop-interchange-specific issue, not a loop nest issue.

also you'll have to rebase again since my patch was reverted
also seems like we've dropped the test?

In D132199#3785586, @aeubanks wrote:

Sorry, I'd meant to reply but didn't find time, I'll try to be more prompt about responding to this

This is definitely better than the previous version.

IMO it would still be best if continually rerunning a pass on some IR would reach a fixed point. For example, when loop unswitching happens, we revisit the current loop (which means restarting the entire loop pipeline), and if we end up creating a new loop, we also add that new loop to the worklist. There has been care taken to ensure that unswitching cannot run forever. If interchange did this, we'd be able to just revisit the entire loop nest again. For the interchange example, you say that loop-interchange may continually swap between L1(L2) and L2(L1), one of them must be better than the other, why can't we converge on the better nesting? I don't quite understand the "locality vs vectorization" argument, there still must be one nesting that's ultimately optimal in the end. Then we can restart the pipeline on the loop nest, or just the loops that got swapped around.

But if that doesn't make sense, then something along these lines seems ok. I think it might be worth revisiting exactly how loop nest passes work, but I haven't thought too hard about it. But then again, this seems like a loop-interchange-specific issue, not a loop nest issue.

also you'll have to rebase again since my patch was reverted
also seems like we've dropped the test?

Thank you very much for the comment! I thought about it more and it looks that IsLoopNestPtrValid &= PassPA->getChecker<LoopNestAnalysis>().preserved() already serves as the interaction of a loopnest pass with the LPMUpdater. Although it does not explicitly interact with the LPMUpdater itself using something like LPMUpdater::isLoopNestChanged(), the fact that PA does not preserve LoopNestAnalysis already indicates that the loop nest is changed. Therefore, I've now added the re-construction of loop nest under if (!IsLoopNestPtrValid). I'd appreciate it if you could take a look.

Regarding reaching a fixed point for loop interchange: I fully agree with you that it would be ideal if we could reach a fixed point instead of running forever. And yes there should be one nesting that's ultimately optimal in the end. It's just with the current cost analysis in loop interchange, sometimes it has not yet been made clear which nesting (L1(L2) versus L2(L1), or locality versus vectorization) would be better. That is something that I think we can improve in the longer term.

I'm wondering if what I described above makes sense to you?

Harbormaster completed remote builds in B186476: Diff 459900.Sep 13 2022, 4:28 PM

I think I'm ok with this sort of patch being a temporary workaround
could you add a FIXME pointing to the discussion here?

Using PreservedAnalyses to tell the loop pass manager that the loop nest structure has changed is still IMO an abuse of it. The whole point of PreservedAnalyses is that we determine whether or not some cached analysis for a given IR unit should be invalidated. The loop nest analysis shouldn't be modeled as a loop analysis since it's not at the same IR level (that's why https://reviews.llvm.org/D132581 is wrong, caused issues, and was reverted). This is essentially adding an extra bit to the pass's return value to tell the loop pass manager to change how it behaves, which is not what PreservedAnalyses is intended to be used for. But that's exactly why LPMUpdater exists, for this sort of thing. So I'd still like to go back to the LPMUpdater version.

I also think there may be issues with loop analyses not being invalidated in regards to loop-interchange. If an inner loop has a cached analysis, right now I believe interchanging does not invalidate the (original) inner loop's analyses, but we may end up running a loop nest pass on it with its old analyses once it's the outer loop, even though the loop structure has changed. But that can be a separate patch.

In D132199#3792795, @aeubanks wrote:

I think I'm ok with this sort of patch being a temporary workaround
could you add a FIXME pointing to the discussion here?

Using PreservedAnalyses to tell the loop pass manager that the loop nest structure has changed is still IMO an abuse of it. The whole point of PreservedAnalyses is that we determine whether or not some cached analysis for a given IR unit should be invalidated. The loop nest analysis shouldn't be modeled as a loop analysis since it's not at the same IR level (that's why https://reviews.llvm.org/D132581 is wrong, caused issues, and was reverted). This is essentially adding an extra bit to the pass's return value to tell the loop pass manager to change how it behaves, which is not what PreservedAnalyses is intended to be used for. But that's exactly why LPMUpdater exists, for this sort of thing. So I'd still like to go back to the LPMUpdater version.

I also think there may be issues with loop analyses not being invalidated in regards to loop-interchange. If an inner loop has a cached analysis, right now I believe interchanging does not invalidate the (original) inner loop's analyses, but we may end up running a loop nest pass on it with its old analyses once it's the outer loop, even though the loop structure has changed. But that can be a separate patch.

Thanks for your clarification regarding PreservedAnalyses and LPMUpdater, I see your point. I've updated the patch again trying to incorporate LPMUpdater for this purpose. The reason that I dropped the use of LPMUpdater in my last version is that, since D132581 is reverted, it becomes a bit less straightforward to directly use LPMUpdater instead of "IsLoopNestPtrValid" and I thought it might suffice just to use "IsLoopNestPtrValid" as the indicator whether the loop nest has been changed.

Now I added the use of LPMUpdater back to the patch, and I added a FIXME that says we should not rely on PreservedAnalyses in the long run and we should use LPMUpdater only (I guess once D132581 is re-landed we could remove "IsLoopNestPtrValid"?). I'd appreciate it if you could take a second look :)

Harbormaster completed remote builds in B187173: Diff 460807.Sep 16 2022, 10:50 AM

thanks, lgtm with one nit
sorry for the long back and forth

I do think that the loop nest infrastructure in general needs some work and doesn't fit in with the rest of the new pass manager infrastructure very well. there should have been an RFC (unless I missed it)

llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
365	`markLoopNestChanged`

This revision is now accepted and ready to land.Sep 16 2022, 2:22 PM

btw perhaps another way of preventing infinite interchanging is to add some metadata to the loop, like unswitching does https://github.com/llvm/llvm-project/blob/7fb96fb5d33ee55fa5b65497c6074086f43babd2/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp#L3145

although perhaps you do want interchanging to potentially change its mind if we ever simplify a loop, assuming that loop passes will only ever simplify so we converge at some point

In D132199#3796658, @aeubanks wrote:

thanks, lgtm with one nit
sorry for the long back and forth

I do think that the loop nest infrastructure in general needs some work and doesn't fit in with the rest of the new pass manager infrastructure very well. there should have been an RFC (unless I missed it)

Thank you very much! I've done a minor update to address the minor comment. I'll land it shortly.

Regarding preventing infinite interchanging: thanks for your suggestion, this is definitely helpful and using metadata could be a way to resolving the infinite interchange. Another direction is to improve the cost analysis. I'll think more about it.

Harbormaster completed remote builds in B188003: Diff 461943.Sep 21 2022, 10:37 AM

This revision was landed with ongoing or failed builds.Sep 21 2022, 9:00 PM

Closed by commit rG6782d71680ea: [LoopPassManager] Ensure to construct loop nests with the outermost loop (authored by congzhe). · Explain Why

This revision was automatically updated to reflect the committed changes.

congzhe added a commit: rG6782d71680ea: [LoopPassManager] Ensure to construct loop nests with the outermost loop.

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

Scalar/

LoopPassManager.h

17 lines

lib/

Transforms/

Scalar/

LoopInterchange.cpp

2 lines

LoopPassManager.cpp

17 lines

test/

Transforms/

LoopInterchange/

interchanged-loop-nest-4.ll

53 lines

Diff 462077

llvm/include/llvm/Transforms/Scalar/LoopPassManager.h

Show First 20 Lines • Show All 350 Lines • ▼ Show 20 Lines	#endif
void revisitCurrentLoop() {		void revisitCurrentLoop() {
// Tell the currently in-flight pipeline to stop running.		// Tell the currently in-flight pipeline to stop running.
SkipCurrentLoop = true;		SkipCurrentLoop = true;

// And insert ourselves back into the worklist.		// And insert ourselves back into the worklist.
Worklist.insert(CurrentL);		Worklist.insert(CurrentL);
}		}

		bool isLoopNestChanged() const {
		return LoopNestChanged;
		}

		/// Loopnest passes should use this method to indicate if the
		/// loopnest has been modified.
		void markLoopNestChanged(bool Changed) {
		aeubanksUnsubmitted Not Done Reply Inline Actions `markLoopNestChanged` aeubanks: `markLoopNestChanged`
		LoopNestChanged = Changed;
		}

private:		private:
friend class llvm::FunctionToLoopPassAdaptor;		friend class llvm::FunctionToLoopPassAdaptor;

/// The \c FunctionToLoopPassAdaptor's worklist of loops to process.		/// The \c FunctionToLoopPassAdaptor's worklist of loops to process.
SmallPriorityWorklist<Loop *, 4> &Worklist;		SmallPriorityWorklist<Loop *, 4> &Worklist;

/// The analysis manager for use in the current loop nest.		/// The analysis manager for use in the current loop nest.
LoopAnalysisManager &LAM;		LoopAnalysisManager &LAM;

Loop *CurrentL;		Loop *CurrentL;
bool SkipCurrentLoop;		bool SkipCurrentLoop;
const bool LoopNestMode;		const bool LoopNestMode;
		bool LoopNestChanged;

#ifdef LLVM_ENABLE_ABI_BREAKING_CHECKS		#ifdef LLVM_ENABLE_ABI_BREAKING_CHECKS
// In debug builds we also track the parent loop to implement asserts even in		// In debug builds we also track the parent loop to implement asserts even in
// the face of loop deletion.		// the face of loop deletion.
Loop *ParentL;		Loop *ParentL;
#endif		#endif

LPMUpdater(SmallPriorityWorklist<Loop *, 4> &Worklist,		LPMUpdater(SmallPriorityWorklist<Loop *, 4> &Worklist,
LoopAnalysisManager &LAM, bool LoopNestMode = false)		LoopAnalysisManager &LAM, bool LoopNestMode = false,
: Worklist(Worklist), LAM(LAM), LoopNestMode(LoopNestMode) {}		bool LoopNestChanged = false)
		: Worklist(Worklist), LAM(LAM), LoopNestMode(LoopNestMode),
		LoopNestChanged(LoopNestChanged) {}
};		};

template <typename IRUnitT, typename PassT>		template <typename IRUnitT, typename PassT>
Optional<PreservedAnalyses> LoopPassManager::runSinglePass(		Optional<PreservedAnalyses> LoopPassManager::runSinglePass(
IRUnitT &IR, PassT &Pass, LoopAnalysisManager &AM,		IRUnitT &IR, PassT &Pass, LoopAnalysisManager &AM,
LoopStandardAnalysisResults &AR, LPMUpdater &U, PassInstrumentation &PI) {		LoopStandardAnalysisResults &AR, LPMUpdater &U, PassInstrumentation &PI) {
// Get the loop in case of Loop pass and outermost loop in case of LoopNest		// Get the loop in case of Loop pass and outermost loop in case of LoopNest
// pass which is to be passed to BeforePass and AfterPass call backs.		// pass which is to be passed to BeforePass and AfterPass call backs.
▲ Show 20 Lines • Show All 153 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LoopInterchange.cpp

Show All 38 Lines
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
		#include "llvm/Transforms/Scalar/LoopPassManager.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"		#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Transforms/Utils/LoopUtils.h"		#include "llvm/Transforms/Utils/LoopUtils.h"
#include <cassert>		#include <cassert>
#include <utility>		#include <utility>
#include <vector>		#include <vector>

using namespace llvm;		using namespace llvm;

▲ Show 20 Lines • Show All 1,706 Lines • ▼ Show 20 Lines	PreservedAnalyses LoopInterchangePass::run(LoopNest &LN,
Function &F = *LN.getParent();		Function &F = *LN.getParent();

DependenceInfo DI(&F, &AR.AA, &AR.SE, &AR.LI);		DependenceInfo DI(&F, &AR.AA, &AR.SE, &AR.LI);
std::unique_ptr<CacheCost> CC =		std::unique_ptr<CacheCost> CC =
CacheCost::getCacheCost(LN.getOutermostLoop(), AR, DI);		CacheCost::getCacheCost(LN.getOutermostLoop(), AR, DI);
OptimizationRemarkEmitter ORE(&F);		OptimizationRemarkEmitter ORE(&F);
if (!LoopInterchange(&AR.SE, &AR.LI, &DI, &AR.DT, CC, &ORE).run(LN))		if (!LoopInterchange(&AR.SE, &AR.LI, &DI, &AR.DT, CC, &ORE).run(LN))
return PreservedAnalyses::all();		return PreservedAnalyses::all();
		U.markLoopNestChanged(true);
return getLoopPassPreservedAnalyses();		return getLoopPassPreservedAnalyses();
}		}

llvm/lib/Transforms/Scalar/LoopPassManager.cpp

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	void PassManager<Loop, LoopAnalysisManager, LoopStandardAnalysisResults &,
}		}
}		}

// Run both loop passes and loop-nest passes on top-level loop \p L.		// Run both loop passes and loop-nest passes on top-level loop \p L.
PreservedAnalyses		PreservedAnalyses
LoopPassManager::runWithLoopNestPasses(Loop &L, LoopAnalysisManager &AM,		LoopPassManager::runWithLoopNestPasses(Loop &L, LoopAnalysisManager &AM,
LoopStandardAnalysisResults &AR,		LoopStandardAnalysisResults &AR,
LPMUpdater &U) {		LPMUpdater &U) {
assert(L.isOutermost() &&		assert(L.isOutermost() &&
		bmahjourUnsubmitted Not Done Reply Inline Actions can we copy the address of L at the start of the function, so we don't have to change the reference to a pointer in the function's interface? bmahjour: can we copy the address of L at the start of the function, so we don't have to change the…
		congzheAuthorUnsubmitted Done Reply Inline Actions Thanks Bardia for your comment! I've updated the patch according to your comment. congzhe: Thanks Bardia for your comment! I've updated the patch according to your comment.
"Loop-nest passes should only run on top-level loops.");		"Loop-nest passes should only run on top-level loops.");
PreservedAnalyses PA = PreservedAnalyses::all();		PreservedAnalyses PA = PreservedAnalyses::all();

// Request PassInstrumentation from analysis manager, will use it to run		// Request PassInstrumentation from analysis manager, will use it to run
// instrumenting callbacks for the passes later.		// instrumenting callbacks for the passes later.
PassInstrumentation PI = AM.getResult<PassInstrumentationAnalysis>(L, AR);		PassInstrumentation PI = AM.getResult<PassInstrumentationAnalysis>(L, AR);

unsigned LoopPassIndex = 0, LoopNestPassIndex = 0;		unsigned LoopPassIndex = 0, LoopNestPassIndex = 0;

// `LoopNestPtr` points to the `LoopNest` object for the current top-level		// `LoopNestPtr` points to the `LoopNest` object for the current top-level
// loop and `IsLoopNestPtrValid` indicates whether the pointer is still valid.		// loop and `IsLoopNestPtrValid` indicates whether the pointer is still valid.
// The `LoopNest` object will have to be re-constructed if the pointer is		// The `LoopNest` object will have to be re-constructed if the pointer is
// invalid when encountering a loop-nest pass.		// invalid when encountering a loop-nest pass.
std::unique_ptr<LoopNest> LoopNestPtr;		std::unique_ptr<LoopNest> LoopNestPtr;
bool IsLoopNestPtrValid = false;		bool IsLoopNestPtrValid = false;
		Loop *OuterMostLoop = &L;

for (size_t I = 0, E = IsLoopNestPass.size(); I != E; ++I) {		for (size_t I = 0, E = IsLoopNestPass.size(); I != E; ++I) {
Optional<PreservedAnalyses> PassPA;		Optional<PreservedAnalyses> PassPA;
if (!IsLoopNestPass[I]) {		if (!IsLoopNestPass[I]) {
// The `I`-th pass is a loop pass.		// The `I`-th pass is a loop pass.
auto &Pass = LoopPasses[LoopPassIndex++];		auto &Pass = LoopPasses[LoopPassIndex++];
PassPA = runSinglePass(L, Pass, AM, AR, U, PI);		PassPA = runSinglePass(L, Pass, AM, AR, U, PI);
		aeubanksUnsubmitted Not Done Reply Inline Actions this feels like a hack and might hide future issues around accidentally swapping loops around augmenting `LPMUpdater` to have a method to set the loopnest and having loop-interchange use that feels like a more proper and explicit solution thoughts? aeubanks: this feels like a hack and might hide future issues around accidentally swapping loops around…
		congzheAuthorUnsubmitted Done Reply Inline Actions Thanks for commenting! Following your comments I've updated the patch such that if "LoopNestAnalysis" is not preserved by a pass (which essentially results in `IsLoopNestPtrValid=false`), we re-construct the loop nest based on the current outermost loop. IMHO this logic is the same as your proposal that if a pass invalidates the loopnest, we'll set the loopnest again. congzhe: Thanks for commenting! Following your comments I've updated the patch such that if…
		aeubanksUnsubmitted Not Done Reply Inline Actions first, the existing code seems to be overly complicated, https://reviews.llvm.org/D132581 cleans up the loop nest analysis code but the bigger thing is that this still seems too magical. the ideology behind the loop pass manager (and the CGSCC pass manager) is that passes should explicitly tell the pass manager what happens to loops. changing `L` throughout the execution isn't really intended, we should be bailing out and letting the function->loop adaptor visit whatever we need to visit next via `LPMUpdater::Worklist` IIUC, loop-interchange can arbitrarily change the nesting of a loop nest. the question is, when that happens, what should the loop pass manager do? I'd imagine that if loop-interchange makes a change, we'd want to bail out after it's done, then completely revisit all the loops in the loop nest inside-out to optimize them in their new nesting. but I'd like to hear your thoughts on this (and if I understand loop-interchange correctly). I had something like https://reviews.llvm.org/D132599 in mind (currently it only revisits the new outermost loop, but something along those lines API-wise), but it seems to still hang on your repro. but that seems like an issue with loop-interchange not being idempotent, i.e. if you keep running it on a loop, it keeps generating IR which seems bad aeubanks: first, the existing code seems to be overly complicated, https://reviews.llvm.org/D132581…
		congzheAuthorUnsubmitted Done Reply Inline Actions Thanks for your input! I applied D132581 and D132599 and did some quick debugging. The reason that opt keeps generating IR is that we add the current outermost loop to the `worklist` via LPMUpdater at the end of loop interchange, so the `worklist` is never empty and we keep poping loops from it and run optimizations on it forever. The behavior occurs for both the new and legacy pass manager. Regarding loop interchange behavior: under the new pass manager it is a loopnest pass, so we are not concerned about the inner or middle loops at all from loop pass manager's perspective. We won't visit all the loops in the loop nest inside-out but we only visit the outermost loops and construct loop nests based on outermost loops. It is my understanding that after the loop interchange pass makes a change (which can be not one but multiple interchanges running one iteration of the pass), we would need to construct the loop nest again based on the current outermost loop, before running the next loop interchange pass. I'd appreciate it if you could let me know if it sounds clear to you. congzhe: Thanks for your input! I applied D132581 and D132599 and did some quick debugging. The reason…
		aeubanksUnsubmitted Not Done Reply Inline Actions D132599 only adds the loop nest back to the worklist if it made a change. It sounds like if you keep running loop-interchange on a loop nest, it can keep optimizing it forever, rather than eventually stopping. That seems bad. In terms of visiting loops, I understand that that this is a loop nest pass and only runs on top level loops. But my question is, given that there are other loop passes that aren't loop nest passes in the same LPM, if we have `loop-pass-A,loop-interchange,loop-pass-B` and loop-interchange makes a change (say on `L1(L2)` where L1 is the outer loop, L2 is the inner loop -> `L2(L1)`), how should the LPM deal with the new loop nest? Should it revisit everything starting from the inner-most loop (e.g. the LPM started with `L1(L2)`, visited L2 with loop-pass-A and loop-pass-B, then ran loop-pass-A on L1 and loop-interchange on the whole loop nest, interchanging the loops, now should it restart all the loop passes on `L2(L1)` starting with L1, then L2, because the structure changed)? aeubanks: D132599 only adds the loop nest back to the worklist if it made a change. It sounds like if you…
		congzheAuthorUnsubmitted Done Reply Inline Actions Thanks for this. The reason why loop interchange keeps optimizing the loopnest is that loop interchange can interchange loops either to achieve better locality, or to expose more vectorization opportunities. Following your terminology above, it may happen that for a loopnest L1(L2), loop interchange will interchange it to achieve better locality, resulting in L2(L1). Now if we run another loop interchange pass on it, it may still interchange the loopnest, resulting in L1(L2). This time the pass interchanges the loop because it might expose more vectorization opportunities. This behavior can go on and on. In terms of visiting loops, I think what the pass manager does currently is as follows, which seems to be appropriate. With your example above, the LPM started with L1(L2), visited L2 with loop-pass-A and loop-pass-B, then ran loop-pass-A on L1 and loop-interchange on the whole loop nest, interchanging the loops which results in L2(L1). Then it would ran run loop-pass-B on L1. It looks appropriate because L1 is the one that we want to optimize with loop-pass-B at this point, even it becase the inner loop after interchange. Nevertheless, we can always "restart all the loop passes on L2(L1) starting with L1, then L2, because the structure changed". The potential drawback is that it causes compile time increase, as well as the infinite run problem we described in our first paragraph. I do get your point that we may not want to change `L` in this function `runWithLoopNestPasses()`. I've provided another solution which does not change `L` but keeps a pointer to the outermost loop. It is based on your patch D132581. Whenever we need to run loopnest passes, we contruct the loopnest based on the outermost loop. The interface with the LPM::Updater (`U.isLoopNestChanged()` and `U.markLoopNestChanged()`) may not be necessary though. It suffices to only add the while loop `while (auto PL = OuterMostLoop->getParentLoop())` after `PassPA = runSinglePass(LN, Pass, AM, AR, U, PI)`. I'd appreciate it if you could let me know your comments on the most recent version. Thanks a lot. congzhe:* Thanks for this. The reason why loop interchange keeps optimizing the loopnest is that loop…
} else {		} else {
// The `I`-th pass is a loop-nest pass.		// The `I`-th pass is a loop-nest pass.
auto &Pass = LoopNestPasses[LoopNestPassIndex++];		auto &Pass = LoopNestPasses[LoopNestPassIndex++];

// If the loop-nest object calculated before is no longer valid,		// If the loop-nest object calculated before is no longer valid,
// re-calculate it here before running the loop-nest pass.		// re-calculate it here before running the loop-nest pass.
if (!IsLoopNestPtrValid) {		//
LoopNestPtr = LoopNest::getLoopNest(L, AR.SE);		// FIXME: PreservedAnalysis should not be abused to tell if the
		// status of loopnest has been changed. We should use and only
		// use LPMUpdater for this purpose.
		if (!IsLoopNestPtrValid \|\| U.isLoopNestChanged()) {
		while (auto *ParentLoop = OuterMostLoop->getParentLoop())
		OuterMostLoop = ParentLoop;
		LoopNestPtr = LoopNest::getLoopNest(*OuterMostLoop, AR.SE);
IsLoopNestPtrValid = true;		IsLoopNestPtrValid = true;
		U.markLoopNestChanged(false);
}		}

PassPA = runSinglePass(*LoopNestPtr, Pass, AM, AR, U, PI);		PassPA = runSinglePass(*LoopNestPtr, Pass, AM, AR, U, PI);
}		}

// `PassPA` is `None` means that the before-pass callbacks in		// `PassPA` is `None` means that the before-pass callbacks in
// `PassInstrumentation` return false. The pass does not run in this case,		// `PassInstrumentation` return false. The pass does not run in this case,
// so we can skip the following procedure.		// so we can skip the following procedure.
if (!PassPA)		if (!PassPA)
continue;		continue;

// If the loop was deleted, abort the run and return to the outer walk.		// If the loop was deleted, abort the run and return to the outer walk.
if (U.skipCurrentLoop()) {		if (U.skipCurrentLoop()) {
PA.intersect(std::move(*PassPA));		PA.intersect(std::move(*PassPA));
break;		break;
}		}

// Update the analysis manager as each pass runs and potentially		// Update the analysis manager as each pass runs and potentially
// invalidates analyses.		// invalidates analyses.
AM.invalidate(L, *PassPA);		AM.invalidate(IsLoopNestPass[I] ? OuterMostLoop : L, PassPA);

// Finally, we intersect the final preserved analyses to compute the		// Finally, we intersect the final preserved analyses to compute the
// aggregate preserved set for this pass manager.		// aggregate preserved set for this pass manager.
PA.intersect(std::move(*PassPA));		PA.intersect(std::move(*PassPA));

// Check if the current pass preserved the loop-nest object or not.		// Check if the current pass preserved the loop-nest object or not.
IsLoopNestPtrValid &= PassPA->getChecker<LoopNestAnalysis>().preserved();		IsLoopNestPtrValid &= PassPA->getChecker<LoopNestAnalysis>().preserved();

// After running the loop pass, the parent loop might change and we need to		// After running the loop pass, the parent loop might change and we need to
// notify the updater, otherwise U.ParentL might gets outdated and triggers		// notify the updater, otherwise U.ParentL might gets outdated and triggers
// assertion failures in addSiblingLoops and addChildLoops.		// assertion failures in addSiblingLoops and addChildLoops.
U.setParentLoop(L.getParentLoop());		U.setParentLoop((IsLoopNestPass[I] ? *OuterMostLoop : L).getParentLoop());
}		}
return PA;		return PA;
}		}

// Run all loop passes on loop \p L. Loop-nest passes don't run either because		// Run all loop passes on loop \p L. Loop-nest passes don't run either because
// \p L is not a top-level one or simply because there are no loop-nest passes		// \p L is not a top-level one or simply because there are no loop-nest passes
// in the pass manager at all.		// in the pass manager at all.
PreservedAnalyses		PreservedAnalyses
▲ Show 20 Lines • Show All 221 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll

This file was added.

				; REQUIRES: asserts
				; RUN: opt < %s -passes="loop(loop-interchange,loop-interchange)" -cache-line-size=8 -verify-dom-info -verify-loop-info \
				; RUN: -debug-only=loop-interchange 2>&1 \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				@g_75 = external global i32, align 1
				@g_78 = external global [6 x ptr], align 1

				; Loop interchange as a loopnest pass should always construct the loop nest from
				; the outermost loop. This test case runs loop interchange twice. In the loop pass
				; manager, it might occur that after the first loop interchange transformation
				; the original outermost loop becomes a inner loop hence the loop nest constructed
				; afterwards for the second loop interchange pass turns out to be a loop list of size
				; 2 and is not valid. This causes functional issues.
				;
				; Make sure we always construct the valid and correct loop nest at the beginning
				; of execution of a loopnest pass.

				; CHECK: Processing LoopList of size = 3
				; CHECK: Processing LoopList of size = 3
				define void @loopnest_01() {
				entry:
				br label %for.cond5.preheader.i.i.i

				for.cond5.preheader.i.i.i: ; preds = %for.end16.i.i.i, %entry
				%storemerge11.i.i.i = phi i32 [ 4, %entry ], [ %sub18.i.i.i, %for.end16.i.i.i ]
				br label %for.cond8.preheader.i.i.i

				for.cond8.preheader.i.i.i: ; preds = %for.inc14.i.i.i, %for.cond5.preheader.i.i.i
				%l_105.18.i.i.i = phi i16 [ 0, %for.cond5.preheader.i.i.i ], [ %add15.i.i.i, %for.inc14.i.i.i ]
				br label %for.body10.i.i.i

				for.body10.i.i.i: ; preds = %for.body10.i.i.i, %for.cond8.preheader.i.i.i
				%storemerge56.i.i.i = phi i16 [ 5, %for.cond8.preheader.i.i.i ], [ %sub.i.i.i, %for.body10.i.i.i ]
				%arrayidx.i.i.i = getelementptr [6 x ptr], ptr @g_78, i16 0, i16 %storemerge56.i.i.i
				store ptr @g_75, ptr %arrayidx.i.i.i, align 1
				%sub.i.i.i = add nsw i16 %storemerge56.i.i.i, -1
				br i1 true, label %for.inc14.i.i.i, label %for.body10.i.i.i

				for.inc14.i.i.i: ; preds = %for.body10.i.i.i
				%add15.i.i.i = add nuw nsw i16 %l_105.18.i.i.i, 1
				%exitcond.not.i.i.i = icmp eq i16 %add15.i.i.i, 6
				br i1 %exitcond.not.i.i.i, label %for.end16.i.i.i, label %for.cond8.preheader.i.i.i

				for.end16.i.i.i: ; preds = %for.inc14.i.i.i
				%sub18.i.i.i = add nsw i32 %storemerge11.i.i.i, -1
				%cmp.i10.not.i.i = icmp eq i32 %storemerge11.i.i.i, 0
				br i1 %cmp.i10.not.i.i, label %func_4.exit.i, label %for.cond5.preheader.i.i.i

				func_4.exit.i: ; preds = %for.end16.i.i.i
				unreachable
				}

This is an archive of the discontinued LLVM Phabricator instance.

[LoopPassManager] Ensure to construct loop nests with the outermost loopClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 462077

llvm/include/llvm/Transforms/Scalar/LoopPassManager.h

llvm/lib/Transforms/Scalar/LoopInterchange.cpp

llvm/lib/Transforms/Scalar/LoopPassManager.cpp

llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll

[LoopPassManager] Ensure to construct loop nests with the outermost loop
ClosedPublic