This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Support/
-
llvm/
-
Support/
-
ThreadPool.h
-
lib/Support/
-
Support/
2/2
ThreadPool.cpp
-
mlir/
-
include/mlir/IR/
-
mlir/
-
IR/
-
MLIRContext.h
-
Threading.h
-
lib/
-
IR/
-
MLIRContext.cpp
-
Verifier.cpp
-
Pass/
3/3
Pass.cpp
-
Transforms/
-
Inliner.cpp
-
test/
-
Dialect/Affine/
-
Affine/
-
SuperVectorize/
2/3
compose_maps.mlir
-
slicing-utils.mlir
-
IR/
-
diagnostic-handler-filter.mlir
-
Pass/
-
pass-timing.mlir
-
pipeline-parsing.mlir

Differential D104516

[mlir] Add a ThreadPool to MLIRContext and refactor MLIR threading usage
ClosedPublic

Authored by rriddle on Jun 18 2021, 3:30 AM.

Download Raw Diff

Details

Reviewers

mehdi_amini
lattner
stellaraccident
aartbik
nicolasvasilache

Commits

rG6569cf2a44bf: [mlir] Add a ThreadPool to MLIRContext and refactor MLIR threading usage

Summary

This revision refactors the usage of multithreaded utilities in MLIR to use a common
thread pool within the MLIR context, in addition to a new utility that makes writing
multi-threaded code in MLIR less error prone. Using a unified thread pool brings about
several advantages:

Better thread usage and more control

We currently use the static llvm threading utilities, which do not allow multiple
levels of asynchronous scheduling (even if there are open threads). This is due to
how the current TaskGroup structure works, which only allows one truly multithreaded
instance at a time. By having our own ThreadPool we gain more control and flexibility
over our job/thread scheduling, and in a followup can enable threading more parts of
the compiler.

The static nature of TaskGroup causes issues in certain configurations

Due to the static nature of TaskGroup, there have been quite a few problems related to
destruction that have caused several downstream projects to disable threading. See
D104207 for discussion on some related fallout. By having a ThreadPool scoped to
the context, we don't have to worry about destruction and can ensure that any
additional MLIR thread usage ends when the context is destroyed.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

rriddle created this revision.Jun 18 2021, 3:30 AM

Herald added a reviewer: aartbik. · View Herald TranscriptJun 18 2021, 3:31 AM

Herald added subscribers: dcaballe, cota, teijeong and 19 others. · View Herald Transcript

rriddle requested review of this revision.Jun 18 2021, 3:31 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJun 18 2021, 3:31 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, stephenneuendorffer, nicolasvasilache. · View Herald Transcript

rriddle mentioned this in D104207: [Verifier] Parallelize verification and dom checking. NFC..Jun 18 2021, 3:33 AM

bondhugula added a subscriber: bondhugula.Jun 18 2021, 3:46 AM

bondhugula added inline comments.

mlir/include/mlir/IR/ThreadingUtilities.h
30 ↗	(On Diff #352961)	synchronously or sequentially?

bondhugula added inline comments.Jun 18 2021, 4:00 AM

mlir/lib/Pass/Pass.cpp
590–615	std::vector<std::atomic<bool>> activePMs(asyncExecutors.size(), std::atomic<bool>{false});

Harbormaster completed remote builds in B109895: Diff 352961.Jun 18 2021, 5:17 PM

Nice, thanks for tackling this River!

llvm/lib/Support/ThreadPool.cpp
77	Yuck but ok. This is obviously your way of trying to force me to rewrite this stuff ;-)
mlir/include/mlir/IR/ThreadingUtilities.h
1 ↗	(On Diff #352961)	I'd recommend calling this just "Threading.h". This also points out a layering issue: MLIRContext is the right place for us to scope this (so I'm ok with the patch), but this really has nothing to do with the IR! One solution would be to split MLIRContext, give it a subclass like "MLIRNonIRContext" (better names welcome) that contain the thread pool, diagnostics utility and other stuff that would be generic, and put it in mlir/Support. MLIRContext would derive from it, and would remain the currency type, but generic utilities like this would take "MLIRNonIRContext". Anyway, discussion for another day, just saying that general parallelism stuff would be better in Support.
48 ↗	(On Diff #352961)	Please merge this into the previous "if", so the compiler doesn't have to undupe the two std::for_each calls.
85 ↗	(On Diff #352961)	Could you also add a parallelForEachN while you're here? circt uses it in a couple places.
mlir/lib/Pass/Pass.cpp
590–615	Right, atomic needs to be explicitly initialized to work correctly
mlir/test/Dialect/Affine/SuperVectorize/compose_maps.mlir
1	Why do these need to disable threading?

lattner accepted this revision.Jun 19 2021, 8:58 AM

This revision is now accepted and ready to land.Jun 19 2021, 8:58 AM

rriddle marked an inline comment as done.Jun 19 2021, 6:14 PM

rriddle added inline comments.

llvm/lib/Support/ThreadPool.cpp
77	Yes, please! I don't have any time to devote to it for the next month or so, but I would really love a ThreadPool that supported the things that we need. This function is a gross band-aid over the current implementation that allows for us to use it as-is for now.
mlir/lib/Pass/Pass.cpp
590–615	Thanks Uday! My 4am brain couldn't remember how to initialize it.

did you consider shoving the existing threadpool stuff into a ManagedStatic? That is how we typically handle problems like this, they are destroyed on llvm_shutdown instead of at global deinit.

Sweet!

mlir/include/mlir/IR/ThreadingUtilities.h
38 ↗	(On Diff #352961)

rriddle marked 8 inline comments as done.Jun 21 2021, 6:28 PM

rriddle added inline comments.

mlir/test/Dialect/Affine/SuperVectorize/compose_maps.mlir
1	Several existing tests depended on the previous threading behavior to work, mostly getting lucky that the `errs()` output was laid out in the way they expect. Switched to no longer disable threading, but adding split file markers to separate tests.

update

In D104516#2830695, @lattner wrote:

did you consider shoving the existing threadpool stuff into a ManagedStatic? That is how we typically handle problems like this, they are destroyed on llvm_shutdown instead of at global deinit.

It was mentioned in the other revision I believe, but ManagedStatic is what the LLVM threading utilities currently use. Moving to the context removed all of the environment related shutdowns that I was seeing.

mlir/include/mlir/IR/ThreadingUtilities.h
1 ↗	(On Diff #352961)	I think the major sticking point is what to do about `Location`, which is used pervasively by Diagnostics. Most of the handlers interact with the location types, so those I suppose could stay in IR, but Diagnostic::getLocation would have to hold/return something like `RawLocation`? Which would have to store something opaque. I think it's doable, but the layering would have to be carefully done such that interacting with Diagnostics doesn't become ugly.

Harbormaster completed remote builds in B110316: Diff 353522.Jun 21 2021, 7:08 PM

It was mentioned in the other revision I believe, but ManagedStatic is what the LLVM threading utilities currently use. Moving to the context removed all of the environment related shutdowns that I was seeing.

No, it isn't. This is the problem:

Executor *Executor::getDefaultExecutor() {
// ...
  static ManagedStatic<ThreadPoolExecutor, ThreadPoolExecutor::Creator,
                       ThreadPoolExecutor::Deleter>
      ManagedExec;
  static std::unique_ptr<ThreadPoolExecutor> Exec(&(*ManagedExec));
  return Exec.get();
}

Note that "Exec" is not a managed static. The "..." elided is some justification for this design that seems really dubious to me (but I admit I haven't mind melded with it!)

mlir/test/Dialect/Affine/SuperVectorize/compose_maps.mlir
1	Ok for this patch, but gross. Plz file a bug about these. It is a bad antipattern to emit to `errs()`, these tests should be emitting errors so they are pinned to a location correctly (similar to the dependence analysis tests)

I mentioned this in the other phab thread: while I'm totally ok with this patch as a way to unblock progress, it really isn't the right thing. It doesn't make sense for MLIR to have its own threadpool tied to its context's. CPU resources aren't library specific, they are global.

In D104516#2832247, @lattner wrote:
It was mentioned in the other revision I believe, but ManagedStatic is what the LLVM threading utilities currently use. Moving to the context removed all of the environment related shutdowns that I was seeing.

No, it isn't. This is the problem:
Executor *Executor::getDefaultExecutor() {
// ...
  static ManagedStatic<ThreadPoolExecutor, ThreadPoolExecutor::Creator,
                       ThreadPoolExecutor::Deleter>
      ManagedExec;
  static std::unique_ptr<ThreadPoolExecutor> Exec(&(*ManagedExec));
  return Exec.get();
}
Note that "Exec" is not a managed static. The "..." elided is some justification for this design that seems really dubious to me (but I admit I haven't mind melded with it!)

Right right, I'm only vaguely recalling watching the original review for this (D70447). Before that we effectively couldn't use the Parallel methods on windows.

I mentioned this in the other phab thread: while I'm totally ok with this patch as a way to unblock progress, it really isn't the right thing. It doesn't make sense for MLIR to have its own threadpool tied to its context's. CPU resources aren't library specific, they are global.

I'm not completely against a static/global thread pool, but I would claim that a static/global thread pool is orthogonal to CPU resources being library specific. A lot of users today with the current LLVM global thread pool don't share threads with things non-LLVM (whether for good reasons or bad).
We've encountered a lot of "interesting" interactions surrounding threading and certain environments, and I want to say a lot of those are due to the way the current code is structured (which can be a bit obscure at times). The weird threading environment interactions have been extremely taxing
and difficult to debug, and often end with "I have no idea what's happening, guess I'll just disable multi-threading". I would strongly prefer we stay non-static until a proper solution can prove itself to be reliable enough. I don't know exactly what you've had in mind
for revamping the threading libraries, but I have a wish list if you are interested.

Ok, I'm more interested in having a parallel verifier than waiting for perfection here. We can all agree the existing threadpool stuff is "suboptimal" and needs to be replaced. This seems like a fine step in the immediate term.

This revision was landed with ongoing or failed builds.Jun 22 2021, 6:33 PM

Closed by commit rG6569cf2a44bf: [mlir] Add a ThreadPool to MLIRContext and refactor MLIR threading usage (authored by rriddle). · Explain Why

This revision was automatically updated to reflect the committed changes.

rriddle added a commit: rG6569cf2a44bf: [mlir] Add a ThreadPool to MLIRContext and refactor MLIR threading usage.

rriddle mentioned this in rG0246dd30046a: [mlir] Fix slicing-utils.mlir test after D104516.Jun 22 2021, 7:59 PM

rriddle mentioned this in rG84bd07aff901: [mlir] Fix GCC5 build after D104516.Jun 22 2021, 8:20 PM

The MLIR bot, https://lab.llvm.org/buildbot/#/builders/88/builds/14468, Have been red for over 30hr when this patch landed. Please provide a fix or pull the related patches to bring the bot back to green.

In D104516#2838885, @lei wrote:

The MLIR bot, https://lab.llvm.org/buildbot/#/builders/88/builds/14468, Have been red for over 30hr when this patch landed. Please provide a fix or pull the related patches to bring the bot back to green.

Are those related to this patch? Those all look like orc failures.

In D104516#2838887, @rriddle wrote:

In D104516#2838885, @lei wrote:

The MLIR bot, https://lab.llvm.org/buildbot/#/builders/88/builds/14468, Have been red for over 30hr when this patch landed. Please provide a fix or pull the related patches to bring the bot back to green.

Are those related to this patch? Those all look like orc failures.

Do you have a way to repro the issue?

In D104516#2838887, @rriddle wrote:

In D104516#2838885, @lei wrote:

The MLIR bot, https://lab.llvm.org/buildbot/#/builders/88/builds/14468, Have been red for over 30hr when this patch landed. Please provide a fix or pull the related patches to bring the bot back to green.

Are those related to this patch? Those all look like orc failures.

I think pulling in additional code has caused a relocation overflow in the runtime linker. We might have to switch to the large code model for ORC on PPC.

In D104516#2838941, @nemanjai wrote:

In D104516#2838887, @rriddle wrote:

In D104516#2838885, @lei wrote:

The MLIR bot, https://lab.llvm.org/buildbot/#/builders/88/builds/14468, Have been red for over 30hr when this patch landed. Please provide a fix or pull the related patches to bring the bot back to green.

Are those related to this patch? Those all look like orc failures.

I think pulling in additional code has caused a relocation overflow in the runtime linker. We might have to switch to the large code model for ORC on PPC.

Thanks for looking! That definitely looks like what is happening. Do you know how to adjust that? (I have very little experience/knowledge of ORC)

We can disable the JIT tests on PPC in the meantime to get the bot back green?

In D104516#2838982, @mehdi_amini wrote:

We can disable the JIT tests on PPC in the meantime to get the bot back green?

I think that is a good idea. I'll look into how to set the code model for ORC if you or someone else knows how to disable these (presumably not just XFAIL for each individual test), I would appreciate it.

I tried to disable them in 652f4b5140e2, hopefully I got the logic right! The bot is backlogged so unfortunately it'll take time to figure.

By the way have you considered enabling CCACHE on the bot? This dramatically helped this bot: https://lab.llvm.org/buildbot/#/builders/61

In D104516#2838979, @rriddle wrote:

In D104516#2838941, @nemanjai wrote:

In D104516#2838887, @rriddle wrote:

In D104516#2838885, @lei wrote:

The MLIR bot, https://lab.llvm.org/buildbot/#/builders/88/builds/14468, Have been red for over 30hr when this patch landed. Please provide a fix or pull the related patches to bring the bot back to green.

Are those related to this patch? Those all look like orc failures.

I think pulling in additional code has caused a relocation overflow in the runtime linker. We might have to switch to the large code model for ORC on PPC.

Thanks for looking! That definitely looks like what is happening. Do you know how to adjust that? (I have very little experience/knowledge of ORC)

@lhames might be able to point you at the right spot!

In D104516#2839035, @mehdi_amini wrote:

I tried to disable them in 652f4b5140e2, hopefully I got the logic right! The bot is backlogged so unfortunately it'll take time to figure.

By the way have you considered enabling CCACHE on the bot? This dramatically helped this bot: https://lab.llvm.org/buildbot/#/builders/61

We might consider using ccache on the bot but I am not sure that would be all that useful. None of the builds on this bot seem to take any longer than 10 minutes. I am not really sure what has caused the backlog on it now - maybe some issue with master or the worker itself? This builder certainly does build every single commit so that may be something we need to change in the future. In any case, we'll get the queue cleaned up to unclog it.

@mehdi_amini The PPC mlir bot still shows 6 overflow failures after your patch. Can you please help to disable them for PPC?

Failed Tests (6):
  MLIR-Unit :: ExecutionEngine/./MLIRExecutionEngineTests/MLIRExecutionEngine.AddInteger
  MLIR-Unit :: ExecutionEngine/./MLIRExecutionEngineTests/MLIRExecutionEngine.SubtractFloat
  MLIR-Unit :: ExecutionEngine/./MLIRExecutionEngineTests/NativeMemRefJit.BasicMemref
  MLIR-Unit :: ExecutionEngine/./MLIRExecutionEngineTests/NativeMemRefJit.JITCallback
  MLIR-Unit :: ExecutionEngine/./MLIRExecutionEngineTests/NativeMemRefJit.RankOneMemref
  MLIR-Unit :: ExecutionEngine/./MLIRExecutionEngineTests/NativeMemRefJit.ZeroRankMemref

In D104516#2839035, @mehdi_amini wrote:

I tried to disable them in 652f4b5140e2, hopefully I got the logic right! The bot is backlogged so unfortunately it'll take time to figure.

By the way have you considered enabling CCACHE on the bot? This dramatically helped this bot: https://lab.llvm.org/buildbot/#/builders/61

Actually never mind 🙂 I think I've isolated the issue on the PPC bot.

In D104516#2845668, @lei wrote:

Actually never mind 🙂 I think I've isolated the issue on the PPC bot.

Ah it seems I only managed to disabled the lit tests but not unit-tests, do you have patch for this or another more direct fix?

I noticed that if I export LD_LIBRARY_PATH=/usr/lib64 then all tests passes. I am thinking to revert your previous patch to enable those lit tests and add this export for the mlir bot for now to get the bot going again, but continue to investigate what the difference is between /lib64 and /usr/lib64 on our PPC machine.

LGTM!

@lhames might be able to point you at the right spot!

Late to the party here, sorry!

Relocation R_PPC64_REL32 overflow -- This is a PC-relative relocation overflow. Something is being allocated out of range for of the fixup address. If you're not using the large code model already then changing to the large code model for your JIT'd code may fix this, but I believe there are some known limitations of the RuntimeDyld PPC backend that mean that out-of-range issues can show up even in the large code model.

I recently landed changes to JITLink's ELF support that make it easier to write new JITLink backends -- I think the ideal fix here would be to implement a PPC64 ELF backend for JITLink and eliminates those limitations altogether.

In D104516#2851594, @lhames wrote:

@lhames might be able to point you at the right spot!

Late to the party here, sorry!

Relocation R_PPC64_REL32 overflow -- This is a PC-relative relocation overflow. Something is being allocated out of range for of the fixup address. If you're not using the large code model already then changing to the large code model for your JIT'd code may fix this, but I believe there are some known limitations of the RuntimeDyld PPC backend that mean that out-of-range issues can show up even in the large code model.

I recently landed changes to JITLink's ELF support that make it easier to write new JITLink backends -- I think the ideal fix here would be to implement a PPC64 ELF backend for JITLink and eliminates those limitations altogether.

This has come up in other contexts as well and it certainly seems like a worthwhile project for us to undertake as soon as resources are available. Would you be able to provide a link to some documentation to guide a developer that only has experience with the back end and no JIT experience?

So not an issue with the current implementation, but a note for any future improvements: the changes in https://reviews.llvm.org/D70447 that made parallel work on Windows were, I believe, necessary (I did not catch the original review, but I've recently run into some issues). Not only that, I believe something has recently (in the last few months) impacted the existing functionality such that lld now regularly exhibits the issues described in https://reviews.llvm.org/D70447, so I would caution using that until they've been resolved.

Herald added subscribers: wrengr, Chia-hungDuan. · View Herald TranscriptSep 2 2021, 11:23 AM

In D104516#2861574, @nemanjai wrote:

In D104516#2851594, @lhames wrote:

I recently landed changes to JITLink's ELF support that make it easier to write new JITLink backends -- I think the ideal fix here would be to implement a PPC64 ELF backend for JITLink and eliminates those limitations altogether.

This has come up in other contexts as well and it certainly seems like a worthwhile project for us to undertake as soon as resources are available. Would you be able to provide a link to some documentation to guide a developer that only has experience with the back end and no JIT experience?

@nemanjai Two of the best resources would be https://llvm.org/docs/JITLink.html, and the recently posted review for a minimal ELF/AArch64 backend: https://reviews.llvm.org/D108986.

Revision Contents

Path

Size

llvm/

include/

llvm/

Support/

ThreadPool.h

3 lines

lib/

Support/

ThreadPool.cpp

8 lines

mlir/

include/

mlir/

IR/

MLIRContext.h

10 lines

Threading.h

153 lines

lib/

IR/

MLIRContext.cpp

10 lines

Verifier.cpp

36 lines

Pass/

Pass.cpp

74 lines

Transforms/

Inliner.cpp

64 lines

test/

Dialect/

Affine/

SuperVectorize/

compose_maps.mlir

32 lines

slicing-utils.mlir

14 lines

IR/

diagnostic-handler-filter.mlir

4 lines

Pass/

pass-timing.mlir

30 lines

pipeline-parsing.mlir

4 lines

Diff 353838

llvm/include/llvm/Support/ThreadPool.h

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	public:
}		}

/// Blocking wait for all the threads to complete and the queue to be empty.		/// Blocking wait for all the threads to complete and the queue to be empty.
/// It is an error to try to add new tasks while blocking on this call.		/// It is an error to try to add new tasks while blocking on this call.
void wait();		void wait();

unsigned getThreadCount() const { return ThreadCount; }		unsigned getThreadCount() const { return ThreadCount; }

		/// Returns true if the current thread is a worker thread of this thread pool.
		bool isWorkerThread() const;

private:		private:
bool workCompletedUnlocked() { return !ActiveThreads && Tasks.empty(); }		bool workCompletedUnlocked() { return !ActiveThreads && Tasks.empty(); }

/// Asynchronous submission of a task to the pool. The returned future can be		/// Asynchronous submission of a task to the pool. The returned future can be
/// used to wait for the task to finish and is non-blocking on destruction.		/// used to wait for the task to finish and is non-blocking on destruction.
std::shared_future<void> asyncImpl(TaskTy F);		std::shared_future<void> asyncImpl(TaskTy F);

/// Threads in flight		/// Threads in flight
Show All 25 Lines

llvm/lib/Support/ThreadPool.cpp

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	}			}

	void ThreadPool::wait() {			void ThreadPool::wait() {
	// Wait for all threads to complete and the queue to be empty			// Wait for all threads to complete and the queue to be empty
	std::unique_lock<std::mutex> LockGuard(QueueLock);			std::unique_lock<std::mutex> LockGuard(QueueLock);
	CompletionCondition.wait(LockGuard, [&] { return workCompletedUnlocked(); });			CompletionCondition.wait(LockGuard, [&] { return workCompletedUnlocked(); });
	}			}

				bool ThreadPool::isWorkerThread() const {
				std::thread::id CurrentThreadId = std::this_thread::get_id();
				for (const std::thread &Thread : Threads)
				lattnerUnsubmitted Done Reply Inline Actions Yuck but ok. This is obviously your way of trying to force me to rewrite this stuff ;-) lattner: Yuck but ok. This is obviously your way of trying to force me to rewrite this stuff ;-)
				rriddleAuthorUnsubmitted Done Reply Inline Actions Yes, please! I don't have any time to devote to it for the next month or so, but I would really love a ThreadPool that supported the things that we need. This function is a gross band-aid over the current implementation that allows for us to use it as-is for now. rriddle: Yes, please! I don't have any time to devote to it for the next month or so, but I would really…
				if (CurrentThreadId == Thread.get_id())
				return true;
				return false;
				}

	std::shared_future<void> ThreadPool::asyncImpl(TaskTy Task) {			std::shared_future<void> ThreadPool::asyncImpl(TaskTy Task) {
	/// Wrap the Task in a packaged_task to return a future object.			/// Wrap the Task in a packaged_task to return a future object.
	PackagedTaskTy PackagedTask(std::move(Task));			PackagedTaskTy PackagedTask(std::move(Task));
	auto Future = PackagedTask.get_future();			auto Future = PackagedTask.get_future();
	{			{
	// Lock the queue and push the new task			// Lock the queue and push the new task
	std::unique_lock<std::mutex> LockGuard(QueueLock);			std::unique_lock<std::mutex> LockGuard(QueueLock);

	▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

mlir/include/mlir/IR/MLIRContext.h

Show All 9 Lines
#define MLIR_IR_MLIRCONTEXT_H		#define MLIR_IR_MLIRCONTEXT_H

#include "mlir/Support/LLVM.h"		#include "mlir/Support/LLVM.h"
#include "mlir/Support/TypeID.h"		#include "mlir/Support/TypeID.h"
#include <functional>		#include <functional>
#include <memory>		#include <memory>
#include <vector>		#include <vector>

		namespace llvm {
		class ThreadPool;
		} // end namespace llvm

namespace mlir {		namespace mlir {
class AbstractOperation;		class AbstractOperation;
class DebugActionManager;		class DebugActionManager;
class DiagnosticEngine;		class DiagnosticEngine;
class Dialect;		class Dialect;
class DialectRegistry;		class DialectRegistry;
class InFlightDiagnostic;		class InFlightDiagnostic;
class Location;		class Location;
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	public:
bool isMultithreadingEnabled();		bool isMultithreadingEnabled();

/// Set the flag specifying if multi-threading is disabled by the context.		/// Set the flag specifying if multi-threading is disabled by the context.
void disableMultithreading(bool disable = true);		void disableMultithreading(bool disable = true);
void enableMultithreading(bool enable = true) {		void enableMultithreading(bool enable = true) {
disableMultithreading(!enable);		disableMultithreading(!enable);
}		}

		/// Return the thread pool owned by this context. This method requires that
		/// multithreading be enabled within the context, and should generally not be
		/// used directly. Users should instead prefer the threading utilities within
		/// Threading.h.
		llvm::ThreadPool &getThreadPool();

/// Return true if we should attach the operation to diagnostics emitted via		/// Return true if we should attach the operation to diagnostics emitted via
/// Operation::emit.		/// Operation::emit.
bool shouldPrintOpOnDiagnostic();		bool shouldPrintOpOnDiagnostic();

/// Set the flag specifying if we should attach the operation to diagnostics		/// Set the flag specifying if we should attach the operation to diagnostics
/// emitted via Operation::emit.		/// emitted via Operation::emit.
void printOpOnDiagnostic(bool enable);		void printOpOnDiagnostic(bool enable);

▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

mlir/include/mlir/IR/Threading.h

This file was added.

				//===- Threading.h - MLIR Threading Utilities -------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines various utilies for multithreaded processing within MLIR.
				// These utilities automatically handle many of the necessary threading
				// conditions, such as properly ordering diagnostics, observing if threading is
				// disabled, etc. These utilities should be used over other threading utilities
				// whenever feasible.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_IR_THREADING_H
				#define MLIR_IR_THREADING_H

				#include "mlir/IR/Diagnostics.h"
				#include "llvm/ADT/Sequence.h"
				#include "llvm/Support/ThreadPool.h"
				#include <atomic>

				namespace mlir {

				/// Invoke the given function on the elements between [begin, end)
				/// asynchronously. If the given function returns a failure when processing any
				/// of the elements, execution is stopped and a failure is returned from this
				/// function. This means that in the case of failure, not all elements of the
				/// range will be processed. Diagnostics emitted during processing are ordered
				/// relative to the element's position within [begin, end). If the provided
				/// context does not have multi-threading enabled, this function always
				/// processes elements sequentially.
				template <typename IteratorT, typename FuncT>
				LogicalResult failableParallelForEach(MLIRContext *context, IteratorT begin,
				IteratorT end, FuncT &&func) {
				unsigned numElements = static_cast<unsigned>(std::distance(begin, end));
				if (numElements == 0)
				return success();

				// If multithreading is disabled or there is a small number of elements,
				// process the elements directly on this thread.
				// FIXME: ThreadPool should allow work stealing to avoid deadlocks when
				// scheduling work within a worker thread.
				if (!context->isMultithreadingEnabled() \|\| numElements <= 1 \|\|
				context->getThreadPool().isWorkerThread()) {
				for (; begin != end; ++begin)
				if (failed(func(*begin)))
				return failure();
				return success();
				}

				// Build a wrapper processing function that properly initializes a parallel
				// diagnostic handler.
				ParallelDiagnosticHandler handler(context);
				std::atomic<unsigned> curIndex(0);
				std::atomic<bool> processingFailed(false);
				auto processFn = [&] {
				while (!processingFailed) {
				unsigned index = curIndex++;
				if (index >= numElements)
				break;
				handler.setOrderIDForThread(index);
				if (failed(func(*std::next(begin, index))))
				processingFailed = true;
				handler.eraseOrderIDForThread();
				}
				};

				// Otherwise, process the elements in parallel.
				llvm::ThreadPool &threadPool = context->getThreadPool();
				size_t numActions = std::min(numElements, threadPool.getThreadCount());
				SmallVector<std::shared_future<void>> threadFutures;
				threadFutures.reserve(numActions - 1);
				for (unsigned i = 1; i < numActions; ++i)
				threadFutures.emplace_back(threadPool.async(processFn));
				processFn();

				// Wait for all of the threads to finish.
				for (std::shared_future<void> &future : threadFutures)
				future.wait();
				return failure(processingFailed);
				}

				/// Invoke the given function on the elements in the provided range
				/// asynchronously. If the given function returns a failure when processing any
				/// of the elements, execution is stopped and a failure is returned from this
				/// function. This means that in the case of failure, not all elements of the
				/// range will be processed. Diagnostics emitted during processing are ordered
				/// relative to the element's position within the range. If the provided context
				/// does not have multi-threading enabled, this function always processes
				/// elements sequentially.
				template <typename RangeT, typename FuncT>
				LogicalResult failableParallelForEach(MLIRContext *context, RangeT &&range,
				FuncT &&func) {
				return failableParallelForEach(context, std::begin(range), std::end(range),
				std::forward<FuncT>(func));
				}

				/// Invoke the given function on the elements between [begin, end)
				/// asynchronously. If the given function returns a failure when processing any
				/// of the elements, execution is stopped and a failure is returned from this
				/// function. This means that in the case of failure, not all elements of the
				/// range will be processed. Diagnostics emitted during processing are ordered
				/// relative to the element's position within [begin, end). If the provided
				/// context does not have multi-threading enabled, this function always
				/// processes elements sequentially.
				template <typename FuncT>
				LogicalResult failableParallelForEachN(MLIRContext *context, size_t begin,
				size_t end, FuncT &&func) {
				return failableParallelForEach(context, llvm::seq(begin, end),
				std::forward<FuncT>(func));
				}

				/// Invoke the given function on the elements between [begin, end)
				/// asynchronously. Diagnostics emitted during processing are ordered relative
				/// to the element's position within [begin, end). If the provided context does
				/// not have multi-threading enabled, this function always processes elements
				/// sequentially.
				template <typename IteratorT, typename FuncT>
				void parallelForEach(MLIRContext *context, IteratorT begin, IteratorT end,
				FuncT &&func) {
				(void)failableParallelForEach(context, begin, end, [&](auto &&value) {
				return func(std::forward<decltype(value)>(value)), success();
				});
				}

				/// Invoke the given function on the elements in the provided range
				/// asynchronously. Diagnostics emitted during processing are ordered relative
				/// to the element's position within the range. If the provided context does not
				/// have multi-threading enabled, this function always processes elements
				/// sequentially.
				template <typename RangeT, typename FuncT>
				void parallelForEach(MLIRContext *context, RangeT &&range, FuncT &&func) {
				parallelForEach(context, std::begin(range), std::end(range),
				std::forward<FuncT>(func));
				}

				/// Invoke the given function on the elements between [begin, end)
				/// asynchronously. Diagnostics emitted during processing are ordered relative
				/// to the element's position within [begin, end). If the provided context does
				/// not have multi-threading enabled, this function always processes elements
				/// sequentially.
				template <typename FuncT>
				void parallelForEachN(MLIRContext *context, size_t begin, size_t end,
				FuncT &&func) {
				parallelForEach(context, llvm::seq(begin, end), std::forward<FuncT>(func));
				}

				} // end namespace mlir

				#endif // MLIR_IR_THREADING_H

mlir/lib/IR/MLIRContext.cpp

Show All 28 Lines
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringSet.h"		#include "llvm/ADT/StringSet.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/Support/Allocator.h"		#include "llvm/Support/Allocator.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/RWMutex.h"		#include "llvm/Support/RWMutex.h"
		#include "llvm/Support/ThreadPool.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <memory>		#include <memory>

#define DEBUG_TYPE "mlircontext"		#define DEBUG_TYPE "mlircontext"

using namespace mlir;		using namespace mlir;
using namespace mlir::detail;		using namespace mlir::detail;

▲ Show 20 Lines • Show All 210 Lines • ▼ Show 20 Lines	#endif

/// If the current stack trace should be attached when emitting diagnostics.		/// If the current stack trace should be attached when emitting diagnostics.
bool printStackTraceOnDiagnostic = false;		bool printStackTraceOnDiagnostic = false;

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Other		// Other
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

		/// The thread pool to use when processing MLIR tasks in parallel.
		llvm::ThreadPool threadPool;

/// This is a list of dialects that are created referring to this context.		/// This is a list of dialects that are created referring to this context.
/// The MLIRContext owns the objects.		/// The MLIRContext owns the objects.
DenseMap<StringRef, std::unique_ptr<Dialect>> loadedDialects;		DenseMap<StringRef, std::unique_ptr<Dialect>> loadedDialects;
DialectRegistry dialectsRegistry;		DialectRegistry dialectsRegistry;

/// This is a mapping from operation name to AbstractOperation for registered		/// This is a mapping from operation name to AbstractOperation for registered
/// operations.		/// operations.
llvm::StringMap<AbstractOperation> registeredOperations;		llvm::StringMap<AbstractOperation> registeredOperations;
▲ Show 20 Lines • Show All 295 Lines • ▼ Show 20 Lines	void MLIRContext::disableMultithreading(bool disable) {
impl->threadingIsEnabled = !disable;		impl->threadingIsEnabled = !disable;

// Update the threading mode for each of the uniquers.		// Update the threading mode for each of the uniquers.
impl->affineUniquer.disableMultithreading(disable);		impl->affineUniquer.disableMultithreading(disable);
impl->attributeUniquer.disableMultithreading(disable);		impl->attributeUniquer.disableMultithreading(disable);
impl->typeUniquer.disableMultithreading(disable);		impl->typeUniquer.disableMultithreading(disable);
}		}

		llvm::ThreadPool &MLIRContext::getThreadPool() {
		assert(isMultithreadingEnabled() &&
		"expected multi-threading to be enabled within the context");
		return impl->threadPool;
		}

void MLIRContext::enterMultiThreadedExecution() {		void MLIRContext::enterMultiThreadedExecution() {
#ifndef NDEBUG		#ifndef NDEBUG
++impl->multiThreadedExecutionContext;		++impl->multiThreadedExecutionContext;
#endif		#endif
}		}
void MLIRContext::exitMultiThreadedExecution() {		void MLIRContext::exitMultiThreadedExecution() {
#ifndef NDEBUG		#ifndef NDEBUG
--impl->multiThreadedExecutionContext;		--impl->multiThreadedExecutionContext;
▲ Show 20 Lines • Show All 483 Lines • Show Last 20 Lines

mlir/lib/IR/Verifier.cpp

	Show All 24 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "mlir/IR/Verifier.h"			#include "mlir/IR/Verifier.h"
	#include "mlir/IR/Attributes.h"			#include "mlir/IR/Attributes.h"
	#include "mlir/IR/Dialect.h"			#include "mlir/IR/Dialect.h"
	#include "mlir/IR/Dominance.h"			#include "mlir/IR/Dominance.h"
	#include "mlir/IR/Operation.h"			#include "mlir/IR/Operation.h"
	#include "mlir/IR/RegionKindInterface.h"			#include "mlir/IR/RegionKindInterface.h"
				#include "mlir/IR/Threading.h"
	#include "llvm/ADT/StringMap.h"			#include "llvm/ADT/StringMap.h"
	#include "llvm/Support/FormatVariadic.h"			#include "llvm/Support/FormatVariadic.h"
	#include "llvm/Support/Parallel.h"			#include "llvm/Support/Parallel.h"
	#include "llvm/Support/PrettyStackTrace.h"			#include "llvm/Support/PrettyStackTrace.h"
	#include "llvm/Support/Regex.h"			#include "llvm/Support/Regex.h"
	#include <atomic>			#include <atomic>

	using namespace mlir;			using namespace mlir;

	namespace {			namespace {
	/// This class encapsulates all the state used to verify an operation region.			/// This class encapsulates all the state used to verify an operation region.
	class OperationVerifier {			class OperationVerifier {
	public:			public:
	explicit OperationVerifier(MLIRContext *context)
	// TODO: Re-enable parallelism once deadlocks found in D104207 are
	// resolved.
	: parallelismEnabled(false) {}

	/// Verify the given operation.			/// Verify the given operation.
	LogicalResult verifyOpAndDominance(Operation &op);			LogicalResult verifyOpAndDominance(Operation &op);

	private:			private:
	LogicalResult			LogicalResult
	verifyBlock(Block &block,			verifyBlock(Block &block,
	SmallVectorImpl<Operation *> &opsWithIsolatedRegions);			SmallVectorImpl<Operation *> &opsWithIsolatedRegions);
	/// Verify the properties and dominance relationships of this operation,			/// Verify the properties and dominance relationships of this operation,
	/// stopping region recursion at any "isolated from above operations". Any			/// stopping region recursion at any "isolated from above operations". Any
	/// such ops are returned in the opsWithIsolatedRegions vector.			/// such ops are returned in the opsWithIsolatedRegions vector.
	LogicalResult			LogicalResult
	verifyOperation(Operation &op,			verifyOperation(Operation &op,
	SmallVectorImpl<Operation *> &opsWithIsolatedRegions);			SmallVectorImpl<Operation *> &opsWithIsolatedRegions);

	/// Verify the dominance property of regions contained within the given			/// Verify the dominance property of regions contained within the given
	/// Operation.			/// Operation.
	LogicalResult verifyDominanceOfContainedRegions(Operation &op,			LogicalResult verifyDominanceOfContainedRegions(Operation &op,
	DominanceInfo &domInfo);			DominanceInfo &domInfo);

	/// This is true if parallelism is enabled on the MLIRContext.
	const bool parallelismEnabled;
	};			};
	} // end anonymous namespace			} // end anonymous namespace

	LogicalResult OperationVerifier::verifyOpAndDominance(Operation &op) {			LogicalResult OperationVerifier::verifyOpAndDominance(Operation &op) {
	SmallVector<Operation *> opsWithIsolatedRegions;			SmallVector<Operation *> opsWithIsolatedRegions;

	// Verify the operation first, collecting any IsolatedFromAbove operations.			// Verify the operation first, collecting any IsolatedFromAbove operations.
	if (failed(verifyOperation(op, opsWithIsolatedRegions)))			if (failed(verifyOperation(op, opsWithIsolatedRegions)))
	return failure();			return failure();

	// Since everything looks structurally ok to this point, we do a dominance			// Since everything looks structurally ok to this point, we do a dominance
	// check for any nested regions. We do this as a second pass since malformed			// check for any nested regions. We do this as a second pass since malformed
	// CFG's can cause dominator analysis construction to crash and we want the			// CFG's can cause dominator analysis construction to crash and we want the
	// verifier to be resilient to malformed code.			// verifier to be resilient to malformed code.
	if (op.getNumRegions() != 0) {			if (op.getNumRegions() != 0) {
	DominanceInfo domInfo;			DominanceInfo domInfo;
	if (failed(verifyDominanceOfContainedRegions(op, domInfo)))			if (failed(verifyDominanceOfContainedRegions(op, domInfo)))
	return failure();			return failure();
	}			}

	// Check the dominance properties and invariants of any operations in the			// Check the dominance properties and invariants of any operations in the
	// regions contained by the 'opsWithIsolatedRegions' operations.			// regions contained by the 'opsWithIsolatedRegions' operations.
	if (!parallelismEnabled \|\| opsWithIsolatedRegions.size() <= 1) {			return failableParallelForEach(
	// If parallelism is disabled or if there is only 0/1 operation to do, use			op.getContext(), opsWithIsolatedRegions,
	// a simple non-parallel loop.			[&](Operation op) { return verifyOpAndDominance(op); });
	for (Operation *op : opsWithIsolatedRegions) {
	if (failed(verifyOpAndDominance(*op)))
	return failure();
	}
	} else {
	// Otherwise, verify the operations and their bodies in parallel.
	ParallelDiagnosticHandler handler(op.getContext());
	std::atomic<bool> passFailed(false);
	llvm::parallelForEachN(0, opsWithIsolatedRegions.size(), [&](size_t opIdx) {
	handler.setOrderIDForThread(opIdx);
	if (failed(verifyOpAndDominance(*opsWithIsolatedRegions[opIdx])))
	passFailed = true;
	handler.eraseOrderIDForThread();
	});
	if (passFailed)
	return failure();
	}

	return success();
	}			}

	/// Returns true if this block may be valid without terminator. That is if:			/// Returns true if this block may be valid without terminator. That is if:
	/// - it does not have a parent region.			/// - it does not have a parent region.
	/// - Or the parent region have a single block and:			/// - Or the parent region have a single block and:
	/// - This region does not have a parent op.			/// - This region does not have a parent op.
	/// - Or the parent op is unregistered.			/// - Or the parent op is unregistered.
	/// - Or the parent op has the NoTerminator trait.			/// - Or the parent op has the NoTerminator trait.
	▲ Show 20 Lines • Show All 249 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Entrypoint			// Entrypoint
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	/// Perform (potentially expensive) checks of invariants, used to detect			/// Perform (potentially expensive) checks of invariants, used to detect
	/// compiler bugs. On error, this reports the error through the MLIRContext and			/// compiler bugs. On error, this reports the error through the MLIRContext and
	/// returns failure.			/// returns failure.
	LogicalResult mlir::verify(Operation *op) {			LogicalResult mlir::verify(Operation *op) {
	return OperationVerifier(op->getContext()).verifyOpAndDominance(*op);			return OperationVerifier().verifyOpAndDominance(*op);
	}			}

mlir/lib/Pass/Pass.cpp

//===- Pass.cpp - Pass infrastructure implementation ----------------------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

// This file implements common pass infrastructure.

//===----------------------------------------------------------------------===//

#include "mlir/Pass/Pass.h"

#include "PassDetail.h"

#include "mlir/IR/Diagnostics.h"

#include "mlir/IR/Dialect.h"

#include "mlir/IR/Threading.h"

#include "mlir/IR/Verifier.h"

#include "mlir/Support/FileUtilities.h"

#include "llvm/ADT/STLExtras.h"

#include "llvm/ADT/ScopeExit.h"

#include "llvm/ADT/SetVector.h"

#include "llvm/Support/CommandLine.h"

#include "llvm/Support/CrashRecoveryContext.h"

#include "llvm/Support/Mutex.h"

▲ Show 20 Lines • Show All 550 Lines • ▼ Show 20 Lines

for (auto &block : region) {

// Add this operation iff the name matches any of the pass managers.

if (findPassManagerFor(mgrs, op.getName().getIdentifier(),

getContext()))

opAMPairs.emplace_back(&op, am.nest(&op));

}

// A parallel diagnostic handler that provides deterministic diagnostic

// ordering.

ParallelDiagnosticHandler diagHandler(&getContext());

// An index for the current operation/analysis manager pair.

std::atomic<unsigned> opIt(0);

// Get the current thread for this adaptor.

PassInstrumentation::PipelineParentInfo parentInfo = {llvm::get_threadid(),

this};

auto *instrumentor = am.getPassInstrumentor();

// An atomic failure variable for the async executors.

std::atomic<bool> passFailed(false);

std::vector<std::atomic<bool>> activePMs(asyncExecutors.size());

llvm::parallelForEach(

std::fill(activePMs.begin(), activePMs.end(), false);

asyncExecutors.begin(),

auto processFn = [&](auto &opPMPair) {

std::next(asyncExecutors.begin(),

// Find a pass manager for this operation.

std::min(asyncExecutors.size(), opAMPairs.size())),

auto it = llvm::find_if(activePMs, [](std::atomic<bool> &isActive) {

[&](MutableArrayRef<OpPassManager> pms) {

bool expectedInactive = false;

for (auto e = opAMPairs.size(); !passFailed && opIt < e;) {

return isActive.compare_exchange_strong(expectedInactive, true);

// Get the next available operation index.

});

unsigned nextID = opIt++;

unsigned pmIndex = it - activePMs.begin();

if (nextID >= e)

break;

// Set the order id for this thread in the diagnostic handler.

diagHandler.setOrderIDForThread(nextID);

// Get the pass manager for this operation and execute it.

auto &it = opAMPairs[nextID];

auto *pm = findPassManagerFor(asyncExecutors[pmIndex],

auto *pm = findPassManagerFor(

opPMPair.first->getName().getIdentifier(),

pms, it.first->getName().getIdentifier(), getContext());

getContext());

assert(pm && "expected valid pass manager for operation");

unsigned initGeneration = pm->impl->initializationGeneration;

LogicalResult pipelineResult =

runPipeline(pm->getPasses(), it.first, it.second, verifyPasses,

runPipeline(pm->getPasses(), opPMPair.first, opPMPair.second,

initGeneration, instrumentor, &parentInfo);

verifyPasses, initGeneration, instrumentor, &parentInfo);

// Drop this thread from being tracked by the diagnostic handler.

// Reset the active bit for this pass manager.

// After this task has finished, the thread may be used outside of

activePMs[pmIndex].store(false);

// this pass manager context meaning that we don't want to track

return pipelineResult;

// diagnostics from it anymore.

};

diagHandler.eraseOrderIDForThread();

// Handle a failed pipeline result.

if (failed(pipelineResult)) {

passFailed = true;

break;

}

});

bondhugulaUnsubmitted

Done

std::atomic<bool> passFailed(false);

- std::vector<std::atomic<bool>> activePMs(asyncExecutors.size());

- for (std::atomic<bool> &isActive : activePMs)

- isActive = false;

- parallelForEach(&getContext(), opAMPairs, [&](auto &opPMPair) {

+ std::vector<std::atomic<bool>> activePMs(asyncExecutors.size(),

+ std::atomic<bool>{false}); parallelForEach(&getContext(), opAMPairs, [&](auto &opPMPair) {

std::vector<std::atomic<bool>> activePMs(asyncExecutors.size(),

std::atomic<bool>{false});

bondhugula: std::vector<std::atomic<bool>> activePMs(asyncExecutors.size()…

lattnerUnsubmitted

Done

Right, atomic needs to be explicitly initialized to work correctly

lattner: Right, atomic needs to be explicitly initialized to work correctly

rriddleAuthorUnsubmitted

Done

Thanks Uday! My 4am brain couldn't remember how to initialize it.

rriddle: Thanks Uday! My 4am brain couldn't remember how to initialize it.

// Signal a failure if any of the executors failed.

if (passFailed)

if (failed(failableParallelForEach(&getContext(), opAMPairs, processFn)))

signalPassFailure();

}

//===----------------------------------------------------------------------===//

// PassManager

//===----------------------------------------------------------------------===//

PassManager::PassManager(MLIRContext *ctx, Nesting nesting,

▲ Show 20 Lines • Show All 224 Lines • Show Last 20 Lines

mlir/lib/Transforms/Inliner.cpp

Show All 9 Lines
// the Strongly Connect Components(SCCs) of the CallGraph. This enables a more		// the Strongly Connect Components(SCCs) of the CallGraph. This enables a more
// incremental propagation of inlining decisions from the leafs to the roots of		// incremental propagation of inlining decisions from the leafs to the roots of
// the callgraph.		// the callgraph.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "PassDetail.h"		#include "PassDetail.h"
#include "mlir/Analysis/CallGraph.h"		#include "mlir/Analysis/CallGraph.h"
		#include "mlir/IR/Threading.h"
#include "mlir/Interfaces/SideEffectInterfaces.h"		#include "mlir/Interfaces/SideEffectInterfaces.h"
#include "mlir/Pass/PassManager.h"		#include "mlir/Pass/PassManager.h"
#include "mlir/Transforms/InliningUtils.h"		#include "mlir/Transforms/InliningUtils.h"
#include "mlir/Transforms/Passes.h"		#include "mlir/Transforms/Passes.h"
#include "llvm/ADT/SCCIterator.h"		#include "llvm/ADT/SCCIterator.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/Parallel.h"		#include "llvm/Support/Parallel.h"

▲ Show 20 Lines • Show All 631 Lines • ▼ Show 20 Lines	for (auto *node : currentSCC) {
if (!region->getParentOp()->hasTrait<OpTrait::IsIsolatedFromAbove>())		if (!region->getParentOp()->hasTrait<OpTrait::IsIsolatedFromAbove>())
continue;		continue;
nodesToVisit.push_back(node);		nodesToVisit.push_back(node);
}		}
if (nodesToVisit.empty())		if (nodesToVisit.empty())
return success();		return success();

// Optimize each of the nodes within the SCC in parallel.		// Optimize each of the nodes within the SCC in parallel.
// NOTE: This is simple now, because we don't enable optimizing nodes within
// children. When we remove this restriction, this logic will need to be
// reworked.
if (context->isMultithreadingEnabled() && nodesToVisit.size() > 1) {
if (failed(optimizeSCCAsync(nodesToVisit, context)))		if (failed(optimizeSCCAsync(nodesToVisit, context)))
return failure();		return failure();

// Otherwise, we are optimizing within a single thread.
} else {
for (CallGraphNode *node : nodesToVisit) {
if (failed(optimizeCallable(node, opPipelines[0])))
return failure();
}
}

// Recompute the uses held by each of the nodes.		// Recompute the uses held by each of the nodes.
for (CallGraphNode *node : nodesToVisit)		for (CallGraphNode *node : nodesToVisit)
useList.recomputeUses(node, cg);		useList.recomputeUses(node, cg);
return success();		return success();
}		}

LogicalResult		LogicalResult
InlinerPass::optimizeSCCAsync(MutableArrayRef<CallGraphNode *> nodesToVisit,		InlinerPass::optimizeSCCAsync(MutableArrayRef<CallGraphNode *> nodesToVisit,
MLIRContext *context) {		MLIRContext *ctx) {
// Ensure that there are enough pipeline maps for the optimizer to run in		// Ensure that there are enough pipeline maps for the optimizer to run in
// parallel. Note: The number of pass managers here needs to remain constant		// parallel. Note: The number of pass managers here needs to remain constant
// to prevent issues with pass instrumentations that rely on having the same		// to prevent issues with pass instrumentations that rely on having the same
// pass manager for the main thread.		// pass manager for the main thread.
size_t numThreads = llvm::hardware_concurrency().compute_thread_count();		size_t numThreads = llvm::hardware_concurrency().compute_thread_count();
if (opPipelines.size() < numThreads) {		if (opPipelines.size() < numThreads) {
// Reserve before resizing so that we can use a reference to the first		// Reserve before resizing so that we can use a reference to the first
// element.		// element.
opPipelines.reserve(numThreads);		opPipelines.reserve(numThreads);
opPipelines.resize(numThreads, opPipelines.front());		opPipelines.resize(numThreads, opPipelines.front());
}		}

// Ensure an analysis manager has been constructed for each of the nodes.		// Ensure an analysis manager has been constructed for each of the nodes.
// This prevents thread races when running the nested pipelines.		// This prevents thread races when running the nested pipelines.
for (CallGraphNode *node : nodesToVisit)		for (CallGraphNode *node : nodesToVisit)
getAnalysisManager().nest(node->getCallableRegion()->getParentOp());		getAnalysisManager().nest(node->getCallableRegion()->getParentOp());

// An index for the current node to optimize.		// An atomic failure variable for the async executors.
std::atomic<unsigned> nodeIt(0);		std::vector<std::atomic<bool>> activePMs(opPipelines.size());
		std::fill(activePMs.begin(), activePMs.end(), false);
// Optimize the nodes of the SCC in parallel.		return failableParallelForEach(ctx, nodesToVisit, [&](CallGraphNode *node) {
ParallelDiagnosticHandler optimizerHandler(context);		// Find a pass manager for this operation.
std::atomic<bool> passFailed(false);		auto it = llvm::find_if(activePMs, [](std::atomic<bool> &isActive) {
llvm::parallelForEach(		bool expectedInactive = false;
opPipelines.begin(), std::next(opPipelines.begin(), numThreads),		return isActive.compare_exchange_strong(expectedInactive, true);
[&](llvm::StringMap<OpPassManager> &pipelines) {		});
for (auto e = nodesToVisit.size(); !passFailed && nodeIt < e;) {		unsigned pmIndex = it - activePMs.begin();
// Get the next available operation index.
unsigned nextID = nodeIt++;
if (nextID >= e)
break;

// Set the order for this thread so that diagnostics will be		// Optimize this callable node.
// properly ordered, and reset after optimization has finished.		LogicalResult result = optimizeCallable(node, opPipelines[pmIndex]);
optimizerHandler.setOrderIDForThread(nextID);
LogicalResult pipelineResult =
optimizeCallable(nodesToVisit[nextID], pipelines);
optimizerHandler.eraseOrderIDForThread();

if (failed(pipelineResult)) {		// Reset the active bit for this pass manager.
passFailed = true;		activePMs[pmIndex].store(false);
break;		return result;
}
}
});		});
return failure(passFailed);
}		}

LogicalResult		LogicalResult
InlinerPass::optimizeCallable(CallGraphNode *node,		InlinerPass::optimizeCallable(CallGraphNode *node,
llvm::StringMap<OpPassManager> &pipelines) {		llvm::StringMap<OpPassManager> &pipelines) {
Operation *callable = node->getCallableRegion()->getParentOp();		Operation *callable = node->getCallableRegion()->getParentOp();
StringRef opName = callable->getName().getStringRef();		StringRef opName = callable->getName().getStringRef();
auto pipelineIt = pipelines.find(opName);		auto pipelineIt = pipelines.find(opName);
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

mlir/test/Dialect/Affine/SuperVectorize/compose_maps.mlir

	// RUN: mlir-opt -allow-unregistered-dialect %s -affine-super-vectorizer-test -compose-maps 2>&1 \| FileCheck %s			// RUN: mlir-opt -allow-unregistered-dialect %s -affine-super-vectorizer-test -compose-maps -split-input-file 2>&1 \| FileCheck %s
				lattnerUnsubmitted Done Reply Inline Actions Why do these need to disable threading? lattner: Why do these need to disable threading?
				rriddleAuthorUnsubmitted Done Reply Inline Actions Several existing tests depended on the previous threading behavior to work, mostly getting lucky that the `errs()` output was laid out in the way they expect. Switched to no longer disable threading, but adding split file markers to separate tests. rriddle: Several existing tests depended on the previous threading behavior to work, mostly getting…
				lattnerUnsubmitted Not Done Reply Inline Actions Ok for this patch, but gross. Plz file a bug about these. It is a bad antipattern to emit to `errs()`, these tests should be emitting errors so they are pinned to a location correctly (similar to the dependence analysis tests) lattner: Ok for this patch, but gross. Plz file a bug about these. It is a bad antipattern to emit to…

	// For all these cases, the test traverses the `test_affine_map` ops and			// For all these cases, the test traverses the `test_affine_map` ops and
	// composes them in order one-by-one.			// composes them in order one-by-one.
	// For instance, the pseudo-sequence:			// For instance, the pseudo-sequence:
	// "test_affine_map"() { affine_map = f } : () -> ()			// "test_affine_map"() { affine_map = f } : () -> ()
	// "test_affine_map"() { affine_map = g } : () -> ()			// "test_affine_map"() { affine_map = g } : () -> ()
	// "test_affine_map"() { affine_map = h } : () -> ()			// "test_affine_map"() { affine_map = h } : () -> ()
	// will produce the sequence of compositions: f, g(f), h(g(f)) and print the			// will produce the sequence of compositions: f, g(f), h(g(f)) and print the
	// AffineMap h(g(f)), which is what FileCheck checks against.			// AffineMap h(g(f)), which is what FileCheck checks against.

	func @simple1() {			func @simple1() {
	// CHECK: Composed map: (d0) -> (d0)			// CHECK: Composed map: (d0) -> (d0)
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 1)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple2() {			func @simple2() {
	// CHECK: Composed map: (d0)[s0, s1] -> (d0 - s0 + s1)			// CHECK: Composed map: (d0)[s0, s1] -> (d0 - s0 + s1)
	"test_affine_map"() { affine_map = affine_map<(d0)[s0] -> (d0 + s0 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0)[s0] -> (d0 + s0 - 1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0)[s0] -> (d0 - s0 + 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0)[s0] -> (d0 - s0 + 1)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple3a() {			func @simple3a() {
	// CHECK: Composed map: (d0, d1)[s0, s1, s2, s3] -> ((d0 ceildiv s2) * s0, (d1 ceildiv s3) * s1)			// CHECK: Composed map: (d0, d1)[s0, s1, s2, s3] -> ((d0 ceildiv s2) * s0, (d1 ceildiv s3) * s1)
	"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 ceildiv s0, d1 ceildiv s1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 ceildiv s0, d1 ceildiv s1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 * s0, d1 * s1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 * s0, d1 * s1)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple3b() {			func @simple3b() {
	// CHECK: Composed map: (d0, d1)[s0, s1] -> (d0 mod s0, d1 mod s1)			// CHECK: Composed map: (d0, d1)[s0, s1] -> (d0 mod s0, d1 mod s1)
	"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 mod s0, d1 mod s1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 mod s0, d1 mod s1)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple3c() {			func @simple3c() {
	// CHECK: Composed map: (d0, d1)[s0, s1, s2, s3, s4, s5] -> ((d0 ceildiv s4) * s4 + d0 mod s2, (d1 ceildiv s5) * s5 + d1 mod s3)			// CHECK: Composed map: (d0, d1)[s0, s1, s2, s3, s4, s5] -> ((d0 ceildiv s4) * s4 + d0 mod s2, (d1 ceildiv s5) * s5 + d1 mod s3)
	"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> ((d0 ceildiv s0) * s0, (d1 ceildiv s1) * s1, d0, d1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> ((d0 ceildiv s0) * s0, (d1 ceildiv s1) * s1, d0, d1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 + d2 mod s2, d1 + d3 mod s3)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 + d2 mod s2, d1 + d3 mod s3)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple4() {			func @simple4() {
	// CHECK: Composed map: (d0, d1)[s0, s1] -> (d1 * s1, d0 ceildiv s0)			// CHECK: Composed map: (d0, d1)[s0, s1] -> (d1 * s1, d0 ceildiv s0)
	"test_affine_map"() { affine_map = affine_map<(d0, d1) -> (d1, d0)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1) -> (d1, d0)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 * s1, d1 ceildiv s0)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 * s1, d1 ceildiv s0)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple5a() {			func @simple5a() {
	// CHECK: Composed map: (d0) -> (d0 * 3 + 18)			// CHECK: Composed map: (d0) -> (d0 * 3 + 18)
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 24)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 24)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 ceildiv 8)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 ceildiv 8)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple5b() {			func @simple5b() {
	// CHECK: Composed map: (d0) -> ((d0 + 6) ceildiv 2)			// CHECK: Composed map: (d0) -> ((d0 + 6) ceildiv 2)
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 4)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 4)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 ceildiv 8)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 ceildiv 8)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple5c() {			func @simple5c() {
	// CHECK: Composed map: (d0) -> (d0 * 8 + 48)			// CHECK: Composed map: (d0) -> (d0 * 8 + 48)
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 24)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 24)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 floordiv 3)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 floordiv 3)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple5d() {			func @simple5d() {
	// CHECK: Composed map: (d0) -> ((d0 * 4) floordiv 3 + 8)			// CHECK: Composed map: (d0) -> ((d0 * 4) floordiv 3 + 8)
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 4)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 4)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 floordiv 3)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 floordiv 3)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple5e() {			func @simple5e() {
	// CHECK: Composed map: (d0) -> ((d0 + 6) ceildiv 8)			// CHECK: Composed map: (d0) -> ((d0 + 6) ceildiv 8)
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 + 7)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 ceildiv 8)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 ceildiv 8)> } : () -> ()
	return			return
	}			}

				// -----

	func @simple5f() {			func @simple5f() {
	// CHECK: Composed map: (d0) -> ((d0 * 4 - 4) floordiv 3)			// CHECK: Composed map: (d0) -> ((d0 * 4 - 4) floordiv 3)
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 - 1)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 4)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 * 4)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 floordiv 3)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0) -> (d0 floordiv 3)> } : () -> ()
	return			return
	}			}

				// -----

	func @perm_and_proj() {			func @perm_and_proj() {
	// CHECK: Composed map: (d0, d1, d2, d3) -> (d1, d3, d0)			// CHECK: Composed map: (d0, d1, d2, d3) -> (d1, d3, d0)
	"test_affine_map"() { affine_map = affine_map<(d0, d1, d2, d3) -> (d3, d1, d2, d0)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1, d2, d3) -> (d3, d1, d2, d0)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0, d1, d2, d3) -> (d1, d0, d3)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1, d2, d3) -> (d1, d0, d3)> } : () -> ()
	return			return
	}			}

				// -----

	func @symbols1() {			func @symbols1() {
	// CHECK: Composed map: (d0)[s0] -> (d0 + s0 + 1, d0 - s0 - 1)			// CHECK: Composed map: (d0)[s0] -> (d0 + s0 + 1, d0 - s0 - 1)
	"test_affine_map"() { affine_map = affine_map<(d0)[s0] -> (d0 + s0, d0 - s0)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0)[s0] -> (d0 + s0, d0 - s0)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0, d1) -> (d0 + 1, d1 - 1)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1) -> (d0 + 1, d1 - 1)> } : () -> ()
	return			return
	}			}

				// -----

	func @drop() {			func @drop() {
	// CHECK: Composed map: (d0, d1, d2)[s0, s1] -> (d0 * 2 + d1 + d2 + s1)			// CHECK: Composed map: (d0, d1, d2)[s0, s1] -> (d0 * 2 + d1 + d2 + s1)
	"test_affine_map"() { affine_map = affine_map<(d0, d1, d2)[s0, s1] -> (d0 + s1, d1 + s0, d0 + d1 + d2)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1, d2)[s0, s1] -> (d0 + s1, d1 + s0, d0 + d1 + d2)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0, d1, d2) -> (d0 + d2)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1, d2) -> (d0 + d2)> } : () -> ()
	return			return
	}			}

				// -----

	func @multi_symbols() {			func @multi_symbols() {
	// CHECK: Composed map: (d0)[s0, s1, s2] -> (d0 + s1 + s2 + 1, d0 - s0 - s2 - 1)			// CHECK: Composed map: (d0)[s0, s1, s2] -> (d0 + s1 + s2 + 1, d0 - s0 - s2 - 1)
	"test_affine_map"() { affine_map = affine_map<(d0)[s0] -> (d0 + s0, d0 - s0)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0)[s0] -> (d0 + s0, d0 - s0)> } : () -> ()
	"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 + 1 + s1, d1 - 1 - s0)> } : () -> ()			"test_affine_map"() { affine_map = affine_map<(d0, d1)[s0, s1] -> (d0 + 1 + s1, d1 - 1 - s0)> } : () -> ()
	return			return
	}			}

mlir/test/Dialect/Affine/slicing-utils.mlir

// RUN: mlir-opt -allow-unregistered-dialect %s -affine-super-vectorizer-test -forward-slicing=true 2>&1 \| FileCheck %s --check-prefix=FWD		// RUN: mlir-opt -allow-unregistered-dialect %s -split-input-file -affine-super-vectorizer-test -forward-slicing=true 2>&1 \| FileCheck %s --check-prefix=FWD
// RUN: mlir-opt -allow-unregistered-dialect %s -affine-super-vectorizer-test -backward-slicing=true 2>&1 \| FileCheck %s --check-prefix=BWD		// RUN: mlir-opt -allow-unregistered-dialect %s -split-input-file -affine-super-vectorizer-test -backward-slicing=true 2>&1 \| FileCheck %s --check-prefix=BWD
// RUN: mlir-opt -allow-unregistered-dialect %s -affine-super-vectorizer-test -slicing=true 2>&1 \| FileCheck %s --check-prefix=FWDBWD		// RUN: mlir-opt -allow-unregistered-dialect %s -split-input-file -affine-super-vectorizer-test -slicing=true 2>&1 \| FileCheck %s --check-prefix=FWDBWD

/// 1 2 3 4		/// 1 2 3 4
/// \|_______\| \|______\|		/// \|_______\| \|______\|
/// \| \| \|		/// \| \| \|
/// \| 5 6		/// \| 5 6
/// \|___\|_____________\|		/// \|___\|_____________\|
/// \| \|		/// \| \|
/// 7 8		/// 7 8
▲ Show 20 Lines • Show All 200 Lines • ▼ Show 20 Lines	func @slicing_test() {
// FWDBWD-DAG: %[[v7:.*]] = "slicing-test-op"(%[[v1]], %[[v5]]) : (i1, i5) -> i7		// FWDBWD-DAG: %[[v7:.*]] = "slicing-test-op"(%[[v1]], %[[v5]]) : (i1, i5) -> i7
// FWDBWD-NEXT: %[[v9:.*]] = "slicing-test-op"(%[[v7]], %[[v8]]) : (i7, i8) -> i9		// FWDBWD-NEXT: %[[v9:.*]] = "slicing-test-op"(%[[v7]], %[[v8]]) : (i7, i8) -> i9

%9 = "slicing-test-op" (%7, %8) : (i7, i8) -> i9		%9 = "slicing-test-op" (%7, %8) : (i7, i8) -> i9

return		return
}		}

		// -----

// FWD-LABEL: slicing_test_2		// FWD-LABEL: slicing_test_2
// BWD-LABEL: slicing_test_2		// BWD-LABEL: slicing_test_2
// FWDBWD-LABEL: slicing_test_2		// FWDBWD-LABEL: slicing_test_2
func @slicing_test_2() {		func @slicing_test_2() {
%c0 = constant 0 : index		%c0 = constant 0 : index
%c2 = constant 2 : index		%c2 = constant 2 : index
%c16 = constant 16 : index		%c16 = constant 16 : index
affine.for %i0 = %c0 to %c16 {		affine.for %i0 = %c0 to %c16 {
Show All 17 Lines	affine.for %i1 = affine_map<(i)[] -> (i)>(%i0) to 10 {
// affine.for only appears in the body of scf.for		// affine.for only appears in the body of scf.for
// BWD-NOT: affine.for {{.*}}		// BWD-NOT: affine.for {{.*}}
%c = "slicing-test-op"(%i0): (index) -> index		%c = "slicing-test-op"(%i0): (index) -> index
}		}
}		}
return		return
}		}

		// -----

// FWD-LABEL: slicing_test_3		// FWD-LABEL: slicing_test_3
// BWD-LABEL: slicing_test_3		// BWD-LABEL: slicing_test_3
// FWDBWD-LABEL: slicing_test_3		// FWDBWD-LABEL: slicing_test_3
func @slicing_test_3() {		func @slicing_test_3() {
%f = constant 1.0 : f32		%f = constant 1.0 : f32
%c = "slicing-test-op"(%f): (f32) -> index		%c = "slicing-test-op"(%f): (f32) -> index
// FWD: matched: {{.*}} (f32) -> index forward static slice:		// FWD: matched: {{.*}} (f32) -> index forward static slice:
// FWD: scf.for {{.*}}		// FWD: scf.for {{.*}}
// FWD: matched: {{.*}} (index, index) -> index forward static slice:		// FWD: matched: {{.*}} (index, index) -> index forward static slice:
scf.for %i2 = %c to %c step %c {		scf.for %i2 = %c to %c step %c {
%d = "slicing-test-op"(%c, %i2): (index, index) -> index		%d = "slicing-test-op"(%c, %i2): (index, index) -> index
}		}
return		return
}		}

		// -----

// FWD-LABEL: slicing_test_function_argument		// FWD-LABEL: slicing_test_function_argument
// BWD-LABEL: slicing_test_function_argument		// BWD-LABEL: slicing_test_function_argument
// FWDBWD-LABEL: slicing_test_function_argument		// FWDBWD-LABEL: slicing_test_function_argument
func @slicing_test_function_argument(%arg0: index) -> index {		func @slicing_test_function_argument(%arg0: index) -> index {
// BWD: matched: {{.*}} (index, index) -> index backward static slice:		// BWD: matched: {{.*}} (index, index) -> index backward static slice:
%0 = "slicing-test-op"(%arg0, %arg0): (index, index) -> index		%0 = "slicing-test-op"(%arg0, %arg0): (index, index) -> index
return %0 : index		return %0 : index
}		}

		// -----

// FWD-LABEL: slicing_test_multiple_return		// FWD-LABEL: slicing_test_multiple_return
// BWD-LABEL: slicing_test_multiple_return		// BWD-LABEL: slicing_test_multiple_return
// FWDBWD-LABEL: slicing_test_multiple_return		// FWDBWD-LABEL: slicing_test_multiple_return
func @slicing_test_multiple_return(%arg0: index) -> (index, index) {		func @slicing_test_multiple_return(%arg0: index) -> (index, index) {
// BWD: matched: {{.*}} (index, index) -> (index, index) backward static slice:		// BWD: matched: {{.*}} (index, index) -> (index, index) backward static slice:
// FWD: matched: %{{.*}}:2 = "slicing-test-op"(%arg0, %arg0) : (index, index) -> (index, index) forward static slice:		// FWD: matched: %{{.*}}:2 = "slicing-test-op"(%arg0, %arg0) : (index, index) -> (index, index) forward static slice:
// FWD: return %{{.}}#0, %{{.}}#1 : index, index		// FWD: return %{{.}}#0, %{{.}}#1 : index, index
%0:2 = "slicing-test-op"(%arg0, %arg0): (index, index) -> (index, index)		%0:2 = "slicing-test-op"(%arg0, %arg0): (index, index) -> (index, index)
Show All 17 Lines

mlir/test/IR/diagnostic-handler-filter.mlir

	// RUN: mlir-opt %s -test-diagnostic-filter='filters=mysource1' -o - 2>&1 \| FileCheck %s			// RUN: mlir-opt %s -test-diagnostic-filter='filters=mysource1' -split-input-file -o - 2>&1 \| FileCheck %s
	// This test verifies that diagnostic handler can emit the call stack successfully.			// This test verifies that diagnostic handler can emit the call stack successfully.

	// CHECK-LABEL: Test 'test1'			// CHECK-LABEL: Test 'test1'
	// CHECK-NEXT: mysource2:1:0: error: test diagnostic			// CHECK-NEXT: mysource2:1:0: error: test diagnostic
	// CHECK-NEXT: mysource3:2:0: note: called from			// CHECK-NEXT: mysource3:2:0: note: called from
	func private @test1() attributes {			func private @test1() attributes {
	test.loc = loc(callsite("foo"("mysource1":0:0) at callsite("mysource2":1:0 at "mysource3":2:0)))			test.loc = loc(callsite("foo"("mysource1":0:0) at callsite("mysource2":1:0 at "mysource3":2:0)))
	}			}

				// -----

	// CHECK-LABEL: Test 'test2'			// CHECK-LABEL: Test 'test2'
	// CHECK-NEXT: mysource1:0:0: error: test diagnostic			// CHECK-NEXT: mysource1:0:0: error: test diagnostic
	func private @test2() attributes {			func private @test2() attributes {
	test.loc = loc("mysource1":0:0)			test.loc = loc("mysource1":0:0)
	}			}

mlir/test/Pass/pass-timing.mlir

	// RUN: mlir-opt %s -mlir-disable-threading=true -verify-each=true -pass-pipeline='func(cse,canonicalize,cse)' -mlir-timing -mlir-timing-display=list 2>&1 \| FileCheck -check-prefix=LIST %s			// RUN: mlir-opt %s -mlir-disable-threading=true -verify-each=true -pass-pipeline='func(cse,canonicalize,cse)' -mlir-timing -mlir-timing-display=list 2>&1 \| FileCheck -check-prefix=LIST %s
	// RUN: mlir-opt %s -mlir-disable-threading=true -verify-each=true -pass-pipeline='func(cse,canonicalize,cse)' -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck -check-prefix=PIPELINE %s			// RUN: mlir-opt %s -mlir-disable-threading=true -verify-each=true -pass-pipeline='func(cse,canonicalize,cse)' -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck -check-prefix=PIPELINE %s
	// RUN: mlir-opt %s -mlir-disable-threading=false -verify-each=true -pass-pipeline='func(cse,canonicalize,cse)' -mlir-timing -mlir-timing-display=list 2>&1 \| FileCheck -check-prefix=MT_LIST %s			// RUN: mlir-opt %s -mlir-disable-threading=false -verify-each=true -pass-pipeline='func(cse,canonicalize,cse)' -mlir-timing -mlir-timing-display=list 2>&1 \| FileCheck -check-prefix=MT_LIST %s
	// RUN: mlir-opt %s -mlir-disable-threading=false -verify-each=true -pass-pipeline='func(cse,canonicalize,cse)' -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck -check-prefix=MT_PIPELINE %s			// RUN: mlir-opt %s -mlir-disable-threading=false -verify-each=true -pass-pipeline='func(cse,canonicalize,cse)' -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck -check-prefix=MT_PIPELINE %s
	// RUN: mlir-opt %s -mlir-disable-threading=false -verify-each=false -test-pm-nested-pipeline -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck -check-prefix=NESTED_MT_PIPELINE %s			// RUN: mlir-opt %s -mlir-disable-threading=true -verify-each=false -test-pm-nested-pipeline -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck -check-prefix=NESTED_PIPELINE %s

	// LIST: Execution time report			// LIST: Execution time report
	// LIST: Total Execution Time:			// LIST: Total Execution Time:
	// LIST: Name			// LIST: Name
	// LIST-DAG: Canonicalizer			// LIST-DAG: Canonicalizer
	// LIST-DAG: CSE			// LIST-DAG: CSE
	// LIST-DAG: DominanceInfo			// LIST-DAG: DominanceInfo
	// LIST: Total			// LIST: Total
	Show All 29 Lines
	// MT_PIPELINE-NEXT: (A) DominanceInfo			// MT_PIPELINE-NEXT: (A) DominanceInfo
	// MT_PIPELINE-NEXT: Canonicalizer			// MT_PIPELINE-NEXT: Canonicalizer
	// MT_PIPELINE-NEXT: CSE			// MT_PIPELINE-NEXT: CSE
	// MT_PIPELINE-NEXT: (A) DominanceInfo			// MT_PIPELINE-NEXT: (A) DominanceInfo
	// MT_PIPELINE-NEXT: Output			// MT_PIPELINE-NEXT: Output
	// MT_PIPELINE-NEXT: Rest			// MT_PIPELINE-NEXT: Rest
	// MT_PIPELINE-NEXT: Total			// MT_PIPELINE-NEXT: Total

	// NESTED_MT_PIPELINE: Execution time report			// NESTED_PIPELINE: Execution time report
	// NESTED_MT_PIPELINE: Total Execution Time:			// NESTED_PIPELINE: Total Execution Time:
	// NESTED_MT_PIPELINE: Name			// NESTED_PIPELINE: Name
	// NESTED_MT_PIPELINE-NEXT: Parser			// NESTED_PIPELINE-NEXT: Parser
	// NESTED_MT_PIPELINE-NEXT: Pipeline Collection : ['func', 'module']			// NESTED_PIPELINE-NEXT: Pipeline Collection : ['func', 'module']
	// NESTED_MT_PIPELINE-NEXT: 'func' Pipeline			// NESTED_PIPELINE-NEXT: 'func' Pipeline
	// NESTED_MT_PIPELINE-NEXT: TestFunctionPass			// NESTED_PIPELINE-NEXT: TestFunctionPass
	// NESTED_MT_PIPELINE-NEXT: 'module' Pipeline			// NESTED_PIPELINE-NEXT: 'module' Pipeline
	// NESTED_MT_PIPELINE-NEXT: TestModulePass			// NESTED_PIPELINE-NEXT: TestModulePass
	// NESTED_MT_PIPELINE-NEXT: 'func' Pipeline			// NESTED_PIPELINE-NEXT: 'func' Pipeline
	// NESTED_MT_PIPELINE-NEXT: TestFunctionPass			// NESTED_PIPELINE-NEXT: TestFunctionPass
	// NESTED_MT_PIPELINE-NEXT: Output			// NESTED_PIPELINE-NEXT: Output
	// NESTED_MT_PIPELINE-NEXT: Rest			// NESTED_PIPELINE-NEXT: Rest
	// NESTED_MT_PIPELINE-NEXT: Total			// NESTED_PIPELINE-NEXT: Total

	func @foo() {			func @foo() {
	return			return
	}			}

	func @bar() {			func @bar() {
	return			return
	}			}
	Show All 18 Lines

mlir/test/Pass/pipeline-parsing.mlir

	// RUN: mlir-opt %s -pass-pipeline='module(test-module-pass,func(test-function-pass)),func(test-function-pass)' -pass-pipeline="func(cse,canonicalize)" -verify-each=false -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck %s			// RUN: mlir-opt %s -mlir-disable-threading -pass-pipeline='module(test-module-pass,func(test-function-pass)),func(test-function-pass)' -pass-pipeline="func(cse,canonicalize)" -verify-each=false -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck %s
	// RUN: mlir-opt %s -test-textual-pm-nested-pipeline -verify-each=false -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck %s --check-prefix=TEXTUAL_CHECK			// RUN: mlir-opt %s -mlir-disable-threading -test-textual-pm-nested-pipeline -verify-each=false -mlir-timing -mlir-timing-display=tree 2>&1 \| FileCheck %s --check-prefix=TEXTUAL_CHECK
	// RUN: not mlir-opt %s -pass-pipeline='module(test-module-pass' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_1 %s			// RUN: not mlir-opt %s -pass-pipeline='module(test-module-pass' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_1 %s
	// RUN: not mlir-opt %s -pass-pipeline='module(test-module-pass))' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_2 %s			// RUN: not mlir-opt %s -pass-pipeline='module(test-module-pass))' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_2 %s
	// RUN: not mlir-opt %s -pass-pipeline='module()(' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_3 %s			// RUN: not mlir-opt %s -pass-pipeline='module()(' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_3 %s
	// RUN: not mlir-opt %s -pass-pipeline=',' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_4 %s			// RUN: not mlir-opt %s -pass-pipeline=',' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_4 %s
	// RUN: not mlir-opt %s -pass-pipeline='func(test-module-pass)' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_5 %s			// RUN: not mlir-opt %s -pass-pipeline='func(test-module-pass)' 2>&1 \| FileCheck --check-prefix=CHECK_ERROR_5 %s

	// CHECK_ERROR_1: encountered unbalanced parentheses while parsing pipeline			// CHECK_ERROR_1: encountered unbalanced parentheses while parsing pipeline
	// CHECK_ERROR_2: encountered extra closing ')' creating unbalanced parentheses while parsing pipeline			// CHECK_ERROR_2: encountered extra closing ')' creating unbalanced parentheses while parsing pipeline
	Show All 31 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Add a ThreadPool to MLIRContext and refactor MLIR threading usageClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 353838

llvm/include/llvm/Support/ThreadPool.h

llvm/lib/Support/ThreadPool.cpp

mlir/include/mlir/IR/MLIRContext.h

mlir/include/mlir/IR/Threading.h

mlir/lib/IR/MLIRContext.cpp

mlir/lib/IR/Verifier.cpp

mlir/lib/Pass/Pass.cpp

mlir/lib/Transforms/Inliner.cpp

mlir/test/Dialect/Affine/SuperVectorize/compose_maps.mlir

mlir/test/Dialect/Affine/slicing-utils.mlir

mlir/test/IR/diagnostic-handler-filter.mlir

mlir/test/Pass/pass-timing.mlir

mlir/test/Pass/pipeline-parsing.mlir

[mlir] Add a ThreadPool to MLIRContext and refactor MLIR threading usage
ClosedPublic