This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/IR/
-
mlir/
-
IR/
16/20
Operation.h
-
Region.h
-
lib/IR/
-
IR/
1
Operation.cpp
8/10
Region.cpp

Differential D123917

[mlir] Make `Regions`s `cloneInto` multithread-readable
ClosedPublic

Authored by zero9178 on Apr 17 2022, 3:23 PM.

Download Raw Diff

Details

Reviewers

rriddle
mehdi_amini
lattner
jpienaar
Mogball

Commits

rGa41aaf166fed: [mlir] Make `Regions`s `cloneInto` multithread-readable

Summary

Prior to this patch, cloneInto would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process.

This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making cloneInto never thread safe.

This patch reimplements cloneInto in three steps to avoid ever creating any extra uses on elements in the source region:

It first creates the mapping of all blocks and block operands
It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands
After all operation results have been mapped, it now sets the operations operands and clones their regions.

That way it is now possible to call cloneInto from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use mlir::inlineCall with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of Values from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the BlockAndValueMapping to Values owned by the caller thread.

While I was at it, I also reworked the clone method of Operation a little bit and added a proper options class to avoid having a cloneWithoutRegionsAndOperands method, and be more extensible in the future. cloneWithoutRegions is now also a simple wrapper that calls clone with the proper options set. That way all the operation cloning code is now contained solely within clone.

Regarding testing: I have no clue what an automated test for thread safety would look like, nor whether that is possible. I used TSAN on my own project to find uses of mlir::inlineCall making writes to the source callable, creating race conditions. After this patch, TSAN no longer reports any issues in my project.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

zero9178 created this revision.Apr 17 2022, 3:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 17 2022, 3:23 PM

Herald added subscribers: sdasgup3, wenzhicui, wrengr and 17 others. · View Herald Transcript

zero9178 requested review of this revision.Apr 17 2022, 3:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 17 2022, 3:23 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Remove redundant includes

Harbormaster completed remote builds in B159996: Diff 423320.Apr 17 2022, 3:56 PM

rriddle requested changes to this revision.Apr 17 2022, 6:00 PM

rriddle added inline comments.

mlir/lib/IR/Operation.cpp
526	The style of the code base has changed at this point, please drop the duplicated method documentation from here.
mlir/lib/IR/Region.cpp
117	We shouldn't need to track every new operation, we can just iterate over the inserted blocks alongside the original blocks.
124–125	Can you pull this out into a separate variable?

This revision now requires changes to proceed.Apr 17 2022, 6:00 PM

Address review comments

(Can't wait for C++17 support btw)

Harbormaster completed remote builds in B160018: Diff 423347.Apr 18 2022, 2:05 AM

Mogball added inline comments.Apr 18 2022, 6:10 PM

mlir/include/mlir/IR/Operation.h
75	Can you expand the documentation here?
78	What does it mean for all flags to be false?
79	Provide a parameter constructor? `CloneOptions(bool cloneRegions, bool cloneOperands)`.
81	What does it mean for all flags to be true?
84–86
92–93	What is the use case of cloning operations with zero operands? E.g. what happens if I clone an op with its regions but without operands. All nested ops have zero operands?
92–93
125–137

Address review comments

zero9178 marked 9 inline comments as done.Apr 19 2022, 8:47 AM

zero9178 added inline comments.

mlir/include/mlir/IR/Operation.h
75	I tried to elaborate a bit further but I am not 100% sure what is missing.
92–93	Nested operations are currently unaffected by the cloning options. Aka they are not recursive and only affect the top level operation. Nested operations are cloned in their entirety. A use case for cloning an operation without its operands but with its regions might be cloning the operation to outline it eg. (Although this could also be achieved via the mapper). Generally speaking, the point of not cloning operands is to avoid creating a use of those operands as that is not thread safe.

Harbormaster completed remote builds in B160245: Diff 423639.Apr 19 2022, 9:20 AM

Mogball added inline comments.Apr 19 2022, 11:41 AM

mlir/include/mlir/IR/Operation.h
75	You should clearly state that the "parts of an operation" include whether the regions are recursively cloned and whether the operands are cloned and that these options are passed to `clone` methods.
92–93	Makes sense. But what if the region is not isolated from above? Could cloning the region recursively add a use to a value defined above and create a race condition?

Address review comments regarding documentation

zero9178 added inline comments.Apr 19 2022, 12:05 PM

mlir/include/mlir/IR/Operation.h
92–93	If the region is not isolated from above then a use of a value not defined by any of the operations being cloned may still lead to a race condition yes. So only cloning an isolated from above region is generally read only. One may still avoid a race condition in your case however if all the outside defined values are mapped by the BlockAndValueMapping to Values owned by the current thread.

Harbormaster completed remote builds in B160296: Diff 423697.Apr 19 2022, 12:20 PM

Have you measured the performance of the new clone to ensure parity? Inlining in TF can get kind of heavy...

mlir/include/mlir/IR/Operation.h
91	Why has this option been removed?
92–93	Right. Could you update the patch description to reflect that bit of a wrinkle? It'd also be great if the threadsafe-ness of cloning were briefly documented somewhere in the code so that this doesn't become MLIR street knowledge.
mlir/lib/IR/Region.cpp
142	Don't create a temporary vector.

In D123917#3460343, @Mogball wrote:

Have you measured the performance of the new clone to ensure parity? Inlining in TF can get kind of heavy...

I tried to create a meaningful benchmark by taking an MLIR file produced by my compiler prior to my inliner pass, which consists of 2812 lines of code. I then modified the main function to contain lots of calls to a large function.
First run of ./pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading | wc -l lead to an output of 454056 lines of MLIR.
I then used hyperfine to try and measure a difference:

[markus@dell-xps13 bin]$ hyperfine --prepare "sleep 20"  './pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading' './pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading'
Benchmark 1: ./pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading
  Time (mean ± σ):      4.230 s ±  0.053 s    [User: 4.141 s, System: 0.069 s]
  Range (min … max):    4.158 s …  4.308 s    10 runs
 
Benchmark 2: ./pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading
  Time (mean ± σ):      4.137 s ±  0.073 s    [User: 4.048 s, System: 0.068 s]
  Range (min … max):    4.009 s …  4.229 s    10 runs
 
Summary
  './pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading' ran
    1.02 ± 0.02 times faster than './pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading'

The benchmark was run on my Linux laptop running Fedora 35 and a i7 7500U. I used a prepare statement that sleeps 20 seconds before execution to try to circumvent thermal throttling of my laptop. For the same reason I also double checked the test results by running them in opposite order, which lead to the same results.

pylir-opt is my projects version of mlir-opt which linked against MLIR with my patch applied. pylir-opt-old is the same but links against a version without my patch applied. Both were compiled in release mode and used LTO when compiling my projects code.

To summarize, this patch seemingly regresses a tiny bit performance wise on average running on average 100ms slower in the benchmark, with a margin of error of around 50 ms. Of note however is that I had to really increase the problem size to get a good measurement. An initial measurement with max-recursion-depth=100 as parameter yielded a runtime of around 1 seconds eg. and an output file of roughly half the lines, where I failed to measure a consistent difference.
Another thing of note is that I did not find out through further profiling with Intel VTune any obvious causes. Most CPU time was spent in mlir::detail::walk and the self CPU time of Region::cloneInto was essentially nothing.

mlir/include/mlir/IR/Operation.h
91	This options was only very recently added in https://reviews.llvm.org/D122531 The bug essentially boiled down to the results of the operation being mapped before regions were being cloned. This patch happens to also be a fix to that bug as through the single `clone` implementation, that ordering issue is also resolved. Due to the recency of @wsmoses patch as well as the only code using that parameter being removed it essentially became dead code that I doubt any downstream users have yet to use as well. (I initially had it as part of CloneOptions until I realised it was redundant)
mlir/lib/IR/Region.cpp
142	I am not sure how else it'd be possible to implement. `mlir::ValueRange` doesn't have a constructor that takes an iterator range of `Value`s, only one taking an ArrayRef of `Value`s, which this code calls. `llvm::map_range` is also just a range of iterators computing the elements and results lazily, that does not have any storage. Hence the `llvm::to_vector` to materialize it and construct the `mlir::ValueRange` with (which I think requires contiguous memory?).

Document the conditions for when it is safe to call Operations clone and Regions cloneInto methods from multiple threads.

zero9178 edited the summary of this revision. (Show Details)Apr 20 2022, 5:19 AM

Harbormaster completed remote builds in B160434: Diff 423873.Apr 20 2022, 5:54 AM

Hoist vector creation out of loop, allowing reuse of capacity of the vector. Performance is now up to parity if not a little better (although in the margin of error):

[markus@dell-xps13 bin]$ hyperfine --prepare 'sleep 20'  './pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading' './pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading'
Benchmark 1: ./pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading
  Time (mean ± σ):      4.073 s ±  0.037 s    [User: 3.983 s, System: 0.068 s]
  Range (min … max):    4.014 s …  4.148 s    10 runs
 
Benchmark 2: ./pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading
  Time (mean ± σ):      4.061 s ±  0.080 s    [User: 3.962 s, System: 0.078 s]
  Range (min … max):    3.963 s …  4.166 s    10 runs
 
Summary
  './pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading' ran
    1.00 ± 0.02 times faster than './pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading'
[markus@dell-xps13 bin]$ hyperfine --prepare 'sleep 20'  './pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading' './pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading'
Benchmark 1: ./pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading
  Time (mean ± σ):      4.025 s ±  0.077 s    [User: 3.927 s, System: 0.078 s]
  Range (min … max):    3.932 s …  4.156 s    10 runs
 
Benchmark 2: ./pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading
  Time (mean ± σ):      4.071 s ±  0.062 s    [User: 3.984 s, System: 0.065 s]
  Range (min … max):    3.979 s …  4.188 s    10 runs
 
Summary
  './pylir-opt benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading' ran
    1.01 ± 0.02 times faster than './pylir-opt-old benchmark.mlir --pylir-trial-inliner="min-callee-size-reduction=0 max-recursion-depth=200" --mlir-disable-threading'

Harbormaster completed remote builds in B160448: Diff 423899.Apr 20 2022, 7:31 AM

Thanks for checking the performance!

Looks mostly good to me, but you'll want to wait for River

mlir/lib/IR/Region.cpp
142	You could change it so that when cloning operands, instead of zero, it sets them all to null, and then loop over them setting them here to avoid creating the vector.

mlir/include/mlir/IR/Operation.h
79	Missing punctuation at the end of this sentence.
96
99
129
mlir/lib/IR/Region.cpp
85–88	Should likely clarify here that it isn't thread safe in some cases (when adding references to values in other regions).
112	Prefer using (), unless there isn't an actual constructor here.
126	nit: Spell out auto here.
127–128	The separate variable doesn't add much, can we just drop it? We could also drop the braces afterwards as well.
136–137	Can you just inlined these variables?

This revision is now accepted and ready to land.Apr 21 2022, 12:31 AM

Address review comments

This revision was landed with ongoing or failed builds.Apr 21 2022, 4:43 AM

Closed by commit rGa41aaf166fed: [mlir] Make `Regions`s `cloneInto` multithread-readable (authored by zero9178). · Explain Why

This revision was automatically updated to reflect the committed changes.

zero9178 added a commit: rGa41aaf166fed: [mlir] Make `Regions`s `cloneInto` multithread-readable.

Thanks to the both of you for the very thorough review! Very happy with the result :)

Harbormaster completed remote builds in B160622: Diff 424151.Apr 21 2022, 5:05 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

IR/

Operation.h

64 lines

Region.h

5 lines

lib/

IR/

Operation.cpp

79 lines

Region.cpp

67 lines

Diff 424153

mlir/include/mlir/IR/Operation.h

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines public:

bool isRegistered() { return getName().isRegistered(); } bool isRegistered() { return getName().isRegistered(); }

/// Remove this operation from its parent block and delete it. /// Remove this operation from its parent block and delete it.

void erase(); void erase();

/// Remove the operation from its parent block, but don't delete it. /// Remove the operation from its parent block, but don't delete it.

void remove(); void remove();

/// Class encompassing various options related to cloning an operation. Users

MogballUnsubmitted

Not Done

Can you expand the documentation here?

Mogball: Can you expand the documentation here?

zero9178AuthorUnsubmitted

Done

I tried to elaborate a bit further but I am not 100% sure what is missing.

zero9178: I tried to elaborate a bit further but I am not 100% sure what is missing.

MogballUnsubmitted

Not Done

You should clearly state that the "parts of an operation" include whether the regions are recursively cloned and whether the operands are cloned and that these options are passed to clone methods.

Mogball: You should clearly state that the "parts of an operation" include whether the regions are…

/// of this class should pass it to Operation's 'clone' methods.

/// Current options include:

/// * Whether cloning should recursively traverse into the regions of the

MogballUnsubmitted

Done

public:

- /// Default constructs an option with all flags set to false.

+ /// Default constructs clone options with all flags set to false.

CloneOptions();

What does it mean for all flags to be false?

Mogball: What does it mean for all flags to be false?

/// operation or not.

MogballUnsubmitted

Done

Provide a parameter constructor? CloneOptions(bool cloneRegions, bool cloneOperands).

Mogball: Provide a parameter constructor? `CloneOptions(bool cloneRegions, bool cloneOperands)`.

rriddleUnsubmitted

Done

Missing punctuation at the end of this sentence.

rriddle: Missing punctuation at the end of this sentence.

/// * Whether cloning should also clone the operands of the operation.

class CloneOptions {

MogballUnsubmitted

Done

What does it mean for all flags to be true?

Mogball: What does it mean for all flags to be true?

public:

/// Default constructs an option with all flags set to false. That means all

/// parts of an operation that may optionally not be cloned, are not cloned.

CloneOptions();

MogballUnsubmitted

Done

static CloneOptions all();

- /// Configures whether cloning should traverse into any of the regions of

- /// the operation. The resulting clone will have the same amount of regions

- /// but they will be empty instead.

+ /// Configures whether cloning should traverse into the regions of

+ /// the operation. If set to true, operations' regions are recursively clone. If set to false, cloned operations will have the same number of regions

+ /// but they will be empty.

CloneOptions &cloneRegions(bool enable = true);

Mogball:

/// Constructs an instance with the clone regions and clone operands flags

/// set accordingly.

CloneOptions(bool cloneRegions, bool cloneOperands);

/// Returns an instance with all flags set to true. This is the default

/// when using the clone method and clones all parts of the operation.

static CloneOptions all();

MogballUnsubmitted

Done

What is the use case of cloning operations with zero operands? E.g. what happens if I clone an op with its regions but without operands. All nested ops have zero operands?

Mogball: What is the use case of cloning operations with zero operands? E.g. what happens if I clone an…

zero9178AuthorUnsubmitted

Done

Nested operations are currently unaffected by the cloning options. Aka they are not recursive and only affect the top level operation. Nested operations are cloned in their entirety.
A use case for cloning an operation without its operands but with its regions might be cloning the operation to outline it eg. (Although this could also be achieved via the mapper). Generally speaking, the point of not cloning operands is to avoid creating a use of those operands as that is not thread safe.

zero9178: Nested operations are currently unaffected by the cloning options. Aka they are not recursive…

MogballUnsubmitted

Not Done

Makes sense. But what if the region is not isolated from above? Could cloning the region recursively add a use to a value defined above and create a race condition?

Mogball: Makes sense. But what if the region is not isolated from above? Could cloning the region…

zero9178AuthorUnsubmitted

Done

If the region is not isolated from above then a use of a value not defined by any of the operations being cloned may still lead to a race condition yes. So only cloning an isolated from above region is generally read only.
One may still avoid a race condition in your case however if all the outside defined values are mapped by the BlockAndValueMapping to Values owned by the current thread.

zero9178: If the region is not isolated from above then a use of a value not defined by any of the…

MogballUnsubmitted

Not Done

Right. Could you update the patch description to reflect that bit of a wrinkle? It'd also be great if the threadsafe-ness of cloning were briefly documented somewhere in the code so that this doesn't become MLIR street knowledge.

Mogball: Right. Could you update the patch description to reflect that bit of a wrinkle? It'd also be…

MogballUnsubmitted

Done

bool shouldCloneRegions() const { return cloneRegionsFlag; }

- /// Configures whether operands should be cloned as well. Otherwise the

- /// resulting clone will simply have no operands.

+ /// Configures whether operations' operands should be cloned. Otherwise the

+ /// resulting clones will have zero operands.

CloneOptions &cloneOperands(bool enable = true);

Mogball:

/// Configures whether cloning should traverse into any of the regions of

/// the operation. If set to true, the operation's regions are recursively

rriddleUnsubmitted

Done

/// Configures whether cloning should traverse into any of the regions of

- /// the operation. If set to true, operations' regions are recursively

+ /// the operation. If set to true, the operation's regions are recursively

/// cloned. If set to false, cloned operations will have the same number of

rriddle:

/// cloned. If set to false, cloned operations will have the same number of

/// regions, but they will be empty.

/// Cloning of nested operations in the operation's regions are currently

rriddleUnsubmitted

Done

/// regions, but they will be empty.

- /// Cloning of nested operations in the operations' regions are currently

+ /// Cloning of nested operations in the operation's regions are currently

/// unaffected by other flags.

rriddle:

/// unaffected by other flags.

CloneOptions &cloneRegions(bool enable = true);

/// Returns whether regions of the operation should be cloned as well.

bool shouldCloneRegions() const { return cloneRegionsFlag; }

/// Configures whether operation' operands should be cloned. Otherwise the

/// resulting clones will simply have zero operands.

CloneOptions &cloneOperands(bool enable = true);

/// Returns whether operands should be cloned as well.

bool shouldCloneOperands() const { return cloneOperandsFlag; }

private:

/// Whether regions should be cloned.

bool cloneRegionsFlag : 1;

/// Whether operands should be cloned.

bool cloneOperandsFlag : 1;

};

/// Create a deep copy of this operation, remapping any operands that use /// Create a deep copy of this operation, remapping any operands that use

/// values outside of the operation using the map that is provided (leaving /// values outside of the operation using the map that is provided (leaving

/// them alone if no entry is present). Replaces references to cloned /// them alone if no entry is present). Replaces references to cloned

/// sub-operations to the corresponding operation that is copied, and adds /// sub-operations to the corresponding operation that is copied, and adds

/// those mappings to the map. /// those mappings to the map.

Operation *clone(BlockAndValueMapping &mapper); /// Optionally, one may configure what parts of the operation to clone using

Operation *clone(); /// the options parameter.

///

/// Calling this method from multiple threads is generally safe if through the

/// process of cloning no new uses of 'Value's from outside the operation are

rriddleUnsubmitted

Done

/// Calling this method from multiple threads is generally safe if through the

- /// process of cloning, no new uses of 'Value's from outside the operation are

+ /// process of cloning no new uses of 'Value's from outside the operation are

/// created. Cloning an isolated-from-above operation with no operands, such

rriddle:

/// created. Cloning an isolated-from-above operation with no operands, such

/// as top level function operations, is therefore always safe. Using the

/// mapper, it is possible to avoid adding uses to outside operands by

/// remapping them to 'Value's owned by the caller thread.

Operation *clone(BlockAndValueMapping &mapper,

CloneOptions options = CloneOptions::all());

Operation *clone(CloneOptions options = CloneOptions::all());

MogballUnsubmitted

Done

/// those mappings to the map.

- /// Optionally, one may configure to not clone parts of the operation using

+ /// Optionally, one may configure what parts of the operation to clone using

/// the options parameter.

Operation *clone(BlockAndValueMapping &mapper,

Mogball:

/// Create a partial copy of this operation without traversing into attached /// Create a partial copy of this operation without traversing into attached

/// regions. The new operation will have the same number of regions as the /// regions. The new operation will have the same number of regions as the

/// original one, but they will be left empty. /// original one, but they will be left empty.

/// Operands are remapped using `mapper` (if present), and `mapper` is updated /// Operands are remapped using `mapper` (if present), and `mapper` is updated

/// to contain the results. /// to contain the results.

/// The `mapResults` argument specifies whether the results of the operation Operation *cloneWithoutRegions(BlockAndValueMapping &mapper);

/// should also be mapped.

Operation *cloneWithoutRegions(BlockAndValueMapping &mapper,

bool mapResults = true);

MogballUnsubmitted

Done

Why has this option been removed?

Mogball: Why has this option been removed?

zero9178AuthorUnsubmitted

Done

This options was only very recently added in https://reviews.llvm.org/D122531
The bug essentially boiled down to the results of the operation being mapped before regions were being cloned.
This patch happens to also be a fix to that bug as through the single clone implementation, that ordering issue is also resolved.
Due to the recency of @wsmoses patch as well as the only code using that parameter being removed it essentially became dead code that I doubt any downstream users have yet to use as well.
(I initially had it as part of CloneOptions until I realised it was redundant)

zero9178: This options was only very recently added in https://reviews.llvm.org/D122531 The bug…

/// Create a partial copy of this operation without traversing into attached /// Create a partial copy of this operation without traversing into attached

/// regions. The new operation will have the same number of regions as the /// regions. The new operation will have the same number of regions as the

/// original one, but they will be left empty. /// original one, but they will be left empty.

Operation *cloneWithoutRegions(); Operation *cloneWithoutRegions();

/// Returns the operation block that contains this operation. /// Returns the operation block that contains this operation.

Block *getBlock() { return block; } Block *getBlock() { return block; }

▲ Show 20 Lines • Show All 681 Lines • Show Last 20 Lines

mlir/include/mlir/IR/Region.h

Show First 20 Lines • Show All 221 Lines • ▼ Show 20 Lines	public:
bool isAncestor(Region *other) {		bool isAncestor(Region *other) {
return this == other \|\| isProperAncestor(other);		return this == other \|\| isProperAncestor(other);
}		}

/// Clone the internal blocks from this region into dest. Any		/// Clone the internal blocks from this region into dest. Any
/// cloned blocks are appended to the back of dest. If the mapper		/// cloned blocks are appended to the back of dest. If the mapper
/// contains entries for block arguments, these arguments are not included		/// contains entries for block arguments, these arguments are not included
/// in the respective cloned block.		/// in the respective cloned block.
		///
		/// Calling this method from multiple threads is generally safe if through the
		/// process of cloning, no new uses of 'Value's from outside the region are
		/// created. Using the mapper, it is possible to avoid adding uses to outside
		/// operands by remapping them to 'Value's owned by the caller thread.
void cloneInto(Region *dest, BlockAndValueMapping &mapper);		void cloneInto(Region *dest, BlockAndValueMapping &mapper);
/// Clone this region into 'dest' before the given position in 'dest'.		/// Clone this region into 'dest' before the given position in 'dest'.
void cloneInto(Region *dest, Region::iterator destPos,		void cloneInto(Region *dest, Region::iterator destPos,
BlockAndValueMapping &mapper);		BlockAndValueMapping &mapper);

/// Takes body of another region (that region will have no body after this		/// Takes body of another region (that region will have no body after this
/// operation completes). The current body of this region is cleared.		/// operation completes). The current body of this region is cleared.
void takeBody(Region &other) {		void takeBody(Region &other) {
▲ Show 20 Lines • Show All 128 Lines • Show Last 20 Lines

mlir/lib/IR/Operation.cpp

	Show First 20 Lines • Show All 517 Lines • ▼ Show 20 Lines
	InFlightDiagnostic Operation::emitOpError(const Twine &message) {			InFlightDiagnostic Operation::emitOpError(const Twine &message) {
	return emitError() << "'" << getName() << "' op " << message;			return emitError() << "'" << getName() << "' op " << message;
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Operation Cloning			// Operation Cloning
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

				Operation::CloneOptions::CloneOptions()
				rriddleUnsubmitted Not Done Reply Inline Actions The style of the code base has changed at this point, please drop the duplicated method documentation from here. rriddle: The style of the code base has changed at this point, please drop the duplicated method…
				: cloneRegionsFlag(false), cloneOperandsFlag(false) {}

				Operation::CloneOptions::CloneOptions(bool cloneRegions, bool cloneOperands)
				: cloneRegionsFlag(cloneRegions), cloneOperandsFlag(cloneOperands) {}

				Operation::CloneOptions Operation::CloneOptions::all() {
				return CloneOptions().cloneRegions().cloneOperands();
				}

				Operation::CloneOptions &Operation::CloneOptions::cloneRegions(bool enable) {
				cloneRegionsFlag = enable;
				return *this;
				}

				Operation::CloneOptions &Operation::CloneOptions::cloneOperands(bool enable) {
				cloneOperandsFlag = enable;
				return *this;
				}

	/// Create a deep copy of this operation but keep the operation regions empty.			/// Create a deep copy of this operation but keep the operation regions empty.
	/// Operands are remapped using `mapper` (if present), and `mapper` is updated			/// Operands are remapped using `mapper` (if present), and `mapper` is updated
	/// to contain the results. The `mapResults` flag specifies whether the results			/// to contain the results. The `mapResults` flag specifies whether the results
	/// of the cloned operation should be added to the map.			/// of the cloned operation should be added to the map.
	Operation *Operation::cloneWithoutRegions(BlockAndValueMapping &mapper,			Operation *Operation::cloneWithoutRegions(BlockAndValueMapping &mapper) {
	bool mapResults) {			return clone(mapper, CloneOptions::all().cloneRegions(false));
				}

				Operation *Operation::cloneWithoutRegions() {
				BlockAndValueMapping mapper;
				return cloneWithoutRegions(mapper);
				}

				/// Create a deep copy of this operation, remapping any operands that use
				/// values outside of the operation using the map that is provided (leaving
				/// them alone if no entry is present). Replaces references to cloned
				/// sub-operations to the corresponding operation that is copied, and adds
				/// those mappings to the map.
				Operation *Operation::clone(BlockAndValueMapping &mapper,
				CloneOptions options) {
	SmallVector<Value, 8> operands;			SmallVector<Value, 8> operands;
	SmallVector<Block *, 2> successors;			SmallVector<Block *, 2> successors;

	// Remap the operands.			// Remap the operands.
				if (options.shouldCloneOperands()) {
	operands.reserve(getNumOperands());			operands.reserve(getNumOperands());
	for (auto opValue : getOperands())			for (auto opValue : getOperands())
	operands.push_back(mapper.lookupOrDefault(opValue));			operands.push_back(mapper.lookupOrDefault(opValue));
				}

	// Remap the successors.			// Remap the successors.
	successors.reserve(getNumSuccessors());			successors.reserve(getNumSuccessors());
	for (Block *successor : getSuccessors())			for (Block *successor : getSuccessors())
	successors.push_back(mapper.lookupOrDefault(successor));			successors.push_back(mapper.lookupOrDefault(successor));

	// Create the new operation.			// Create the new operation.
	auto *newOp = create(getLoc(), getName(), getResultTypes(), operands, attrs,			auto *newOp = create(getLoc(), getName(), getResultTypes(), operands, attrs,
	successors, getNumRegions());			successors, getNumRegions());

	// Remember the mapping of any results.
	if (mapResults) {
	for (unsigned i = 0, e = getNumResults(); i != e; ++i)
	mapper.map(getResult(i), newOp->getResult(i));
	}

	return newOp;
	}

	Operation *Operation::cloneWithoutRegions() {
	BlockAndValueMapping mapper;
	return cloneWithoutRegions(mapper);
	}

	/// Create a deep copy of this operation, remapping any operands that use
	/// values outside of the operation using the map that is provided (leaving
	/// them alone if no entry is present). Replaces references to cloned
	/// sub-operations to the corresponding operation that is copied, and adds
	/// those mappings to the map.
	Operation *Operation::clone(BlockAndValueMapping &mapper) {
	auto newOp = cloneWithoutRegions(mapper, /mapResults=*/false);

	// Clone the regions.			// Clone the regions.
				if (options.shouldCloneRegions()) {
	for (unsigned i = 0; i != numRegions; ++i)			for (unsigned i = 0; i != numRegions; ++i)
	getRegion(i).cloneInto(&newOp->getRegion(i), mapper);			getRegion(i).cloneInto(&newOp->getRegion(i), mapper);
				}

				// Remember the mapping of any results.
	for (unsigned i = 0, e = getNumResults(); i != e; ++i)			for (unsigned i = 0, e = getNumResults(); i != e; ++i)
	mapper.map(getResult(i), newOp->getResult(i));			mapper.map(getResult(i), newOp->getResult(i));

	return newOp;			return newOp;
	}			}

	Operation *Operation::clone() {			Operation *Operation::clone(CloneOptions options) {
	BlockAndValueMapping mapper;			BlockAndValueMapping mapper;
	return clone(mapper);			return clone(mapper, options);
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// OpState trait class.			// OpState trait class.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// The fallback for the parser is to try for a dialect operation parser.			// The fallback for the parser is to try for a dialect operation parser.
	// Otherwise, reject the custom assembly form.			// Otherwise, reject the custom assembly form.
	▲ Show 20 Lines • Show All 615 Lines • Show Last 20 Lines

mlir/lib/IR/Region.cpp

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines void Region::cloneInto(Region *dest, Region::iterator destPos,

BlockAndValueMapping &mapper) { BlockAndValueMapping &mapper) {

assert(dest && "expected valid region to clone into"); assert(dest && "expected valid region to clone into");

assert(this != dest && "cannot clone region into itself"); assert(this != dest && "cannot clone region into itself");

// If the list is empty there is nothing to clone. // If the list is empty there is nothing to clone.

if (empty()) if (empty())

return; return;

// The below clone implementation takes special care to be read only for the

// sake of multi threading. That essentially means not adding any uses to any

// of the blocks or operation results contained within this region as that

// would lead to a write in their use-def list. This is unavoidable for

rriddleUnsubmitted

Done

Should likely clarify here that it isn't thread safe in some cases (when adding references to values in other regions).

rriddle: Should likely clarify here that it isn't thread safe in some cases (when adding references to…

// 'Value's from outside the region however, in which case it is not read

// only. Using the BlockAndValueMapper it is possible to remap such 'Value's

// to ones owned by the calling thread however, making it read only once

// again.

// First clone all the blocks and block arguments and map them, but don't yet

// clone the operations, as they may otherwise add a use to a block that has

// not yet been mapped

for (Block &block : *this) { for (Block &block : *this) {

Block *newBlock = new Block(); Block *newBlock = new Block();

mapper.map(&block, newBlock); mapper.map(&block, newBlock);

// Clone the block arguments. The user might be deleting arguments to the // Clone the block arguments. The user might be deleting arguments to the

// block by specifying them in the mapper. If so, we don't add the // block by specifying them in the mapper. If so, we don't add the

// argument to the cloned block. // argument to the cloned block.

for (auto arg : block.getArguments()) for (auto arg : block.getArguments())

if (!mapper.contains(arg)) if (!mapper.contains(arg))

mapper.map(arg, newBlock->addArgument(arg.getType(), arg.getLoc())); mapper.map(arg, newBlock->addArgument(arg.getType(), arg.getLoc()));

// Clone and remap the operations within this block.

for (auto &op : block)

newBlock->push_back(op.clone(mapper));

dest->getBlocks().insert(destPos, newBlock); dest->getBlocks().insert(destPos, newBlock);

} }

// Now that each of the blocks have been cloned, go through and remap the auto newBlocksRange =

// operands of each of the operations. llvm::make_range(Region::iterator(mapper.lookup(&front())), destPos);

rriddleUnsubmitted

Not Done

auto newBlocksRange =

- llvm::make_range(Region::iterator{mapper.lookup(&front())}, destPos);

+ llvm::make_range(Region::iterator(mapper.lookup(&front())), destPos);

// Now follow up with creating the operations, but don't yet clone their

Prefer using (), unless there isn't an actual constructor here.

rriddle: Prefer using (), unless there isn't an actual constructor here.

auto remapOperands = [&](Operation *op) {

for (auto &operand : op->getOpOperands()) // Now follow up with creating the operations, but don't yet clone their

if (auto mappedOp = mapper.lookupOrNull(operand.get())) // regions, nor set their operands. Setting the successors is safe as all have

operand.set(mappedOp); // already been mapped. We are essentially just creating the operation results

for (auto &succOp : op->getBlockOperands()) // to be able to map them.

rriddleUnsubmitted

Done

We shouldn't need to track every new operation, we can just iterate over the inserted blocks alongside the original blocks.

rriddle: We shouldn't need to track every new operation, we can just iterate over the inserted blocks…

if (auto *mappedOp = mapper.lookupOrNull(succOp.get())) // Cloning the operands and region as well would lead to uses of operations

succOp.set(mappedOp); // not yet mapped.

}; auto cloneOptions =

Operation::CloneOptions::all().cloneRegions(false).cloneOperands(false);

for (auto zippedBlocks : llvm::zip(*this, newBlocksRange)) {

Block &sourceBlock = std::get<0>(zippedBlocks);

Block &clonedBlock = std::get<1>(zippedBlocks);

// Clone and remap the operations within this block.

rriddleUnsubmitted

Done

Can you pull this out into a separate variable?

rriddle: Can you pull this out into a separate variable?

for (Operation &op : sourceBlock)

rriddleUnsubmitted

Done

nit: Spell out auto here.

rriddle: nit: Spell out auto here.

clonedBlock.push_back(op.clone(mapper, cloneOptions));

}

rriddleUnsubmitted

Done

for (auto &op : sourceBlock) {

- Operation *clone = op.clone(mapper, cloneOptions);

- clonedBlock.push_back(clone);

+ clonedBlock.push_back(op.clone(mapper, cloneOptions));

}

The separate variable doesn't add much, can we just drop it? We could also drop the braces afterwards as well.

rriddle: The separate variable doesn't add much, can we just drop it? We could also drop the braces…

// Finally now that all operation results have been mapped, set the operands

// and clone the regions.

SmallVector<Value> operands;

for (auto zippedBlocks : llvm::zip(*this, newBlocksRange)) {

for (auto ops :

llvm::zip(std::get<0>(zippedBlocks), std::get<1>(zippedBlocks))) {

Operation &source = std::get<0>(ops);

Operation &clone = std::get<1>(ops);

rriddleUnsubmitted

Done

Can you just inlined these variables?

rriddle: Can you just inlined these variables?

operands.resize(source.getNumOperands());

llvm::transform(

source.getOperands(), operands.begin(),

[&](Value operand) { return mapper.lookupOrDefault(operand); });

MogballUnsubmitted

Done

Don't create a temporary vector.

Mogball: Don't create a temporary vector.

zero9178AuthorUnsubmitted

Done

I am not sure how else it'd be possible to implement. mlir::ValueRange doesn't have a constructor that takes an iterator range of Values, only one taking an ArrayRef of Values, which this code calls. llvm::map_range is also just a range of iterators computing the elements and results lazily, that does not have any storage. Hence the llvm::to_vector to materialize it and construct the mlir::ValueRange with (which I think requires contiguous memory?).

zero9178: I am not sure how else it'd be possible to implement. `mlir::ValueRange` doesn't have a…

MogballUnsubmitted

Not Done

You could change it so that when cloning operands, instead of zero, it sets them all to null, and then loop over them setting them here to avoid creating the vector.

Mogball: You could change it so that when cloning operands, instead of zero, it sets them all to null…

clone.setOperands(operands);

for (iterator it(mapper.lookup(&front())); it != destPos; ++it) for (auto regions : llvm::zip(source.getRegions(), clone.getRegions()))

it->walk(remapOperands); std::get<0>(regions).cloneInto(&std::get<1>(regions), mapper);

}

} }

/// Returns 'block' if 'block' lies in this region, or otherwise finds the /// Returns 'block' if 'block' lies in this region, or otherwise finds the

/// ancestor of 'block' that lies in this region. Returns nullptr if the latter /// ancestor of 'block' that lies in this region. Returns nullptr if the latter

/// fails. /// fails.

Block *Region::findAncestorBlockInRegion(Block &block) { Block *Region::findAncestorBlockInRegion(Block &block) {

Block *currBlock = &block; Block *currBlock = &block;

while (currBlock->getParent() != this) { while (currBlock->getParent() != this) {

▲ Show 20 Lines • Show All 128 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Make `Regions`s `cloneInto` multithread-readableClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 424153

mlir/include/mlir/IR/Operation.h

mlir/include/mlir/IR/Region.h

mlir/lib/IR/Operation.cpp

mlir/lib/IR/Region.cpp

[mlir] Make `Regions`s `cloneInto` multithread-readable
ClosedPublic