This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Rewrite/
-
Rewrite/
6/9
ByteCode.h
27/40
ByteCode.cpp
-
test/Rewrite/
-
Rewrite/
-
pdl-bytecode.mlir

Differential D108547

Introduced iterative bytecode execution.
ClosedPublic

Authored by sfuniak on Aug 23 2021, 5:19 AM.

Download Raw Diff

Details

Reviewers

rriddle
mehdi_amini
Mogball

Commits

rG3eb1647af036: Introduced iterative bytecode execution.

Summary

This is commit 2 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).

This commit implements the features needed for the execution of the new operations pdl_interp.get_accepting_ops, pdl_interp.choose_op:

The implementation of the generation and execution of the two ops.
The addition of Stack of bytecode positions within the ByteCodeExecutor. This is needed because in pdl_interp.choose_op, we iterate over the values returned by pdl_interp.get_accepting_ops until we reach finalize. When we reach finalize, we need to return back to the position marked in the stack.
The functionality to extend the lifetime of values that cross the nondeterministic choice. The existing bytecode generator allocates the values to memory positions by representing the liveness of values as a collection of disjoint intervals over the matcher positions. This is akin to register allocation, and substantially reduces the footprint of the bytecode executor. However, because with iterative operation pdl_interp.choose_op, execution "returns" back, so any values whose original liveness cross the nondeterminstic choice must have their lifetime executed until finalize.

Testing: pdl-bytecode.mlir test

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sfuniak created this revision.Aug 23 2021, 5:19 AM

Herald added subscribers: wrengr, Chia-hungDuan, dcaballe and 16 others. · View Herald TranscriptAug 23 2021, 5:19 AM

sfuniak requested review of this revision.Aug 23 2021, 5:19 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2021, 5:19 AM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

sfuniak added a parent revision: D108543: Defines new PDLInterp operations needed for multi-root matching in PDL..Aug 23 2021, 5:21 AM

Harbormaster completed remote builds in B120764: Diff 368071.Aug 23 2021, 5:32 AM

sfuniak added a child revision: D108549: Implementation of the root ordering algorithm.Aug 23 2021, 5:33 AM

Made PDL ByteCode work with the new pdl_interp.foreach region.

This update migrates the ByteCode generator and executor to support the new ForEach construct in the pdl_interp dialect. I have reverted the pdl_interp.finalize operation to not perform any looping, and rely on pdl_interp.continue to advance the execution of the current loop to the next iterate.

Herald added a subscriber: wenzhicui. · View Herald TranscriptOct 10 2021, 8:45 PM

Harbormaster completed remote builds in B128028: Diff 378558.Oct 10 2021, 9:13 PM

Nice!

mlir/lib/Rewrite/ByteCode.cpp
303–304	Same comment about debug fields as in the header.
348–350
mlir/lib/Rewrite/ByteCode.h
34	nit: Drop llvm:: for ArrayRef (it's re-exported in `mlir` namespace)
186–187	This is only for LLVM_DEBUG? Can you move this to the bottom and wrap with NDEBUG? We should keep anything debug specific sectioned off from everything else.
187	nit: Drop std:: from size_t

rriddle added inline comments.Oct 12 2021, 12:23 AM

mlir/lib/Rewrite/ByteCode.cpp
450–453	That is... weird. Can you file an llvm bug and reference it here?
579–594
605
1486–1492	llvm::reverse(*block)?
1486–1492	This feels expensive, I'm also not sure this is correct. If we move the insertion point to the operands, we may end up creating a side effecting operation incorrectly. For example, what happens in the situation where: %cst = some.constant %alloc = some.alloc ... %load = some.load ... <Current Insertion Point> If the current insertion point is the same above, if we were about to create a `some.store %cst in %alloc` operation wouldn't this incorrectly create it above the load (thus leading to a miscompile)?
1705–1707	Why not grab the users from all of the values?
1713–1716
1725–1726	Yeah, the opRangeMemory should hold the actual storage. `memory` should hold the address to the data.

This revision now requires changes to proceed.Oct 12 2021, 12:23 AM

sfuniak added inline comments.Oct 12 2021, 7:42 PM

mlir/lib/Rewrite/ByteCode.cpp
1486–1492	I can see how this change could be problematic. It was mostly introduced, because the current insertion point is at the root of the pattern, and with downward edge traversal, there is no guarantee that all the inputs are defined at this point. However, now that the pdl.rewrite accepts an optional argument forcing the root, this is less of an issue, because the user can manually specify the desired root that's known to work. Furthermore, in our use case, we rely on graph region where the order does not matter anyway. I will revert this change.

General question: what's the cost if I never use multi-root patterns?

In D108547#3060427, @Mogball wrote:

General question: what's the cost if I never use multi-root patterns?

If you do not use multi-root patterns, the increase in cost of PDL-to-PDLInterp lowering is negligible (the system will scan through the pattern to determine the candidate roots, which there will be exactly 1). There is no increase in the cost of the bytecode execution, because there will be no pdl_interp.foreach operations in the resulting PDLInterp IR.

sfuniak added inline comments.Oct 14 2021, 10:03 PM

mlir/lib/Rewrite/ByteCode.cpp
303–304	Actually, `blockToAddr` is always needed to resolve the forward references. It's moved here from `void Generator::generate(ModuleOp module)`. What you probably refer to is `addrToBlock` below. I could guard that one, but as I indicated in my earlier comment, it may be better to determine at runtime whether this map is to be populated, rather than compile time.
mlir/lib/Rewrite/ByteCode.h
186–187	I am a little apprehensive about wrapping this with NDEBUG in a public header. Doing so would break the code if the users of this header file are compiled with the different setting of the NDEBUG flag as the ByteCode.cpp. From `llvm/lib/Support/Debug.cpp`: Even though LLVM might be built with NDEBUG, define symbols that the code built without NDEBUG can depend on via the llvm/Support/Debug.h header. Furthermore, eventually, I would like to have a debugger-like functionality in `ByteCode`, so that you can execute the operations one by one. So in the long run, we may want to control at runtime whether or not to populate this map.

rriddle added inline comments.Oct 14 2021, 10:07 PM

mlir/lib/Rewrite/ByteCode.h
186–187	It's not a public header though? This can't be included outside fo lib/Rewrite. I'd also like to avoid any kind of debugging facilities being baked into release builds, that type of functionality should be reserved for compilation setups that want it.

sfuniak added inline comments.Oct 14 2021, 10:15 PM

mlir/lib/Rewrite/ByteCode.h
186–187	I suppose you are right, it's not a public header. We access the bytecode only through `FrozenRewritePatternSet`. I will do as you suggested.

Addressed review feedback and made further improvements:

Fixed the insertion point on pdl_interp.create_operation.
Fixed the implementation of pdl_interp.get_users to match its semantics specified in stacked diff #1.
Eliminated the extra map storing the block for each instruction address. Instead, we store the line number of the instruction directly in the bytecode when debug messages are turned on.
pdl_interp.foreach no longer consumes the range being iterated over. Instead, we store the index into the range and update this index every time we call pdl_interp.continue.
PDLByteCodeMutableState::opRangeMemory now stores an owning reference to each range.
Improved unit tests coverage.

Harbormaster completed remote builds in B129700: Diff 380907.Oct 20 2021, 5:17 AM

I am sorry, I am not sure why my recent changes to PDLInterps files are showing up here. It's as if the base commit was not set right. I did update the stack diff #1. Any ideas?

Attempt to fix a rebase problem.

Harbormaster completed remote builds in B129705: Diff 380912.Oct 20 2021, 5:49 AM

sfuniak added inline comments.Oct 22 2021, 5:10 AM

mlir/lib/Rewrite/ByteCode.cpp
450–453	I figured this out: `IntervalMap` has a user-defined destructor but not a user-defined copy constructor. The compiler-generated implicitly-declared copy constructor is wrong. We should just define a move constructor, which is sufficient for most practical purposes. I will submit a diff next week and reference it here.

Updated comment with the IntervalMap fix.

Harbormaster completed remote builds in B132609: Diff 384962.Nov 4 2021, 10:04 PM

@rriddle: This is the last remaining diff. If you can re-review it whenever you are free, that would be greatly appreciated. Then we can land all 4 and iterate. Thank you!

Gentle ping, @rriddle @mehdi_amini

Herald added a subscriber: sdasgup3. · View Herald TranscriptNov 14 2021, 9:01 PM

rriddle added inline comments.Nov 14 2021, 11:53 PM

mlir/lib/Rewrite/ByteCode.cpp
591	nit: Drop the trivial braces here.
692–697	The debug related functionality feels separable, can you split this out?
916	Could probably use llvm::SaveAndRestore here.
1080
1507–1508	nit: Please move comments to their own line.
1605	We don't really need the reference.
1737	Walking the user count is going to be O(N), what's the trade off here vs. not reserving? Are we banking on a small number of users in practice?
1750	typo: acceessing
1751–1753	Why do we need all of this? Can we augment/add an overload of executeGetOperandsResults for the case we are interested in?

sfuniak marked 20 inline comments as done.Nov 16 2021, 10:28 PM

sfuniak added inline comments.

mlir/lib/Rewrite/ByteCode.cpp
692–697	will do
916	I find `llvm::SaveAndRestore` a bit too magical and no less verbose. It certainly helps in cases when we set a variable to an arbitrary value, but in this case, we are increasing and decreasing level, which I think is cleaner as is. We do not have exceptions to guard against either. But you are the code owner, so the decision ultimately rests with you.
1737	Not reserving will require several allocs, one alloc every time we expand the storage. I figured walking the users upfront would be cheaper (though still O(N)). But on the second thought, the number of users will often be small, so maybe `SmallVector` will do.
1751–1753	We need the comparison, because we need to check if the extracted operand(s) at the specified position match(es) the given value(s). I will create an overload of `executeGetOperandsResults` as you suggested that does this.

Split out the debug functionality & further review feedback.

Harbormaster completed remote builds in B134663: Diff 387832.Nov 16 2021, 10:45 PM

Mogball added a reviewer: Mogball.Nov 17 2021, 9:17 PM

You mentioned that the cost (when running the bytecode) for pattern sets with no multi-root patterns should be negligible, but wouldn't pattern compilation slow down even for pattern sets without any multi-root patterns? (Because they all have to be scanned).

mlir/lib/Rewrite/ByteCode.cpp
1514	You could just `assert(kind == PDLValue::Kind::Operation` given that the only other case in this switch is an abort.
1749	If a null `value` is read, you could just exit early (and set `range` to empty).
1755	Same here. If there are no values to read (either null pointer or empty range) you can just return.
1768	The implementation of `get_users` in the bytecode is a little bit complex, stemming from the four cases of whether an index is specified cross whether the operand is a value or value range. Would this be any better with two different ops? One with an index and one without. It'll get rid of the sentinel value too.

There are two places where we do an extra traversal: 1) when verifying the PDL pattern (checking for connectivity), and 2) when forming the predicate tree (detecting the roots). Both are O(E), where E is the total number of operands across all the operations in the PDL pattern -- hardly an expensive operation. Just to be sure, I did a barebones test, where I took 100 copies of test/Conversion/PDLToPDLInterp/pdl-to-pdl-interp-matcher.mlir concatenated together, and ran mlir-opt -split-input-file -convert-pdl-to-pdl-interp > /dev/null on the resulting file. The runtime of my version and the baseline version of mlir-opt were the same.

mlir/lib/Rewrite/ByteCode.cpp
1514	That's true for now, but we are going to soon follow this with another diff where the iteration type is a ValueRange. I thought I'd anticipate the change.
1768	That's a good point, I will create a new op code and split this up into two execute functions.

Split GetUsers into GetUsersAll and GetUsersAt.

Harbormaster completed remote builds in B134869: Diff 388154.Nov 18 2021, 4:45 AM

In D108547#3139714, @sfuniak wrote:

There are two places where we do an extra traversal: 1) when verifying the PDL pattern (checking for connectivity), and 2) when forming the predicate tree (detecting the roots). Both are O(E), where E is the total number of operands across all the operations in the PDL pattern -- hardly an expensive operation. Just to be sure, I did a barebones test, where I took 100 copies of test/Conversion/PDLToPDLInterp/pdl-to-pdl-interp-matcher.mlir concatenated together, and ran mlir-opt -split-input-file -convert-pdl-to-pdl-interp > /dev/null on the resulting file. The runtime of my version and the baseline version of mlir-opt were the same.

Thanks for checking!

Just to be clear, this concern doesn't preclude any of the changes you've suggested. I'm just wondering whether explicitly marking a pattern as single-rooted or multi-rooted would be useful (should the performance have been a concern).

mlir/lib/Rewrite/ByteCode.cpp
1514	Gotcha
1749	Unrelated, but this makes me think we need an `AttrSizedSegments` interface that hides a lot of this work for us.
1777–1791	For this one here, I think you can extract the loop and move it below the if/else of `Value` vs `ValueRange`. These functions look much cleaner now, thanks!

Mogball added inline comments.Nov 18 2021, 9:58 AM

mlir/lib/Rewrite/ByteCode.cpp
888–889

Simplified GetUsers.

Harbormaster completed remote builds in B135025: Diff 388368.Nov 18 2021, 8:31 PM

Looking pretty good to me.

mlir/lib/Rewrite/ByteCode.cpp
1516

rriddle added inline comments.Nov 19 2021, 11:54 AM

mlir/lib/Rewrite/ByteCode.cpp
132	or the first value in a range Is this right?
1705–1707	Unresolved?
mlir/lib/Rewrite/ByteCode.h
31	I think we shouldn't define OpRange like this, given that it has a different ownership model than TypeRange/ValueRange.

sfuniak added inline comments.Nov 21 2021, 2:54 PM

mlir/lib/Rewrite/ByteCode.cpp
1705–1707	Grabbing only the first value in a range is consistent with the definition of pdl_interp.get_defining_op. I believe, the present definition is correct. You will match the entirety of a value range, so we could take the users of any value. We could also take the intersection of users, but that's more costly and not really needed, because we follow up the users query with operand comparison.
mlir/lib/Rewrite/ByteCode.h
31	Okay... can I just rename it to OwningOpRange then? I still think that using an owning reference below in `std::vector<OpRange> opRangeMemory;` is the right thing to do (no matter what we choose to call it).

Fixed the semantics of get_users for value range and implemented pdl_interp.get_value.

Harbormaster completed remote builds in B135375: Diff 388846.Nov 22 2021, 3:34 AM

LGTM after resolving the comments in the parent and pulling in the changes to here. I'd really like to get this in tree and iterate from there.

mlir/lib/Rewrite/ByteCode.h
31	Yeah, OwningOpRange is fine. Only had problems with the name, just want to avoid a situation where we would forget that OpRange has different semantics.

This revision is now accepted and ready to land.Nov 24 2021, 2:58 PM

Implemented pdl_interp.extract.

Harbormaster completed remote builds in B136134: Diff 389898.Nov 25 2021, 9:28 PM

Minor cleanups.

Harbormaster completed remote builds in B136140: Diff 389904.Nov 25 2021, 10:01 PM

Closed by commit rG3eb1647af036: Introduced iterative bytecode execution. (authored by sfuniak, committed by bondhugula). · Explain WhyNov 26 2021, 4:43 AM

This revision was automatically updated to reflect the committed changes.

bondhugula added a commit: rG3eb1647af036: Introduced iterative bytecode execution..

Revision Contents

Path

Size

mlir/

lib/

Rewrite/

ByteCode.h

16 lines

ByteCode.cpp

409 lines

test/

Rewrite/

pdl-bytecode.mlir

204 lines

Diff 387832

mlir/lib/Rewrite/ByteCode.h

Show All 22 Lines

namespace detail {		namespace detail {
class PDLByteCode;		class PDLByteCode;

/// Use generic bytecode types. ByteCodeField refers to the actual bytecode		/// Use generic bytecode types. ByteCodeField refers to the actual bytecode
/// entries. ByteCodeAddr refers to size of indices into the bytecode.		/// entries. ByteCodeAddr refers to size of indices into the bytecode.
using ByteCodeField = uint16_t;		using ByteCodeField = uint16_t;
using ByteCodeAddr = uint32_t;		using ByteCodeAddr = uint32_t;
		using OpRange = llvm::OwningArrayRef<Operation *>;
		rriddleUnsubmitted Not Done Reply Inline Actions I think we shouldn't define OpRange like this, given that it has a different ownership model than TypeRange/ValueRange. rriddle: I think we shouldn't define OpRange like this, given that it has a different ownership model…
		sfuniakAuthorUnsubmitted Done Reply Inline Actions Okay... can I just rename it to OwningOpRange then? I still think that using an owning reference below in `std::vector<OpRange> opRangeMemory;` is the right thing to do (no matter what we choose to call it). sfuniak: Okay... can I just rename it to OwningOpRange then? I still think that using an owning…
		rriddleUnsubmitted Not Done Reply Inline Actions Yeah, OwningOpRange is fine. Only had problems with the name, just want to avoid a situation where we would forget that OpRange has different semantics. rriddle: Yeah, OwningOpRange is fine. Only had problems with the name, just want to avoid a situation…

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// PDLByteCodePattern		// PDLByteCodePattern
		rriddleUnsubmitted Done Reply Inline Actions nit: Drop llvm:: for ArrayRef (it's re-exported in `mlir` namespace) rriddle: nit: Drop llvm:: for ArrayRef (it's re-exported in `mlir` namespace)
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// All of the data pertaining to a specific pattern within the bytecode.		/// All of the data pertaining to a specific pattern within the bytecode.
class PDLByteCodePattern : public Pattern {		class PDLByteCodePattern : public Pattern {
public:		public:
static PDLByteCodePattern create(pdl_interp::RecordMatchOp matchOp,		static PDLByteCodePattern create(pdl_interp::RecordMatchOp matchOp,
ByteCodeAddr rewriterAddr);		ByteCodeAddr rewriterAddr);

Show All 33 Lines	private:
/// Allow access to data fields.		/// Allow access to data fields.
friend class PDLByteCode;		friend class PDLByteCode;

/// The mutable block of memory used during the matching and rewriting phases		/// The mutable block of memory used during the matching and rewriting phases
/// of the bytecode.		/// of the bytecode.
std::vector<const void *> memory;		std::vector<const void *> memory;

/// A mutable block of memory used during the matching and rewriting phase of		/// A mutable block of memory used during the matching and rewriting phase of
		/// the bytecode to store ranges of operations. These are always stored by
		/// owning references, because at no point in the execution of the byte code
		/// we get an indexed range (view) of operations.
		std::vector<OpRange> opRangeMemory;

		/// A mutable block of memory used during the matching and rewriting phase of
/// the bytecode to store ranges of types.		/// the bytecode to store ranges of types.
std::vector<TypeRange> typeRangeMemory;		std::vector<TypeRange> typeRangeMemory;
/// A set of type ranges that have been allocated by the byte code interpreter		/// A set of type ranges that have been allocated by the byte code interpreter
/// to provide a guaranteed lifetime.		/// to provide a guaranteed lifetime.
std::vector<llvm::OwningArrayRef<Type>> allocatedTypeRangeMemory;		std::vector<llvm::OwningArrayRef<Type>> allocatedTypeRangeMemory;

/// A mutable block of memory used during the matching and rewriting phase of		/// A mutable block of memory used during the matching and rewriting phase of
/// the bytecode to store ranges of values.		/// the bytecode to store ranges of values.
std::vector<ValueRange> valueRangeMemory;		std::vector<ValueRange> valueRangeMemory;
/// A set of value ranges that have been allocated by the byte code		/// A set of value ranges that have been allocated by the byte code
/// interpreter to provide a guaranteed lifetime.		/// interpreter to provide a guaranteed lifetime.
std::vector<llvm::OwningArrayRef<Value>> allocatedValueRangeMemory;		std::vector<llvm::OwningArrayRef<Value>> allocatedValueRangeMemory;

		/// The current index of ranges being iterated over for each level of nesting.
		/// These are always maintained at 0 for the loops that are not active, so we
		/// do not need to have a separate initialization phase for each loop.
		std::vector<unsigned> loopIndex;

/// The up-to-date benefits of the patterns held by the bytecode. The order		/// The up-to-date benefits of the patterns held by the bytecode. The order
/// of this array corresponds 1-1 with the array of patterns in `PDLByteCode`.		/// of this array corresponds 1-1 with the array of patterns in `PDLByteCode`.
std::vector<PatternBenefit> currentPatternBenefits;		std::vector<PatternBenefit> currentPatternBenefits;
};		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// PDLByteCode		// PDLByteCode
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	void executeByteCode(const ByteCodeField *inst, PatternRewriter &rewriter,
SmallVectorImpl<MatchResult> *matches) const;		SmallVectorImpl<MatchResult> *matches) const;

/// A vector containing pointers to uniqued data. The storage is intentionally		/// A vector containing pointers to uniqued data. The storage is intentionally
/// opaque such that we can store a wide range of data types. The types of		/// opaque such that we can store a wide range of data types. The types of
/// data stored here include:		/// data stored here include:
/// * Attribute, Identifier, OperationName, Type		/// * Attribute, Identifier, OperationName, Type
std::vector<const void *> uniquedData;		std::vector<const void *> uniquedData;

/// A vector containing the generated bytecode for the matcher.		/// A vector containing the generated bytecode for the matcher.
SmallVector<ByteCodeField, 64> matcherByteCode;		SmallVector<ByteCodeField, 64> matcherByteCode;
		rriddleUnsubmitted Done Reply Inline Actions nit: Drop std:: from size_t rriddle: nit: Drop std:: from size_t
		rriddleUnsubmitted Done Reply Inline Actions This is only for LLVM_DEBUG? Can you move this to the bottom and wrap with NDEBUG? We should keep anything debug specific sectioned off from everything else. rriddle: This is only for LLVM_DEBUG? Can you move this to the bottom and wrap with NDEBUG? We should…
		sfuniakAuthorUnsubmitted Done Reply Inline Actions I am a little apprehensive about wrapping this with NDEBUG in a public header. Doing so would break the code if the users of this header file are compiled with the different setting of the NDEBUG flag as the ByteCode.cpp. From `llvm/lib/Support/Debug.cpp`: Even though LLVM might be built with NDEBUG, define symbols that the code built without NDEBUG can depend on via the llvm/Support/Debug.h header. Furthermore, eventually, I would like to have a debugger-like functionality in `ByteCode`, so that you can execute the operations one by one. So in the long run, we may want to control at runtime whether or not to populate this map. sfuniak: I am a little apprehensive about wrapping this with NDEBUG in a public header. Doing so would…
		rriddleUnsubmitted Not Done Reply Inline Actions It's not a public header though? This can't be included outside fo lib/Rewrite. I'd also like to avoid any kind of debugging facilities being baked into release builds, that type of functionality should be reserved for compilation setups that want it. rriddle: It's not a public header though? This can't be included outside fo lib/Rewrite. I'd also like…
		sfuniakAuthorUnsubmitted Done Reply Inline Actions I suppose you are right, it's not a public header. We access the bytecode only through `FrozenRewritePatternSet`. I will do as you suggested. sfuniak: I suppose you are right, it's not a public header. We access the bytecode only through…

/// A vector containing the generated bytecode for all of the rewriters.		/// A vector containing the generated bytecode for all of the rewriters.
SmallVector<ByteCodeField, 64> rewriterByteCode;		SmallVector<ByteCodeField, 64> rewriterByteCode;

/// The set of patterns contained within the bytecode.		/// The set of patterns contained within the bytecode.
SmallVector<PDLByteCodePattern, 32> patterns;		SmallVector<PDLByteCodePattern, 32> patterns;

/// A set of user defined functions invoked via PDL.		/// A set of user defined functions invoked via PDL.
std::vector<PDLConstraintFunction> constraintFunctions;		std::vector<PDLConstraintFunction> constraintFunctions;
std::vector<PDLRewriteFunction> rewriteFunctions;		std::vector<PDLRewriteFunction> rewriteFunctions;

/// The maximum memory index used by a value.		/// The maximum memory index used by a value.
ByteCodeField maxValueMemoryIndex = 0;		ByteCodeField maxValueMemoryIndex = 0;

/// The maximum number of different types of ranges.		/// The maximum number of different types of ranges.
		ByteCodeField maxOpRangeCount = 0;
ByteCodeField maxTypeRangeCount = 0;		ByteCodeField maxTypeRangeCount = 0;
ByteCodeField maxValueRangeCount = 0;		ByteCodeField maxValueRangeCount = 0;

		/// The maximum number of nested loops.
		ByteCodeField maxLoopLevel = 0;
};		};

} // end namespace detail		} // end namespace detail
} // end namespace mlir		} // end namespace mlir

#endif // MLIR_REWRITE_BYTECODE_H_		#endif // MLIR_REWRITE_BYTECODE_H_

mlir/lib/Rewrite/ByteCode.cpp

Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines enum OpCode : ByteCodeField {

/// Compare the operand count of an operation with a constant. /// Compare the operand count of an operation with a constant.

CheckOperandCount, CheckOperandCount,

/// Compare the name of an operation with a constant. /// Compare the name of an operation with a constant.

CheckOperationName, CheckOperationName,

/// Compare the result count of an operation with a constant. /// Compare the result count of an operation with a constant.

CheckResultCount, CheckResultCount,

/// Compare a range of types to a constant range of types. /// Compare a range of types to a constant range of types.

CheckTypes, CheckTypes,

/// Continue to the next iteration of a loop.

Continue,

/// Create an operation. /// Create an operation.

CreateOperation, CreateOperation,

/// Create a range of types. /// Create a range of types.

CreateTypes, CreateTypes,

/// Erase an operation. /// Erase an operation.

EraseOp, EraseOp,

/// Terminate a matcher or rewrite sequence. /// Terminate a matcher or rewrite sequence.

Finalize, Finalize,

/// Iterate over a range of values.

ForEach,

/// Get a specific attribute of an operation. /// Get a specific attribute of an operation.

GetAttribute, GetAttribute,

/// Get the type of an attribute. /// Get the type of an attribute.

GetAttributeType, GetAttributeType,

/// Get the defining operation of a value. /// Get the defining operation of a value.

GetDefiningOp, GetDefiningOp,

/// Get a specific operand of an operation. /// Get a specific operand of an operation.

GetOperand0, GetOperand0,

GetOperand1, GetOperand1,

GetOperand2, GetOperand2,

GetOperand3, GetOperand3,

GetOperandN, GetOperandN,

/// Get a specific operand group of an operation. /// Get a specific operand group of an operation.

GetOperands, GetOperands,

/// Get a specific result of an operation. /// Get a specific result of an operation.

GetResult0, GetResult0,

GetResult1, GetResult1,

GetResult2, GetResult2,

GetResult3, GetResult3,

GetResultN, GetResultN,

/// Get a specific result group of an operation. /// Get a specific result group of an operation.

GetResults, GetResults,

/// Gets all users of a value and store them in a range.

rriddleUnsubmitted

Not Done

or the first value in a range

Is this right?

rriddle: > or the first value in a range Is this right?

GetUsers,

/// Get the type of a value. /// Get the type of a value.

GetValueType, GetValueType,

/// Get the types of a value range. /// Get the types of a value range.

GetValueRangeTypes, GetValueRangeTypes,

/// Check if a generic value is not null. /// Check if a generic value is not null.

IsNotNull, IsNotNull,

/// Record a successful pattern match. /// Record a successful pattern match.

RecordMatch, RecordMatch,

Show All 17 Lines

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// ByteCode Generation // ByteCode Generation

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// Generator // Generator

namespace { namespace {

struct ByteCodeLiveRange;

struct ByteCodeWriter; struct ByteCodeWriter;

/// This class represents the main generator for the pattern bytecode. /// This class represents the main generator for the pattern bytecode.

class Generator { class Generator {

public: public:

Generator(MLIRContext *ctx, std::vector<const void *> &uniquedData, Generator(MLIRContext *ctx, std::vector<const void *> &uniquedData,

SmallVectorImpl<ByteCodeField> &matcherByteCode, SmallVectorImpl<ByteCodeField> &matcherByteCode,

SmallVectorImpl<ByteCodeField> &rewriterByteCode, SmallVectorImpl<ByteCodeField> &rewriterByteCode,

SmallVectorImpl<PDLByteCodePattern> &patterns, SmallVectorImpl<PDLByteCodePattern> &patterns,

ByteCodeField &maxValueMemoryIndex, ByteCodeField &maxValueMemoryIndex,

ByteCodeField &maxOpRangeMemoryIndex,

ByteCodeField &maxTypeRangeMemoryIndex, ByteCodeField &maxTypeRangeMemoryIndex,

ByteCodeField &maxValueRangeMemoryIndex, ByteCodeField &maxValueRangeMemoryIndex,

ByteCodeField &maxLoopLevel,

llvm::StringMap<PDLConstraintFunction> &constraintFns, llvm::StringMap<PDLConstraintFunction> &constraintFns,

llvm::StringMap<PDLRewriteFunction> &rewriteFns) llvm::StringMap<PDLRewriteFunction> &rewriteFns)

: ctx(ctx), uniquedData(uniquedData), matcherByteCode(matcherByteCode), : ctx(ctx), uniquedData(uniquedData), matcherByteCode(matcherByteCode),

rewriterByteCode(rewriterByteCode), patterns(patterns), rewriterByteCode(rewriterByteCode), patterns(patterns),

maxValueMemoryIndex(maxValueMemoryIndex), maxValueMemoryIndex(maxValueMemoryIndex),

maxOpRangeMemoryIndex(maxOpRangeMemoryIndex),

maxTypeRangeMemoryIndex(maxTypeRangeMemoryIndex), maxTypeRangeMemoryIndex(maxTypeRangeMemoryIndex),

maxValueRangeMemoryIndex(maxValueRangeMemoryIndex) { maxValueRangeMemoryIndex(maxValueRangeMemoryIndex),

maxLoopLevel(maxLoopLevel) {

for (auto it : llvm::enumerate(constraintFns)) for (auto it : llvm::enumerate(constraintFns))

constraintToMemIndex.try_emplace(it.value().first(), it.index()); constraintToMemIndex.try_emplace(it.value().first(), it.index());

for (auto it : llvm::enumerate(rewriteFns)) for (auto it : llvm::enumerate(rewriteFns))

externalRewriterToMemIndex.try_emplace(it.value().first(), it.index()); externalRewriterToMemIndex.try_emplace(it.value().first(), it.index());

} }

/// Generate the bytecode for the given PDL interpreter module. /// Generate the bytecode for the given PDL interpreter module.

void generate(ModuleOp module); void generate(ModuleOp module);

Show All 28 Lines public:

} }

private: private:

/// Allocate memory indices for the results of operations within the matcher /// Allocate memory indices for the results of operations within the matcher

/// and rewriters. /// and rewriters.

void allocateMemoryIndices(FuncOp matcherFunc, ModuleOp rewriterModule); void allocateMemoryIndices(FuncOp matcherFunc, ModuleOp rewriterModule);

/// Generate the bytecode for the given operation. /// Generate the bytecode for the given operation.

void generate(Region *region, ByteCodeWriter &writer);

void generate(Operation *op, ByteCodeWriter &writer); void generate(Operation *op, ByteCodeWriter &writer);

void generate(pdl_interp::ApplyConstraintOp op, ByteCodeWriter &writer); void generate(pdl_interp::ApplyConstraintOp op, ByteCodeWriter &writer);

void generate(pdl_interp::ApplyRewriteOp op, ByteCodeWriter &writer); void generate(pdl_interp::ApplyRewriteOp op, ByteCodeWriter &writer);

void generate(pdl_interp::AreEqualOp op, ByteCodeWriter &writer); void generate(pdl_interp::AreEqualOp op, ByteCodeWriter &writer);

void generate(pdl_interp::BranchOp op, ByteCodeWriter &writer); void generate(pdl_interp::BranchOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CheckAttributeOp op, ByteCodeWriter &writer); void generate(pdl_interp::CheckAttributeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CheckOperandCountOp op, ByteCodeWriter &writer); void generate(pdl_interp::CheckOperandCountOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CheckOperationNameOp op, ByteCodeWriter &writer); void generate(pdl_interp::CheckOperationNameOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CheckResultCountOp op, ByteCodeWriter &writer); void generate(pdl_interp::CheckResultCountOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CheckTypeOp op, ByteCodeWriter &writer); void generate(pdl_interp::CheckTypeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CheckTypesOp op, ByteCodeWriter &writer); void generate(pdl_interp::CheckTypesOp op, ByteCodeWriter &writer);

void generate(pdl_interp::ContinueOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CreateAttributeOp op, ByteCodeWriter &writer); void generate(pdl_interp::CreateAttributeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CreateOperationOp op, ByteCodeWriter &writer); void generate(pdl_interp::CreateOperationOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CreateTypeOp op, ByteCodeWriter &writer); void generate(pdl_interp::CreateTypeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::CreateTypesOp op, ByteCodeWriter &writer); void generate(pdl_interp::CreateTypesOp op, ByteCodeWriter &writer);

void generate(pdl_interp::EraseOp op, ByteCodeWriter &writer); void generate(pdl_interp::EraseOp op, ByteCodeWriter &writer);

void generate(pdl_interp::FinalizeOp op, ByteCodeWriter &writer); void generate(pdl_interp::FinalizeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetAttributeOp op, ByteCodeWriter &writer); void generate(pdl_interp::GetAttributeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetAttributeTypeOp op, ByteCodeWriter &writer); void generate(pdl_interp::GetAttributeTypeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetDefiningOpOp op, ByteCodeWriter &writer); void generate(pdl_interp::GetDefiningOpOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetOperandOp op, ByteCodeWriter &writer); void generate(pdl_interp::GetOperandOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetOperandsOp op, ByteCodeWriter &writer); void generate(pdl_interp::GetOperandsOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetResultOp op, ByteCodeWriter &writer); void generate(pdl_interp::GetResultOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetResultsOp op, ByteCodeWriter &writer); void generate(pdl_interp::GetResultsOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetUsersOp op, ByteCodeWriter &writer);

void generate(pdl_interp::GetValueTypeOp op, ByteCodeWriter &writer); void generate(pdl_interp::GetValueTypeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::InferredTypesOp op, ByteCodeWriter &writer); void generate(pdl_interp::InferredTypesOp op, ByteCodeWriter &writer);

void generate(pdl_interp::IsNotNullOp op, ByteCodeWriter &writer); void generate(pdl_interp::IsNotNullOp op, ByteCodeWriter &writer);

void generate(pdl_interp::ForEachOp op, ByteCodeWriter &writer);

void generate(pdl_interp::RecordMatchOp op, ByteCodeWriter &writer); void generate(pdl_interp::RecordMatchOp op, ByteCodeWriter &writer);

void generate(pdl_interp::ReplaceOp op, ByteCodeWriter &writer); void generate(pdl_interp::ReplaceOp op, ByteCodeWriter &writer);

void generate(pdl_interp::SwitchAttributeOp op, ByteCodeWriter &writer); void generate(pdl_interp::SwitchAttributeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::SwitchTypeOp op, ByteCodeWriter &writer); void generate(pdl_interp::SwitchTypeOp op, ByteCodeWriter &writer);

void generate(pdl_interp::SwitchTypesOp op, ByteCodeWriter &writer); void generate(pdl_interp::SwitchTypesOp op, ByteCodeWriter &writer);

void generate(pdl_interp::SwitchOperandCountOp op, ByteCodeWriter &writer); void generate(pdl_interp::SwitchOperandCountOp op, ByteCodeWriter &writer);

void generate(pdl_interp::SwitchOperationNameOp op, ByteCodeWriter &writer); void generate(pdl_interp::SwitchOperationNameOp op, ByteCodeWriter &writer);

void generate(pdl_interp::SwitchResultCountOp op, ByteCodeWriter &writer); void generate(pdl_interp::SwitchResultCountOp op, ByteCodeWriter &writer);

Show All 15 Lines private:

/// Mapping from rewriter function name to the bytecode address of the /// Mapping from rewriter function name to the bytecode address of the

/// rewriter function in byte. /// rewriter function in byte.

llvm::StringMap<ByteCodeAddr> rewriterToAddr; llvm::StringMap<ByteCodeAddr> rewriterToAddr;

/// Mapping from a uniqued storage object to its memory index within /// Mapping from a uniqued storage object to its memory index within

/// `uniquedData`. /// `uniquedData`.

DenseMap<const void *, ByteCodeField> uniquedDataToMemIndex; DenseMap<const void *, ByteCodeField> uniquedDataToMemIndex;

/// The current level of the foreach loop.

ByteCodeField curLoopLevel = 0;

/// The current MLIR context. /// The current MLIR context.

MLIRContext *ctx; MLIRContext *ctx;

/// Mapping from block to its address.

DenseMap<Block *, ByteCodeAddr> blockToAddr;

rriddleUnsubmitted

Done

Same comment about debug fields as in the header.

rriddle: Same comment about debug fields as in the header.

sfuniakAuthorUnsubmitted

Done

Actually, blockToAddr is always needed to resolve the forward references. It's moved here from
void Generator::generate(ModuleOp module).
What you probably refer to is addrToBlock below. I could guard that one, but as I indicated in my earlier comment, it may be better to determine at runtime whether this map is to be populated, rather than compile time.

sfuniak: Actually, `blockToAddr` is always needed to resolve the forward references. It's moved here…

/// Data of the ByteCode class to be populated. /// Data of the ByteCode class to be populated.

std::vector<const void *> &uniquedData; std::vector<const void *> &uniquedData;

SmallVectorImpl<ByteCodeField> &matcherByteCode; SmallVectorImpl<ByteCodeField> &matcherByteCode;

SmallVectorImpl<ByteCodeField> &rewriterByteCode; SmallVectorImpl<ByteCodeField> &rewriterByteCode;

SmallVectorImpl<PDLByteCodePattern> &patterns; SmallVectorImpl<PDLByteCodePattern> &patterns;

ByteCodeField &maxValueMemoryIndex; ByteCodeField &maxValueMemoryIndex;

ByteCodeField &maxOpRangeMemoryIndex;

ByteCodeField &maxTypeRangeMemoryIndex; ByteCodeField &maxTypeRangeMemoryIndex;

ByteCodeField &maxValueRangeMemoryIndex; ByteCodeField &maxValueRangeMemoryIndex;

ByteCodeField &maxLoopLevel;

}; };

/// This class provides utilities for writing a bytecode stream. /// This class provides utilities for writing a bytecode stream.

struct ByteCodeWriter { struct ByteCodeWriter {

ByteCodeWriter(SmallVectorImpl<ByteCodeField> &bytecode, Generator &generator) ByteCodeWriter(SmallVectorImpl<ByteCodeField> &bytecode, Generator &generator)

: bytecode(bytecode), generator(generator) {} : bytecode(bytecode), generator(generator) {}

/// Append a field to the bytecode. /// Append a field to the bytecode.

void append(ByteCodeField field) { bytecode.push_back(field); } void append(ByteCodeField field) { bytecode.push_back(field); }

void append(OpCode opCode) { bytecode.push_back(opCode); } void append(OpCode opCode) { bytecode.push_back(opCode); }

/// Append an address to the bytecode. /// Append an address to the bytecode.

void append(ByteCodeAddr field) { void append(ByteCodeAddr field) {

static_assert((sizeof(ByteCodeAddr) / sizeof(ByteCodeField)) == 2, static_assert((sizeof(ByteCodeAddr) / sizeof(ByteCodeField)) == 2,

"unexpected ByteCode address size"); "unexpected ByteCode address size");

ByteCodeField fieldParts[2]; ByteCodeField fieldParts[2];

std::memcpy(fieldParts, &field, sizeof(ByteCodeAddr)); std::memcpy(fieldParts, &field, sizeof(ByteCodeAddr));

bytecode.append({fieldParts[0], fieldParts[1]}); bytecode.append({fieldParts[0], fieldParts[1]});

} }

/// Append a successor range to the bytecode, the exact address will need to /// Append a single successor to the bytecode, the exact address will need to

/// be resolved later. /// be resolved later.

void append(SuccessorRange successors) { void append(Block *successor) {

// Add back references to the any successors so that the address can be // Add back a reference to the successor so that the address can be resolved

// resolved later. // later.

for (Block *successor : successors) {

unresolvedSuccessorRefs[successor].push_back(bytecode.size()); unresolvedSuccessorRefs[successor].push_back(bytecode.size());

append(ByteCodeAddr(0)); append(ByteCodeAddr(0));

} }

/// Append a successor range to the bytecode, the exact address will need to

/// be resolved later.

void append(SuccessorRange successors) {

for (Block *successor : successors)

append(successor);

rriddleUnsubmitted

Done

void append(SuccessorRange successors) {

- for (Block *successor : successors) {

+ for (Block *successor : successors)

append(successor);

- }

+ }

/// Append a range of values that will be read as generic PDLValues.

rriddle:

} }

/// Append a range of values that will be read as generic PDLValues. /// Append a range of values that will be read as generic PDLValues.

void appendPDLValueList(OperandRange values) { void appendPDLValueList(OperandRange values) {

bytecode.push_back(values.size()); bytecode.push_back(values.size());

for (Value value : values) for (Value value : values)

appendPDLValue(value); appendPDLValue(value);

} }

/// Append a value as a PDLValue. /// Append a value as a PDLValue.

void appendPDLValue(Value value) { void appendPDLValue(Value value) {

appendPDLValueKind(value); appendPDLValueKind(value);

append(value); append(value);

} }

/// Append the PDLValue::Kind of the given value. /// Append the PDLValue::Kind of the given value.

void appendPDLValueKind(Value value) { void appendPDLValueKind(Value value) { appendPDLValueKind(value.getType()); }

// Append the type of the value in addition to the value itself.

/// Append the PDLValue::Kind of the given type.

void appendPDLValueKind(Type type) {

PDLValue::Kind kind = PDLValue::Kind kind =

TypeSwitch<Type, PDLValue::Kind>(value.getType()) TypeSwitch<Type, PDLValue::Kind>(type)

.Case<pdl::AttributeType>( .Case<pdl::AttributeType>(

[](Type) { return PDLValue::Kind::Attribute; }) [](Type) { return PDLValue::Kind::Attribute; })

.Case<pdl::OperationType>( .Case<pdl::OperationType>(

[](Type) { return PDLValue::Kind::Operation; }) [](Type) { return PDLValue::Kind::Operation; })

.Case<pdl::RangeType>([](pdl::RangeType rangeTy) { .Case<pdl::RangeType>([](pdl::RangeType rangeTy) {

if (rangeTy.getElementType().isa<pdl::TypeType>()) if (rangeTy.getElementType().isa<pdl::TypeType>())

return PDLValue::Kind::TypeRange; return PDLValue::Kind::TypeRange;

return PDLValue::Kind::ValueRange; return PDLValue::Kind::ValueRange;

Show All 40 Lines struct ByteCodeWriter {

/// The main generator producing PDL. /// The main generator producing PDL.

Generator &generator; Generator &generator;

}; };

/// This class represents a live range of PDL Interpreter values, containing /// This class represents a live range of PDL Interpreter values, containing

/// information about when values are live within a match/rewrite. /// information about when values are live within a match/rewrite.

struct ByteCodeLiveRange { struct ByteCodeLiveRange {

using Set = llvm::IntervalMap<ByteCodeField, char, 16>; using Set = llvm::IntervalMap<uint64_t, char, 16>;

using Allocator = Set::Allocator; using Allocator = Set::Allocator;

ByteCodeLiveRange(Allocator &alloc) : liveness(alloc) {} ByteCodeLiveRange(Allocator &alloc) : liveness(new Set(alloc)) {}

/// Union this live range with the one provided. /// Union this live range with the one provided.

void unionWith(const ByteCodeLiveRange &rhs) { void unionWith(const ByteCodeLiveRange &rhs) {

for (auto it = rhs.liveness.begin(), e = rhs.liveness.end(); it != e; ++it) for (auto it = rhs.liveness->begin(), e = rhs.liveness->end(); it != e;

liveness.insert(it.start(), it.stop(), /*dummyValue*/ 0); ++it)

liveness->insert(it.start(), it.stop(), /*dummyValue*/ 0);

} }

/// Returns true if this range overlaps with the one provided. /// Returns true if this range overlaps with the one provided.

bool overlaps(const ByteCodeLiveRange &rhs) const { bool overlaps(const ByteCodeLiveRange &rhs) const {

return llvm::IntervalMapOverlaps<Set, Set>(liveness, rhs.liveness).valid(); return llvm::IntervalMapOverlaps<Set, Set>(*liveness, *rhs.liveness)

.valid();

} }

/// A map representing the ranges of the match/rewrite that a value is live in /// A map representing the ranges of the match/rewrite that a value is live in

/// the interpreter. /// the interpreter.

llvm::IntervalMap<ByteCodeField, char, 16> liveness; ///

/// We use std::unique_ptr here, because IntervalMap does not provide a

/// correct copy or move constructor. We can eliminate the pointer once

/// https://reviews.llvm.org/D113240 lands.

std::unique_ptr<llvm::IntervalMap<uint64_t, char, 16>> liveness;

rriddleUnsubmitted

Done

That is... weird. Can you file an llvm bug and reference it here?

rriddle: That is... weird. Can you file an llvm bug and reference it here?

sfuniakAuthorUnsubmitted

Done

I figured this out: IntervalMap has a user-defined destructor but not a user-defined copy constructor. The compiler-generated implicitly-declared copy constructor is wrong. We should just define a move constructor, which is sufficient for most practical purposes. I will submit a diff next week and reference it here.

sfuniak: I figured this out: `IntervalMap` has a user-defined destructor but not a user-defined copy…

/// The operation range storage index for this range.

Optional<unsigned> opRangeIndex;

/// The type range storage index for this range. /// The type range storage index for this range.

Optional<unsigned> typeRangeIndex; Optional<unsigned> typeRangeIndex;

/// The value range storage index for this range. /// The value range storage index for this range.

Optional<unsigned> valueRangeIndex; Optional<unsigned> valueRangeIndex;

}; };

} // end anonymous namespace } // end anonymous namespace

Show All 15 Lines for (FuncOp rewriterFunc : rewriterModule.getOps<FuncOp>()) {

rewriterToAddr.try_emplace(rewriterFunc.getName(), rewriterByteCode.size()); rewriterToAddr.try_emplace(rewriterFunc.getName(), rewriterByteCode.size());

for (Operation &op : rewriterFunc.getOps()) for (Operation &op : rewriterFunc.getOps())

generate(&op, rewriterByteCodeWriter); generate(&op, rewriterByteCodeWriter);

} }

assert(rewriterByteCodeWriter.unresolvedSuccessorRefs.empty() && assert(rewriterByteCodeWriter.unresolvedSuccessorRefs.empty() &&

"unexpected branches in rewriter function"); "unexpected branches in rewriter function");

// Generate code for the matcher function. // Generate code for the matcher function.

DenseMap<Block *, ByteCodeAddr> blockToAddr;

llvm::ReversePostOrderTraversal<Region *> rpot(&matcherFunc.getBody());

ByteCodeWriter matcherByteCodeWriter(matcherByteCode, *this); ByteCodeWriter matcherByteCodeWriter(matcherByteCode, *this);

for (Block *block : rpot) { generate(&matcherFunc.getBody(), matcherByteCodeWriter);

// Keep track of where this block begins within the matcher function.

blockToAddr.try_emplace(block, matcherByteCode.size());

for (Operation &op : *block)

generate(&op, matcherByteCodeWriter);

}

// Resolve successor references in the matcher. // Resolve successor references in the matcher.

for (auto &it : matcherByteCodeWriter.unresolvedSuccessorRefs) { for (auto &it : matcherByteCodeWriter.unresolvedSuccessorRefs) {

ByteCodeAddr addr = blockToAddr[it.first]; ByteCodeAddr addr = blockToAddr[it.first];

for (unsigned offsetToFix : it.second) for (unsigned offsetToFix : it.second)

std::memcpy(&matcherByteCode[offsetToFix], &addr, sizeof(ByteCodeAddr)); std::memcpy(&matcherByteCode[offsetToFix], &addr, sizeof(ByteCodeAddr));

} }

Show All 30 Lines void Generator::allocateMemoryIndices(FuncOp matcherFunc,

} }

// The matcher function uses a more sophisticated numbering that tries to // The matcher function uses a more sophisticated numbering that tries to

// minimize the number of memory indices assigned. This is done by determining // minimize the number of memory indices assigned. This is done by determining

// a live range of the values within the matcher, then the allocation is just // a live range of the values within the matcher, then the allocation is just

// finding the minimal number of overlapping live ranges. This is essentially // finding the minimal number of overlapping live ranges. This is essentially

// a simplified form of register allocation where we don't necessarily have a // a simplified form of register allocation where we don't necessarily have a

// limited number of registers, but we still want to minimize the number used. // limited number of registers, but we still want to minimize the number used.

DenseMap<Operation *, ByteCodeField> opToIndex; DenseMap<Operation *, unsigned> opToIndex;

matcherFunc.getBody().walk([&](Operation *op) { matcherFunc.getBody().walk([&](Operation *op) {

opToIndex.insert(std::make_pair(op, opToIndex.size())); opToIndex.insert(std::make_pair(op, opToIndex.size()));

}); });

// Liveness info for each of the defs within the matcher. // Liveness info for each of the defs within the matcher.

ByteCodeLiveRange::Allocator allocator; ByteCodeLiveRange::Allocator allocator;

DenseMap<Value, ByteCodeLiveRange> valueDefRanges; DenseMap<Value, ByteCodeLiveRange> valueDefRanges;

// Assign the root operation being matched to slot 0. // Assign the root operation being matched to slot 0.

BlockArgument rootOpArg = matcherFunc.getArgument(0); BlockArgument rootOpArg = matcherFunc.getArgument(0);

valueToMemIndex[rootOpArg] = 0; valueToMemIndex[rootOpArg] = 0;

// Walk each of the blocks, computing the def interval that the value is used. // Walk each of the blocks, computing the def interval that the value is used.

Liveness matcherLiveness(matcherFunc); Liveness matcherLiveness(matcherFunc);

for (Block &block : matcherFunc.getBody()) { matcherFunc->walk([&](Block *block) {

const LivenessBlockInfo *info = matcherLiveness.getLiveness(&block); const LivenessBlockInfo *info = matcherLiveness.getLiveness(block);

assert(info && "expected liveness info for block"); assert(info && "expected liveness info for block");

auto processValue = [&](Value value, Operation *firstUseOrDef) { auto processValue = [&](Value value, Operation *firstUseOrDef) {

// We don't need to process the root op argument, this value is always // We don't need to process the root op argument, this value is always

// assigned to the first memory slot. // assigned to the first memory slot.

if (value == rootOpArg) if (value == rootOpArg)

return; return;

// Set indices for the range of this block that the value is used. // Set indices for the range of this block that the value is used.

auto defRangeIt = valueDefRanges.try_emplace(value, allocator).first; auto defRangeIt = valueDefRanges.try_emplace(value, allocator).first;

defRangeIt->second.liveness.insert( defRangeIt->second.liveness->insert(

opToIndex[firstUseOrDef], opToIndex[firstUseOrDef],

opToIndex[info->getEndOperation(value, firstUseOrDef)], opToIndex[info->getEndOperation(value, firstUseOrDef)],

/*dummyValue*/ 0); /*dummyValue*/ 0);

// Check to see if this value is a range type. // Check to see if this value is a range type.

if (auto rangeTy = value.getType().dyn_cast<pdl::RangeType>()) { if (auto rangeTy = value.getType().dyn_cast<pdl::RangeType>()) {

Type eleType = rangeTy.getElementType(); Type eleType = rangeTy.getElementType();

if (eleType.isa<pdl::TypeType>()) if (eleType.isa<pdl::OperationType>())

defRangeIt->second.opRangeIndex = 0;

else if (eleType.isa<pdl::TypeType>())

defRangeIt->second.typeRangeIndex = 0; defRangeIt->second.typeRangeIndex = 0;

else if (eleType.isa<pdl::ValueType>()) else if (eleType.isa<pdl::ValueType>())

defRangeIt->second.valueRangeIndex = 0; defRangeIt->second.valueRangeIndex = 0;

} }

}; };

// Process the live-ins of this block. // Process the live-ins of this block.

for (Value liveIn : info->in()) for (Value liveIn : info->in()) {

processValue(liveIn, &block.front()); // Only process the value if it has been defined in the current region.

// Other values that span across pdl_interp.foreach will be added higher

// up. This ensures that the we keep them alive for the entire duration

// of the loop.

if (liveIn.getParentRegion() == block->getParent())

processValue(liveIn, &block->front());

}

// Process the block arguments for the entry block (those are not live-in).

if (block->isEntryBlock()) {

for (Value argument : block->getArguments())

rriddleUnsubmitted

Done

nit: Drop the trivial braces here.

rriddle: nit: Drop the trivial braces here.

processValue(argument, &block->front());

}

rriddleUnsubmitted

Done

// Process the live-ins of this block.

- for (Value liveIn : info->in())

+ for (Value liveIn : info->in()) {

// Only process the value if it has been defined in the current region.

// Other values that span across pdl_interp.foreach will be added higher

// up. This ensures that the we keep them alive for the entire duration

// of the loop.

if (liveIn.getParentRegion() == block->getParent())

processValue(liveIn, &block->front());

+ }

// Process any new defs within this block.

rriddle:

// Process any new defs within this block. // Process any new defs within this block.

for (Operation &op : block) for (Operation &op : *block)

for (Value result : op.getResults()) for (Value result : op.getResults())

processValue(result, &op); processValue(result, &op);

} });

// Greedily allocate memory slots using the computed def live ranges. // Greedily allocate memory slots using the computed def live ranges.

std::vector<ByteCodeLiveRange> allocatedIndices; std::vector<ByteCodeLiveRange> allocatedIndices;

ByteCodeField numIndices = 1, numTypeRanges = 0, numValueRanges = 0;

// The number of memory indices currently allocated (and its next value).

// Recall that the root gets allocated memory index 0.

rriddleUnsubmitted

Done

// The number of memory indices currently allocated (and its next value).

- // Recall that the roots gets allocated memory index 0.

+ // Recall that the roots get allocated memory index 0.

ByteCodeField numIndices = 1;

rriddle:

ByteCodeField numIndices = 1;

// The number of memory ranges of various types (and their next values).

ByteCodeField numOpRanges = 0, numTypeRanges = 0, numValueRanges = 0;

for (auto &defIt : valueDefRanges) { for (auto &defIt : valueDefRanges) {

ByteCodeField &memIndex = valueToMemIndex[defIt.first]; ByteCodeField &memIndex = valueToMemIndex[defIt.first];

ByteCodeLiveRange &defRange = defIt.second; ByteCodeLiveRange &defRange = defIt.second;

// Try to allocate to an existing index. // Try to allocate to an existing index.

for (auto existingIndexIt : llvm::enumerate(allocatedIndices)) { for (auto existingIndexIt : llvm::enumerate(allocatedIndices)) {

ByteCodeLiveRange &existingRange = existingIndexIt.value(); ByteCodeLiveRange &existingRange = existingIndexIt.value();

if (!defRange.overlaps(existingRange)) { if (!defRange.overlaps(existingRange)) {

existingRange.unionWith(defRange); existingRange.unionWith(defRange);

memIndex = existingIndexIt.index() + 1; memIndex = existingIndexIt.index() + 1;

if (defRange.typeRangeIndex) { if (defRange.opRangeIndex) {

if (!existingRange.opRangeIndex)

existingRange.opRangeIndex = numOpRanges++;

valueToRangeIndex[defIt.first] = *existingRange.opRangeIndex;

} else if (defRange.typeRangeIndex) {

if (!existingRange.typeRangeIndex) if (!existingRange.typeRangeIndex)

existingRange.typeRangeIndex = numTypeRanges++; existingRange.typeRangeIndex = numTypeRanges++;

valueToRangeIndex[defIt.first] = *existingRange.typeRangeIndex; valueToRangeIndex[defIt.first] = *existingRange.typeRangeIndex;

} else if (defRange.valueRangeIndex) { } else if (defRange.valueRangeIndex) {

if (!existingRange.valueRangeIndex) if (!existingRange.valueRangeIndex)

existingRange.valueRangeIndex = numValueRanges++; existingRange.valueRangeIndex = numValueRanges++;

valueToRangeIndex[defIt.first] = *existingRange.valueRangeIndex; valueToRangeIndex[defIt.first] = *existingRange.valueRangeIndex;

} }

break; break;

} }

// If no existing index could be used, add a new one. // If no existing index could be used, add a new one.

if (memIndex == 0) { if (memIndex == 0) {

allocatedIndices.emplace_back(allocator); allocatedIndices.emplace_back(allocator);

ByteCodeLiveRange &newRange = allocatedIndices.back(); ByteCodeLiveRange &newRange = allocatedIndices.back();

newRange.unionWith(defRange); newRange.unionWith(defRange);

// Allocate an index for type/value ranges. // Allocate an index for op/type/value ranges.

if (defRange.typeRangeIndex) { if (defRange.opRangeIndex) {

newRange.opRangeIndex = numOpRanges;

valueToRangeIndex[defIt.first] = numOpRanges++;

} else if (defRange.typeRangeIndex) {

newRange.typeRangeIndex = numTypeRanges; newRange.typeRangeIndex = numTypeRanges;

valueToRangeIndex[defIt.first] = numTypeRanges++; valueToRangeIndex[defIt.first] = numTypeRanges++;

} else if (defRange.valueRangeIndex) { } else if (defRange.valueRangeIndex) {

newRange.valueRangeIndex = numValueRanges; newRange.valueRangeIndex = numValueRanges;

valueToRangeIndex[defIt.first] = numValueRanges++; valueToRangeIndex[defIt.first] = numValueRanges++;

} }

memIndex = allocatedIndices.size(); memIndex = allocatedIndices.size();

++numIndices; ++numIndices;

} }

// Print the index usage and ensure that we did not run out of index space.

LLVM_DEBUG({

llvm::dbgs() << "Allocated " << allocatedIndices.size() << " indices "

<< "(down from initial " << valueDefRanges.size() << ").\n";

});

assert(allocatedIndices.size() <= std::numeric_limits<ByteCodeField>::max() &&

"Ran out of memory for allocated indices");

// Update the max number of indices. // Update the max number of indices.

if (numIndices > maxValueMemoryIndex) if (numIndices > maxValueMemoryIndex)

maxValueMemoryIndex = numIndices; maxValueMemoryIndex = numIndices;

if (numOpRanges > maxOpRangeMemoryIndex)

maxOpRangeMemoryIndex = numOpRanges;

if (numTypeRanges > maxTypeRangeMemoryIndex) if (numTypeRanges > maxTypeRangeMemoryIndex)

maxTypeRangeMemoryIndex = numTypeRanges; maxTypeRangeMemoryIndex = numTypeRanges;

if (numValueRanges > maxValueRangeMemoryIndex) if (numValueRanges > maxValueRangeMemoryIndex)

maxValueRangeMemoryIndex = numValueRanges; maxValueRangeMemoryIndex = numValueRanges;

} }

void Generator::generate(Region *region, ByteCodeWriter &writer) {

llvm::ReversePostOrderTraversal<Region *> rpot(region);

for (Block *block : rpot) {

// Keep track of where this block begins within the matcher function.

blockToAddr.try_emplace(block, matcherByteCode.size());

for (Operation &op : *block)

generate(&op, writer);

}

void Generator::generate(Operation *op, ByteCodeWriter &writer) { void Generator::generate(Operation *op, ByteCodeWriter &writer) {

TypeSwitch<Operation *>(op) TypeSwitch<Operation *>(op)

.Case<pdl_interp::ApplyConstraintOp, pdl_interp::ApplyRewriteOp, .Case<pdl_interp::ApplyConstraintOp, pdl_interp::ApplyRewriteOp,

pdl_interp::AreEqualOp, pdl_interp::BranchOp, pdl_interp::AreEqualOp, pdl_interp::BranchOp,

pdl_interp::CheckAttributeOp, pdl_interp::CheckOperandCountOp, pdl_interp::CheckAttributeOp, pdl_interp::CheckOperandCountOp,

pdl_interp::CheckOperationNameOp, pdl_interp::CheckResultCountOp, pdl_interp::CheckOperationNameOp, pdl_interp::CheckResultCountOp,

pdl_interp::CheckTypeOp, pdl_interp::CheckTypesOp, pdl_interp::CheckTypeOp, pdl_interp::CheckTypesOp,

rriddleUnsubmitted

Done

The debug related functionality feels separable, can you split this out?

rriddle: The debug related functionality feels separable, can you split this out?

sfuniakAuthorUnsubmitted

Done

will do

sfuniak: will do

pdl_interp::CreateAttributeOp, pdl_interp::CreateOperationOp, pdl_interp::ContinueOp, pdl_interp::CreateAttributeOp,

pdl_interp::CreateTypeOp, pdl_interp::CreateTypesOp, pdl_interp::CreateOperationOp, pdl_interp::CreateTypeOp,

pdl_interp::EraseOp, pdl_interp::FinalizeOp, pdl_interp::CreateTypesOp, pdl_interp::EraseOp,

pdl_interp::FinalizeOp, pdl_interp::ForEachOp,

pdl_interp::GetAttributeOp, pdl_interp::GetAttributeTypeOp, pdl_interp::GetAttributeOp, pdl_interp::GetAttributeTypeOp,

pdl_interp::GetDefiningOpOp, pdl_interp::GetOperandOp, pdl_interp::GetDefiningOpOp, pdl_interp::GetOperandOp,

pdl_interp::GetOperandsOp, pdl_interp::GetResultOp, pdl_interp::GetOperandsOp, pdl_interp::GetResultOp,

pdl_interp::GetResultsOp, pdl_interp::GetValueTypeOp, pdl_interp::GetResultsOp, pdl_interp::GetUsersOp,

pdl_interp::InferredTypesOp, pdl_interp::IsNotNullOp, pdl_interp::GetValueTypeOp, pdl_interp::InferredTypesOp,

pdl_interp::RecordMatchOp, pdl_interp::ReplaceOp, pdl_interp::IsNotNullOp, pdl_interp::RecordMatchOp,

pdl_interp::SwitchAttributeOp, pdl_interp::SwitchTypeOp, pdl_interp::ReplaceOp, pdl_interp::SwitchAttributeOp,

pdl_interp::SwitchTypesOp, pdl_interp::SwitchOperandCountOp, pdl_interp::SwitchTypeOp, pdl_interp::SwitchTypesOp,

pdl_interp::SwitchOperationNameOp, pdl_interp::SwitchResultCountOp>( pdl_interp::SwitchOperandCountOp, pdl_interp::SwitchOperationNameOp,

pdl_interp::SwitchResultCountOp>(

[&](auto interpOp) { this->generate(interpOp, writer); }) [&](auto interpOp) { this->generate(interpOp, writer); })

.Default([](Operation *) { .Default([](Operation *) {

llvm_unreachable("unknown `pdl_interp` operation"); llvm_unreachable("unknown `pdl_interp` operation");

}); });

} }

void Generator::generate(pdl_interp::ApplyConstraintOp op, void Generator::generate(pdl_interp::ApplyConstraintOp op,

ByteCodeWriter &writer) { ByteCodeWriter &writer) {

▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines writer.append(OpCode::CheckResultCount, op.operation(), op.count(),

op.getSuccessors()); op.getSuccessors());

} }

void Generator::generate(pdl_interp::CheckTypeOp op, ByteCodeWriter &writer) { void Generator::generate(pdl_interp::CheckTypeOp op, ByteCodeWriter &writer) {

writer.append(OpCode::AreEqual, op.value(), op.type(), op.getSuccessors()); writer.append(OpCode::AreEqual, op.value(), op.type(), op.getSuccessors());

} }

void Generator::generate(pdl_interp::CheckTypesOp op, ByteCodeWriter &writer) { void Generator::generate(pdl_interp::CheckTypesOp op, ByteCodeWriter &writer) {

writer.append(OpCode::CheckTypes, op.value(), op.types(), op.getSuccessors()); writer.append(OpCode::CheckTypes, op.value(), op.types(), op.getSuccessors());

} }

void Generator::generate(pdl_interp::ContinueOp op, ByteCodeWriter &writer) {

assert(curLoopLevel > 0 && "encountered pdl_interp.continue at top level");

writer.append(OpCode::Continue, ByteCodeField(curLoopLevel - 1));

}

void Generator::generate(pdl_interp::CreateAttributeOp op, void Generator::generate(pdl_interp::CreateAttributeOp op,

ByteCodeWriter &writer) { ByteCodeWriter &writer) {

// Simply repoint the memory index of the result to the constant. // Simply repoint the memory index of the result to the constant.

getMemIndex(op.attribute()) = getMemIndex(op.value()); getMemIndex(op.attribute()) = getMemIndex(op.value());

} }

void Generator::generate(pdl_interp::CreateOperationOp op, void Generator::generate(pdl_interp::CreateOperationOp op,

ByteCodeWriter &writer) { ByteCodeWriter &writer) {

writer.append(OpCode::CreateOperation, op.operation(), writer.append(OpCode::CreateOperation, op.operation(),

▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines writer.append(OpCode::GetResults,

index.getValueOr(std::numeric_limits<uint32_t>::max()), index.getValueOr(std::numeric_limits<uint32_t>::max()),

op.operation()); op.operation());

if (result.getType().isa<pdl::RangeType>()) if (result.getType().isa<pdl::RangeType>())

writer.append(getRangeStorageIndex(result)); writer.append(getRangeStorageIndex(result));

else else

writer.append(std::numeric_limits<ByteCodeField>::max()); writer.append(std::numeric_limits<ByteCodeField>::max());

writer.append(result); writer.append(result);

} }

void Generator::generate(pdl_interp::GetUsersOp op, ByteCodeWriter &writer) {

Value operations = op.operations();

Optional<uint32_t> index = op.index();

writer.append(OpCode::GetUsers, operations, getRangeStorageIndex(operations),

index.getValueOr(std::numeric_limits<uint32_t>::max()));

MogballUnsubmitted

Not Done

ByteCodeField rangeIndex = getRangeStorageIndex(operations);

- Optional<uint32_t> index = op.index();

- if (index)

+ if (Optional<uint32_t> index = op.index())

writer.append(OpCode::GetUsersAt, operations, rangeIndex, *index);

Mogball:

writer.appendPDLValue(op.value());

}

void Generator::generate(pdl_interp::GetValueTypeOp op, void Generator::generate(pdl_interp::GetValueTypeOp op,

ByteCodeWriter &writer) { ByteCodeWriter &writer) {

if (op.getType().isa<pdl::RangeType>()) { if (op.getType().isa<pdl::RangeType>()) {

Value result = op.result(); Value result = op.result();

writer.append(OpCode::GetValueRangeTypes, result, writer.append(OpCode::GetValueRangeTypes, result,

getRangeStorageIndex(result), op.value()); getRangeStorageIndex(result), op.value());

} else { } else {

writer.append(OpCode::GetValueType, op.result(), op.value()); writer.append(OpCode::GetValueType, op.result(), op.value());

} }

void Generator::generate(pdl_interp::InferredTypesOp op, void Generator::generate(pdl_interp::InferredTypesOp op,

ByteCodeWriter &writer) { ByteCodeWriter &writer) {

// InferType maps to a null type as a marker for inferring result types. // InferType maps to a null type as a marker for inferring result types.

getMemIndex(op.type()) = getMemIndex(Type()); getMemIndex(op.type()) = getMemIndex(Type());

} }

void Generator::generate(pdl_interp::IsNotNullOp op, ByteCodeWriter &writer) { void Generator::generate(pdl_interp::IsNotNullOp op, ByteCodeWriter &writer) {

writer.append(OpCode::IsNotNull, op.value(), op.getSuccessors()); writer.append(OpCode::IsNotNull, op.value(), op.getSuccessors());

} }

void Generator::generate(pdl_interp::ForEachOp op, ByteCodeWriter &writer) {

BlockArgument arg = op.getLoopVariable();

writer.append(OpCode::ForEach, getRangeStorageIndex(op.values()), arg);

writer.appendPDLValueKind(arg.getType());

writer.append(curLoopLevel, op.successor());

++curLoopLevel;

rriddleUnsubmitted

Not Done

Could probably use llvm::SaveAndRestore here.

rriddle: Could probably use llvm::SaveAndRestore here.

sfuniakAuthorUnsubmitted

Done

I find llvm::SaveAndRestore a bit too magical and no less verbose. It certainly helps in cases when we set a variable to an arbitrary value, but in this case, we are increasing and decreasing level, which I think is cleaner as is. We do not have exceptions to guard against either. But you are the code owner, so the decision ultimately rests with you.

sfuniak: I find `llvm::SaveAndRestore` a bit too magical and no less verbose. It certainly helps in…

if (curLoopLevel > maxLoopLevel)

maxLoopLevel = curLoopLevel;

generate(&op.region(), writer);

--curLoopLevel;

}

void Generator::generate(pdl_interp::RecordMatchOp op, ByteCodeWriter &writer) { void Generator::generate(pdl_interp::RecordMatchOp op, ByteCodeWriter &writer) {

ByteCodeField patternIndex = patterns.size(); ByteCodeField patternIndex = patterns.size();

patterns.emplace_back(PDLByteCodePattern::create( patterns.emplace_back(PDLByteCodePattern::create(

op, rewriterToAddr[op.rewriter().getLeafReference().getValue()])); op, rewriterToAddr[op.rewriter().getLeafReference().getValue()]));

writer.append(OpCode::RecordMatch, patternIndex, writer.append(OpCode::RecordMatch, patternIndex,

SuccessorRange(op.getOperation()), op.matchedOps()); SuccessorRange(op.getOperation()), op.matchedOps());

writer.appendPDLValueList(op.inputs()); writer.appendPDLValueList(op.inputs());

} }

Show All 37 Lines

// PDLByteCode // PDLByteCode

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

PDLByteCode::PDLByteCode(ModuleOp module, PDLByteCode::PDLByteCode(ModuleOp module,

llvm::StringMap<PDLConstraintFunction> constraintFns, llvm::StringMap<PDLConstraintFunction> constraintFns,

llvm::StringMap<PDLRewriteFunction> rewriteFns) { llvm::StringMap<PDLRewriteFunction> rewriteFns) {

Generator generator(module.getContext(), uniquedData, matcherByteCode, Generator generator(module.getContext(), uniquedData, matcherByteCode,

rewriterByteCode, patterns, maxValueMemoryIndex, rewriterByteCode, patterns, maxValueMemoryIndex,

maxTypeRangeCount, maxValueRangeCount, constraintFns, maxOpRangeCount, maxTypeRangeCount, maxValueRangeCount,

rewriteFns); maxLoopLevel, constraintFns, rewriteFns);

generator.generate(module); generator.generate(module);

// Initialize the external functions. // Initialize the external functions.

for (auto &it : constraintFns) for (auto &it : constraintFns)

constraintFunctions.push_back(std::move(it.second)); constraintFunctions.push_back(std::move(it.second));

for (auto &it : rewriteFns) for (auto &it : rewriteFns)

rewriteFunctions.push_back(std::move(it.second)); rewriteFunctions.push_back(std::move(it.second));

} }

/// Initialize the given state such that it can be used to execute the current /// Initialize the given state such that it can be used to execute the current

/// bytecode. /// bytecode.

void PDLByteCode::initializeMutableState(PDLByteCodeMutableState &state) const { void PDLByteCode::initializeMutableState(PDLByteCodeMutableState &state) const {

state.memory.resize(maxValueMemoryIndex, nullptr); state.memory.resize(maxValueMemoryIndex, nullptr);

state.opRangeMemory.resize(maxOpRangeCount);

state.typeRangeMemory.resize(maxTypeRangeCount, TypeRange()); state.typeRangeMemory.resize(maxTypeRangeCount, TypeRange());

state.valueRangeMemory.resize(maxValueRangeCount, ValueRange()); state.valueRangeMemory.resize(maxValueRangeCount, ValueRange());

state.loopIndex.resize(maxLoopLevel, 0);

state.currentPatternBenefits.reserve(patterns.size()); state.currentPatternBenefits.reserve(patterns.size());

for (const PDLByteCodePattern &pattern : patterns) for (const PDLByteCodePattern &pattern : patterns)

state.currentPatternBenefits.push_back(pattern.getBenefit()); state.currentPatternBenefits.push_back(pattern.getBenefit());

} }

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// ByteCode Execution // ByteCode Execution

namespace { namespace {

/// This class provides support for executing a bytecode stream. /// This class provides support for executing a bytecode stream.

class ByteCodeExecutor { class ByteCodeExecutor {

public: public:

ByteCodeExecutor( ByteCodeExecutor(

const ByteCodeField *curCodeIt, MutableArrayRef<const void *> memory, const ByteCodeField *curCodeIt, MutableArrayRef<const void *> memory,

MutableArrayRef<llvm::OwningArrayRef<Operation *>> opRangeMemory,

MutableArrayRef<TypeRange> typeRangeMemory, MutableArrayRef<TypeRange> typeRangeMemory,

std::vector<llvm::OwningArrayRef<Type>> &allocatedTypeRangeMemory, std::vector<llvm::OwningArrayRef<Type>> &allocatedTypeRangeMemory,

MutableArrayRef<ValueRange> valueRangeMemory, MutableArrayRef<ValueRange> valueRangeMemory,

std::vector<llvm::OwningArrayRef<Value>> &allocatedValueRangeMemory, std::vector<llvm::OwningArrayRef<Value>> &allocatedValueRangeMemory,

ArrayRef<const void *> uniquedMemory, ArrayRef<ByteCodeField> code, MutableArrayRef<unsigned> loopIndex, ArrayRef<const void *> uniquedMemory,

ArrayRef<ByteCodeField> code,

ArrayRef<PatternBenefit> currentPatternBenefits, ArrayRef<PatternBenefit> currentPatternBenefits,

ArrayRef<PDLByteCodePattern> patterns, ArrayRef<PDLByteCodePattern> patterns,

ArrayRef<PDLConstraintFunction> constraintFunctions, ArrayRef<PDLConstraintFunction> constraintFunctions,

ArrayRef<PDLRewriteFunction> rewriteFunctions) ArrayRef<PDLRewriteFunction> rewriteFunctions)

: curCodeIt(curCodeIt), memory(memory), typeRangeMemory(typeRangeMemory), : curCodeIt(curCodeIt), memory(memory), opRangeMemory(opRangeMemory),

typeRangeMemory(typeRangeMemory),

allocatedTypeRangeMemory(allocatedTypeRangeMemory), allocatedTypeRangeMemory(allocatedTypeRangeMemory),

valueRangeMemory(valueRangeMemory), valueRangeMemory(valueRangeMemory),

allocatedValueRangeMemory(allocatedValueRangeMemory), allocatedValueRangeMemory(allocatedValueRangeMemory),

uniquedMemory(uniquedMemory), code(code), loopIndex(loopIndex), uniquedMemory(uniquedMemory), code(code),

currentPatternBenefits(currentPatternBenefits), patterns(patterns), currentPatternBenefits(currentPatternBenefits), patterns(patterns),

constraintFunctions(constraintFunctions), constraintFunctions(constraintFunctions),

rewriteFunctions(rewriteFunctions) {} rewriteFunctions(rewriteFunctions) {}

/// Start executing the code at the current bytecode index. `matches` is an /// Start executing the code at the current bytecode index. `matches` is an

/// optional field provided when this function is executed in a matching /// optional field provided when this function is executed in a matching

/// context. /// context.

void execute(PatternRewriter &rewriter, void execute(PatternRewriter &rewriter,

SmallVectorImpl<PDLByteCode::MatchResult> *matches = nullptr, SmallVectorImpl<PDLByteCode::MatchResult> *matches = nullptr,

Optional<Location> mainRewriteLoc = {}); Optional<Location> mainRewriteLoc = {});

private: private:

/// Internal implementation of executing each of the bytecode commands. /// Internal implementation of executing each of the bytecode commands.

void executeApplyConstraint(PatternRewriter &rewriter); void executeApplyConstraint(PatternRewriter &rewriter);

void executeApplyRewrite(PatternRewriter &rewriter); void executeApplyRewrite(PatternRewriter &rewriter);

void executeAreEqual(); void executeAreEqual();

void executeAreRangesEqual(); void executeAreRangesEqual();

void executeBranch(); void executeBranch();

void executeCheckOperandCount(); void executeCheckOperandCount();

void executeCheckOperationName(); void executeCheckOperationName();

void executeCheckResultCount(); void executeCheckResultCount();

void executeCheckTypes(); void executeCheckTypes();

void executeContinue();

void executeCreateOperation(PatternRewriter &rewriter, void executeCreateOperation(PatternRewriter &rewriter,

Location mainRewriteLoc); Location mainRewriteLoc);

void executeCreateTypes(); void executeCreateTypes();

void executeEraseOp(PatternRewriter &rewriter); void executeEraseOp(PatternRewriter &rewriter);

void executeFinalize();

void executeForEach();

void executeGetAttribute(); void executeGetAttribute();

void executeGetAttributeType(); void executeGetAttributeType();

void executeGetDefiningOp(); void executeGetDefiningOp();

void executeGetOperand(unsigned index); void executeGetOperand(unsigned index);

void executeGetOperands(); void executeGetOperands();

void executeGetResult(unsigned index); void executeGetResult(unsigned index);

void executeGetResults(); void executeGetResults();

void executeGetUsers();

void executeGetValueType(); void executeGetValueType();

void executeGetValueRangeTypes(); void executeGetValueRangeTypes();

void executeIsNotNull(); void executeIsNotNull();

void executeRecordMatch(PatternRewriter &rewriter, void executeRecordMatch(PatternRewriter &rewriter,

SmallVectorImpl<PDLByteCode::MatchResult> &matches); SmallVectorImpl<PDLByteCode::MatchResult> &matches);

void executeReplaceOp(PatternRewriter &rewriter); void executeReplaceOp(PatternRewriter &rewriter);

void executeSwitchAttribute(); void executeSwitchAttribute();

void executeSwitchOperandCount(); void executeSwitchOperandCount();

void executeSwitchOperationName(); void executeSwitchOperationName();

void executeSwitchResultCount(); void executeSwitchResultCount();

void executeSwitchType(); void executeSwitchType();

void executeSwitchTypes(); void executeSwitchTypes();

/// Pushes a code iterator to the stack.

void pushCodeIt(const ByteCodeField *it) { resumeCodeIt.push_back(it); }

/// Pops a code iterator from the stack, returning true on success.

void popCodeIt() {

assert(!resumeCodeIt.empty() && "attempt to pop code off empty stack");

rriddleUnsubmitted

Done

void popCodeIt() {

- assert(!resumeCodeIt.empty() && "Attempt to pop code off empty stack");

+ assert(!resumeCodeIt.empty() && "attempt to pop code off empty stack");

curCodeIt = resumeCodeIt.back();

rriddle:

curCodeIt = resumeCodeIt.back();

resumeCodeIt.pop_back();

}

/// Read a value from the bytecode buffer, optionally skipping a certain /// Read a value from the bytecode buffer, optionally skipping a certain

/// number of prefix values. These methods always update the buffer to point /// number of prefix values. These methods always update the buffer to point

/// to the next field after the read data. /// to the next field after the read data.

template <typename T = ByteCodeField> template <typename T = ByteCodeField>

T read(size_t skipN = 0) { T read(size_t skipN = 0) {

curCodeIt += skipN; curCodeIt += skipN;

return readImpl<T>(); return readImpl<T>();

} }

▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines private:

template <typename T> template <typename T>

std::enable_if_t<std::is_same<T, PDLValue::Kind>::value, T> readImpl() { std::enable_if_t<std::is_same<T, PDLValue::Kind>::value, T> readImpl() {

return static_cast<PDLValue::Kind>(readImpl<ByteCodeField>()); return static_cast<PDLValue::Kind>(readImpl<ByteCodeField>());

} }

/// The underlying bytecode buffer. /// The underlying bytecode buffer.

const ByteCodeField *curCodeIt; const ByteCodeField *curCodeIt;

/// The stack of bytecode positions at which to resume operation.

SmallVector<const ByteCodeField *> resumeCodeIt;

/// The current execution memory. /// The current execution memory.

MutableArrayRef<const void *> memory; MutableArrayRef<const void *> memory;

MutableArrayRef<OpRange> opRangeMemory;

MutableArrayRef<TypeRange> typeRangeMemory; MutableArrayRef<TypeRange> typeRangeMemory;

std::vector<llvm::OwningArrayRef<Type>> &allocatedTypeRangeMemory; std::vector<llvm::OwningArrayRef<Type>> &allocatedTypeRangeMemory;

MutableArrayRef<ValueRange> valueRangeMemory; MutableArrayRef<ValueRange> valueRangeMemory;

std::vector<llvm::OwningArrayRef<Value>> &allocatedValueRangeMemory; std::vector<llvm::OwningArrayRef<Value>> &allocatedValueRangeMemory;

/// The current loop indices.

MutableArrayRef<unsigned> loopIndex;

/// References to ByteCode data necessary for execution. /// References to ByteCode data necessary for execution.

ArrayRef<const void *> uniquedMemory; ArrayRef<const void *> uniquedMemory;

ArrayRef<ByteCodeField> code; ArrayRef<ByteCodeField> code;

ArrayRef<PatternBenefit> currentPatternBenefits; ArrayRef<PatternBenefit> currentPatternBenefits;

ArrayRef<PDLByteCodePattern> patterns; ArrayRef<PDLByteCodePattern> patterns;

ArrayRef<PDLConstraintFunction> constraintFunctions; ArrayRef<PDLConstraintFunction> constraintFunctions;

ArrayRef<PDLRewriteFunction> rewriteFunctions; ArrayRef<PDLRewriteFunction> rewriteFunctions;

}; };

▲ Show 20 Lines • Show All 178 Lines • ▼ Show 20 Lines void ByteCodeExecutor::executeCheckTypes() {

LLVM_DEBUG(llvm::dbgs() << "Executing AreEqual:\n"); LLVM_DEBUG(llvm::dbgs() << "Executing AreEqual:\n");

TypeRange *lhs = read<TypeRange *>(); TypeRange *lhs = read<TypeRange *>();

Attribute rhs = read<Attribute>(); Attribute rhs = read<Attribute>();

LLVM_DEBUG(llvm::dbgs() << " * " << lhs << " == " << rhs << "\n\n"); LLVM_DEBUG(llvm::dbgs() << " * " << lhs << " == " << rhs << "\n\n");

selectJump(*lhs == rhs.cast<ArrayAttr>().getAsValueRange<TypeAttr>()); selectJump(*lhs == rhs.cast<ArrayAttr>().getAsValueRange<TypeAttr>());

} }

void ByteCodeExecutor::executeContinue() {

ByteCodeField level = read();

LLVM_DEBUG(llvm::dbgs() << "Executing Continue\n"

<< " * Level: " << level << "\n");

++loopIndex[level];

popCodeIt();

}

void ByteCodeExecutor::executeCreateTypes() { void ByteCodeExecutor::executeCreateTypes() {

LLVM_DEBUG(llvm::dbgs() << "Executing CreateTypes:\n"); LLVM_DEBUG(llvm::dbgs() << "Executing CreateTypes:\n");

unsigned memIndex = read(); unsigned memIndex = read();

unsigned rangeIndex = read(); unsigned rangeIndex = read();

ArrayAttr typesAttr = read<Attribute>().cast<ArrayAttr>(); ArrayAttr typesAttr = read<Attribute>().cast<ArrayAttr>();

LLVM_DEBUG(llvm::dbgs() << " * Types: " << typesAttr << "\n\n"); LLVM_DEBUG(llvm::dbgs() << " * Types: " << typesAttr << "\n\n");

▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines void ByteCodeExecutor::executeCreateOperation(PatternRewriter &rewriter,

Operation *resultOp = rewriter.createOperation(state); Operation *resultOp = rewriter.createOperation(state);

memory[memIndex] = resultOp; memory[memIndex] = resultOp;

LLVM_DEBUG({ LLVM_DEBUG({

llvm::dbgs() << " * Attributes: " llvm::dbgs() << " * Attributes: "

<< state.attributes.getDictionary(state.getContext()) << state.attributes.getDictionary(state.getContext())

<< "\n * Operands: "; << "\n * Operands: ";

llvm::interleaveComma(state.operands, llvm::dbgs()); llvm::interleaveComma(state.operands, llvm::dbgs());

llvm::dbgs() << "\n * Result Types: "; llvm::dbgs() << "\n * Result Types: ";

llvm::interleaveComma(state.types, llvm::dbgs()); llvm::interleaveComma(state.types, llvm::dbgs());

llvm::dbgs() << "\n * Result: " << *resultOp << "\n"; llvm::dbgs() << "\n * Result: " << *resultOp << "\n";

}); });

} }

rriddleUnsubmitted

Done

operationSet.insert(value.getDefiningOp());

- for (auto it = block->rbegin(); it != block->rend(); ++it)

+ for (auto it = block->rbegin(); it != block->rend(); ++it) {

if (operationSet.count(&*it)) {

LLVM_DEBUG(llvm::dbgs()

<< "Inserting " << state.name << " after " << *it << "\n");

rewriter.setInsertionPointAfter(&*it);

break;

}

+ }

Operation *resultOp = rewriter.createOperation(state);

llvm::reverse(*block)?

rriddle: llvm::reverse(*block)?

rriddleUnsubmitted

Done

This feels expensive, I'm also not sure this is correct. If we move the insertion point to the operands, we may end up creating a side effecting operation incorrectly. For example, what happens in the situation where:

%cst = some.constant
%alloc = some.alloc
...
%load = some.load
... <Current Insertion Point>

If the current insertion point is the same above, if we were about to create a some.store %cst in %alloc operation wouldn't this incorrectly create it above the load (thus leading to a miscompile)?

rriddle: This feels expensive, I'm also not sure this is correct. If we move the insertion point to the…

sfuniakAuthorUnsubmitted

Done

I can see how this change could be problematic. It was mostly introduced, because the current insertion point is at the root of the pattern, and with downward edge traversal, there is no guarantee that all the inputs are defined at this point. However, now that the pdl.rewrite accepts an optional argument forcing the root, this is less of an issue, because the user can manually specify the desired root that's known to work. Furthermore, in our use case, we rely on graph region where the order does not matter anyway. I will revert this change.

sfuniak: I can see how this change could be problematic. It was mostly introduced, because the current…

void ByteCodeExecutor::executeEraseOp(PatternRewriter &rewriter) { void ByteCodeExecutor::executeEraseOp(PatternRewriter &rewriter) {

LLVM_DEBUG(llvm::dbgs() << "Executing EraseOp:\n"); LLVM_DEBUG(llvm::dbgs() << "Executing EraseOp:\n");

Operation *op = read<Operation *>(); Operation *op = read<Operation *>();

LLVM_DEBUG(llvm::dbgs() << " * Operation: " << *op << "\n"); LLVM_DEBUG(llvm::dbgs() << " * Operation: " << *op << "\n");

rewriter.eraseOp(op); rewriter.eraseOp(op);

} }

void ByteCodeExecutor::executeFinalize() {

LLVM_DEBUG(llvm::dbgs() << "Executing Finalize\n");

}

void ByteCodeExecutor::executeForEach() {

LLVM_DEBUG(llvm::dbgs() << "Executing ForEach:\n");

// Subtract 1 for the op code.

const ByteCodeField *it = curCodeIt - 1;

rriddleUnsubmitted

Done

nit: Please move comments to their own line.

rriddle: nit: Please move comments to their own line.

unsigned rangeIndex = read();

unsigned memIndex = read();

const void *value = nullptr;

switch (read<PDLValue::Kind>()) {

case PDLValue::Kind::Operation: {

MogballUnsubmitted

Not Done

You could just assert(kind == PDLValue::Kind::Operation given that the only other case in this switch is an abort.

Mogball: You could just `assert(kind == PDLValue::Kind::Operation` given that the only other case in…

sfuniakAuthorUnsubmitted

Done

That's true for now, but we are going to soon follow this with another diff where the iteration type is a ValueRange. I thought I'd anticipate the change.

sfuniak: That's true for now, but we are going to soon follow this with another diff where the iteration…

MogballUnsubmitted

Not Done

Gotcha

Mogball: Gotcha

unsigned &index = loopIndex[read()];

const ArrayRef<Operation *> &array = opRangeMemory[rangeIndex];

MogballUnsubmitted

Not Done

unsigned &index = loopIndex[read()];

- const ArrayRef<Operation *> &array = opRangeMemory[rangeIndex];

+ ArrayRef<Operation *> array = opRangeMemory[rangeIndex];

assert(index <= array.size() && "iterated past the end");

Mogball:

assert(index <= array.size() && "iterated past the end");

if (index < array.size()) {

LLVM_DEBUG(llvm::dbgs() << " * Result: " << array[index] << "\n");

value = array[index];

break;

}

LLVM_DEBUG(llvm::dbgs() << " * Done\n");

index = 0;

selectJump(size_t(0));

return;

}

default:

llvm_unreachable("unexpected `ForEach` value kind");

}

// Store the iterate value and the stack address.

memory[memIndex] = value;

pushCodeIt(it);

// Skip over the successor (we will enter the body of the loop).

read<ByteCodeAddr>();

}

void ByteCodeExecutor::executeGetAttribute() { void ByteCodeExecutor::executeGetAttribute() {

LLVM_DEBUG(llvm::dbgs() << "Executing GetAttribute:\n"); LLVM_DEBUG(llvm::dbgs() << "Executing GetAttribute:\n");

unsigned memIndex = read(); unsigned memIndex = read();

Operation *op = read<Operation *>(); Operation *op = read<Operation *>();

Identifier attrName = read<Identifier>(); Identifier attrName = read<Identifier>();

Attribute attr = op->getAttr(attrName); Attribute attr = op->getAttr(attrName);

LLVM_DEBUG(llvm::dbgs() << " * Operation: " << *op << "\n" LLVM_DEBUG(llvm::dbgs() << " * Operation: " << *op << "\n"

▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines

/// This function is the internal implementation of `GetResults` and /// This function is the internal implementation of `GetResults` and

/// `GetOperands` that provides support for extracting a value range from the /// `GetOperands` that provides support for extracting a value range from the

/// given operation. /// given operation.

template <template <typename> class AttrSizedSegmentsT, typename RangeT> template <template <typename> class AttrSizedSegmentsT, typename RangeT>

static void * static void *

executeGetOperandsResults(RangeT values, Operation *op, unsigned index, executeGetOperandsResults(RangeT values, Operation *op, unsigned index,

ByteCodeField rangeIndex, StringRef attrSizedSegments, ByteCodeField rangeIndex, StringRef attrSizedSegments,

MutableArrayRef<ValueRange> &valueRangeMemory) { MutableArrayRef<ValueRange> valueRangeMemory) {

rriddleUnsubmitted

Done

ByteCodeField rangeIndex, StringRef attrSizedSegments,

- const MutableArrayRef<ValueRange> &valueRangeMemory) {

+ MutableArrayRef<ValueRange> valueRangeMemory) {

// Check for the sentinel index that signals that all values should be

We don't really need the reference.

rriddle: We don't really need the reference.

// Check for the sentinel index that signals that all values should be // Check for the sentinel index that signals that all values should be

// returned. // returned.

if (index == std::numeric_limits<uint32_t>::max()) { if (index == std::numeric_limits<uint32_t>::max()) {

LLVM_DEBUG(llvm::dbgs() << " * Getting all values\n"); LLVM_DEBUG(llvm::dbgs() << " * Getting all values\n");

// `values` is already the full value range. // `values` is already the full value range.

// Otherwise, check to see if this operation uses AttrSizedSegments. // Otherwise, check to see if this operation uses AttrSizedSegments.

} else if (op->hasTrait<AttrSizedSegmentsT>()) { } else if (op->hasTrait<AttrSizedSegmentsT>()) {

▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines void ByteCodeExecutor::executeGetResults() {

void *result = executeGetOperandsResults<OpTrait::AttrSizedResultSegments>( void *result = executeGetOperandsResults<OpTrait::AttrSizedResultSegments>(

op->getResults(), op, index, rangeIndex, "result_segment_sizes", op->getResults(), op, index, rangeIndex, "result_segment_sizes",

valueRangeMemory); valueRangeMemory);

if (!result) if (!result)

LLVM_DEBUG(llvm::dbgs() << " * Invalid result range\n"); LLVM_DEBUG(llvm::dbgs() << " * Invalid result range\n");

memory[read()] = result; memory[read()] = result;

} }

/// This function is a helper that returns the operands of the specified

/// operation for the given operand group.

static Optional<OperandRange> getOperandsAt(Operation *op, unsigned index,

bool single) {

// Check if `index` is a sentinel, indicating that all operands should be

// returned.

if (index == std::numeric_limits<uint32_t>::max())

return op->getOperands();

// Check if the operation uses AttrSizedSegments; if so, return the operand

// slice at group `index`.

if (op->hasTrait<OpTrait::AttrSizedOperandSegments>()) {

auto segmentAttr =

op->getAttrOfType<DenseElementsAttr>("operand_segment_sizes");

if (!segmentAttr || index >= segmentAttr.getNumElements())

rriddleUnsubmitted

Done

Why not grab the users from all of the values?

rriddle: Why not grab the users from all of the values?

rriddleUnsubmitted

Not Done

Unresolved?

rriddle: Unresolved?

sfuniakAuthorUnsubmitted

Done

Grabbing only the first value in a range is consistent with the definition of pdl_interp.get_defining_op.

I believe, the present definition is correct. You will match the entirety of a value range, so we could take the users of any value. We could also take the intersection of users, but that's more costly and not really needed, because we follow up the users query with operand comparison.

sfuniak: Grabbing only the first value in a range is consistent with the definition of [[https://mlir.

return llvm::None;

auto segments = segmentAttr.getValues<int32_t>();

auto groupIt = segments.begin() + index;

unsigned startIndex = std::accumulate(segments.begin(), groupIt, 0);

return op->getOperands().slice(startIndex, *groupIt);

}

// Otherwise, if we are extracting a single operand, return a singleton range.

rriddleUnsubmitted

Done

SmallVector<Operation *> users;

- if (value)

+ if (value) {

for (OpOperand &use : value.getUses())

if (use.getOperandNumber() == operandNumber)

users.push_back(use.getOwner());

+ }

LLVM_DEBUG(llvm::dbgs() << " * Result: " << users.size() << " operations\n");

rriddle:

if (single && index < op->getNumOperands())

return op->getOperands().slice(index, 1);

// Otherwise, assume this is the last operand group of the operation.

// FIXME: We currently don't support operations with

// SameVariadicOperandSize/SameVariadicResultSize here given that we don't

// have a way to detect it's presence.

if (index <= op->getNumOperands())

return op->getOperands().drop_front(index);

rriddleUnsubmitted

Done

Yeah, the opRangeMemory should hold the actual storage. memory should hold the address to the data.

rriddle: Yeah, the opRangeMemory should hold the actual storage. `memory` should hold the address to the…

// We couldn't detect a way to compute the values, bail out.

return llvm::None;

}

void ByteCodeExecutor::executeGetUsers() {

LLVM_DEBUG(llvm::dbgs() << "Executing GetUsers:\n");

unsigned memIndex = read();

unsigned rangeIndex = read();

unsigned operandNumber = read<uint32_t>();

OpRange &range = opRangeMemory[rangeIndex];

memory[memIndex] = &range;

rriddleUnsubmitted

Not Done

Walking the user count is going to be O(N), what's the trade off here vs. not reserving? Are we banking on a small number of users in practice?

rriddle: Walking the user count is going to be O(N), what's the trade off here vs. not reserving? Are we…

sfuniakAuthorUnsubmitted

Done

Not reserving will require several allocs, one alloc every time we expand the storage. I figured walking the users upfront would be cheaper (though still O(N)). But on the second thought, the number of users will often be small, so maybe SmallVector will do.

sfuniak: Not reserving will require several allocs, one alloc every time we expand the storage. I…

bool single = false;

// A single value or a representative value of a range.

Value value;

// A (possibly empty) range of all the values.

ValueRange values;

// Read the value(s).

if (read<PDLValue::Kind>() == PDLValue::Kind::Value) {

single = true;

value = read<Value>();

if (value)

MogballUnsubmitted

Not Done

If a null value is read, you could just exit early (and set range to empty).

Mogball: If a null `value` is read, you could just exit early (and set `range` to empty).

MogballUnsubmitted

Not Done

Unrelated, but this makes me think we need an AttrSizedSegments interface that hides a lot of this work for us.

Mogball: Unrelated, but this makes me think we need an `AttrSizedSegments` interface that hides a lot of…

values = ValueRange(value);

rriddleUnsubmitted

Done

typo: acceessing

rriddle: typo: acceessing

LLVM_DEBUG(llvm::dbgs() << " * Value: " << value << "\n");

} else {

if (auto *value_ptr = read<ValueRange *>())

rriddleUnsubmitted

Done

Why do we need all of this? Can we augment/add an overload of executeGetOperandsResults for the case we are interested in?

rriddle: Why do we need all of this? Can we augment/add an overload of executeGetOperandsResults for the…

sfuniakAuthorUnsubmitted

Done

We need the comparison, because we need to check if the extracted operand(s) at the specified position match(es) the given value(s). I will create an overload of executeGetOperandsResults as you suggested that does this.

sfuniak: We need the comparison, because we need to check if the extracted operand(s) at the specified…

values = *value_ptr;

if (!values.empty())

MogballUnsubmitted

Not Done

Same here. If there are no values to read (either null pointer or empty range) you can just return.

Mogball: Same here. If there are no values to read (either null pointer or empty range) you can just…

value = values.front();

LLVM_DEBUG({

llvm::dbgs() << " * Values (" << values.size() << "): ";

llvm::interleaveComma(values, llvm::dbgs());

llvm::dbgs() << "\n";

});

}

// Extract the users.

if (!value) {

// No value or empty range of values given, so no users.

range = OpRange();

} else if (single && operandNumber == std::numeric_limits<uint32_t>::max()) {

MogballUnsubmitted

Not Done

The implementation of get_users in the bytecode is a little bit complex, stemming from the four cases of whether an index is specified cross whether the operand is a value or value range. Would this be any better with two different ops? One with an index and one without. It'll get rid of the sentinel value too.

Mogball: The implementation of `get_users` in the bytecode is a little bit complex, stemming from the…

sfuniakAuthorUnsubmitted

Done

That's a good point, I will create a new op code and split this up into two execute functions.

sfuniak: That's a good point, I will create a new op code and split this up into two execute functions.

// Special case: all users of a single value.

range = OpRange(std::distance(value.user_begin(), value.user_end()));

llvm::copy(value.getUsers(), range.begin());

} else {

// Default case: users for a specific group or the entire range.

SmallVector<Operation *> users;

// Iterate over all the users of the representative value, extract the

// operands of specified group, and compare with the value range.

for (Operation *op : value.getUsers()) {

Optional<OperandRange> operands =

getOperandsAt(op, operandNumber, single);

if (operands && llvm::equal(*operands, values))

users.push_back(op);

}

// Populated the user range.

range = OpRange(users.size());

llvm::copy(users, range.begin());

}

LLVM_DEBUG(llvm::dbgs() << " * Result: " << range.size() << " operations\n");

}

MogballUnsubmitted

Not Done

For this one here, I think you can extract the loop and move it below the if/else of Value vs ValueRange. These functions look much cleaner now, thanks!

Mogball: For this one here, I think you can extract the loop and move it below the if/else of `Value` vs…

void ByteCodeExecutor::executeGetValueType() { void ByteCodeExecutor::executeGetValueType() {

LLVM_DEBUG(llvm::dbgs() << "Executing GetValueType:\n"); LLVM_DEBUG(llvm::dbgs() << "Executing GetValueType:\n");

unsigned memIndex = read(); unsigned memIndex = read();

Value value = read<Value>(); Value value = read<Value>();

Type type = value ? value.getType() : Type(); Type type = value ? value.getType() : Type();

LLVM_DEBUG(llvm::dbgs() << " * Value: " << value << "\n" LLVM_DEBUG(llvm::dbgs() << " * Value: " << value << "\n"

<< " * Result: " << type << "\n"); << " * Result: " << type << "\n");

▲ Show 20 Lines • Show All 206 Lines • ▼ Show 20 Lines case CheckOperationName:

executeCheckOperationName(); executeCheckOperationName();

break; break;

case CheckResultCount: case CheckResultCount:

executeCheckResultCount(); executeCheckResultCount();

break; break;

case CheckTypes: case CheckTypes:

executeCheckTypes(); executeCheckTypes();

break; break;

case Continue:

executeContinue();

break;

case CreateOperation: case CreateOperation:

executeCreateOperation(rewriter, *mainRewriteLoc); executeCreateOperation(rewriter, *mainRewriteLoc);

break; break;

case CreateTypes: case CreateTypes:

executeCreateTypes(); executeCreateTypes();

break; break;

case EraseOp: case EraseOp:

executeEraseOp(rewriter); executeEraseOp(rewriter);

break; break;

case Finalize: case Finalize:

LLVM_DEBUG(llvm::dbgs() << "Executing Finalize\n\n"); executeFinalize();

LLVM_DEBUG(llvm::dbgs() << "\n");

return; return;

case ForEach:

executeForEach();

break;

case GetAttribute: case GetAttribute:

executeGetAttribute(); executeGetAttribute();

break; break;

case GetAttributeType: case GetAttributeType:

executeGetAttributeType(); executeGetAttributeType();

break; break;

case GetDefiningOp: case GetDefiningOp:

executeGetDefiningOp(); executeGetDefiningOp();

Show All 25 Lines while (true) {

} }

case GetResultN: case GetResultN:

LLVM_DEBUG(llvm::dbgs() << "Executing GetResultN:\n"); LLVM_DEBUG(llvm::dbgs() << "Executing GetResultN:\n");

executeGetResult(read<uint32_t>()); executeGetResult(read<uint32_t>());

break; break;

case GetResults: case GetResults:

executeGetResults(); executeGetResults();

break; break;

case GetUsers:

executeGetUsers();

break;

case GetValueType: case GetValueType:

executeGetValueType(); executeGetValueType();

break; break;

case GetValueRangeTypes: case GetValueRangeTypes:

executeGetValueRangeTypes(); executeGetValueRangeTypes();

break; break;

case IsNotNull: case IsNotNull:

executeIsNotNull(); executeIsNotNull();

Show All 34 Lines

void PDLByteCode::match(Operation *op, PatternRewriter &rewriter, void PDLByteCode::match(Operation *op, PatternRewriter &rewriter,

SmallVectorImpl<MatchResult> &matches, SmallVectorImpl<MatchResult> &matches,

PDLByteCodeMutableState &state) const { PDLByteCodeMutableState &state) const {

// The first memory slot is always the root operation. // The first memory slot is always the root operation.

state.memory[0] = op; state.memory[0] = op;

// The matcher function always starts at code address 0. // The matcher function always starts at code address 0.

ByteCodeExecutor executor( ByteCodeExecutor executor(

matcherByteCode.data(), state.memory, state.typeRangeMemory, matcherByteCode.data(), state.memory, state.opRangeMemory,

state.allocatedTypeRangeMemory, state.valueRangeMemory, state.typeRangeMemory, state.allocatedTypeRangeMemory,

state.allocatedValueRangeMemory, uniquedData, matcherByteCode, state.valueRangeMemory, state.allocatedValueRangeMemory, state.loopIndex,

state.currentPatternBenefits, patterns, constraintFunctions, uniquedData, matcherByteCode, state.currentPatternBenefits, patterns,

rewriteFunctions); constraintFunctions, rewriteFunctions);

executor.execute(rewriter, &matches); executor.execute(rewriter, &matches);

// Order the found matches by benefit. // Order the found matches by benefit.

std::stable_sort(matches.begin(), matches.end(), std::stable_sort(matches.begin(), matches.end(),

[](const MatchResult &lhs, const MatchResult &rhs) { [](const MatchResult &lhs, const MatchResult &rhs) {

return lhs.benefit > rhs.benefit; return lhs.benefit > rhs.benefit;

}); });

} }

/// Run the rewriter of the given pattern on the root operation `op`. /// Run the rewriter of the given pattern on the root operation `op`.

void PDLByteCode::rewrite(PatternRewriter &rewriter, const MatchResult &match, void PDLByteCode::rewrite(PatternRewriter &rewriter, const MatchResult &match,

PDLByteCodeMutableState &state) const { PDLByteCodeMutableState &state) const {

// The arguments of the rewrite function are stored at the start of the // The arguments of the rewrite function are stored at the start of the

// memory buffer. // memory buffer.

llvm::copy(match.values, state.memory.begin()); llvm::copy(match.values, state.memory.begin());

ByteCodeExecutor executor( ByteCodeExecutor executor(

&rewriterByteCode[match.pattern->getRewriterAddr()], state.memory, &rewriterByteCode[match.pattern->getRewriterAddr()], state.memory,

state.typeRangeMemory, state.allocatedTypeRangeMemory, state.opRangeMemory, state.typeRangeMemory,

state.valueRangeMemory, state.allocatedValueRangeMemory, uniquedData, state.allocatedTypeRangeMemory, state.valueRangeMemory,

state.allocatedValueRangeMemory, state.loopIndex, uniquedData,

rewriterByteCode, state.currentPatternBenefits, patterns, rewriterByteCode, state.currentPatternBenefits, patterns,

constraintFunctions, rewriteFunctions); constraintFunctions, rewriteFunctions);

executor.execute(rewriter, /*matches=*/nullptr, match.location); executor.execute(rewriter, /*matches=*/nullptr, match.location);

} }

mlir/test/Rewrite/pdl-bytecode.mlir

	Show First 20 Lines • Show All 509 Lines • ▼ Show 20 Lines
	module @ir attributes { test.check_types_1 } {			module @ir attributes { test.check_types_1 } {
	"test.op"() : () -> (i32, i64)			"test.op"() : () -> (i32, i64)
	"test.op"() : () -> i32			"test.op"() : () -> i32
	}			}

	// -----			// -----

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// pdl_interp::ContinueOp
				//===----------------------------------------------------------------------===//

				// Fully tested within the tests for other operations.

				//===----------------------------------------------------------------------===//
	// pdl_interp::CreateAttributeOp			// pdl_interp::CreateAttributeOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// Fully tested within the tests for other operations.			// Fully tested within the tests for other operations.

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// pdl_interp::CreateOperationOp			// pdl_interp::CreateOperationOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// pdl_interp::FinalizeOp			// pdl_interp::FinalizeOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// Fully tested within the tests for other operations.			// Fully tested within the tests for other operations.

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// pdl_interp::ForEachOp
				//===----------------------------------------------------------------------===//

				module @patterns {
				func @matcher(%root : !pdl.operation) {
				%val1 = pdl_interp.get_result 0 of %root
				%ops1 = pdl_interp.get_users of %val1 : !pdl.value
				pdl_interp.foreach %op1 : !pdl.operation in %ops1 {
				%val2 = pdl_interp.get_result 0 of %op1
				%ops2 = pdl_interp.get_users of %val2 : !pdl.value
				pdl_interp.foreach %op2 : !pdl.operation in %ops2 {
				pdl_interp.record_match @rewriters::@success(%op2 : !pdl.operation) : benefit(1), loc([%root]) -> ^cont
				^cont:
				pdl_interp.continue
				} -> ^cont
				^cont:
				pdl_interp.continue
				} -> ^end
				^end:
				pdl_interp.finalize
				}

				module @rewriters {
				func @success(%matched : !pdl.operation) {
				%op = pdl_interp.create_operation "test.success"
				pdl_interp.erase %matched
				pdl_interp.finalize
				}
				}
				}

				// CHECK-LABEL: test.foreach
				// CHECK: "test.success"
				// CHECK: "test.success"
				// CHECK: "test.success"
				// CHECK: "test.success"
				// CHECK: %[[ROOT:.*]] = "test.op"
				// CHECK: %[[VALA:.*]] = "test.op"(%[[ROOT]])
				// CHECK: %[[VALB:.*]] = "test.op"(%[[ROOT]])
				module @ir attributes { test.foreach } {
				%root = "test.op"() : () -> i32
				%valA = "test.op"(%root) : (i32) -> (i32)
				"test.op"(%valA) : (i32) -> (i32)
				"test.op"(%valA) : (i32) -> (i32)
				%valB = "test.op"(%root) : (i32) -> (i32)
				"test.op"(%valB) : (i32) -> (i32)
				"test.op"(%valB) : (i32) -> (i32)
				}

				// -----

				//===----------------------------------------------------------------------===//
				// pdl_interp::GetUsersOp
				//===----------------------------------------------------------------------===//

				module @patterns {
				func @matcher(%root : !pdl.operation) {
				%val = pdl_interp.get_result 0 of %root
				%ops = pdl_interp.get_users of %val : !pdl.value
				pdl_interp.foreach %op : !pdl.operation in %ops {
				pdl_interp.record_match @rewriters::@success(%op : !pdl.operation) : benefit(1), loc([%root]) -> ^cont
				^cont:
				pdl_interp.continue
				} -> ^end
				^end:
				pdl_interp.finalize
				}

				module @rewriters {
				func @success(%matched : !pdl.operation) {
				%op = pdl_interp.create_operation "test.success"
				pdl_interp.erase %matched
				pdl_interp.finalize
				}
				}
				}

				// CHECK-LABEL: test.get_all_users_of_value
				// CHECK: "test.success"
				// CHECK: "test.success"
				// CHECK: %[[OPERAND:.*]] = "test.op"
				module @ir attributes { test.get_all_users_of_value } {
				%operand = "test.op"() : () -> i32
				"test.op"(%operand) : (i32) -> (i32)
				"test.op"(%operand, %operand) : (i32, i32) -> (i32)
				}

				// -----

				module @patterns {
				func @matcher(%root : !pdl.operation) {
				%val = pdl_interp.get_result 0 of %root
				%ops = pdl_interp.get_users 1 of %val : !pdl.value
				pdl_interp.foreach %op : !pdl.operation in %ops {
				pdl_interp.record_match @rewriters::@success(%op : !pdl.operation) : benefit(1), loc([%root]) -> ^cont
				^cont:
				pdl_interp.continue
				} -> ^end
				^end:
				pdl_interp.finalize
				}

				module @rewriters {
				func @success(%matched : !pdl.operation) {
				%op = pdl_interp.create_operation "test.success"
				pdl_interp.erase %matched
				pdl_interp.finalize
				}
				}
				}

				// CHECK-LABEL: test.get_specific_users_of_value
				// CHECK: "test.success"
				// CHECK: "test.success"
				// CHECK: %[[OPERAND:.*]] = "test.op"
				// CHECK: "test.op"(%[[OPERAND]])
				module @ir attributes { test.get_specific_users_of_value } {
				%operand = "test.op"() : () -> i32
				"test.op"(%operand) : (i32) -> (i32)
				"test.op"(%operand, %operand) : (i32, i32) -> (i32)
				"test.op"(%operand, %operand, %operand) : (i32, i32, i32) -> (i32)
				}

				// -----

				module @patterns {
				func @matcher(%root : !pdl.operation) {
				pdl_interp.check_result_count of %root is at_least 2 -> ^next, ^end
				^next:
				%vals = pdl_interp.get_results of %root : !pdl.range<value>
				%ops = pdl_interp.get_users of %vals : !pdl.range<value>
				pdl_interp.foreach %op : !pdl.operation in %ops {
				pdl_interp.record_match @rewriters::@success(%op : !pdl.operation) : benefit(1), loc([%root]) -> ^cont
				^cont:
				pdl_interp.continue
				} -> ^end
				^end:
				pdl_interp.finalize
				}

				module @rewriters {
				func @success(%matched : !pdl.operation) {
				%op = pdl_interp.create_operation "test.success"
				pdl_interp.erase %matched
				pdl_interp.finalize
				}
				}
				}

				// CHECK-LABEL: test.get_all_users_of_range
				// CHECK: "test.success"
				// CHECK: %[[OPERANDS:.*]]:2 = "test.op"
				// CHECK: "test.op"(%[[OPERANDS]]#0, %[[OPERANDS]]#0)
				module @ir attributes { test.get_all_users_of_range } {
				%operands:2 = "test.op"() : () -> (i32, i32)
				"test.op"(%operands#0, %operands#1) : (i32, i32) -> (i32)
				"test.op"(%operands#0, %operands#0) : (i32, i32) -> (i32)
				}

				// -----

				module @patterns {
				func @matcher(%root : !pdl.operation) {
				pdl_interp.check_result_count of %root is at_least 2 -> ^next, ^end
				^next:
				%vals = pdl_interp.get_results of %root : !pdl.range<value>
				%ops = pdl_interp.get_users 1 of %vals : !pdl.range<value>
				pdl_interp.foreach %op : !pdl.operation in %ops {
				pdl_interp.record_match @rewriters::@success(%op : !pdl.operation) : benefit(1), loc([%root]) -> ^cont
				^cont:
				pdl_interp.continue
				} -> ^end
				^end:
				pdl_interp.finalize
				}

				module @rewriters {
				func @success(%matched : !pdl.operation) {
				%op = pdl_interp.create_operation "test.success"
				pdl_interp.erase %matched
				pdl_interp.finalize
				}
				}
				}

				// CHECK-LABEL: test.get_specific_users_of_range
				// CHECK: "test.success"
				// CHECK: %[[OPERANDS:.*]]:2 = "test.op"
				// CHECK: "test.op"(%[[OPERANDS]]#0, %[[OPERANDS]]#1)
				module @ir attributes { test.get_specific_users_of_range } {
				%operands:2 = "test.op"() : () -> (i32, i32)
				"test.op"(%operands#0, %operands#1) : (i32, i32) -> (i32)
				"test.op"(%operands#0, %operands#0, %operands#1) : (i32, i32, i32) -> (i32)
				}

				// -----

				//===----------------------------------------------------------------------===//
	// pdl_interp::GetAttributeOp			// pdl_interp::GetAttributeOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// Fully tested within the tests for other operations.			// Fully tested within the tests for other operations.

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// pdl_interp::GetAttributeTypeOp			// pdl_interp::GetAttributeTypeOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	▲ Show 20 Lines • Show All 660 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Introduced iterative bytecode execution.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 387832

mlir/lib/Rewrite/ByteCode.h

mlir/lib/Rewrite/ByteCode.cpp

mlir/test/Rewrite/pdl-bytecode.mlir

Introduced iterative bytecode execution.
ClosedPublic