Do not use the pass yet, except in a test.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
| mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
|---|---|---|
| 112 | I find it somewhat hard to read that this is a carried state. A better name or comment would already help a lot. | |
| 134 | Is it useful to formulate this as a pattern? It could equally well be just a walk over all operations, as there is no matching going on, at all. | |
| mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
|---|---|---|
| 134 | I'm using notifyMatchFailure, and maybe/theoretically there could be different patterns in the same pass. But to be honest I mainly used RewritePattern because I vaguely remember that River didn't like the raw walk in the all-reduce lowering. If you think a raw walk is better here, I'm happy to change it. | |
| mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
|---|---|---|
| 134 | I don't exactly remember the details there. A normal walk is fine, but I tend to err more towards patterns because they compose better/have better logging/remove the need to write transformation application logic when things are more complicated than a simple walk. Not to say that you shouldn't just use a simple walk here, but I'd say it depends on how this pass is intended to evolve. | |
| mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
|---|---|---|
| 134 | Because there is state in the pattern keeping a value, this is to me a strong indication that we should not use a pattern here. This is just hiding the traversal logic, while it seems core to the algorithm. | |
Use raw walk instead of a RewritePattern.
This does seem a little cleaner, at least at the moment:
- It's easier to make the pass fail when some preconditions aren't met.
- The rewriter.replaceOp() assumes that we replace the root, which is not what we did.
| mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
|---|---|---|
| 39 | Thanks for the suggestion. Yes, that sound better. I will include that change in my next update (it's always a bit of a dance, so I'm waiting if there are more requested changes). | |
| 80 | I thought control flow requires separate blocks, each with their own terminator (which triggers a host synchronization and clears the currentToken). Is that not correct? | |
| 134 | I've removed the pattern, but it did not contain state. The Callback containing the state was instantiated inside the matchAndRewrite(). | |
| mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
|---|---|---|
| 80 | You're right, I didn't notice that bit. | |
| mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
|---|---|---|
| 82 | Is there a reason why this is using Operation::create to clone the operation instead of one of the clone() routine or could we just switch it? | |
The more common term for this is to "thread the token through", WDYT?