Do not use the pass yet, except in a test.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Time | Test | |
---|---|---|
20 ms | linux > LLVM.CodeGen/AArch64/GlobalISel::arm64-irtranslator.ll Script:
--
: 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -O0 -aarch64-enable-atomic-cfg-tidy=0 -stop-after=irtranslator -global-isel -verify-machineinstrs /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll -o - 2>&1 | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
| |
20 ms | linux > LLVM.CodeGen/AArch64/GlobalISel::irtranslator-no-op-intrinsics.ll Script:
--
: 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -global-isel -O0 -mtriple=aarch64-- -stop-after=irtranslator -verify-machineinstrs -o - /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-no-op-intrinsics.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-no-op-intrinsics.ll
| |
60 ms | windows > LLVM.CodeGen/AArch64/GlobalISel::arm64-irtranslator.ll Script:
--
: 'RUN: at line 1'; c:\ws\w64\llvm-project\premerge-checks\build\bin\llc.exe -O0 -aarch64-enable-atomic-cfg-tidy=0 -stop-after=irtranslator -global-isel -verify-machineinstrs C:\ws\w64\llvm-project\premerge-checks\llvm\test\CodeGen\AArch64\GlobalISel\arm64-irtranslator.ll -o - 2>&1 | c:\ws\w64\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\w64\llvm-project\premerge-checks\llvm\test\CodeGen\AArch64\GlobalISel\arm64-irtranslator.ll
| |
50 ms | windows > LLVM.CodeGen/AArch64/GlobalISel::irtranslator-no-op-intrinsics.ll Script:
--
: 'RUN: at line 2'; c:\ws\w64\llvm-project\premerge-checks\build\bin\llc.exe -global-isel -O0 -mtriple=aarch64-- -stop-after=irtranslator -verify-machineinstrs -o - C:\ws\w64\llvm-project\premerge-checks\llvm\test\CodeGen\AArch64\GlobalISel\irtranslator-no-op-intrinsics.ll | c:\ws\w64\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\w64\llvm-project\premerge-checks\llvm\test\CodeGen\AArch64\GlobalISel\irtranslator-no-op-intrinsics.ll
|
Event Timeline
mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
---|---|---|
112 | I find it somewhat hard to read that this is a carried state. A better name or comment would already help a lot. | |
134 | Is it useful to formulate this as a pattern? It could equally well be just a walk over all operations, as there is no matching going on, at all. |
mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
---|---|---|
134 | I'm using notifyMatchFailure, and maybe/theoretically there could be different patterns in the same pass. But to be honest I mainly used RewritePattern because I vaguely remember that River didn't like the raw walk in the all-reduce lowering. If you think a raw walk is better here, I'm happy to change it. |
mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
---|---|---|
134 | I don't exactly remember the details there. A normal walk is fine, but I tend to err more towards patterns because they compose better/have better logging/remove the need to write transformation application logic when things are more complicated than a simple walk. Not to say that you shouldn't just use a simple walk here, but I'd say it depends on how this pass is intended to evolve. |
mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
---|---|---|
134 | Because there is state in the pattern keeping a value, this is to me a strong indication that we should not use a pattern here. This is just hiding the traversal logic, while it seems core to the algorithm. |
Use raw walk instead of a RewritePattern.
This does seem a little cleaner, at least at the moment:
- It's easier to make the pass fail when some preconditions aren't met.
- The rewriter.replaceOp() assumes that we replace the root, which is not what we did.
mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
---|---|---|
39 | Thanks for the suggestion. Yes, that sound better. I will include that change in my next update (it's always a bit of a dance, so I'm waiting if there are more requested changes). | |
80 | I thought control flow requires separate blocks, each with their own terminator (which triggers a host synchronization and clears the currentToken). Is that not correct? | |
134 | I've removed the pattern, but it did not contain state. The Callback containing the state was instantiated inside the matchAndRewrite(). |
mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp | ||
---|---|---|
80 | You're right, I didn't notice that bit. |