MemRefDataFlow performs mem2reg style operations for affine load/stores. Unfortunately, it is not presently correct in the presence of external operations such as memref.cast, or function calls. This diff extends the functionality of the pass to remain correct in the presence of such ops.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
@wsmoses MemRefDataFlow has just been moved and been renamed. Could you rebase right away? (so that the rebase becomes easy with no further intervening commits)
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 73–93 | Could we have some doc comments for these functions? | |
| 130–134 | Nit: add braces for these | |
| 142–144 | Nit: expand auto here plz | |
| 185 | ||
| 188 | Nit: ++iter | |
| 195 | Expanding auto will remove this linter error. | |
| 231 | Nit: here and below, please add trailing dots to all sentences. | |
| 362 | Please fix the linter error. | |
| mlir/test/Dialect/Affine/scalrep.mlir | ||
| 238–253 | Please explain what needs to be done here. | |
| 659 | Please add a newline. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 73–93 | Since these aren't publicly exposed, I think the convention is to have these comments on the definition instead. | |
| 115–116 | Reflow - use width. | |
| 117 | auto should be fine here. | |
| 127–128 | Reflow. | |
| 393 | You don't need to initialize this to nullptr - it has default null init. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 105 | From the comment itself, it's not clear what it means for an operation to have a memory effect on memOp! An op has a memory effect or not through a Value in this case - it isn't meaningful to say an op has a memory effect on another op. Can you rephrase? | |
| 113 | Can you move this declare/init further below? Also, legal -> hasEffect? | |
| 183 | Terminate with period. | |
| 221–225 | This is covered neither by the commit title nor commit summary and is a new feature/extension completely separate from the fix for side-effecting/non-affine deferencing ops being present. Can you please move this to another revision? | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 117 | I don't believe it is. Since check is called recursively, the actual type is needed. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 113 | Changed name, however, this can't be moved any further down since it is referenced within the check function. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 117 | Indeed. And you need an std::function, function_ref won't work correctly. | |
| mlir/test/Dialect/Affine/scalrep.mlir | ||
| 238–253 | Can this be fixed by recognizing specifically the ops with AffineRead/WriteOpInterface before MemoryEffectOpInterface, and keeping track of the affine subscripts? | |
This patch is regressing functionality - on store_load_store_nest_forward. Dependence analysis was already being used by this pass and was able to earlier already detect that a store was able to forward to the load. We shouldn't be disabling that test instead of revising this to not undo that inference.
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 76–77 | Pass dom infos by reference since they can't be null at this stage. | |
| 81 | Likewise. | |
| 104–236 | Ensure that all ... do not have ... Ensure that no operation between ... has the ... It's not clear what "potential memory effect" is. Rephrase. Nit: Enclose any arg names referred to in backticks (eg. memOp, EffectType). | |
| 105 | It still isn't clear here what "effect on an op" means. | |
| 106 | is defined to be an operation -> is an operation ? | |
| 110 | originalMemref -> memref ? It's not clear what "original" here means. | |
| 114 | Document this please. Should this be hasSideEffect? hasEffect is generic. | |
| 117 | Please rename this lambda - check is too general. | |
| 142 | may have a side effect? Nit: Please terminate all comments with a period - here and everywhere else. | |
| 157 | Please rename this lambda. | |
| 168 | Assertion message please. | |
| 183 | Name isn't descriptive enough: todoBlocks? | |
| 252 | to -> endOp or untilOp? | |
| 261 | You don't need the .getOperation() I think. | |
| 268–272 | Update doc comment to cover the additional ..ToErase args. | |
| 322 | "2." is gone and all numbers are off by one. You'll also have to update the top-level class comment on AffineScalarReplacement. | |
| 329–333 | This isn't meaningful as a loop - this is only equivalent to: assert(fwdingCandidates.size() <= 1 && "..."); | |
| 334–335 | Move this up and if (....empty()) return failure(); ... assertion ... lastWriteStoreOp = fwdingCandidates.front(); | |
| 362 | This isn't properly named - loadCandidates? | |
| 383–385 | 2 values -> two values | |
| 394 | Avoid auto here. | |
| 446 | This is a local variable and isn't used any more - no need to clear. | |
| mlir/test/Dialect/Affine/scalrep.mlir | ||
| 240–241 | Dependence analysis is already being used by the pass and was able to earlier already detect that this store did not reach the load above. Please see comment above - it's exactly saying that: This patch is regressing on this functionality and undoing that inference. We shouldn't be disabling this test. | |
Additional comments
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 285 | This is a sketch of how the additional dependence analysis can be restored, while maintaining correctness with other ops. This snippet here is not sufficient, see my comment in the test. | |
| mlir/test/Dialect/Affine/scalrep.mlir | ||
| 240–241 | I've demonstrated above where such dependence checking code could be inserted. Unfortunately, I do not understand the existing dependence loop structure sufficiently to rewrite it to be correct and apply here, when taking into account that other operators could modify the memory. If you could explain that, or take a stab feel free. Alternatively, if you're okay with it, since this PR fixes existing correctness bugs that I've seen in practice, perhaps we could first fix correctness, then restore that optimization? | |
| mlir/test/Dialect/Affine/scalrep.mlir | ||
|---|---|---|
| 240–241 | 
 Can you explain which part isn't clear? I can help if the code comments or doc isn't clear. 
 Actually, we shouldn't be creating such a regression - this is really not a corner case or a special case that this is regressing on but a pretty important pattern and the very reason dependence analysis is being used by this pass! It is common for affine passes to be missing handling of such side-effecting operations that need to be fixed at various places - (for eg. we always assume that %memref_1 and %memref_2 are different even if they are block arguments). Let's fix this while not regressing. For hasNoIntervening effect, can you document what "between" two ops means? It's obvious when the two ops lie in the same block but not otherwise. Also, the class comment on the top has still not been updated to reflect this check. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 77 | This is unused. | |
| 284–285 | an -> and | |
| 285 | It's not clear to me what "check the entire parent op" means here. Rephrase? | |
| 308 | Please put from and to in backticks or else it messes up what' being conveyed. | |
| 319 | If we only expect to find the first op and are asserting below when there is more than one, we should be doing it right here instead of finding more than two ops. We are also completely missing comments here on what the approach is. To start with, postDominanceInfo is now no longer being used, yet it's being computed and passed to this function. All of the related comments are outdated and incorrect and still appear to be present. I don't think I really understand the approach at all. Can you please properly document things and summarize the approach? | |
| 322–333 | This whole comment is outdated and incorrect now. I think several of the code comment in class comment are also similarly outdated. | |
| 334 | .size() == 0 -> .empty() | |
| 423 | This is not even used any more anywhere in the code. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 323–333 | You may also want to rebase on master - post dominance info isn't used anymore and this check was changed to use dominance info. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 275–295 | Instead of dropping the reliance on dominance and the condition (3) that existed to find the "last write" to fwd, what could instead be done to be unified with the existing approach is to simply also consider here all ops that have a memory write side-effect *in addition to* just affine writes for storeOps above. Then ensure that those ops get into depSrcStores as well (effectively treating such non-affine ops conservatively). Find the last write op in the same way as before and this naturally/transparently takes care of it. If the unique last write isn't an AffineWriteOpInterface, no forwarding will happen. This will also be as powerful as the previous approach and will not cause any regression. It's also no more expensive than the existing approach and you really don't need to check for any paths separately. (You are just reusing dominance info.) | |
| mlir/test/Dialect/Affine/scalrep.mlir | ||
|---|---|---|
| 240–241 | Understood. I've spent some time recently going through and trying to understand the existing analysis, and have added what I think is a reasonable solution that maintains the aforementioned case, and general correctness. For ease, I've also added some more depth to that part of the code (and perhaps can add a figure if thought to be useful). In essence the reliance of the store dominating the load as a necessary precondition for why the given loop depth range was unclear to me and why I was hesitant to adding something like that before ensuring it would also apply here. Happily it does (with the aforementioned changes and comment describing why it should be correct). It may even be able to be slightly more aggressive than the previous dependence analysis since we can set the min loop depth to that of the proposed replacement op rather than the min across all potential storing operations. | |
I'm going to be mostly away from reviewing the next 1.5 weeks - @ayzhuang, @dcaballe - could you please take over reviewing this? I've pretty much posted all comments I had so far.
Diego is also unavailable AFAIK.
The fixed version without regression looks okay to me. The overall recursive algorithm is to look if there is a non-affine operation that may have side effects preventing the store-to-load forwarding. It starts from the store and eagerly looks into all control-flow successors (operations and blocks) until the load is reached or all control flow paths are visited. This is overly conservative because it is visiting operations that do not lie on any control flow path from the store to the load. Ideally, we would have some reachability analysis and only visit the operations that are (1) reachable from the store and (2) such that the load is reachable from them.
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 106 | Nit: /// | |
| 280–281 | Please reflow and use /// for top-level comments. | |
| 340–352 | Spurious whitespace changes. | |
| 358–372 | Nit: trailing dot plz. | |
| 371 | ||
| 379–381 | ||
| mlir/test/Dialect/Affine/scalrep.mlir | ||
| 577 | Nit: there's no "external" operation in this test AFAICS | |
| 611–622 | Please put the check blocks consistently below or consistently after the function they should match. (Look what the rest of the file does). It's also possible to use the // ----- marker to separate test cases. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 141 | How does this work in presence of things like subview? Or if two different function parameters are the same memref (maybe that's forbidden by the affine dialect I don't remember)? | |
| 290 | Seems like you can early return as soon as hasSideEffect is true here instead of continuing to traverse the IR? | |
| 314 | Likely another place where you can early return when hasSideEffect is true. | |
| 359 | After the call to checkOperation is another place to check for hasSideEffect and early return to limit the traversal. | |
| mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp | ||
|---|---|---|
| 395 | rename depStore to load | |
| 413 | hasSingleElement check is removed. Have you tested multi-block functions? If there is no problem, could you remove the above comment and add a unit test of multi-block function? | |
| 425 | Why remove loadCSE from this and other comments? | |
| mlir/test/Dialect/Affine/scalrep.mlir | ||
| 592 | Comment not correct, please modify. | |
Pass dom infos by reference since they can't be null at this stage.