This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Correct memrefdataflow behavior in the presence of cast and other operations
ClosedPublic

Authored by wsmoses on Jun 10 2021, 12:15 PM.

Details

Summary

MemRefDataFlow performs mem2reg style operations for affine load/stores. Unfortunately, it is not presently correct in the presence of external operations such as memref.cast, or function calls. This diff extends the functionality of the pass to remain correct in the presence of such ops.

Diff Detail

Event Timeline

wsmoses created this revision.Jun 10 2021, 12:15 PM
wsmoses requested review of this revision.Jun 10 2021, 12:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2021, 12:16 PM

@wsmoses MemRefDataFlow has just been moved and been renamed. Could you rebase right away? (so that the rebase becomes easy with no further intervening commits)

wsmoses updated this revision to Diff 351943.Jun 14 2021, 11:32 AM

Rebase after memrefdataflow opt move

ftynse added inline comments.Jun 21 2021, 4:11 AM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
68–71

Could we have some doc comments for these functions?

108–112

Nit: add braces for these

120–122

Nit: expand auto here plz

163
166

Nit: ++iter

173

Expanding auto will remove this linter error.

209

Nit: here and below, please add trailing dots to all sentences.

334

Please fix the linter error.

mlir/test/Dialect/Affine/scalrep.mlir
248–249

Please explain what needs to be done here.

645

Please add a newline.

wsmoses updated this revision to Diff 353443.Jun 21 2021, 11:43 AM
wsmoses marked 5 inline comments as done.

Fix formatting

bondhugula requested changes to this revision.Jun 21 2021, 6:27 PM
bondhugula added inline comments.
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
68–71

Since these aren't publicly exposed, I think the convention is to have these comments on the definition instead.

93–94

Reflow - use width.

95

auto should be fine here.

105–106

Reflow.

367

You don't need to initialize this to nullptr - it has default null init.

bondhugula added inline comments.Jun 21 2021, 6:27 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
83

From the comment itself, it's not clear what it means for an operation to have a memory effect on memOp! An op has a memory effect or not through a Value in this case - it isn't meaningful to say an op has a memory effect on another op. Can you rephrase?

91

Can you move this declare/init further below?

Also, legal -> hasEffect?

161

Terminate with period.

199–203

This is covered neither by the commit title nor commit summary and is a new feature/extension completely separate from the fix for side-effecting/non-affine deferencing ops being present. Can you please move this to another revision?

This revision now requires changes to proceed.Jun 21 2021, 6:27 PM
wsmoses updated this revision to Diff 353678.Jun 22 2021, 9:38 AM

Address changes and refactor store forwarding

wsmoses marked 12 inline comments as done.Jun 22 2021, 9:40 AM
wsmoses added inline comments.
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
95

I don't believe it is. Since check is called recursively, the actual type is needed.

wsmoses added inline comments.Jun 22 2021, 9:41 AM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
91

Changed name, however, this can't be moved any further down since it is referenced within the check function.

ftynse added inline comments.Jun 23 2021, 1:32 AM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
95

Indeed. And you need an std::function, function_ref won't work correctly.

mlir/test/Dialect/Affine/scalrep.mlir
248–249

Can this be fixed by recognizing specifically the ops with AffineRead/WriteOpInterface before MemoryEffectOpInterface, and keeping track of the affine subscripts?

bondhugula requested changes to this revision.EditedJun 26 2021, 12:48 AM

This patch is regressing functionality - on store_load_store_nest_forward. Dependence analysis was already being used by this pass and was able to earlier already detect that a store was able to forward to the load. We shouldn't be disabling that test instead of revising this to not undo that inference.

mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
71–72

Pass dom infos by reference since they can't be null at this stage.

76

Likewise.

82–125

Ensure that all ... do not have ...

Ensure that no operation between ... has the ...

It's not clear what "potential memory effect" is. Rephrase.

Nit: Enclose any arg names referred to in backticks (eg. memOp, EffectType).

83

It still isn't clear here what "effect on an op" means.

84

is defined to be an operation -> is an operation ?

88

originalMemref -> memref ?

It's not clear what "original" here means.

92

Document this please. Should this be hasSideEffect? hasEffect is generic.

95

Please rename this lambda - check is too general.

120

may have a side effect?

Nit: Please terminate all comments with a period - here and everywhere else.

135

Please rename this lambda.

141

to -> endOp or untilOp?

146

Assertion message please.

150

You don't need the .getOperation() I think.

161

Name isn't descriptive enough: todoBlocks?

258–259

Update doc comment to cover the additional ..ToErase args.

306–307

"2." is gone and all numbers are off by one.

You'll also have to update the top-level class comment on AffineScalarReplacement.

306–307

This isn't meaningful as a loop - this is only equivalent to:

assert(fwdingCandidates.size() <= 1 && "...");
307–308

Move this up and

if (....empty()) 
  return failure();
... assertion ...
lastWriteStoreOp = fwdingCandidates.front();
334

This isn't properly named - loadCandidates?

357–359

2 values -> two values

368

Avoid auto here.

409

This is a local variable and isn't used any more - no need to clear.

mlir/test/Dialect/Affine/scalrep.mlir
250–251

Dependence analysis is already being used by the pass and was able to earlier already detect that this store did not reach the load above. Please see comment above - it's exactly saying that:
" Although there is a dependence from the second store to the load, it is
satisfied by the outer surrounding loop, and does not prevent the first
// store to be forwarded to the load."

This patch is regressing on this functionality and undoing that inference. We shouldn't be disabling this test.

This revision now requires changes to proceed.Jun 26 2021, 12:48 AM
wsmoses updated this revision to Diff 354763.Jun 27 2021, 12:04 PM

Address comments

wsmoses updated this revision to Diff 354764.Jun 27 2021, 12:14 PM
wsmoses marked 26 inline comments as done.

Additional comments

mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
174

This is a sketch of how the additional dependence analysis can be restored, while maintaining correctness with other ops. This snippet here is not sufficient, see my comment in the test.

mlir/test/Dialect/Affine/scalrep.mlir
250–251

I've demonstrated above where such dependence checking code could be inserted. Unfortunately, I do not understand the existing dependence loop structure sufficiently to rewrite it to be correct and apply here, when taking into account that other operators could modify the memory.

If you could explain that, or take a stab feel free.

Alternatively, if you're okay with it, since this PR fixes existing correctness bugs that I've seen in practice, perhaps we could first fix correctness, then restore that optimization?

bondhugula requested changes to this revision.Jun 27 2021, 5:15 PM
bondhugula added inline comments.
mlir/test/Dialect/Affine/scalrep.mlir
250–251

Unfortunately, I do not understand the existing dependence loop structure sufficiently

Can you explain which part isn't clear? I can help if the code comments or doc isn't clear.

Alternatively, if you're okay with it, since this PR fixes existing correctness bugs that >I've seen in practice, perhaps we could first fix correctness, then restore that

Actually, we shouldn't be creating such a regression - this is really not a corner case or a special case that this is regressing on but a pretty important pattern and the very reason dependence analysis is being used by this pass! It is common for affine passes to be missing handling of such side-effecting operations that need to be fixed at various places - (for eg. we always assume that %memref_1 and %memref_2 are different even if they are block arguments). Let's fix this while not regressing.

For hasNoIntervening effect, can you document what "between" two ops means? It's obvious when the two ops lie in the same block but not otherwise.

Also, the class comment on the top has still not been updated to reflect this check.

This revision now requires changes to proceed.Jun 27 2021, 5:15 PM
bondhugula added inline comments.Jun 27 2021, 5:53 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
155

Address clang-tidy here please.

179–189

Reflow - please use the entire width for comments here and everywhere else - it'll lead to fewer lines in general.

bondhugula added inline comments.Jun 27 2021, 6:21 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
72

This is unused.

173–174

an -> and

174

It's not clear to me what "check the entire parent op" means here. Rephrase?

197

Please put from and to in backticks or else it messes up what' being conveyed.

302–304

If we only expect to find the first op and are asserting below when there is more than one, we should be doing it right here instead of finding more than two ops.

We are also completely missing comments here on what the approach is. To start with, postDominanceInfo is now no longer being used, yet it's being computed and passed to this function. All of the related comments are outdated and incorrect and still appear to be present. I don't think I really understand the approach at all. Can you please properly document things and summarize the approach?

306–307

This whole comment is outdated and incorrect now. I think several of the code comment in class comment are also similarly outdated.

307

.size() == 0 -> .empty()

397

This is not even used any more anywhere in the code.

bondhugula added inline comments.Jun 27 2021, 6:36 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
306–307

You may also want to rebase on master - post dominance info isn't used anymore and this check was changed to use dominance info.

bondhugula edited reviewers, added: dcaballe; removed: bondhugula.Jun 27 2021, 6:42 PM

I'm going to be mostly away from reviewing the next 1.5 weeks - @ayzhuang, @dcaballe - could you please take over reviewing this? I've pretty much posted all comments I had so far.

This revision now requires review to proceed.Jun 27 2021, 6:42 PM
bondhugula added inline comments.Jun 27 2021, 7:16 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
258–278

Instead of dropping the reliance on dominance and the condition (3) that existed to find the "last write" to fwd, what could instead be done to be unified with the existing approach is to simply also consider here all ops that have a memory write side-effect *in addition to* just affine writes for storeOps above. Then ensure that those ops get into depSrcStores as well (effectively treating such non-affine ops conservatively). Find the last write op in the same way as before and this naturally/transparently takes care of it. If the unique last write isn't an AffineWriteOpInterface, no forwarding will happen. This will also be as powerful as the previous approach and will not cause any regression. It's also no more expensive than the existing approach and you really don't need to check for any paths separately. (You are just reusing dominance info.)

wsmoses updated this revision to Diff 354783.Jun 27 2021, 7:40 PM
wsmoses marked 9 inline comments as done.

Restore dependence analysis and address comments

wsmoses added inline comments.Jun 27 2021, 7:53 PM
mlir/test/Dialect/Affine/scalrep.mlir
250–251

Understood. I've spent some time recently going through and trying to understand the existing analysis, and have added what I think is a reasonable solution that maintains the aforementioned case, and general correctness.

For ease, I've also added some more depth to that part of the code (and perhaps can add a figure if thought to be useful). In essence the reliance of the store dominating the load as a necessary precondition for why the given loop depth range was unclear to me and why I was hesitant to adding something like that before ensuring it would also apply here.

Happily it does (with the aforementioned changes and comment describing why it should be correct).

It may even be able to be slightly more aggressive than the previous dependence analysis since we can set the min loop depth to that of the proposed replacement op rather than the min across all potential storing operations.

ftynse accepted this revision.Jun 28 2021, 3:13 AM

I'm going to be mostly away from reviewing the next 1.5 weeks - @ayzhuang, @dcaballe - could you please take over reviewing this? I've pretty much posted all comments I had so far.

Diego is also unavailable AFAIK.

The fixed version without regression looks okay to me. The overall recursive algorithm is to look if there is a non-affine operation that may have side effects preventing the store-to-load forwarding. It starts from the store and eagerly looks into all control-flow successors (operations and blocks) until the load is reached or all control flow paths are visited. This is overly conservative because it is visiting operations that do not lie on any control flow path from the store to the load. Ideally, we would have some reachability analysis and only visit the operations that are (1) reachable from the store and (2) such that the load is reachable from them.

mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
84

Nit: ///

263–264

Please reflow and use /// for top-level comments.

313–325

Spurious whitespace changes.

330–345

Nit: trailing dot plz.

343
352–355
mlir/test/Dialect/Affine/scalrep.mlir
597

Nit: there's no "external" operation in this test AFAICS

631–642

Please put the check blocks consistently below or consistently after the function they should match. (Look what the rest of the file does). It's also possible to use the // ----- marker to separate test cases.

This revision is now accepted and ready to land.Jun 28 2021, 3:13 AM
wsmoses updated this revision to Diff 354907.Jun 28 2021, 8:19 AM
wsmoses marked 6 inline comments as done.

Post rebase / fixing nits

mehdi_amini added inline comments.Jun 28 2021, 6:13 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
119

How does this work in presence of things like subview? Or if two different function parameters are the same memref (maybe that's forbidden by the affine dialect I don't remember)?

179

Seems like you can early return as soon as hasSideEffect is true here instead of continuing to traverse the IR?

203

Likely another place where you can early return when hasSideEffect is true.

248

After the call to checkOperation is another place to check for hasSideEffect and early return to limit the traversal.

ayzhuang added inline comments.Jun 29 2021, 10:30 AM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
369

rename depStore to load

387

hasSingleElement check is removed. Have you tested multi-block functions? If there is no problem, could you remove the above comment and add a unit test of multi-block function?

398

Why remove loadCSE from this and other comments?

mlir/test/Dialect/Affine/scalrep.mlir
612

Comment not correct, please modify.