This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Correct memrefdataflow behavior in the presence of cast and other operations
ClosedPublic

Authored by wsmoses on Jun 10 2021, 12:15 PM.

Details

Summary

MemRefDataFlow performs mem2reg style operations for affine load/stores. Unfortunately, it is not presently correct in the presence of external operations such as memref.cast, or function calls. This diff extends the functionality of the pass to remain correct in the presence of such ops.

Diff Detail

Event Timeline

wsmoses created this revision.Jun 10 2021, 12:15 PM
wsmoses requested review of this revision.Jun 10 2021, 12:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2021, 12:16 PM

@wsmoses MemRefDataFlow has just been moved and been renamed. Could you rebase right away? (so that the rebase becomes easy with no further intervening commits)

wsmoses updated this revision to Diff 351943.Jun 14 2021, 11:32 AM

Rebase after memrefdataflow opt move

ftynse added inline comments.Jun 21 2021, 4:11 AM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
73–93

Could we have some doc comments for these functions?

130–134

Nit: add braces for these

142–144

Nit: expand auto here plz

185
188

Nit: ++iter

195

Expanding auto will remove this linter error.

231

Nit: here and below, please add trailing dots to all sentences.

362

Please fix the linter error.

mlir/test/Dialect/Affine/scalrep.mlir
238–253

Please explain what needs to be done here.

659

Please add a newline.

wsmoses updated this revision to Diff 353443.Jun 21 2021, 11:43 AM
wsmoses marked 5 inline comments as done.

Fix formatting

bondhugula requested changes to this revision.Jun 21 2021, 6:27 PM
bondhugula added inline comments.
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
73–93

Since these aren't publicly exposed, I think the convention is to have these comments on the definition instead.

115–116

Reflow - use width.

117

auto should be fine here.

127–128

Reflow.

393

You don't need to initialize this to nullptr - it has default null init.

bondhugula added inline comments.Jun 21 2021, 6:27 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
105

From the comment itself, it's not clear what it means for an operation to have a memory effect on memOp! An op has a memory effect or not through a Value in this case - it isn't meaningful to say an op has a memory effect on another op. Can you rephrase?

113

Can you move this declare/init further below?

Also, legal -> hasEffect?

183

Terminate with period.

221–225

This is covered neither by the commit title nor commit summary and is a new feature/extension completely separate from the fix for side-effecting/non-affine deferencing ops being present. Can you please move this to another revision?

This revision now requires changes to proceed.Jun 21 2021, 6:27 PM
wsmoses updated this revision to Diff 353678.Jun 22 2021, 9:38 AM

Address changes and refactor store forwarding

wsmoses marked 12 inline comments as done.Jun 22 2021, 9:40 AM
wsmoses added inline comments.
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
117

I don't believe it is. Since check is called recursively, the actual type is needed.

wsmoses added inline comments.Jun 22 2021, 9:41 AM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
113

Changed name, however, this can't be moved any further down since it is referenced within the check function.

ftynse added inline comments.Jun 23 2021, 1:32 AM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
117

Indeed. And you need an std::function, function_ref won't work correctly.

mlir/test/Dialect/Affine/scalrep.mlir
238–253

Can this be fixed by recognizing specifically the ops with AffineRead/WriteOpInterface before MemoryEffectOpInterface, and keeping track of the affine subscripts?

bondhugula requested changes to this revision.EditedJun 26 2021, 12:48 AM

This patch is regressing functionality - on store_load_store_nest_forward. Dependence analysis was already being used by this pass and was able to earlier already detect that a store was able to forward to the load. We shouldn't be disabling that test instead of revising this to not undo that inference.

mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
76–77

Pass dom infos by reference since they can't be null at this stage.

81

Likewise.

104–236

Ensure that all ... do not have ...

Ensure that no operation between ... has the ...

It's not clear what "potential memory effect" is. Rephrase.

Nit: Enclose any arg names referred to in backticks (eg. memOp, EffectType).

105

It still isn't clear here what "effect on an op" means.

106

is defined to be an operation -> is an operation ?

110

originalMemref -> memref ?

It's not clear what "original" here means.

114

Document this please. Should this be hasSideEffect? hasEffect is generic.

117

Please rename this lambda - check is too general.

142

may have a side effect?

Nit: Please terminate all comments with a period - here and everywhere else.

157

Please rename this lambda.

168

Assertion message please.

183

Name isn't descriptive enough: todoBlocks?

252

to -> endOp or untilOp?

261

You don't need the .getOperation() I think.

268–272

Update doc comment to cover the additional ..ToErase args.

322

"2." is gone and all numbers are off by one.

You'll also have to update the top-level class comment on AffineScalarReplacement.

329–333

This isn't meaningful as a loop - this is only equivalent to:

assert(fwdingCandidates.size() <= 1 && "...");
334–335

Move this up and

if (....empty()) 
  return failure();
... assertion ...
lastWriteStoreOp = fwdingCandidates.front();
362

This isn't properly named - loadCandidates?

383–385

2 values -> two values

394

Avoid auto here.

446

This is a local variable and isn't used any more - no need to clear.

mlir/test/Dialect/Affine/scalrep.mlir
240–241

Dependence analysis is already being used by the pass and was able to earlier already detect that this store did not reach the load above. Please see comment above - it's exactly saying that:
" Although there is a dependence from the second store to the load, it is
satisfied by the outer surrounding loop, and does not prevent the first
// store to be forwarded to the load."

This patch is regressing on this functionality and undoing that inference. We shouldn't be disabling this test.

This revision now requires changes to proceed.Jun 26 2021, 12:48 AM
wsmoses updated this revision to Diff 354763.Jun 27 2021, 12:04 PM

Address comments

wsmoses updated this revision to Diff 354764.Jun 27 2021, 12:14 PM
wsmoses marked 26 inline comments as done.

Additional comments

mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
285

This is a sketch of how the additional dependence analysis can be restored, while maintaining correctness with other ops. This snippet here is not sufficient, see my comment in the test.

mlir/test/Dialect/Affine/scalrep.mlir
240–241

I've demonstrated above where such dependence checking code could be inserted. Unfortunately, I do not understand the existing dependence loop structure sufficiently to rewrite it to be correct and apply here, when taking into account that other operators could modify the memory.

If you could explain that, or take a stab feel free.

Alternatively, if you're okay with it, since this PR fixes existing correctness bugs that I've seen in practice, perhaps we could first fix correctness, then restore that optimization?

bondhugula requested changes to this revision.Jun 27 2021, 5:15 PM
bondhugula added inline comments.
mlir/test/Dialect/Affine/scalrep.mlir
240–241

Unfortunately, I do not understand the existing dependence loop structure sufficiently

Can you explain which part isn't clear? I can help if the code comments or doc isn't clear.

Alternatively, if you're okay with it, since this PR fixes existing correctness bugs that >I've seen in practice, perhaps we could first fix correctness, then restore that

Actually, we shouldn't be creating such a regression - this is really not a corner case or a special case that this is regressing on but a pretty important pattern and the very reason dependence analysis is being used by this pass! It is common for affine passes to be missing handling of such side-effecting operations that need to be fixed at various places - (for eg. we always assume that %memref_1 and %memref_2 are different even if they are block arguments). Let's fix this while not regressing.

For hasNoIntervening effect, can you document what "between" two ops means? It's obvious when the two ops lie in the same block but not otherwise.

Also, the class comment on the top has still not been updated to reflect this check.

This revision now requires changes to proceed.Jun 27 2021, 5:15 PM
bondhugula added inline comments.Jun 27 2021, 5:53 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
266

Address clang-tidy here please.

290–300

Reflow - please use the entire width for comments here and everywhere else - it'll lead to fewer lines in general.

bondhugula added inline comments.Jun 27 2021, 6:21 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
77

This is unused.

284–285

an -> and

285

It's not clear to me what "check the entire parent op" means here. Rephrase?

308

Please put from and to in backticks or else it messes up what' being conveyed.

319

If we only expect to find the first op and are asserting below when there is more than one, we should be doing it right here instead of finding more than two ops.

We are also completely missing comments here on what the approach is. To start with, postDominanceInfo is now no longer being used, yet it's being computed and passed to this function. All of the related comments are outdated and incorrect and still appear to be present. I don't think I really understand the approach at all. Can you please properly document things and summarize the approach?

322–333

This whole comment is outdated and incorrect now. I think several of the code comment in class comment are also similarly outdated.

334

.size() == 0 -> .empty()

423

This is not even used any more anywhere in the code.

bondhugula added inline comments.Jun 27 2021, 6:36 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
323–333

You may also want to rebase on master - post dominance info isn't used anymore and this check was changed to use dominance info.

bondhugula edited reviewers, added: dcaballe; removed: bondhugula.Jun 27 2021, 6:42 PM

I'm going to be mostly away from reviewing the next 1.5 weeks - @ayzhuang, @dcaballe - could you please take over reviewing this? I've pretty much posted all comments I had so far.

This revision now requires review to proceed.Jun 27 2021, 6:42 PM
bondhugula added inline comments.Jun 27 2021, 7:16 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
275–295

Instead of dropping the reliance on dominance and the condition (3) that existed to find the "last write" to fwd, what could instead be done to be unified with the existing approach is to simply also consider here all ops that have a memory write side-effect *in addition to* just affine writes for storeOps above. Then ensure that those ops get into depSrcStores as well (effectively treating such non-affine ops conservatively). Find the last write op in the same way as before and this naturally/transparently takes care of it. If the unique last write isn't an AffineWriteOpInterface, no forwarding will happen. This will also be as powerful as the previous approach and will not cause any regression. It's also no more expensive than the existing approach and you really don't need to check for any paths separately. (You are just reusing dominance info.)

wsmoses updated this revision to Diff 354783.Jun 27 2021, 7:40 PM
wsmoses marked 9 inline comments as done.

Restore dependence analysis and address comments

wsmoses added inline comments.Jun 27 2021, 7:53 PM
mlir/test/Dialect/Affine/scalrep.mlir
240–241

Understood. I've spent some time recently going through and trying to understand the existing analysis, and have added what I think is a reasonable solution that maintains the aforementioned case, and general correctness.

For ease, I've also added some more depth to that part of the code (and perhaps can add a figure if thought to be useful). In essence the reliance of the store dominating the load as a necessary precondition for why the given loop depth range was unclear to me and why I was hesitant to adding something like that before ensuring it would also apply here.

Happily it does (with the aforementioned changes and comment describing why it should be correct).

It may even be able to be slightly more aggressive than the previous dependence analysis since we can set the min loop depth to that of the proposed replacement op rather than the min across all potential storing operations.

ftynse accepted this revision.Jun 28 2021, 3:13 AM

I'm going to be mostly away from reviewing the next 1.5 weeks - @ayzhuang, @dcaballe - could you please take over reviewing this? I've pretty much posted all comments I had so far.

Diego is also unavailable AFAIK.

The fixed version without regression looks okay to me. The overall recursive algorithm is to look if there is a non-affine operation that may have side effects preventing the store-to-load forwarding. It starts from the store and eagerly looks into all control-flow successors (operations and blocks) until the load is reached or all control flow paths are visited. This is overly conservative because it is visiting operations that do not lie on any control flow path from the store to the load. Ideally, we would have some reachability analysis and only visit the operations that are (1) reachable from the store and (2) such that the load is reachable from them.

mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
106

Nit: ///

280–281

Please reflow and use /// for top-level comments.

340–352

Spurious whitespace changes.

358–372

Nit: trailing dot plz.

371
379–381
mlir/test/Dialect/Affine/scalrep.mlir
577

Nit: there's no "external" operation in this test AFAICS

611–622

Please put the check blocks consistently below or consistently after the function they should match. (Look what the rest of the file does). It's also possible to use the // ----- marker to separate test cases.

This revision is now accepted and ready to land.Jun 28 2021, 3:13 AM
wsmoses updated this revision to Diff 354907.Jun 28 2021, 8:19 AM
wsmoses marked 6 inline comments as done.

Post rebase / fixing nits

mehdi_amini added inline comments.Jun 28 2021, 6:13 PM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
141

How does this work in presence of things like subview? Or if two different function parameters are the same memref (maybe that's forbidden by the affine dialect I don't remember)?

290

Seems like you can early return as soon as hasSideEffect is true here instead of continuing to traverse the IR?

314

Likely another place where you can early return when hasSideEffect is true.

359

After the call to checkOperation is another place to check for hasSideEffect and early return to limit the traversal.

ayzhuang added inline comments.Jun 29 2021, 10:30 AM
mlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
395

rename depStore to load

413

hasSingleElement check is removed. Have you tested multi-block functions? If there is no problem, could you remove the above comment and add a unit test of multi-block function?

425

Why remove loadCSE from this and other comments?

mlir/test/Dialect/Affine/scalrep.mlir
592

Comment not correct, please modify.