This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Change the pattern for TiledLoopOp bufferization.
ClosedPublic

Authored by pifon2a on Aug 10 2021, 12:05 PM.

Details

Summary

This version is does not affect the patterns for Extract/InsertSliceOp and
LinalgOps.

Diff Detail

Event Timeline

pifon2a created this revision.Aug 10 2021, 12:05 PM
pifon2a requested review of this revision.Aug 10 2021, 12:05 PM
pifon2a updated this revision to Diff 365578.Aug 10 2021, 12:08 PM

Cosmetic change.

bkramer accepted this revision.Aug 10 2021, 12:25 PM

Thanks, this looks better than the original.

This revision is now accepted and ready to land.Aug 10 2021, 12:25 PM
This revision was landed with ongoing or failed builds.Aug 10 2021, 12:27 PM
This revision was automatically updated to reflect the committed changes.

Can you please add the integration test cases I suggested in the other patch? I'm still pretty sure this will miscompile. Fundamentally you are writing this as an "in-place" bufferization and that cannot be done with the current framework because it involves non-local reasoning.

Can you please revert this commit and the previous one I objected to? I think they are fundamentally going in the wrong direction.

https://llvm.org/docs/CodeReview.html#post-commit-review

If a community member expresses a concern about a recent commit, and this concern would have been significant enough to warrant a conversation during pre-commit review (including around the need for more design discussions), they may ask for a revert to the original author who is responsible to revert the patch promptly. Developers often disagree, and erring on the side of the developer asking for more review prevents any lingering disagreement over code in the tree. This does not indicate any fault from the patch author, this is inherent to our post-commit review practices. Reverting a patch ensures that design discussions can happen without blocking other development; it’s entirely possible the patch will end up being reapplied essentially as-is once concerns have been resolved.

Can you please add the integration test cases I suggested in the other patch? I'm still pretty sure this will miscompile. Fundamentally you are writing this as an "in-place" bufferization and that cannot be done with the current framework because it involves non-local reasoning.

I reverted both PRs. This is not really an in-place bufferization, this is an attempt to reason about subsets that tiled loop operates upon in the presence of an ugly terminator. That's why there is special logic to bufferize extract_slice/insert_slice that have a block arg of tiled loop as their operand. If tiled_loop had a separate region to specify the subset (tiled_range or full_range), then all of that would not be needed. Changing this operation would require time and it is not clear whether it makes sense to do that in MLIR OSS.

Thanks Alex!

Can you please add the integration test cases I suggested in the other patch? I'm still pretty sure this will miscompile. Fundamentally you are writing this as an "in-place" bufferization and that cannot be done with the current framework because it involves non-local reasoning.

I reverted both PRs. This is not really an in-place bufferization, this is an attempt to reason about subsets that tiled loop operates upon in the presence of an ugly terminator. That's why there is special logic to bufferize extract_slice/insert_slice that have a block arg of tiled loop as their operand. If tiled_loop had a separate region to specify the subset (tiled_range or full_range), then all of that would not be needed. Changing this operation would require time and it is not clear whether it makes sense to do that in MLIR OSS.

I'm not as much worried about the handling of extract_slice/insert_slice (I haven't looked in detail at that aspect). The most important problem I see is:

auto newLoop = rewriter.create<TiledLoopOp>(
    loc, adaptor.lowerBound(), adaptor.upperBound(), adaptor.step(),
    adaptor.inputs(), adaptor.outputs(), adaptor.iterator_types(),
    adaptor.distribution_types());

Note the use of adaptor.outputs() directly as the outs. Instead, it should be a copy of the outputs, to avoid miscompiles when the outputs are readonly or aliased. That is what I refer to when I say "in place bufferization".