This is an archive of the discontinued LLVM Phabricator instance.

[mlir] parallel loop canonicalization
ClosedPublic

Authored by gysit on Jun 19 2020, 7:41 AM.

Details

Summary

The patch introduces a canonicalization pattern for parallel loops. The pattern removes single-iteration loop dimensions if the loop bounds and steps are constants.

Diff Detail

Event Timeline

gysit created this revision.Jun 19 2020, 7:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 19 2020, 7:41 AM
herhut accepted this revision.Jun 19 2020, 8:31 AM

Thank you for splitting this out! Just some nits but otherwise this looks great! Thanks for adding this.

mlir/lib/Dialect/SCF/SCF.cpp
752

nit: You can also use op.getInductionVars()which makes the intent clearer.

mlir/test/Dialect/SCF/canonicalize.mlir
12

It would be great to add the case of an empty iteration space. Either with another copy that does not get rewritten or by introducing one in this example.

This revision is now accepted and ready to land.Jun 19 2020, 8:31 AM
gysit added inline comments.Jun 19 2020, 8:59 AM
mlir/test/Dialect/SCF/canonicalize.mlir
12

Does it make sense to have a canonicalization pattern that remove loops with an empty iteration space entirely. I was circumventing this bcs I was not sure if I can simply replace all uses of the loop result values by the init values. And to be honest, I also do not believe it is a very relevant "optimization".

gysit updated this revision to Diff 272103.Jun 19 2020, 9:16 AM
  • added additional test
gysit marked 2 inline comments as done.Jun 19 2020, 9:17 AM
rriddle added inline comments.Jun 19 2020, 11:16 AM
mlir/lib/Dialect/SCF/SCF.cpp
14

I don't think most of these are necessary.

779

Why clone instead of inline?

gysit updated this revision to Diff 272247.Jun 20 2020, 2:22 AM
  • cleanup includes
gysit marked 3 inline comments as done.Jun 20 2020, 2:41 AM
gysit added inline comments.
mlir/lib/Dialect/SCF/SCF.cpp
14

You are right.

779

I didn't find an inline signature that takes a mapping parameter. Luckily the clone is in a code path that will not execute often... If there is a better solution for the entire block argument mapping I am keen to learn it.

mehdi_amini added inline comments.Jun 22 2020, 9:05 PM
mlir/lib/Dialect/SCF/SCF.cpp
779

It isn't clear to me that you need a mapping here, instead of mapping.map(std::get<3>(bounds), std::get<0>(bounds)); couldn't you just do std::get<3>(bounds).replaceAllUsesWith(std::get<0>(bounds)) and inline the region?
(you may have to remap other block args, but it may be handled by the inline region code)

Alternatively, we should extend the inlineRegion methods with one that fits the needs here.

gysit marked 2 inline comments as done.Jun 22 2020, 10:49 PM
gysit added inline comments.
mlir/lib/Dialect/SCF/SCF.cpp
779

My understanding (from an earlier review) is that it is not a good idea to use replaceAllUsesWith with the PatternRewriter since it cannot track these changes (which is needed to rollback?). But I may be wrong there and things may have changed since this discussion. Let me know if there is a misunderstanding on my side.

Having an inline signature with a mapping sounds like a useful extension to me!

gysit marked an inline comment as not done.Jun 23 2020, 8:59 AM
gysit added inline comments.
mlir/lib/Dialect/SCF/SCF.cpp
779

I learned canonicalization patterns are never reverted on success. I thus assume I can use replaceAllUsesWith but only after line 738 where the pattern may return failure()? Unfortunately I also need to erase all remapped arguments. The following code works for me:

for (auto iv : llvm::enumerate(llvm::reverse(op.getInductionVars()))) {
  if (auto lowerBound = mapping.lookupOrNull(iv.value())) {
    iv.value().replaceAllUsesWith(lowerBound);
    op.getBody()->eraseArgument(op.getNumLoops() - iv.index() - 1);
  }
}
rewriter.inlineRegionBefore(op.region(), newOp.region(),
                            newOp.region().begin());

If I do not erase the remapped arguments I get an invalid loop with the original number of induction variables / block arguments. Let me know if there is a nicer way to do this and if I should update the patch.

ftynse added inline comments.Jun 23 2020, 10:26 AM
mlir/lib/Dialect/SCF/SCF.cpp
779

I'd say they are _currently_ never reverted on success if you run the canonicalization pass. If somebody decides to put this pattern into the legalization infra (e.g., applyAnalysisConversion) which can use these patterns just fine, they can be reverted...

That being said, if you use eraseArgument, you can use replaceAllUsesWith and the rest of in-place modifications because none of those would be compatible with the rollback...

@rriddle will be the most knowledgeable here

rriddle added inline comments.Jun 23 2020, 12:12 PM
mlir/lib/Dialect/SCF/SCF.cpp
779

I would not rely on the fact that they aren't currently reverted. All patterns generally have the same restrictions irregardless of the driver, i.e. you should not really be mutating outside of the pattern rewriter. If you code patterns, especially canonicalization patterns, to the quirks of a specific driver it breaks the ability to use it anywhere else and it also makes it much much more difficult if anyone ever wants to change the way the driver works in the future.

mehdi_amini added inline comments.Jun 23 2020, 12:36 PM
mlir/lib/Dialect/SCF/SCF.cpp
779

That's interesting: limiting canonicalization to "cancellable" transformations may incur extra costs (as we see here): do we have an RAUW exposed in a way that the driver could change the behavior to have efficient canonicalizations? (there may be tricky interactions though, I'm not sure how it'd play out)

rriddle added inline comments.Jun 23 2020, 1:04 PM
mlir/lib/Dialect/SCF/SCF.cpp
779

It's not just about cancellable or not, even the greedy driver keeps a mapping of things which can easily become invalidated. There have been many instances of crashes/asan failures because a pattern is doing something without informing the driver. For dialect conversion it is largely about cancellability, for greedy it is largely about allowing the driver to remove mappings for things that are being invalidated.

gysit updated this revision to Diff 273461.Jun 25 2020, 11:11 AM
gysit edited the summary of this revision. (Show Details)
  • Added a comment that reflects the discussions about clone (can be reverted) vs inline (performance).
  • The current implementation is revertible at the cost of cloning the loop body.
  • Use std::tie to improve the code readability.
herhut accepted this revision.Jun 25 2020, 12:23 PM

I think the conclusion is that we want it to be cancelable. So this is ready to land.

mlir/test/Dialect/SCF/canonicalize.mlir
12

I agree it is not very relevant but yes, you could replace the results with the init values in such case.

This revision was automatically updated to reflect the committed changes.