This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][affine-loop-fusion] Fix incorrect slice bounds on reduction producers when fusing
Needs ReviewPublic

Authored by sumesh13 on May 6 2022, 1:42 PM.

Details

Summary

A producer loop that is a reduction into a scalar value is inserted into the consumer as a single iteration. The producer loop needs to be inserted with it original bounds to be correct.

Diff Detail

Event Timeline

sumesh13 created this revision.May 6 2022, 1:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2022, 1:42 PM
sumesh13 requested review of this revision.May 6 2022, 1:42 PM
sumesh13 updated this revision to Diff 428146.May 9 2022, 11:38 AM

clang-format fix

sumesh13 updated this revision to Diff 428226.May 9 2022, 3:57 PM

Update test

Before I go for an in-depth review, one question on the approach:
If the fusion was incorrect, was the dependence analysis incorrect? If that's the case, it needs to be enhanced as a generic solution. I see some matching being done, and that made me wonder if matching is a robust enough solution.

@vinayaka-polymage Thanks for taking a look! Perhaps I should have worded it differently. The result of the entire fusion is what am referring to here . While it is a valid fusion canidadate, what is incorrect is not using the entire slice bounds. A similar fixing of bounds
happens when there is read-read dependency. See https://github.com/llvm/llvm-project/blob/3e6ba89055c80e6360b7605464520711b30084a6/mlir/lib/Dialect/Affine/Analysis/Utils.cpp#L1107 . My change is along those lines and is keeping the origina bounds
when the producer is a memref based reduction. The pattern checking is checking if it is a memref based reduction like suggested in https://discourse.llvm.org/t/detecting-reduction-loops-in-affine-dialect/2527

sumesh13 retitled this revision from [MLIR][affine-loop-fusion] Fix a bug that fuses producers that are reductions incorrectly to [MLIR][affine-loop-fusion] Fix incorrect slice bounds on reduction producers when fusing.
vinayaka-polymage resigned from this revision.Jun 2 2022, 11:36 PM

Thanks @sumesh13 for the clarification. Sorry for the late response here, I am unavailable (travel) until 12th. I will check this thread after that.

I would like to help but I'm having a hard time guessing how the wrong code looks like and what is this actually fixing. Something that could help, and that is very common in LLVM, is to pre-commit the test matching the wrong output, and then introduce the fix so that we can clearly see how the test evolves. Could you please do that? It would be very helpful.

Thanks,
Diego

sumesh13 set the repository for this revision to rG LLVM Github Monorepo.Jun 9 2022, 4:29 PM
sumesh13 updated this revision to Diff 435730.Jun 9 2022, 4:32 PM

Example showing fusion is incorrect after slice with incorrect bounds is inserted.

@dcaballe Thanks for the suggestion. I have added the example that shows the incorrect fusion. You see how the initialization of the reduction is merged into the reduction body due to bounds of the slice being set as 1.

sumesh13 updated this revision to Diff 435745.Jun 9 2022, 5:24 PM

Recommit fix that address the incorrect fusion shown in last commit

@dcaballe Sorry couldnt remind earlier. Was on vacation myself. Did you need anything else for the review ?

I think we need feedback from @bondhugula or @vinayaka-polymage. IIRC, some of the utilities that this patch modifies are used in different places and I think they assume that reductions are represented by SSA values through iter_args. @bondhugula had some thoughts about converting reductions back and forth from SSA to memory representation based on the requirements of each analysis/transformation. I'm missing some context here.

I'm also wondering if we should use affine analysis to detect memory reductions instead of just pattern-matching the IR. That would be less error-prone?