Current tileConsumerAndFuseProducers method allows for only
replacing the uses of the tiled consumer, and not the fused producers.
Based on how fusion works currently, it is not always possible/easy to
return a tensor value from the tiled loop nest to replace the uses of
the fused producers since the same tile of the fused producer might be
recomputed while computing multiple tiles of the consumer op. This
patch adds an option that will allow the transformation to return the
tensor replacements for the fused producer ops, but expects the caller
to control the tile sizes so that the tiles of the fused producer are
computed only once, by controlling which loops are tiled. There might
be an analysis that can determine this, but for now this left to
control from the caller.
Details
Details
- Reviewers
nicolasvasilache
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
Discussed this offline, the plan is to start by building small independent IR transform utils that can be tested independently and reevaluate how they compose once we have a better visibility of the parts.