Add a transpose option to hoist padding to transpose the padded tensor before storing it into the packed tensor. The early transpose improves the memory access patterns of the actual compute kernel. The patch introduces a transpose right after the hoisted pad tensor and a second transpose inside the compute loop. The second transpose can either be fused into the compute operation or will canonicalize away when lowering to vector instructions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Dialect/Linalg/Transforms/HoistPadding.cpp | ||
---|---|---|
160 | You could use RankedTensorType::Builder , up to you. | |
165 | Can we add this as a helper somewhere in linalgops.cpp / h or utils ? | |
396–398 | Please add some comment on the fact that we do not modify loop order but just tensor order and so the analysis is the same but the contiguous accesses end up being different. | |
mlir/test/lib/Dialect/Linalg/TestLinalgCodegenStrategy.cpp | ||
114 | Can we make the individual strings : separated (i.e. 1:0,0,0) until we can find a better list of list option? |
You could use RankedTensorType::Builder , up to you.