This patch allows to supply an optional memory space of the promoted buffer.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
hello @AviadCo thanks for your contribution.
I have a question regarding your intended usage of this, do you have a pass pipeline in which this fits (if so it it possible to share it) ?
The reason I am asking is that most passes in Linalg should be considered as "test" passes as we do not have automatic profitability heuristics upstream.
Instead we have been switching to implementing key functionality as transform dialect ops and apply / test them independently of a pass pipeline.
Separately, pass pipelines with proper heuristics can be built from these building blocks.
Here is a recent example of adding a new transform and a new test, in case it is useful: https://reviews.llvm.org/D153420.
You could also directly target a memory space attribute rather than using the hardcoded constants that are backend specific and can be confusing (see also: https://discourse.llvm.org/t/confused-by-inconsistencies-in-gpu-magic-constants/72041).
mlir/include/mlir/Dialect/Linalg/Passes.td | ||
---|---|---|
157 | nit: dimensions | |
mlir/lib/Dialect/Linalg/Transforms/UpdateAddressSpace.cpp | ||
147 ↗ | (On Diff #555006) | MemRefType::Builder(srcMemRefType).setAddressSpace(...) or something similar plz |
159 ↗ | (On Diff #555006) | you would likely be better off here with a linalg::CopyOp as it can further be mapped to e.g. GPU thread ids and vector load/store using something like: https://reviews.llvm.org/D154836 |
Actually, here is a contribution that seems related: https://reviews.llvm.org/D144666
Do you see a way to generalize and/or reuse the existing ?
@nicolasvasilache thanks for feedback, I really appreciate it!
I will take a look at Transform Dialect and the references you added here and see how I can generalize a solution.
Thanks,
Aviad Cohen
I really appreciate your references. I now see that I may use Linalg transform::PromoteOp with small modification to add memory space support.
Not related to this patch, I am a bit confused why GPU dialect is deeply connected to this transform? IMO we should avoid that coupling. Moreover, I am not using GPU at all but still needs to specify the memory address.
Can you please review this new patch?
@AviadCo thanks this looks good to me.
As to the existing state, I suspect the person who added these patterns wanted to implement some simple heuristic to raise the level of control / automation in the transform but ended up coupling the GPU dialect with it in the process.
I'd welcome a refactoring to decouple these concerns indeed.
The operations used for alloc and copy could also be made parametric to avoid hardcoding.
thanks for reviewing so fast!
I acctually have some ideas to improve the promotion a bit (i.e. not to copy outputs before the linalg operation if there are no uses inside the linalg block - meaning output is not an "in/out" operand) so i believe I will make further commits to PromoteOp.
nit: dimensions