This revision is a work in progress that tries to entangle the notion of backend-specific type conversion.
In particular it demonstrates that a notional sink pass (or generalized pattern application) helps
avoid footguns that inevitably arise from pass-based thinking that requires keeping all passes in sync with
the same type conversion decision.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/include/mlir/Conversion/LLVMCommon/Pattern.h | ||
---|---|---|
51 | This API should not be used for addressing then because it is decoupled from the memory space? Won't this lead to unrealized_cast because of different index size between arithmetic and indexing? | |
60 | It's a bit weird that the indexing of memref would differ from this API which is more generic and about address spaces. | |
mlir/include/mlir/Dialect/GPU/IR/GPUBase.td | ||
58 | This is just gonna do isSharedMemoryAddressSpace(type.getMemorySpace()); does this indirection pulls its weight here? Also shall it be defined inline? |
mlir/include/mlir/Dialect/GPU/IR/GPUBase.td | ||
---|---|---|
62 | We could add hasConstantMemoryAddressSpace or hasLocalMemoryAddressSpace. | |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
211 | We should use 32b for the local and constant memory spaces as well. Here is the list of address spaces (I don't remember what is 2): generic = 0 global = 1 shared = 3 constant = 4 local = 5 As far as I can see, MLIR does set shared address space, but does not set local or constant address spaces. |
Does all this complexity really belongs to llvm conversion pass?
Maybe add a separate pass which decomposes memrefs index calculations into explicit ops (possibly inserting appropriate trunc casts depending on memory space) and then result is just straightforwardly converted to llvm.
As as bonus, you will be able to additional canonicalizations/transformations between this new pass and llvm conversion.
This API should not be used for addressing then because it is decoupled from the memory space?
Won't this lead to unrealized_cast because of different index size between arithmetic and indexing?