This is an archive of the discontinued LLVM Phabricator instance.

[mlir][memref] WIP - Reorganize conversions involving memref in the presence of backend-specific type conversions.
Needs ReviewPublic

Authored by nicolasvasilache on Aug 2 2023, 7:27 AM.

Details

Summary

This revision is a work in progress that tries to entangle the notion of backend-specific type conversion.
In particular it demonstrates that a notional sink pass (or generalized pattern application) helps
avoid footguns that inevitably arise from pass-based thinking that requires keeping all passes in sync with
the same type conversion decision.

Diff Detail

Event Timeline

Herald added a reviewer: ftynse. · View Herald Transcript
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
nicolasvasilache requested review of this revision.Aug 2 2023, 7:27 AM
mehdi_amini added inline comments.Aug 2 2023, 8:11 PM
mlir/include/mlir/Conversion/LLVMCommon/Pattern.h
51

This API should not be used for addressing then because it is decoupled from the memory space?

Won't this lead to unrealized_cast because of different index size between arithmetic and indexing?

60

It's a bit weird that the indexing of memref would differ from this API which is more generic and about address spaces.

mlir/include/mlir/Dialect/GPU/IR/GPUBase.td
58

This is just gonna do isSharedMemoryAddressSpace(type.getMemorySpace()); does this indirection pulls its weight here? Also shall it be defined inline?

guraypp added inline comments.Aug 2 2023, 11:54 PM
mlir/include/mlir/Dialect/GPU/IR/GPUBase.td
62

We could add hasConstantMemoryAddressSpace or hasLocalMemoryAddressSpace.

mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
211

We should use 32b for the local and constant memory spaces as well. Here is the list of address spaces (I don't remember what is 2):

generic = 0
global = 1
shared = 3
constant = 4
local = 5

As far as I can see, MLIR does set shared address space, but does not set local or constant address spaces.

Does all this complexity really belongs to llvm conversion pass?
Maybe add a separate pass which decomposes memrefs index calculations into explicit ops (possibly inserting appropriate trunc casts depending on memory space) and then result is just straightforwardly converted to llvm.
As as bonus, you will be able to additional canonicalizations/transformations between this new pass and llvm conversion.