This allows exposing the full gpu device and host lowering pipelines in a single place.
Depends on: D157703
Paths
| Differential D157723
[mlir][transform] Replace complex test-lower-to-nvvm by an explicit TD listing in transform-mma-sync-matmul-f32.mlir AcceptedPublic Authored by nicolasvasilache on Aug 11 2023, 8:12 AM.
Details Summary This allows exposing the full gpu device and host lowering pipelines in a single place. Depends on: D157703
Diff Detail
Event Timeline
springerm added inline comments. Comment Actions (drive by comment)
This revision is now accepted and ready to land.Sep 4 2023, 4:31 AM
Revision Contents
Diff 549404 mlir/include/mlir/Dialect/Transform/IR/TransformDialect.h
mlir/lib/Dialect/GPU/TransformOps/CMakeLists.txt
mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
mlir/lib/Dialect/Linalg/TransformOps/CMakeLists.txt
mlir/lib/Dialect/Linalg/TransformOps/DialectExtension.cpp
mlir/test/Integration/GPU/CUDA/TensorCore/sm80/transform-mma-sync-matmul-f32.mlir
|
This is a problem: The code is adding an extension (registry.addExtension) as part of an extension. I haven't figured out the details yet, but this can cause a reallocation in the underlying extensions vector while it is being iterated over.