This is an archive of the discontinued LLVM Phabricator instance.

[mlir][nvvm] Add `cp.async.bulk.tensor.shared.cluster.global`
ClosedPublic

Authored by guraypp on Jul 12 2023, 2:49 AM.

Details

Summary

This work introduce cp.async.bulk.tensor.shared.cluster.global in NVVM dialect that executes load using TMA.

Depends on D155056

Diff Detail

Event Timeline

guraypp created this revision.Jul 12 2023, 2:49 AM
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
guraypp requested review of this revision.Jul 12 2023, 2:49 AM
guraypp updated this revision to Diff 540918.Jul 17 2023, 2:33 AM

rebase and fix argument order

guraypp updated this revision to Diff 540975.Jul 17 2023, 5:42 AM

fix typos

nicolasvasilache accepted this revision.Jul 17 2023, 8:07 AM
nicolasvasilache added inline comments.
mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
27

spurious include ? (should be transitively included already)

36

spurious include ?

This revision is now accepted and ready to land.Jul 17 2023, 8:07 AM
This revision was landed with ongoing or failed builds.Jul 17 2023, 8:10 AM
This revision was automatically updated to reflect the committed changes.