This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][NVGPU] Introducing the `nvgpu.mbarrier.group` Type
AbandonedPublic

Authored by guraypp on Sep 4 2023, 9:14 AM.

Details

Summary

A common practice involves the creation of multiple MBarrier objects for utilization within loops, see an example below. This is particularly valuable in scenarios like software pipelining during matmul code generation, where we need to generate and employ five barriers dynamically within a loop.

This works improves nvgpu.mbarrier.barrier type into the nvgpu.mbarrier.group. All MBarrier-related operations now uses this type. Consequently, these operations are now capable of managing multiple barriers seamlessly.

%barriers = nvgpu.mbarrier.create -> !nvgpu.mbarrier.group<memorySpace = #gpu.address_space<workgroup>, num_barriers = 3>
nvgpu.mbarrier.init %barriers[%c0], %num_threads : !nvgpu.mbarrier.group<memorySpace = #gpu.address_space<workgroup>, num_barriers = 3>
nvgpu.mbarrier.init %barriers[%c1], %num_threads : !nvgpu.mbarrier.group<memorySpace = #gpu.address_space<workgroup>, num_barriers = 3>
nvgpu.mbarrier.init %barriers[%c2], %num_threads : !nvgpu.mbarrier.group<memorySpace = #gpu.address_space<workgroup>, num_barriers = 3>
...
scf.for %i = %c0 to %n step %c1 {
    %mbarId = arith.remui %i, 3 : index
    %isDone = nvgpu.mbarrier.test.wait %barriers[%mbarId], %token : !nvgpu.mbarrier.group<memorySpace = #gpu.address_space<workgroup>, num_barriers = 3>, !tokenType
}

Diff Detail

Event Timeline

guraypp created this revision.Sep 4 2023, 9:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2023, 9:14 AM
guraypp requested review of this revision.Sep 4 2023, 9:14 AM
guraypp abandoned this revision.Sep 11 2023, 5:32 AM