This is an archive of the discontinued LLVM Phabricator instance.

[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types
ClosedPublic

Authored by christopherbate on Apr 27 2022, 9:15 AM.

Details

Summary

Adds missing logic in the lowering from NvGPU to NVVM to support fp32
(in an accumulator operand) and tf32 (in multiplicand operand) types.
Fixes logic in one of the helper functions for converting the result
of a mma.sync operation with multiple 8x256bit output tiles, which is
the case for f32 outputs.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2022, 9:15 AM
christopherbate requested review of this revision.Apr 27 2022, 9:15 AM
herhut accepted this revision.Apr 28 2022, 2:49 AM
herhut added a reviewer: csigg.

Thanks. Adding @csigg as an FYI.

This revision is now accepted and ready to land.Apr 28 2022, 2:49 AM

I have a patch for NVVM mma.sync TF32 support. I should land t hat before this one, then update this one to include additional fixes for TF32

Updated the diff to include changes that support TF32 operands.

christopherbate retitled this revision from [mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32 types to [mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types.May 5 2022, 10:42 AM
christopherbate edited the summary of this revision. (Show Details)