This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][NVVM] Fix register mapping in `wgmma.mma_async`
ClosedPublic

Authored by guraypp on Aug 14 2023, 1:58 AM.

Details

Summary

WgmmaMmaAsync Op generates wgmma.mma_async PTX instruction that uses the same registers as read and write with mapping. Therefore, the registers count needs to be increased 2 times for the following registers.

This works changes this:

llvm.inline_asm has_side_effects asm_dialect = att "{wgmma.mma_async... {$0, $1, $2, $3, $4}, $5, $6, p", "=f,=f,=f,=f,0,1,2,3,l,l"

Into this one below. The only different is the number of registers ($8 and $9) that comes after read/write.

llvm.inline_asm has_side_effects asm_dialect = att "{wgmma.mma_async... {$0, $1, $2, $3, $4}, $8, $9, p", "=f,=f,=f,=f,0,1,2,3,l,l"

Diff Detail

Event Timeline

guraypp created this revision.Aug 14 2023, 1:58 AM
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
guraypp requested review of this revision.Aug 14 2023, 1:58 AM
qcolombet accepted this revision.Aug 14 2023, 4:11 AM
This revision is now accepted and ready to land.Aug 14 2023, 4:11 AM
This revision was automatically updated to reflect the committed changes.