The 'mma.sp.sync.aligned' family of instructions expects
the sparsity selector as a direct literal (0x0 or 0x1).
The current MLIR inline asm passed this as a value in
register, which broke the downstream assemblers
This is a small step towards supporting 2:4 sparsity on
NVidia GPUs in the sparse compiler of MLIR.