When converting to nvvm lowering gpu.printf to vprintf allows us to
support printing when running on cuda.
Details
Diff Detail
Event Timeline
Nice. Just some comments.
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
---|---|---|
254 | nit: why not add this above? | |
mlir/test/Integration/GPU/CUDA/printf.mlir | ||
16 | Could you test this with multiple arguments that have different sizes? I wonder whether your struct packing approach fulfills the alignment requirements for arguments to the vprintf call. |
F32 needs to be extended to f64. If you add a test you will see it fails.
mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp | ||
---|---|---|
381 | This same idiom (lines around 380 through 400) is repeated 3 times in this file, so might be useful to outline to helper function? Since SymbolTable class can't be constructed in the middle of conversion, maybe it would be helpful if this type of function (e.g. getUniqueSymbolName) was added to the OpTrait::SymbolTable class in IR/SymbolTable.h, but maybe that's too heavy handed? |
mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp | ||
---|---|---|
408 | F32 needs to be extended to f64 before pushing into struct. If you add a test, you will see it fails for f32 args currently. Integers need to be extended to 32 bits, but this happens automatically based on my initial test. | |
mlir/test/Integration/GPU/CUDA/printf.mlir | ||
16 | F32 needs to be extended to f64 before pushing into struct. If you add a test, you will see it fails for f32 args currently. |
This same idiom (lines around 380 through 400) is repeated 3 times in this file, so might be useful to outline to helper function?
Since SymbolTable class can't be constructed in the middle of conversion, maybe it would be helpful if this type of function (e.g. getUniqueSymbolName) was added to the OpTrait::SymbolTable class in IR/SymbolTable.h, but maybe that's too heavy handed?