Lowering arith.maxf/minf to __nv_fmax/__nv_fmin in GPUToNVVM Conversion.
Also remove Linalg pass declaration which is not cleaned.
Details
- Reviewers
ThomasRaoux herhut nicolasvasilache
Diff Detail
Event Timeline
mlir/include/mlir/Dialect/Linalg/Passes.h | ||
---|---|---|
34 ↗ | (On Diff #449859) | Is this on purpose? |
mlir/include/mlir/Dialect/Linalg/Passes.h | ||
---|---|---|
34 ↗ | (On Diff #449859) | never mind I see your comment and this is dead. |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
257 | The semantic of arith::MaxFOp and __nv_fmax_f are different for Nan values: arith::MaxFOp: Returns the maximum of the two arguments, treating -0.0 as less than +0.0. If one of the arguments is NaN, then the result is also NaN. __nv_fmax_f: If one argument is a NaN and the other is legitimate numeric value, the numeric value is chosen. |
mlir/include/mlir/Dialect/Linalg/Passes.h | ||
---|---|---|
34 ↗ | (On Diff #449859) | this pass seems to be removed at https://reviews.llvm.org/D124145 |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
---|---|---|
257 | This seems to be a problem. BTW, does other math function have the same problem? like math.sin(NaN) and __nv_sin(NaN). |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
---|---|---|
257 | Yes I don't think we can have this lowering with the current semantic of arith::MaxFOp as far as I know this problem is only for fmax/fmin as the semantic of TF and other ML framework tends to be different than what hw natively support. For other arithmetic operations the semantic is more obvious. |
mlir/include/mlir/Dialect/Linalg/Passes.h | ||
---|---|---|
34 ↗ | (On Diff #449859) | Thanks! |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
---|---|---|
257 | Thanks, should I close this differential? |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
---|---|---|
257 | Yes unfortunately I don’t think there is another solution right now. |
The semantic of arith::MaxFOp and __nv_fmax_f are different for Nan values:
arith::MaxFOp:
__nv_fmax_f: