Details
- Reviewers
ThomasRaoux herhut
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
---|---|---|
263 | RemFOp already lowers to llvm instruction frem (https://llvm.org/docs/LangRef.html#frem-instruction). Trying it out this works fine through the NVPTX backend. |
The fix above corrects an edge case in which the result of the remainder operation is NaN when the second argument is inf.
This was detected in TensorFlow (https://github.com/tensorflow/tensorflow/issues/58369). (Sorry for the delay, I had to fix a problem with my bug repro :-))
Requesting review again with this additional information.
Thanks for explaining. I don't think this is the right fix as this would mean the other existing lowering of arith.remf would be incorrect and cause miscompile if applied before this pattern.
Looking at it a bit, llvm instruction frem should return the result you want for frem 1.f, inf : f32. The instruction is defined here https://llvm.org/docs/LangRef.html#frem-instruction and says the semantic should match libm fmod. fmod(1.f, inf) returns 1.f.
The bug seems to be in NVPTX backend: https://github.com/llvm/llvm-project/blob/2784b243e38f7d526a40838a854554b53ebeb41e/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td#L1156
The sequence emitted won't do the right thing for inf, it should be fixed there.
Thanks for pointing this out, I agree that this would be a better fix.
I will work on a fix there.
The proposed fix is at https://reviews.llvm.org/D140846. Thanks again for pointing me there!
RemFOp already lowers to llvm instruction frem (https://llvm.org/docs/LangRef.html#frem-instruction). Trying it out this works fine through the NVPTX backend.
Therefore I don't think we need that. What problem is it trying to solve?