Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1810 | This should match the instruction behavior (although I guess we can ignore the bug on SI) | |
1811–1814 | The specs aren't necessarily relevant here, since this just needs to match the instruction behavior. Talking about the min when it isn't here is potentially confusing |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1810 | Is there a good public reference for that? The Vega ISA Reference Guide doesn't go into much detail. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1810 | This is always a problem, and no. I just go by this comment: // V_FRACT is buggy on SI, so the F32 version is never used and (x-floor(x)) is // used instead. However, SI doesn't have V_FLOOR_F64, so the most efficient // way to implement it is using V_FRACT_F64. // The workaround for the V_FRACT bug is: // fract(x) = isnan(x) ? x : min(V_FRACT(x), 0.99999999999999999) // Convert floor(x) to (x - fract(x)) |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1810 | OK, so it sounds like the (non-buggy) hardware uses the same trick as the OpenCL definition, to avoid ever returning 1.0. I'll try to confirm that on some real hardware. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1810 | I've confirmed this for f16 and f32 types, on some real gfx9 hardware. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1810 | ... and confirmed for f64 too. |
This should match the instruction behavior (although I guess we can ignore the bug on SI)