We previously directly codegened to v_log_f32, which is broken for
denormals. The lowering isn't complicated, you simply need to scale
denormal inputs and adjust the result. Note log and log10 are still
not accurate enough, and will be fixed separately.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2463 | Don't you need to produce v_log_f16 if denorm handling is not needed? |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2463 | v_log_f16 is supposed to be fully correct. This is the promoted path for SI/CI that don't have it |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | Typo 0.51ULP? |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | That's what the last 3 public ISA docs say |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | I.e. I understand 0.5ULP. This is half of a bit, when you do not known how to round it: up or down. But what is 1/100 of a bit? Is that like your computation is 100 bits more accurate, but you do not know how to round it? |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | It's when the true result is something.51 but the hardware rounds it down, or the true result is something.49 but the hardware rounds it up. I assume for f16 someone exhaustively tested all inputs, measured the largest error from the true result, and decided 0.51 bits was a nice neat upper bound that they could publish. |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | But the representation is in binary, not in decimal. There can be no true result 0.51 decimal to have this rounding problem, it must be 0.51 of a bit. This is the definition of the ULP: 1 bit. We have 0.51 of a bit here. |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | I.e. anything above 0.5ULP is 1ULP to me. If your previous mantissa bit is 1 you round it up, if it is 0 you round it down, or whatever rules you have you can only change a single bit. Your 100th bit which you cannot represent doesn't matter. If you can have a single bit error in the lowest mantissa bit, this is 1ULP. |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | There is a true infinitely precise mathematical result. The documented error bound is the difference from the true mathematical result, not the difference from the correctly rounded result. |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | If that's in the manual we need to fix it. |
llvm/docs/AMDGPUUsage.rst | ||
---|---|---|
963 | Whatever you think of a true result, I want to understand what is 0.51ULP accuracy of the rounding of it. I.e. I am not certain about some transcendental results [in a given notation]. But up to this point I was pretty much sure about its rounding. |
To proceed I want to see a one single example of a computation giving a wrong result with 0.5ULP rounding, correct result with 0.51ULP rounding [whatever it is], and then again wrong result with 1ULP rounding.
This has nothing to do with the patch. There's no behavior change here for the f16 intrinsic. We continue to directly pass through to the hardware instructions, and the documentation is simply restating what the ISA manual states. The point of the documentation is stating the f16 case passes through directly and the f32 is a small expansion
Quote from Brian:
In general, ULP(x) is defined for any real value of x and is roughly the distance from x to the nearest floating point number divided by the distance between the two nearest floating point numbers.
So, statements about a maximum relative error being 0.51 or 3.14159 ULP are quite meaningful.
Typo 0.51ULP?