This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Use correct lowering for llvm.log2.f32
ClosedPublic

Authored by arsenm on Jun 12 2023, 6:32 PM.

Details

Reviewers
foad
rampitec
Pierre-vh
cdevadas
b-sumner
Group Reviewers
Restricted Project
Summary

We previously directly codegened to v_log_f32, which is broken for
denormals. The lowering isn't complicated, you simply need to scale
denormal inputs and adjust the result. Note log and log10 are still
not accurate enough, and will be fixed separately.

Diff Detail

Event Timeline

arsenm created this revision.Jun 12 2023, 6:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 6:32 PM
arsenm requested review of this revision.Jun 12 2023, 6:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 6:32 PM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm edited the summary of this revision. (Show Details)Jun 12 2023, 6:33 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2463

Don't you need to produce v_log_f16 if denorm handling is not needed?

arsenm added inline comments.Jun 13 2023, 11:03 AM
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2463

v_log_f16 is supposed to be fully correct. This is the promoted path for SI/CI that don't have it

This revision is now accepted and ready to land.Jun 13 2023, 11:24 AM

Please let Brian a chance to review too.

rampitec added inline comments.Jun 15 2023, 12:49 PM
llvm/docs/AMDGPUUsage.rst
963

Typo 0.51ULP?

arsenm added inline comments.Jun 21 2023, 6:47 PM
llvm/docs/AMDGPUUsage.rst
963

That's what the last 3 public ISA docs say

rampitec added inline comments.Jun 22 2023, 1:37 AM
llvm/docs/AMDGPUUsage.rst
963

For what I know this does not make any sense. Someones finger must be mistyped. @b-sumner can you comment please?

rampitec added inline comments.Jun 22 2023, 1:45 AM
llvm/docs/AMDGPUUsage.rst
963

I.e. I understand 0.5ULP. This is half of a bit, when you do not known how to round it: up or down. But what is 1/100 of a bit? Is that like your computation is 100 bits more accurate, but you do not know how to round it?

foad added inline comments.Jun 22 2023, 1:52 AM
llvm/docs/AMDGPUUsage.rst
963

It's when the true result is something.51 but the hardware rounds it down, or the true result is something.49 but the hardware rounds it up.

I assume for f16 someone exhaustively tested all inputs, measured the largest error from the true result, and decided 0.51 bits was a nice neat upper bound that they could publish.

rampitec added inline comments.Jun 22 2023, 1:56 AM
llvm/docs/AMDGPUUsage.rst
963

But the representation is in binary, not in decimal. There can be no true result 0.51 decimal to have this rounding problem, it must be 0.51 of a bit. This is the definition of the ULP: 1 bit. We have 0.51 of a bit here.

rampitec added inline comments.Jun 22 2023, 2:15 AM
llvm/docs/AMDGPUUsage.rst
963

I.e. anything above 0.5ULP is 1ULP to me. If your previous mantissa bit is 1 you round it up, if it is 0 you round it down, or whatever rules you have you can only change a single bit. Your 100th bit which you cannot represent doesn't matter. If you can have a single bit error in the lowest mantissa bit, this is 1ULP.

foad added inline comments.Jun 22 2023, 3:15 AM
llvm/docs/AMDGPUUsage.rst
963

There is a true infinitely precise mathematical result. The documented error bound is the difference from the true mathematical result, not the difference from the correctly rounded result.

rampitec added inline comments.Jun 22 2023, 3:18 AM
llvm/docs/AMDGPUUsage.rst
963

We certainly do not have a common ground here, so I am calling for @b-sumner again.

arsenm added inline comments.Jun 22 2023, 3:19 AM
llvm/docs/AMDGPUUsage.rst
963

I can also just delete the note about the ULP. It's not exactly great to have to repeat the ISA manual on every instruction

963

But yes, the ULP definition is in terms of the true result, not the expected rounded result

rampitec added inline comments.Jun 22 2023, 3:21 AM
llvm/docs/AMDGPUUsage.rst
963

If that's in the manual we need to fix it.

rampitec added inline comments.Jun 22 2023, 3:26 AM
llvm/docs/AMDGPUUsage.rst
963

Whatever you think of a true result, I want to understand what is 0.51ULP accuracy of the rounding of it. I.e. I am not certain about some transcendental results [in a given notation]. But up to this point I was pretty much sure about its rounding.

rampitec added 1 blocking reviewer(s): b-sumner.Jun 22 2023, 3:41 AM
This revision now requires review to proceed.Jun 22 2023, 3:41 AM

To proceed I want to see a one single example of a computation giving a wrong result with 0.5ULP rounding, correct result with 0.51ULP rounding [whatever it is], and then again wrong result with 1ULP rounding.

To proceed I want to see a one single example of a computation giving a wrong result with 0.5ULP rounding, correct result with 0.51ULP rounding [whatever it is], and then again wrong result with 1ULP rounding.

This has nothing to do with the patch. There's no behavior change here for the f16 intrinsic. We continue to directly pass through to the hardware instructions, and the documentation is simply restating what the ISA manual states. The point of the documentation is stating the f16 case passes through directly and the f32 is a small expansion

rampitec removed 1 blocking reviewer(s): b-sumner.Jun 22 2023, 12:51 PM
This revision is now accepted and ready to land.Jun 22 2023, 12:51 PM

OK, got the explanation from @b-sumner. This looks very unusual, but it is correct.

OK, got the explanation from @b-sumner. This looks very unusual, but it is correct.

Please could you summarise it here? Also curious what's going on here

OK, got the explanation from @b-sumner. This looks very unusual, but it is correct.

Please could you summarise it here? Also curious what's going on here

Quote from Brian:

In general, ULP(x) is defined for any real value of x and is roughly the distance from x to the nearest floating point number divided by the distance between the two nearest floating point numbers.

So, statements about a maximum relative error being 0.51 or 3.14159 ULP are quite meaningful.