This is an archive of the discontinued LLVM Phabricator instance.

libclc: Fix FP_ILOGBNAN definition
ClosedPublic

Authored by bbrezillon on Jul 9 2020, 4:46 AM.

Details

Summary

Fix FP_ILOGBNAN definition to match the opencl-c-base.h one and
guarantee that FP_ILOGBNAN and FP_ILOGB0 are different. Doing that
implies fixing ilogb() implementation to return the right value.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Diff Detail

Event Timeline

bbrezillon created this revision.Jul 9 2020, 4:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 9 2020, 4:46 AM
bbrezillon marked an inline comment as done.Jul 9 2020, 4:58 AM
bbrezillon added inline comments.
libclc/generic/lib/math/ilogb.cl
34

My bad, should be FP_ILOGB0 here.

bbrezillon updated this revision to Diff 276706.Jul 9 2020, 5:05 AM

Use FP_ILOGB0 where appropriate.

daniels added a subscriber: daniels.Jul 9 2020, 5:39 AM

The build failure is again unrelated; comment filed at https://github.com/google/llvm-premerge-checks/issues/207

What is the problem this patch is trying to address?
The specs do not mandate these two values to be different.

On the more practical side.
This patch only changes fp32 implementation to return the new value leaving the fp64 implementation to return INT_MIN in both cases.
The implementation now returns FP_ILOGBNAN even for Inf input, which is not correct.

CLC spec doesn't talk about Inf inputs, but the libm behaviour is to return `INT_MAX, which might be useful.
If FP_ILOGBNAN and FP_ILOGB0 need to be different it'd be better to use FP_ILOGBNAN == INT_MIN and FP_ILOGB0 == -INT_MAX.

bbrezillon added a comment.EditedJul 10 2020, 1:02 AM

What is the problem this patch is trying to address?

Well, the primary goal was to have consistent values in clang/lib/Headers/opencl-c-base.h and libclc/generic/include/clc/float/definitions.h to avoid the mess when one links against libclc but includes opencl-c-base.h.
Not to mention that having 2 conflicting definitions in headers that both lives in the same code base and are supposed to represent the same thing is confusing, to say the least.

The specs do not mandate these two values to be different.

It might be me misunderstanding the spec here. I read

"The value of FP_ILOGB0 shall be either INT_MIN or -INT_MAX. The value of FP_ILOGBNAN shall be either INT_MAX or INT_MIN."

as (FP_ILOGB0=INT_MIN and FP_ILOGBNAN=INT_MAX) or (FP_ILOGB0=-INT_MAX and FP_ILOGBNAN=INT_MIN).
But you're probably right, there's nothing stating that FP_ILOGB0 and FP_ILOGBNAN should map to different values, it's just the pattern I've seen in various libc implementations.

On the more practical side.
This patch only changes fp32 implementation to return the new value leaving the fp64 implementation to return INT_MIN in both cases.

Oops. The fp64 version should definitely be patched accordingly.

The implementation now returns FP_ILOGBNAN even for Inf input, which is not correct.

Hm, nope, it still returns 0x7fffffff, which is INT_MAX. I think you're referring to my comment, where I'm emitting the idea of merging the 2 tests into a single one since FP_ILOGBNAN is now also equal to INT_MAX, but as mentioned there, I think clarity prevails over optimization (especially since clang might optimize that for us anyway).

CLC spec doesn't talk about Inf inputs, but the libm behaviour is to return `INT_MAX, which might be useful.

Yep, and I didn't change that part.

If FP_ILOGBNAN and FP_ILOGB0 need to be different it'd be better to use FP_ILOGBNAN == INT_MIN and FP_ILOGB0 == -INT_MAX.

Except you'd then have a mismatch between clang/lib/Headers/opencl-c-base.h and libclc/generic/include/clc/float/definitions.h.
So maybe the answer is don't include opencl-c-base.h when you link against libclc, but as I mentioned above, the fact that both headers living in the same code base define 2 different values for the same thing is confusing.

jvesely added a comment.EditedJul 10 2020, 11:37 AM

What is the problem this patch is trying to address?

Well, the primary goal was to have consistent values in clang/lib/Headers/opencl-c-base.h and libclc/generic/include/clc/float/definitions.h to avoid the mess when one links against libclc but includes opencl-c-base.h.
Not to mention that having 2 conflicting definitions in headers that both lives in the same code base and are supposed to represent the same thing is confusing, to say the least.

That combination is not supported and should not be attempted. If you want to link with libclc. you should use clc headers installed by libclc.
I agree that not having multiple conflicting headers is preferable, and I have no idea why clang added their own header instead of using libclc.
I'd say a better solution would be to drop libclc headers entirely and switch the build to use clang's CLC header, or drop clang's CLC header, trying to sync two different locations is just asking for trouble.

The specs do not mandate these two values to be different.

It might be me misunderstanding the spec here. I read

"The value of FP_ILOGB0 shall be either INT_MIN or -INT_MAX. The value of FP_ILOGBNAN shall be either INT_MAX or INT_MIN."

as (FP_ILOGB0=INT_MIN and FP_ILOGBNAN=INT_MAX) or (FP_ILOGB0=-INT_MAX and FP_ILOGBNAN=INT_MIN).
But you're probably right, there's nothing stating that FP_ILOGB0 and FP_ILOGBNAN should map to different values, it's just the pattern I've seen in various libc implementations.

On the more practical side.
This patch only changes fp32 implementation to return the new value leaving the fp64 implementation to return INT_MIN in both cases.

Oops. The fp64 version should definitely be patched accordingly.

The implementation now returns FP_ILOGBNAN even for Inf input, which is not correct.

Hm, nope, it still returns 0x7fffffff, which is INT_MAX. I think you're referring to my comment, where I'm emitting the idea of merging the 2 tests into a single one since FP_ILOGBNAN is now also equal to INT_MAX, but as mentioned there, I think clarity prevails over optimization (especially since clang might optimize that for us anyway).

It doesn't matter if the value is in hex or decimal or a define, The results of ilogb(Inf) and ilogb(NaN) is now indistinguishable to the caller.
The spec is rather clear that the value of FP_ILOGBNAN shall be returned only if the input is NaN.
Since libclc already uses INT_MAX for Inf, FP_ILOGBNAN cannot map to the same value.

CLC spec doesn't talk about Inf inputs, but the libm behaviour is to return `INT_MAX, which might be useful.

Yep, and I didn't change that part.

If FP_ILOGBNAN and FP_ILOGB0 need to be different it'd be better to use FP_ILOGBNAN == INT_MIN and FP_ILOGB0 == -INT_MAX.

Except you'd then have a mismatch between clang/lib/Headers/opencl-c-base.h and libclc/generic/include/clc/float/definitions.h.
So maybe the answer is don't include opencl-c-base.h when you link against libclc, but as I mentioned above, the fact that both headers living in the same code base define 2 different values for the same thing is confusing.

There are potentially tons of other differences between libclc and clang opencl headers, trying to sync them is IMO futile.
libclc historically supported multiple clang versions (the current head can be built using clang 3.9 -> clang 11), I'd strongly prefer to keep it that way until at least clc 1.2 is fully supported, but in the end, it's a question for @tstellar .

The implementation now returns FP_ILOGBNAN even for Inf input, which is not correct.

Hm, nope, it still returns 0x7fffffff, which is INT_MAX. I think you're referring to my comment, where I'm emitting the idea of merging the 2 tests into a single one since FP_ILOGBNAN is now also equal to INT_MAX, but as mentioned there, I think clarity prevails over optimization (especially since clang might optimize that for us anyway).

It doesn't matter if the value is in hex or decimal or a define, The results of ilogb(Inf) and ilogb(NaN) is now indistinguishable to the caller.
The spec is rather clear that the value of FP_ILOGBNAN shall be returned only if the input is NaN.

Hm, can you point me to the portion of the spec where this is stated? If there's such a rule stating that FP_ILOGBNAN should be unique, the current implementation is wrong since both FP_ILOGBNAN and FP_ILOGB0 map to the same value.

@jvesely, I think libclc needs to change its definition here, as it's the only one out of 3 OpenCL standard lib headers that's different.

Since OpenCL allows apps to provide precompiled intermediate representations to the API, in the form of SPIR or SPIR-V, that means that the app could have embedded references to FP_ILOGBNAN, which are just a constant value since it's a #define, in their own code. They could write a kernel which calls ilogb and compares the result to FP_ILOGBNAN, and compile that with either clang trunk (which uses opencl-c.h automatically), or with Khronos's offline compiler (https://github.com/KhronosGroup/SPIR) using Khronos's standard library headers (https://github.com/KhronosGroup/libclcxx). Both of these standard library implementation headers define FP_ILOGBNAN to be INT_MAX:

Nobody is going to offline compile OpenCL code using the libclc headers. That means that if our implementation wants to leverage libclc behind the scenes, then libclc should use the same definition of this value as the other standard libraries.

Yeah, it's different from CPU land where you don't go around compiling against standard library headers from one library, and then linking against another -- except that you can kind of do that too between e.g. glibc and musl (which have matching definitions of this define for what it's worth).

tldr; I don't mind changing this to be in sync with clang, just get the patch right.
However, it's not the right way to guarantee IR level compatibility between clang-cl-headers and libclc.
details below

@jvesely, I think libclc needs to change its definition here, as it's the only one out of 3 OpenCL standard lib headers that's different.

There are two separate issues
a) whether to change libclc
b) doing the change in the correct way

Since OpenCL allows apps to provide precompiled intermediate representations to the API, in the form of SPIR or SPIR-V, that means that the app could have embedded references to FP_ILOGBNAN, which are just a constant value since it's a #define, in their own code. They could write a kernel which calls ilogb and compares the result to FP_ILOGBNAN, and compile that with either clang trunk (which uses opencl-c.h automatically), or with Khronos's offline compiler (https://github.com/KhronosGroup/SPIR) using Khronos's standard library headers (https://github.com/KhronosGroup/libclcxx). Both of these standard library implementation headers define FP_ILOGBNAN to be INT_MAX:

I've no problem with the compatibility argument. but using INT_MAX for both NaN and Inf is in my understanding wrong.
If the intention is to use INT_MAX for FP_ILOGBNAN, the implementation needs to be changed to return a different value for Inf (presumably 127 for fp32 and 1023 for fp64, since those can reuse the default codepath).

Note that LLVM internally uses INT_MIN + 1 (-INT_MAX) for ilogb(0), INT_MIN for ilogb(NaN) and INT_MAX for ilogb(Inf) [0]. This means that any optimization handling these constants and ilogb operation would be incorrect in LLVM.
IMO, the correct action on LLVM side would be to submit a bug to Khronos to either fix their headers or provide clarification wrt ilogb(Inf) behaviour.

Nobody is going to offline compile OpenCL code using the libclc headers. That means that if our implementation wants to leverage libclc behind the scenes, then libclc should use the same definition of this value as the other standard libraries.

Why not? it's fairly easy and straightforward.
What are the requirements for precompiled portability? The specs allow implementations to use different values for FP_ILOBGNAN and FP_ILOGB0.
afaiu, offline compiled kernels are not expected to work between different implementations. I haven't found a mention of ilogb values in SPIR-V spec.
In this regard clangs opencl headers and libclc are different implementations. Again, I agree that syncing them would be desirable, but keeping two different set of headers is not the right way to do it.

Yeah, it's different from CPU land where you don't go around compiling against standard library headers from one library, and then linking against another -- except that you can kind of do that too between e.g. glibc and musl (which have matching definitions of this define for what it's worth).

switching libraries at link time requires ABI compatibility, this is no different from CPU libraries, CLC spec only details API.

fwiw: glibc definitions of these FP_ILOB0 and FP_ILOGBNAN vary depending on system configuration [1].
on x86 it's

#  define FP_ILOGB0     (-2147483647 - 1)
#  define FP_ILOGBNAN   (-2147483647 - 1)

[2]
while for m68k or is ia64 it's

#  define FP_ILOGB0     (-2147483647 - 1)
#  define FP_ILOGBNAN   2147483647

[3, 4]
on other architectures it's

#  define FP_ILOGB0     (-2147483647)
#  define FP_ILOGBNAN   2147483647

[5]

So copying IR representations of C programs between those platforms won't work for the same reasons.

[0] https://llvm.org/doxygen/APFloat_8h_source.html#l00229
[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=math/math.h;h=e48860e3915b0ec5c0ae0d594d84432c0568ddc6;hb=HEAD#l190
[2] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/bits/fp-logb.h;h=f180a90754a223547f0d0b965b6fbe6d0132190b;hb=HEAD
[3] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/m68k/m680x0/bits/fp-logb.h;h=bb31eb8d6210033e4be3e8e05bae9a6e09d09e86;hb=HEAD
[4] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/ia64/bits/fp-logb.h;h=45d1bd5e01bb0ecfb9ae7d366f18e315b637c8ca;hb=HEAD
[5] https://sourceware.org/git/?p=glibc.git;a=blob;f=bits/fp-logb.h;h=30effcd52196e7e6a53bde38c66dd6011bbbc3e1;hb=HEAD

Since OpenCL allows apps to provide precompiled intermediate representations to the API, in the form of SPIR or SPIR-V, that means that the app could have embedded references to FP_ILOGBNAN, which are just a constant value since it's a #define, in their own code. They could write a kernel which calls ilogb and compares the result to FP_ILOGBNAN, and compile that with either clang trunk (which uses opencl-c.h automatically), or with Khronos's offline compiler (https://github.com/KhronosGroup/SPIR) using Khronos's standard library headers (https://github.com/KhronosGroup/libclcxx). Both of these standard library implementation headers define FP_ILOGBNAN to be INT_MAX:

I've no problem with the compatibility argument. but using INT_MAX for both NaN and Inf is in my understanding wrong.

Again, I didn't see this clearly stated in the spec. Can you quote the spec or point us to the relevant section?

If the intention is to use INT_MAX for FP_ILOGBNAN, the implementation needs to be changed to return a different value for Inf (presumably 127 for fp32 and 1023 for fp64, since those can reuse the default codepath).

I see several non-CL implementations using the same value for NaN and Inf [1][2][3]. Again, I'm not saying this applies to CL, but I couldn't find anything in the spec forbidding this collision.

Note that LLVM internally uses INT_MIN + 1 (-INT_MAX) for ilogb(0), INT_MIN for ilogb(NaN) and INT_MAX for ilogb(Inf) [0]. This means that any optimization handling these constants and ilogb operation would be incorrect in LLVM.

Hm, don't you have the same problem when such optimizations happen and the binary is linked with a libc that has different definitions for those values?

[1]https://pubs.opengroup.org/onlinepubs/007908799/xsh/ilogb.html
[2]https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/ilogb-ilogbf-ilogbl2?view=vs-2019
[3]http://redhat.polarhome.com/service/man/?qf=ilogb&tf=2&of=OpenDarwin&sf=3

Note that LLVM internally uses INT_MIN + 1 (-INT_MAX) for ilogb(0), INT_MIN for ilogb(NaN) and INT_MAX for ilogb(Inf) [0]. This means that any optimization handling these constants and ilogb operation would be incorrect in LLVM.

Hm, don't you have the same problem when such optimizations happen and the binary is linked with a libc that has different definitions for those values?

I had a quick look, and it seems that ilogb() results are never returned directly. All callers seem to compare the result to the IEK_xxx defs and handle the Inf, NaN and Zero properly or check the arg value before passing it to ilogb(), so I don't think this mismatch is an issue.

jvesely added a comment.EditedJul 14 2020, 11:24 AM

Since OpenCL allows apps to provide precompiled intermediate representations to the API, in the form of SPIR or SPIR-V, that means that the app could have embedded references to FP_ILOGBNAN, which are just a constant value since it's a #define, in their own code. They could write a kernel which calls ilogb and compares the result to FP_ILOGBNAN, and compile that with either clang trunk (which uses opencl-c.h automatically), or with Khronos's offline compiler (https://github.com/KhronosGroup/SPIR) using Khronos's standard library headers (https://github.com/KhronosGroup/libclcxx). Both of these standard library implementation headers define FP_ILOGBNAN to be INT_MAX:

I've no problem with the compatibility argument. but using INT_MAX for both NaN and Inf is in my understanding wrong.

Again, I didn't see this clearly stated in the spec. Can you quote the spec or point us to the relevant section?

It's the same section, the wording "The following macros shall expand to integer constant expressions whose values are returned by ilogb(x) if x is zero or NaN, respectively." implies to me that those integer values should not be returned by other inputs, but you're right it's not explicitly stated.

If the intention is to use INT_MAX for FP_ILOGBNAN, the implementation needs to be changed to return a different value for Inf (presumably 127 for fp32 and 1023 for fp64, since those can reuse the default codepath).

I see several non-CL implementations using the same value for NaN and Inf [1][2][3]. Again, I'm not saying this applies to CL, but I couldn't find anything in the spec forbidding this collision.

Note that LLVM internally uses INT_MIN + 1 (-INT_MAX) for ilogb(0), INT_MIN for ilogb(NaN) and INT_MAX for ilogb(Inf) [0]. This means that any optimization handling these constants and ilogb operation would be incorrect in LLVM.

Hm, don't you have the same problem when such optimizations happen and the binary is linked with a libc that has different definitions for those values?

yes, the same problem exists for all platforms that provide ilogb behaviour different from the one in LLVM.
This is not the case of llvm using the wrong values internally, but rather using an incorrect constant in constant propagation (afaik it doesn't currently exist).

[1]https://pubs.opengroup.org/onlinepubs/007908799/xsh/ilogb.html
[2]https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/ilogb-ilogbf-ilogbl2?view=vs-2019
[3]http://redhat.polarhome.com/service/man/?qf=ilogb&tf=2&of=OpenDarwin&sf=3

yes. there are multiple different behaviors, depending on the platform. the question is which one should libclc follow. I think returning different values for Inf and NaN is preferable.

I added ilogb test to my small test suite[0] and ran it on couple of different platforms:

  • clover (current libclc): matches CPU libm behaviour
  • ROCm (OpenCL 2.0 AMD-APP.internal (3098.0)): Incorrect element(1): nan result: 2147483647 correct: -2147483648
  • Intel NEO (OpenCL 2.1 NEO): Incorrect element(1): nan result: 2147483647 correct: -2147483648
  • Intel CPU (OpenCL 1.2 (Build 117)): Incorrect element(1): nan result: 2147483647 correct: -2147483648
  • Nvidia (OpenCL 1.2 CUDA 10.2.185): matches CPU libm behaviour
  • POCL (OpenCL 1.2 pocl HSTR: pthread-x86_64-unknown-linux-gnu-bdver4): matches CPU libm behaviour

The current libclc implementation was taken from old AMD OpenCL implementation, but I don't have a working setup to test that.

it looks everyone other than pocl, clover, and Nvidia disagrees with me and uses INT_MAX for both FP_ILOGBNAN and ilogb(Inf).
I withdraw my objection to the behaviour, I think compatibility is more important in this regard.

can you use the macros instead of hardcoded values in the patch?
thanks

edit: nvm, you already do.

[0] https://github.com/jvesely/ocl_tests/blob/master/ilogb/ilogb.cpp

  • Updated the commit message to drop the part stating that ILOGB0 and ILOGBNAN should be different (which is not required)
  • Patched the 64b variant of ilogb()
jvesely accepted this revision.Jul 15 2020, 5:15 AM
This revision is now accepted and ready to land.Jul 15 2020, 5:15 AM
This revision was automatically updated to reflect the committed changes.