This is an archive of the discontinued LLVM Phabricator instance.

[X86][clang] Disable long double type for -mno-x87 option
ClosedPublic

Authored by asavonic on Mar 18 2021, 1:45 PM.

Details

Summary

This patch attempts to fix a compiler crash that occurs when long double type
is used with -mno-x87 compiler option.

The option disables x87 target feature, which in turn disables x87 registers, so
CG cannot select them for x86_fp80 LLVM IR type. Long double is lowered as
x86_fp80 for some targets, so it leads to a crash.

The option seems to contradict the SystemV ABI, which requires long double to
be represented as a 80-bit floating point, and it also requires to use x87 registers.

In addition to that, float and double also use x87 registers for
return values on 32-bit x86, so they are disabled as well.

Diff Detail

Event Timeline

asavonic created this revision.Mar 18 2021, 1:45 PM
asavonic requested review of this revision.Mar 18 2021, 1:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 18 2021, 1:45 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
asavonic added a comment.EditedMar 18 2021, 2:01 PM

I'm not sure that this is the right approach, but I wanted to get feedback on
how the issue should be fixed. Currently, the compiler crashes on almost any
code with long double and -mno-x87 (excluding cases where CG does not properly disable x87):

long double foo(long double x, long double y)
{
  long double z = x + y;
  if (z < 0.0)
    return z;
  else
    return 0.0;
}

The option was added in D19658 and D13979, but I'm not sure how it is supposed to work for SystemV ABI.
GCC emits an error if long double type is used with -mno-x87: "test.c:37:1: error: x87 register return with x87 disabled".

Added reviewers of the original patches. Please let me know if this patch makes sense, or we should do something else to avoid the crash.

pengfei added inline comments.May 18 2021, 10:43 PM
clang/test/Sema/x86-no-x87.c
48–61

These seems pass with GCC. https://godbolt.org/z/qM4nWhThx

asavonic added inline comments.May 19 2021, 8:29 AM
clang/test/Sema/x86-no-x87.c
48–61

Right. Assignment of a literal is compiled to just mov without any x87 instructions, so it is not diagnosed by GCC. On the other hand, assignment of a variable does trigger the error:

void assign4(double d) {
  struct st_long_double st;
  st.ld = d; // error: long double is not supported on this target
}

We can update the patch to do the same for some cases, but it does not look very consistent, and makes assumptions on how the code is optimized and compiled.

GCC has an advantage here, because it emits the diagnostic at a lower level after at lease some optimizations are done. For example, the following code does not trigger an error for GCC because of inlining:

double get_const() {
  return 0.42;
}
void assign5(struct st_long_double *st) {
  st->ld = get_const();
}
pengfei added inline comments.May 19 2021, 7:13 PM
clang/test/Sema/x86-no-x87.c
48–61

I see. Another concern is about the 32 bits. @LiuChen3 had tested in D100091 that GCC doesn't error for 32 bits. Do we need to consider it here?

asavonic added inline comments.May 24 2021, 10:14 AM
clang/test/Sema/x86-no-x87.c
48–61

Yes, it should be considered. Both x86 and x86_64 SysV ABI targets require x87 to be used for long double. Other targets have different requirements for long double: double precision for Windows, double and quad precision for Android x86 and x86_64 respectively.

asavonic updated this revision to Diff 347443.May 24 2021, 10:16 AM

Added LIT run lines for i686 and windows targets.

pengfei added inline comments.May 25 2021, 12:08 AM
clang/test/Sema/x86-no-x87.c
3

Should i686 expect no error like GCC?

asavonic added inline comments.May 25 2021, 5:40 AM
clang/test/Sema/x86-no-x87.c
3

GCC seems to fallback to soft-float for i686 if -mno-80387 is used:

long double orig(long double x, long double y)
{
  long double z = x + y;
  if (z < 0.0)
    return z;
  else
    return 0.0;
}

i686-linux-gnu-gcc-8 -c -S -mno-80387 -O3:

	  call	__addxf3@PLT
	  [...]
	  call	__ltxf2@PLT
	  addl	$32, %esp
	  testl	%eax, %eax
	  js	.L3
	  xorl	%esi, %esi
	  xorl	%edi, %edi
	  xorl	%ebp, %ebp
  .L3:
	  addl	$12, %esp
	  movl	%esi, %eax
	  movl	%edi, %edx
	  movl	%ebp, %ecx
	  popl	%ebx
	  popl	%esi
	  popl	%edi
	  popl	%ebp
	  ret

This looks like a different ABI.
X87 instructions are not used, so no error is reported.

pengfei added inline comments.May 25 2021, 7:07 AM
clang/test/Sema/x86-no-x87.c
3

I found it's a bit complex for 32 bits.

  1. i686 ABI specifies the return of floating point type must be put in %st0, so any FP type returning should be error out w/o x87.
  2. GCC doesn't respect above ABI.
  3. FP types are passed from stack, so a function like void orig(long double x, long double y, long double *z) should not be error out w/o x87.

x86_64 only uses ST registers when returning FP80.
Considering it is rare for case 3, I think we can ignore it this time, but I suggest we should add check for float and double on 32 bits.

asavonic updated this revision to Diff 348219.May 27 2021, 4:12 AM
  • Disabled double or float return for x86 targets
  • Refactored checks into a separate function.
pengfei accepted this revision.May 27 2021, 4:57 AM

LGTM. But let's wait one or more days to see if others have more comments.

clang/lib/Sema/SemaChecking.cpp
4602

Nit: We can save the curly bracket for it.

14022

ditto

This revision is now accepted and ready to land.May 27 2021, 4:57 AM

@erichkeane, can you please check if this patch is OK for Clang?

This seems to do a lot of work that I think already exists, I believe I've seen that SPIRV targets already prohibit long-double, so doing another implementation specifically for x86 seems like the wrong approach. I'd suggest seeing if the SPIRV version works for these. If it doesn't, I'd prefer you generalize that implementation.

clang/include/clang/Basic/TargetInfo.h
588

I think SPIR targets already do something like this, can you check and see how they do it instead?

Thanks @erichkeane. SPIR-V diagnostics seem to work fine. There are a few cases that are not handled, so I submitted D109315 to review them separately.

asavonic updated this revision to Diff 381538.Oct 22 2021, 7:27 AM
asavonic retitled this revision from [X86][Draft] Disable long double type for -mno-x87 option to [X86][clang] Disable long double type for -mno-x87 option.
asavonic edited the summary of this revision. (Show Details)

Rebased the patch on top of D109315.

Since the diagnostic is now shared with other targets (SYCL and
OpenMP), I decided to drop all assumptions on optimizations that
we previously discussed. There seems to be no actual use case
where it is important to have them (other than compatibility with
GCC), and they make rules a bit inconsistent.

asavonic requested review of this revision.Oct 25 2021, 4:34 AM

@pengfei, @erichkeane, please let me know if the patch is OK now, or anything else needs to be fixed.

@Fznamznon was into this a bunch at once, so she should probably take a look as well.

clang/lib/Sema/Sema.cpp
1936 ↗(On Diff #381538)

Should this be a return, or do we still intend the device invocations to go through the below checks too?

If so, please write a test for that.

clang/lib/Sema/SemaDecl.cpp
9438–9439

Why are we not checking declarations here? This doesn't seem right to me. At least in the offload languages, we still need to check declarations. Also, if something is a problem with a declaration, why is it not a problem with defaulted/deleted?

asavonic added inline comments.Oct 29 2021, 7:58 AM
clang/lib/Sema/SemaDecl.cpp
9438–9439

Why are we not checking declarations here? This doesn't seem right to me. At least in the offload languages, we still need to check declarations.

I assume that if if a function is declared and not used, then it is discarded and no code is generated for it. So it should not really matter if it uses an "unsupported" type.
This is important for long double, because there are C standard library functions that have long double arguments. We skip diagnostics for declarations to avoid errors from standard library headers when the type is actually not used.

Also, if something is a problem with a declaration, why is it not a problem with defaulted/deleted?

If we can expect that defaulted or deleted functions never result in a code with unsupported types, then we can exclude them as well. Something like this perhaps?

void no_long_double(long double) = delete;
erichkeane added inline comments.Oct 29 2021, 8:03 AM
clang/lib/Sema/SemaDecl.cpp
9438–9439

The problem is that these declarations could be called, right? So are we catching something like:

 void has_long_double(long double d);
....
has_long_double(1.0);

The deleted types shouldn't result in code being generated, but default will absolutely result in code being generated. Though I guess I can't think of a situation where we would have a defaulted function that could take a long double anyway.

asavonic added inline comments.Oct 29 2021, 8:35 AM
clang/lib/Sema/SemaDecl.cpp
9438–9439

The problem is that these declarations could be called, right? So are we catching something like:

 void has_long_double(long double d);
....
has_long_double(1.0);

Yes, we catch this when the function is called (see below). This is done in Sema::DiagnoseUseOfDecl.

The deleted types shouldn't result in code being generated, but default will absolutely result in code being generated. Though I guess I can't think of a situation where we would have a defaulted function that could take a long double anyway.

Thank you. I'll update the patch to exclude deleted functions.

clang/test/Sema/x86-no-x87.c
36

This is the case where a declaration is called.

asavonic added inline comments.Nov 1 2021, 6:46 AM
clang/lib/Sema/Sema.cpp
1936 ↗(On Diff #381538)

It should be fine either way. New diagnostics are disabled if TI.hasLongDoubleType and TI.hasFPReturn return "true", as expected for SYCL and OpenMP device targets.

Added a test for SYCL, OpenMP already has tests with long double type.

asavonic updated this revision to Diff 383794.Nov 1 2021, 6:47 AM
  • Added a test for SYCL.
  • Excluded functions marked with C++ delete from the check.
  • Moved the check function from ActOnFunctionDeclarator to ActOnFinishFunctionBody, because the deleted/defaulted property is not yet available in ActOnFunctionDeclarator.
erichkeane accepted this revision.Nov 1 2021, 6:49 AM
This revision is now accepted and ready to land.Nov 1 2021, 6:49 AM
This revision was landed with ongoing or failed builds.Nov 3 2021, 2:09 AM
This revision was automatically updated to reflect the committed changes.

This seems to be causing build errors for the linux kernel (which uses -mno-x87) for expressions like: unsigned long foo = 300L * 1E6L;
https://github.com/ClangBuiltLinux/linux/issues/1497

// clang -O2 -mno-x87
void foo (unsigned long);
void bar (void) { foo(1 * 1E6L); }
<source>:3:25: error: expression requires  'long double' type support, but target 'x86_64-unknown-linux-gnu' does not support it
void bar (void) { foo(1 * 1E6L); }
                        ^

While we can change kernel sources to not use L suffixes for floating point literals, this is a curious difference from GCC and may lead to compatibility issues with other code bases.

@nickdesaulniers, thanks a lot for reporting this!
Unfortunately it is not easy to keep compatibility with GCC in this case. GCC has this diagnostic very late in the pipeline, and trivial uses of long double type (e.g. unused variables, constant expressions) can be optimized away before that. We cannot do the same with LLVM, because there is no way to emit a nice source-level diagnostic from the codegen.
I suggest to fix the kernel, because it looks like the long double literal is not intentional and can be replaced with just double.

This patch causes a regression.

To reproduce - clang -cc1 -fsycl-is-device -triple spir64 test.cpp

// expected-note@+2 {{'bar<__float128>' defined here}}
template <typename T>
T bar() { return T(); };

void usage() {
  // expected-error@+2 {{'bar<__float128>' requires 128 bit size '__float128' type support, but target 'spir64' does not support it}}
  // expected-error@+1 {{'__float128' is not supported on this target}}
  auto malAutoTemp5 = bar<__float128>();
}
template <typename name, typename Func>
__attribute__((sycl_kernel)) void kernel_single_task(const Func &kernelFunc) {
  kernelFunc(); // expected-note {{called by 'kernel_single_task<fake_kernel, (lambda at}}
}

int main() {
  kernel_single_task<class fake_kernel>([=]() {
    usage(); // expected-note {{called by 'operator()'}}
  });
  return 0;
}

This test now fails due to an additional diagnostic generated at template definition.

test.cpp:x:3: error: 'bar<__float128>' requires 128 bit size '__float128' type support, but target 'spir64' does not support it
T bar() { return T(); };
  ^

I looked at it briefly, and I believe the issue is call to checkTypeSupport() in ActOnFinishFunctionBody(). I tried deleting the call but it breaks tests (E.g. L26 in x86_64-no-x87.cpp). @asavonic Please take a look. I will be reverting the patch if this cannot be fixed soon.

This patch causes a regression.

To reproduce - clang -cc1 -fsycl-is-device -triple spir64 test.cpp

test.cpp:x:3: error: 'bar<__float128>' requires 128 bit size '__float128' type support, but target 'spir64' does not support it
T bar() { return T(); };
  ^

I looked at it briefly, and I believe the issue is call to checkTypeSupport() in ActOnFinishFunctionBody(). I tried deleting the call but it breaks tests (E.g. L26 in x86_64-no-x87.cpp). @asavonic Please take a look. I will be reverting the patch if this cannot be fixed soon.

The diagnostic seems to be correct - this instance of bar returns an unsupported type. Why do you think it should not be diagnosed?

This patch causes a regression.

To reproduce - clang -cc1 -fsycl-is-device -triple spir64 test.cpp

test.cpp:x:3: error: 'bar<__float128>' requires 128 bit size '__float128' type support, but target 'spir64' does not support it
T bar() { return T(); };
  ^

I looked at it briefly, and I believe the issue is call to checkTypeSupport() in ActOnFinishFunctionBody(). I tried deleting the call but it breaks tests (E.g. L26 in x86_64-no-x87.cpp). @asavonic Please take a look. I will be reverting the patch if this cannot be fixed soon.

The diagnostic seems to be correct - this instance of bar returns an unsupported type. Why do you think it should not be diagnosed?

I believe the problem is that there are now _3_ different diagnostics for the same thing, the one on 'bar', plus 2 more here:

auto malAutoTemp5 = bar<__float128>();

I think i would expect 1 error on 'bar', 1 error on the deduced 'auto', but the 3rd is superfluous.

This patch causes a regression.

To reproduce - clang -cc1 -fsycl-is-device -triple spir64 test.cpp

test.cpp:x:3: error: 'bar<__float128>' requires 128 bit size '__float128' type support, but target 'spir64' does not support it
T bar() { return T(); };
  ^

I looked at it briefly, and I believe the issue is call to checkTypeSupport() in ActOnFinishFunctionBody(). I tried deleting the call but it breaks tests (E.g. L26 in x86_64-no-x87.cpp). @asavonic Please take a look. I will be reverting the patch if this cannot be fixed soon.

The diagnostic seems to be correct - this instance of bar returns an unsupported type. Why do you think it should not be diagnosed?

I believe the problem is that there are now _3_ different diagnostics for the same thing, the one on 'bar', plus 2 more here:

auto malAutoTemp5 = bar<__float128>();

I think i would expect 1 error on 'bar', 1 error on the deduced 'auto', but the 3rd is superfluous.

I will check if there is a way to filter it out. However, I don't think that it is a good reason to revert the patch.

This patch causes a regression.

To reproduce - clang -cc1 -fsycl-is-device -triple spir64 test.cpp

test.cpp:x:3: error: 'bar<__float128>' requires 128 bit size '__float128' type support, but target 'spir64' does not support it
T bar() { return T(); };
  ^

I looked at it briefly, and I believe the issue is call to checkTypeSupport() in ActOnFinishFunctionBody(). I tried deleting the call but it breaks tests (E.g. L26 in x86_64-no-x87.cpp). @asavonic Please take a look. I will be reverting the patch if this cannot be fixed soon.

The diagnostic seems to be correct - this instance of bar returns an unsupported type. Why do you think it should not be diagnosed?

I believe the problem is that there are now _3_ different diagnostics for the same thing, the one on 'bar', plus 2 more here:

auto malAutoTemp5 = bar<__float128>();

I think i would expect 1 error on 'bar', 1 error on the deduced 'auto', but the 3rd is superfluous.

I will check if there is a way to filter it out. However, I don't think that it is a good reason to revert the patch.

This a pretty significant regression in our downstream, which is typically more than enough to get a revert.

This patch causes a regression.

To reproduce - clang -cc1 -fsycl-is-device -triple spir64 test.cpp

test.cpp:x:3: error: 'bar<__float128>' requires 128 bit size '__float128' type support, but target 'spir64' does not support it
T bar() { return T(); };
  ^

I looked at it briefly, and I believe the issue is call to checkTypeSupport() in ActOnFinishFunctionBody(). I tried deleting the call but it breaks tests (E.g. L26 in x86_64-no-x87.cpp). @asavonic Please take a look. I will be reverting the patch if this cannot be fixed soon.

The diagnostic seems to be correct - this instance of bar returns an unsupported type. Why do you think it should not be diagnosed?

@asavonic I spoke offline with @erichkeane. I was mistaken. There are only 2 error diagnostics generated (upstream) after this patch. The additional diagnostic generated at bar after your patch is correct. We just need to remove the third diagnostic downstream. So nothing needs to be done here. I apologize for the confusion and trouble.

@asavonic I spoke offline with @erichkeane. I was mistaken. There are only 2 error diagnostics generated (upstream) after this patch. The additional diagnostic generated at bar after your patch is correct. We just need to remove the third diagnostic downstream. So nothing needs to be done here. I apologize for the confusion and trouble.

No worries. Thanks a lot for checking this.

Disabled double or float return for x86 targets

Sorry, I think we still need to match GCC's behavior. I.e., we shouldn't diagnosticate any FP type (float, double, long double) on 32-bits. https://godbolt.org/z/KrbhfWc9o
We have users who require of that. Noticed this problem during D112143.
Could you have a look again?

clang/test/Sema/x86-no-x87.cpp
1 ↗(On Diff #384364)

This seems not used?

22–24 ↗(On Diff #384364)

I'd think we can't emit error diagnostics on 32-bits now. We have users who rely on such behavior.

@asavonic , I put a patch D114162 to enable it on 32-bits. Could you help to review it?

nickdesaulniers added a comment.EditedJan 10 2022, 11:59 AM

Linus Torvalds' thoughts on the patches as a result of this change.

Those uses of 1E6L were perhaps strange, but they were only used in
constant compile-time expressions that were cast to 'unsigned long' in
the end, so how the heck did llvm end up with any floating point in
there?

In this case there was really no excuse not to just use a better
constant, but there are other situations where it might well be quite
reasonable to use floating point calculations to create an integer
constant (eg maybe some spec describes an algorithm in FP, but the
implementation uses fixed-point arithmetic and has initializers that
do the conversion).

Imagine for a moment that you want to do fixed-point math (perhaps
because you have a microcontroller without an FP unit - it's not
limited to just "the kernel doesn't do FP"). Further imagine just for
a moment that you still want some fundamental constants like PI in
that fixed-point format.

The sane way to generate said constants is to do something like

#define FIXPT_1 (1u << FIXPT_SHIFT)
#define FIXPT_FP(x) ((fixpt_t) (((x)*FIXPT_1)+0.5))
#define FIXPT_PI FIXPT_FP(3.14159265)

rather than have to do something incredibly stupid and illogical and
unreadable like

#define FIXPT_PI 205887

So honestly, this seems to be just llvm being completely stupid. The
fact that you don't want to generate floating point code has *NOTHING*
to do with floating point literals for constant expressions.

In fact, even if you don't want to generate floating point code -
again, maybe you don't have a FP unit - doesn't mean that you might
not want to generate normal floating point constants. You may end up
having explicit software floating point, and doing things like passing
the floating point values around manually, ie

 union fp {
     uint64_t val;
     double fp_val;
};

and having code like

static const union fp sqrt2 = { .fp_val = 1.4142.... };

and then doing all the math in 'uint64_t' just because you wrote a
couple of math routines yourself:

fp_mul(&res,  sqrt2.val, x);

again, there's no floating point *code* generated, but that doesn't
mean that the compiler shouldn't allow users to use floating point
*values*.

The two
examples above are very much not about the Linux kernel (although I
think we actually have a couple of places that do things exactly like
that) they are about generic coding issues.

Linus Torvalds' thoughts on the patches as a result of this change.

Those uses of 1E6L were perhaps strange, but they were only used in
constant compile-time expressions that were cast to 'unsigned long' in
the end, so how the heck did llvm end up with any floating point in
there?

In this case there was really no excuse not to just use a better
constant, but there are other situations where it might well be quite
reasonable to use floating point calculations to create an integer
constant (eg maybe some spec describes an algorithm in FP, but the
implementation uses fixed-point arithmetic and has initializers that
do the conversion).

Imagine for a moment that you want to do fixed-point math (perhaps
because you have a microcontroller without an FP unit - it's not
limited to just "the kernel doesn't do FP"). Further imagine just for
a moment that you still want some fundamental constants like PI in
that fixed-point format.

The sane way to generate said constants is to do something like

#define FIXPT_1 (1u << FIXPT_SHIFT)
#define FIXPT_FP(x) ((fixpt_t) (((x)*FIXPT_1)+0.5))
#define FIXPT_PI FIXPT_FP(3.14159265)

rather than have to do something incredibly stupid and illogical and
unreadable like

#define FIXPT_PI 205887

So honestly, this seems to be just llvm being completely stupid. The
fact that you don't want to generate floating point code has *NOTHING*
to do with floating point literals for constant expressions.

In fact, even if you don't want to generate floating point code -
again, maybe you don't have a FP unit - doesn't mean that you might
not want to generate normal floating point constants. You may end up
having explicit software floating point, and doing things like passing
the floating point values around manually, ie

 union fp {
     uint64_t val;
     double fp_val;
};

and having code like

static const union fp sqrt2 = { .fp_val = 1.4142.... };

and then doing all the math in 'uint64_t' just because you wrote a
couple of math routines yourself:

fp_mul(&res,  sqrt2.val, x);

again, there's no floating point *code* generated, but that doesn't
mean that the compiler shouldn't allow users to use floating point
*values*.

The two
examples above are very much not about the Linux kernel (although I
think we actually have a couple of places that do things exactly like
that) they are about generic coding issues.

I saw that and responded to it on the ClangBuiltLinux thread...

Generally, our policy is that we should do all our diagnostics in the CFE/Sema. I believe GCC implements this limitation in the code-generator, so the result is that depending on optimization settings, you don't get these errors. I'm going to disagree with Linus, from a language design perspective, if we want to decide a type is 'illegal', we need to do so consistently, not JUST when it shows up. The Clang project has had this opinion as long as I've been around, and we've had to have GCC incompatibilities in the past as a result. I don't see this as any different from what we have done before.