Page MenuHomePhabricator

[x86] Enable f128 as a legal type in 64-bit mode if SSE is enabled rather than if MMX is enabled.
Needs RevisionPublic

Authored by craig.topper on Aug 31 2017, 5:52 PM.

Details

Summary

Currently if you use f128 in 64-bit mode with sse disabled and mmx enabled, you get an error about not being able to use an SSE register for a return. And then later we throw an assert trying to emit an xmm register copy.

Changing this to check SSE instead of MMX at least stops us from printing the error and asserting.

I'm still not sure this is correct though. Looking at some assembly with sse/mmx enabled we think the f128 library functions like add return their result in xmm0. But if you disable sse/mmx we then think they return their result in rax/rdx. So unless the library functions return their result both ways, I don't think this is correct.

Here's an example that currently crashes when compiled with "clang -c -mno-mmx"

void foo(__float128 a, __float128 b, __float128 *c) {
    *c = a + b;
}

Diff Detail

Event Timeline

craig.topper created this revision.Aug 31 2017, 5:52 PM
chh requested changes to this revision.Sep 1 2017, 6:33 PM

I cannot get clang to crash with "clang -c -mno-mmx" and the given example.
Maybe I missed something?

It also looks opposite to the problem of "... with sse disabled and mmx
enabled, you get an error about not being able to use an SSE register
for a return ..."
Could you include an example to reproduce that error?

Although replacing hasMMX with hasSSE1 looked logical and it passed
most Android used modes, the failed unit test cases showed
incompatible ABI generated for some other modes like SSE and AVX.
We need to find out if those before change output were correct or not.
Even if they were incorrect, I am not sure about breaking the compatibility.

Some of the unit test checks in extract-store.ll were removed.
They should be kept and changed to check new output pattern if necessary.

This revision now requires changes to proceed.Sep 1 2017, 6:33 PM

Oops I had that backwards, it crashes with clang -c -mmmx -mno-sse

I didn't remove any of the checks in extract_store. They just collapsed into a common check-prefix.

chh added a comment.Sep 5 2017, 4:04 PM

Okay, I see the change in extract_store.ll now.
The checks were not reduced.
The test mode +mmx was changed to +sse,
and expected output of +sse2, +sse4.1, +avx were changed.

With -mmmx -mno-sse, I can see a clang assertion fault following an error message:

  .... error: SSE register return with SSE disabled
void foo(__float128 a, __float128 b, __float128 *c) {
     ^
clang-5.0: .../llvm/lib/CodeGen/TargetRegisterInfo.cpp:173: const llvm::TargetRegisterClass* llvm::TargetRegisterInfo::getMinimalPhysRegClass(unsigned int, llvm::MVT) const: Assertion `BestRC && "Couldn't find the register class"' failed.

Which target/plaftform needs "-mmmx -mno-sse"?
Which uses only mmx or sse?

I saw some other code that checked hasMMX() only,
so I added test cases with +sse2 and without +mmx.
I think they should not use the mmx registers.

How about making only the following change?

if (Subtarget.is64Bit() && Subtarget.hasMMX())

to

if (Subtarget.is64Bit() && Subtarget.hasMMX() && Subtarget.hasSSE1())

That will pass existing unit tests and your -mmx -mno-sse test case.

-mno-sse and -mmmx are driver options a user can specify on the clang command line. They aren't a platform requirement.

Did you look at the assembly output for the test case i gave with and without "-mno-sse -mmmx"? I'm pretty sure even with my change or your change we still miscompile the code when SSE and MMX are disabled. As you can see here https://godbolt.org/g/ELe3CW with them enabled we think addtf3 returns in xmm0, but if we disable mmx and sse we expect the return in rax/rdx. addtf3 is a library function that we didn't compile. Does it return both ways or is the rax/rdx version a miscompile? I believe gcc just throws an error for this case.

chh resigned from this revision.Sep 6 2017, 11:56 AM

My original change in lib/Target/X86/X86ISelLowering.cpp
was part of https://reviews.llvm.org/D15134 to fix
calling convention bug of f128 type mentioned in
https://bugs.llvm.org/show_bug.cgi?id=23897.

Android's x64 target assumes both -msse and -mmmx.
So it works with

if (Subtarget.is64Bit() && Subtarget.hasMMX())

or

if (Subtarget.is64Bit() && Subtarget.hasSSE1())

or

if (Subtarget.is64Bit() && Subtarget.hasMMX() && Subtarget.hasSSE1())

I see now with -msse and -mno-mmx, gcc still uses %xmm0.
That seems to be compatible with AMD64 ABI, too.
So your change in X86ISelLowering.cpp:611 is good:

if (Subtarget.is64Bit() && Subtarget.hasSSE1())

Could you change the 3 affected test files to include both
(1) original +mmx mode, (no sse) but expected output should be changed.
(2) add +sse mode, and the expected output is to using %xmm0 for __float128

The remaining big question is what to do with __float128 when there is no SSE.
I think gcc's output is still better matching AMD64 ABI.
Without SSE,

gcc passes __float128 on stack, but reject source code that returns __float128.

$ gcc -S -O2 m.c -mno-sse -mno-mmx -o m.g.s

  1. or gcc -S -O2 m.c -mno-sse -o m.g.mmx.s

m.c: In function 'get1':
m.c:5:1: error: SSE register return with SSE disabled
float128 get1() { return x; }
m.c:13:58: error: SSE register return with SSE disabled
void foo(
float128 a, float128 b, float128 *c) { *c = a + b; }

It should take more changes (and maybe library functions) to make clang's output
match gcc's under -mno-sse. We should fix that problem later in a separate CL.

chh removed a reviewer: chh.Sep 6 2017, 11:59 AM
chh added a reviewer: chh.