This is an archive of the discontinued LLVM Phabricator instance.

[clang][Sparc] Default to -mcpu=v9 for SparcV8 on Solaris
ClosedPublic

Authored by ro on Aug 26 2020, 7:03 AM.

Details

Summary

As reported in Bug 42535, clang doesn't inline atomic ops on 32-bit Sparc, unlike gcc on
Solaris. In a 1-stage build with gcc, only two testcases are affected (currently XFAILed), while in a 2-stage build more than 100 tests FAIL due to this issue.

The reason for this gcc/clang difference is that gcc on 32-bitSolaris/SPARC defaults to -mpcu=v9 where atomic ops are supported, unlike with clang's default of -mcpu=v8. This patch changes clang to use -mcpu=v9 on 32-bit Solaris/SPARC, too.

Doing so uncovered two bugs:

clang -m32 -mcpu=v9 chokes with any Solaris system headers included:

/usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined"
#error "Both _ILP32 and _LP64 are defined"

While clang currently defines __sparcv9 in a 32-bit -mcpu=v9 compilation, neither gcc nor Studio cc do. In fact, the Studio 12.6 cc(1) man page clearly states:

          These predefinitions are valid in all modes:
[...]
             __sparcv8 (SPARC)
             __sparcv9 (SPARC -m64)

At the same time, the patch defines __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] for a 32-bit Sparc compilation with any V9 cpu. I've also changed MaxAtomicInlineWidth for V9, matching what gcc does and the Oracle Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic Types, 3.1 Size and Alignment of Atomic C Types).

The two testcases that had been XFAILed for Bug 42535 are un-XFAILed again. However, one of those and another testcase now FAIL due to Bug 42493 and are thus XFAILed.

Tested on sparcv9-sun-solaris2.11 and amd64-pc-solaris2.11.

While the fix proper is trivial: just two lines in lib/Driver/ToolChains/CommonArgs.cpp, finding the right place has been nightmarishly difficult: I'd have expected handling of a Solaris/SPARC CPU default in either of Solaris or SPARC specific files, but not deeply hidden in common code. I've come across issues like this over and over again: configuration information in LLVM is spread all over the place, difficult to find or just to know that it exists.

Diff Detail

Event Timeline

ro created this revision.Aug 26 2020, 7:03 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: Restricted Project, jfb, jrtc27, fedor.sergeev. · View Herald Transcript
ro requested review of this revision.Aug 26 2020, 7:03 AM
ro added a comment.Aug 27 2020, 2:17 AM

While doing 2-stage builds, the 2 XFAILs due to Bug 42493 were required. However, I've since also done a 1-stage build with gcc 10, which shows

Unexpectedly Passed Tests (2):
  UBSan-Standalone-sparc :: TestCases/Float/cast-overflow.cpp
  UBSan-Standalone-sparc :: TestCases/Misc/log-path_test.cpp

which is no wonder since the underlying bug only exists in clang long double handling.

I'm a bit uncertain how to deal with this: since the Solaris/sparcv9 buildbot currently does 1-stage builds with gcc, the resulting XPASSes would turn the bot red. However, currently there's no clang-vs.-gcc feature in lit to distinguish the two builds, AFAICT. libcxx has something like this, but that might be overkill to extract.

Perhaps just remove the XFAILs for now?

Ping? It's been a week.

While the fix proper is trivial: just two lines in lib/Driver/ToolChains/CommonArgs.cpp, finding the right place has been nightmarishly difficult: I'd have expected handling of a Solaris/SPARC CPU default in either of Solaris or SPARC specific files, but not deeply hidden in common code. I've come across issues like this over and over again: configuration information in LLVM is spread all over the place, difficult to find or just to know that it exists.

The clang driver is messy, yes. Other places are pretty good about this, mostly.

For compiler-rt, the XFAILs should probably reflect whatever config the bot is running. (Alternatively, you could use UNSUPPORTED, but that doesn't seem warranted here.)

clang/lib/Basic/Targets/Sparc.cpp
224

This probably should be refactored so the target-independent code generates it based on MaxAtomicInlineWidth, instead of duplicating it for each target. But I guess you don't need to do that here.

From the other code, the getCPUGeneration(CPU) == CG_V9 check should only guard the definition of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8?

clang/lib/Driver/ToolChains/CommonArgs.cpp
350

Do we want to make sparc and sparcel behave differently here?

jrtc27 requested changes to this revision.Aug 31 2020, 10:58 AM

GCC on Linux defines __sparc_v9__ even with -m32. I don't know what Solaris does but please don't break other operating systems just because Solaris has broken headers that conflate the CPU and the ABI.

This revision now requires changes to proceed.Aug 31 2020, 10:58 AM

And notably _doesn't_ define the V8 macros, which this patch then reintroduces.

ro added a comment.Sep 1 2020, 2:33 AM

For compiler-rt, the XFAILs should probably reflect whatever config the bot is running. (Alternatively, you could use UNSUPPORTED, but that doesn't seem warranted here.)

That's been my guess, too (with the goal of getting the bot green eventually in mind). Once the underlying clang bug is fixed, the tests will pass with both compilers, so I agree that UNSUPPORTED isn't warranted here.

clang/lib/Basic/Targets/Sparc.cpp
224

This probably should be refactored so the target-independent code generates it based on MaxAtomicInlineWidth, instead of duplicating it for each target. But I guess you don't need to do that here.

Good: one issue at a time ;-)

From the other code, the getCPUGeneration(CPU) == CG_V9 check should only guard the definition of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8?

I don't think so: at least gcc defines none of the four with -m32 -mcpu=v8 and all with -m32 -mcpu=v9.

clang/lib/Driver/ToolChains/CommonArgs.cpp
350

The only thing about little-endian SPARC is that UltraSPARC CPUs can be run in little-endian mode, presumably to ease a Windows NT port that never materialized.

I could barely find any info on sparcel: the triplet isn't in config.guess, there's very little else about such CPUs and certainly no specification or even an ABI. That's why I tend to leave sparcel-specific code alone unless someone who knows these beasts tells me otherwise.

What I can say for certain that Solaris never ran on little-endian SPARC.

ro added a comment.Sep 1 2020, 2:43 AM

GCC on Linux defines __sparc_v9__ even with -m32. I don't know what Solaris does but please don't break other operating systems just because Solaris has broken headers that conflate the CPU and the ABI.

I'd have sworn that I checked gcc on Linux/sparc64: must have been dreaming.

I quoted the relevant snippet of the Studio cc(1) man page which shows that Solaris compilers predefine __sparcv8 for -m32 and __sparcv9 for -m64. I'd argue that the creators of the CPU, the OS, and the initial compiler can decide to do so as they please (after all, AFAICT that's not part of any ABI and I couldn't find an API document that mentioned those), nothing broken or conflated here.

I've now looked at the GCC sources and found:

  • FreeBSD/sparc64 is 64-bit-only, no 32-bit support in sight
  • while the NetBSD/sparc64 port still includes bi-arch support, it is disabled because the don't ship a 32-bit environment and even if they did, they only define __sparc_v9__ etc. for -m64.
  • OpenBSD/sparc64 again is 64-bit only

I'll see how best to reconcile those.

ro updated this revision to Diff 289128.Sep 1 2020, 4:46 AM

Define __sparcv8 only on 32-bit Solaris/SPARC.
Update testcase.
Remove XFAILs that only apply to clang/2-stage builds.

jrtc27 added a comment.Sep 1 2020, 5:13 AM

Actually, __sparcv8 is only for V8; if you have 32-bit V9 on Solaris it defines __sparcv8plus _instead_:

jrtc27@gcc-solaris11:~$ /opt/solarisstudio12.4/bin/cc -E - -xarch=v9 -m32 -xdumpmacros </dev/null 2>&1 | grep sparc
#define __sparcv8plus 1
#define __sparc 1
#define sparc 1

In fact, -xarch=v9 + -m32 is a bit weird because -xarch=v9 implies -m64 so the argument order matters, and the modern way to do it is (if you read the man page, -xarch=sparc means V9 and -xarch=v9 is an alias for -m64 -xarch=sparc...):

jrtc27@gcc-solaris11:~$ /opt/solarisstudio12.4/bin/cc -E - -m32 -xarch=sparc -xdumpmacros </dev/null 2>&1 | grep sparc
#define __sparcv8plus 1
#define __sparc 1
#define sparc 1

(gcc211 on the GCC compile farm; any open-source developer can register, it's not specific to GCC developers despite the name)

ro added a comment.Sep 1 2020, 5:37 AM

Actually, __sparcv8 is only for V8; if you have 32-bit V9 on Solaris it defines __sparcv8plus _instead_:

jrtc27@gcc-solaris11:~$ /opt/solarisstudio12.4/bin/cc -E - -xarch=v9 -m32 -xdumpmacros </dev/null 2>&1 | grep sparc
#define __sparcv8plus 1
#define __sparc 1
#define sparc 1

That's only true up to Studio 12.4: 12.5 and 12.6 define both __sparcv8 and __sparcv8plus, and strangely even the 12.4 and 12.3 cc(1) man pages only documents __sparcv8 while 12.2 only has __sparc. We could also define __sparcv8plus, but gcc doesn't do that, so it would only help people compiling with cc.

In fact, -xarch=v9 + -m32 is a bit weird because -xarch=v9 implies -m64 so the argument order matters, and the modern way to do it is (if you read the man page, -xarch=sparc means V9 and -xarch=v9 is an alias for -m64 -xarch=sparc...):

Right; 12.5 and 12.6 cc even warn

cc: Warning: -xarch=v9 is deprecated, use -m64 -xarch=sparc instead
jrtc27@gcc-solaris11:~$ /opt/solarisstudio12.4/bin/cc -E - -m32 -xarch=sparc -xdumpmacros </dev/null 2>&1 | grep sparc
#define __sparcv8plus 1
#define __sparc 1
#define sparc 1

(gcc211 on the GCC compile farm; any open-source developer can register, it's not specific to GCC developers despite the name)

I do have a cfarm account already. However, there's no need for that: I do have every version of Studio back to 11 installed locally.

brad added inline comments.Sep 1 2020, 1:01 PM
clang/lib/Driver/ToolChains/CommonArgs.cpp
350

"The Sparcle is an experimental 32-bit microprocessor chip developed in 1992 by a consortium of MIT, LSI Corporation, and Sun Microsystems. It was an evolution Sun's SPARC RISC architecture with features geared towards "large-scale multiprocessing".[1] The chip was manufactured by LSI."

https://people.eecs.berkeley.edu/~kubitron/cs252/handouts/papers/sparcle-micro-final.pdf

From what little I can find I believe this prototype CPU / system was based on Linux.

efriedma added inline comments.Sep 1 2020, 1:15 PM
clang/lib/Basic/Targets/Sparc.cpp
224

This code, the code that sets MaxAtomicInlineWidth, and the code inSPARCISelLowering.cpp that calls setMaxAtomicSizeInBitsSupported() all need to agree about the supported atomic operations.

I guess the current setting of MaxAtomicInlineWidth is wrong?

ro added inline comments.Sep 1 2020, 1:46 PM
clang/lib/Basic/Targets/Sparc.cpp
224

I'd say so, yes: gcc -m32 inlines ops on _Atomic long long while clang-11 -m32 -mcpu=v9 doesn't.

ro added inline comments.Sep 1 2020, 1:51 PM
clang/lib/Driver/ToolChains/CommonArgs.cpp
350

Ah, thanks. So this predates Solaris 2.0 (also released in 1992). But even if not, current Solaris 11 is SPARC V9-only.

However, if this is the only implementation, this bears the question why support for such an experimental CPU almost 30 years old still lives in LLVM. It comes with a cost, if only for reasoning what needs and needs not to be done to the V8el code.

efriedma added inline comments.Sep 1 2020, 2:12 PM
clang/lib/Basic/Targets/Sparc.cpp
224

Oh, hmm, it looks like the backend's support for 32-bit v9 is really limited; we basically generate v8 code, with a couple limited exceptions. Probably okay to make clang assume 64-bit atomics are actually supported, even if we don't inline the implementation at the moment; they should still behave correctly.

I was more wondering about what we do for v8: we set MaxAtomicInlineWidth to 32, but I don't think it supports atomic cmpxchg at all.

efriedma added inline comments.Sep 1 2020, 2:20 PM
clang/lib/Driver/ToolChains/CommonArgs.cpp
350

The sparcv9 manual says the CPU has a little-endian mode; I assume that means all sparcv9 chips support it. If nobody actually does that in practice, though, we should drop the sparcel target.

ro added inline comments.Sep 1 2020, 2:53 PM
clang/lib/Driver/ToolChains/CommonArgs.cpp
350

Right. However, that's a privileged operation AFAIU and non of the OSes I know about (Solaris, Linux, the BSDs) run in little-endian mode. Besides, it would be weird to restrict oneself to the sparcv8 insns if one were using that v9 facility. On top of all that, gcc doesn't support the little-endian mode at all.

ro added inline comments.Sep 1 2020, 3:18 PM
clang/lib/Driver/ToolChains/CommonArgs.cpp
350

Seems we missed something, though: https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5a6e0e311&appId=PPGMS states (p.18):

6.3.5. Endianness
The Leon processors are both working in little endian same as the SHAVE processors. No extra requirements are needed to ensure endianess consistency.

Judging from that, Myriad is the primary user of sparcel these days. So unless we can get rid of that, too, sparcel will have to stay.

The point of no available documentation remains, though.

ro added inline comments.Sep 2 2020, 2:55 AM
clang/lib/Basic/Targets/Sparc.cpp
224

I've now found The V8+ Technical Specification. I've not checked how far LLVM makes use of that, though (gcc seems to be pretty extensive, and it certainly makes use of casx for v8plus).

I'm not really clear on the semantics of MaxAtomicInlineWidth: TargetInfo.h's description of getMaxAtomicInlineWidth only states

/// Return the maximum width lock-free atomic operation which can be
/// inlined given the supported features of the given target.

which would be satisfied given that some 32-bit atomic ops are inlined for v8.

efriedma added inline comments.Sep 2 2020, 11:29 AM
clang/lib/Basic/Targets/Sparc.cpp
224

The definition of C atomic operations requires that we only expose lock-free atomic operations on targets that have a lock-free cmpxchg. This is necessary to allow the implementation in libatomic to work: even if an operation is "atomic", it wouldn't correctly honor the libatomic locks.

Note that the dynamic component of libatomic allows using lockfree operations in a "v8plus" environment. The libatomic implementation should dynamically check whether the CPU supports casx, and use lock-free operations if it does.

ro added inline comments.Sep 2 2020, 12:00 PM
clang/lib/Basic/Targets/Sparc.cpp
224

Honestly, I don't see this as my job, though: the issue is pre-existing, it doesn't affect any of my targets and I see that GCC's libatomic forces -mcpu=v9 for the 32-bit sparc library.

efriedma added inline comments.Sep 2 2020, 2:42 PM
clang/lib/Basic/Targets/Sparc.cpp
224

If you don't want to touch it, please leave a FIXME so the next person looking at this code has a reference.

ro updated this revision to Diff 289658.Sep 3 2020, 1:45 AM

Add FIXME for SparcV8 MaxAtomicInlineWidth.

ro added a comment.Sep 4 2020, 2:06 AM

Is there anything left to do to get approval? Thanks.

ro added a comment.Sep 10 2020, 4:25 PM

Ping? It's been a week and AFAICT there's nothing left for me to do.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 11 2020, 12:54 AM
This revision was automatically updated to reflect the committed changes.