As reported in Bug 42535, clang doesn't inline atomic ops on 32-bit Sparc, unlike gcc on
Solaris. In a 1-stage build with gcc, only two testcases are affected (currently XFAILed), while in a 2-stage build more than 100 tests FAIL due to this issue.
The reason for this gcc/clang difference is that gcc on 32-bitSolaris/SPARC defaults to -mpcu=v9 where atomic ops are supported, unlike with clang's default of -mcpu=v8. This patch changes clang to use -mcpu=v9 on 32-bit Solaris/SPARC, too.
Doing so uncovered two bugs:
clang -m32 -mcpu=v9 chokes with any Solaris system headers included:
/usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined" #error "Both _ILP32 and _LP64 are defined"
While clang currently defines __sparcv9 in a 32-bit -mcpu=v9 compilation, neither gcc nor Studio cc do. In fact, the Studio 12.6 cc(1) man page clearly states:
These predefinitions are valid in all modes: [...] __sparcv8 (SPARC) __sparcv9 (SPARC -m64)
At the same time, the patch defines __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] for a 32-bit Sparc compilation with any V9 cpu. I've also changed MaxAtomicInlineWidth for V9, matching what gcc does and the Oracle Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic Types, 3.1 Size and Alignment of Atomic C Types).
The two testcases that had been XFAILed for Bug 42535 are un-XFAILed again. However, one of those and another testcase now FAIL due to Bug 42493 and are thus XFAILed.
Tested on sparcv9-sun-solaris2.11 and amd64-pc-solaris2.11.
While the fix proper is trivial: just two lines in lib/Driver/ToolChains/CommonArgs.cpp, finding the right place has been nightmarishly difficult: I'd have expected handling of a Solaris/SPARC CPU default in either of Solaris or SPARC specific files, but not deeply hidden in common code. I've come across issues like this over and over again: configuration information in LLVM is spread all over the place, difficult to find or just to know that it exists.
This probably should be refactored so the target-independent code generates it based on MaxAtomicInlineWidth, instead of duplicating it for each target. But I guess you don't need to do that here.
From the other code, the getCPUGeneration(CPU) == CG_V9 check should only guard the definition of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8?