This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Move section assignment from initializeSymbols to postParse
ClosedPublic

Authored by MaskRay on Feb 26 2022, 11:57 PM.

Details

Summary

https://discourse.llvm.org/t/parallel-input-file-parsing/60164

initializeSymbols currently sets Defined::section and handles non-prevailing
COMDAT groups. Move the code to the parallel postParse to reduce work from the
single-threading code path and make parallel section initialization infeasible.

Postpone reporting duplicate symbol errors so that the messages have the
section information. (Defined::section is assigned in postParse and another
thread may not have the information).

  • duplicated-synthetic-sym.s: BinaryFile duplicate definition (very rare) now has no section information
  • comdat-binding: %t/w.o %t/g.o leads to an undesired undefined symbol. This is not ideal but we report a diagnostic to inform that this is unsupported. (See release note)
  • comdat-discarded-lazy.s: %tdef.o is unextracted. The new behavior (discarded section error) makes more sense

Depends on D120640

Diff Detail

Event Timeline

MaskRay created this revision.Feb 26 2022, 11:57 PM
MaskRay requested review of this revision.Feb 26 2022, 11:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2022, 11:57 PM
MaskRay edited the summary of this revision. (Show Details)Feb 27 2022, 12:02 AM
MaskRay updated this revision to Diff 411661.Feb 27 2022, 12:07 AM

improve comment

MaskRay updated this revision to Diff 411726.Feb 27 2022, 7:12 PM
MaskRay edited the summary of this revision. (Show Details)

Depends on D120626

Not supporting comdat-binding: %t/w.o %t/g.o leads to an undesired undefined symbol. This is not ideal but acceptable because valid input cannot form this case. makes me a bit nervous. In a clang/gcc world this is not a problem, but there may be third party compilers that differ. For example arm's legacy compiler armcc from Arm Compiler 5 uses STB_GLOBAL for definitions in COMDAT groups, whereas GCC, at least used to, uses STB_WEAK. It is possible that this is illegal, but I can't find in the spec where it says that.

Perhaps it could be accounted for in resolve as I think we'd be passing a global with the discarded section. Perhaps we could prefer the existing symbol in that case?

lld/ELF/Driver.h
64 ↗(On Diff #411726)

Could be worth a named struct? Something like DuplicateSymbol. This could be accepted by reportUndefined rather than 4 individual tuple field accesses.

lld/ELF/InputFiles.cpp
1149

nit: !sym.file allows a symbol assignment to redefine a symbol without an error.

1163

If this is happening in parallel, is there a chance of interleaved output? For example a traced symbol in multiple groups? I guess that would need something like driver->duplicates?

lld/ELF/Symbols.cpp
544–553

I'm trying to think how this can happen? Would this be a case where there is a definition of symbol S outside of a group and then an existing weak S from prevailing group and a global S from a discarded group.

May be worth expanding on the case in the comment as it isn't easy (or at least I wasn't able to easily work it out) from the code.

lld/test/ELF/comdat-binding.s
13

I'm not so sure about that. It is certainly an area that is ill-specified. I can't find anything ELF prohibiting it though. Certainly it wouldn't within the same tool-chain I'd expect consistency, but I could see toolchain X using weak symbols in comdat groups and toolchain Y using global symbols unless the C++ ABI for the platform is explicit about it.

I can only find references to weak symbols in groups for http://itanium-cxx-abi.github.io/cxx-abi/abi.html#vague-static . The Arm C++ ABI references this https://github.com/ARM-software/abi-aa/blob/main/cppabi32/cppabi32.rst#426elf-binding-of-static-data-guard-variable-symbols but is silent on other cases.

MaskRay updated this revision to Diff 412282.Mar 1 2022, 3:38 PM
MaskRay marked 3 inline comments as done.

address some comments

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 3:38 PM

Not supporting comdat-binding: %t/w.o %t/g.o leads to an undesired undefined symbol. This is not ideal but acceptable because valid input cannot form this case. makes me a bit nervous. In a clang/gcc world this is not a problem, but there may be third party compilers that differ. For example arm's legacy compiler armcc from Arm Compiler 5 uses STB_GLOBAL for definitions in COMDAT groups, whereas GCC, at least used to, uses STB_WEAK. It is possible that this is illegal, but I can't find in the spec where it says that.

Perhaps it could be accounted for in resolve as I think we'd be passing a global with the discarded section. Perhaps we could prefer the existing symbol in that case?

Avoiding accessing sections[*] is what the patch strives to do: untangle section initialization and symbol initialization/resolution as much as possible.
I have a plan to merge section initialization and initializeLocalSymbols into initSectionsAndLocalSyms, which parallelizes section initialization.
In the long term, this is a necessary step to achieve parallel input file parsing (https://discourse.llvm.org/t/parallel-input-file-parsing/60164) (and perhaps important to keep lld relevant with the emerging of mold.)

Keeping if (sec == &InputSection::discarded) will address the %t/w.o %t/g.o dilemma, but that probably will not fly.
Personally I'd like to say such a dilemma is unsupported. If we have to, it doesn't seem like there is a good way supporting it...

lld/ELF/InputFiles.cpp
1163

Yes. If the determinism is desired, elf::printTraceSymbol( needs to some refactoring: the message call needs to be replaced with something like driver->duplicates.

Placing duplicates in LinkerDriver is a bit arbitrary. I wonder whether we should use ctx->duplicates.

lld/ELF/Symbols.cpp
544–553

Updated the comment (the original one was a bit wrong). It's related to test/ELF/comdat-linkonce.s

Not supporting comdat-binding: %t/w.o %t/g.o leads to an undesired undefined symbol. This is not ideal but acceptable because valid input cannot form this case. makes me a bit nervous. In a clang/gcc world this is not a problem, but there may be third party compilers that differ. For example arm's legacy compiler armcc from Arm Compiler 5 uses STB_GLOBAL for definitions in COMDAT groups, whereas GCC, at least used to, uses STB_WEAK. It is possible that this is illegal, but I can't find in the spec where it says that.

Perhaps it could be accounted for in resolve as I think we'd be passing a global with the discarded section. Perhaps we could prefer the existing symbol in that case?

Avoiding accessing sections[*] is what the patch strives to do: untangle section initialization and symbol initialization/resolution as much as possible.
I have a plan to merge section initialization and initializeLocalSymbols into initSectionsAndLocalSyms, which parallelizes section initialization.
In the long term, this is a necessary step to achieve parallel input file parsing (https://discourse.llvm.org/t/parallel-input-file-parsing/60164) (and perhaps important to keep lld relevant with the emerging of mold.)

Keeping if (sec == &InputSection::discarded) will address the %t/w.o %t/g.o dilemma, but that probably will not fly.
Personally I'd like to say such a dilemma is unsupported. If we have to, it doesn't seem like there is a good way supporting it...

Thanks for the updates so far.

My thoughts so far:

  • I agree that mixing symbol bindings in the same link is a niche use case, likely for compilers not derived from LLVM or GCC.
  • I expect that parallel reading will be in development for some time. It probably isn't worth working extra hard right now.
  • If it isn't possible to cleanly support it, then can we diagnose it with a specific error message? For example LLD does not support COMDAT groups where different groups use different symbol bindings for the same symbol name. is better than an undefined symbol error.
  • We'll need to document and release note the restriction.
lld/ELF/InputFiles.cpp
1163

By interleaved output I was thinking of something along the lines of:

Some people, when confronted with a problem, think, “I know, I’ll use threads,” and then two they hav erpoblesms.

from https://nedbatchelder.com/blog/201204/two_problems.html

Perhaps another mutex for the block if it is susceptible. It is possible that the stream is protected with a lock though.

ctx->duplicates is probably more friendly to the people that want to use LLD in the context of a library.

MaskRay added a comment.EditedMar 8 2022, 7:51 PM

I agree that mixing symbol bindings in the same link is a niche use case, likely for compilers not derived from LLVM or GCC.
I expect that parallel reading will be in development for some time. It probably isn't worth working extra hard right now.

This change is just a necessary step to unlock more parallelism (section initialization), unless there is any other approach I haven't thought of.
(Our progress is not particularly rapid, just incrementally making more stuff parallel)

If it isn't possible to cleanly support it, then can we diagnose it with a specific error message? For example LLD does not support COMDAT groups where different groups use different symbol bindings for the same symbol name. is better than an undefined symbol error.
We'll need to document and release note the restriction.

Unfortunately I can't think of a way to detect the case without introducing a new loop.
The best I can think of is to add a boolean variable to InputFile. After parallel section initialization, add some code to postParse to detect the case, then in a new loop, resolve the symbol to the prevailing (even if weak) COMDAT group.
This seems inelegant and a lot of engineering to report a good diagnostic for this case ...

Lowest cost option I can think of is to additionally set a flag when InputFiles.cpp:1163:

if (sym.file == this)
  sym.replace(Undefined{this, sym.getName(), sym.binding, sym.stOther,
                        sym.type, secIdx});
continue;

Perhaps a new bit field entry in undefined. At the time of reporting undefined symbols we could check for that flag and report some additional context.
`Symbol <sym> was defined in non-prevailing group. Either the symbol was not present in the prevailing group, or the symbol in the prevailing group had STB_WEAK binding and this symbol had STB_GLOBAL binding. LLD does not support mixing of groups with STB_WEAK and STB_GLOBAL bindings for the same symbol name.'

I don't want to hold this up any more than I have already. If it is going to be too much to support then worth adding something to the documentation, or eventually the release notes stating what we don't support.

MaskRay updated this revision to Diff 414597.Mar 10 2022, 11:53 PM

Add a diagnostic.
Add release note

MaskRay updated this revision to Diff 414598.Mar 11 2022, 12:02 AM

Move duplicates from LinkerDriver to a new struct Ctx

Ctx has short name because we may use it in many functions' signatures in the future.
Using LinkerDriver for global states may look a bit weird

MaskRay added inline comments.Mar 11 2022, 12:09 AM
lld/ELF/InputFiles.cpp
1163

There is no interleaving. A single-line or multi-line message is emitted as a whole.

lld/Common/ErrorHandler.cpp uses lock_guard for messages and diagnostics.

peter.smith accepted this revision.Mar 14 2022, 11:16 AM

Thanks for the updates. I've set approved as my comments have been addressed.

This revision is now accepted and ready to land.Mar 14 2022, 11:16 AM

Thanks for accepting the compromise!

MaskRay edited the summary of this revision. (Show Details)Mar 14 2022, 12:06 PM
haowei added a subscriber: haowei.Mar 15 2022, 2:11 PM

We are seeing build failures in cmake config step for the clang runtime builds after this patch. Example of error messages:

[3945/4056] Performing configure step for 'runtimes-i386-unknown-linux-gnu'
-- The C compiler identification is Clang 15.0.0
-- The CXX compiler identification is Clang 15.0.0
-- The ASM compiler identification is Clang with GNU-like command-line
-- Found assembler: /mnt/nvme_sec/SRC/llvm-project/build-repro/./bin/clang
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - failed
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - failed
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Could NOT find ZLIB (missing: ZLIB_LIBRARY) (found version "1.2.11")
-- Found LibXml2: /usr/lib/x86_64-linux-gnu/libxml2.so (found version "2.9.12")
-- Could NOT find ZLIB (missing: ZLIB_LIBRARY) (found version "1.2.11")
-- Performing Test LLVM_RUNTIMES_LINKING_WORKS
-- Performing Test LLVM_RUNTIMES_LINKING_WORKS - Failed
-- Performing Test LLVM_RUNTIMES_SUPPORT_UNWINDLIB_NONE_FLAG
-- Performing Test LLVM_RUNTIMES_SUPPORT_UNWINDLIB_NONE_FLAG - Failed
-- Performing Test LLVM_RUNTIMES_SUPPORT_NOSTDLIBXX_FLAG
-- Performing Test LLVM_RUNTIMES_SUPPORT_NOSTDLIBXX_FLAG - Failed
-- Performing Test LLVM_RUNTIMES_SUPPORT_NOSTDINCXX_FLAG
-- Performing Test LLVM_RUNTIMES_SUPPORT_NOSTDINCXX_FLAG - Failed
CMake Warning (dev) at /usr/share/cmake-3.22/Modules/GNUInstallDirs.cmake:239 (message):
  Unable to determine default CMAKE_INSTALL_LIBDIR directory because no
  target architecture is known.  Please enable at least one language before
  including GNUInstallDirs.
Call Stack (most recent call first):
  /mnt/nvme_sec/SRC/llvm-project/llvm/cmake/modules/AddLLVM.cmake:1 (include)
  CMakeLists.txt:126 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Linker detection: unknown
-- Performing Test CXX_SUPPORTS_CUSTOM_LINKER
-- Performing Test CXX_SUPPORTS_CUSTOM_LINKER - Failed
CMake Error at /mnt/nvme_sec/SRC/llvm-project/llvm/cmake/modules/HandleLLVMOptions.cmake:308 (message):
  Host compiler does not support '-fuse-ld=lld'
Call Stack (most recent call first):
  CMakeLists.txt:127 (include)

Example of the build failure on the bots: https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8819676041146779809/+/u/clang/build/stdout
This breakage was masked by another breakage caused by e049a87f04cff8e81b4177097a6b2fdf37c7b148 so it took us a bit while to figure out. We locally verified 0c3156bd43842d8f0ea54bea3fea628e8fee93d2 (1 patch before) will pass the build while c30e6447c0225f675773d07f2b6d4e3c2b962155 (this patch) will fall with the error message above.

Could you take a look? If it takes a bit long time to investigate, could you revert this change first please?

Could this be related to the patch https://lab.llvm.org/buildbot/#/builders/37/builds/11582 ?

FYI @pcc

I could verify that clang -fuse-ld=lld -m32 a.c failed to link if I used an older crti.o (glibc<2.32)

usr/lib32/crti.o: definition of __x86.get_pc_thunk.bx
usr/lib32/crti.o: reference to __x86.get_pc_thunk.bx
usr/lib32/crti.o: reference to __x86.get_pc_thunk.bx
ld.lld: error: relocation refers to a symbol in a discarded section: __x86.get_pc_thunk.bx
>>> defined in usr/lib32/crti.o
>>> referenced by usr/lib32/crti.o:(.init+0x5)
>>> referenced by usr/lib32/crti.o:(.fini+0x5)

It was related to the partially supported(worked around) .gnu.linkonce.t.__x86.get_pc_thunk.bx broken by this change.
Should be fixed by 6be457c14dafd634989c2c0b702a9231b438e2c4

We are seeing build failures in cmake config step for the clang runtime builds after this patch. Example of error messages:

Thanks for the report, but with Host compiler does not support '-fuse-ld=lld' it might be difficult to know for sure whether the culprit was in an lld change or a brittle configure.
Providing CMakeError.log or CMakeOutput.log with the linker diagnostic could make me more sure that it was related to this lld change.
Anycase, I think the issue was exactly the one for i386 sanitizers and it should have been fixed by the aforementioned commit.

We are seeing build failures in cmake config step for the clang runtime builds after this patch. Example of error messages:

Thanks for the report, but with Host compiler does not support '-fuse-ld=lld' it might be difficult to know for sure whether the culprit was in an lld change or a brittle configure.
Providing CMakeError.log or CMakeOutput.log with the linker diagnostic could make me more sure that it was related to this lld change.
Anycase, I think the issue was exactly the one for i386 sanitizers and it should have been fixed by the aforementioned commit.

I looked up my local CMakeError.log file for this patch and it contains:

[2/2] Linking C executable cmTC_5b689
FAILED: cmTC_5b689
: && /mnt/nvme_sec/SRC/llvm-project/build-repro/./bin/clang --target=i386-unknown-linux-gnu --sysroot=/usr/local/google/home/haowei/SRC/fuchsia-sysroot --target=i386-unknown-linux-gnu -fuse-ld=lld -v CMakeFiles/cmTC_5b689.dir/CMakeCCompilerABI.c.o -o cmTC_5b689   && :
Fuchsia clang version 15.0.0 (git@github.com:llvm/llvm-project.git c30e6447c0225f675773d07f2b6d4e3c2b962155)
Target: i386-unknown-linux-gnu
Thread model: posix
InstalledDir: /mnt/nvme_sec/SRC/llvm-project/build-repro/./bin
Found candidate GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8
Found candidate GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.8
Selected GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8
Candidate multilib: .;@m32
Selected multilib: .;@m32
 "/mnt/nvme_sec/SRC/llvm-project/build-repro/./bin/ld.lld" --sysroot=/usr/local/google/home/haowei/SRC/fuchsia-sysroot --build-id --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o cmTC_5b689 /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crt1.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8/crtbegin.o -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8 -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/lib/i386-linux-gnu -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/lib -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib CMakeFiles/cmTC_5b689.dir/CMakeCCompilerABI.c.o /mnt/nvme_sec/SRC/llvm-project/build-repro/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a -lc /mnt/nvme_sec/SRC/llvm-project/build-repro/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8/crtend.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crtn.o
ld.lld: error: relocation refers to a symbol in a discarded section: __x86.get_pc_thunk.bx
>>> defined in /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o
>>> referenced by /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o:(.init+0x5)
>>> referenced by /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o:(.fini+0x5)
>>> referenced by elf-init.oS:(__libc_csu_init) in archive /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/libc_nonshared.a
clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.

So it is indeed the same type of error mentioned by @vitalybuka . I tested your fix in 6be457c14dafd634989c2c0b702a9231b438e2c4 in our builder. but I am still seeing the same error: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8819586782274771361/overview . The CMakeError.log contains:

[2/2] Linking C executable cmTC_ace5f
FAILED: cmTC_ace5f
: && /mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/./bin/clang --target=i386-unknown-linux-gnu --sysroot=/usr/local/google/home/haowei/SRC/fuchsia-sysroot --target=i386-unknown-linux-gnu -fuse-ld=lld -v CMakeFiles/cmTC_ace5f.dir/CMakeCCompilerABI.c.o -o cmTC_ace5f   && :
Fuchsia clang version 15.0.0 (git@github.com:llvm/llvm-project.git 6be457c14dafd634989c2c0b702a9231b438e2c4)
Target: i386-unknown-linux-gnu
Thread model: posix
InstalledDir: /mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/./bin
Found candidate GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8
Found candidate GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.8
Selected GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8
Candidate multilib: .;@m32
Selected multilib: .;@m32
 "/mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/./bin/ld.lld" --sysroot=/usr/local/google/home/haowei/SRC/fuchsia-sysroot --build-id --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o cmTC_ace5f /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crt1.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8/crtbegin.o -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8 -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/lib/i386-linux-gnu -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/lib -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib CMakeFiles/cmTC_ace5f.dir/CMakeCCompilerABI.c.o /mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a -lc /mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8/crtend.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crtn.o
ld.lld: error: duplicate symbol: __x86.get_pc_thunk.bx
>>> defined in /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o
>>> defined in /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/libc_nonshared.a(elf-init.oS)
clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.

Still complaining the same symbol but with a slightly different error: ld.lld: error: duplicate symbol: __x86.get_pc_thunk.bx. compared to earlier ld.lld: error: relocation refers to a symbol in a discarded section: __x86.get_pc_thunk.bx

Could you take a look?

We are seeing build failures in cmake config step for the clang runtime builds after this patch. Example of error messages:

Thanks for the report, but with Host compiler does not support '-fuse-ld=lld' it might be difficult to know for sure whether the culprit was in an lld change or a brittle configure.
Providing CMakeError.log or CMakeOutput.log with the linker diagnostic could make me more sure that it was related to this lld change.
Anycase, I think the issue was exactly the one for i386 sanitizers and it should have been fixed by the aforementioned commit.

I looked up my local CMakeError.log file for this patch and it contains:

[2/2] Linking C executable cmTC_5b689
FAILED: cmTC_5b689
: && /mnt/nvme_sec/SRC/llvm-project/build-repro/./bin/clang --target=i386-unknown-linux-gnu --sysroot=/usr/local/google/home/haowei/SRC/fuchsia-sysroot --target=i386-unknown-linux-gnu -fuse-ld=lld -v CMakeFiles/cmTC_5b689.dir/CMakeCCompilerABI.c.o -o cmTC_5b689   && :
Fuchsia clang version 15.0.0 (git@github.com:llvm/llvm-project.git c30e6447c0225f675773d07f2b6d4e3c2b962155)
Target: i386-unknown-linux-gnu
Thread model: posix
InstalledDir: /mnt/nvme_sec/SRC/llvm-project/build-repro/./bin
Found candidate GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8
Found candidate GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.8
Selected GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8
Candidate multilib: .;@m32
Selected multilib: .;@m32
 "/mnt/nvme_sec/SRC/llvm-project/build-repro/./bin/ld.lld" --sysroot=/usr/local/google/home/haowei/SRC/fuchsia-sysroot --build-id --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o cmTC_5b689 /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crt1.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8/crtbegin.o -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8 -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/lib/i386-linux-gnu -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/lib -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib CMakeFiles/cmTC_5b689.dir/CMakeCCompilerABI.c.o /mnt/nvme_sec/SRC/llvm-project/build-repro/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a -lc /mnt/nvme_sec/SRC/llvm-project/build-repro/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8/crtend.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crtn.o
ld.lld: error: relocation refers to a symbol in a discarded section: __x86.get_pc_thunk.bx
>>> defined in /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o
>>> referenced by /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o:(.init+0x5)
>>> referenced by /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o:(.fini+0x5)
>>> referenced by elf-init.oS:(__libc_csu_init) in archive /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/libc_nonshared.a
clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.

So it is indeed the same type of error mentioned by @vitalybuka . I tested your fix in 6be457c14dafd634989c2c0b702a9231b438e2c4 in our builder. but I am still seeing the same error: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8819586782274771361/overview . The CMakeError.log contains:

[2/2] Linking C executable cmTC_ace5f
FAILED: cmTC_ace5f
: && /mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/./bin/clang --target=i386-unknown-linux-gnu --sysroot=/usr/local/google/home/haowei/SRC/fuchsia-sysroot --target=i386-unknown-linux-gnu -fuse-ld=lld -v CMakeFiles/cmTC_ace5f.dir/CMakeCCompilerABI.c.o -o cmTC_ace5f   && :
Fuchsia clang version 15.0.0 (git@github.com:llvm/llvm-project.git 6be457c14dafd634989c2c0b702a9231b438e2c4)
Target: i386-unknown-linux-gnu
Thread model: posix
InstalledDir: /mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/./bin
Found candidate GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8
Found candidate GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.8
Selected GCC installation: /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8
Candidate multilib: .;@m32
Selected multilib: .;@m32
 "/mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/./bin/ld.lld" --sysroot=/usr/local/google/home/haowei/SRC/fuchsia-sysroot --build-id --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o cmTC_ace5f /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crt1.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8/crtbegin.o -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8 -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/lib/i386-linux-gnu -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/lib -L/usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib CMakeFiles/cmTC_ace5f.dir/CMakeCCompilerABI.c.o /mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a -lc /mnt/nvme_sec/SRC/llvm-project/build-repro-mitigate/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/gcc/i586-linux-gnu/4.8/crtend.o /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crtn.o
ld.lld: error: duplicate symbol: __x86.get_pc_thunk.bx
>>> defined in /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/crti.o
>>> defined in /usr/local/google/home/haowei/SRC/fuchsia-sysroot/usr/lib/i386-linux-gnu/libc_nonshared.a(elf-init.oS)
clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.

Still complaining the same symbol but with a slightly different error: ld.lld: error: duplicate symbol: __x86.get_pc_thunk.bx. compared to earlier ld.lld: error: relocation refers to a symbol in a discarded section: __x86.get_pc_thunk.bx

Could you take a look?

I just suppressed the error in c1d4c67718db88be48f451b91f7bddab8fff2424.

Can you provide i386-linux-gnu/crti.o and i386-linux-gnu/libc_nonshared.a(elf-init.oS)?
I need to understand the issue better to craft a better test. I believe the state before this patch worked for such symbols were accidental.

I just suppressed the error in c1d4c67718db88be48f451b91f7bddab8fff2424.

Can you provide i386-linux-gnu/crti.o and i386-linux-gnu/libc_nonshared.a(elf-init.oS)?
I need to understand the issue better to craft a better test. I believe the state before this patch worked for such symbols were accidental.

These files are available in archive https://chrome-infra-packages.appspot.com/p/fuchsia/third_party/sysroot/linux/+/tp7-Zyo4pv2SVEoK_eaU6yuKmyxJWcR54vtJKTWpTIYC under:

./usr/lib/i386-linux-gnu/crti.o
./usr/lib/i386-linux-gnu/libc_nonshared.a

The bots that failed the build used this exact revision.

haowei added a comment.EditedMar 15 2022, 6:35 PM

While c1d4c67718db88be48f451b91f7bddab8fff2424 clears the cmake error. I am now seeing runtimes-i386-unknown-linux-gnu build failures related to the same symbol:

[367/1127] Linking CXX shared library /b/s/w/ir/x/w/staging/llvm_build/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.ubsan_minimal.so
FAILED: /b/s/w/ir/x/w/staging/llvm_build/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.ubsan_minimal.so 
: && /b/s/w/ir/x/w/staging/llvm_build/./bin/clang++ --target=i386-unknown-linux-gnu --sysroot=/b/s/w/ir/x/w/cipd/linux -fPIC --target=i386-unknown-linux-gnu -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -ffile-prefix-map=/b/s/w/ir/x/w/staging/llvm_build/runtimes/runtimes-i386-unknown-linux-gnu-bins=../staging/llvm_build/runtimes/runtimes-i386-unknown-linux-gnu-bins -ffile-prefix-map=/b/s/w/ir/x/w/llvm-llvm-project/= -no-canonical-prefixes -Wall -std=c++14 -Wno-unused-parameter -O2 -g -DNDEBUG  -fuse-ld=lld  -Wl,-z,nodelete -fuse-ld=lld -Wl,--color-diagnostics   -nodefaultlibs -Wl,-z,text -nostdlib++ -shared -Wl,-soname,libclang_rt.ubsan_minimal.so -o /b/s/w/ir/x/w/staging/llvm_build/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.ubsan_minimal.so compiler-rt/lib/ubsan_minimal/CMakeFiles/RTUbsan_minimal.i386.dir/ubsan_minimal_handlers.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/b/s/w/ir/x/w/staging/llvm_build/./lib"  -lc  /b/s/w/ir/x/w/staging/llvm_build/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a && :
ld.lld: error: relocation R_386_PC32 cannot refer to absolute symbol: __x86.get_pc_thunk.bx
>>> defined in /b/s/w/ir/x/w/cipd/linux/usr/lib/i386-linux-gnu/crti.o
>>> referenced by /b/s/w/ir/x/w/cipd/linux/usr/lib/i386-linux-gnu/crti.o:(.init+0x5)

ld.lld: error: relocation R_386_PC32 cannot refer to absolute symbol: __x86.get_pc_thunk.bx
>>> defined in /b/s/w/ir/x/w/cipd/linux/usr/lib/i386-linux-gnu/crti.o
>>> referenced by /b/s/w/ir/x/w/cipd/linux/usr/lib/i386-linux-gnu/crti.o:(.fini+0x5)
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

The failed build is https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8819579887511161281/overview which was started after your patch was landed.

While c1d4c67718db88be48f451b91f7bddab8fff2424 clears the cmake error. I am now seeing runtimes-i386-unknown-linux-gnu build failures related to the same symbol:

[367/1127] Linking CXX shared library /b/s/w/ir/x/w/staging/llvm_build/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.ubsan_minimal.so
FAILED: /b/s/w/ir/x/w/staging/llvm_build/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.ubsan_minimal.so 
: && /b/s/w/ir/x/w/staging/llvm_build/./bin/clang++ --target=i386-unknown-linux-gnu --sysroot=/b/s/w/ir/x/w/cipd/linux -fPIC --target=i386-unknown-linux-gnu -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -ffile-prefix-map=/b/s/w/ir/x/w/staging/llvm_build/runtimes/runtimes-i386-unknown-linux-gnu-bins=../staging/llvm_build/runtimes/runtimes-i386-unknown-linux-gnu-bins -ffile-prefix-map=/b/s/w/ir/x/w/llvm-llvm-project/= -no-canonical-prefixes -Wall -std=c++14 -Wno-unused-parameter -O2 -g -DNDEBUG  -fuse-ld=lld  -Wl,-z,nodelete -fuse-ld=lld -Wl,--color-diagnostics   -nodefaultlibs -Wl,-z,text -nostdlib++ -shared -Wl,-soname,libclang_rt.ubsan_minimal.so -o /b/s/w/ir/x/w/staging/llvm_build/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.ubsan_minimal.so compiler-rt/lib/ubsan_minimal/CMakeFiles/RTUbsan_minimal.i386.dir/ubsan_minimal_handlers.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/b/s/w/ir/x/w/staging/llvm_build/./lib"  -lc  /b/s/w/ir/x/w/staging/llvm_build/lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.builtins.a && :
ld.lld: error: relocation R_386_PC32 cannot refer to absolute symbol: __x86.get_pc_thunk.bx
>>> defined in /b/s/w/ir/x/w/cipd/linux/usr/lib/i386-linux-gnu/crti.o
>>> referenced by /b/s/w/ir/x/w/cipd/linux/usr/lib/i386-linux-gnu/crti.o:(.init+0x5)

ld.lld: error: relocation R_386_PC32 cannot refer to absolute symbol: __x86.get_pc_thunk.bx
>>> defined in /b/s/w/ir/x/w/cipd/linux/usr/lib/i386-linux-gnu/crti.o
>>> referenced by /b/s/w/ir/x/w/cipd/linux/usr/lib/i386-linux-gnu/crti.o:(.fini+0x5)
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

The failed build is https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8819579887511161281/overview which was started after your patch was landed.

I just reverted the patch in 9b61fff0eb9332553ecdbe00e01e3d08ad392768 since I need to think a bit how to support __x86.get_pc_thunk.bx.
Thanks for the i386 sysroot. It allowed me to improve the test i386-linkonce.s to describe the case we have to work around more precisely.