This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Fixed crashes when linking with incompatible-arch archives/
ClosedPublic

Authored by oontvoo on Jul 27 2023, 11:05 AM.

Details

Summary
   
Two changes:
 - Avoid crashing in predicate functions.
   Querying the property of the Symbols via these is*() functions shouldn't crash the program - the answer should just be "false".
   Currently, having them throw UNREACHABLE already (incorrectly) crash certain code paths involving macho::validateSymbolRelocation() .

 - Simply ignore input archives with incompatible arch (changes from PRESIDENT810@)

Diff Detail

Event Timeline

oontvoo created this revision.Jul 27 2023, 11:05 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 27 2023, 11:05 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
oontvoo requested review of this revision.Jul 27 2023, 11:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 11:05 AM
pirama added a subscriber: pirama.Jul 27 2023, 1:12 PM

Test?

https://github.com/llvm/llvm-project/issues/56386 has a reproducer that could help with the test.

oontvoo updated this revision to Diff 545765.Jul 31 2023, 12:46 PM

updated diff

oontvoo retitled this revision from [lld-macho]Avoid crashing in predicate functions. to [lld-macho] Fixed crashes when linking with incompatible-arch archives/.Jul 31 2023, 12:46 PM
oontvoo edited the summary of this revision. (Show Details)
oontvoo updated this revision to Diff 545774.Jul 31 2023, 1:12 PM
oontvoo edited the summary of this revision. (Show Details)

got rid of TODO in test

oontvoo updated this revision to Diff 546146.Aug 1 2023, 10:57 AM

updated tests

@lld-macho, @thakis, @int3, et al, anyone got a few mins to review this? Thanks!

MaskRay added a subscriber: MaskRay.Aug 3 2023, 8:34 AM
MaskRay added inline comments.
lld/test/MachO/ignore-incompat-arch.s
35

Replace cat command with --input-file=%t/x86_a.map

71

it'd be slightly better to use a call instead of a data relocation.

73

delete blank line

lld/test/MachO/objc.s
28 ↗(On Diff #546146)

-NOT: {{.}} is slightly better than -EMPTY:. The former tests that there is no extra line while -EMPTY: may match an empty line.

oontvoo marked 3 inline comments as done.Aug 3 2023, 10:32 AM
oontvoo added inline comments.
lld/test/MachO/ignore-incompat-arch.s
71

Originally I'd used the normal call[q] but llvm-mc choked when compiling for arm64 and I didn't want to have two separate input for x86 vs arm64.
So I'd figured .quad is more portable - acceptable for both archs

Can you suggest an alternative?

oontvoo updated this revision to Diff 546935.Aug 3 2023, 10:32 AM

updated tests

MaskRay added inline comments.Aug 3 2023, 10:58 AM
lld/test/MachO/ignore-incompat-arch.s
71

Ah, ok, for portability, .quad looks good.

MaskRay added inline comments.Aug 3 2023, 11:24 AM
lld/MachO/InputFiles.cpp
206

return checkCompatibility(file);

1042

if (!compatArch) cannot be triggered. Delete it.

1044

compatArch seems no longer read after the store. Can compatArch be made ArchiveFile specific?

2115

It seems that if (!mbOrErr) can be checked first, then use else if (!err)

2131

consumeError

lld/test/MachO/ignore-incompat-arch.s
15

to prevent a warning from llvm-ar

The summary seems to have an unneeded 4-space indentation.

  • Simply ignore input archives with incompatible arch (changes from PRESIDENT810@)

What is PRESIDENT810@?

oontvoo updated this revision to Diff 546973.Aug 3 2023, 12:28 PM
oontvoo marked 5 inline comments as done.

addressed review comments

The summary seems to have an unneeded 4-space indentation.

  • Simply ignore input archives with incompatible arch (changes from PRESIDENT810@)

What is PRESIDENT810@?

Sorry, I meant the idea of peek at the first member in the archive to check its arch in addLazySymbol is taken from this commit https://github.com/PRESIDENT810/llvm-project/commit/3062d0392a7ac2d3a77a303c6af7272dbdac9db6 by PRESIDENT810

lld/MachO/InputFiles.cpp
1042

I seem to have missed the TODOs on propagating this from the parent archive to its children so we don't need to parse them - implemented that now.

1044

As mentioned in previous comment, this compatibility flag can be propagated from a parent archive to its children so the flag should be present for all InputFile

MaskRay accepted this revision as: MaskRay.Aug 3 2023, 1:17 PM

LGTM.

This revision is now accepted and ready to land.Aug 3 2023, 1:17 PM

The summary seems to have an unneeded 4-space indentation.

  • Simply ignore input archives with incompatible arch (changes from PRESIDENT810@)

What is PRESIDENT810@?

Sorry, I meant the idea of peek at the first member in the archive to check its arch in addLazySymbol is taken from this commit https://github.com/PRESIDENT810/llvm-project/commit/3062d0392a7ac2d3a77a303c6af7272dbdac9db6 by PRESIDENT810

You may want to use https://github.com/PRESIDENT810 . It's unclear that PRESIDENT810 is a username, at least to me...

gulfem added a subscriber: gulfem.Aug 7 2023, 5:34 PM

We started seeing a segmentation fault after this patch in our mac builders.

[12554/40876](1) RUST host_x64/netstack3_core_instrumented_lib_test host_x64/exe.unstripped/netstack3_core_instrumented_lib_test
FAILED: host_x64/netstack3_core_instrumented_lib_test host_x64/exe.unstripped/netstack3_core_instrumented_lib_test 
          PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
          Stack dump:
           #0 0x0000000107d536d8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1057e16d8)
           #1 0x0000000107d515c9 llvm::sys::RunSignalHandlers() (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1057df5c9)
           #2 0x0000000107d53d66 SignalHandler(int) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1057e1d66)
           #3 0x00007fff70b215fd (/usr/lib/system/libsystem_platform.dylib+0x7fff673fc5fd)
           #4 0x0000000000000000 
           #5 0x00000001077bf1d2 llvm::object::ArchiveMemberHeader::getSize() const (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10524d1d2)
           #6 0x00000001077c09cf llvm::object::Archive::Child::getName() const (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10524e9cf)
           #7 0x00000001077c149e llvm::object::Archive::Child::getMemoryBufferRef() const (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10524f49e)
           #8 0x0000000104e9cf7b lld::macho::ArchiveFile::addLazySymbols() (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10292af7b)
           #9 0x0000000104e835c6 addFile(llvm::StringRef, LoadType, bool, bool, bool, bool) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1029115c6)
          #10 0x0000000104e80852 lld::macho::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10290e852)
          #11 0x000000010537f33f lld::unsafeLldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>, bool) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x102e0d33f)
          #12 0x000000010263af99 lld_main(int, char**, llvm::ToolContext const&) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1000c8f99)
          #13 0x00000001028474bb findTool(int, char**, char const*) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1002d54bb)
          #14 0x00007fff70928cc9 (/usr/lib/system/libdyld.dylib+0x7fff67203cc9)
          #15 0x000000000000009f 
          clang++: error: unable to execute command: Segmentation fault: 11
          clang++: error: linker command failed due to signal (use -v to see invocation)

https://ci.chromium.org/ui/p/fuchsia/builders/ci/clang_toolchain.ci.core.x64-host_test_only-mac-subbuild/b8773677091545805297/overview

We started seeing a segmentation fault after this patch in our mac builders.

[12554/40876](1) RUST host_x64/netstack3_core_instrumented_lib_test host_x64/exe.unstripped/netstack3_core_instrumented_lib_test
FAILED: host_x64/netstack3_core_instrumented_lib_test host_x64/exe.unstripped/netstack3_core_instrumented_lib_test 
          PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
          Stack dump:
           #0 0x0000000107d536d8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1057e16d8)
           #1 0x0000000107d515c9 llvm::sys::RunSignalHandlers() (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1057df5c9)
           #2 0x0000000107d53d66 SignalHandler(int) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1057e1d66)
           #3 0x00007fff70b215fd (/usr/lib/system/libsystem_platform.dylib+0x7fff673fc5fd)
           #4 0x0000000000000000 
           #5 0x00000001077bf1d2 llvm::object::ArchiveMemberHeader::getSize() const (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10524d1d2)
           #6 0x00000001077c09cf llvm::object::Archive::Child::getName() const (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10524e9cf)
           #7 0x00000001077c149e llvm::object::Archive::Child::getMemoryBufferRef() const (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10524f49e)
           #8 0x0000000104e9cf7b lld::macho::ArchiveFile::addLazySymbols() (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10292af7b)
           #9 0x0000000104e835c6 addFile(llvm::StringRef, LoadType, bool, bool, bool, bool) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1029115c6)
          #10 0x0000000104e80852 lld::macho::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x10290e852)
          #11 0x000000010537f33f lld::unsafeLldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>, bool) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x102e0d33f)
          #12 0x000000010263af99 lld_main(int, char**, llvm::ToolContext const&) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1000c8f99)
          #13 0x00000001028474bb findTool(int, char**, char const*) (/opt/s/w/ir/x/w/fuchsia/out/not-default/../../prebuilt/third_party/clang/mac-x64/bin/ld64.lld+0x1002d54bb)
          #14 0x00007fff70928cc9 (/usr/lib/system/libdyld.dylib+0x7fff67203cc9)
          #15 0x000000000000009f 
          clang++: error: unable to execute command: Segmentation fault: 11
          clang++: error: linker command failed due to signal (use -v to see invocation)

https://ci.chromium.org/ui/p/fuchsia/builders/ci/clang_toolchain.ci.core.x64-host_test_only-mac-subbuild/b8773677091545805297/overview

Yes, this is caused by the issue described in https://reviews.llvm.org/D157300

Can we revert this for now then? Trunk should be in a known-working state as much as possible.

Can we revert this for now then? Trunk should be in a known-working state as much as possible.

The issue + fix are already in D157300. Can we not get that submitted instead? (it's also fairly trivial)

thakis added a comment.Aug 9 2023, 1:21 PM

Your fix is in now, so that's ok, but there was a 2-day window between things being known-bad and the fix landing. This causes needless work (see e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=1471633). Reverting is usually much faster than fixing, so please default to revert-first to keep trunk green (unless you have a fix that's ready to land as quickly as the revert).

gulfem added a comment.Aug 9 2023, 2:34 PM

Your fix is in now, so that's ok, but there was a 2-day window between things being known-bad and the fix landing. This causes needless work (see e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=1471633). Reverting is usually much faster than fixing, so please default to revert-first to keep trunk green (unless you have a fix that's ready to land as quickly as the revert).

I confirm that https://reviews.llvm.org/D157300 fixed the issue that I reported. Reverting and relanding with a fix can save a lot time engineering time especially to toolchain maintainers when there is a confirmed issue. When a builder is broken for a few days, upcoming breakages might be masked because of that, and this makes it more difficult to bisect the new issues.

Your fix is in now, so that's ok, but there was a 2-day window between things being known-bad and the fix landing. This causes needless work (see e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=1471633). Reverting is usually much faster than fixing, so please default to revert-first to keep trunk green (unless you have a fix that's ready to land as quickly as the revert).

I confirm that https://reviews.llvm.org/D157300 fixed the issue that I reported. Reverting and relanding with a fix can save a lot time engineering time especially to toolchain maintainers when there is a confirmed issue. When a builder is broken for a few days, upcoming breakages might be masked because of that, and this makes it more difficult to bisect the new issues.

I understand. Sorry about this! It was breaking our internal builds too but I'd underestimated the time it took to fix-forward. Will watch out in the future.