Because of https://llvm.org/PR50675, we can end up producing a PLT
relocation referencing a symbol that's dropped from the dynamic symbol
table, which in turn causes a crash at runtime. We ran into this again
recently, resulting in crashes for our users. A subsequent diff will fix
that issue, but add an assert to catch it if it happens again.
Details
- Reviewers
MaskRay - Commits
- rG610a0e8b5368: [ELF] Assert on invalid GOT or PLT relocations
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I haven't read other patches yet. If the issue only happens due to a bug, I think guarding in #ifndef NDEBUG is probably more appropriate and goes in line with checkDynamicRelocsDefault.
lld/ELF/SyntheticSections.cpp | ||
---|---|---|
1591–1592 | if (!needsDynSymIndex()) return 0; |
Yup, this is happening due to a bug which is fixed by the next patch.
By guarding with #ifndef NDEBUG, did you mean keeping the current error and guarding it, or just switching to an assert? In either case, should I keep the test in this patch? (The next patch which fixes the bug has to get rid of the test anyway, because I couldn't think of another way to trigger this bug.)
This assertion triggers when building the Linux kernel for arm64:
$ make -skj"$(nproc)" ARCH=arm64 LLVM=1 mrproper defconfig all ld.lld: /home/nathan/cbl/src/llvm-project/lld/ELF/SyntheticSections.cpp:1597: uint32_t lld::elf::DynamicReloc::getSymIndex(lld::elf::SymbolTableBaseSection *) const: Assertion `(index != 0 || type != target->gotRel && type != target->pltRel) && "GOT or PLT relocation must refer to symbol in dynamic symbol table"' failed. ld.lld: /home/nathan/cbl/src/llvm-project/lld/ELF/SyntheticSections.cpp:1597: uint32_t lld::elf::DynamicReloc::getSymIndex(lld::elf::SymbolTableBaseSection *) const: Assertion `(index != 0 || type != target->gotRel && type != target->pltRel) && "GOT or PLT relocation must refer to symbol in dynamic symbol table"' failed. ld.lld: /home/nathan/cbl/src/llvm-project/lld/ELF/SyntheticSections.cpp:1597: uint32_t lld::elf::DynamicReloc::getSymIndex(lld::elf::SymbolTableBaseSection *) const: Assertion `(index != 0 || type != target->gotRel && type != target->pltRel) && "GOT or PLT relocation must refer to symbol in dynamic symbol table"' failed. ld.lld: /home/nathan/cbl/src/llvm-project/lld/ELF/SyntheticSections.cpp:1597: uint32_t lld::elf::DynamicReloc::getSymIndex(lld::elf::SymbolTableBaseSection *) const: Assertion `(index != 0 || type != target->gotRel && type != target->pltRel) && "GOT or PLT relocation must refer to symbol in dynamic symbol table"' failed. ld.lld: /home/nathan/cbl/src/llvm-project/lld/ELF/SyntheticSections.cpp:1597: uint32_t lld::elf::DynamicReloc::getSymIndex(lld::elf::SymbolTableBaseSection *) const: Assertion `(index != 0 || type != target->gotRel && type != target->pltRel) && "GOT or PLT relocation must refer to symbol in dynamic symbol table"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. ld.lld: /home/nathan/cbl/src/llvm-project/lld/ELF/SyntheticSections.cpp:1597: uint32_t lld::elf::DynamicReloc::getSymIndex(lld::elf::SymbolTableBaseSection *) const: Assertion `(index != 0 || type != target->gotRel && type != target->pltRel) && "GOT or PLT relocation must refer to symbol in dynamic symbol table"' failed. ld.lld: /home/nathan/cbl/src/llvm-project/lld/ELF/SyntheticSections.cpp:1597: uint32_t lld::elf::DynamicReloc::getSymIndex(lld::elf::SymbolTableBaseSection *) const: Assertion `(index != 0 || type != target->gotRel && type != target->pltRel) && "GOT or PLT relocation must refer to symbol in dynamic symbol table"' failed. ld.lld: /home/nathan/cbl/src/llvm-project/lld/ELF/SyntheticSections.cpp:1597: uint32_t lld::elf::DynamicReloc::getSymIndex(lld::elf::SymbolTableBaseSection *) const: Assertion `(index != 0 || type != target->gotRel && type != target->pltRel) && "GOT or PLT relocation must refer to symbol in dynamic symbol table"' failed. scripts/link-vmlinux.sh: line 164: 2580614 Aborted (core dumped) ${ld} ${ldflags} -o ${output} ${wl}--whole-archive ${objs} ${wl}--no-whole-archive ${wl}--start-group ${libs} ${wl}--end-group $@ ${ldlibs} make[1]: *** [Makefile:1158: vmlinux] Error 134
What can I provide to help investigate this further?
Hmm, this is interesting. I think the assertion should only fire when the linker would have otherwise produced an invalid binary, but maybe there's something in the Linux kernel which would make this work.
Could you create a tarball with --reproduce and upload that somewhere? Feel free to revert this in the meantime if it's blocking you.
It looks like Fangrui already fixed this with f4a3569d0ad61907692fe10b1ffe1fa7763e9c26, thanks a lot! I can confirm that ARCH=arm64 defconfig and allmodconfig build fine now. I will test other architectures tonight and report back if there are any other issues I uncover.