This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Demote lazy symbols relative to a discarded section to Undefined
ClosedPublic

Authored by MaskRay on May 2 2020, 6:56 PM.

Details

Summary

Fixes PR45594.

In ObjFile<ELFT>::initializeSymbols(), for a defined symbol relative to
a discarded section (due to section group rules), it may have been
inserted as a lazy symbol. We need to demote it to an Undefined to
enable the discarded section error happened in a later pass.

Add LazyObjFile::fetched (if true) and ArchiveFile::parsed (if false) to represent that there is an
ongoing lazy symbol fetch and we should replace the current lazy symbol
with an Undefined, instead of calling Symbol::resolve (the lazy symbol
was added by an unrelated archive/lazy object).

As a side result, one small issue in start-lib-comdat.s is now fixed.
The hack motivating D51892 will be unsupported: if
.gnu.linkonce.t.__i686.get_pc_thunk.bx in an archive is referenced
by another section, this will likely be errored unless the function is
also defined in a regular object file.
(Bringing back rL330869 would error undefined symbol instead of the
more relevant discarded section.)

Note, glibc i386's crti.o still works (PR31215), because
.gnu.linkonce.t.__x86.get_pc_thunk.bx is in crti.o (one of the first
regular object files in a linker command line).

Diff Detail

Event Timeline

MaskRay created this revision.May 2 2020, 6:56 PM
MaskRay marked an inline comment as done.May 2 2020, 6:58 PM
MaskRay added inline comments.
lld/ELF/InputFiles.cpp
1645

s/has been processed/is being fetched/

grimar added inline comments.May 12 2020, 6:12 AM
lld/ELF/InputFiles.cpp
1125

I am thinking about adding some flag like isFetched or alike to InputFile instead of using symbols.empty() this and in other places.
(Or perhaps only for ArchiveFile/LazyObjFile`)
It should look cleaner and avoid hacks like push_back(nullptr). What do you think?

1647

You could use std::swap(file->symbols, symbols) instead I think.

lld/test/ELF/comdat-discarded-lazy.s
12

I think it should mention somewhere why it is ordered before/after.
I.e. that symbols are normally ordered by name in the symbol table.

19

Perhaps AA->BEFORE, ZZ->AFTER

MaskRay marked 5 inline comments as done.May 12 2020, 5:11 PM
MaskRay added inline comments.
lld/ELF/InputFiles.cpp
1125

Thanks. I agree that the usage is subtle. Added a member to ArchiveFile and LazyObjFile, respectively.

lld/test/ELF/comdat-discarded-lazy.s
19

It is not clear what the subject and object are when I see BEFORE or AFTER. I will just stick with AA ZZ

MaskRay updated this revision to Diff 263578.May 12 2020, 5:16 PM
MaskRay marked an inline comment as done.
MaskRay edited the summary of this revision. (Show Details)

Address comments

ruiu added a comment.May 12 2020, 10:59 PM

I wonder in what condition this error occurs. In particular, can you trigger this with standard-compliant object files?

I wonder in what condition this error occurs. In particular, can you trigger this with standard-compliant object files?

Compiled code may not cause such object files because their use of section groups has some patterns. It is not common to see more than one global symbol in a section group.

As @rprichard described in https://bugs.llvm.org//show_bug.cgi?id=45594 , this issue happened with some handwritten assembly files.

psmith added inline comments.May 13 2020, 3:54 AM
lld/ELF/InputFiles.cpp
1125

It took me a while to understand what was going on here. Some suggestions

ArchiveFile::parsed or !LazyObjFile::fetched means that the file containing this object has not finished processing, i.e. this symbol is a result of a lazy symbol fetch. We should demote the lazy symbol to an Undefined so that any relocations to it will trigger a discarded section error.
lld/test/ELF/comdat-discarded-lazy.s
13

Can we mention that the symbols are ordered lexically. Something like

The test relies on the symbol table order of llvm-mc (lexical), which will need ...
MaskRay updated this revision to Diff 263745.May 13 2020, 9:13 AM
MaskRay marked 2 inline comments as done.

Update comments

Thanks for the updates, I don't have any more comments. Happy if George is happy.

ruiu added a comment.May 13 2020, 7:42 PM

There might be a better way of doing this. As far as I know, directly referring a non-group-leader symbol is not allowed, so the problem we have here is to find an object file that violates the spec. Do you think you can directly implement that logic? I mean, when a symbol is being resolved, we can check if a symbol is directly referring a group internal symbol, and if that's the case, we can emit an error as soon as such a bad symbol use is found.

There might be a better way of doing this. As far as I know, directly referring a non-group-leader symbol is not allowed, so the problem we have here is to find an object file that violates the spec. Do you think you can directly implement that logic? I mean, when a symbol is being resolved, we can check if a symbol is directly referring a group internal symbol, and if that's the case, we can emit an error as soon as such a bad symbol use is found.

This approach is the best I can find.

I have thought about restoring rL330869. It can give us an error, but not a specific error like "relocations refer to a symbol in a discarded section". So it is inferior.

Our current symbol resolution errs that some symbols are incorrectly considered Lazy while should be Undefined. This patch fixes the bug. More precisely, some symbols which should be considered undefined according to the spec:

A symbol table entry with STB_GLOBAL or STB_WEAK binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be converted to an undefined symbol (its section index must be changed to SHN_UNDEF) if the group members are discarded. References to this symbol table entry from outside the group are allowed.

lld/ELF/InputFiles.cpp
1125

Thanks the suggestion! Adapted.

Ping.. I think we should do this.

psmith accepted this revision.Jun 8 2020, 2:37 PM

LGTM, I don't have a better idea.

As I understand it, I don't think that we can detect this case when reading the object. As these symbols are global they could be referred to from outside the group, even another object file (references to a global from outside a group are permitted in ELF). If we scan the relocations in the object we can detect references to the discarded symbol from the object, but not the other objects. We'd be able to detect all relocations to local symbols.

This revision is now accepted and ready to land.Jun 8 2020, 2:37 PM
MaskRay edited the summary of this revision. (Show Details)Jun 9 2020, 9:48 AM
This revision was automatically updated to reflect the committed changes.