Without this patch, lld emits "error: undefined symbol: _start" if it encountered only weak references to that symbol.
Details
Diff Detail
Event Timeline
Is there any practical reason to do this? A system having _start as a weak symbol looks a bit odd.
We came across the issue in our environment, where an ELF linker is used only as an intermediate step and the value of entry field in the header is ignored. But in one of the libraries, we have a weak reference to _start, because sometimes this symbol might be defined for a special purpose. I can guess that for some other embedded environments and third-party libraries the same situation might take place.
ELF/Driver.cpp | ||
---|---|---|
1038 | What the comment describing this code says is that we do this to load a start symbol from a static library. So maybe it is better. if (SymbolBody *B = Symtab->find(Config->Entry)) if (B->isLazy()) Symtab->addUndefined<ELFT>(Config->Entry); |
Yes, this variant is more straightforward. However, as addUndefined() implicitly marks the symbol to be preserved by LTO, we now have to set that flag directly; overwise, many tests in the lto test-suite fail. Moreover, without that flag, the entry field in the ELF header is not filled correctly in case of LTO because symbol _start is missing.
ELF/Driver.cpp | ||
---|---|---|
1039 | It seems a bit odd because we are not doing this for symbols given by the -undefined option. Is this a bug of -undefined, or don't we need it here as well? |
ELF/Driver.cpp | ||
---|---|---|
1039 | I can't tell about "-u", but if we miss this for the entry symbols, and it is defined in a BC file, we won't be able to fill the entry field in the ELF header correctly. |
It looks like a symbol specified by -e can be handled as if it were specified by -u. However, it won't work because I believe -u doesn't keep symbols over LTO. So, if -u has the same bug, then it is probably better to fix it and then add a symbol specified by -e to Config->Undefined.
Did you mean something like this?
I am inclined not to mix an entry symbol into Config->Undefined in order to preserve clarity.
ELF/SymbolTable.cpp | ||
---|---|---|
587 | Unfortunately, we should set the flag for regular symbols, too. If it is set only for lazy symbols, lots of tests in LTO suit fail: lld :: ELF/lto/archive-2.ll lld :: ELF/lto/codemodel.ll lld :: ELF/lto/combined-lto-object-name.ll lld :: ELF/lto/dynamic-list.ll lld :: ELF/lto/internalize-basic.ll lld :: ELF/lto/lto-start.ll lld :: ELF/lto/opt-level.ll lld :: ELF/lto/parallel-internalize.ll lld :: ELF/lto/shlib-undefined.ll lld :: ELF/lto/thin-archivecollision.ll lld :: ELF/lto/undefined-puts.ll Some of them might be fixed in tests themselves, but others show why regular symbols should be marked. See ELF/lto/lto-start.ll for example. |
ELF/SymbolTable.cpp | ||
---|---|---|
587 | But still, setting IsUsedInRegularObj to true even if B is Undefined seems a bit odd. ("An undefined/lazy symbol that is used in regular obj" sounds like an odd concept.) |
ELF/SymbolTable.cpp | ||
---|---|---|
587 | Do you mean something like if (!B->isUndefined()) B->symbol()->IsUsedInRegularObj = true; or even if (!B->isUndefined() && B->symbol()->File && B->symbol()->File->kind() == InputFile::BitcodeKind) B->symbol()->IsUsedInRegularObj = true; ? |
ELF/SymbolTable.cpp | ||
---|---|---|
599–600 | This changes the semantics of the -u option because the new code sets IsUsedInRegularObj to true. And I believe that's not a desired change. |
I assumed that here...
I believe -u doesn't keep symbols over LTO. So, if -u has the same bug, then it is probably better to fix it
... you suggested that --undefined symbols should be preserved during LTO, as well as the entry symbol. Am I right?
In any case, the entry symbol should survive LTO; otherwise, we won't be able to fill the entry field in the header.
... you suggested that --undefined symbols should be preserved during LTO, as well as the entry symbol. Am I right?
Oh sorry, I forgot about my comment! But regardless of its correctness, it should be done in a separate patch, as this patch doesn't have any test for that change.
What the comment describing this code says is that we do this to load a start symbol from a static library. So maybe it is better.