This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Improve error message for relocations to symbols defined in discarded sections
ClosedPublic

Authored by MaskRay on Mar 21 2019, 9:18 AM.

Details

Summary

Rather than report "undefined symbol: ", give more informative message
about the object file that defines the discarded section.

In particular, PR41133, if the section is a discarded COMDAT, print the
section group signature and the object file with the prevailing
definition. This is useful to track down some ODR issues.

We need to

  • add uint32_t DiscardedSecIdx to Undefined for this feature.
  • make ComdatGroups public and change its type to DenseMap<CachedHashStringRef, const InputFile *>

Diff Detail

Repository
rL LLVM

Event Timeline

MaskRay created this revision.Mar 21 2019, 9:18 AM

Thank you!

If I'm understanding correctly, for the case where the relocation is not in the same object file as the discarded section, we'll still get the "undefined symbol" error message. I don't know if it would be possible to enhance that case to print the "relocation against discarded section" message as well. (gold does print that message in this case; it just doesn't print the additional information about the section group signature and prevailing definition.)

ruiu added inline comments.Mar 21 2019, 1:11 PM
ELF/Relocations.cpp
698 ↗(On Diff #191716)

This is a big chunk of new code. Is there any way to factor out?

709–711 ↗(On Diff #191716)

I think the following message would be better:

relocation refers to a symbol in a discarded section: <symbol name here>
MaskRay updated this revision to Diff 191802.Mar 21 2019, 4:59 PM

Factor out maybeReportDiscardedComdat

MaskRay marked 2 inline comments as done.Mar 21 2019, 5:01 PM

Thank you!

If I'm understanding correctly, for the case where the relocation is not in the same object file as the discarded section, we'll still get the "undefined symbol" error message. I don't know if it would be possible to enhance that case to print the "relocation against discarded section" message as well. (gold does print that message in this case; it just doesn't print the additional information about the section group signature and prevailing definition.)

Symbol has the File field but not the section index (st_shndx). If the relocation is not in the same object as the discarded section, we can't get the defined section easily.

Thank you!

If I'm understanding correctly, for the case where the relocation is not in the same object file as the discarded section, we'll still get the "undefined symbol" error message. I don't know if it would be possible to enhance that case to print the "relocation against discarded section" message as well. (gold does print that message in this case; it just doesn't print the additional information about the section group signature and prevailing definition.)

Symbol has the File field but not the section index (st_shndx). If the relocation is not in the same object as the discarded section, we can't get the defined section easily.

Could we do so via symbol table lookup? Rui said in https://bugs.llvm.org/show_bug.cgi?id=41133 that "We already insert all symbols (including ones for discarded sections) to the symbol table".

Thank you!

If I'm understanding correctly, for the case where the relocation is not in the same object file as the discarded section, we'll still get the "undefined symbol" error message. I don't know if it would be possible to enhance that case to print the "relocation against discarded section" message as well. (gold does print that message in this case; it just doesn't print the additional information about the section group signature and prevailing definition.)

Symbol has the File field but not the section index (st_shndx). If the relocation is not in the same object as the discarded section, we can't get the defined section easily.

Could we do so via symbol table lookup? Rui said in https://bugs.llvm.org/show_bug.cgi?id=41133 that "We already insert all symbols (including ones for discarded sections) to the symbol table".

We do insert all symbols into the symbol table (InputFiles.cpp:ObjFile<ELFT>::initializeSymbols()), however, Symbol does not encode all fields of Elf_Sym. Symbol has several derived classes, among them, only Defined keeps st_shndx, Undefined doesn't. Symbols defined in discarded COMDAT sections are represented by Undefined so the section index is lost. I think I can probably add SectionBase *Sec to Undefined so that we do not have the limitation.

MaskRay updated this revision to Diff 191826.Mar 22 2019, 12:18 AM
MaskRay retitled this revision from [ELF] Improve error message for relocations to symbols defined in discarded COMDAT to [ELF] Improve error message for relocations to symbols defined in discarded sections.
MaskRay edited the summary of this revision. (Show Details)
MaskRay removed a reviewer: grimar.

generalize to SHF_EXCLUDE and other discarded sections

MaskRay updated this revision to Diff 191829.Mar 22 2019, 12:47 AM

Binding != STB_WEAK

grimar added inline comments.Mar 22 2019, 3:07 AM
ELF/Driver.cpp
1296 ↗(On Diff #191830)

I would use something like 0 /*DiscardedSecIdx*/ because it is not clear what is 0.
(here and below)

ELF/Relocations.cpp
682 ↗(On Diff #191830)

Early return?

684 ↗(On Diff #191830)

Use lookup?

691 ↗(On Diff #191830)

I.e. this code seems a bit better looking as:

// If the discarded section is a COMDAT.
Elf_Shdr_Impl<ELFT> ELFSec = ObjSections[Sym.DiscardedSecIdx - 1];
if (ELFSec.sh_type != SHT_GROUP)
  return Msg;

StringRef Signature = File->getShtGroupSignature(ObjSections, ELFSec);
if (const InputFile *InFile =
        Symtab->ComdatGroups.lookup(CachedHashStringRef(Signature)))
  Msg += "\n>>> section group signature: " + Signature.str() +
         "\n>>> prevailing definition is in " + toString(InFile);
return Msg;
707 ↗(On Diff #191830)

If you'll do -> std::string here, then ...

723 ↗(On Diff #191830)

... you can remove .str() here.

test/ELF/exclude-discarded-error2.s
9 ↗(On Diff #191830)

What about testing the full error message explicitly here?

MaskRay updated this revision to Diff 191882.Mar 22 2019, 8:07 AM
MaskRay marked 7 inline comments as done.
MaskRay edited the summary of this revision. (Show Details)

Address grimar's comments

MaskRay marked an inline comment as done.Mar 22 2019, 8:08 AM
MaskRay added inline comments.
ELF/Relocations.cpp
684 ↗(On Diff #191830)

Thank you! I didn't know the useful lookup...

MaskRay updated this revision to Diff 191883.Mar 22 2019, 8:11 AM

0 -> /*DiscardedSecIdx=*/0

I have no more comments, thanks!

ELF/Driver.cpp
1296 ↗(On Diff #191830)

Side note: Hmmm. We are not consistent. We use both /*DiscardedSecIdx*/ and /*DiscardedSecIdx=*/ forms across LLD.

MaskRay updated this revision to Diff 191917.Mar 22 2019, 11:39 AM
MaskRay marked an inline comment as done.

/*DiscardedSecIdx*/ 0 -> /*DiscardedSecIdx=*/0

MaskRay marked an inline comment as done.Mar 22 2019, 11:39 AM
MaskRay added inline comments.
ELF/Driver.cpp
1296 ↗(On Diff #191830)

Missed that case. Fixed!

Thank you! This does even better than gold now :)

Ping :) I had come across an ODR violation case (which took me quite a while to figure out the cause) before I noticed smeenai's report so I believe this improved error message will be helpful.

ruiu added a comment.Apr 7 2019, 11:08 PM

I'm honestly not very happy about the plumbing work in this patch, as the new value, DiscardedSecIdx, is not semantically required but handed over to many functions just to get to the location where the error message is generated.

I wonder if we can do this without that variable. When we need that variable, we are generating an error message, so the code doesn't have to be fast. I.e. you can scan through object file's symbol table to find an undefined symbol, for example. What do you think?

I wonder if we can do this without that variable. ... you can scan through object file's symbol table to find an undefined symbol, for example. What do you think?

I have thought about this option. I believe that will make code slower and more complex.

We use Undefined for two purposes: 1) undefined symbol 2) symbol in a discarded section (has only one occurrence: InputFiles.cpp: ObjFile<ELFT>::createSymbol). (I added some comments in this patch).

If we don't store Undefined::DiscardedSecIdx, we can iterate the symbol table of Symbol::File to figure out the section index and then check if the section is discarded. That would repeat some work done by ObjFile<ELFT>::createSymbol. Because we may emit many maybeReportUndefined errors, we don't want symbol table iterating to be too slow. We may end up with some cache. I think that is more complex than an extra DiscardedSecIdx.

If /*DiscardedSecIdx=0*/ looks too messy, it can be made a default argument.
That is simpler than having two member functions SymbolTable::addSymbolInDiscarded and SymbolTable::addUndefined.

MaskRay updated this revision to Diff 194104.Apr 8 2019, 1:43 AM

explict /*DiscardedSecIdx=0*/ -> default argument

Does the default argument DiscardedSecIdx look good?

MaskRay updated this revision to Diff 198257.May 6 2019, 6:11 AM

Remove template parameter class RelTy

MaskRay updated this revision to Diff 200506.May 21 2019, 7:30 AM

Rebase after recent SymbolTable changes. Combined with D61583, we will give good warnings for PR41693

ruiu added a comment.May 22 2019, 1:04 AM

Overall looking good.

Somewhat unrelated, but maybe we should clear ComdatGroups map once name resolution pass is done? COMDAT signatures are long strings, and discarding them from memory might save nontrivial amount of memory.

ELF/InputFiles.cpp
117 ↗(On Diff #200506)

Why did you move this to Symtab?

MaskRay marked an inline comment as done.May 22 2019, 1:33 AM

Overall looking good.

Somewhat unrelated, but maybe we should clear ComdatGroups map once name resolution pass is done? COMDAT signatures are long strings, and discarding them from memory might save nontrivial amount of memory.

Good point. I can do that in a separate change.

ELF/InputFiles.cpp
117 ↗(On Diff #200506)

Relocations.cpp needs to access the variable to check which section is prevailing.

It was originally in Symtab until you moved it to a function-local static variable.. It can also be a global variable if you prefer.

ruiu accepted this revision.May 22 2019, 1:39 AM

LGTM

This revision is now accepted and ready to land.May 22 2019, 1:39 AM
This revision was automatically updated to reflect the committed changes.
MaskRay marked an inline comment as done.May 22 2019, 4:33 AM
MaskRay added inline comments.
lld/trunk/ELF/Relocations.cpp
688

We may improve the search for SHT_GROUP. Probably some validation is required. Sym.DiscardedSecIdx - 1 may not find the belonged SHT_GROUP...