This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Show error location for 'undefined symbol' errors
ClosedPublic

Authored by evgeny777 on Oct 20 2016, 7:10 AM.

Details

Summary

The 'undefined symbol' is likely most common linker error, so linker typically show some location information
to make identifying problem source easier. With this patch lld will show following information about error source:

  • Source file name in case STT_FILE symbol is present in object file
  • Function name, if it can be retrieved.
  • section name + offset

In addition to this ld/gold can show line number if debug info is present. This is the next thing I plan to do.

Diff Detail

Repository
rL LLVM

Event Timeline

evgeny777 updated this revision to Diff 75294.Oct 20 2016, 7:10 AM
evgeny777 retitled this revision from to [ELF] Show error location for 'undefined symbol' errors.
evgeny777 updated this object.
evgeny777 added reviewers: ruiu, rafael.
evgeny777 set the repository for this revision to rL LLVM.
evgeny777 added a project: lld.
evgeny777 added subscribers: ikudrin, grimar, llvm-commits.
evgeny777 updated this revision to Diff 75298.Oct 20 2016, 7:57 AM
evgeny777 removed rL LLVM as the repository for this revision.

Fixed memory corruption

grimar added inline comments.Oct 20 2016, 8:10 AM
ELF/InputFiles.cpp
411 ↗(On Diff #75298)

May be

if (!D || D->Section != S)
ELF/Relocations.cpp
543 ↗(On Diff #75298)

Looks you do not need .str() here.

evgeny777 added inline comments.Oct 20 2016, 8:13 AM
ELF/Relocations.cpp
543 ↗(On Diff #75298)

Yes, I do. The type of expression is llvm::Twine because of S.Name and Twine::utohexstr(Offset)

ruiu edited edge metadata.Oct 20 2016, 11:47 AM

Thank you for doing this. This would be very useful.

ELF/Error.cpp
42 ↗(On Diff #75298)

I think I'm inclined to showing ARGV[0] even for undefined symbol errors to keep them consistent with other linker-issued errors. I do not see a reason to not do except for making error lines a bit shorter.

ELF/InputFiles.cpp
405 ↗(On Diff #75298)

A brief comment would be appreciated. Please note that this is not an optimized function but it is okay because this is for creating an error message.

407 ↗(On Diff #75298)

This function can be implemented as a non-member function, no? Non-member functions are preferred over member functions if they access only public members.

409–411 ↗(On Diff #75298)
for (...)
   if (auto *D = ...)
     if (D->Value <= ...)
        return D;
return nullptr;
ELF/InputFiles.h
178–179 ↗(On Diff #75298)

I'd make SourceFile a public member variable and remove the accessor, as the accessor doesn't contain a logic.

182 ↗(On Diff #75298)

Probably a personal taste, but how about getSymbolAt?

evgeny777 added inline comments.Oct 20 2016, 11:57 AM
ELF/Error.cpp
42 ↗(On Diff #75298)

No problem, but you'll get both path to linker and path to file in a single error message. Isn't this confusing? BTW ld and gold don't show argv0 for this error type.

ruiu added inline comments.Oct 20 2016, 12:56 PM
ELF/Error.cpp
42 ↗(On Diff #75298)

Clang doesn't seem to print out a full path but instead of the last path component (so it prints out "clang-4.0" instead of "/foo/bar/clang-4.0"). We might want to do the same. But for now I think we want to handle all errors in the same way. We can tweak later if we want it.

evgeny777 added inline comments.Oct 21 2016, 2:07 AM
ELF/InputFiles.cpp
407 ↗(On Diff #75298)

No, it can't. It iterates SymbolBodies, which is member variable.

evgeny777 updated this revision to Diff 75400.Oct 21 2016, 2:53 AM
evgeny777 edited edge metadata.

Addressed review comments

evgeny777 updated this revision to Diff 75427.Oct 21 2016, 7:43 AM

Added debugging information parser. See undef.s and undef-debug.s for test case.
Let me know if separate review required for this.

grimar added inline comments.Oct 21 2016, 8:04 AM
ELF/InputFiles.cpp
61 ↗(On Diff #75427)

I wonder if this class can be not templated.
You probably can pass uint64_t Addr and DILineInfo &LineInfo only.

evgeny777 added inline comments.Oct 21 2016, 8:11 AM
ELF/InputFiles.cpp
61 ↗(On Diff #75427)

I was thinking about this also, but I need ObjectFile<ELFT> to construct object::ObjectFile. Also I need to know endianess and pointer size.

evgeny777 updated this revision to Diff 75437.Oct 21 2016, 8:43 AM

Code cleanups

ruiu added inline comments.Oct 21 2016, 11:01 AM
ELF/InputFiles.cpp
407 ↗(On Diff #75298)

But you can use ObjectFile::getSymbols() instead, no?

evgeny777 added inline comments.Oct 21 2016, 11:05 AM
ELF/InputFiles.cpp
407 ↗(On Diff #75298)

Makes sense. May be also move it to Relocations.cpp?

ruiu added inline comments.Oct 21 2016, 11:06 AM
ELF/InputFiles.cpp
407 ↗(On Diff #75298)

Yeah, I'd do that and make this a static function.

ruiu added inline comments.Oct 21 2016, 12:43 PM
ELF/InputFiles.h
51 ↗(On Diff #75437)

I think this is too smart, at least as the initial implementation. This function is run only when an error occurs, and for error cases, we don't pursue performance. Instead I'd like to keep it simple as possible. So, could you please remove this class and define a non-member function? I think it is okay to parse debug info for each error message.

ELF/Relocations.cpp
569–571 ↗(On Diff #75437)

Now that you can just concatenate strings before passing to the functions.

warn(Location + ": " + Msg);
error(Location + ": " + Msg);

So I think you don't need to add new overloaded functions to error.cpp.

evgeny777 added inline comments.Oct 21 2016, 1:46 PM
ELF/Relocations.cpp
569–571 ↗(On Diff #75437)

Actually location and message are separated by message type, i.e:

<Location> : warning : <Msg>
ruiu added inline comments.Oct 21 2016, 2:18 PM
ELF/Relocations.cpp
569–571 ↗(On Diff #75437)

Did you have a reason to break consistency that way here?

evgeny777 added inline comments.Oct 22 2016, 2:08 AM
ELF/InputFiles.h
51 ↗(On Diff #75437)

could you please remove this class

I really really wouldn't like to do this. The main reason is that LLVM DWARF parser is very slow (main reason why lldb is so slow). If you want the numbers, here they are:

command (reports about 3K errors):

ld.lld LoopConvertCheck.cpp.o -o a.out

time w/o caching: 3m31.165s
time with caching: 0m1.659s

This is release build of lld on i7-2600K / SSD
Debug build had worked over 15 minutes, before I terminated it with Ctrl-C

May be it's better to simplify DIHelper somehow?

evgeny777 updated this revision to Diff 75682.Oct 25 2016, 4:26 AM

Addressed review comments

rafael accepted this revision.Oct 25 2016, 5:34 AM
rafael edited edge metadata.

This is awesome.

Fine by me with a nit, but please wait for Rui to LGTM too.

ELF/InputFiles.cpp
65 ↗(On Diff #75682)

Can you move the call to getLineTable to the constructor and cache LineTbl?

This revision is now accepted and ready to land.Oct 25 2016, 5:34 AM
evgeny777 added inline comments.Oct 25 2016, 5:40 AM
ELF/InputFiles.cpp
65 ↗(On Diff #75682)

Yes, it's possible to make LineTbl a member variable (together with DwarfLine), but getLineTable(Offset) is actually very fast. It's a single lookup in std::map with one element.

ruiu accepted this revision.Oct 25 2016, 10:41 AM
ruiu edited edge metadata.

LGTM with nits.

ELF/InputFiles.cpp
47 ↗(On Diff #75682)

lineData -> LineData

56 ↗(On Diff #75682)

Use unique_ptr to remove this dtor.

59–60 ↗(On Diff #75682)

nit: move them after if (!DrawfLine) return "" so that we won't instantiate them if we return early.

ELF/Relocations.cpp
574–575 ↗(On Diff #75682)
std::string Msg = getLocation(...) + ": undefined symbol: '" + ...
evgeny777 added inline comments.Oct 25 2016, 11:08 AM
ELF/InputFiles.cpp
56 ↗(On Diff #75682)

It's not possible unless you include DWARFContext.h to InputFiles.h or override unique_ptr deleter (second template parameter). Is this better?

This revision was automatically updated to reflect the committed changes.