This fixes PR42407.
Details
- Reviewers
ruiu rnk ychen MaskRay jdoerfert - Commits
- rG1d06d48bb346: [LLD] [COFF] Resolve source locations for undefined references using dwarf
rLLD372843: [LLD] [COFF] Resolve source locations for undefined references using dwarf
rL372843: [LLD] [COFF] Resolve source locations for undefined references using dwarf
Diff Detail
Event Timeline
test/COFF/undefined-symbol-dwarf.s | ||
---|---|---|
15 | clang -S output is verbose. Many directives in this file are not significant and they should be deleted to improve readability. The relocations are not resolved. Does that cause any problem? |
test/COFF/undefined-symbol-dwarf.s | ||
---|---|---|
15 |
Yes, I guess I could trim out some. This is also built with -O0 to make sure the inline function ends up in a separate comdat (.section .text$_ZN1A5afuncEv below) to test that resolving references in different sections work, but I guess I could use some optimization with -fno-inline or something similar to keep them separate. However, I'm a little weary of doing too much manual cleanups of it, if it would make the debug info mismatch the actual code. I'm not very proficient in DWARF so I don't know for sure exactly what needs to match what here, and which of the debug sections could be removed without breaking it.
Um, what do you mean? The errors about undefined symbols? That's the whole point of the test; to have undefined references, that we try to resolve back to source line numbers using the debug info. |
Shrunk the test file a little by generating it using -Os -fno-exceptions -fno-unwind-tables -fno-inline.
test/COFF/undefined-symbol-dwarf.s | ||
---|---|---|
15 | Can you take a look at test/ELF/debug-line-str.s and test/ELF/debug-line-obj.s? They are still a bit verbose in my opinion but is much conciser than this file.. .debug* sections (DWARF) in object files contains relocations (e.g. .secrel32 .Linfo_string1). They have to be resolved so that they can be symbolized by lib/DebugInfo/Symbolize. In the ELF port of lld, ELF/DWARF.cpp provides the relocation resolver functionality. Does COFF need a counterpart? To be honest I know very little about COFF. I'll make some investigation into this. Meanwhile, can you test this patch on some larger programs, DLLs with multiple object files, etc? |
test/COFF/undefined-symbol-dwarf.s | ||
---|---|---|
15 | There are a few directives left I could strip out, and apparently it's ok to remove most of the .debug_str section, and some comments. With that I don't think I would be too far off from the ELF debug-line tests. My test case has two text sections, just like debug-line-obj.s, but I guess I could try to transcribe that one to COFF for a smaller testcase of the same situation. I tested it on a larg DLL (with lots of object files) and seems to work fine there as well. (I don't see how the number of object files would matter though as only one object file is given to Symbolizer at a time.) However it doesn't seem to work with debug info generated by GCC, so I guess I need to look into what's differing there. The libObject RelocationResolver.cpp seems to be involved to some part for resolving relocations at least, but I'm not sure if that's enough for handling it in general, or if something like what is done for ELF is needed here. |
test/COFF/undefined-symbol-dwarf.s | ||
---|---|---|
15 |
This turned out to just be me missing to pass the mingw flag to the linker when I tested it; it seems to work just as well with dwarf debug info from GCC.
As far as I can see when reading through the code, this is supposed to just work, thanks to getRelocationResolver(Obj), which for COFF handles .secrel32 relocations. |
Reduced the size of the test case by manually cutting out unneeded parts, similar to some of the ELF tests.
Yes, it works fine in large test cases as well.
Do you have an opinion on whether I should keep the LLVMSymbolizer object around between calls?
And would it be better to use symbolizeInlinedCode instead of symbolizeCode? (That requires a small patch to the symbolizer, to allow that one to operate on an in-memory object file.)
Nice.
Do you have an opinion on whether I should keep the LLVMSymbolizer object around between calls?
This is the error path and we don't expect this code to be called many times so its contribution to the overall time may be negligible. Keep the same LLVMSymbolizer will benefit when you symbolize the same object multiple times. If reusing LLVMSymbolizer does not take too many lines, you can do that.
And would it be better to use symbolizeInlinedCode instead of symbolizeCode? (That requires a small patch to the symbolizer, to allow that one to operate on an in-memory object file.)
In the diagnostics I think we just need one line, so symbolizeCode should be sufficient. symbolizeInlinedCode can give you multiple frames but I'm not sure how to pretty print them.
test/COFF/undefined-symbol-dwarf.s | ||
---|---|---|
110 | Convert tabs to spaces so they will align regardless of the tabstop setting (e.g. in phabricator it is nicely rendered) |
@ruiu - Do you have any suggestion on where to store it, to have it reused in multiple calls here? I could add it as an std::unique_ptr in the config object, but I think that might end up requiring including the symbolizer header in the config header.
And would it be better to use symbolizeInlinedCode instead of symbolizeCode? (That requires a small patch to the symbolizer, to allow that one to operate on an in-memory object file.)
In the diagnostics I think we just need one line, so symbolizeCode should be sufficient. symbolizeInlinedCode can give you multiple frames but I'm not sure how to pretty print them.
Ok, thanks for the info!
test/COFF/undefined-symbol-dwarf.s | ||
---|---|---|
110 | Ok, will do |
Reduced the test case further, converted it to use spaces instead of tabs. (A few manual edits had given some parts get mixed indentation in the middle of lines before.) Keeping the symbolizer around in the global config object (without std::unique_ptr, as it's allocated by the arena allocator with make<LLVMSymbolizer>()).
COFF/SymbolTable.cpp | ||
---|---|---|
93 | This funtionality could be in coff::getFileLine() ? Also, I have not idea why coff::getFileLine is replying on {"", 0} and ExitOnError instead of Expected<>. |
COFF/SymbolTable.cpp | ||
---|---|---|
93 |
Well, despite the file name, it should maybe be named getPDBFileLine() or something like that; it's defined in PDB.cpp and specific to that format.
That's a good point, but also feels orthogonal to this patch. |
COFF/SymbolTable.cpp | ||
---|---|---|
93 |
I'm not sure if coff::getFileLine is meant to be PDB specific since symbolizing a COFF file should be debug-format agnostic and getSymbolLocations is not in a PDB specific path in 3108802f1696a3635d7e1edcc3fbe523381ed5df. It more looks like a missing case for ELF.
Agree. Considering it is just a LLD internal API, it is not a major issue. |
COFF/SymbolTable.cpp | ||
---|---|---|
93 |
Um, what? Add ELF support for what, where? Currently the COFF linker in lld only supports symbolizing with PDB debug info. This patch adds support for DWARF debug info as well. Maybe it would be clearer if this patch also renamed getFileLine to getPDBFileLine (that function is not part of any public API anywhere, it's just an internal function within the lld COFF linker), changed the name of the new symbolizeFileLine to getDwarfFileLine and added a wrapper frontend getFileLine which will call getPDBFileLine and getDwarfFileLine? |
COFF/SymbolTable.cpp | ||
---|---|---|
93 | Sorry for the typo. I meant DWARF.
This sounds great to me. |
This funtionality could be in coff::getFileLine() ?
Also, I have not idea why coff::getFileLine is replying on {"", 0} and ExitOnError instead of Expected<>.