This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Add gnu-debuglink support for Windows PE/COFF
ClosedPublic

Authored by alvinhochun on May 25 2022, 4:11 AM.

Details

Summary

The specification of gnu-debuglink can be found at:
https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

The file CRC or the CRC value from the .gnu_debuglink section is now
used to calculate the module UUID as a fallback, to allow verifying that
the debug object does match the executable. Note that if a CodeView
build id exists, it still takes precedence. This works even for MinGW
builds because LLD writes a synthetic CodeView build id which does not
get stripped from the debug object.

The Minidump/Windows/find-module test also needs a fix by adding a
CodeView record to the exe to match the one in the minidump, otherwise
it fails due to the new UUID calculated from the file CRC.

Fixes https://github.com/llvm/llvm-project/issues/54344

Diff Detail

Event Timeline

alvinhochun created this revision.May 25 2022, 4:11 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: mgorny. · View Herald Transcript

Updating thc patch properly

alvinhochun published this revision for review.May 25 2022, 8:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2022, 8:37 AM
mstorsjo added a subscriber: mstorsjo.

Changed GetPluginNameStatic to "PE-COFF", and rebased onto main.

mstorsjo added inline comments.May 31 2022, 4:32 AM
lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
120

I'm curious - this adds new logic (copied from SymbolVendorELF afaik?) for iterating over sections and finding the ones that contain the dwarf debug info. Prior to this change, finding debug info within the executable itself has worked just fine.

What codepath and where has handled that? Has it fallen back on SymbolVendorELF so far?

(If that used to work, why is a specific plugin for PECOFF needed at this point for debuglink?)

alvinhochun added inline comments.May 31 2022, 5:30 AM
lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
120

The SymbolVendorELF seems to only handle loading the external debug info specified by the .gnu_debuglink section. The dwarf debug info already embedded in the main executable should be handled somewhere else, which has worked fine for both ELF and PE/COFF. Though I don't know where specifically this was handled.

SymbolVendorELF tries to downcast the object file to ObjectFileELF *, so it could not have worked for PE/COFF.

mstorsjo added inline comments.May 31 2022, 5:38 AM
lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
120

Ok, so what I misunderstood is that this function doesn't seem to handle dwarf debug info in the main executable after all - this only tries to dig up dwarf sections from GetSymbolFileFileSpec() or GetDebugLink(). So the existing code that locates dwarf sections in the executables themselves still runs as before.

So then this seems reasonable.

So essentially, if GetDebugLink() would be a virtual method, both SymbolVendorELF and SymbolVendorPECOFF could theoretically be merged into one?

alvinhochun added inline comments.May 31 2022, 6:04 AM
lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
120

So essentially, if GetDebugLink() would be a virtual method, both SymbolVendorELF and SymbolVendorPECOFF could theoretically be merged into one?

Yes, they may very well be merged if no other platform-specific logic is needed. There are some small differences now: SymbolVendorELF checks that obj_file->GetUUID() is valid, but this fails the test I added so I removed this check in SymbolVendorPECOFF. I also removed a few entries (those that doesn't show up in ObjectFilePECOFF) from the g_sections list, but this change is probably not necessary.

Fixed loading gnu-debuglink for x86 Windows and added a test for it.

DavidSpickett added a subscriber: DavidSpickett.

Can you include some links to documentation of the debuglink feature in the commit message? I assume https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html is what you'd want.

lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
882

Is "leading directory components removed" significant here?

A filename, with any leading directory components removed, followed by a zero byte,

I think that means for C:\a\b\c\foo you get foo, and here we expect to just get the file name not a path?

883

Is this aligning up or down, can you add a comment explaining?

I think it is something like read a filename, return the offset of the end of it. Align that up to a 4 byte boundary then read the crc from there?

884

You read the crc but it is unused.

If you just want to make sure that the section actually has enough data to have a crc in it, then check somehow that it was read or better, check the size explicitly.

As for calculating the crc I guess you'd have to read the whole linked file so the most you could do here is stash the crc value somewhere for later when you have opportunity to open it. Most of the time you won't hit an issue but I guess it's here for a reason. Perhaps if you were to debug a file that linked to something that had been rebuilt since?

lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
74

Is this preference defined by some standard or are you assuming you won't see both?

FWIW it does make sense to me to use the filename in the usual place first, then the debuglink, just as you've got it here.

lldb/source/Symbol/LocateSymbolFile.cpp
368

What produces this special case?

lldb/test/Shell/ObjectFile/PECOFF/dwarf-gnu-debuglink-i686.yaml
3

For these tests you are making a file, then giving it a debug link to itself then checking that lldb finds it? Seems like if the parsing of the debuglink failed then it could just find the original file and look like it passed.

Except you strip that original file of everything *but* the debug link. So if lldb tried to use the orignal file, it wouldn't have any debug info.

Assuming I got that right could you add comments to this and the other test stating that intent.

FWIW, most of the logic for how to deal with the gnu debuglink contents here is copied from the corresponding preexisting methods for ELF.

alvinhochun planned changes to this revision.Jun 6 2022, 8:04 AM

Can you include some links to documentation of the debuglink feature in the commit message? I assume https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html is what you'd want.

I suppose I can add that link.

lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
882

I think that's the expected result. (Also see the reply below.)

883
884

I didn't think much about it at first. I thought ObjectFileELF only used this CRC as one of the few ways of generating an UUID for the object file. But just now re-reading the code I realize that Symbols::LocateExecutableSymbolFile actually uses this UUID to check the validity of the debug file, so it now makes more sense.

I think I may have to do the same for ObjectFilePECOFF. The tricky bit is that it already has code to calculate an UUID using the PDB info if it exists. What should happen if both PDB and gnu-debuglink exists for the COFF object file?

lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
74

I think that may actually be the option set by the user (target symbols add perhaps?) but I am not sure. (This was just copied from SymbolVendorELF.)

lldb/source/Symbol/LocateSymbolFile.cpp
368

ObjectFilePECOFF::GetModuleSpecifications returns two specs for MachineX86 -- "i386-pc-windows" and "i686-pc-windows" (with -msvc after being normalized implicitly). I do not know why it does this and I am scared of touching it.

lldb/test/Shell/ObjectFile/PECOFF/dwarf-gnu-debuglink-i686.yaml
3

No, I think the gnu-debuglink is only added to the output file %t.stripped and after it has already been stripped, the input file %t isn't touched at all. So no gnu-debuglink is being stripped from any files.

(By the way this test is a variation of lldb/test/Shell/ObjectFile/ELF/gnu-debuglink.yaml.)

DavidSpickett added inline comments.Jun 6 2022, 8:57 AM
lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
883

Please confirm it and add a comment stating the logic.

884

You'd have to work out if one can create such a file without "hacking" some how like adding a debuglink with objcopy. And even if lldb could load both, would they mostly be copies of each other. Seems like an odd scenario.

If it's not something a default toolchain use is going to produce, I'd prefer the more "normal" one (the PDB, just a guess) and if people get confused because it chose the PDB then they've somewhat opted into paying that investigation cost by choosing this unusual workflow.

You can always log these sort of things too. I believe a lot of the Apple code will emit things like ok I'm loading this file here, but actually substituted it with this one etc. 99% of people never need to read them but it's easy to say to someone who has an issue "enable this log and tell me what you see".

lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
74

Ok so we support debuglink from ELF already? Cool.

lldb/test/Shell/ObjectFile/PECOFF/dwarf-gnu-debuglink-i686.yaml
3

Right, you load the stripped version which debug links to the unstripped version (that doesn't have a debug link).

The way it's done is fine, please document the logic though as it's not the most obvious thing on first look.

alvinhochun added inline comments.Jun 6 2022, 10:26 AM
lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
883

Ah yes, the linked spec states

zero to three bytes of padding, as needed to reach the next four-byte boundary within the section

before the crc checksum.

884

It is doable, Clang can be asked to produce both PDB and DWARF debug info by passing both -gdwarf -gcodeview, and link with -Wl,-pdb=.

It turns out LLD always writes a synthetic "build id" for mingw even if no PDB is generated (https://github.com/llvm/llvm-project/blob/0498415f1d6ac0bfbda72486505c11a7b3d464c4/lld/COFF/Writer.cpp#L1926). Seems that this mechanism is already working to check that the executable and the debug file are from the same build at least for LLD-linked objects, because this "build id" does not get stripped. I have verified that lldb does generate the UUID for such objects.

As a fallback for objects linked by the GNU linker, I can use the CRC to generate the UUID like how ObjectFileELF does it.

I will add tests for these cases too.

alvinhochun edited the summary of this revision. (Show Details)

Addressed comments, added CRC checking for debuglink (only applies if the object does not contain a PDB build id) and added tests for this.

DavidSpickett accepted this revision.Jun 7 2022, 1:32 AM
DavidSpickett added a reviewer: DavidSpickett.

This LGTM.

lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
877

You can drop the {} from the if here.

This revision is now accepted and ready to land.Jun 7 2022, 1:32 AM

Applied suggested change

alvinhochun planned changes to this revision.Jun 7 2022, 7:11 AM

Sorry, I just realized the crc change broke a test.

The test https://github.com/llvm/llvm-project/blob/main/lldb/test/Shell/Minidump/Windows/find-module.test fails with this patch. That test opens a minidump and checks that the associated exe is loaded. The minidump contains a codeview PDB record for the exe which is used to generate the UUID, but the same codeview PDB record is missing from the exe. The test still worked because LLDB ignores the empty UUID (it skips the UUID equality check). However, when this patch adds a UUID using the CRC, that test fails because the UUIDs no longer match up.

I see two ways of fixing it:

  1. Add the codeview PDB record to the test exe. I already prepared the change, but I don't know if it is the expected behaviour (that the exe should have a matching codeview PDB record).
diff
diff --git a/lldb/test/Shell/Minidump/Windows/Inputs/find-module.exe.yaml b/lldb/test/Shell/Minidump/Windows/Inputs/find-module.exe.yaml
index 42ccc6d6e053..c4e2b931c4a3 100644
--- a/lldb/test/Shell/Minidump/Windows/Inputs/find-module.exe.yaml
+++ b/lldb/test/Shell/Minidump/Windows/Inputs/find-module.exe.yaml
@@ -16,6 +16,9 @@ OptionalHeader:
   SizeOfStackCommit: 4096
   SizeOfHeapReserve: 1048576
   SizeOfHeapCommit: 4096
+  Debug:
+    RelativeVirtualAddress: 20480
+    Size:            28
 header:
   Machine:         IMAGE_FILE_MACHINE_I386
   Characteristics: [ IMAGE_FILE_EXECUTABLE_IMAGE, IMAGE_FILE_32BIT_MACHINE ]
@@ -28,5 +31,10 @@ sections:
     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
     VirtualAddress:  16384
     VirtualSize:     48
+  - Name:            .buildid
+    Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ ]
+    VirtualAddress:  20480
+    VirtualSize:     194
+    SectionData:     00000000E5038E620000000002000000A60000001C5000001C400000525344533ED87D89C8A8184197F3A925EE4BF74101000000433A5C70726F6A656374735C746573745F6170705C436F6E736F6C654170706C69636174696F6E315C44656275675C436F6E736F6C654170706C69636174696F6E312E70646200
 symbols:         []
 ...
  1. Don't set the UUID with the CRC. In SymbolVendorCOFF, don't rely on the UUID but instead implement a separate CRC check for debuglink (only necessary if there is no codeview PDB record). This may be problematic though -- Symbols::LocateExecutableSymbolFile has more than one search path and it can ignore files which doesn't match, but if I put the check in SymbolVendorCOFF it can only get the first matching filename, which may not actually be the correct match.

@labath I see you added the test in https://reviews.llvm.org/D65955, do you have any comment on this?

alvinhochun edited the summary of this revision. (Show Details)

Added the fix to the Minidump find-module test, also updated the commit message.

This revision is now accepted and ready to land.Jun 8 2022, 9:36 AM

When testing this patch, I ran into crashes in the SymbolFile/NativePDB/load-pdb.cpp testcase. To fix it, I had to add a change like this:

diff --git a/lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp b/lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
index 26df0035b37c..e08753b86d5b 100644
--- a/lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
+++ b/lldb/source/Plugins/SymbolVendor/PECOFF/SymbolVendorPECOFF.cpp
@@ -108,6 +108,8 @@ SymbolVendorPECOFF::CreateInstance(const lldb::ModuleSP &module_sp,
   // that.
   SectionList *module_section_list = module_sp->GetSectionList();
   SectionList *objfile_section_list = dsym_objfile_sp->GetSectionList();
+  if (!objfile_section_list || !module_section_list)
+    return nullptr;
 
   static const SectionType g_sections[] = { 
       eSectionTypeDWARFDebugAbbrev,   eSectionTypeDWARFDebugAranges,

Also, when built with assertions enabled (which often is to prefer during development, to catch things before the buildbots do), reading the debuglink info failed an assert:

lldb/source/Utility/DataExtractor.cpp:135: lldb_private::DataExtractor::DataExtractor(const void*, lldb::offset_t, lldb::ByteOrder, uint32_t, uint32_t): Assertion `addr_size >= 1 && addr_size <= 8' failed.

This is easy to fix with a change like this:

diff --git a/lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp b/lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
index ba4612178be7..657fcdc5af6d 100644
--- a/lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
+++ b/lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
@@ -63,7 +63,7 @@ static bool GetDebugLinkContents(const llvm::object::COFFObjectFile &coff_obj,
       }
       DataExtractor data(
           content->data(), content->size(),
-          coff_obj.isLittleEndian() ? eByteOrderLittle : eByteOrderBig, 0);
+          coff_obj.isLittleEndian() ? eByteOrderLittle : eByteOrderBig, 4);
       lldb::offset_t gnu_debuglink_offset = 0;
       gnu_debuglink_file = data.GetCStr(&gnu_debuglink_offset);
       // Align to the next 4-byte offset

If it's ok with you, I can squash these changes (plus a rewording regarding GNU ld and the codeview build ID) and push the patch.

lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
63

Actually, the GNU linker does write such a build id if you pass it -Wl,--build-id, but it's not default like in LLD.

Ah I did not realize that GNU ld also has the option to add the same build id for mingw. I am fine with you committing these changes. Thanks!

This revision was automatically updated to reflect the committed changes.

Is there any difference in functionality between SymbolVendorELF and the new class introduced here? Could this have been achieved by teaching SymbolVendorELF (after renaming it to something else) to handle COFF files as well?

Is there any difference in functionality between SymbolVendorELF and the new class introduced here? Could this have been achieved by teaching SymbolVendorELF (after renaming it to something else) to handle COFF files as well?

There is a slight difference with the elements in the g_sections list, which I mentioned in an inline comment. (https://reviews.llvm.org/D126367#inline-1214432 I also mentioned removing the UUID check but it has been re-added with the CRC handling in place.) Yes, it could probably be combined with SymbolVendorELF.

Perhaps I should also say, what led me to make a separate class SymbolVendorPECOFF instead, is the pre-existing SymbolVendorWasm class, which also has very similar code to SymbolVendorELF. It gave the impression that each object file format should have its own SymbolVendor plugin.