This is an archive of the discontinued LLVM Phabricator instance.

[DWARFLibrary] Add support to re-construct cu-index
AbandonedPublic

Authored by ayermolo on Nov 8 2022, 9:58 AM.

Details

Reviewers
jhenderson
Summary

According to DWARF5 specification and gnu specification for DWARF4 the offset
entry in the CU/TU Index is 32 bits. This presents a problem when
.debug_info.dwo in DWP file grows beyond 4GB. The CU Index becomes partially
corrupted.

This diff adds manual parsing of .debug_info.dwo/.debug_abbrev.dwo to
reconstruct CU index in general, and TU index for DWARF5. This is a work around
until DWARF6 spec is finalized.

Next patch will change internal CU/TU struct to 64 bit, and change uses as
necessary. The plan is to land all the patches in one go after all are approved.

This patch originates from the discussion in: https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902

Diff Detail

Event Timeline

ayermolo created this revision.Nov 8 2022, 9:58 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: hoy, modimo, wenlei and 3 others. · View Herald Transcript
ayermolo requested review of this revision.Nov 8 2022, 9:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 8 2022, 9:58 AM
ayermolo retitled this revision from [DWARFLibrary] Add support re-constructing cu-index to [DWARFLibrary] Add support to re-construct cu-index.Nov 8 2022, 10:25 AM
dblaikie added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

I'm a bit more worried about implementing this for DWARFv4 due to needing to rebuild .debug_abbrev sections together, which is less reliable/guaranteed (there's no guarantee that the abbrev contributions are written in the same order as the .debug_info sections - though it's the case in reality I guess) than .debug_info parsing.

Any chance this workaround can be restricted to only DWARFv5?

llvm/lib/DebugInfo/DWARF/DWARFUnitIndex.cpp
188–199

Could we put this after the old table reading code (& then the old code wouldn't need to be modified - yeah, it'd read the corrupted values, but they'd be overwritten afterwards anyway)? Might be a smaller change?

202
ayermolo added inline comments.Nov 9 2022, 9:53 AM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

I would prefer to keep it more generic and include DWARF4. Part of the reason is that we still heavily rely on DWARF4.
One thing we can do is that if DWOID is "garbage". As in when we populate CU Index and can't find the signature in a map we stop.
With the current implementation we can just clear the map and parse CU/TU index as before. If we move implementation of updating contributions entires after parsing we just stop updating further. I think with this approach we can handle the common case of how things are now in reality, and on of chance the assumption fails we are no worse than we were before. Although it does assume that with wrong abbrev. getDWOID doesn't crash.... What do you think?

dblaikie added inline comments.Nov 9 2022, 11:40 AM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

maybe a way to make this more robust for DWARFv4 would be to parse the index as-written first, and using the index entries to find the suitable .debug_abbrev contribution for the CU (found via the manual walk, associated via the DWO ID)

As for the fallback behavior in case of "garbage" DWO IDs - given how tenuous all this already is, perhaps we should stop/error if that happens, rather than producing corrupted data?

ayermolo added inline comments.Nov 9 2022, 12:09 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

Sorry, not sure I follow. When we manually walk there is implicit assumption that .debug_info.dwo section and corresponding .debug_abbrev.dwo are in the same order. Thus we can use abbrev to parse unit die to get DWO ID (since this is DWARF4 and it's one of the attributes). Then use DWO ID to figure out which contribution we need to modify.

When we parse Index CUs are in the order in which DWOIDs got hashed into the table. Since .debug_info offsets column is corrupt (at least for some), we still can't map the abbrev section to the corresponding debug section.

The sentiment from users is better to have some/most of debug information than none. Which is why we have dwp with over 4GB of debug info even thought in trunk it's an error in llvm-dwp....

(could you link in the patch description to the thread where we were discussing this so I can page the context back in, etc? I'm probably considering/rehashing some of that conversation in my head - like introducing a new/custom index version number to handle these cases, etc)

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

Ah, right, right - circular problem. Sorry.

I'm still a bit worried about the complexity of supporting v4 here... not sure it's the right thing to do.

819–820

I guess we could at least validate that the abbrev offset was probably correct - once we have the DWO ID we could look up the index entry, find the abbrev section offset, and confirm it was the one we used?

ayermolo edited the summary of this revision. (Show Details)Nov 9 2022, 12:53 PM
ayermolo edited the summary of this revision. (Show Details)
ayermolo added inline comments.Nov 9 2022, 5:49 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

We could. For the case where we use wrong abbrev, interpret some random 8 bytes as DWO ID, and they happen to be same ones as a valid signature?

clayborg added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
797–802

We shouldn't need to do anything for .debug_abbrev stuff right? That section will never exceed 4GB, so no need to do anything when manually parsing.

814–815

This will be wrong for DWARF64. Just use the DWARFUnitHeader::getNextUnitOffset() function as it correctly figures out the length.

819–820

Both solutions are a bit messy. We must get the abbrev offset correct somehow. If we parse the CU index first, you can most probably identify the right .debug_abbrev offset. It will be very doubtful that you will have two 32 bit CU offsets that truncate to the same offset. It could happen, but very unlikely. So I would rather we don't assume anything about .debug_abbrev layout and parse the standard 32 bit CU index first, and then to find the .debug_abbrev offset, we find the truncated 32 bit CU offset in the list.

llvm/lib/DebugInfo/DWARF/DWARFUnitIndex.cpp
188–199

I agree here. Just let it parse normally, and then run some code to fix up the wrong values only if needed.

@dblaikie How would you like to proceed?

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

Not sure I follow. Only algo I could come up with that would do it is:

Parse headers and have a map from truncated and normal ones under 4GB to 64bit real offsets. In case of collision
Parse the CU Index, and iterate over it.
As we do it, we find a cu offset in a map 
   get its real 64 bit offsets. 
   Get abbrev offset from cu index
   Until DWO ID matches signature iterate over 64 bit real offsets
       Use it to parse UnitDIE and check
       update cu index with real 64bit offset if check passes

*Assumption is that parsing CU with wrong abbrev won't just crash somewhere

or did you had something else in mind?

clayborg added inline comments.Nov 10 2022, 9:26 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

yeah, my flow would be:

  • parse CU index using previous code and let 32 bit CU offsets get truncated
  • if (sizeof(.debug_info|types) > 4GB) then call this function
  • parse all CU headers
    • find the current CU's .debug_abbrev offset by searching truncated map from first step using a truncated 32 bit offset
    • parse the CU's first DIE using the correct abbreviation table
ayermolo added inline comments.Nov 10 2022, 9:40 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

To confirm the last step so we can verify DWO ID in a UnitDIE is the same as a signature in contribution?

clayborg added inline comments.Nov 10 2022, 10:19 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

You would only need to do this if you have multiple CUs whose truncated offset is in the parsed CU index. If there is only one match, we are good to go. So maybe a better flow is:

  • parse CU index using previous code and let 32 bit CU offsets get truncated
  • if (sizeof(.debug_info|types) <= 4GB) then we are done
  • parse all CU headers
    • search the CU index for a truncated offset that matches the current CU's truncated 64 bit offset
      • If the there is only one match, just fix the truncated CU offset and continue, no need to read the CU DIE at all for the DWO ID
      • if there are multiple matches, then you need to try and get the the right abbreviation so you can verify the DWO ID. If we have the wrong abbrev table, then we might get decoding errors, or we might not, so this will be the tricky part to get right. You could rely on ordering etc to help try and get the right abbrev table as you were doing, but you only need to do this _if_ you have multiple matches.
clayborg added inline comments.Nov 10 2022, 10:21 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

If we have DWARF 5, then the DWO ID is in the DWARFUnitHeader, so no need for .debug_abbrev parsing at all in that case. So we only need to do heroics for DWARF 4 and earlier.

ayermolo added inline comments.Nov 11 2022, 7:59 AM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

OK, sounds like we are on same page on what you proposed.
@dblaikie what do you think?

dblaikie added inline comments.Nov 11 2022, 3:54 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

Nice not to have to read the abbrev more than necessary (ie: only when there's an ambiguity in the truncated offset) - but the code is still a bit iffy to have to figure out the abbrev section to be able to read the DWO ID to use...

Would it be acceptable to skip that, and still fail in those cases & rely only on the unambiguous truncation logic?

ayermolo added inline comments.Nov 11 2022, 4:02 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

Hard to tell since we don't know how often collision will happen in real world, but probably as a first attempt good enough?
If we start seeing issues we can revisit?

Although TBH personally I don't see a big deal with current approach in this diff. Yes realistically speaking we are relying on implicit behavior of debug sections and abbrev sections being in the same order, but tool that is being used is llvm-dwp and it's in the same llvm toolchain. I believe it's more robust than relying on assumption that collisions will be rare.

clayborg added inline comments.Nov 11 2022, 5:23 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

I'd doubt you will have many if any cases where truncated compile units match. Can we rely on the ordering of the CU Offsets at all in the CU index? Can we assume they are ever increasing? That might make it easier to match up things to CUs. Or is the CU index ordered by the DWO ID or is there no order to it at all? I can't remember.

We have rock solid ways to do this with no collisions. With collisions, which will be really rare, we can make a best effort. Don't care how it is done. Might be worth checking out a binary from the other dwp tool? Is there one from GCC?

ayermolo added inline comments.Nov 11 2022, 5:39 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
819–820

For CU Index we cannot rely on ordering. They are ordered on based on hash of DWO ID in the table.

I tried DWP GNU dwp version 2.30-119.el8 with bzip2 build with clang-15 and it coredumped.
It also coredumped with bzip2 build with gcc (GCC) 11.x 20221024

Wasn't sure whether to override this diff or create a new one. For clarity went with a new one. Based on Gregs suggestion.
https://reviews.llvm.org/D137882

ayermolo abandoned this revision.Dec 8 2022, 3:47 PM