Page MenuHomePhabricator

[DWARF] Don't process .debug_info relocations for DWO Context
ClosedPublic

Authored by ayermolo on Jul 22 2021, 6:05 PM.

Details

Summary

When we build with split dwarf in single mode the .o files that contain both "normal" debug sections and dwo sections, along with relocaiton sections for "normal" debug sections.
When we create DWARF context in DWARFObjInMemory we process relocations and store them in the map for .debug_info, etc section.
For DWO Context we also do it for non dwo dwarf sections. Which I believe is not necessary. This leads to a lot of memory being wasted. We observed 70GB extra memory being used.

I went with context sensitive approach, flag is passed in. I am not sure if it's always safe not to process relocations for regular debug sections if Obj contains .dwo sections.
If it is alternatvie might be just to scan, in constructor, sections and if there are .dwo sections not to process regular debug ones.

Diff Detail

Event Timeline

ayermolo created this revision.Jul 22 2021, 6:05 PM
ayermolo requested review of this revision.Jul 22 2021, 6:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2021, 6:05 PM

70GB seems like quite a bit - do you have a description of the scenario (ideally something reproducible - like "build clang at this revision with these flags and run these commands & the memory profile looks like this")? I'm guessing this is running llvm-symbolizer on a large Split DWARF input (without a DWP file, so using loose/separate .dwo files) and passing in many addresses to symbolize (forcing many of those .dwo files to be loaded)? I'd be curious to know exactly which data structures, etc,

Could this be tested - with, say, an invalid relocation or something, demonstrating that relocations are no longer applied? (I guess we never applied them to the .dwo sections, right? But we'd parse them for the non-dwo sections in the .dwo file (eg, when using split-dwarf=single?), so I'm not sure if there's any kind of invalidity we could put in the relocation section of the non-dwo sections - oh, I see we try to apply the relocations early, so maybe there is some invalidity that would expose this? not perfect, but maybe a little motivating/demonstrating.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1742

How come this tests both these properties? I guess there's some case where IsDWOContext is inadequate/isn't correct and the file extension test is needed? What's that case?

70GB seems like quite a bit - do you have a description of the scenario (ideally something reproducible - like "build clang at this revision with these flags and run these commands & the memory profile looks like this")? I'm guessing this is running llvm-symbolizer on a large Split DWARF input (without a DWP file, so using loose/separate .dwo files) and passing in many addresses to symbolize (forcing many of those .dwo files to be loaded)? I'd be curious to know exactly which data structures, etc,

Could this be tested - with, say, an invalid relocation or something, demonstrating that relocations are no longer applied? (I guess we never applied them to the .dwo sections, right? But we'd parse them for the non-dwo sections in the .dwo file (eg, when using split-dwarf=single?), so I'm not sure if there's any kind of invalidity we could put in the relocation section of the non-dwo sections - oh, I see we try to apply the relocations early, so maybe there is some invalidity that would expose this? not perfect, but maybe a little motivating/demonstrating.

Sorry, forgot to mention. This is regarding BOLT with -update-debug-sections turned on using it on an internal service build with -O2 mode and -g debug information. Without ThinLTO.
Right we are building it with -split-dwarf=single so the .dwo sections are in the .o file not in .dwo. Actually with -split-dwarf=split this problem would not manifest because it will only have .dwo sections.
I attached a screen shot with stack trace and memory consumption. Basically I am going through Skeleton CUs and creating DWARFUnits for non-skeleton CUs that are in .o files. This goes through getDWOContext and eventually calls DWARFObjInMemory. It doesn't know for what purpose it is invoked so it goes through all the debug sections. Including the regular one that are in .o file. Which we don't really care about I think since we are creating Context for DWO. If relocations for that exist then sure we should apply them (is that even a thing?), but for regular debug section, when creating context for DWO, I don't think it's necessary.
Thus a check for a flag and for .dwo. If it's true then we only care about .dwo sections.

After the fix. Both collected by @rafaelauler

If relocations for that exist then sure we should apply them (is that even a thing?)

No, relocations don't exist for dwo sections (if there are any relocations for dwo sections, that's a bug in the producer) - so skipping all relocation application for .dwo sections is fine/correct (could error out if there are relocation sections for dwo sections, though, if you like - but probably should do that in a separate patch).

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1742

Still curious about this.

If relocations for that exist then sure we should apply them (is that even a thing?)

No, relocations don't exist for dwo sections (if there are any relocations for dwo sections, that's a bug in the producer) - so skipping all relocation application for .dwo sections is fine/correct (could error out if there are relocation sections for dwo sections, though, if you like - but probably should do that in a separate patch).

This is related to your question about (IsDWOContext && !Name.contains(".dwo") check. Why I though both of them are needed. Looking at code on line 1776

else if (RelSecName == "debug_info.dwo")
          Map = &static_cast<DWARFSectionMap &>(
                     InfoDWOSections[*RelocatedSection])
                     .Relocs;

I thought there might be relocations for .debug_info.dwo so I structured a check in two parts flag plus ".dwo" so that if context is DWO and relocations for ".dwo" exist it will still process them.
Now I am confused. If relocations don't exist for dwo, why is that code there?
I traced it to https://reviews.llvm.org/D53907. Which seems to be DWARF V5 related.

If there are no relocations for .dwo then just checking IsDWOContext is enough, and !Name.contains(".dwo") is not necessary.

If relocations for that exist then sure we should apply them (is that even a thing?)

No, relocations don't exist for dwo sections (if there are any relocations for dwo sections, that's a bug in the producer) - so skipping all relocation application for .dwo sections is fine/correct (could error out if there are relocation sections for dwo sections, though, if you like - but probably should do that in a separate patch).

This is related to your question about (IsDWOContext && !Name.contains(".dwo") check. Why I though both of them are needed. Looking at code on line 1776

else if (RelSecName == "debug_info.dwo")
          Map = &static_cast<DWARFSectionMap &>(
                     InfoDWOSections[*RelocatedSection])
                     .Relocs;

I thought there might be relocations for .debug_info.dwo so I structured a check in two parts flag plus ".dwo" so that if context is DWO and relocations for ".dwo" exist it will still process them.
Now I am confused. If relocations don't exist for dwo, why is that code there?
I traced it to https://reviews.llvm.org/D53907. Which seems to be DWARF V5 related.

Yeah, I think that's just added for symmetry - probably copying what was done for .debug_types.dwo too (which I probably wrote) which also doesn't need relocation handling.

If there are no relocations for .dwo then just checking IsDWOContext is enough, and !Name.contains(".dwo") is not necessary.

Yeah, if you could remove that (& if you're willing, a separate patch to warn about relocations on .dwo sections) check, that'd be great!

Might be worth replacing the bool "IsDWO" parameters with an enum to make the callsites more legible rather than an anonymous "false" value? (might also be an opportunity to/worth creating a 3rd value for the enum - some kind of "indeterminate" state (& that one might be good to skip and warn on relocations in dwo sections) - this would come up when directly dumping a .dwo file, I think?)

ayermolo updated this revision to Diff 361805.Jul 26 2021, 2:27 PM

Good point. Changed to ENUM. I left as two enums as I am not quite following the "interminate" remark.
Maybe this is related but running dwarfdump-header.s it fails because relocation section for .debug_info.dwo is being generated.
Looking at ELFWriter::writeObject looks like there was a code added to check the mode, but I haven't 100% internalized what all of it means. Do you have insights to shortcut it?
It was added in https://reviews.llvm.org/D47051

Also leaving it as one patch until all the changes are finalized. Will split later.

Digging a bit more in to a test and llvm-mc. I think modifying encoding of .debug_abbrev.dwo and .debug_info.dwo should do the trick...

ayermolo updated this revision to Diff 362195.Jul 27 2021, 3:03 PM

Updatd failing test, added a test.

ayermolo retitled this revision from [WIP][DWARF] Don't process .debug_info relocations for DWO Context to [DWARF] Don't process .debug_info relocations for DWO Context.Jul 27 2021, 3:04 PM
dblaikie added inline comments.Jul 28 2021, 2:15 PM
llvm/include/llvm/DebugInfo/DWARF/DWARFContext.h
390

This name might be a bit vague - and I don't think LLVM usually uses all-upper for enum constants. So maybe "SplitOrObjectSections" with "{Both, DWO, Object}"? I'm not really sure - open to better names.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1743–1747

Any thoughts/details on why one is an error and the other is a warning? (since they're about the same content, I think?)

Hmm, also it might not be invalid to encounter relocation sections in a dwo file in general - that's the split-dwarf=single case, isn't it? (ie: in split-dwarf=single, we'd be in this codiepath for ContextType::DWO, but the dwo file is really the .o file, so it'll have relocations for the non-.dwo sections - though they can/should be silently ignored, but they aren't an error/problem)

llvm/test/DebugInfo/X86/dwarfdump-header.s
24 ↗(On Diff #362195)

The changes to this file could probably be committed separately in a standalone patch before this relocation handling patch, yeah? If so, please do that and rebase this patch on top.

llvm/test/DebugInfo/X86/dwarfdump-rela-dwo.s
2

This test seems a bit long (I see it's got structure types? That seems unnecessary for this purpose) - maybe a simple/single file with "extern int i; int i = 3;".

ayermolo added inline comments.Jul 28 2021, 4:27 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1743–1747

My original thinking was if consumer explicitly goes through the DWO context path there is expectation that DWO sections won't have any relocations, and should fail early, and warning might get ignored/missed. The "ALL" path is less critical. It's used for things such as llvm-dwarfdump where it's not necessary for it to fail and warning is more visible.
Re-thinking about it now, it doesn't really seem like a good argument. So maybe just keep it as a warning for both?
Also there is a bug. It should be:

if (RelocatedSection != Obj.section_end() && Name.contains(".dwo")) {
        if (Type == DWARFContext::ContextType::DWO)
          HandleError(createError(
              "Creating DWO Context with relocation section for " + Name));
        else if (Type == DWARFContext::ContextType::ALL)
          HandleWarning(createError("Relocation " + Name + " exists"));
      }

So dropping the difference:

if (RelocatedSection != Obj.section_end() && Name.contains(".dwo")) {
        if (Type == DWARFContext::ContextType::DWO || 
            Type == DWARFContext::ContextType::ALL)
            HandleWarning(createError("Relocation " + Name + " exists"));
      }

Or maybe just to simplify the whole thing and remove enum and

if (RelocatedSection != Obj.section_end() && Name.contains(".dwo"))
            HandleWarning(createError("Relocation " + Name + " exists"));

If we are removing the code that handles the relocations for dwo sections, and establish that this is not a valid scenario, then there is no real reason to distinguish what context this is. Since even if they are present we will print a warning, and output won't be correct.

dblaikie added inline comments.Jul 29 2021, 8:56 AM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1743–1747

If we are removing the code that handles the relocations for dwo sections, and establish that this is not a valid scenario, then there is no real reason to distinguish what context this is. Since even if they are present we will print a warning, and output won't be correct.

Almost - but I think you want to skip /all/ relocations when you're in a DWO context, right? (for split-dwarf=single - where you're parsing the .dwo file (which is really the .o file) and don't want to handle all the non-dwo sections, etc because you know its only the dwo parts that are relevant) so you'd still need the context to implement that, I think?

ayermolo added inline comments.Jul 29 2021, 3:33 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1743–1747

Derp moment on my part. The enum is still necessary.

dblaikie added inline comments.Jul 30 2021, 8:54 AM
llvm/include/llvm/DebugInfo/DWARF/DWARFContext.h
390

Ping on this. (and maybe "NonSplit" rather than "Object" would be more clear - but finding a more specific name than "ContextType" is probably the important part of making this more self-descriptive/less vague)

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1743–1747

OK, so coming back to this code - seems like we could error any time we see relocations for .dwo sections though - without checking the "ContextType". But, separately, we should check the "ContextType" and if it's DWO, no relocations should be processed, yeah? (that latter part seems to be the code a few lines lower, at 1749 \/)

So this warning/error code could be simplified to:

if (RelocatedSection != Obj.section_end() && Name.contains(".dwo"))
  HandleWarning(createError("Unexpected relocations for dwo section " + Name));

Something like that?

llvm/test/DebugInfo/X86/dwarfdump-rela-dwo.s
2

Ping on this - still seems a bit long/convoluted.

ayermolo updated this revision to Diff 363185.Jul 30 2021, 12:38 PM

Sorry was waiting for modified test to land before circling back.
Updated warning, and enum (more in inline comments).

ayermolo marked 6 inline comments as done.Jul 30 2021, 12:41 PM
ayermolo added inline comments.
llvm/include/llvm/DebugInfo/DWARF/DWARFContext.h
390

How about this?
Thinking is what we really care about is whether to process or not relocations. With warning being consolidated to one case (dwo relocations are present), doesn't seem we need to distinguish between various cases.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1743–1747

Yep, yep, exactly.

llvm/test/DebugInfo/X86/dwarfdump-rela-dwo.s
2

OK, let me see if I can modify this/regenerate a new one.

ayermolo updated this revision to Diff 363199.Jul 30 2021, 1:54 PM
ayermolo marked an inline comment as done.

Updated the test.

ayermolo marked 2 inline comments as done.Jul 30 2021, 1:55 PM
dblaikie accepted this revision.Jul 30 2021, 6:05 PM

Sounds good - thanks for your patience!

llvm/include/llvm/DebugInfo/DWARF/DWARFContext.h
390

While I think the new name is more descriptive - it still feels a bit confusing for readers at call sites - still feels like the caller passing "ProcessDebugRelocations::Ignore" would benefit from a comment explaining why they should be ignored here (& maybe even similarly at the other call sites passing "Process" - be nice if it were a defaulted argument, but the other mix of defaulted arguments means it's hard to actually use the default (because other callers want to specify some subset of the arguments)...

Hmm, but I keep coming up with issues with all the other ideas for naming/API design I have here, so let's just go with this one. It's not used prolifically and could be renamed/improved later without significant cost.

This revision is now accepted and ready to land.Jul 30 2021, 6:05 PM

Sounds good - thanks for your patience!

Thank you for review, will commit.