Page MenuHomePhabricator

[DebugInfo] Fix crash with -fdebug-info-for-profiling and split dwarf
Needs ReviewPublic

Authored by rupprecht on Dec 16 2020, 8:15 PM.

Details

Reviewers
dblaikie
Group Reviewers
debug-info
Summary

Compiling the following:

struct b {
  template <typename c>
  void d(c e) { d(e); }
};
void f() { b a; a.d(0); }

With clang -fsplit-dwarf-inlining -g -gsplit-dwarf -fdebug-info-for-profiling -O2 triggers assertion failures that the subprogram die should have already been created by getOrCreateSubprogramDIE. It appears to not be created because we skip that for split dwarf in some cases, so make sure we don't skip that when the -fdebug-info-for-profiling option is used.

Note: this fix only addresses one issue common to building this with either dwarf 4 or 5. There is a second crash here still present with dwarf 5.

Diff Detail

Event Timeline

rupprecht created this revision.Dec 16 2020, 8:15 PM
rupprecht requested review of this revision.Dec 16 2020, 8:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 16 2020, 8:15 PM
MaskRay added a subscriber: MaskRay.

Any idea if this was a recent regression or the like? Worth noting what patch might've caused it, etc.

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
985–987

Idle thoughts, not especially related to this patch:

This may highlight an interesting situation:

Currently gmlt + split-dwarf-inlining + fission => non-fission gmlt, because split-dwarf-inlining is essentially the gmlt-like data, kept in the object file. So if the only thing that's being fissioned into the dwo is gmlt data anyway, and the gmlt-like data is also going in the object file, why bother with the dwo gmlt data? But if debug-info-for-profiling makes for particularly larger debug info, then it could be reasonable to have gmlt + split-dwarf-inlining + fission + debug-info-for-profiling all active at once. (though, perhaps now we have a good way to turn fission off (-gno-split-dwarf) perhaps we could remove that quirk I added that caused gmlt + split-dwarf-inlining + fission not to compos, but overriding/disabling fission)

Might be interesting to measure object/executable size metrics for debug-info-for-profiling + gmlt and debug-info-for-profiling + gmlt + fission + split-dwarf-inlining. See how much obj/exe space is saved by that and whether it's worth supporting.

llvm/test/DebugInfo/X86/debug-info-for-profiling-crash.ll
2

Generally tests should verify more than "does anything other than crash" - that's a fairly broad test/definition of acceptable behavior. So some checking that the resulting DWARF is as desired - especially around the specific codepath that crashed (what construct was meant to be created but never got created because of the assertion, for instance - or, with a non-assertions build did it produce some specific mangled DWARF that we can check is now correct)

  • Update test case to check dwarfdump structure

Any idea if this was a recent regression or the like? Worth noting what patch might've caused it, etc.

We started noticing this with switching to dwarf5, which crashes in release builds too, and I ran into the assertion when trying to figure out the root cause with a debug build.

I checked as far back as d157a9bc8ba1 (Oct 28), and the crash is present there. I'd guess this has been present for a long time.

llvm/test/DebugInfo/X86/debug-info-for-profiling-crash.ll
2

Sure, I switched to a release build and let it run, with the expected dwarfdump skeleton listed here.

In short, the .debug_info contents now looks more similar to the .debug_info.dwo contents, namely with the DW_TAG_subprogram for _ZN1b1dIiEEvT_ (i.e. void b::d<int>(int)) now present in .debug_info too. The previous version doesn't look mangled AFAICT, it's just missing sections.

Yeah - would you be able to see if rather than making changes to support debug-info-for-profiling further in split-dwarf-inlining, we could go the other way and support it less/not at all? (it's going to be a bit annoying, now splitting hairs of having gmlt-without-profiling and gmlt-with-profiling, but hopefully not too drastic *fingers crossed*)

Though given this seems pretty generically broken (even the simplest examples assert when all 3 features (split-dwarf-inlining, split-dwarf, and debug-info-for-profiling) are combined) I'd probably be OK with this patch landing first, but then revisiting the general direction separately. This could reduce sanitizer object/executable size (in builds where debug-info-for-profiling and split-dwarf are enabled), for instance - since split-dwarf-inlining is used there. (though perhaps we restrict debug-info-for-profiling more to only the places we really need it (release builds that get sample based profiling) and so the overlap doesn't come up as often?)

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
985–987

Hmm, now that I look at it, I'm not so sure this is an idle thought.

This proposed patch looks like it might be doing the thing I'm not certain about - it seems like this patch is expanding -fdebug-info-for-profiling to apply to split-dwarf-inlining info, which I'm not sure is accurate/useful. At least given Google implemented -fdebug-info-for-profiling and our use case for it doesn't include needing it "online" (it's processed offline, with access to dwo/dwp files when using Split DWARF) we don't need this info in .o/executable files, so perhaps we shouldn't add it there?

(hmm, well - debug-info-for-profiling is already somewhat used (I see the extra function name data and function decl file/line numbers) in the split-dwarf-inlining data, so I'm suggesting rolling that back/"fixing" that bug too/going more in that direction, rather than in the direction of adding more debug-info-for-profiling data into the split-dwarf-inlining data)

llvm/test/DebugInfo/X86/debug-info-for-profiling-crash.ll
5–9

A slightly simpler/more canonical test that seems to reproduce the crash would be:

void f1();
__attribute__((always_inline)) inline void f2() { f1(); }
void f3() { f2(); }

This is the smallest/simplest example of inlining, reproduces without extra instructions at -O0, etc.

27–41

I'd probably skip this bit - if anything, maybe a comparison with -gmlt -fdebug-info-for-profiling would be a more Apples to Apples take (the split-dwarf-inlining is meant to, roughly, give the same experience as -gmlt in the object file).

Yeah - would you be able to see if rather than making changes to support debug-info-for-profiling further in split-dwarf-inlining, we could go the other way and support it less/not at all? (it's going to be a bit annoying, now splitting hairs of having gmlt-without-profiling and gmlt-with-profiling, but hopefully not too drastic *fingers crossed*)

Do you mean clang -fsplit-dwarf-inlining -fdebug-info-for-profiling should fail during command line parsing? Or something else?

llvm/test/DebugInfo/X86/debug-info-for-profiling-crash.ll
5–9

Looks like that runs into the second crash I'm looking at, but isn't dwarf-5 specific, so I'll take a look at that :)

Yeah - would you be able to see if rather than making changes to support debug-info-for-profiling further in split-dwarf-inlining, we could go the other way and support it less/not at all? (it's going to be a bit annoying, now splitting hairs of having gmlt-without-profiling and gmlt-with-profiling, but hopefully not too drastic *fingers crossed*)

Do you mean clang -fsplit-dwarf-inlining -fdebug-info-for-profiling should fail during command line parsing? Or something else?

Nah - I mean that debug-info-for-profiling would apply to the split DWARF info (stuff in the dwo file) when using split DWARF, not to the split-dwarf-inlining info.

(this /might/ mean that debug-info-for-profiling would be a no-op with -g2+split-dwarf-inlining - I'm not 100% sure/don't remember whether -g2 and above are a strict superset of split-dwarf-inlining or not. And -g1/-gmlt+split-dwarf-inlining+split-dwarf actually resolves to "not split DWARF" because they'd be totally redundant - whereas with this proposed direction they wouldn't necessarily be totally redundant, since the gmlt-in-dwo-with-debug-info-for-profiling would be more verbose/expressive than the split-dwarf-inlining... there's a lot of slices here and it's hard to explain succinctly. If this isn't making sense, I can try to write up something more verbose/clearer)

llvm/test/DebugInfo/X86/debug-info-for-profiling-crash.ll
5–9

ah, fair enough - maybe there's a common fix, especially if we move in the direction of debug-info-for-profiling not applying to split-dwarf-inlining info.

rupprecht updated this revision to Diff 312641.Dec 17 2020, 4:19 PM
  • Fix the other crash
rupprecht marked 3 inline comments as done.Dec 17 2020, 4:24 PM
rupprecht added inline comments.
llvm/test/DebugInfo/X86/debug-info-for-profiling-crash.ll
5–9

Managed to fix that crash too, but it looks like the original repro (with dwarf 5) still has a crash, so I replaced the IR with that example.

27–41

Removed.

I do still need the ; CHECK: .debug_info.dwo contents:line, because otherwise the section above may match the .debug_info.dwo section. (Hmm, I guess dwarfdump can probably filter on just a single section instead...)

rupprecht updated this revision to Diff 312672.Dec 17 2020, 7:33 PM
rupprecht marked 2 inline comments as done.
  • Fix the dwarf 5 crash

This now "fixes" the dwarf 5 crash I originally went down this rabbit hole for, but I'd like feedback on if it's actually the correct fix (e.g. I'm not generating bogus debug info here :) )

llvm/test/DebugInfo/X86/debug-info-for-profiling-crash.ll
27–41

I didn't find a dwarfdump option to do that, so I left in the ; CHECK: .debug_info.dwo contents: with a comment on why.

dblaikie added inline comments.Dec 22 2020, 2:42 PM
llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
985–987

Ping for thoughts on this ^.

If even the most basic case of split-dwarf+split-dwarf-inlining+debug-info-for-profiling is/was broken I'm sort of inclined to suggest this should move away from the split-dwarf-inlining having the debug-info-for-profiling data in it.

llvm/test/DebugInfo/X86/debug-info-for-profiling-crash.ll
5–9

Ah - so this is now two fixes and tests in one patch? Sorry for the runaround, but they should generally be separate patches. Is one a subset of the other? (eg: with one fix one of the tests passes, but the other test doesn't pass unless both fixes are applied?)

27–41

Yeah - when we added dwo sections we didn't end up adding a whole separate set of flags for dumping only a dwo or non-dwo variant of a section. So you can select, say -debug-info, but you get debug_info and debug_info.dwo sections.

That CHECK: * contents: is about right/as good as we can do currently.