In a LTO build, the end_sequence in debug_line table for each compile unit (CU) points the end of text section which merged all CUs. The end_sequence needs to point to the end of each CU's range. This bug often causes invalid debug_line table in the final .dSYM binary for MachO after running dsymutil which tries to compensate an out-of-range address of end_sequence.
The fix is to sync the line table termination with the range operations that are already maintained in DwarfDebug. When CU or section changes, or nodebug functions appear or module is finished, the prior pending line table is terminated using the last range label. In the MC path where no range is tracked, the old logic is conservatively used to end the line table using the section end symbol.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Would it be possible and easier to test this issue without any LTO concerns, with code like this:
void f1() { }
void __attribute__((nodebug)) void f2() { }Looks like the bug manifests here by having the line table cover f2, when it should end at the end of f1 instead?
(admittedly there's still a problem, maybe this isn't sufficiently representative, if there was void f3() { } at the end - we don't end the sequence around f2 and restart it afterwards... )
Well, if you want to keep the existing LTO-related testing, you could link two IR files together with simple empty functions, rather than interesting functions like you have in your examples. Just void f1() { } and void f2() { } in two files, llvm-linkd together would suffice, I think?
| llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp | ||
|---|---|---|
| 2167–2169 | What's this refactoring for? Looks like it causes duplicate lookups of the DwarfCompileUnit, that might be nice to avoid/not introduce? (if it is necessary/preferred for some reason, might be worth doing separately from the rest of this patch?) (both callers already have the DwarfCompileUnit available - perhaps this function should take that instead of the MachineFunction? that way it could avoid the extra lookups) | |
| 2226–2230 | Perhaps this could be done once outside the loop? // MBBSectionRanges is always non-empty at this point, right? (could assert just in case, but I think it's a reasonable invariant)
Asm->OutStreamer->getContext()
        .getMCDwarfLineTable(getDwarfCompileUnitID(MF))
        .getMCLineSections()
        .updateEndLabel(Asm->MBBSectionRanges.back().second.EndLabel); | |
| llvm/lib/MC/MCObjectStreamer.cpp | ||
| 517–518 ↗ | (On Diff #367079) | When is this condition false? (is it tested?) | 
| llvm/test/DebugInfo/lto-debugline-main.ll | ||
| 1 ↗ | (On Diff #367079) | This patch only applies to llvm's codegen - so the test probably shouldn't include opt or LTO tools - the post-LTO'd IR should be checked in directly & then fed into llc in the test to exercise the codegen codepath, most likely? | 
| llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp | ||
|---|---|---|
| 2230 | Are the "Asm->MBBSectionRanges" sorted? If not, do we need to find the highest EndLabel and only add that one? | |
Update the change based on the feedback
- Refactor NFC change to a separate revision, https://reviews.llvm.org/D108271
- Use a non-LTO unit test
- Set EndLabel once after the loop
- Simpliy the logic to directly use EndLabel for end_sequence address
Rebase. getDwarfCompileUnitID -> getDwarfCompileUnitIDForLineTable in https://reviews.llvm.org/D108271
| llvm/lib/MC/MCObjectStreamer.cpp | ||
|---|---|---|
| 517–518 ↗ | (On Diff #367079) | Still curious about this question ^ | 
| llvm/lib/MC/MCObjectStreamer.cpp | ||
|---|---|---|
| 517–518 ↗ | (On Diff #367079) | I've attempted to delete this symbol diff check in https://reviews.llvm.org/D108261?id=367123 but Assertion failed: Abs && "We created a line delta with an invalid expression" So, I restored this logic back. As for InSet = false, I think it confirms the absolute symbol difference, or this check can be true optimistically by https://github.com/llvm/llvm-project/blob/d480f968ad8b56d3ee4a6b6df5532d485b0ad01e/llvm/lib/MC/MachObjectWriter.cpp#L680-L681. DebugInfo/X86/gmlt.test | 
| llvm/lib/MC/MCObjectStreamer.cpp | ||
|---|---|---|
| 517–518 ↗ | (On Diff #367079) | This seems suspicious to me - "isSymbolRefDifferenceFullyResolvedImpl" (also a bit suspicious that this is calling an "Impl" function - usually those are only used as implementation details for some non-impl function, where the latter is intended as the general entry point) should only return false if the two symbols aren't in the same section. I wouldn't've thought that should happen - when/where/how/why does that happen? | 
Use isSymbolRefDifferenceFullyResolved instead of isSymbolRefDifferenceFullyResolvedImpl
| llvm/lib/MC/MCObjectStreamer.cpp | ||
|---|---|---|
| 517–518 ↗ | (On Diff #367079) | I've just replaced isSymbolRefDifferenceFullyResolvedImpl by isSymbolRefDifferenceFullyResolved.  Are you suggesting a new API isSymbolRefDifferenceFullyResolved(constMCAssembler &Asm, constMCSymbolRefExpr *A, constMCSymbolRefExpr *B) that defaults with InSet = false, and use it here? | 
| llvm/lib/MC/MCObjectStreamer.cpp | ||
|---|---|---|
| 517–518 ↗ | (On Diff #367079) | Sorry, no I don't mean to rename or refactor this API. I'd like to understand why this function call ever produces a result of false - it seems to me that EndLabel and LastLabel should always be from the same section, and if they are, I think this function should never return false - so I think I'm misunderstanding something/don't understand where these labels are coming from such that they could end up being from different sections. I'd like to understand how that situation can arise before approving this patch - to understand better why this approach/test is suitable. | 
| llvm/lib/MC/MCObjectStreamer.cpp | ||
|---|---|---|
| 517–518 ↗ | (On Diff #367079) | I'm not familiar with the whole logic, but I've traced a particular case below. DebugInfo/X86/cu-ranges.ll This test runs with -function-sections so it appears each function goes to each section (comdat?). | 
Set EndLabel per Section. A compilation unit (CU) can have multiple sections (e.g., comdat) so track EndLabel per section.
This seems a way to avoid unnecessary symbol diff check because EndLabel and LastLabel would be in the same section.
I'm a bit confused/unclear how this algorithm works/how this fixes things and whether it's the right direction. It'll be a while before I can find the time/brain bandwidth to really dig into this - so perhaps you can help me.
What I'm confused by is why multiple line tables would be unterminated & then end up needing to terminate them all/figure out which one to terminate? Instead I'd expect whenever another line table got an entry, the previously current line table entry would need to end - so there shouldn't be a need for the extra maps & maybe we can just keep track of the current "open" line table (whichever was the last one to emit an entry)?
Here is my understanding. Basically a line table is emitted per compilation unit (CU). For simplicity, I consider two cases:
- When multiple sections exist in a CU, we still want to emit them into a single line table (with multiple end_sequences per section).
- In a LTO case, multiple CUs appear in a single (merged) object. In that case, we want to emit multiple line tables per CU.
So, this algorithm tries to track the end label of each function range per section in a CU. Likewise each line entries are already aggregated per section in a CU. This ensures emitting the end label per section per CU.
In case for asm parser path (instead of normal compilation path), those line entries/function range may not be emitted, so in that case (EndLabel = null), I fall back to the old logic that just uses the section end label conservatively.
This is two examples that this change tries to fix/improve.
- Multiple sections in a CU case -- similar to Comdat (function per section).
void f2()
{
}
// The end of .text in CU line table
__attribute__((section(".s1")))
void f1()
{
}
// The end of .s1 in CU line table
__attribute__((nodebug,section(".s1")))
void f1nodebug()
{
}
// --> Before this change: The line table pointed to here because it's the end of the section .s1.
#if 0
// llvm-objdump -d t1.o
Disassembly of section .text:
0000000000000000 <f2>:
..
       5: c3                            retq
Disassembly of section .s1:
0000000000000000 <f1>:
..
       5: c3                            retq
0000000000000010 <f1nodebug>:
..
      15: c3                            retq
// llvm-dwarfdump --debugline t1.o
Address            Line   Column File   ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x0000000000000000      2      0      1   0             0  is_stmt
0x0000000000000004      3      1      1   0             0  is_stmt prologue_end
0x0000000000000006      3      1      1   0             0  is_stmt end_sequence
0x0000000000000000      8      0      1   0             0  is_stmt
0x0000000000000004      9      1      1   0             0  is_stmt prologue_end
0x0000000000000006      9      1      1   0             0  is_stmt end_sequence ---> Before this change, it pointed to offset 16, the end of the section .s1.
#endif- LTO case: Multiple CUs in a single section
// LTO object effectively has the view of the merged CUs.
// CU1 from t1.c
void t1()
{
}
// End of .text in CU1 line table
// CU2 from t2.c
void t2()
{
}
// End of .text in CU2 line table
#if 0
$ llvm-objdump -d lto.o
0000000000000000 <t1>:
...
       5: c3                            retq
0000000000000010 <t2>:
..
      15: c3                            retq
$ llvm-dwarfdump --debug-line lto.o
Address            Line   Column File   ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x0000000000000000      4      0      1   0             0  is_stmt
0x0000000000000004      5      1      1   0             0  is_stmt prologue_end
0x0000000000000006      5      1      1   0             0  is_stmt end_sequence --> Before this change, it pointed to offset 16 (the end of .text section).
Address            Line   Column File   ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x0000000000000010      4      0      1   0             0  is_stmt
0x0000000000000014      5      1      1   0             0  is_stmt prologue_end
0x0000000000000016      5      1      1   0             0  is_stmt end_sequence
#endif^ Presumably multiple line tables per Module, one per CU.
So, this algorithm tries to track the end label of each function range per section in a CU. Likewise each line entries are already aggregated per section in a CU. This ensures emitting the end label per section per CU.
In case for asm parser path (instead of normal compilation path), those line entries/function range may not be emitted, so in that case (EndLabel = null), I fall back to the old logic that just uses the section end label conservatively.
Ah, I see, and you've in the "normal compilation path" the change here in DwarfDebug's endFunctionImpl to add an end label for the section. Wouldn't this have issues if you interleave functions from CUs in the same section - like CU1:func1:.text, then CU2:func2:.text, then CU1:func3:.text. (ah, right, I mentioned this in my first comment, but I'm feeling more like fixing the issue your seeing should involve the more general fix to that interleaved issue too)
Yeah, in that case you get CU1's line table covering the whole range, including func2, which isn't intended/desirable.
So I think this solution you have is incomplete & I'd say it's essentially the same bug - at least I think of it that way. But I guess it depends what sort of "invalid debug_line table in the final dsym" you're dealing with - they're all not great, but maybe there's a more severe invalidity in the end case?
I guess what I'm suggesting would probably still require that extra signal from DwarfDebug in the normal compilation path, but it could be more robust/address the interleaved issue which wouldn't be fixed by this approach.
What if whenever the section changes (in raw MC) or when a new line entry is emitted to another CU (or in DwarfDebug, if a nodebug function starts) - then emit an end_prologue to whatever the current open line table is? (have to keep track of that, I guess - "last emitted line table MCLineDivision")
That would miss some opportunities for shorter line table encodings (in the func1/2/3 scenario above, if func2 was in another section, then the line table for func1/3 wouldn't actually need to have two separate chunks - they could be contiguous) - so more advanced would be keeping "Last MCLineDivision per section". Terminate the last MCLineDivision whenever somethingr is emitted to that section that isn't the same division. DwarfDebug would have one extra bonus: If it starts a nodebug function, it could also pre-emptively terminate any current MCLineDivision.
I think that'd fix all these issues, probably?
Thanks for comments! I roughly get your points but may need sometime to digest it.
I may misunderstand an important piece on your example -- CU1:func1:.text, then CU2:func2:.text, then CU1:func3:.text.
My understanding is CU is a compilation unit which seems to be processed sequentially (in LLVM). How could we interleave CUs like this? I thought we only interleave sections within a CU.
Does this interleaving come from a ThinLTO compilation? I would appreciate it if you can provide an example.
Thanks for comments! I roughly get your points but may need sometime to digest it.
I may misunderstand an important piece on your example -- CU1:func1:.text, then CU2:func2:.text, then CU1:func3:.text.
My understanding is CU is a compilation unit which seems to be processed sequentially (in LLVM).
Ah, that's not quite correct - a CU exists both by its existence in the cu list named module metadata, and by its reference from DISubprograms (which exist by reference from llvm::Functions). A CU is not processed atomically - some handling is done up front (at the start of the module, iterating the named module metadata list of CUs) and some is done at the end of the module - but between those, for all the DISubprograms, they're visited in the order of the llvm::Functions that reference them - so they can be arbitrarily interleaved.
How could we interleave CUs like this? I thought we only interleave sections within a CU.
Does this interleaving come from a ThinLTO compilation? I would appreciate it if you can provide an example.
I don't know that I have a concrete example that creates the interleaving I can reproduce by hand - but it's not an IR invariant that all Functions that reference CUs through DISubprograms must be contiguous for that CU.
To create such a situation by hand, create two files, let's say a.cpp and b.cpp and have two functions a1 and a2 in a.cpp and main in b.cpp, have main call a1 and a2. Compile to IR, link the IR, hand-modify the IR to reorder the functions so it goes a1, then main, then a2 - then you should be able to reproduce/see the strange situation where the line table for a.cpp continues/extends over main/b.cpp.
Incorporated the feedback.
- Bookkeep the current line table for CU so that when a line entry needs to be emitted for different CU, terminate the current line table by injecting an end sequence.
- Didn't break this on section change, becuase the line table is already being tracked per section.
- Still didn't handle nodebug function case -- no debug function may span debug functions in the line table. Terminating the line table was a bit tricky becuse no debug function basically does not have a reference line entry. But I think this impreciion is less concern in practice.
Rather than trying to do this at the MCDwarf/streamer level - what if we did this in the AsmPrinter/DwarfDebug level? It could call some kind of MC function to terminate the line table because it knows when to do so? (it'd terminate basically whenever DwarfDebug was about to change the value of its PrevCU member (so in DwarfCompileUnit::addRange/DwarfDebug::setPrevCU and in DwarfDebug::skippedNonDebugFunction))
Incorporate the feedback
- Terminate the line table when switching CU in DwarfDebug.
- Update the test since this now handles nodebug case correctly.
| llvm/include/llvm/MC/MCDwarf.h | ||
|---|---|---|
| 221 | When does this case come up? I think all of this would only happen when PrevCU was non-null/had been populated with some content first, right? | |
| llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp | ||
| 2180 | presumably this function shouldn't be called if PrevCU == NewCU, right? (maybe that could be asserted, rather than tested) | |
| 2184 | Does is this empty? I /think/ that "PrevCU" is only set when PrevCU has already been populated with some content, right? | |
| llvm/lib/MC/MCDwarf.cpp | ||
| 245–250 | Would be nice if we could eliminate the need for this case - since any time this gets used it risks being because a line table was allowed to "flow off the end" further than it should. What would it be like if we handled the line table similar to the way ranges are handled? Always terminated at the end of each function, but then extended if the next function happens to start immediately after the last entry ended? | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 245–250 | Yeah. That sounds great, but it seems similar to what I initially tried, which tracked the end label of function as functions were added. | |
Fix the test failures by restoring PrevCU check instead of assertion.
- Line table entrty insertion is independent of PrevCU creation.
| llvm/include/llvm/MC/MCDwarf.h | ||
|---|---|---|
| 221 | It turned out that there is the case where the line entry is empty in PrevCU. So I restored it with this check. | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 245–250 | From the second thought, addLineEntry is used during function emission (which is independent of Range operation) while Range is added at the end of function. It doesn't seem to work to extend the line table using the end of function range. | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 196–198 | When does this occur? I guess when the last function with debug info in this section is followed by a function in another section (with or without debug info) or a nodebug function? That does seem a bit subtle and like it'd be nicer if this API's invariant was just that all callers terminated their own lists. (this would involve fixing up the GenDwarfForAssembly codepath to do that too, I'd guess) For DwarfDebug/etc I guess this would be addressed by calling terminateLineTableForPrevCU (or perhaps resetPrevCU) in finishModule or whatever it's called? | |
| 245–250 | Yeah, though I was thinking more driven by the DwarfDebug code so it's not tracked in two different places that might diverge/reduce the code/logic needed to handle these cases. I think alignment doesn't break this strategy (it doesn't seem to break it for the ranges data using this approach) - though the alignment does come between the start of one function and the beginning of the next function - the code in DwarfDebug/DwarfCompileUnit/etc extends the CU ranges to cover that & hopefully the line table could be powered by the same logic. | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 196–198 | I can see we can terminate the line table for DwarfDebug in finishModule via resetPrevCU -- this will use an end label of range. However, for assembly path, I don't think we have explicit labels/symbols to terminate. I may introduce an api to patch the line table for all sections that do not have an end entry at the end. But in that case, I still need to synthesize/emit a label using section symbol like MCStreamer::endSection. It's just moving the place when we fill this gap -- either here or before coming here. | |
- Ensure the line table is terminated in DwarfDebug (compiler) path.
- Still preserve the old code to synthesize the end entry for the assembly path.
Looks roughly right - one trailing question I wouldn't mind knowing the answer to (might merit a comment so it can be cleaned up later) - and also, maybe worth adding an assembly test case. Something like this:
.text .file 1 "small.c" .loc 1 1 0 nop .section .text.2 .loc 1 2 0 nop .text nop
Showing that line table locations in assembly do flow on to the next chunk of a section even if there's an intermediate section switch. So unlike with the DwarfDebug case, which can end the line table after the function/whenever switching compilation units - the assembly mode might have any number of outstanding sections that need to be terminated at the end. (I guess in theory we could optimize the IR/DwarfDebug case slightly by keeping some of these open - but I don't think it likely in practice & seems cleaner with what this patch does now)
| llvm/include/llvm/MC/MCDwarf.h | ||
|---|---|---|
| 221 | When does that case arise? (maybe when a function is zero-length/has no instructions? That's something I/we would like to fix at some point, FWIW) | |
| llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp | ||
| 497 | Unrelated change - could commit this separately. | |
Terminate the line table aligning with Range creation:
- When a new range is formed due to section or CU change
- When nodebug function starts
- When module is finished
From the above, I think the line table is well-formed in DwarfDebug (normal compiler path).
No need specialization for asm streamer or object stremer to end the line table.
However, the MC path seems to still terminate using the section end symbol conservatively,
because we do not track the range like the DwarfDebug path as above.
So, I still preserve emitDwarfLineEndEntry when no end entry is emitted in the loop.
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | When does this situation come up? (I think this question got lost from its previous place here: https://reviews.llvm.org/D108261#inline-1086075 during refactoring) (if it is necessary, it might be suitable to use find on the MCLineDivisions rather than operator[] (so this call doesn't create an entry when it's empty/unused)) | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | Even though we have a range, we may not have any debug line entry in unit tests. Makes sense using find. Will update it with additional assembly test. | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | Seems like we shouldn't be adding extra API surface area only for unit tests, though? But it's possible this comes up in real cases of empty functions - if that's the case (worth testing/validating), leaving a FIXME for when we remove/fix the existence of empty functions would be good. | |
- Add an assembly test to cover the MC path that still emits the end entry using the section end symbol.
- Add a check for addEndEntry before adding an end entry. The MCStreamer path may not populate the line entry.
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | find is preferred over count - so that the result can be used rather than a duplicate lookup: auto I = MCLineDivisions.find(Sec);
if (I != MCLineDivisions.end()) {
  auto &Entries = I->second;
  ...
}Though I'm still curious to better understand under what conditions this is needed. | |
| llvm/test/DebugInfo/XCOFF/empty.ll | ||
| 230 | Hmm, what aspect of the change caused these labels to change name? | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | It turned out the MCAsmStreamer path (as opposed to MCObjectStreamer) typically doesn't populate the line entry. | |
| 148 | Yea. find seems better since I need to use the map anyhow inside. Will update it. As commented above in the code, the assembly output path (MCAsmStreamer as opposed to MCObjectStreamer), the line entry is not added instead .loc directives are emitted during the function emission. So, the line entries are often irrelevant in the assembly output except a certain target -- I guess XCOFF shown below in the tests. | |
| llvm/test/DebugInfo/XCOFF/empty.ll | ||
| 230 | I think this XCOFF seems a unique path that generates the line table for the assembly output in DwarfDebug. | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | Ah, OK! Right - the solution to that would be to have a virtual function in MCStreamer that's differently implemented (a no-op in the asm streamer case - maybe someone'll eventually extend the assembly syntax to support terminating line contributions - so nodebug could be properly done in assembly) this would be side-by-side with the MCStreamer::emitDwarfLocDirective - or, perhaps it could use that specific function, with an extra parameter or flag value for "end entry"? | |
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | Yea. I tried to create emitTerminateLineTable in MCStreamer and differently extends it for MCAsmStreamer, which resolved the most cases. define i32 @main() !dbg !960 {
entry:
  %call1.i = tail call %"class.std::basic_ostream"* @test(%"class.std::basic_ostream"* @_ZSt4cout, i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.str, i64 0, i64 0), i64 5)
  ret i32 0
}This means the function appears to have a debug info, but there is no tag like DILocation for each line. | |
Thanks for all the work/iteration/research here - sorry it was a bit fussy.
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | Ah, hmm - yeah, there's certainly ways we could arrive at a function without any instructions having a location. Though I wonder what the line table should look like for that? Not having the line table cover those instructions seems a bit weird too. But if that's the state of things today, might as well leave ti as-is. Could you include a comment describing how that occurs with the MCObjectStreamer use case as well as the one outlined with MCAsmStreamer? | |
Thanks for the thorough review! I had a chance to dive a bit in this area.
Indeed, this debug world seems subtle to make it right.
| llvm/lib/MC/MCDwarf.cpp | ||
|---|---|---|
| 148 | For the above case where instructions have no location at all, the line entries in the table didn't appear in dwarf although other contents appeared. Yeah. Will update the comments to include two cases (MCAsmStreamer and MCObjectStreamer). | |
When does this case come up? I think all of this would only happen when PrevCU was non-null/had been populated with some content first, right?