Page MenuHomePhabricator

[DEBUG_INFO] fix .loc directives emitted for missing prologues
Needs ReviewPublic

Authored by TWeaver on Jul 17 2018, 7:54 AM.

Details

Reviewers
probinson
CarlosAlbertoEnciso
Group Reviewers
debug-info
Summary

Debug line entry tables are generated from .loc directives in the assembly. .loc directives should be generated for function prologues when a prologue is generated. They should be omitted when a prologue is stripped.

The current behaviour of the DwarfDebug ASMPrinter is to print a .loc directive for a prologue, missing or otherwise if a function has user code.

As a result, a .loc directive was being emitted for prologue code that had been stripped.

This lead to illegal DWARF line entries being generated, where the first instruction of a function with a stripped prologue being mapped to two lines of source line - the function opening brace and the first line of user code.

This patch implements a fix that checks for prologue code and avoids emitting .loc directives when it's not found.

I've also written a regression test that aims to force the backend to generate functions in assembly with missing prologues and checks that .loc directives are missing for stripped prologues.

This patch has been tested on Linux Ubuntu (VM) to elf and Windows 10 to exe, targetting x86_64 architectures.

I've also run a before and after with the LLDB and GDB test suites and found no new failing tests.

Diff Detail

Event Timeline

TWeaver created this revision.Jul 17 2018, 7:54 AM

Could you provide a small example dump of the invalid line table you're addressing?

TWeaver edited the summary of this revision. (Show Details)Jul 17 2018, 10:05 AM
TWeaver added a comment.EditedJul 17 2018, 10:27 AM

Could you provide a small example dump of the invalid line table you're addressing?

Hi David,

thanks for your review.

I built a pre-patch version of clang and recompiled the source in the test (which I've updated as was out of date).

I fed the resulting .elf into dwarfdump with -l and got the following output:

root@clangbox-VirtualBox:~/dev/clang/lldb_release/bin# dwarfdump -l leafCall.elf

.debug_line: line number info for a single cu
Source lines (from CU-DIE at .debug_info offset 0x0000000b)

NS new statement, BB new basic block, ET end of text sequence
PE prologue end, EB epilogue begin
IS=val ISA number, DI=val discriminator value

<pc>        [lno,col] NS BB ET PE EB IS= DI= uri: "filepath"
0x00400480  [   2, 0] NS
0x00400480  [   3, 3] NS PE
0x00400490  [   7, 0] NS
0x00400490  [   8, 3] NS PE
0x00400496  [   8, 3] NS ET

You can see there's two sets of two line entries (one for each function) where a single instruction is mapped to two different source lines.

TWeaver updated this revision to Diff 155916.Jul 17 2018, 10:37 AM

updated test .cpp source description with correct source.

The repeated address is syntactically valid per the rules for the line-number program, but semantically undefined per the rationale in the non-normative description of the table that the line-number program is supposed to be encoding (i.e., a table indexed by instruction address).

I feel like maybe the solution here should be in the MC integrated
assembler - to not emit zero-length sequences in the line table, regardless
of whether it's in the prologue or anywhere else?

Could you provide a small example dump of the invalid line table you're addressing?

I fed the resulting .elf into dwarfdump with -l and got the following output:

What is the ouput generated by llvm-dwarfdump?

I feel like maybe the solution here should be in the MC integrated
assembler - to not emit zero-length sequences in the line table, regardless
of whether it's in the prologue or anywhere else?

I agree.

There is future work planned (at my office) to address the interpretation of a sequence of .loc directives with no interleaved instructions.

I've hand rolled a .asm file with multiple sequential .loc directives and find similar results, regardless of prologue or otherwise.

However, this particular piece of work addresses what I perceived to be an artefact, a .loc directive for a missing prologue.

If you believe it best to address both issues within this one patch i'd be happy to accommodate.

However, I'm being ushered onto another piece of work post this patch, so may take a little bit of time to get it done.

Thanks for your time
Tom

Could you provide a small example dump of the invalid line table you're addressing?

I fed the resulting .elf into dwarfdump with -l and got the following output:

What is the ouput generated by llvm-dwarfdump?

Hiya Carlos,

thanks for your review.

heres the line table dump from llvm-dwarfdump.

.debug_line contents:
debug_line[0x00000000]
Line table prologue:
    total_length: 0x00000070
         version: 4
 prologue_length: 0x0000004e
 min_inst_length: 1
max_ops_per_inst: 1
 default_is_stmt: 1
       line_base: -5
      line_range: 14
     opcode_base: 13
standard_opcode_lengths[DW_LNS_copy] = 0
standard_opcode_lengths[DW_LNS_advance_pc] = 1
standard_opcode_lengths[DW_LNS_advance_line] = 1
standard_opcode_lengths[DW_LNS_set_file] = 1
standard_opcode_lengths[DW_LNS_set_column] = 1
standard_opcode_lengths[DW_LNS_negate_stmt] = 0
standard_opcode_lengths[DW_LNS_set_basic_block] = 0
standard_opcode_lengths[DW_LNS_const_add_pc] = 0
standard_opcode_lengths[DW_LNS_fixed_advance_pc] = 1
standard_opcode_lengths[DW_LNS_set_prologue_end] = 0
standard_opcode_lengths[DW_LNS_set_epilogue_begin] = 0
standard_opcode_lengths[DW_LNS_set_isa] = 1
include_directories[  1] = "/home/clangbox/dev/clang/lldb_release/bin"
file_names[  1]:
           name: "leafCall.cpp"
      dir_index: 1
       mod_time: 0x00000000
         length: 0x00000000

Address            Line   Column File   ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x0000000000400480      2      0      1   0             0  is_stmt
0x0000000000400480      3      3      1   0             0  is_stmt prologue_end
0x0000000000400490      7      0      1   0             0  is_stmt
0x0000000000400490      8      3      1   0             0  is_stmt prologue_end
0x0000000000400496      8      3      1   0             0  is_stmt end_sequence

There is future work planned (at my office) to address the interpretation of a sequence of .loc directives with no interleaved instructions.

Ah, great - I think that's the right fix to cover this isuse as well.

I've hand rolled a .asm file with multiple sequential .loc directives and find similar results, regardless of prologue or otherwise.

Is that only the behavior of LLVM's integrated assembler - does gas (GCC assembler) skip these zero-length regions? (so we can more strongly justify this as a bug-fix/quality improvement (rather than a divergence from GCC's behavior))

However, this particular piece of work addresses what I perceived to be an artefact, a .loc directive for a missing prologue.

I'm not sure it's really a bug here, though - if the assembler addressed the sub-optimality of producing zero-length regions. It seems to me like a natural implementation to change the location, emit the prologue, then change the location again - and if the prologue happens to be empty/zero-length, the correct output falls out naturally. Avoiding special casing the empty prologue keeps the code simple/easier to read/understand, I think.

However, this particular piece of work addresses what I perceived to be an artefact, a .loc directive for a missing prologue.

I'm not sure it's really a bug here, though - if the assembler addressed the sub-optimality of producing zero-length regions. It seems to me like a natural implementation to change the location, emit the prologue, then change the location again - and if the prologue happens to be empty/zero-length, the correct output falls out naturally. Avoiding special casing the empty prologue keeps the code simple/easier to read/understand, I think.

Intuitively, if we see multiple .loc for the same address, the last one should win. If we can be sure that ignoring the first .loc is the right solution, in all cases where we find a sequence of .loc directives for the same address, then handling it in the assembler is probably the right fix.

Certainly hand-written assembler can have multiple .loc in a row, and what gas does with this is a fair question.

I wonder if we should experiment with making the line-table builder complain about a .loc that doesn't increment the address. That would tell us whether these zero-length cases show up anywhere else. Or, if we decide the assembler should handle it correctly, whether anything slips through.

Hello fellow LLVM contributors.

As my back log at work is now cleared I'm being refocused onto fixing optimized debugging issues.
As a result, I'm resurrecting this piece of work in the hope of getting the line table entries for striped prologues sorted hence-forth.

The current version of this patch was "semi" rejected due to special casing around missing prologues..
it was propsed that a cleaner way to fix this issue would be to avoid emitting line entries with address deltas with 0 values.

I've done a little bit of poking around to see if this is a viable and ran into some issues - primarily that address delta calculations tend to happen in the assembler at the fragment relaxation stage. This is probably to late in the pipe line to be avoiding emitted line entries (I think, I welcome discussion on that point).

as a result, I've instead decided to look at changing the MCObjectStreamer behaviour around reading .loc directives. That is, when two or more directives are read one after another with not interceding instruction, only the last of the set should be streamed/emitted/interpreted for later emission as a line table entry.

What are peoples thoughts on this approach?

Thanks in advance
Tom W

@probinson

I've also taken a look at GAS and tested it's behaviour and have found it mirrors clangs, that is, it will happily output line entries for multiple .loc directives with no interceding instructions - thus leading to line table entries with address delta's of 0.

So, I've been poking around in MCObjectStreamer.cpp and have found the area where multiple .loc directives are emitted, however,

A previous commit was made in 2013 that intentionally implemented this behaviour, you can see the commit here:

https://github.com/llvm-mirror/llvm/commit/4df4bccc71ea0477836db9a417d3da202c2baa09

I've commented out line 269 (in the commit linked) and get the desired beheviour for this patch, however, judging from the commit message, the original author is relying on this behaviour for other functionality (debugger related).

So GAS does the same thing as Clang's integrated assembler currently (pre-patch) and that behavior is problematic for GDB and LLDB? (only for LLDB) so the difference is that the assembly clang generates includes these zero-address-length locations, but GCC doesn't?

What's the problematic cases/behavior in the debuggers or other DWARF consumers?

> So GAS does the same thing as Clang's integrated assembler currently (pre-patch) and that behavior is problematic for GDB and LLDB?

GAS has the same behaviour as Clang's integrated assembler. When an assembly file with multiple .loc directives with no interveaning instructions are assembled, GAS produces a DWARF line table that maps one instruction to multiple entries.

Commit r184357 copied this behaviour from GAS/GCC to Clang. Before this commit Clang would only produce a line table entry for the last .loc directive read for a particular instruction.

This behaviour is not problematic for LLDB or GDB, having tested both debuggers and single stepped through the disassembly, source and set break points on the "troublesome" lines that correspond to the line entries, both LLDB and GDB handle the issue gracefully by stepping to the line corresponding to first bit of user code.

I've tried this with and without a prologue_end flag and both GDB and LLDB handle this case fine.

> (only for LLDB) so this difference is that the assembly clang generates includes these zero-address-length locations, but GCC doesn't?

GCC and Clang share the same behaviour. Both produce .loc directives and subsequent line table entries for missing prologues and user code that map to the same assembly instruction.

0 // simple.cpp
1 #include "basic.h"
2 int main(int argc, const char * argv[])
3 {
4   int result = basic(argc);
5   return result;
6 }

0 // basic.cpp
1 int basic(const int arg)
2 {
3   int result = arg + arg;
4   return result;
5 }

Then the following line table dumps for both clang and gcc builds:

Clang bleeding edge with -O2 -g -fno-inline:
> snipped
<pc>        [lno,col] NS BB ET PE EB IS= DI= uri: "filepath"
0x00400480  [   3, 0] NS uri: "/home/clangbox/dev/temp/./simple.cpp"
0x00400483  [   4,16] NS PE
0x00400488  [   5,17] NS
> snipped

> snipped
<pc>        [lno,col] NS BB ET PE EB IS= DI= uri: "filepath"
0x00400490  [   2, 0] NS uri: "/home/clangbox/dev/temp/./basic.cpp"
0x00400490  [   4,17] NS PE
0x00400493  [   4, 3]
> snipped

GCC version 7.3.0 on Ubuntu 1~18.04 with -O2 -g -fno-inline:
> snipped
<pc>        [lno,col] NS BB ET PE EB IS= DI= uri: "filepath"
0x000004f0  [   3, 0] NS uri: "/home/clangbox/dev/temp/simple.cpp"
0x000004f1  [   3, 0] NS
0x000004f3  [   4, 0] NS
> snipped

> snipped
<pc>        [lno,col] NS BB ET PE EB IS= DI= uri: "filepath"
0x00000610  [   2, 0] NS uri: "/home/clangbox/dev/temp/basic.cpp"
0x00000610  [   4, 0] NS
0x00000613  [   5, 0] NS
> snipped

> What's the problematic cases/behaviour in the debuggers or other DWARF consumer's?

as stated above, GDB and LLDB both handle this case, line table entries for missing prologue, gracefully. I've tested several ways of stepping and setting break points and both debuggers always manage to figure out that the instruction with 2 mappings is, in fact, mapped to the first line of source, not the curly brace denoted by the prologue line entry.

However, our internal debugger, on the example above, when stepping into basic.cpp pauses on the curly brace at line 2, then skips over the expression on line 3 straight to the return value. Giving the user a "wonky" optimized debugging experience.

very gentle and polite ping.

I'm OK with this change from a "slightly reduces debug info size" but I hesitate to suggest that this isn't a thing your consumer should support & this "When an assembly file with multiple .loc directives with no interveaning instructions are assembled, GAS produces a DWARF line table that maps one instruction to multiple entries." sounds like a bit of a mischaracterization of how I'd read the line table entries in question - I think this line table doesn't "map one instruction to multiple entries" but instead has an empty entry that contains no instructions. Everything between one .loc and the next is at that first location - if there's nothing between them, then nothing is at that location. I suspect that's how other consumers are viewing this situation.

If this is an invariant you want (that there are no zero-length ranges in the line table), maybe it makes sense to implement it as an option in the integrated assembler, rather than special casing this particular instance of zero-length ranges in the line table?

They seem useless enough, but important to reproduce accurately if that's what the assembly says and other assemblers portray that information - but for our own uses we can say "these are meaningless, elide any zero-length location range"?

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1343–1346

I'd probably swap this around for an early return:

if (!isPrologueInstruction(MI))
  return std::make_pair(...);
hasPrologue = true;