This is an archive of the discontinued LLVM Phabricator instance.

[llvm-dwarfdump] - Change how dwarfdump dumps .debug_ranges
AbandonedPublic

Authored by grimar on Aug 3 2017, 8:37 AM.

Details

Summary

It was raised in D36097 thread that we currently do not have
testcases for section index API in libDebugInfo.

Idea was to add some useful functionality for any tool,
like llvm-dwarfdump which will use that API and require testcases.

Patch introduces next changes:

  1. Teaches dumper of .debug_ranges to render range section index.
  2. Teaches dumper about address selection entry and how to render entries properly if section contains them. (previously it did not show result range properly, because did not add base address to result).
  3. Cosmetic: adds nice header.

Diff Detail

Event Timeline

grimar created this revision.Aug 3 2017, 8:37 AM
dblaikie edited edge metadata.Aug 3 2017, 11:27 AM

A couple of things

  1. If possible, I think it'd be better if it printed the section name, rather than the section index
  1. I'm not sure how practical it is to actually process base selection entries when rendering debug_ranges. I was looking at some related behavior today - since parsing the debug_ranges section depends on knowing which CU refers to which range - it's hard to tell short of parsing all the CU to find all the references to ranges, which initial base address applies to which section. Note that currently what's dumped in debug_ranges looks quite different (it includes the literal base address selection entries just as normal entries - not showing their effect) from when the range is dumped inside debug_info dumping - where it's processed and the effect of the base address selection entries (& of the default base address) is shown.

How does your patch deal with handling the default base address? Given that it varies between each range list in debug_ranges and would only be known by determining which CU refers to that range list (which can only be known by walking all the DWARF DIEs to find the points that refer to range lists)? Does it not use a default base address? (I guess that's not the case) Does it use the default base address of the first CU?

I'd say probably leave the debug_ranges dumping as-is (I think there's a minor change that could be made to improve it a little*, but still leave it basically dumping the raw bytes, not the processed/semantic range list) but improve the range dumping in debug_info to include this info, perhaps?

For example: how does your patch dump... oh, that's a bug in debug_ranges emission (or a suboptimal feature). So in the example you're testing there shouldn't be any relocations or a base address selection entry - LLVM should produce DWARF that relies on the existing default base address, perhaps. Oh, maybe that's circular - so not a bug, but...

OK, so what we want to do is get LLVM to produce a situation where it does emit a range but the CU has a default base address. Let's see what I can conjure up...

OK, here we go. It's not the simplest reproduction (I don't know how to tickle LLVM to produce code like this naturally - but it's easy with a minor manual IR edit):

Take this source code:

void f1();
void f2() {
  f1();
  {
    int i;
    f1();
    f1();
  }
}

Compile to LLVM textual IR with debug info (clang++ test.cpp -g -c -S -emit-llvm ).

Modify the resulting IR by swapping the order of the first two calls in f2. This creates a hole in the scope's range, forcing the use of DW_AT_ranges/debug_ranges section. While still having a low/highpc for the CU, providing a default base address for the range list.

Here's the modified IR:

define void @_Z2f2v() local_unnamed_addr !dbg !7 {
entry:
  tail call void @_Z2f1v(), !dbg !15
  tail call void @_Z2f1v(), !dbg !14
  tail call void @_Z2f1v(), !dbg !16
  ret void, !dbg !17
}
declare void @_Z2f1v() local_unnamed_addr
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4, !5}
!llvm.ident = !{!6}
!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "clang version 6.0.0 (trunk 309873) (llvm/trunk 309879)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
!1 = !DIFile(filename: "range.cpp", directory: "/usr/local/google/home/blaikie/dev/scratch")
!2 = !{}
!3 = !{i32 2, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{i32 1, !"wchar_size", i32 4}
!6 = !{!"clang version 6.0.0 (trunk 309873) (llvm/trunk 309879)"}
!7 = distinct !DISubprogram(name: "f2", linkageName: "_Z2f2v", scope: !1, file: !1, line: 2, type: !8, isLocal: false, isDefinition: true, scopeLine: 2, flags: DIFlagPrototyped, isOptimized: true, unit: !0, variables: !10)
!8 = !DISubroutineType(types: !9)
!9 = !{null}
!10 = !{!11}
!11 = !DILocalVariable(name: "i", scope: !12, file: !1, line: 5, type: !13)
!12 = distinct !DILexicalBlock(scope: !7, file: !1, line: 4, column: 3)
!13 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!14 = !DILocation(line: 3, column: 3, scope: !7)
!15 = !DILocation(line: 6, column: 5, scope: !12)
!16 = !DILocation(line: 7, column: 5, scope: !12)
!17 = !DILocation(line: 9, column: 1, scope: !7)

Then compiled (clang ranges.ll) produces a ranges section that looks like this:

        .section        .debug_ranges,"",@progbits
.Ldebug_ranges0:
        .quad   .Lfunc_begin0-.Lfunc_begin0
        .quad   .Ltmp0-.Lfunc_begin0
        .quad   .Ltmp1-.Lfunc_begin0
        .quad   .Ltmp2-.Lfunc_begin0
        .quad   0
        .quad   0

So in this case you can only tell which section these refer to by knowing which CU is referring to this range list to find the default base address to parse the list with.

This mapping from generically parsed range lists to 'resolved' range lists happens in DWARFDebugRangeList::getAbsoluteRanges - so the computation/addition of the section name/index should probably happen here. (maybe this code maps to section index, and then the dumping code higher up maps the index to the name to print it out)

  • both debug_ranges and debug_loc need the address size to even parse/dump the raw contents (not accounting for more CU-specific things like default base addresses, etc) - but they take a different strategy to doing so. debug_ranges dumping only works if you also dump debug_info, and it scrapes the address size while dumping debug_info to use when dumping debug_ranges. debug_loc on teh other hand forces the parsing of the first CU to retrieve its pointer size - so debug_loc dumping works even if debug_info is not dumped (& only parses the first unit header to achieve this) - I think that's probably the right solution and debug_ranges should do the same thing there.
grimar added a comment.Aug 4 2017, 5:50 AM

Thanks for explanations and testcase, David !
My comments below.

A couple of things

  1. If possible, I think it'd be better if it printed the section name, rather than the section index

Then I believe we want to print both name and index, because there can be multiple sections with the same name:
.section .foo, "aw", @progbits, unique, 1
.section .foo, "aw", @init_array, unique, 2
.section .foo, "aw", @preinit_array, unique, 3

How does your patch deal with handling the default base address? Given that it varies between each range list in debug_ranges and would only be known by determining which CU refers to that range list (which can only be known by walking all the DWARF DIEs to find the points that refer to range lists)? Does it not use a default base address? (I guess that's not the case) Does it use the default base address of the first CU?

I'd say probably leave the debug_ranges dumping as-is (I think there's a minor change that could be made to improve it a little*, but still leave it basically dumping the raw bytes, not the processed/semantic range list) but improve the range dumping in debug_info to include this info, perhaps?

You right, this patch does not know about default base address, it works only with addess selection entries if there are any. I think I'll abandon it and try to improve .debug_info dumping just like you suggested.

grimar abandoned this revision.Aug 4 2017, 8:13 AM

D36313 posted instead.