This is an archive of the discontinued LLVM Phabricator instance.

DwarfDebug: Pick next location in case of missing location at block begin
ClosedPublic

Authored by MatzeB on Oct 3 2018, 10:21 PM.

Details

Summary

Context: Compiler generated instructions do not have a debug location
assigned to them. However emitting 0-line records for all of them bloats
the line tables for very little benefit so we usually avoid doing that.

Not emitting anything will lead to the previous debug location getting
applied to the locationless instructions. This is not desirable for
block begin and after labels. Previously we would emit simply emit
line-0 records in this case, this patch changes the behavior to do a
forward search for a debug location in these cases before emitting a
line-0 record to further reduce line table bloat.

Inspired by the discussion in https://reviews.llvm.org/D52125

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB created this revision.Oct 3 2018, 10:21 PM

Comparing two test-suite builds gives me:

Tests: 1010
Metric: size

Program                                         base   changes diff

SingleSour...itTests/SignlessTypes/cast-bug    4600   4608     0.2%
SingleSour...tTests/ms_struct_pack_layout-1    4632   4640     0.2%
SingleSour...tTests/2010-05-24-BitfieldTest    4632   4640     0.2%
SingleSour...ion-C-2003-06-16-VolatileError    4720   4728     0.2%
SingleSour...egression-C-pointer_arithmetic    4864   4872     0.2%
SingleSour...ression/C/Regression-C-PR10189    5552   5560     0.1%
MultiSourc...enchmarks/VersaBench/dbms/dbms    50456  50504    0.1%
SingleSour...nitTests/2006-12-01-float_varg    8800   8808     0.1%
SingleSour...nitTests/2003-10-13-SwitchTest    8800   8808     0.1%
SingleSour...tTests/2007-04-10-BitfieldTest    8808   8816     0.1%
SingleSour...ession-C-2008-01-07-LongDouble    8832   8840     0.1%
SingleSour...s/Shootout/Shootout-nestedloop    8840   8848     0.1%
SingleSour.../2004-06-20-StaticBitfieldInit    8880   8888     0.1%
SingleSour...sion-C-2004-03-15-IndirectGoto    8888   8896     0.1%
SingleSour...UnitTests/2002-08-02-CastTest2    8896   8904     0.1%
               base       changes         diff
count  1.010000e+03  1.010000e+03  1010.000000
mean   1.773759e+05  1.773819e+05  0.000102
std    8.762489e+05  8.762656e+05  0.000246
min    4.600000e+03  4.608000e+03  0.000000
25%    1.456500e+04  1.456500e+04  0.000000
50%    6.281800e+04  6.283400e+04  0.000000
75%    9.593600e+04  9.593600e+04  0.000023
max    1.585930e+07  1.585930e+07  0.001739

> 0.1% smaller files on average.

vsk added a comment.Oct 4 2018, 11:02 AM

The core idea and test updates look good to me. I'm curious about the size changes: are you just measuring the size of executables, or of .o's? If it's the former, on Darwin, any size change resulting from this patch might indicate bugs.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1317

Why not use MachineBasicBlock::findDebugLoc?

1344

DebugLoc is pointer-sized, so it's fine to just copy it instead of returning a reference. This might create a little extra traffic through TrackingMDRef's operator=, but in the common case I'd expect the optimizer to get rid of most of it.

In D52862#1255529, @vsk wrote:

The core idea and test updates look good to me. I'm curious about the size changes: are you just measuring the size of executables, or of .o's? If it's the former, on Darwin, any size change resulting from this patch might indicate bugs.

I measured executable sizes. I tried some flags of llvm-size or llvm-dwarfdump but could not figure out how to measure the size of the line tables directly...

vsk added a comment.Oct 4 2018, 11:13 AM
In D52862#1255529, @vsk wrote:

The core idea and test updates look good to me. I'm curious about the size changes: are you just measuring the size of executables, or of .o's? If it's the former, on Darwin, any size change resulting from this patch might indicate bugs.

I measured executable sizes.

Ah, ok. Well, we know we have more work to do to eliminate codegen differences with/without debug info. We're planning on discussing this at the debug info BoF. (It's not a blocker for this patch.)

I tried some flags of llvm-size or llvm-dwarfdump but could not figure out how to measure the size of the line tables directly...

You can look at the .dSYMs or .o's if you have those around:
$ size -m main.dSYM/Contents/Resources/DWARF/main | grep debug_line
Section __debug_line: 60

In D52862#1255529, @vsk wrote:

The core idea and test updates look good to me. I'm curious about the size changes: are you just measuring the size of executables, or of .o's? If it's the former, on Darwin, any size change resulting from this patch might indicate bugs.

I measured executable sizes. I tried some flags of llvm-size or llvm-dwarfdump but could not figure out how to measure the size of the line tables directly...

llvm-objdump -h is what I usually use. 'bloaty' (a 3rd party tool) does slightly more precise accounting, though (can account for object file headers, etc), but only ELF I think. Be sure to measure both the section and the .rela section containing the relocations - though I suppose your change shouldn't be changing the number of relocations.

(& absolute changes - compared to the file size as a whole, as well as changes within the section are useful measures)

In D52862#1255543, @vsk wrote:
In D52862#1255529, @vsk wrote:

The core idea and test updates look good to me. I'm curious about the size changes: are you just measuring the size of executables, or of .o's? If it's the former, on Darwin, any size change resulting from this patch might indicate bugs.

I measured executable sizes.

Ah, ok. Well, we know we have more work to do to eliminate codegen differences with/without debug info. We're planning on discussing this at the debug info BoF. (It's not a blocker for this patch.)

The actual instructions are exactly the same, at least in the couple examples I picked.

I tried some flags of llvm-size or llvm-dwarfdump but could not figure out how to measure the size of the line tables directly...

You can look at the .dSYMs or .o's if you have those around:
$ size -m main.dSYM/Contents/Resources/DWARF/main | grep debug_line
Section __debug_line: 60

There are no .dSYM files around, should there be? Example compilation of 1 of the benchmarks:

$ /var/tmp/locbase/bin/clang -DNDEBUG  -save-temps=obj   -O0 -g -isysroot /Volumes/XCode/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk   -w -Werror=date-time -MD -MT SingleSource/Benchmarks/Misc/CMakeFiles/pi.dir/pi.c.o -MF SingleSource/Benchmarks/Misc/CMakeFiles/pi.dir/pi.c.o.d -o SingleSource/Benchmarks/Misc/CMakeFiles/pi.dir/pi.c.o   -c /Users/mbraun/dev/test-suite/SingleSource/Benchmarks/Misc/pi.c
$ /var/tmp/locbase/bin/clang -save-temps=obj   -O0 -g -isysroot /Volumes/XCode/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names  SingleSource/Benchmarks/Misc/CMakeFiles/pi.dir/pi.c.o  -o SingleSource/Benchmarks/Misc/pi  -Wl,-no_pie -lm
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1317

As far as I can tell findDebugLoc will only give you the DebugLoc of the next non-DEBUG_VALUE instruction. This here will work better if there are mutliple instructions without DebugLoc in sequence.

MatzeB marked 3 inline comments as done.Oct 4 2018, 11:23 AM
MatzeB added inline comments.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1344

I was trying to avoid unnecessary refcounting operations. But I'd happy change this to DebugLoc DL as that indeed makes the code look simpler.

vsk added a comment.Oct 4 2018, 11:31 AM
In D52862#1255543, @vsk wrote:
In D52862#1255529, @vsk wrote:

The core idea and test updates look good to me. I'm curious about the size changes: are you just measuring the size of executables, or of .o's? If it's the former, on Darwin, any size change resulting from this patch might indicate bugs.

I measured executable sizes.

Ah, ok. Well, we know we have more work to do to eliminate codegen differences with/without debug info. We're planning on discussing this at the debug info BoF. (It's not a blocker for this patch.)

The actual instructions are exactly the same, at least in the couple examples I picked.

Hm, I wonder what changed, then.

I tried some flags of llvm-size or llvm-dwarfdump but could not figure out how to measure the size of the line tables directly...

You can look at the .dSYMs or .o's if you have those around:
$ size -m main.dSYM/Contents/Resources/DWARF/main | grep debug_line
Section __debug_line: 60

There are no .dSYM files around, should there be?

On macOS, typically, yes. You can check whether or not your compiler is doing this with: xcrun clang -g -x c - -o - '-###'. The last step is usually a dsymutil link.

Is it possible that your test harness is removing .dSYMs? .. ah, no, what's happening is that clang declines to add a dsymutil step when it's being used to link .o's into an executable.

Maybe it'd be easier to run measurements on the .o's?

Example compilation of 1 of the benchmarks:

$ /var/tmp/locbase/bin/clang -DNDEBUG  -save-temps=obj   -O0 -g -isysroot /Volumes/XCode/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk   -w -Werror=date-time -MD -MT SingleSource/Benchmarks/Misc/CMakeFiles/pi.dir/pi.c.o -MF SingleSource/Benchmarks/Misc/CMakeFiles/pi.dir/pi.c.o.d -o SingleSource/Benchmarks/Misc/CMakeFiles/pi.dir/pi.c.o   -c /Users/mbraun/dev/test-suite/SingleSource/Benchmarks/Misc/pi.c
$ /var/tmp/locbase/bin/clang -save-temps=obj   -O0 -g -isysroot /Volumes/XCode/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names  SingleSource/Benchmarks/Misc/CMakeFiles/pi.dir/pi.c.o  -o SingleSource/Benchmarks/Misc/pi  -Wl,-no_pie -lm
vsk added inline comments.Oct 4 2018, 11:33 AM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1317

Sounds good.

MatzeB marked an inline comment as done.Oct 4 2018, 11:39 AM

Is it possible that your test harness is removing .dSYMs? .. ah, no, what's happening is that clang declines to add a dsymutil step when it's being used to link .o's into an executable.

So I assume when linking .o files the debug information is still part of the executable afterwards. And running dsymutil is an optional step that a buildsystem can do to split up the dwarf data into a .dSYM...

vsk added a comment.Oct 4 2018, 11:42 AM

Is it possible that your test harness is removing .dSYMs? .. ah, no, what's happening is that clang declines to add a dsymutil step when it's being used to link .o's into an executable.

So I assume when linking .o files the debug information is still part of the executable afterwards. And running dsymutil is an optional step that a buildsystem can do to split up the dwarf data into a .dSYM...

That's how it works on most non-Apple platforms, I think. ld64 is different in that it doesn't copy any debug info into the final executable: it just embeds pointers to .o's in the mach-o header. Later, the debugger (or dsymutil) can find debug info for the program by chasing these pointers.

That explains why I couldn't find those line tables in the executables :)

So here's a comparison of the line table sections in the .o files:

Tests: 5100
Metric: size.__DWARF_debug_line

Program                                         debug_line  debug_line0 diff

External/S...PU2006/403.gcc/src/insn-opinit    9345        6081        -34.9%
External/S...ec/CPU2006/458.sjeng/src/sjeng    4829        3529        -26.9%
External/S...pec/CPU2006/458.sjeng/src/eval    4524        3353        -25.9%
External/S...spec/CINT95/124.m88ksim/src/go    1622        1208        -25.5%
MultiSourc...Files/timberwolfmc.dir/findloc    4667        3488        -25.3%
External/S...CINT2000/186.crafty/src/option    20585       15408       -25.1%
External/S...U2006/403.gcc/src/insn-attrtab    198287      148800      -25.0%
External/S...nchspec/CINT95/099.go/src/g27a    5325        4025        -24.4%
External/S...nchspec/CINT95/099.go/src/g27b    2073        1567        -24.4%
External/S...c/CINT2000/186.crafty/src/time    2405        1819        -24.4%
MultiSourc...CMakeFiles/compiler.dir/parser    7185        5443        -24.2%
External/S...ec/CPU2006/458.sjeng/src/moves    18948       14424       -23.9%
External/S...c/CINT2000/254.gap/src/scanner    8954        6840        -23.6%
External/S...enchspec/CINT95/099.go/src/g25    41351       31651       -23.5%
External/S...c/CINT2000/186.crafty/src/edit    1649        1263        -23.4%
          debug_line    debug_line0         diff
count  5100.000000    5100.000000    5100.000000
mean   7859.688627    7537.172157   -0.030294
std    17028.127781   16122.239159   0.044775
min    135.000000     136.000000    -0.349278
25%    1242.000000    1200.750000   -0.043018
50%    3508.500000    3341.000000   -0.012748
75%    9079.500000    8857.750000    0.000462
max    521041.000000  515397.000000  0.025559

So it's an average 3% reduction of the line table section ranging from -2% to 35% reduction. (Looking at the few size increases I see less .loc files in the assembly diff, so I suspect it is just an effect of the directory name being 1 character longer in the patched version...)

vsk accepted this revision.Oct 4 2018, 12:28 PM

Looks great, thanks!

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1344

I don't feel strongly about this, I'll leave it to you.

This revision is now accepted and ready to land.Oct 4 2018, 12:28 PM
MatzeB marked 2 inline comments as done.Oct 5 2018, 11:31 AM
MatzeB added inline comments.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1344

I reorganized the code a bit so I don't need to switch to a pointer for reassignment and we can keep using a const reference.

MatzeB closed this revision.Oct 5 2018, 11:32 AM
MatzeB marked an inline comment as done.

Landed in rL343874