This is an archive of the discontinued LLVM Phabricator instance.

Trust the arange accelerator tables in dSYMs
ClosedPublic

Authored by JDevlieghere on Oct 8 2019, 11:34 AM.

Details

Summary

When ingesting aranges from a dSYM it makes sense to always trust the contents of the accelerator table since it always comes from dsymutil. According to Instruments, skipping the decoding of all CU DIEs to get at the DW_AT_ranges attribute removes ~3.5 seconds from setting a breakpoint by file/line when debugging clang with a dSYM. Interestingly on the wall clock the speedup is less noticeable, but still present.

rdar://problem/56057688

Diff Detail

Event Timeline

aprantl created this revision.Oct 8 2019, 11:34 AM
jasonmolenda accepted this revision.Oct 8 2019, 1:02 PM

Nice fix. The real cost of computing the aranges manually is expanding all the line tables and calling GetContiguousFileAddressRanges on them. I wonder if avoiding the full linetable expansion here is shifting that to a different place for this operation.

This revision is now accepted and ready to land.Oct 8 2019, 1:02 PM

Nice fix. The real cost of computing the aranges manually is expanding all the line tables and calling GetContiguousFileAddressRanges on them. I wonder if avoiding the full linetable expansion here is shifting that to a different place for this operation.

No, the total user CPU time reported by Instruments goes from ~7000ms -> ~3500ms, but the wall clock from ~6s -> ~5s (both ±.5s).

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 8 2019, 2:15 PM

I had to revert this. Apparently for very small / gapless programs, clang emits single-value DW_AT_low_pc/high_pc in the compile unit, and dsymutil does not rewrite that into DW_AT_ranges. Would be a nice speedup though.

I had to revert this. Apparently for very small / gapless programs, clang emits single-value DW_AT_low_pc/high_pc in the compile unit, and dsymutil does not rewrite that into DW_AT_ranges. Would be a nice speedup though.

So is this a bug in dsymutil? It is also legal for compilers to do this. I have also seen a lot of DWARF that has issues where the DW_AT_ranges for a DW_TAG_compile_unit is not complete and is missing functions.

I wouldn't call it a bug, but if we changed dsymutil to always rewrite DW_AT_low_pc into DW_AT_ranges, we could trust the .debug_ranges section to be complete.

JDevlieghere reopened this revision.Jan 9 2023, 11:15 AM
This revision is now accepted and ready to land.Jan 9 2023, 11:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 9 2023, 11:15 AM
JDevlieghere commandeered this revision.Jan 9 2023, 11:15 AM
JDevlieghere edited reviewers, added: aprantl; removed: JDevlieghere.

Apparently for very small / gapless programs, clang emits single-value DW_AT_low_pc/high_pc in the compile unit, and dsymutil does not rewrite that into DW_AT_ranges.

Although this statement is correct, it shouldn't affect this patch. Even when the compile units uses DW_AT_low_pc/high_pc, dsymutil still emits the correct aranges. Without a counter-example we should go ahead with this patch.

I agree with landing this change.

This revision was automatically updated to reflect the committed changes.

SymbolVendorELF.cpp is using this as well:

dsym_objfile_sp->SetType(ObjectFile::eTypeDebugInfo);

They must have copied and pasted some code. So not sure this is safe enough as is to implement. There are also a few places that try to check for a dSYM file using this kind of method, might be nice to add a method, that is done right, that detects if we really have a dSYM or not in ObjectFile.h/.cpp. Wasm also sets the file type to debug info and so does breakpad.

There have also been some bugs in .debug_aranges and some folks want to get rid of .debug_aranges all together and rely only upon the DW_TAG_compile_unit's DW_AT_ranges. Tons of details in this patch:

https://reviews.llvm.org/D136395

So long story short, some people believe we should not produce or use .debug_aranges anymore, and want LLDB changed to ignore it.

SymbolVendorELF.cpp is using this as well:

dsym_objfile_sp->SetType(ObjectFile::eTypeDebugInfo);

They must have copied and pasted some code. So not sure this is safe enough as is to implement. There are also a few places that try to check for a dSYM file using this kind of method, might be nice to add a method, that is done right, that detects if we really have a dSYM or not in ObjectFile.h/.cpp. Wasm also sets the file type to debug info and so does breakpad.

The best way to check for dSYM would be to read the bytes from the mach-o header that specify MH_DSYM from the data for the object file.

There have also been some bugs in .debug_aranges and some folks want to get rid of .debug_aranges all together and rely only upon the DW_TAG_compile_unit's DW_AT_ranges. Tons of details in this patch:

https://reviews.llvm.org/D136395

So long story short, some people believe we should not produce or use .debug_aranges anymore, and want LLDB changed to ignore it.

I'm honestly curious how the DW_TAG_compile_unit's DW_AT_ranges (or DW_AT_high/low_pc) would be authoritative, but the ranges for the CU in debug_aranges would be incorrect. If the linker is allowed to reorder functions (does this not happen on linux?), then the compiler can't know the address ranges of the CU, only the linker (or post-linker) can know what address ranges are owned by this compile unit. So the linker had to create the DW_TAG_compile_unit's DW_AT_ranges list after linking is complete, and the linker is also the part of the toolchain that creates debug_aranges. If it is creating an incorrect debug_aranges entry, how can it possibly do the correct thing when rewriting the DW_TAG_compile_unit's DW_AT_ranges?

On Darwin, we usually get a DW_AT_high/low_pc on the compile_unit in the .o files, where all of the functions are contiguous. When dsymutil links the debug information into the dSYM, it includes the address ranges of every DW_TAG_subprogram that was included in the final executable in the debug_aranges. It doesn't modify DW_AT_high/low_pc in the compile_unit, it merely updates the addresses to their linked addresses. But lldb doesn't use those for anything - either it trusts the debug_aranges entries, or lldb scans all of the subprograms itself to construct the aranges (same thing dsymutil did).

tl;dr I don't see what using DW_AT_ranges in the compile_unit accomplishes versus debug_aranges.

There have also been some bugs in .debug_aranges and some folks want to get rid of .debug_aranges all together and rely only upon the DW_TAG_compile_unit's DW_AT_ranges. Tons of details in this patch:

https://reviews.llvm.org/D136395

So long story short, some people believe we should not produce or use .debug_aranges anymore, and want LLDB changed to ignore it.

I'm honestly curious how the DW_TAG_compile_unit's DW_AT_ranges (or DW_AT_high/low_pc) would be authoritative, but the ranges for the CU in debug_aranges would be incorrect. If the linker is allowed to reorder functions (does this not happen on linux?), then the compiler can't know the address ranges of the CU, only the linker (or post-linker) can know what address ranges are owned by this compile unit. So the linker had to create the DW_TAG_compile_unit's DW_AT_ranges list after linking is complete, and the linker is also the part of the toolchain that creates debug_aranges. If it is creating an incorrect debug_aranges entry, how can it possibly do the correct thing when rewriting the DW_TAG_compile_unit's DW_AT_ranges?

They are all the same from the linker perspective: both .debug_aranges (for accelerator table) and .debug_ranges (for DW_AT_ranges) have relocations on them. As linkers move stuff around they apply the relocations and they all stay up to date. Anything without a relocation has its start address zeroed out.

On Darwin, we usually get a DW_AT_high/low_pc on the compile_unit in the .o files, where all of the functions are contiguous. When dsymutil links the debug information into the dSYM, it includes the address ranges of every DW_TAG_subprogram that was included in the final executable in the debug_aranges. It doesn't modify DW_AT_high/low_pc in the compile_unit, it merely updates the addresses to their linked addresses. But lldb doesn't use those for anything - either it trusts the debug_aranges entries, or lldb scans all of the subprograms itself to construct the aranges (same thing dsymutil did).

dsymutil generates correct .debug_aranges by looking at the final DWARF. Note that there are no .debug_aranges sections in .o files since we don't use them, they only get produced if we have a dSYM for darwin.

tl;dr I don't see what using DW_AT_ranges in the compile_unit accomplishes versus debug_aranges.

Different things are included in DW_AT_ranges, like address ranges for global and static variables. .debug_aranges only has functions, no globals or statics, so if you are trying to find a global variable by address, you can't rely on .debug_aranges. Nothing in the DWARF spec states things clearly enough for different compilers to know what to include in .debug_aranges and the compiler uint DW_AT_ranges. Each compiler (GCC and clang) do their own thing and often do things differently. So the other way to add functionality like this would be to check the target triple and check for Apple as the vendor maybe. But we still need know if we have a dSYM file or not, because if not, we can't use .debug_aranges as .o files don't have them, but they also don't claim to be of type ObjectFile::eTypeDebugInfo.

But we still need know if we have a dSYM file or not, because if not, we can't use .debug_aranges as .o files don't have them, but they also don't claim to be of type ObjectFile::eTypeDebugInfo.

I came to the same realization and fixed that in https://reviews.llvm.org/rG9b737f148d88501a0a778e1adacf342108286bb0

But we still need know if we have a dSYM file or not, because if not, we can't use .debug_aranges as .o files don't have them, but they also don't claim to be of type ObjectFile::eTypeDebugInfo.

I came to the same realization and fixed that in https://reviews.llvm.org/rG9b737f148d88501a0a778e1adacf342108286bb0

nice! approved that one as the detection is solid.

I know this is all moot because the dSYM-specific patch landed, but I am curious about this part,

Different things are included in DW_AT_ranges, like address ranges for global and static variables. .debug_aranges only has functions, no globals or statics, so if you are trying to find a global variable by address, you can't rely on .debug_aranges. Nothing in the DWARF spec states things clearly enough for different compilers to know what to include in .debug_aranges and the compiler uint DW_AT_ranges.

The standard says,

"Each descriptor is a triple consisting of a segment selector, the beginning address within that segment of a range of text or data covered by some entry owned by the corresponding compilation unit, followed by the non-zero length of that range"

It is pretty clear on the point that any part of the address space that can be attributed to a compile_unit can be included in the debug_aranges range list - if only code is included, that's a choice of the aranges writer. lldb itself, if debug_aranges is missing or doesn't include a CU, steps through the line table concatenating addresses for all the line entries in the CU (v. DWARFCompileUnit::BuildAddressRangeTable ) - it doesn't include data. (it will also use DW_AT_ranges from the compile_unit but I worry more about this being copied verbatim from the .o file into the linked DWARF than I worry about debug_aranges, personally)

In a DW_TAG_compile_unit, the DW_AT_ranges that the compiler puts in the .o file isn't relevant to the final linked debug information, unless it included the discrete range of every item which might be linked into the final binary, and the DWARF linker only copies the range entries that were in the final binary in the linked dwarf DW_AT_ranges range list. Or if the dwarf linker knows how to scan the DWARF compile_unit like lldb does, concatenating line table entries or DW_TAG_subprogram's.

If any dwarf linker is doing all of this work to construct an accurate & comprehensive compile_unit DW_AT_ranges, why couldn't it be trusted to also have an identical/accurate dwarf_aranges for this cu? I don't see the difference.

I guess to say it shorter. If I have a dwarf_aranges, that means the dwarf linker created it. And if it created it, surely its at least based off of the subprogram address ranges or the line table -- that is, the text address ranges. If I have a DW_TAG_compile_unit DW_AT_ranges, either the compiler (to the .o file) created it, in which case I really am suspicious of those ranges because the compiler can't know which symbols will end up in the final executable, and the addresses in the ranges were simply translated to the final executable address equivalents. Or it was rewritten by a dwarf linker that parsed the DWARF and knew how to correctly calculate the addresses that correspond to that compile unit.

if anything, I would trust a dwarf_aranges entry for a CU before I would trust the CU's DW_AT_ranges list. Both have to be written by the dwarf linker to be correct, but only the former is written ONLY by the dwarf linker.

labath added a subscriber: labath.Jan 12 2023, 4:33 AM

Both have to be written by the dwarf linker to be correct, but only the former is written ONLY by the dwarf linker.

I don't think that's right:

$ clang -c -x c - -o - -gdwarf-aranges -g <<<"void f(){}" | llvm-readelf - --sections | grep aranges
  [ 6] .debug_aranges    PROGBITS        0000000000000000 0000a0 000030 00      0   0  1
  [ 7] .rela.debug_aranges RELA          0000000000000000 000380 000030 18   I 20   6  8

Both have to be written by the dwarf linker to be correct, but only the former is written ONLY by the dwarf linker.

I don't think that's right:

$ clang -c -x c - -o - -gdwarf-aranges -g <<<"void f(){}" | llvm-readelf - --sections | grep aranges
  [ 6] .debug_aranges    PROGBITS        0000000000000000 0000a0 000030 00      0   0  1
  [ 7] .rela.debug_aranges RELA          0000000000000000 000380 000030 18   I 20   6  8

Ah thanks Pavel, clang on Darwin doesn't emit this in .o files. So in this case, the .o file has a DW_TAG_compile_unit with a DW_AT_ranges and a .debug_aranges, both generated pre-link-time by the compiler. And Greg is saying that the DW_AT_ranges list in the final executable will be better than the .debug_aranges? I don't know how strongly he asserts this - today if a .debug_aranges has an entry, we don't look at DW_TAG_compile_unit's DW_AT_ranges, or create the ranges ourself via the line table; the debug_aranges is trusted above the other methods if it includes CU.

Both have to be written by the dwarf linker to be correct, but only the former is written ONLY by the dwarf linker.

I don't think that's right:

$ clang -c -x c - -o - -gdwarf-aranges -g <<<"void f(){}" | llvm-readelf - --sections | grep aranges
  [ 6] .debug_aranges    PROGBITS        0000000000000000 0000a0 000030 00      0   0  1
  [ 7] .rela.debug_aranges RELA          0000000000000000 000380 000030 18   I 20   6  8

Ah thanks Pavel, clang on Darwin doesn't emit this in .o files.

It does. It just doesn't do it by default -- same as on linux (but not on PS4 I believe). You have to use the -gdwarf-aranges flag explicitly (which, quite possibly, noone does),

So in this case, the .o file has a DW_TAG_compile_unit with a DW_AT_ranges and a .debug_aranges, both generated pre-link-time by the compiler.

Yes.

And Greg is saying that the DW_AT_ranges list in the final executable will be better than the .debug_aranges? I don't know how strongly he asserts this - today if a .debug_aranges has an entry, we don't look at DW_TAG_compile_unit's DW_AT_ranges, or create the ranges ourself via the line table; the debug_aranges is trusted above the other methods if it includes CU.

That I don't know. Personally, I would expect the two to be equivalent, except that the debug_aranges could theoretically be faster to parse (and takes up more space).

I know this is all moot because the dSYM-specific patch landed, but I am curious about this part,

Different things are included in DW_AT_ranges, like address ranges for global and static variables. .debug_aranges only has functions, no globals or statics, so if you are trying to find a global variable by address, you can't rely on .debug_aranges. Nothing in the DWARF spec states things clearly enough for different compilers to know what to include in .debug_aranges and the compiler uint DW_AT_ranges.

The standard says,

"Each descriptor is a triple consisting of a segment selector, the beginning address within that segment of a range of text or data covered by some entry owned by the corresponding compilation unit, followed by the non-zero length of that range"

It is pretty clear on the point that any part of the address space that can be attributed to a compile_unit can be included in the debug_aranges range list - if only code is included, that's a choice of the aranges writer. lldb itself, if debug_aranges is missing or doesn't include a CU, steps through the line table concatenating addresses for all the line entries in the CU (v. DWARFCompileUnit::BuildAddressRangeTable ) - it doesn't include data. (it will also use DW_AT_ranges from the compile_unit but I worry more about this being copied verbatim from the .o file into the linked DWARF than I worry about debug_aranges, personally)

Sorry, looks like the spec is pretty clear. From what I remember David Blakie saying, clang won't put globals or static (no data, just code) into the .debug_aranges. GCC might do this differently. I thought that LLDB would try to use the following in order:

  • .debug_aranges for a CU if available
  • fall back to DW_AT_ranges
  • fall back to line tables only if DW_AT_ranges is not there

In a DW_TAG_compile_unit, the DW_AT_ranges that the compiler puts in the .o file isn't relevant to the final linked debug information, unless it included the discrete range of every item which might be linked into the final binary, and the DWARF linker only copies the range entries that were in the final binary in the linked dwarf DW_AT_ranges range list. Or if the dwarf linker knows how to scan the DWARF compile_unit like lldb does, concatenating line table entries or DW_TAG_subprogram's.

The Darwin workflow has a smart DWARF linker, so yes, it will regenerate only what is needed for the final output in dsymutil. The compiler have been trained to not emit .debug_aranges for Darwin triples because we know we don't need them since dsymutil will regenerate from the final output only what is needed.

All other platforms use standard linker technology where they will concatenate all DWARF sections to create the final DWARF sections and then apply relocations. So if you have a DW_AT_ranges in a CU that has 100 addresses in a .o file, and the final output only uses 10 functions and dead strips 90 functions, the final .debug_aranges for that CU will still contain ranges for all functions, 10 of which will have had relocations applied and will have valid address ranges, and 90 others will start at the sentinel addresss of zero or -1 to indicate they should have been dead stripped, but there can be tons of wasted data that isn't needed in each .debug_aranges CU data.

If any dwarf linker is doing all of this work to construct an accurate & comprehensive compile_unit DW_AT_ranges, why couldn't it be trusted to also have an identical/accurate dwarf_aranges for this cu? I don't see the difference.

Only dsymutil does this as this is the only smart DWARF linker at the moment! See above comment as to why. We do have llvm-dwarfutil now for ELF files that can optimize DWARF after it has been linked to remove all of this extra junk and unique types. llvm-dwarfutil uses the same DWARF linking engine as dsymutil. But all other systems are concatenate all sections, then apply relocations, and anything that doesn't have a relocation gets a sentinel address applied as the relocation.

I guess to say it shorter. If I have a dwarf_aranges, that means the dwarf linker created it. And if it created it, surely its at least based off of the subprogram address ranges or the line table -- that is, the text address ranges.

True for Darwin and dsymutil yes.

If I have a DW_TAG_compile_unit DW_AT_ranges, either the compiler (to the .o file) created it, in which case I really am suspicious of those ranges because the compiler can't know which symbols will end up in the final executable, and the addresses in the ranges were simply translated to the final executable address equivalents. Or it was rewritten by a dwarf linker that parsed the DWARF and knew how to correctly calculate the addresses that correspond to that compile unit.

This still works in all cases as relocations will be applied to each range. If we are doing DWARF in .o files, then we know to check each range to see if it ended up in the debug map or not and we throw out the ones that weren't.

if anything, I would trust a dwarf_aranges entry for a CU before I would trust the CU's DW_AT_ranges list. Both have to be written by the dwarf linker to be correct, but only the former is written ONLY by the dwarf linker.

It would be great if the compilers actually followed the DWARF spec, but alas they don't do this well for .debug_aranges. I know clang and GCC do things differently for these sections.

Basically the contents of the .debug_aranges and the DW_AT_ranges can and should be the same. It is kind of a waste of space. It would be nice if the .debug_aranges could be modified to say "use this offset in the .debug_ranges section" so it would re-use the data and allow for quicker access when doing address lookups. I don't like having to parse ANY DWARF DIEs, like the CU DIE, just to get to the DW_AT_ranges, so it would be nice to be able to access this from .debug_aranges but be able to share the data between the two references. I can't remember how a DW_TAG_compile_unit defines what should be included in the DW_AT_ranges... I am not sure if the DWARF spec says that data should be included in these ranges or not.

Both have to be written by the dwarf linker to be correct, but only the former is written ONLY by the dwarf linker.

I don't think that's right:

$ clang -c -x c - -o - -gdwarf-aranges -g <<<"void f(){}" | llvm-readelf - --sections | grep aranges
  [ 6] .debug_aranges    PROGBITS        0000000000000000 0000a0 000030 00      0   0  1
  [ 7] .rela.debug_aranges RELA          0000000000000000 000380 000030 18   I 20   6  8

Ah thanks Pavel, clang on Darwin doesn't emit this in .o files. So in this case, the .o file has a DW_TAG_compile_unit with a DW_AT_ranges and a .debug_aranges, both generated pre-link-time by the compiler. And Greg is saying that the DW_AT_ranges list in the final executable will be better than the .debug_aranges? I don't know how strongly he asserts this - today if a .debug_aranges has an entry, we don't look at DW_TAG_compile_unit's DW_AT_ranges, or create the ranges ourself via the line table; the debug_aranges is trusted above the other methods if it includes CU.

It isn't better or worse and it varies per compiler. I believe, but I don't remember for sure, that DW_AT_ranges only currently has code address ranges (no data, so no globals or statics). So if a compiler in fact does include data in the .debug_aranges, it would be better to use .debug_aranges.

I tried to add a verifier into llvm-dwarfdump to catch when either .debug_aranges or DW_AT_ranges on the CU didn't contain some address ranges that were in the CU with:

https://reviews.llvm.org/D136395

The problem is the debugger has issues if either of these tables are incorrect, so you would think a verifier would be a good thing since if the address isn't in either .debug_aranges or DW_AT_ranges, we won't ever find a DIE for a function or global variable. I had seen bugs where we have .debug_aranges that didn't include all ranges for all DW_TAG_subprogram DIEs, and also cases where DW_AT_ranges didn't have some ranges. This usually happens when someone tries to LTO or optimize a binary and the DWARF didn't have a dedicate entry for a function, so it would drop address ranges from the .debug_aranges or DW_AT_ranges. Since everything non Darwin uses "concatenate + relocate" in the linker, image if you have a .debug_aranges entry that had [0x1000-0x2000), and you had DWARF with two functions that fall into those ranges: [0x1000-0x1500) for "func1" and [0x1500-0x2000) for "func2", and those functions get optimized into different places in the binary (not right next to each other), then there is no valid relocation to can apply to the range [0x1000-0x2000) that will make the table continue to work after linking. You will end up relocating the 0x1000 address correctly for "func1", and if you apply the relocation for the end address of 0x2000, you can end up setting it to the end address of 0x2000, which could be less that the new address for the start of "func1". So lots of stuff goes wrong in the linking phase, especially when LTO is involved.