This is an archive of the discontinued LLVM Phabricator instance.

Add the ability to verify the .debug_aranges section.
Needs ReviewPublic

Authored by clayborg on Oct 20 2022, 4:56 PM.

Details

Summary

.debug_aranges is a DWARF index that maps addresses back to compile units. This index should contain a full list of address ranges for the compile unit. This fix verifies:

  • If the DW_TAG_compile_unit has a DW_AT_ranges or a DW_AT_low_pc/DW_AT_high_pc:
    • all addresses from a DW_TAG_compile_unit's DW_AT_ranges are contained in the .debug_aranges index
    • all address ranges from the .debug_aranges for a compile unit are contained in the DW_AT_ranges of the DW_TAG_compile_unit
  • if the DW_TAG_compile_unit doesn't have a DW_AT_ranges or a DW__AT_low_pc/DW_AT_high_pc:
    • verify all ranges from any DW_TAG_subprogram DIEs are all contained in the .debug_aranges.
  • Run through all .debug_line line table entries from each compile unit and make sure that the .debug_aranges contains ranges for any row addresses and also verify if the CU DW_AT_ranges/low/high pc or subprogram ranges contain the address.

Below are example errors.

When a compile unit has a DW_AT_ranges or DW_AT_low_pc/high_pc that isn't in the .debug_aranges data for that compile unit:

error: .debug_aranges[0x00000000] compile unit range [0x0000000000001000 - 0x0000000000002000) not in .debug_aranges.

When the .debug_aranges has a range that isn't in the compile unit's has a DW_AT_ranges or DW_AT_low_pc/high_pc:

error: .debug_aranges[0x00000000][1] range [0x0000000000010000 - 0x0000000000011000) not in compile unit @ 0x00000000 ranges.

When the .debug_aranges has a range that isn't in any of the compile unit's DW_TAG_subprogram's ranges and the compile unit has no DW_AT_ranges or DW_AT_low_pc/high_pc:

error: .debug_aranges[0x00000000][1] range [0x0000000000010000 - 0x0000000000011000) not in compile unit @ 0x00000000 subprogram ranges.

When we have line table rows that have addresses that don't exist in the .debug_aranges and/or the compile unit's ranges:

error: .debug_aranges[0x00000000] compile unit @ 0x00000000 line table sequence [0-2) has row[1] with address 0x0000000000002010 that is not in .debug_aranges nor in compile unit subprogram ranges.
error: .debug_aranges[0x00000000] compile unit @ 0x00000000 line table sequence [0-2) has row[1] with address 0x0000000000002010 that is not in .debug_aranges nor in compile unit ranges.
error: .debug_aranges[0x00000000] compile unit @ 0x00000000 line table sequence [0-2) has row[1] with address 0x0000000000002010 that is not in .debug_aranges.

Diff Detail

Event Timeline

clayborg created this revision.Oct 20 2022, 4:56 PM
Herald added a project: Restricted Project. · View Herald Transcript
clayborg requested review of this revision.Oct 20 2022, 4:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 20 2022, 4:56 PM

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

Can you elaborate on why they should be removed. Is it because of the aforementioned duplication of information, or are there other reasons also?

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

Can you elaborate on why they should be removed. Is it because of the aforementioned duplication of information, or are there other reasons also?

Yeah, basically only that - they're redundant, and maintaining different paths is a burden (verifying them, fixing bugs in them, having tools that either consume one or the other or both, etc) - having a single way to represent things would be better for the DWARF ecosystem of consumers and producers.

(also aranges haven't been updated to benefit from the more compact/fewer-relocation-using encoding of .debug_rnglists introduced in DWARFv5 - so it's bigger/less efficient now as well)

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

Can you elaborate on why they should be removed. Is it because of the aforementioned duplication of information, or are there other reasons also?

Yeah, basically only that - they're redundant, and maintaining different paths is a burden (verifying them, fixing bugs in them, having tools that either consume one or the other or both, etc) - having a single way to represent things would be better for the DWARF ecosystem of consumers and producers.

(also aranges haven't been updated to benefit from the more compact/fewer-relocation-using encoding of .debug_rnglists introduced in DWARFv5 - so it's bigger/less efficient now as well)

I see. That makes sense, but unfortunately that option still exists and is being used. We hit an issue where there is inconsistency that I am looking into right now on llvm side. Thus this patch on verify side. I am also trying to follow up internally if we can just stop using this section.

We (Sony) also have some tooling that relies on .debug_aranges; while I sympathize with wanting to get rid of it, and I've filed a ticket to update the relevant tooling, we're not there yet.
Actually getting rid of .debug_aranges would probably deserve an RFC to raise the visibility, because there are clearly tools in various odd places that expect it to be there. But if that process is not imminent, I'm not opposed to introducing verification to make sure we get it right.

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

Can you elaborate on why they should be removed. Is it because of the aforementioned duplication of information, or are there other reasons also?

Yeah, basically only that - they're redundant, and maintaining different paths is a burden (verifying them, fixing bugs in them, having tools that either consume one or the other or both, etc) - having a single way to represent things would be better for the DWARF ecosystem of consumers and producers.

(also aranges haven't been updated to benefit from the more compact/fewer-relocation-using encoding of .debug_rnglists introduced in DWARFv5 - so it's bigger/less efficient now as well)

I see. That makes sense, but unfortunately that option still exists and is being used. We hit an issue where there is inconsistency that I am looking into right now on llvm side.

Could you provide more detail about this inconsistency?

(one issue I can say is that if you're comparing GCC and Clang .debug_aranges you'll see differences - Clang includes global variable addresses in the table and GCC does not)

Thus this patch on verify side.

Has it discovered anything interesting so far during development?

I am also trying to follow up internally if we can just stop using this section.

Thanks!

We (Sony) also have some tooling that relies on .debug_aranges; while I sympathize with wanting to get rid of it, and I've filed a ticket to update the relevant tooling, we're not there yet.
Actually getting rid of .debug_aranges would probably deserve an RFC to raise the visibility, because there are clearly tools in various odd places that expect it to be there. But if that process is not imminent, I'm not opposed to introducing verification to make sure we get it right.

Yeah, this review isn't getting rid of .debug_aranges and I agree, when we get to that point an RFC would be suitable.

For now I'm just of the opinion that the feature may not be worth certain amounts of work, code review, design, etc, if we can help it. but happy to discuss what sort of value this might provide.

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

Can you elaborate on why they should be removed. Is it because of the aforementioned duplication of information, or are there other reasons also?

Yeah, basically only that - they're redundant, and maintaining different paths is a burden (verifying them, fixing bugs in them, having tools that either consume one or the other or both, etc) - having a single way to represent things would be better for the DWARF ecosystem of consumers and producers.

(also aranges haven't been updated to benefit from the more compact/fewer-relocation-using encoding of .debug_rnglists introduced in DWARFv5 - so it's bigger/less efficient now as well)

I see. That makes sense, but unfortunately that option still exists and is being used. We hit an issue where there is inconsistency that I am looking into right now on llvm side.

Could you provide more detail about this inconsistency?

(one issue I can say is that if you're comparing GCC and Clang .debug_aranges you'll see differences - Clang includes global variable addresses in the table and GCC does not)

Thus this patch on verify side.

Has it discovered anything interesting so far during development?

I am also trying to follow up internally if we can just stop using this section.

Thanks!

We (Sony) also have some tooling that relies on .debug_aranges; while I sympathize with wanting to get rid of it, and I've filed a ticket to update the relevant tooling, we're not there yet.
Actually getting rid of .debug_aranges would probably deserve an RFC to raise the visibility, because there are clearly tools in various odd places that expect it to be there. But if that process is not imminent, I'm not opposed to introducing verification to make sure we get it right.

Yeah, this review isn't getting rid of .debug_aranges and I agree, when we get to that point an RFC would be suitable.

For now I'm just of the opinion that the feature may not be worth certain amounts of work, code review, design, etc, if we can help it. but happy to discuss what sort of value this might provide.

Basically what's in this patch.
error: .debug_aranges[0x0074c110][0] range [0x00000000036d0b0c - 0x00000000036d0b0d) not in compile unit @ 0x1447119f ranges.
error: .debug_aranges[0x00025d20] compile unit @ 0x002e2f30 line table sequence [5411-5417) has row[5416] with address 0x0000000000d58265 that is not in .debug_aranges nor in compile unit ranges.

I would be happy to see .debug_aranges go away, but we have toolchains that produce it incorrectly and it causes bugs in symbolizers and in debuggers if the debuggers try to use this section. Having a verifier to tell us when there are issues can also help prove that we should get rid of this section, so I believe this is actually a good reason that it should be included so we can say "look at this build, it has X number of errors in the .debug_aranges section, so this is proof we should remove it".

So we are seeing this issue with clang built binaries where we have some ranges missing in .debug_aranges, and also some ranges that are missing in the DW_AT_ranges of the DW_TAG_compile_unit. So this is also a good way to check that the DW_AT_ranges of the compile unit are ok. To make matters worse, we also are seeing some line table entries that are not in either the .debug_ranges _or_ the DW_AT_ranges of the DW_TAG_compile_unit. That is really bad.

Having a complete DWARF verifier is my goal for this tool as there is no tool provided by the DWARF group that does any verification, and we have compilers, linkers and post production tools that modify and emit DWARF and often times the DWARF is in really bad shape, but it gets shipped and then all of the tools that try to consume this DWARF is left with trying to do their best with DWARF that is anywhere from perfect to really bad.

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

I believe the opposite in that if we can prove tools are having a tough time producing these accelerator tables correctly, it can be a reason to vote for removal of this section. I agree this section is not needed if the compile unit has a DW_AT_ranges attribute and would be happy to see this section go away.

But this section is part of the DWARF specification and I believe we should be able to verify it.

We (Sony) also have some tooling that relies on .debug_aranges; while I sympathize with wanting to get rid of it, and I've filed a ticket to update the relevant tooling, we're not there yet.
Actually getting rid of .debug_aranges would probably deserve an RFC to raise the visibility, because there are clearly tools in various odd places that expect it to be there. But if that process is not imminent, I'm not opposed to introducing verification to make sure we get it right.

I agree here. We have some internal tools that rely on this section.

LLDB currently will use it if it is available, then it falls back to the DW_AT_ranges or DW_AT_low_pc/DW_AT_high_pc of the compile unit if any of the attributes are present, and falls back to looking for ranges manually if neither are there.

After seeing issues with the .debug_aranges with modern clang builds, I am tempted to stop using .debug_aranges at all in LLDB and might make a patch for this.

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

Can you elaborate on why they should be removed. Is it because of the aforementioned duplication of information, or are there other reasons also?

Yeah, basically only that - they're redundant, and maintaining different paths is a burden (verifying them, fixing bugs in them, having tools that either consume one or the other or both, etc) - having a single way to represent things would be better for the DWARF ecosystem of consumers and producers.

(also aranges haven't been updated to benefit from the more compact/fewer-relocation-using encoding of .debug_rnglists introduced in DWARFv5 - so it's bigger/less efficient now as well)

I see. That makes sense, but unfortunately that option still exists and is being used. We hit an issue where there is inconsistency that I am looking into right now on llvm side.

Could you provide more detail about this inconsistency?

(one issue I can say is that if you're comparing GCC and Clang .debug_aranges you'll see differences - Clang includes global variable addresses in the table and GCC does not)

Thus this patch on verify side.

Has it discovered anything interesting so far during development?

Basically what's in this patch.
error: .debug_aranges[0x0074c110][0] range [0x00000000036d0b0c - 0x00000000036d0b0d) not in compile unit @ 0x1447119f ranges.
error: .debug_aranges[0x00025d20] compile unit @ 0x002e2f30 line table sequence [5411-5417) has row[5416] with address 0x0000000000d58265 that is not in .debug_aranges nor in compile unit ranges.

Oh, sorry, I meant an example I can build with clang/llvm to reproduce to see what might be going on?

I'd like to see at least one/some examples before we add this verifier check, to get a sense of what sort of things you're seeing/what the most suitable tooling would be to deal with/investigate them.

So we are seeing this issue with clang built binaries where we have some ranges missing in .debug_aranges, and also some ranges that are missing in the DW_AT_ranges of the DW_TAG_compile_unit.

(similarly, I'd like to see examples so we have something more concrete to consider what this tooling is for/about)

llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
971–978

I think we already have a function for this in libDebugInfoDWARF - it's used by the symbolizer to drop tombstoned addresses earlier on/at a lower level probably? (& probably doesn't handle the zero case, since that can be ambiguous)

-2 should only be used in .debug_loc (& theoretically in .debug_range but I don't think binutils ld uses it there, strangely) - so we probably shouldn't include it here?

991

This is to account for overflow?

1043–1046

Pretty sure this won't be true for clang/llvm's debug info - LLVM includes global variable addresses in aranges, but not in CU ranges. We've had various discussions about whether this is correct/useful, and so far people seem to think it is correct and maybe useful? I don't really know. GCC doesn't put global variable addresses in either aranges or cu ranges.

(I'm pretty sure it's technically correct, but maybe not all that useful - at least if GCC doesn't do it (pretty sure it's generally correct that CU ranges shouldn't include global variable addresses, at least - so just a question of what should/shouldn't be in aranges))

1078–1088

Rather than scanning in both directions (does everything in aranges appear in CU ranges, then does everything in CU ranges appear in aranges) - maybe sort both and do a single scan through & just flag the first place they're inconsistent? (or, if accounting for the "aranges has extra stuff for global variables", skip over those entries if you can until you pick up in the CU ranges - if you get to the end without having visited all the CU ranges then there's something wrong at least)

1093–1130

Do we already have something like this for CU ranges? Could we reuse it?

If possible, I'd rather not add this - I think .debug_aranges should be removed (it's already been off-by-default for a decade in Clang) in favor of using CU-level address ranges. They're cheap-enough to parse that it doesn't substantially change the performance of tools so far as I'm aware and they save space by not duplicating the address range information in two places.

Adding a verifier feels like endorsing/encouraging/maintaining .debug_aranges which seems like the wrong direction we should be going.

Can you elaborate on why they should be removed. Is it because of the aforementioned duplication of information, or are there other reasons also?

Yeah, basically only that - they're redundant, and maintaining different paths is a burden (verifying them, fixing bugs in them, having tools that either consume one or the other or both, etc) - having a single way to represent things would be better for the DWARF ecosystem of consumers and producers.

(also aranges haven't been updated to benefit from the more compact/fewer-relocation-using encoding of .debug_rnglists introduced in DWARFv5 - so it's bigger/less efficient now as well)

I see. That makes sense, but unfortunately that option still exists and is being used. We hit an issue where there is inconsistency that I am looking into right now on llvm side.

Could you provide more detail about this inconsistency?

(one issue I can say is that if you're comparing GCC and Clang .debug_aranges you'll see differences - Clang includes global variable addresses in the table and GCC does not)

Thus this patch on verify side.

Has it discovered anything interesting so far during development?

Basically what's in this patch.
error: .debug_aranges[0x0074c110][0] range [0x00000000036d0b0c - 0x00000000036d0b0d) not in compile unit @ 0x1447119f ranges.
error: .debug_aranges[0x00025d20] compile unit @ 0x002e2f30 line table sequence [5411-5417) has row[5416] with address 0x0000000000d58265 that is not in .debug_aranges nor in compile unit ranges.

Oh, sorry, I meant an example I can build with clang/llvm to reproduce to see what might be going on?

I'd like to see at least one/some examples before we add this verifier check, to get a sense of what sort of things you're seeing/what the most suitable tooling would be to deal with/investigate them.

So we are seeing this issue with clang built binaries where we have some ranges missing in .debug_aranges, and also some ranges that are missing in the DW_AT_ranges of the DW_TAG_compile_unit.

(similarly, I'd like to see examples so we have something more concrete to consider what this tooling is for/about)

Ah sorry. Sure. I'll post something once I figure out what is happening. Was juggling couple of things so kind of slow progress on this. :)

clayborg added inline comments.Oct 24 2022, 5:06 PM
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
991

And to catch ranges that have been zeroed out or set to the same address to effectively dead strip the function. So this is mostly the catch LowPC == HighPC, but it can catch overflow issues as well.

1043–1046

Pretty sure this won't be true for clang/llvm's debug info - LLVM includes global variable addresses in aranges, but not in CU ranges. We've had various discussions about whether this is correct/useful, and so far people seem to think it is correct and maybe useful? I don't really know. GCC doesn't put global variable addresses in either aranges or cu ranges.

Interesting. We should be including it all address ranges for functions and globals in both .debug_aranges and the DW_AT_ranges for the compile unit, or not for either. Otherwise we are making a case for .debug_aranges to continue to exist.

(I'm pretty sure it's technically correct, but maybe not all that useful - at least if GCC doesn't do it (pretty sure it's generally correct that CU ranges shouldn't include global variable addresses, at least - so just a question of what should/shouldn't be in aranges))

I would vote to include globals in both, but with the hopes to move to DW_AT_ranges and getting rid of .debug_aranges. Since it hasn't been clarified, maybe we can leave this in for now to see if we can detect this issue with GCC? No one is using llvm-dwarfdump's verify result as something that stops a compilation or build that I am aware of, so it would be good to see this inconsistency IMHO.

1078–1088

Rather than scanning in both directions (does everything in aranges appear in CU ranges, then does everything in CU ranges appear in aranges) - maybe sort both and do a single scan through & just flag the first place they're inconsistent?

I would personally prefer to see all ranges that are missing to indicate how many times we have issues in the .debug_aranges. If we just flag the first inconsistency then we only know that there was at least one error. The reason I like seeing all of these issues is when thinks fail in a debugger, like "I was looking up address 0x123456 in this binary and it failed", it is nice to see an error in the "llvm-dwarfdump --verify" output that lists the range as having an error so we know this can be the reason it failed. Similar issues with anyone doing symbolication would exist as well.

(or, if accounting for the "aranges has extra stuff for global variables", skip over those entries if you can until you pick up in the CU ranges - if you get to the end without having visited all the CU ranges then there's something wrong at least)

I am thinking of solutions for the global variable problem. One idea is to try and extract all ranges for any data sections (read only and read + write), and then if we don't find a .debug_aranges range in the DW_AT_ranges, then we check if that address is in the DataRanges and if so, don't produce an error? Do we want to warn?

I am not opposed to seeing the full inconsistency though as if something is in .debug_aranges, it should really also been DW_AT_ranges IMHO. Do you agree? If we already have something that differs where global vars are in .debug_aranges but not in DW_AT_ranges, this should be fixed somehow if we want to get rid of .debug_aranges.

1093–1130

We don't currently. It would be easy to move this elsewhere so we can use it when checking .debug_line sections where that code would verify against either the DW_TAG_compile_unit's DW_AT_ranges, or in this case maybe it would only check against the .debug_aranges ranges. The hard part there is determining which addresses have the first address in each sequence being invalid as most dead stripping will just zeroing out the start address of a function. So we end up with tons of overlapping address ranges at zero all the time and this info really shouldn't be checked.

dblaikie added inline comments.Oct 25 2022, 2:45 PM
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
1043–1046

We should be including it all address ranges for functions and globals in both .debug_aranges and the DW_AT_ranges for the compile unit, or not for either.

My reading of the spec is that they're different (aranges includes code and data, ranges only includes code) - see below for spec quotations.

Otherwise we are making a case for .debug_aranges to continue to exist.

Owing to the practical implementation differences/defacto standard GCC provides, I don't think we need to worry about that - the majority of tooling won't be depending on global variables being in aranges.

If anyone knows of anything that depends on aranges that can't be reimplemented with similar performance using CU ranges (so, especially, anything that depends on the data addresses being in aranges - which only clang does, and it doesn't enable aranges by default anyway...) I'd love to hear about it.

I would vote to include globals in both, but with the hopes to move to DW_AT_ranges and getting rid of .debug_aranges. Since it hasn't been clarified, maybe we can leave this in for now to see if we can detect this issue with GCC? No one is using llvm-dwarfdump's verify result as something that stops a compilation or build that I am aware of, so it would be good to see this inconsistency IMHO.

It won't detect an issue in GCC - more likely an issue in clang where there will be entries in .debug_aranges that aren't in CU ranges.

(but yeah, if you looked further, and believe that CU ranges should include global variable addresses - then verifying that relationship (unrelated to aranges) would show another issue with Clang, and /then/ if you verified aranges against raw debug_info addresses (skipping CU ranges) you'd see the GCC issue, maybe)

The DWARF spec says that aranges entries contain "the beginning address ... of text or data covered by some entry owned by the corresponding CU" which supports the idea of having variable data in aranges. But the DWARF spec says that DW_AT_low_pc/high_pc are about describing "an entity that has a machine code address or range of machine code addresses".

So I believe Clang is technically correct, and aranges should/can contain things that aren't in CU ranges (& cu ranges shouldn't include addresses for global data).

But I don't think a consumer can reasonably rely on this guarantee due to GCC's different choice of what it puts in aranges.
test.cpp:

extern int x;
int x = 3;
void f1() { }
$ g++-12 -g -c test.cpp && llvm-dwarfdump-tot test.o -debug-aranges -debug-info
DW_TAG_compile_unit
...
  DW_AT_low_pc      (0x0000000000000000)
  DW_AT_high_pc     (0x0000000000000007)
...
  DW_TAG_variable
    DW_AT_location  (DW_OP_addrx 0x0)
  DW_TAG_subprogram
    DW_AT_low_pc    (0x0000000000000000)
    DW_AT_high_pc   (0x0000000000000006)
.debug_aranges contents:
Address Range Header: length = 0x0000002c, format = DWARF32, version = 0x0002, cu_offset = 0x00000000, addr_size = 0x08, seg_size = 0x00
[0x0000000000000000, 0x0000000000000007)
Whereas Clang's aranges:
$ clang++-tot -gdwarf-aranges -g -c test.cpp && llvm-dwarfdump-tot test.o -debug-aranges 
...
.debug_aranges contents:
Address Range Header: length = 0x0000003c, format = DWARF32, version = 0x0002, cu_offset = 0x00000000, addr_size = 0x08, seg_size = 0x00
[0x0000000000000000, 0x0000000000000004)
[0x0000000000000000, 0x0000000000000006)

(the .debug_info/CU is roughly the same as for GCC - nothing interesting to show there)

1078–1088

I would personally prefer to see all ranges that are missing to indicate how many times we have issues in the .debug_aranges.

Fair enough - could probably still be achieved in a single scan if both lists are sorted first? Do it like a merge deduplication - walk each list so long as its less than the next entry in the other list - skip over any entries that are earlier than the other list, then report all those as missing once you find the next matching entry?

I am not opposed to seeing the full inconsistency though as if something is in .debug_aranges, it should really also been DW_AT_ranges IMHO. Do you agree?

Nah, I don't think that's something to warn or error on - the spec seems pretty clear that aranges should include data and code addresses and high_pc/low_pc/ranges should only include code addresses.

So, yeah, if you want to implement this on-spec, you'd need to search through all the DIEs looking for code addresses (which is non-trivial, since you have to go look in global variable exprlocs and go scrumping through the operations to pick the address operations out of them - and determining the length you wan tto use to verify against aranges would be non-trivial)

It's complexity like this that is why I'd rather not do any of this work - flag aranges as deprecated and move on.

1093–1130

Here's the function I was thinking of: https://github.com/llvm/llvm-project/blob/2bdfececef4330b3a6489cdb67c57eb771d5f9e4/llvm/include/llvm/BinaryFormat/Dwarf.h#L780

Looks like for old school .debug_ranges we just subtract 1 from that with a comment to explain: https://github.com/llvm/llvm-project/blob/2bdfececef4330b3a6489cdb67c57eb771d5f9e4/llvm/lib/DebugInfo/DWARF/DWARFDebugRangeList.cpp#L92

So we could roll that into a helper function (either the same helper function with an extra parameter to the helper function or some other way).

Perhaps all this code could just use a higher level abstraction - but I guess maybe we don't have a higher level abstraction for aranges, unlike we do for ranges (see the code above and the general DIE::getAddressRanges which returns ranges not including dead code using whatever the appropriate tombstone values are) - these don't support 0 as a tombstone owing to its ambiguity/uncertainty.

OK, Got to the bottom of this.
To refresh memory this is for these two types of verify errors:
error: .debug_aranges[0x00000000] compile unit range [0x0000000000000600 - 0x000000000000063f) not in .debug_aranges.
error: .debug_aranges[0x00000000] compile unit @ 0x00000000 line table sequence [0-5) has row[0] with address 0x0000000000000600 that is not in .debug_aranges.

The binary was build with -g1 and debug info profiling is not enabled.
When machine function is processed things diverge here:
https://github.com/llvm/llvm-project/blob/a3a9fffea1bfd14ea509007cbf4f9fdb4d602c7c/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L2223

For ones that have Abstract Scopes it fall through creates a DIE and aranges entry. For others it just adds it to the ranges in CU.

So should it always add to aranges?

Repurposed one of the tests

namespace test {
 extern int a;
}
using test::a;
int GlobalConst = 42;
int Global;
struct S {
  static const int constant = 24;
  int (*fn)(int);
} s;
int __attribute__((always_inline)) square(int i) { return i * i; }
int cube(int i) {
  int squared = square(i);
  return squared*i;
}

int main(){
  return 0;
}
0x0000000b: DW_TAG_compile_unit [1] *
              DW_AT_producer [DW_FORM_strp] ( .debug_str[0x00000000] = "clang version 16.0.0")
              DW_AT_language [DW_FORM_data2]  (DW_LANG_C_plus_plus_14)
              DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000086] = "example.cpp")
              DW_AT_stmt_list [DW_FORM_sec_offset]  (0x00000000)
              DW_AT_comp_dir [DW_FORM_strp] ( .debug_str[0x00000092] = "examples")
              DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000)
              DW_AT_ranges [DW_FORM_sec_offset] (0x00000000
                 [0x0000000000000600, 0x0000000000000610)
                 [0x0000000000000610, 0x0000000000000630)
                 [0x0000000000000630, 0x000000000000063f))

0x0000002a:   DW_TAG_subprogram [2]   (0x0000000b)
                DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000c1] = "square")
                DW_AT_inline [DW_FORM_data1]  (DW_INL_inlined)

0x00000030:   DW_TAG_subprogram [3] * (0x0000000b)
                DW_AT_low_pc [DW_FORM_addr] (0x0000000000000610)
                DW_AT_high_pc [DW_FORM_data4] (0x00000020)
                DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000c8] = "cube")

0x00000041:     DW_TAG_inlined_subroutine [4]   (0x00000030)
                  DW_AT_abstract_origin [DW_FORM_ref4]  (cu + 0x002a => {0x0000002a} "square")
                  DW_AT_low_pc [DW_FORM_addr] (0x000000000000061d)
                  DW_AT_high_pc [DW_FORM_data4] (0x00000007)
                  DW_AT_call_file [DW_FORM_data1] ("example.cpp")
                  DW_AT_call_line [DW_FORM_data1] (13)
                  DW_AT_call_column [DW_FORM_data1] (0x11)

0x00000055:     NULL

0x00000056:   NULL

.debug_aranges contents:
Address Range Header: length = 0x0000002c, format = DWARF32, version = 0x0002, cu_offset = 0x00000000, addr_size = 0x08, seg_size = 0x00
[0x0000000000000610, 0x0000000000000630)

OK, Got to the bottom of this.
To refresh memory this is for these two types of verify errors:
error: .debug_aranges[0x00000000] compile unit range [0x0000000000000600 - 0x000000000000063f) not in .debug_aranges.
error: .debug_aranges[0x00000000] compile unit @ 0x00000000 line table sequence [0-5) has row[0] with address 0x0000000000000600 that is not in .debug_aranges.

The binary was build with -g1 and debug info profiling is not enabled.
When machine function is processed things diverge here:
https://github.com/llvm/llvm-project/blob/a3a9fffea1bfd14ea509007cbf4f9fdb4d602c7c/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L2223

For ones that have Abstract Scopes it fall through creates a DIE and aranges entry. For others it just adds it to the ranges in CU.

So should it always add to aranges?

Possibly - further discussion on the fix over on the review.

Maybe - discussion over there.

llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
1078–1088

(oh, and in something like -g1, you wouldn't find any global variable DIEs - not sure if they currently appear in aranges, but if they do, then there's no way to say "there's something in aranges that's missing from the DIEs" because they could all be global variables - but it's probably OK if we wanted to fix that by removing global variable aranges entries in -g1 since there's no info/DIEs about them anyway)

DianQK added a subscriber: DianQK.Nov 2 2022, 6:36 PM