verifyDieRanges function checks for the intersected address ranges.
It adds child DieRangeInfo into parent DieRangeInfo to check
whether children have overlapping address ranges. It is safe to not add
DieRangeInfo with empty address range into parent's children list.
This decreases the number of children which should be navigated and as a result
decreases execution time(parents having a lot of children with empty ranges
spend much time navigating them). For this command: "llvm-dwarfdump --verify clang-repl"
execution time decreased from 220 sec till 75 sec.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Is this only for DIEs that have ranges (I guess DIEs that have no ranges/high/low they'll go through the error path at line 379? or do DIEs without those attributes silently create an empty range?)?
If it's only for DIEs with ranges (high/low/ranges) - then I guess they only end up empty if using the new tombstone addresses or using the now default lld/bfd relocations and ranges (or address encoding (rather than length encoding) for DW_AT_high_pc), so the address ranges end up empty? (rather than gold's tombstoning which would produce zero-based but non-empty ranges)
DIEs without those attributes create an empty range. DWARFDie::getAddressRanges() function returns DWARFAddressRangesVector() for all dies that have no ranges/high/low. Thus, it would not go through the error path at line 379. Instead it would create an empty DieRangeInfo RI(RI.Ranges.empty());
If it's only for DIEs with ranges (high/low/ranges) - then I guess they only end up empty if using the new tombstone addresses or using the now default lld/bfd relocations and ranges (or address encoding (rather than length encoding) for DW_AT_high_pc), so the address ranges end up empty? (rather than gold's tombstoning which would produce zero-based but non-empty ranges)
That would be another case. RI.Ranges would be non empty in this case. It would contain an empty address range.
Ah, yeah, if this produces empty ranges for every DIE that has nothing to do with addresses... then yeah, that's going to be a problem/this is a good thing to do for sure, it seems to me.
If it's only for DIEs with ranges (high/low/ranges) - then I guess they only end up empty if using the new tombstone addresses or using the now default lld/bfd relocations and ranges (or address encoding (rather than length encoding) for DW_AT_high_pc), so the address ranges end up empty? (rather than gold's tombstoning which would produce zero-based but non-empty ranges)
That would be another case. RI.Ranges would be non empty in this case. It would contain an empty address range.
Ah *nod* wouldn't be covered here. Perhaps a more general fix could be made to the way DieRangeInfo does its work? Specifically DieRangeInfo::intersects(DieRangeInfo) - perhaps it could favor whichever has the smaller list? (currently it iterates all the LHS and checks them against the RHS - but usually, especially for this use in the verifier, the LHS will be much larger (containing the aggregate of all the ranges seen so far) than the RHS - I'd hazard that even just always flipping it (always iterating RHS to check for presence in LHS) would be faster, but to be general/symmetric, perhaps checking which is smaller and iterating that would be best). That would let the empty case fall out naturally, and hopefully make the several zero-length ranges (like linker-gc'd functions) better too, hopefully?
This looks good to me but needs a test.
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp | ||
---|---|---|
386 | As I was looking at this code and seeing as this patch wants to improve performance, can we change this to a const reference to avoid copying the ranges here? Compile units can have many ranges. | |
419 | On the other topic of dead stripping, I would suggest doing something like llvm-gsymutil does in llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp in the function handleObjectFile(...) where it iterates over all sections in a object file and comes up with .text address ranges for any sections that is a code section. This allows llvm-gsymutil to ignore any functions that should have been dead stripped by ignoring any functions with ranges that do not exist in any code ranges. The code to find all text ranges looks like: AddressRanges TextRanges; for (const object::SectionRef &Sect : Obj.sections()) { if (!Sect.isText()) continue; const uint64_t Size = Sect.getSize(); if (Size == 0) continue; const uint64_t StartAddr = Sect.getAddress(); TextRanges.insert(AddressRange(StartAddr, StartAddr + Size)); } We would store the "TextRanges" in the verifier itself and we would check any ranges we find and ignore any ranges that don't fall into any of the TextRanges. This will take care of all dead stripped functions and stop them from emitting extra errors. This can be done in a separate patch though. |
Ah *nod* wouldn't be covered here. Perhaps a more general fix could be made to the way DieRangeInfo does its work? Specifically DieRangeInfo::intersects(DieRangeInfo) - perhaps it could favor whichever has the smaller list? (currently it iterates all the LHS and checks them against the RHS - but usually, especially for this use in the verifier, the LHS will be much larger (containing the aggregate of all the ranges seen so far) than the RHS - I'd hazard that even just always flipping it (always iterating RHS to check for presence in LHS) would be faster, but to be general/symmetric, perhaps checking which is smaller and iterating that would be best). That would let the empty case fall out naturally, and hopefully make the several zero-length ranges (like linker-gc'd functions) better too, hopefully?
I think I did not get the idea. DieRangeInfo::intersects always compares ranges pair by pair. It looks like no sense to change iterating from RHS to LHS:
bool DWARFVerifier::DieRangeInfo::intersects(const DieRangeInfo &RHS) const { auto I1 = Ranges.begin(), E1 = Ranges.end(); auto I2 = RHS.Ranges.begin(), E2 = RHS.Ranges.end(); while (I1 != E1 && I2 != E2) { if (I1->intersects(*I2)) return true; if (I1->LowPC < I2->LowPC) ++I1; else ++I2; } return false; }
Also, it looks like there is no problem with ranges of zero length. The problem is when there are no ranges at all.
In that case, we do not need to add DieRangeInfo RI into the Children list, to prevent redundant checks and navigation of a lot of empty structures:
DWARFVerifier::DieRangeInfo::die_range_info_iterator DWARFVerifier::DieRangeInfo::insert(const DieRangeInfo &RI) { auto End = Children.end(); auto Iter = Children.begin(); while (Iter != End) { if (Iter->intersects(RI)) <<<<<<<<<<<<<<<<<<<<< redundant check if *Iter does not have a range return Iter; ++Iter; <<<<<<<<<<<<<<<<<<<< redundant navigation if *Iter does not have a range } Children.insert(RI); return Children.end(); }
Probably, I misunderstood your suggestion.
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp | ||
---|---|---|
419 | Yes, that should be done as a separate patch. Though, the question is do we want to make verifier silent on these cases? Or do we want that dwarf producer fixes the generated dwarf instead(removes debuginfo related to dead striped code)? |
So I ran converted the yaml into an object file and ran my current llvm-dwarfdump against it:
$ llvm-dwarfdump /tmp/a.out --verify --all Verifying /tmp/a.out: file format Mach-O 64-bit x86-64 Verifying .debug_abbrev... Verifying .debug_info Unit Header Chain... Verifying .debug_info references... Verifying .debug_types Unit Header Chain... Verifying .debug_line... No errors.
Don't we need the "no_range1" and "no_range2" to have an address range with zero size where high and low PC values are the same?
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp | ||
---|---|---|
419 | There is really no hope if this ever happening AFAIK. Most linkers do not understand DWARF at all, all they do is concatenate all DWARF sections from each .o file to each other and apply relocations. Anything that has no relocations gets a tombstone value of 0 or -1. The reason is DWARF is a stream that can't easily be manipulated with the standard toolbox available in most linkers, so it is really hard to remove anything without affecting the entire DWARF CU stream. I would love it if producers were able to do this, but without major changes to DWARF itself, I don't think a solution is coming anytime soon. |
Right. This patch does not fix any error. The only effect is that performance has been improved. Is there a way to write a lit test checking that the performance improved?
Currently the test checks that this patch does not introduce new error.
Don't we need the "no_range1" and "no_range2" to have an address range with zero size where high and low PC values are the same?
No we do not. zero length address range is a separate case. This patch fixes a problem when dies, not having any range, affect performance.
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp | ||
---|---|---|
437–445 | This line will actually stop errors from being emitted where they wouldn't before. If you can make a DW_TAG_subprogram like this: DW_TAG_subprogram DW_AT_name ("main") DW_AT_low_pc(0x0) DW_AT_high_pc(0x0) << Encode with a DW_FORM_addr whose value is the same as the low PC DW_TAG_lexical_block DW_AT_low_pc(0x1000) DW_AT_high_pc(0x2000) This would have been reported before, but wouldn't now because there are no valid ranges in the parent right? |
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp | ||
---|---|---|
437–445 | no. in this case parent would have a zero length address range ParentRI.Ranges[0]={0,0}. RI.Ranges would be non empty since RI.Ranges[0]={1000,2000}. and then the error would be reported. Will add a test case proves that the error is still reported. |
Hmm - this is a new proposal ^ (it's not how the code's currently written)? Not sure I follow what this suggestion would provide.
Also, it looks like there is no problem with ranges of zero length. The problem is when there are no ranges at all.
In that case, we do not need to add DieRangeInfo RI into the Children list, to prevent redundant checks and navigation of a lot of empty structures:DWARFVerifier::DieRangeInfo::die_range_info_iterator DWARFVerifier::DieRangeInfo::insert(const DieRangeInfo &RI) { auto End = Children.end(); auto Iter = Children.begin(); while (Iter != End) { if (Iter->intersects(RI)) <<<<<<<<<<<<<<<<<<<<< redundant check if *Iter does not have a range return Iter; ++Iter; <<<<<<<<<<<<<<<<<<<< redundant navigation if *Iter does not have a range } Children.insert(RI); return Children.end(); }Probably, I misunderstood your suggestion.
Sorry, yeah, I'm still confused. By "RHS' and "LHS" I didn't mean the start/end of each pair, but the two DieRangeInfos - LHS == this, RHS == the 'RHS' parameter to the function.
I mean having insert look something like this:
bool DWARFVerifier::DieRangeInfo::intersects(const DieRangeInfo &OriginalRHS) const { bool Swap = Ranges.size() > RHS.Ranges.size(); auto &LHS = Swap ? OriginalRHS : *this; auto &RHS = Swap ? *this : OriginalRHS; auto End = LHS.end(); auto Iter = LHS.begin(); while (Iter != End) { bool Intersects; iterator IntersectingIterator; std::tie(Intersects, IntersectingIterator) = Iter->intersects(RHS); // imagine a DieRangeInfo::intersects that returns both the bool, but also an iterator about where the intersection was if (Intersects) return Swap ? IntersectingIterator : Iter; ++Iter; } Children.insert(OriginalRHS); return Children.end(); }
This way, not only does the empty case perform better (if either LHS or RHS is empty, the algorithm will be quick - because the while loop would never run) - but also non-empty cases would be more efficient, I think, maybe?
Though honestly - maybe a broader change in direction would be better: What if the DieRangeInfos were sorted (I guess they are, since they're in a std::set?)? Then it wouldn't be an exhaustive search, but use find to find the relevant part of the list to look at instead. Hmm, maybe sorted in a contiguous data structure (like std::vector) and then binary searching the flat structure - and also then being able to binary search only the remainder of the list when looking for the next entry? Not sure, though.
Above is current upstream implementation of DieRangeInfo::intersects(const DieRangeInfo &RHS).
Also, it looks like there is no problem with ranges of zero length. The problem is when there are no ranges at all.
In that case, we do not need to add DieRangeInfo RI into the Children list, to prevent redundant checks and navigation of a lot of empty structures:DWARFVerifier::DieRangeInfo::die_range_info_iterator DWARFVerifier::DieRangeInfo::insert(const DieRangeInfo &RI) { auto End = Children.end(); auto Iter = Children.begin(); while (Iter != End) { if (Iter->intersects(RI)) <<<<<<<<<<<<<<<<<<<<< redundant check if *Iter does not have a range return Iter; ++Iter; <<<<<<<<<<<<<<<<<<<< redundant navigation if *Iter does not have a range } Children.insert(RI); return Children.end(); }Probably, I misunderstood your suggestion.
Sorry, yeah, I'm still confused. By "RHS' and "LHS" I didn't mean the start/end of each pair, but the two DieRangeInfos - LHS == this, RHS == the 'RHS' parameter to the function.
I mean having insert look something like this:
bool DWARFVerifier::DieRangeInfo::intersects(const DieRangeInfo &OriginalRHS) const { bool Swap = Ranges.size() > RHS.Ranges.size(); auto &LHS = Swap ? OriginalRHS : *this; auto &RHS = Swap ? *this : OriginalRHS; auto End = LHS.end(); auto Iter = LHS.begin(); while (Iter != End) { bool Intersects; iterator IntersectingIterator; std::tie(Intersects, IntersectingIterator) = Iter->intersects(RHS); // imagine a DieRangeInfo::intersects that returns both the bool, but also an iterator about where the intersection was if (Intersects) return Swap ? IntersectingIterator : Iter; ++Iter; } Children.insert(OriginalRHS); return Children.end(); }This way, not only does the empty case perform better (if either LHS or RHS is empty, the algorithm will be quick - because the while loop would never run) - but also non-empty cases would be more efficient, I think, maybe?
Though honestly - maybe a broader change in direction would be better: What if the DieRangeInfos were sorted (I guess they are, since they're in a std::set?)? Then it wouldn't be an exhaustive search, but use find to find the relevant part of the list to look at instead. Hmm, maybe sorted in a contiguous data structure (like std::vector) and then binary searching the flat structure - and also then being able to binary search only the remainder of the list when looking for the next entry? Not sure, though.
I see. So you are speaking not about DieRangeInfo::intersects(const DieRangeInfo &RHS) but about DieRangeInfo::insert(const DieRangeInfo &RI).
In suggested implementation we still need to check for OriginalRHS.Ranges.empty() to avoid storing it into children list:
if ( !OriginalRHS.Ranges.empty() ) { Children.insert(OriginalRHS); return Children.end(); }
But I understood your suggestion and will add binary search inside Children for DieRangeInfo::insert(const DieRangeInfo &RI).
I mean having insert look something like this:
bool DWARFVerifier::DieRangeInfo::intersects(const DieRangeInfo &OriginalRHS) const { bool Swap = Ranges.size() > RHS.Ranges.size(); auto &LHS = Swap ? OriginalRHS : *this; auto &RHS = Swap ? *this : OriginalRHS; auto End = LHS.end(); auto Iter = LHS.begin(); while (Iter != End) { bool Intersects; iterator IntersectingIterator; std::tie(Intersects, IntersectingIterator) = Iter->intersects(RHS); // imagine a DieRangeInfo::intersects that returns both the bool, but also an iterator about where the intersection was if (Intersects) return Swap ? IntersectingIterator : Iter; ++Iter; } Children.insert(OriginalRHS); return Children.end(); }This way, not only does the empty case perform better (if either LHS or RHS is empty, the algorithm will be quick - because the while loop would never run) - but also non-empty cases would be more efficient, I think, maybe?
Though honestly - maybe a broader change in direction would be better: What if the DieRangeInfos were sorted (I guess they are, since they're in a std::set?)? Then it wouldn't be an exhaustive search, but use find to find the relevant part of the list to look at instead. Hmm, maybe sorted in a contiguous data structure (like std::vector) and then binary searching the flat structure - and also then being able to binary search only the remainder of the list when looking for the next entry? Not sure, though.
I see. So you are speaking not about DieRangeInfo::intersects(const DieRangeInfo &RHS) but about DieRangeInfo::insert(const DieRangeInfo &RI).
Ah, sorry, no - I did mean insert, got the function name wrong when I was assembling the example code. (the implementation's still the implementation of insert - since it has the Children.insert code, etc, just the function name is incorrect)
In suggested implementation we still need to check for OriginalRHS.Ranges.empty() to avoid storing it into children list:
if ( !OriginalRHS.Ranges.empty() ) { Children.insert(OriginalRHS); return Children.end(); }
That's fair. Is that causing scaling problems? In terms of making 'Children' large and so it's really slow to iterate through? Fair enough - yeah, then I don't object so much to just checking that more up-front (but maybe up-front inside "insert" rather than in the caller - to make "insert" more generically robust?
But I understood your suggestion and will add binary search inside Children for DieRangeInfo::insert(const DieRangeInfo &RI).
There might still be room for this ^ work too, but probably in a separate patch/in addition, not as an alternative to the patch you're proposing here - I think this patch here can move forward, just with the check moved into "insert" so it's more generically robust.
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp | ||
---|---|---|
419 | yep. There is no clear perspective that linkers/dwarf producers would support removing unused parts from the DWARF info. But there is a suggestion which is much easier and might be supported by DWARF producers - http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 . I agree that the suggested functionality(gsymutil like) would be helpful. It allows to filter out a ton of messages about "overlapping address ranges". Probably it would be good if it would not be done by default, but might be requested when necessary. Then the problem would be clearly seen and it will motivate to support "Reserve an address value for "not present"" solution. |
llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp | ||
---|---|---|
386 | done. | |
452 | The meaning of "ShouldBeContained" looks to be different than "contains". f.e. contains() returns false if (ParentRI/this).Ranges is empty. So it looks like contains could not have the same handling. |
As I was looking at this code and seeing as this patch wants to improve performance, can we change this to a const reference to avoid copying the ranges here? Compile units can have many ranges.