This is an archive of the discontinued LLVM Phabricator instance.

Add DWARF verifiers to verify address ranges are correct and scoped correctly.
Needs ReviewPublic

Authored by clayborg on May 3 2017, 12:02 PM.

Details

Summary

This change adds:

  • Verify all DIEs with address ranges have valid ranges
  • If a CU has a low/high PC or has a DW_AT_ranges, then verify that and DW_TAG_subprogram is completely contained in the CU ranges
  • Keep a list of all address ranges for any DW_TAG_subprogram DIEs and verify none overlap
  • verify that all DW_TAG_lexical_block and DW_TAG_inlined_subroutine address ranges are completely contained in the parent address ranges

Diff Detail

Event Timeline

clayborg created this revision.May 3 2017, 12:02 PM
dblaikie edited edge metadata.May 3 2017, 1:00 PM

I haven't followed some of the design direction Adrian's been pointing this work in - so all of this is subject to discussion, etc:

  1. I'd expect not to build an entire representation of all ranges up-front, then verify them (uses more memory, etc) - but rather verify during a traversal of the DIEs instead
  1. for/for loops to check if one range is a subset of another's probably rather inefficient. Not sure if LLVM already has better tools for this, though I suspect not.
  1. I'm not sure that function ranges need to be special/separate - I'd expect any nested ranges (eg: the range of two scopes in a function) to be distinct as well, so I'd expect a more general algorithm that walks nodes and their children looking for ranges and checking that children are subsets of parents, and children are distinct from each other.

I haven't followed some of the design direction Adrian's been pointing this work in - so all of this is subject to discussion, etc:

  1. I'd expect not to build an entire representation of all ranges up-front, then verify them (uses more memory, etc) - but rather verify during a traversal of the DIEs instead

The whole reason we need this is to look for overlapping functions between all CUs. This helps us to identify exactly which DIEs overlap. It is also just a DIE (2 pointers) + vector<pair<uint64_t, uint64_t>. Not too costly.

  1. for/for loops to check if one range is a subset of another's probably rather inefficient. Not sure if LLVM already has better tools for this, though I suspect not.

It currently works and I have seen many errors introduced by the compilers when they optimize debug info and they get it wrong. We only check blocks and inline functions against the containing ranges. We can make sure the ranges are coalesced when needed, but other than that, what is innefficient about what is there? 99% of the time it is one range seeing if one other range is contained.

  1. I'm not sure that function ranges need to be special/separate - I'd expect any nested ranges (eg: the range of two scopes in a function) to be distinct as well, so I'd expect a more general algorithm that walks nodes and their children looking for ranges and checking that children are subsets of parents, and children are distinct from each other.

Again, not sure what is wrong with the way it is doing it. We just keep a stack >= depth of the current DIE with ranges filled in. If a block or inlined function is found, it verifies it is in the parent. The DWARF doesn't store this information in any useful way we can access, so I believe this approach work well.

And lets just get something to start with and optimize when we actually find something is taking too much memory or an algorithm is not efficient?

One thing to clarify:

std::vector<RangeInfoType> DieRangeInfos;

is a vector that only gets as deep as the DIE depth. It doesn't get every range for every function, block, or inlined function, it just has a stack of ranges for the _current_ DIE depth.

std::set<RangeInfoType> AllFunctionRangeInfos;

Contains one entry for each DW_TAG_subprogram.

clayborg updated this revision to Diff 97854.May 4 2017, 11:29 AM
  • Make DIE verification recursive
  • Remove all Depth based code and just pass info from the parent
  • add detection of overalap between DW_TAG_lexical_block or DW_TAG_inlined_subroutine DIEs at the same level
  • rename RangeInfoType to DieRangeInfo
  • create a new VerifyDieInfo to help with verification

David: this approach uses your suggested recursive approach and should make things clearer. It also adds verification of overlapping DW_TAG_lexical_block and DW_TAG_inlined_subroutine that are siblings at any level.

clayborg updated this revision to Diff 97857.May 4 2017, 11:43 AM

Fixed address ranges not being contained errors from incorrectly being reported.

Would love to get this one in if anyone any time. I have only one more patch after this for .debug_aranges

aprantl added inline comments.May 5 2017, 9:24 AM
include/llvm/DebugInfo/DWARF/DWARFVerifier.h
78

We usually use \param for consistency.

lib/DebugInfo/DWARF/DWARFVerifier.cpp
44

It doesn't look like this is tested? (as far as I can tell by searching for the error message).

73

test?

129

This entire function has really weird indentation. Please clang-format.

149

. at the end

187

.

197

remove extra {

aprantl resigned from this revision.May 5 2017, 9:31 AM

David, are all your concerns addressed?
Otherwise from my end this change LGTM (with all outstanding issues addressed).

David, are all your concerns addressed?

Nah - I'll go over it on Monday.

Otherwise from my end this change LGTM (with all outstanding issues addressed).

lib/DebugInfo/DWARF/DWARFVerifier.cpp
104–105

Else after return

clayborg updated this revision to Diff 97975.May 5 2017, 9:41 AM

Fixes:

  • addressed Adrian's issues
  • added test for invalid DIE ranges where high pc is less than low pc
clayborg marked 2 inline comments as done.May 5 2017, 9:43 AM
clayborg added inline comments.
lib/DebugInfo/DWARF/DWARFVerifier.cpp
44

I added a test.

73

There are two: TestDwarfVerifyOverlappingFunctionRanges and TestDwarfVerifyOverlappingLexicalBlockRanges

clayborg updated this revision to Diff 97976.May 5 2017, 9:44 AM

Fixed else after return.

clayborg updated this revision to Diff 97999.May 5 2017, 12:21 PM

Moved ranges comparisons over into header file so we can unit test. The unit tests will help in case we make AnyDWARFRangesIntersect or DWARFRangesContains changes that assume sorted lists and try to do things more efficiently.

dblaikie added inline comments.May 8 2017, 1:46 PM
lib/DebugInfo/DWARF/DWARFVerifier.cpp
60–71

This still looks like it only compares adjacent pairs:

{{1, 2}, {2, 3}, {3, 4}}

Is that not the case? If it does' what if {1,3} overlap, but neither overlaps with 2? Will that be diagnosed?

134–163

I /think/ a lot of this code could be simplified if there were (as I mention in another comment) a generalized representation of the "scope of non-overlapping things with a possible parent they should all be within" that could be used at the CU level, subprogram level, and inter-subprogram level.

Let me know if you'd like me to sketch out such a design in more detail.

179–180

Why not do this as elements are visited, rather than at the end in another pass?

I'd imagine a small class that would take DIEs verify their ranges, merge them, flag merge failures (between previous DIEs or parents) & by the end there'd be nothing left to do?

246

If this change is independent of the rest of this patch, please commit it separately (doesn't need to be sent for review if you're just constifying something that should've been const to begin with)

314

Similarly, commit without pre-commit review spelling corrections like this - makes it easier to review the real changes without having them conflated with cleanup changes like this.

335–339

I'd expect to do the subprogram checking incrementally, the same as the lexical scopes are done incrementally when another lexical scope is visited? Is there a reason to do it differently?

(possibly even using the same machinery/abstraction - even if it needs to be special cased to handle the interesting nesting cases)

349

Probably do this with

getUnitDIE(/* ExtractUnitDIEOnly = */ false)

rather than a local constant. Or try the trick Adrian did in a recent patch - an enum with a single value as a constant:

enum { ExtractAllDIEs = false };

(& could include that in the header where getUnitDIE is declared)

clayborg updated this revision to Diff 98226.May 8 2017, 3:41 PM
  • Made a new NonOverlappingRanges class.
  • Use it to verify address ranges on the fly for
    • functions within all CUs
    • lexical block and inlined functions in DW_TAG_subprogram DIEs
clayborg marked 5 inline comments as done.May 8 2017, 3:42 PM
clayborg added inline comments.
lib/DebugInfo/DWARF/DWARFVerifier.cpp
71

These are sorted ranges since the data is in a std::set. You can't have 1 overlap with 3 without 1 and 2 overlapping and 2 and 3 overlapping.

163

That is exactly what ParentVRI is? The class DieRangeInfo maintains this info. I didn't use one for the CU, but I use it for everything else. What am I missing here?

180

We now check for overlapping functions on the fly and overlapping lexical blocks and inlined subroutines on the fly.

314

Sure thing.

339

It has now been inlined.

So now we have DIERangeInfo:

struct DieRangeInfo {
  DWARFDie Die;
  DWARFAddressRangesVector Ranges;
};

And a class to track ranges that can't overlap:

struct NonOverlappingRanges {
  std::set<DieRangeInfo> NonOverlappingRanges;
};

We store a DieRangeInfo for the compile unit name UnitRI and a NonOverlappingRanges instance named AllFunctionRanges in the Verify instance. Then we pass a "const DieRangeInfo &ParentRI" and a "NonOverlappingRanges &NRO" to verifyDIE. Since the parent range info and DIE can't change, they should be in 1 class (DieRangeInfo) and the non overlapping address ranges are in a modifiable in/out argument. This effectively implements what you asked for David. We also do checking on the fly in all cases.

clayborg updated this revision to Diff 98305.May 9 2017, 9:42 AM

Fixed the core issues David was asking for. Main differences are:

  • Now have SortedRanges class that guarantees ranges are sorted instead of relying on DWARFAddressRangesVector being sorted
  • SortedRanges can report if there are invalid ranges in a DWARFAddressRangesVector and if there are overlaps during insert
  • DieRangeInfo now uses SortedRanges instead of DWARFAddressRangesVector
  • DieRangeInfo now reports errors which really cleans up the DWARFVerifier::verifyDie() function
  • NonOverlappingRanges now can report errors when DIEs have overlapping address ranges which really cleans up the DWARFVerifier::verifyDie() function
clayborg updated this revision to Diff 98310.May 9 2017, 9:44 AM

Remove unused DWARFVerifier::verifyNoRangesOverlap() function.

I can work up a prototype/example of what I'm suggesting, if you like - but here are some comments to give the basic idea. It still seems to me like there are more parts to this than necessary.

include/llvm/DebugInfo/DWARF/DWARFDebugRangeList.h
25–40 ↗(On Diff #98310)

Might be tidier to make this a struct with a couple of functions:

struct DWARFRange {
  uint64_t Start;
  uint64_t End;
  bool intersects(const DWARFRange&);
  bool contains(const DWARFRange&);
}

That way the function names don't have to repeat the "DWARFRange" prefix.

43 ↗(On Diff #98310)

I don't think this typedef adds a lot now - the original name is shorter & more descriptive (since it makes it clear it's just a plain old std::vector, not anything fancier)

46–63 ↗(On Diff #98310)

Surprised to see these O(N^2) algorithms still here - if the ranges are always sorted first, the pairwise-style algorithm you implemented for overlap within a single range should be usable on these cases I would think. A bit trickier to walk the two lists together, but do-able.

I wonder if LLVM's IntervalMap would be usable here (maps an integer range to something - in this case to a DWARFDie). If it provides an error message or a failable insert - you could walk each element of a range, adding them to the IntervalMap - if they ever collide, error out. Otherwise you have a compact/efficient representation of the ranges you can quickly look up/insert into...

include/llvm/DebugInfo/DWARF/DWARFVerifier.h
30–46

Does this class need to exist? I'm not sure it adds much of an improved abstraction boundary compared to being an implementation detail of DieRangeInfo?

48–74

I'm surprised there's a need for both DieRangeInfo and NonOverlappingRanges - it seems like DieRangeInfo should/could do all the non-overlapping testing too, no?

52

I think the "Die(), Ranges()" is redundant/unnecessary & probably best to write this as:

DieRangeInfo() = default;
lib/DebugInfo/DWARF/DWARFVerifier.cpp
75–103

Ah, here's the other version of "intersects" - it's an "insert and test intersection". I'd expect to only need this version - not the other one that does the O(N^2) set of comparisons. Is that practical? (to only have the one version)

I also wouldn't be terribly sad if ranges were sorted first - that way they'd be quick to compare with the parent range and with the sibling ranges (O(N Log(N)) to sort, then two O(N) walks - one for the parent and one for the siblings. Rather than two O(N log(N)) walks - I mean it's still O(N log(N)) in the end... I don't mind much either way)

114–117

This diagnostic won't include both the DIEs that overlap, which seems difficult for the user. (test case should test this - that both the DIEs are dumped for cases where an error is about two DIEs)

201–217

Why are there 3 different things a subprogram is being added to/processed by/errored by?

I'd have expected only one - the outer/current CU. It'd also need to initialize the current SP (which would have to be a stack of some kind, I guess (like the patch I provided last week), to handle the nested SP definition case) for lexical scopes to compare against/be added to.

207–208

I don't think the CU DIE needs a special case for the diagnostic, does it? I would've expected the diagnostics to be in the RangeInfo & not need to be customized per caller.

209–212

The non-empty check I would guess isn't needed here ("reportErrorIfNotContained" seems like it'd be trivially true if the range is empty - it couldn't possibly be not contained in anything)

214–215

Why would all functions need to be compared? I would expect that functions would be compared to their enclosing CU - if all CUs are not overlapping and all functions are contained within their CU's range then no function could be overlapping (in the same way that if all lexical scopes are contained within their parent, and their parents are non-overlapping (eg: different, non-overlapping functions) then such a pair of lexical scopes couldn't overlap)

222–238

Also here, I wouldn't expect 3 different cases (RI, ParentRI, and NOR) - I'd expect one, maybe two but hopefully abstracted into a single operation I think - "to the current relevant scope add this DIE - which will retrieve its range, check it for local validity, check it doesn't overlap with any siblings in the current scope and is contained in the parent".

clayborg updated this revision to Diff 98859.May 12 2017, 4:17 PM
  • Created a struct named DWARFRange and test it
  • Move all contains and intersects logic into DWARFRange and DieRangeInfo
  • Create a new DWARFVerify::DIEInfo that contains a DieRangeInfo and a NonOverlappingRanges.
  • Simplify verifyDIE as requested to do everything in one place
  • Modified DieRangeInfo::contains and DieRangeInfo::intersects to be faster
  • Added tests for DieRangeInfo::contains and DieRangeInfo::intersects
  • Removed the SortedRanges class and moved it into DieRangeInfo

Looking pretty good - thanks for all your patience!

A few places this could be further tidied up/streamlined - and if you can do anything to pare down the unit tests based on equivalence classes/partitions, that'd be great, it looks like more tests/cases than I'd expect, given the code under test.

include/llvm/DebugInfo/DWARF/DWARFVerifier.h
48

Usually any operator overload that can be a non-member should be (it can be a friend, though in this instance even that's not necessary)

49–51

One quick way to implement this is:

return std::tie(Start, End) < std::tie(RHS.Start, RHS.End);
83

Does llvm's IntervalMap make sense here? Keeping an IntervalMap from the address range to the DIE that occupies that range seems good & less code (no need to write a custom op<, etc) I think... probably.

lib/DebugInfo/DWARF/DWARFVerifier.cpp
37–51

Wouldn't the presence of overlapping ranges in the Ranges list mess with the correctness of other algorithms (like intersection) - you might then have to search arbitrarily far back in the range list from the lower_bound because an earlier range might be longer and extend past the current one?

This seems unreliable and may be better to error on an invalid range and then not process it further? (not produce follow-on errors for intersections, etc)? (also may be better in general rather than producing a bunch of related errors about an invalid range - errors about it overlapping with many things, being outside the parent raneg, etc)

Not sure.

62–72

Seems like a job for erase/remove, maybe:

UnsortedRanges.erase(llvm:::remove_if(UnsortedRanges, [&](const std::pair<uint64_t, uint64_t> &R) { if (R.first < R.second) return false; if (R.first > R.second) HasErrors = true; return false; } , UnsortedRanges.end());

(appropriately indented, etc)

124–134

This and contains still seems more expensive (O(N log(N))) than I'd expect - it should be doable without too much complexity, linearly - by walking both sorted lists at the same time.

Ah, I see, by moving the starting point forward it's a smaller search space - but still not as efficient, I think, as searching linearly from the start of that range, so maybe replacing lower_bound with std::find with a comparator of some kind?

Similar idea for 'contains' above.

139–153

Can probably implement this with:

return std::tie(Ranges, Die.getOffset()) < std::tie(RHS.Ranges, RHS.Die.getOffset());

Also - does this need to support < comparison of two entities with exactly the same ranges but a different DIE? When/where does that come up/what behavior is desirable there?

158–165

Probably invert this to reduce indentation:

if (contains(RI))
  return false;
OS << Error;
...
171–173

Drop braces on single line ifs, and probably change this code to:

bool HasErrors = DieRI.extractRangesAndReportErrors(OS);
178–180

Why is the CU a special case here? (a subprogram can have ranges in a CU with no ranges, but a lexical scope can't have ranges in a subprogram without ranges - is that the condition? I'm not sure why that's the case - could you explain it?)

181–188

Why is there a special case for CU DIEs here? I'd expect the same error message and behavior regardless of whether it's a CU or not. The DIEs should be dumped and that'll make it clear to the user what kind of DIE it is without needing a different error message.

202

Invert to reduce indentation:

if (RangeSet.empty())
  return nullptr;
203–210

Might be worth a couple of comments explaining the conditions these two cases cover.

204–205

Could collapse these together:

if (Iter != RangeSet.end() && Iter->intersects(RI))
208–210

Could collapse these two together:

if (Iter != RangeSet.begin() && (--Iter)->intersects(RI))

Though maybe that doesn't help readability - not sure.

220–223

Probably makes sense for "GetOverlappingRangeInfo" to return just the DWARFDie (assuming it has a null/empty state that's boolean testable) rather than something more complicated - maybe simpler name too "GetOverlappingDIE" I guess.

& you can collapse the declaration:

if (DWARFDie Die = GetOverlappingDie(RI)) {
  ...
228

If you use an IntervalMap, might be worth seeing if you could do an insert+retrieve previous value (but that's probably too niche - since an insert into the IntervalMap could cover multiple previous values). This would avoid GetOverlappingRangeInfo doing similar/the same work as 'insert'.

241–245

Where does the cross-CU checking happen (ensuring no two CUs ranges overlap). I'd expect it to happen here, but don't see it?

unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
2501

/might/ be able to simplify these lines by omitting the "DWARFVerifier::DIERangeInfo(" ")" part (& just leaving the {} init) - you might need to add an extra set of {} around it instead, I'm not sure.

clayborg updated this revision to Diff 99021.May 15 2017, 10:39 AM
  • Address all of David's previous issues
dblaikie added inline comments.May 15 2017, 11:23 AM
lib/DebugInfo/DWARF/DWARFVerifier.cpp
59

Perhaps this should go in 'insert' which is only called from this function anyway?

69

Guessing this should be 'return true;' - otherwise no elements will ever be removed?

Is that tested? (what happens if elements aren't removed here? what would that do to the rest of the code/validation?)

80–83

Only dumps one of the two DIEs that are relevant to this situation. It should probably dump both - and at least some test coverage for this to demonstrate the usability of dumping two DIEs, etc.

124–130

This idiom seems to have appeared in a few places (similar code in DieRangeInfo::insert & sort of similar stuff in 'contains') any chance of factoring that out?

124–134

Still wondering about this ^

139–153

Still curious about this: does this need to support < comparison of two entities with exactly the same ranges but a different DIE? When/where does that come up/what behavior is desirable there?

155–159

Why is a rangeless CU a special case/acceptable here, in comparison to say, a rangeless subprogram or lexical_block?

178

grammar/missing word "Iter <something here> might the address range that follows RI"?

205

Roll this into its one use:

switch (Die.getTag())
212–217

Where, if anywhere, does the case of overlapping CUs get handled? That should be an error, right?

214–215

If the RI is the only part of UnitDI that's used - maybe only have the RI in the first place here?

But maybe the NOR in UnitDI should be used - that would catch cases where two CUs overlap with each other, right? & then all 3 cases (compile_unit, subprogram, lexical_block) would look quite the same, I think, just operating on different DieInfo objects?

224–227

Inconsistent bracing (some blocks in this switch have braces, some don't, for essentially the same code)

clayborg updated this revision to Diff 99147.May 16 2017, 8:01 AM
  • Added an assert to catch if we don't remove overlapping or invalid ranges that catches the issue where the lambda was returning false instead of true and modified DieRangeInfo::insert() to assert as mentioned.
  • fixed ::contains and ::intersects to use a more efficient traversal.
  • fixed all other issues
dblaikie added inline comments.May 16 2017, 11:34 AM
lib/DebugInfo/DWARF/DWARFVerifier.cpp
59

Still curious about this

69

While adding an assert is useful - I'm still curious about what would go wrong if/when this wasn't removing the elements it's intended to? Is this purely a performance improvement (if so, it's fine as-is, with the assert). Does this change behavior at all, in a way that matters/is intentional? (eg: does it make follow-on error messages better?) If so, that behavior should be tested.

80–83

Still curious about this

124–130

Still curious about this - this code and DieRangeInfo::insert (& similar, though not identical code for 'contains' tests) were duplicate, and now this code is fixed but DieRangeInfo::insert is not. Please refactor so the code's written once, not twice to avoid this kind of divergence/duplication/repeated work.

130–132

I'd probably skip this - the algorithm's linear anyway & pretty tidy without it. I suppose it catches the common case where they don't overlap at all, but might be better done another way (range lists could keep their max/min bound & shortcircuit here, etc)

134–140

This looks more expensive than what I had in mind - I figured a linear search to find the item, then testing 'intersects', rather than testing intersects on every item along the way.

139–153

Still curious about this

155–159

Still curious about this

175–176

Probably "emit (or report, to match the function name) an error" ("make an error" sounds curious, like maybe it returns an llvm::Error (though even that I'd expect to be described as "returns an Error") that the caller is responsible for emitting to the user, etc)

182

DWARFDie is already boolean testable - does it need to be wrapped in Optional? Or could this return an invalid DWARFDie to signal the "no overlap" case?

212–217

Still wondering about this.

214–215

Still curious about this

clayborg added inline comments.May 16 2017, 11:37 AM
include/llvm/DebugInfo/DWARF/DWARFVerifier.h
46

Yes this can be moved into DieRangeInfo

52

Will change it.

74

We need both. NonOverlappingRanges keeps a set of "DIE + ranges" so we can tell which DIE had ranges that don't overlap. DieRangeInfo is ranges for a specific Die. For all function address ranges (AllFunctionRanges below), we need to know which DIE the range came from.

83

I don't it does. Looking at the IntervalMap::insert() function headerdoc:

/// insert - Add a mapping of [a;b] to y, coalesce with adjacent intervals.
/// It is assumed that no key in the interval is mapped to another value, but
/// overlapping intervals already mapped to y will be coalesced.

So it is says it can have no keys mapped to different values, and we are watching for possible overlaps and possible DIEs that have the same range.

lib/DebugInfo/DWARF/DWARFVerifier.cpp
51

Longer ranges will appear at the end since the comparison does both the Start and End so this should work right?

59

DieRangeInfo::extractRangesAndReportErrors() is called from two places. This function calls DWARFVerifier::DieRangeInfo::insert() and also the DieRangeInfo constructor calls it for testing.

69

It didn't affect the tests because DWARFVerifier::DieRangeInfo::insert() was removing them when they were inserted. I have removed the check from insert and added an assert that will fire and catch the above error when the final "return false;" is in, but succeeds when it is true. The DWARFVerifier::DieRangeInfo::insert() is used for testing with the DieRangeInfo constructor and those ranges must to be valid (sorted and non overlapping) and the assert that was added enforces that.

83

Again, this is only checking if the ranges within 1 DIE overlap while it is adding these ranges. Only 1 die needs to be printed.

117

This is detecting ranges within a Die that overlap or are invalid and the one and only Die is dumped below. This isn't checking if multiple Dies overlap, that is elsewhere. I do need to add a test for this though

130

The logic is simpler now, let me know if it still needs to be broken out.

153

Yes, we need to find errors if two DIEs have the same low/high PC.

159

rangless subprograms are fine, but not rangeless subprograms with ranged lexical blocks. So:

1 - Ranged CU with ranged subprogram -> OK as long as contained
2 - Rangless CU with rangeless subprogram -> OK
3 - Ranged CU with rangeless subprogram -> OK

1 - Rangless subprogram/block/inlined with contained rangeless block/inlined -> OK
2 - Rangless subprogram/block/inlined with contained ranged block/inlined -> bad
3 - Ranged subprogram/block/inlined with contained ranged block/inlined -> OK as long as contained

#2 is where they differ

180

You got it right. A CU may or may not have ranges and if it doesn't we don't need to check if DW_TAG_subprogram DIEs are contained. Lexical blocks and inline functions must be contained in parent.

188

One error message will be fine, I'll change it.

208

DW_TAG_subprogram DIEs are always direct children of the CU DIE so this special case is needed. We are always passing a parent range info in ParentRI, but that might be a DW_TAG_class_type for a DW_TAG_subprogram. So for DW_TAG_subprogram we always check against the CU and only check against the CU if it has ranges. For blocks we always check if the block has ranges, then the parent must have the ranges. So this needs to stay.

212

It is ok for the CU to be empty, not when you have a DW_TAG_lexical_block. If it has address range(s), then the parent must contain it. Not true for compile units. So the empty check is saying "it is ok if you just have a DW_AT_low_pc". If the CU has a low/high PC or a DW_AT_ranges, then it must contain all functions.

215

NOR is used to detect any functions overlapping. So the UnitDI.RI is used to track the Unit DIE and its ranges so each subprogram in the CU can verify it is in the CU ranges if the CU has range(s), and the NOR is never cleared across all CUs so we can verify that no functions overlap within a CU. So this is already happening.

216

This is the easiest way to do it.

I would expect that functions would be compared to their enclosing CU - if all CUs are not overlapping and all functions are contained within their CU's range then no function could be overlapping (in the same way that if all lexical scopes are contained within their parent, and their parents are non-overlapping (eg: different, non-overlapping functions) then such a pair of lexical scopes couldn't overlap)

So what if a CU has no ranges? Then we need to actually make and keep a list of all CUs and then compare them at the end. And we would still need to tell which DIEs overlapped in the end. This is easier and cleaner IMHO.

217

It checks 3 things:
1 - if the DIE has bad ranges or ranges that overlap within itself
2 - if the CU has ranges, make sure its ranges are in them
3 - make sure that the function doesn't overlap globally with any other function's address range

This seems pretty simple.

218

overlapping CUs happens below. Each DW_TAG_subprogram will check if it is in the UnitDI.RI's ranges and also add itself to the UnitDI.NOR to verify no functions globally overlap.

224–225

The above two lines need to be removed.

238

Sounds good.

245

The "NOR" in the UnitDI never gets cleared, so it is doing the cross CU checking. UnitDI.RI gets updated with the current CU info, but UnitDI.NOR never gets cleared.

Sorry I hadn't posted my reply comments, just did that. Will look at your new comments and answer as needed.

clayborg updated this revision to Diff 99177.May 16 2017, 11:53 AM
clayborg added inline comments.
lib/DebugInfo/DWARF/DWARFVerifier.cpp
59

This should be answered above, let me know if it isn't?

140

I don't follow.

I believe this patch is close enough for an initial checkin. We can iterate on it once it is in? We seem to be down to how to best iterate over ranges in ranges for intersect and for contains. Let me know your thoughts. Or feel free to commandeer this and check it in as you want it...