This is an archive of the discontinued LLVM Phabricator instance.

Fix DWARFContext::getCompileUnitForOffset().
ClosedPublic

Authored by friss on Sep 9 2014, 5:05 AM.

Details

Summary

The header comment for this funtction states that it is usable to find the
CompileUnit containing a given offset, but right now it only works for exact
start offsets of the Units.

To generalize it to any offset, change the comparator function to compare
against the Unit->getNextUnitOffset() and change the search function from
lower_bound to upper_bound. This way, the returned unit is the first one
where Unit->getNextUnitOffset() is strictly greater than the searched offset
which is exactly what we want.

Further patches will use this generalized function to lookup cross-unit
references, and test coverage for this fix will then come naturally through
the tests for that new functionality.

Diff Detail

Repository
rL LLVM

Event Timeline

friss updated this revision to Diff 13452.Sep 9 2014, 5:05 AM
friss retitled this revision from to Fix DWARFContext::getCompileUnitForOffset()..
friss updated this object.
friss added reviewers: dblaikie, samsonov.
friss added a subscriber: Unknown Object (MLST).
samsonov added inline comments.Sep 10 2014, 5:50 PM
lib/DebugInfo/DWARFContext.cpp
415 ↗(On Diff #13452)

You still need to check that Offset you're looking for is inside the CU.

friss updated this revision to Diff 13695.Sep 15 2014, 2:23 AM

Rebased the fix on trunk where the search code is now located in the DWARFUnitSection class.

samsonov added inline comments.Sep 15 2014, 10:33 AM
lib/DebugInfo/DWARFUnit.h
66 ↗(On Diff #13695)

You still need to check that Offset you're looking for is inside the CU.

friss added inline comments.Sep 15 2014, 11:05 AM
lib/DebugInfo/DWARFContext.cpp
415 ↗(On Diff #13452)

I don't think so as units in a section are necessarily contiguous. The range of acceptable offsets goes from 0 inclusive to lastUnit->getNextUnitOffset() exclusive. And upper_bound will rightly return CUs.end() if we pass an offset >= the last admissible one.

lib/DebugInfo/DWARFUnit.h
66 ↗(On Diff #13695)

I might be wrong, but as I answered last time you told me that, I think this is not true. The admissible offsets go from 0 to lastUnit->getNextUnitOffset() exclusive. There is no hole in the section.

samsonov accepted this revision.Sep 15 2014, 1:08 PM
samsonov edited edge metadata.

LGTM

lib/DebugInfo/DWARFUnit.h
66 ↗(On Diff #13695)

Sorry, I've received both answers at the same time (probably, your first comment was saved as draft but not submitted).

You are right - there are no holes. Please add a comment about it and feel free to submit this.

This revision is now accepted and ready to land.Sep 15 2014, 1:08 PM
friss added inline comments.Sep 15 2014, 1:26 PM
lib/DebugInfo/DWARFUnit.h
66 ↗(On Diff #13695)

Thanks! And sorry, I should have checked that the old review reply got sent out.
David are you also OK with this? I gathered that maybe you wanted to look more into the implications of this.

dblaikie accepted this revision.Sep 15 2014, 1:48 PM
dblaikie edited edge metadata.
dblaikie added inline comments.
lib/DebugInfo/DWARFUnit.h
66 ↗(On Diff #13695)

Thanks for looping back around/checking.

I'm still unclear on a few things here:

Do we need all 3 comparison operations? (if we only need one, I won't have to think so hard about the others - if we only need one, which I think we do (specifically the "(unique_ptr<Unit>, unsigned offset)")) - if we don't, please just delete the two unneeded ones in a preliminary cleanup patch to simplify this one.

getNextUnitOffset < Offset seems... OK, thinking about it I see that it makes sense. I'm not sure if a comment would be helpful to explain why (I can't think of a simple comment that really helps).

Looks fine to me with the optional preliminary cleanup mentioned above.

(we should be able to use code coverage to see that the other two comparison functions are never executed... someone maintains a code coverage website for LLVM but I forget the URL)

dblaikie added inline comments.Sep 15 2014, 1:50 PM
lib/DebugInfo/DWARFUnit.h
66 ↗(On Diff #13695)

(further comment - if those other two comparisons are required, I think they may be incorrect, specifically I think the RHS should be getOffset in (1) and (3), not "getNextUnitOffset") - I can explain more if it turns out these are going to remain in the code, if they're being deleted then there's no need to discuss them)

friss added inline comments.Sep 15 2014, 2:12 PM
lib/DebugInfo/DWARFUnit.h
66 ↗(On Diff #13695)

I don't think we need the 3, but it depends on the point of view, and I can't seem to find a definitive reference on the subject.

The documentation I found states that the functor needs to be convertible to a conversion between the value and the collection contents (the direction of the comparison varies between lower_/upper_bound). I tested with libcxx by only providing a simple function, and it works.
I also read that the parameter should adhere to the Compare concept. This means among other things that comp(a,a) must return false. With a simple function comparing an offset and a Unit, comp(a,a) is simply illegal.

I'm quite sure that no implementation would use anything else that a simple comparison function, but this doesn't mean it is right.

dblaikie added inline comments.Sep 15 2014, 2:33 PM
lib/DebugInfo/DWARFUnit.h
66 ↗(On Diff #13695)

Since we're already relying on heterogenous comparisons here, I think it's safe to remove the other two.

As for language lawyering - it looks like the changes proposed here: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#270 were applied to C++11, where the final wording says:

"The elements e of [first,last) shall be partitioned with respect to the expression ... comp(e, value)."

friss updated this revision to Diff 13748.Sep 16 2014, 6:27 AM
friss edited edge metadata.

Remove unneeded methods in the comparison functor.

dblaikie added inline comments.Sep 16 2014, 10:38 AM
lib/DebugInfo/DWARFUnit.h
50 ↗(On Diff #13748)

I'm rather surprised if this is the one function that's required. According to the spec, it looks like the specified value is passed as the RHS, not the LHS...

friss added inline comments.Sep 16 2014, 1:32 PM
lib/DebugInfo/DWARFUnit.h
50 ↗(On Diff #13748)

Note I'm using upper_bound bellow. The link you provided says:
Change 25.3.3.2 [lib.upper.bound] paragraph 1 from:
-1- Requires: Type T is LessThanComparable (lib.lessthancomparable).
to:
-1- Requires: The elements e of [first, last) are partitioned with respect to the expression !(value < e) or !comp(value, e)

So value is on the LHS and the element at the RHS.

dblaikie added inline comments.Sep 16 2014, 2:09 PM
lib/DebugInfo/DWARFUnit.h
50 ↗(On Diff #13748)

Oh, right - thanks for the explanation.

In that case, could we write this comparison as "LHS < RHS->getOffset()". That would be the real less-than comparison of the offset.

But then I suppose you'd have to switch around to lower_bound? Would there be a problem with that?

Sorry I'm being rather confused by all this...

It's just given the definition of upper_bound, it's sort of weird to give a less than comparison that's subtly different (doesn't actually compare !< for the same unit) and then access the element returned - when the element should be one /past/ the equal elements you're looking for... except for the subtle comparison which ends up doing what you want.

So I suppose what I'm trying to suggest is:

operator()(unique_ptr LHS, unsigned RHS)
  return LHS->getNextUnitOffset() <= RHS; // this is actually a "less than" comparison, because "next unit offset" is past the end of this unit, this could alternatively be written as "getNextUnitOffset() - 1 < RHS"
}
...
auto I = lower_bound... 
...
friss added inline comments.Sep 16 2014, 11:27 PM
lib/DebugInfo/DWARFUnit.h
50 ↗(On Diff #13748)

For me 2 solutions work:

  • upper_bound with Offset < RHS->getNextUnitOffset()
  • lower_bound with LHS->getNextUnitOffset() <= Offset

upper_bound returns the first element comparing greater than Offset, thus the first Unit where the getNextUnitOffset is strictly greater than Offset. I find this definition quite logical.

lower_bound returns the first element 'not less than', i.e. the first element where the comparator returns false. Which is strictly equivalent, but IMHO not less than applied to the non-strict comparison requires a bit more thinking.

But I guess which one seems more logical depends on the mental model you have for both functions. I managed to convince myself that both are equivalent (I hope all this back and forth didn't confuse my logic :-) ), thus I really don't mind the version that gets committed.

samsonov added inline comments.Sep 17 2014, 3:12 PM
lib/DebugInfo/DWARFUnit.h
50 ↗(On Diff #13748)

FWIW I'm perfectly fine with the current implementation.

friss closed this revision.Sep 18 2014, 2:48 AM
friss updated this revision to Diff 13823.

Closed by commit rL218040 (authored by @friss).