This is an archive of the discontinued LLVM Phabricator instance.

[llvm-debuginfo-analyzer] 04 - Locations and ranges
ClosedPublic

Authored by CarlosAlbertoEnciso on May 17 2022, 6:44 AM.

Diff Detail

Build Status
Buildable 188371

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMay 17 2022, 6:44 AM
CarlosAlbertoEnciso requested review of this revision.May 17 2022, 6:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 17 2022, 6:44 AM
CarlosAlbertoEnciso edited the summary of this revision. (Show Details)May 17 2022, 7:07 AM
CarlosAlbertoEnciso removed a project: Restricted Project.
CarlosAlbertoEnciso removed subscribers: mgorny, hiraditya.
Herald added a project: Restricted Project. · View Herald TranscriptMay 17 2022, 7:07 AM
CarlosAlbertoEnciso edited the summary of this revision. (Show Details)May 18 2022, 1:33 AM

Good job! Here is a series of comments, please have a look.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVLocation.h
43

Either the class should be marked as final or the destructor should be virtual.

84
129

Should this and a couple of const methods below return non-const pointers?

164–169

Either virtual or override is enough.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVRange.h
26

The class should be marked as final or have a virtual destructor (or protected destructor if you have no intent to access the derived from LVRangeEntry classes by pointers to the basic class).

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
122

This cast of a const pointer to a non-const, so casting away constness may cause undefined behavior if you write/change something in the non-const function member. If you don't (it's not clear now because not all the involved functions are implemented yet), it could make sense to move the actual implementation of the function member into the const version and cast this to const (or use it as is, I believe it should compile too).

Also I saw this pattern a few times in the previous patch: https://reviews.llvm.org/D125778

llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
324

Is there any enum in the dwarf namespace which is equals to zero? If not it could make sense to use a named constant instead of magic number 0.

623
625
628

There is not check that Locations is not empty.

646

There is no check that Locations is not null.

llvm/lib/DebugInfo/LogicalView/Core/LVRange.cpp
28

I'm not sure who will be the owner of the heap-allocated Scope and who will delete it. Maybe a comment is required.

37

This check is not required, if RageEntries is empty, the loop below will just do no iterations.

75

This should create a RangeEntry in place.

78
llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
136–137

These two lines should be swapped.

llvm/lib/DebugInfo/LogicalView/Core/LVSort.cpp
59

I'm not sure these wrapping () are required.

llvm/lib/DebugInfo/LogicalView/Core/LVSymbol.cpp
173

This method looks like it can be marked as const if you have no plan to add a call to calculateCoverage() or to another method that changes the object.

llvm/unittests/DebugInfo/LogicalView/LocationRangesTest.cpp
95

There is an overload of the operator== for two StringRefs, so there is no needs in EXPECT_STREQ, moreover the data() method of StringRef doesn't guarantee that the returned string will be null terminated.

psamolysov added inline comments.May 31 2022, 5:45 AM
llvm/unittests/DebugInfo/LogicalView/LocationRangesTest.cpp
59

A std::bad_alloc will be thrown if the test is out of memory. It makes sense to wrap the line in the try-catch block or use the non-throwing overload of the new operator.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVLocation.h
43

Marked as final.

84

Good simplification. Changed.

129

The getLowerLine and getUpperLine now return a const pointer.

164–169

Removed virtual.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVRange.h
26

Marked as final.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
122

The implementation of printActiveRanges it is not changed at later patches.

The const version is only required when it is called by other const functions:

void LVScopeNamespace::printExtra(raw_ostream &OS, bool Full) const {
  ...
  // Print any active ranges.
  if (Full) {
    printActiveRanges(OS, Full);
    ...
  }
}
llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
324

@probinson How do you feel in creating something like dwarf::DW_OP_null similar to the existing dwarf::DW_TAG_null.

623

Added the assertion.

625

Added the assertion. Small correction: !Locations->empty()

628

Added the following checks to cover both cases: null ptr and/or empty.

...
assert(Locations && "Locations must not be nullptr");
// Print the symbol coverage.
if (Locations && !Locations->empty() && options().getAttributeCoverage()) {
..
646

Added the following check:

if (Locations && getReader().doPrintLocation(/*Location=*/nullptr))
llvm/lib/DebugInfo/LogicalView/Core/LVRange.cpp
28

That is a good finding.

using LVTuple = std::tuple<LVAddress, LVAddress, LVScope *>;
LVTuple getEntry(LVAddress Low, LVAddress High, LVLevel Level) {
  LVScope *Scope = new LVScope();
  Scope->setLevel(Level);
  return std::make_tuple(Low, High, Scope);
}

That code is leftovers. Removed all together.

37

Removed the check.

75

Replaced.

78

Added assertion.

llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
136–137

Done.

llvm/lib/DebugInfo/LogicalView/Core/LVSort.cpp
59

Removed the wrapping ().

llvm/lib/DebugInfo/LogicalView/Core/LVSymbol.cpp
173

Marked the method as const.

llvm/unittests/DebugInfo/LogicalView/LocationRangesTest.cpp
59

Changed to use the non-throwing overload.

llvm/unittests/DebugInfo/LogicalView/LocationRangesTest.cpp
95

Changed to EXPECT_EQ(Element->getName(), Name);

probinson added inline comments.Jul 1 2022, 5:45 AM
llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
324

DW_TAG_NULL is meaningful in the sense that a 0 tag indicates no more siblings at this nesting level, in the debug info. But a 0 operator should never appear; are you inserting these yourself for some reason?

llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
324

Thanks for the clear answer on DW_TAG_null.

This is an extract from the llvm-dwarfdump output for an ELF test case:

DW_TAG_member
     DW_AT_name	("name")
     DW_AT_type	("int")
     DW_AT_decl_line	(1858)
     DW_AT_data_member_location	(0x00)

Which llvm-dva converts into

1858           {Member} private 'name' -> 'int'
                 {Location}
                   {Entry} offset 0

The 0 operator corresponds to the DW_AT_data_member_location value.

probinson added inline comments.Jul 5 2022, 1:20 PM
llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
324

What I'm understanding from this is that when llvm-dva reads the DIE for "name" it finds DW_AT_data_member_location, and encodes that into an LVOperation with Opcode = 0. Then LVOperation::getOperandsDWARFInfo() sees the 0 opcode and knows it means this is a simple offset.
If I am understanding that correctly, then this is something llvm-dva is doing for its own benefit, and should not be considered a DWARF expression opcode of any kind. llvm-dva should define its own constant for this, and really should not use a DW_OP_* spelling for the name.

llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
324

Your analysis is correct.

324

@psamolysov, @probinson: based on the previous analysis any suggestions to define a llvm-dva constant to describe the zero as an indication for member-offset?

May be LV_LOCATION_MEMBER_OFFSET

That constant will be used in

// Add a Location Entry.
void LVSymbol::addLocationConstant(dwarf::Attribute Attr, LVUnsigned Constant,
                                   uint64_t LocDescOffset) {
  ...
  // Add records to Location Entry.
  addLocationOperands(/*Opcode=*/LV_LOCATION_MEMBER_OFFSET,
                      /*Operand1=*/Constant, /*Operand2=*/0);
}

std::string LVOperation::getOperandsDWARFInfo() {
  ...
  switch (Opcode) {
  ...
  //-------------------------------------------------------------------------
  // Member location.
  //-------------------------------------------------------------------------
  case LV_LOCATION_MEMBER_OFFSET:
  ...
  }
}
probinson added inline comments.Jul 8 2022, 6:01 AM
llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
324

I think LLVM style would prefer a CamelCase name, such as LVLocationMemberOffset.

llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
324

Thanks for your suggestion. Added the following definition in LVLocation.h

// The DW_AT_data_member_location attribute is a simple member offset.
const LVSmall LVLocationMemberOffset = 0;
  • Addressed the reviewer’s feedback.
  • Tool renamed as: llvm-debuginfo-analyzer.
CarlosAlbertoEnciso retitled this revision from [llvm-dva] 04 - Locations and ranges to [llvm-debuginfo-analyzer] 04 - Locations and ranges.Jul 25 2022, 2:17 AM
CarlosAlbertoEnciso edited the summary of this revision. (Show Details)
probinson added inline comments.Sep 19 2022, 10:41 AM
llvm/include/llvm/DebugInfo/LogicalView/Core/LVRange.h
49

Can they overlap?

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
122

I think I said something about this in a previous patch, that a non-const print function is counter-intuitive. If it can be designed not to need a non-const version, that would be better.

224

Why have both calculateCoverage and CalculateCoverageFactor ? They appear to do the same thing.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
174

updateOperationCode implies modifying something, which this function does not do. getCVOperationCode maybe?

llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
624

This is okay, but when a pointer parameter must not be null, it's better practice to make it a reference parameter instead. Then tests for non-null go away.

627

The if above already checked !Locations->empty() so this assertion is redundant.

llvm/lib/DebugInfo/LogicalView/Core/LVRange.cpp
27

The same heading is here and below where the entire resulting tree is printed. Maybe these should be different? "Range Tree entries" here, and leave "Ranges Tree" for when the whole tree is printed.

119

This comment does not describe what the method does, from the caller's perspective. Also maybe should be named hasEntry instead of findEntry

132

I see the caller is stable_sort so is there an ordering guarantee for multiple ranges with the same lower bound? If so, where is that documented and enforced?

llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
712

"get" usually implies a method that is returning data of interest, but these are all doing something else.

762

Looks like if Ranges is null, this ends up returning null. So, cheaper to do

if (!Ranges)
  return nullptr;

and then remove the if (Ranges) from inside the while loop.

942

It looks odd to call something named getRanges using local variables that are immediately thrown away. This is essentially the same comment about the names of the get methods I made previously.

llvm/lib/DebugInfo/LogicalView/Core/LVSymbol.cpp
190
203

Same comment as the other place that made this same calculation, in a previous patch.

llvm/unittests/DebugInfo/LogicalView/LocationRangesTest.cpp
59

Use ASSERT_NE so the test will abort if the allocation fails.

Update patch due to changes in previous patches.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVRange.h
49

As they are recorded using the IntervalTree, they can overlap.
Added to the comment:

// Class to represent a list of range addresses associated with a
// scope; the addresses are stored in ascending order and can overlap.
llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
224

Removed CalculateCoverageFactor. Left over from some renaming.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
174

Changed to getCodeViewOperationCode.

llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
624

I see your point. Unfortunately Locations is a pointer and can be null.

The caller is already checking for a null pointer.

void LVSymbol::printExtra(raw_ostream &OS, bool Full) const {
  ...
    if (Locations)
      // Print location information.
      LVLocation::print(Locations, OS, Full);
  ...
}

Change to

void LVSymbol::printExtra(raw_ostream &OS, bool Full) const {
  ...
  // Print location information.
  LVLocation::print(Locations, OS, Full);
  ...
}
// Print location (formatted version).
void LVLocation::print(LVLocations *Locations, raw_ostream &OS, bool Full) {
  if (!Locations || Locations->empty())
    return;

  // Print the symbol coverage.
  if (options().getAttributeCoverage()) {
    ...
  }

  // Print the symbol location, including the missing entries.
  if (getReader().doPrintLocation(/*Location=*/nullptr))
    ...
}

The assertion is not needed.

627

Removed assertion.

llvm/lib/DebugInfo/LogicalView/Core/LVRange.cpp
27

Good point. Changed to your suggested heading.

llvm/lib/DebugInfo/LogicalView/Core/LVSymbol.cpp
190

Changed.

203

It solved the issue of differences in the percentage when analyzing the same binary on Windows and Linux. For some cases:

Expect value:
Totals by lexical level:
[001]:        106 (100.00%)
[002]:         71 ( 66.99%)    <---
[003]:         20 ( 18.87%)
Actual value
Totals by lexical level:
[001]:        106 (100.00%)
[002]:         71 ( 66.98%)    <---
[003]:         20 ( 18.87%)
llvm/unittests/DebugInfo/LogicalView/LocationRangesTest.cpp
59

ASSERT_NE can be used only inside functions that don't return a value.

See https://google.github.io/googletest/advanced.html#assertion-placement

The one constraint is that assertions that generate a fatal failure (FAIL* and ASSERT_*) can only be used in void-returning functions.

llvm/lib/DebugInfo/LogicalView/Core/LVRange.cpp
119

Comment changed to:
// True if the range addresses contain the pair [LowerAddress, UpperAddress].

Method renamed as hasEntry.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
122

Yes. You mentioned in the previous patch:

Looks like the method shouldn't be declared const. Ordinarily you wouldn't think a print() method would be non-const, but obviously this one is.

Removed the non-const version. Changed printActiveRanges to be const as it does not modify any class members.

llvm/lib/DebugInfo/LogicalView/Core/LVRange.cpp
132

Very good point. Added code to handle that case:

auto CompareRangeEntry = [](const LVRangeEntry &lhs,
                            const LVRangeEntry &rhs) -> bool {
  if (lhs.lower() < rhs.lower())
    return true;

  // If the lower address is the same, use the upper address value in
  // order to put first the smallest interval.
  if (lhs.lower() == rhs.lower())
    return lhs.upper() < rhs.upper();

  return false;
};
llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
712

It seems that is the same issue as with the getRanges method.

Added explanatory comments in getLocations to describe the functions performed by this method.

// Traverse the symbol location ranges and for each range:
// - Apply the 'ValidLocation' validation criteria.
// - Add any failed range to the 'LocationList'.
// - Calculate location coverage.
void LVScope::getLocations(LVLocations &LocationList,
                           LVValidLocation ValidLocation, bool RecordInvalid) {
  ...
}
762

I think you found an issue with the code.
The Ranges is not updated when we move scopes. We must use the Ranges for the new scope and check if the given Address is contained in the specific set.

LVScope *LVScope::outermostParent(LVAddress Address) {
  LVScope *Parent = this;
  while (Parent) {
    const LVLocations *ParentRanges = Parent->getRanges();
    if (ParentRanges)
      for (const LVLocation *Location : *ParentRanges)
        if (Location->getLowerAddress() <= Address)
          return Parent;
    Parent = Parent->getParentScope();
  }
  return Parent;
}
942

You are correct as the local Locations is immediately thrown away.

What the getRanges method does is:

  • traverse the scope ranges
  • apply a validation criteria.
  • add a failed range to the Locations
  • calculate coverage.

It was hard to split getRanges between this patch and the 06-warning-internals patch and avoid big changes.

In summary:
For this patch the only needed functionality in getRanges is to calculate the location coverage. As RecordInvalid = false the getRanges method will not update the local Locations variable.

Added explanatory comments in getRanges and getLocations to describe the functions performed by those methods.

// Traverse the scope ranges and for each range:
// - Apply the 'ValidLocation' validation criteria.
// - Add any failed range to the 'LocationList'.
// - Calculate location coverage.
void LVScope::getRanges(LVLocations &LocationList,
                        LVValidLocation ValidLocation, bool RecordInvalid) {
  ...
}
llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
762

The internal regression suite detected 2 cases where the location coverage changed from:
0.00 -> 19.05 and 0.00 -> 23.53.

Please update to the most recent patch; some changes are not in the review yet.

llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
762

Please make sure the unittest has coverage for this.

Address points raised by @probinson.

probinson accepted this revision.Sep 30 2022, 7:26 AM

LGTM conditional on a test for LVScope::getRanges change. Detecting the difference in an internal test suite is good, but having something upstream is better.

This revision is now accepted and ready to land.Sep 30 2022, 7:26 AM

LGTM conditional on a test for LVScope::getRanges change. Detecting the difference in an internal test suite is good, but having something upstream is better.

Added test case.

Updated patch:

  • Removes some redundant public: modifiers.
  • Add pending unittest requested by @probinson.
This revision was landed with ongoing or failed builds.Oct 20 2022, 12:32 AM
This revision was automatically updated to reflect the committed changes.

@psamolysov , @probinson Thanks very much for all your valuable reviews.

The following patch fixed some buildbot sanitizer issues: https://reviews.llvm.org/D136333

CarlosAlbertoEnciso edited the summary of this revision. (Show Details)Nov 14 2022, 4:01 AM