This is an archive of the discontinued LLVM Phabricator instance.

[llvm-debuginfo-analyzer] 08a - Memory Management
ClosedPublic

Authored by CarlosAlbertoEnciso on Nov 14 2022, 3:53 AM.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptNov 14 2022, 3:53 AM
CarlosAlbertoEnciso requested review of this revision.Nov 14 2022, 3:53 AM
CarlosAlbertoEnciso edited the summary of this revision. (Show Details)Nov 14 2022, 3:53 AM

(some suggestions here might be too noisy to do all in this review - maybe a matter of splitting things up - the discussion about new data structures/changes in ownership might be one of those worth breaking down into smaller steps, not sure)

Perhaps some general discussion about the ownership/lifetime model of everything here would be useful to add to the patch description to get a general lay of the land?

llvm/include/llvm/DebugInfo/LogicalView/Core/LVObject.h
84–85

Looks like these are unused, should they be removed?

llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
82

This addition of a new owning data structure (similar to some other comments I've made) makes me a bit uncomfortable - were there existing memory ownership issues in this code that this is addressing, or was the existing ownership hard to model in C++ explicit ownership? Or some other reason this is being introduced?

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
119–132

Don't need the = nullptr for std::unique_ptr, that's the default behavior of the default ctor. (this is one of the reasons I'm a bit reticent to use aliases for unique_ptr names - I'm not sure the name shortening is worth the benefit compared to the obfuscation of these sort of core types with known semantics - might be better to keep the explicit unique_ptr usage here/generally?)

492

Remove this, since it's restating the default behavior of the implicit dtor?

553–563

Are these incidental/unrelated fixes? Maybe pre-commit/post-commit them to keep this patch more focussed? (oh, I guess maybe they're motivated by this patch because without unique_ptr these types were copyable, but probably weren't intended to be copied here anyway - so precommit cleanup might be suitable?)

llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
134–135

Maybe simplify this?

150–151

Remove this, since it's the default?
(& could probably remove the default ctor too, separately, since that's the default as well)

173–174
173–174

(I'd probably skip naming a variable here, FWIW - and roll the initialization into the return)

199–201

(or otherwise avoid the need to call .get() on one line only to dereference on the next (eg: if you find the named variable important for readability, change its type to a reference - and do type &x = *y; for (... : x) instead of .get()+*, perhaps))

llvm/include/llvm/DebugInfo/LogicalView/Core/LVSymbol.h
76

Remove this default?

llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
30

If there are going to be aliases for unique_ptr types, maybe worth using a consistent suffix (this one is Obj, others are Ptr) - and maybe SP rather than Ptr, to make it clear it's a smart pointer? But generally I'd avoid the aliases & find that seeing unique_ptr is probably sufficiently more clear to readers so as to be worth the extra characters required.

73–74

remove this default?

83

I /think/ we can write this as return Err; now we're using C++17?

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
155–156

Perhaps?

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.h
286 ↗(On Diff #475097)

Remove this because it's the same as the default?

llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
583

push_back is probably fine here, since it's just calling an implicit move ctor, rather than any explicit operation?

608–609

(or remove the named variable entirely, perhaps - and roll the expressions together into Entries->front()->getOpcode())

llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
130–131

There's a lot of code like this - Lines here, Children above, Ranges and Scopes below, etc - are they all needed, or could some of them be turned into direct values (eg: LVLines Lines rather than unique_ptr<LVLines> Lines?)?

llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
299–307

Perhaps this'd be simpler as:

also - if this function never returns null, it should probably return by reference instead of pointer?

And if this map never contains null elements, perhaps it could contain values, instead of unique_ptrs? (they still have pointer stability in a std::map - so that probably is enough?)

372

What's the ownership model for "Instructions"/why did DiscoveredLines get introduced in this patch? (was there a memory ownership issue in the raw-new-using code that this is addressing? Otherwise I Wouldn't expect to see new containers in this patch)

806–807

Probably turn this into a reference? (could do that sort of thing pre or postpatch I guess)

(some suggestions here might be too noisy to do all in this review - maybe a matter of splitting things up - the discussion about new data structures/changes in ownership might be one of those worth breaking down into smaller steps, not sure)

Perhaps some general discussion about the ownership/lifetime model of everything here would be useful to add to the patch description to get a general lay of the land?

First of all, thanks very much for your reviews.
I have addressed some of them. Tomorrow I will finish the remaining (including a general discussion about the ownership/lifetime) and upload a new patch.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVObject.h
84–85

Good finding. They are unused. Removed

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
119–132

Happy to remove all the aliases for std::unique_ptr and keep the explicit unique_ptr.

std::unique_ptr<LVElements> Children;

492

Changed to ~LVScopeCompileUnit() = default; just for consistency with other LVScopexxx definitions.

May be in a post-commit: remove all default dtors.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
134–135

Nice simplification. Changed to
(*Map)[Key].push_back(Value);

150–151

Removed both default ctor and dtor.

173–174

Removed the extra parenthesis.

173–174

Removed the extra variable.

199–201

Removed extra parenthesis.
Keep named variable for readability.

Changed to

for (typename LVFirstMapType::const_reference FirstEntry : FirstMap) {
  LVSecondMapType &SecondMap = *FirstEntry.second;
  for (typename LVSecondMapType::const_reference SecondEntry : SecondMap)
    Values.push_back(SecondEntry.second);
}
llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
30

Happy to remove all the aliases for std::unique_ptr and keep the explicit unique_ptr.

using LVReaders = std::vector<std::unique_ptr<LVReader>>;

73–74

Removed.

83

You are correct. Changed to return Err;.

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
155–156

Nice simplification.

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.h
286 ↗(On Diff #475097)

Removed.

llvm/lib/DebugInfo/LogicalView/Core/LVLocation.cpp
583

Good point. Changed to push_back.

608–609

Nice simplification. Changed to
if (dwarf::DW_OP_fbreg == Entries->front()->getOpcode())

llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
130–131

This is a good question. They are all needed.
The reason to define Lines, Children, Ranges, Scopes, etc in that way is to keep the size of LVScope as small as possible.

struct LVLines {};

class ClassPointer {
  std::unique_ptr<SmallVector<LVLines *>> Lines;
};

class ClassVector {
  SmallVector<LVLines *,2> Lines; 
};

ClassPointer size = 8 bytes
ClassVector size = 32 bytes

llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
806–807

Removed the extra parenthesis.
Changing LVLines *InlineeLines to reference can be done as a postpatch.
Adding

// TODO: Convert this into a reference.
LVLines *InlineeLines = InlineeIter->second.get();

(pulling this discussion about new data structures out of line as I think it'll take more words/quoting/etc than are easy to do in inline comments)

There's a lot of code like this - Lines here, Children above, Ranges and Scopes below, etc - are they all needed, or could some of them be turned into direct values (eg: LVLines Lines rather than unique_ptr<LVLines> Lines?)?

This is a good question. They are all needed.
The reason to define Lines, Children, Ranges, Scopes, etc in that way is to keep the size of LVScope as small as possible.

struct LVLines {};

class ClassPointer {
  std::unique_ptr<SmallVector<LVLines *>> Lines;
};

class ClassVector {
 SmallVector<LVLines *,2> Lines; 
};

ClassPointer size = 8 bytes
ClassVector size = 32 bytes

Could you expand on this a bit - what's the current ownership model (before moving to smart pointers)? (which things are owned by what/tracked where - where are the raw pointers stored/what code is responsible for cleaning them up) Maybe take one of these lists as an example.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
553–563

but probably weren't intended to be copied here anyway

You are correct. Created https://reviews.llvm.org/D138092

llvm/include/llvm/DebugInfo/LogicalView/Core/LVSymbol.h
76

Keep it just for consistency with other LVScopexxx definitions.

May be in a post-commit: remove all default dtors.

llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
299–307

Nice simplification.
Changed to

LVSectionRanges::iterator IterSection = SectionRanges.find(SectionIndex);
if (IterSection == SectionRanges.end())
  IterSection =
      SectionRanges.emplace(SectionIndex, std::make_unique<LVRange>()).first;
LVRange *Range = IterSection->second.get();
assert(Range && "Range is null.");
return Range;
299–307

also - if this function never returns null, it should probably return by reference instead of pointer?

And if this map never contains null elements, perhaps it could contain values, instead of unique_ptrs? (they still have pointer stability in a std::map - so that probably is enough?)

What is your preference on those changes:
Part of this patch or post-commit?

CarlosAlbertoEnciso added a comment.EditedNov 16 2022, 4:30 AM

The main logical unit is the LVReader, which is the main access point from the user point of view. Its functions are:

  • Load the binary file
  • Parse the debug information
  • Create the logical view which is a tree composed of logical elements (LVScopes, LVSymbols, LVTypes and LVLines, etc.). Depending on the command line options, additional information is added to the those logical elements.
  • Print the logical view.
  • Select logical elements using user criterias.
  • Compare two `logical views'.

This is a very broad relationship between the reader and the logical elements.

class LVReader {
  ...
  LVScopeRoot *Root = nullptr;
  ...
  // Allocated logical elements.
  SmallVector<std::unique_ptr<LVObject>> AllocatedObjects;
  ...
  createScopes() {
    Root = createObject<LVScopeRoot>();
  }
  ...
  doPrint()
  ...
  createObject(...)
}

class LVELFReader <- LVReader {
  ...
}

class LVCodeViewReader <- LVReader {
  ...
}

Root represents the loaded binary file.

The different Readers are created by the Reader Handler based on the binary files format. It records each created reader.

class LVReaderHandler {
  LVReaders TheReaders;
  ...
  createReaders();
  printReaders();
  compareReaders();
  ...
}
class LVScope <- LVObject {
  std::unique_ptr<LVTypes> Types;
  std::unique_ptr<LVSymbols> Symbols;
  std::unique_ptr<LVScopes> Scopes;
  std::unique_ptr<LVLines> Lines;
}

class LVScopeRoot <- LVScope {
   ...
}

A LVScope can represent a compile unit, function, lexical block, etc.

The other logical elements:

class LVLine <- LVObject {
  ...
}

class LVSymbol <- LVObject {
  ...
}

class LVType <- LVObject {
  ...
}
CarlosAlbertoEnciso added a comment.EditedNov 16 2022, 5:22 AM
OwnerWhatTracked
LVReaderHanderLVReaderTheReaders
LVReaderLogical ElementsAllocatedObjects
LVReaderLVLineAssemblerDiscoveredLines

The raw pointers for the logical elements are stored in the LVScope: Types, Symbols, Scopes, Lines, etc.
The AllocatedObjects was introduced to record their smart pointers, to minimize the changes across the tool.

// Creates a logical object and records it in the Reader.
template <typename ObjectType> ObjectType *createObject() {
  std::unique_ptr<LVObject> ObjectSP = std::make_unique<ObjectType>();
  LVObject *Object = ObjectSP.get();
  AllocatedObjects.emplace_back(std::move(ObjectSP));
  return (ObjectType *)Object;
}

createObject is use to create any logical element within the tool or in the unittests.
DiscoveredLines are the lines created to represent the disassembly instructions. Those LVLineAssembler are moved to the Lines container in the LVScope.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
82

or was the existing ownership hard to model in C++ explicit ownership? Or some other reason this is being introduced?

Correct.

llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
372

The raw-new-using code stored those 'Instructions' in the 'LVDoubleMap` container. It was not very clear.
Now they are stored at the Reader scope.

Updated patch addressing @dblaikie review.

The AllocatedObjects was introduced to record their smart pointers, to minimize the changes across the tool.

This is probably the core of what I'm concerned about & the reason for more up-front review. If the logical ownership is for each node to own its children, then that's what I'd probably want to see here, rather than introducing a side table ownership "bucket".

It'd be good if this change didn't move ownership around - but mapped the ownership directly from the C code, where it's not too convoluted. Or if there's good reason (other than code churn, like "even if we were continuing to use raw new/delete, we would still want to change the ownership in this way") to change the ownership models at the same time I think more details about the old ownership and new ownership and why the new one is better would be suitable to have in the patch description.

As it stands, trees owning their children seems fine (though possibly slow to cleanup but the currently proposed change in ownership doesn't look like it'd speed that up much, and would increase memory usage I think - if perf is a problem, it might be worth considering one of LLVM's BumpPtrAllocator - if nodes are only ever allocated in one pass, never manipulated/created/destroyed later - maybe that's the good/correct and simpler path (not correct because it's simpler, but independently)?)

llvm/unittests/DebugInfo/LogicalView/WarningInternalTest.cpp
497

Perhaps as follow-up, this could/should be changed to a reference variable, then the next line (checking non-null) would be more obviously unneeded for instance.

(similarly in the several other cases above/below)

@dblaikie: I am sorry for my delay in addressing your feedback/concerns. I just came last week from a long annual leave (from Nov 19th).

CarlosAlbertoEnciso retitled this revision from [llvm-debuginfo-analyzer] 10 - Smart Pointers to [llvm-debuginfo-analyzer] 08a - Memory Management.
CarlosAlbertoEnciso edited the summary of this revision. (Show Details)
CarlosAlbertoEnciso edited the summary of this revision. (Show Details)Jan 19 2023, 4:18 AM

Address @dblaikie concerns on the use of a bucket with smart pointers.

This patch uses a SpecificBumpPtrAllocator to manage the creation of logical elements.

I have explored several options for how to model the pointer ownership, and I think SpecificBumpPtrAllocator is probably the best, my reasoning is below.

Explored options:
Move ownership to scope nodes with SmartPointers (Very difficult).
The containers within the scope, will contain smart pointers. The main complication is that there are quite few functions that are executed from the point where the logical element is created up to the point where the element is added to its scope:

Element is created: createElement() -> SmartPointer
...
Element attributes are updated: func1(), ... -> rawPointer
...
Element is added to scope parent: addElement() --> rawPointer
  • The functions addElement(), func1(), ... can use a raw pointer extracted from the smart pointer.
  • This involves quite a lot of changes specially in the CodeView Reader which uses visitor modules (logical, types and symbols) to traverse the debug information.

Using LLVM's BumpPtrAllocator (Discarded).
Destructors are not called.

Using SpecificBumpPtrAllocator (Small changes).
As the Reader creates different variants of logical elements (see above), it is required to have individual allocators for each variant:

SpecificBumpPtrAllocator<LVScope> Scopes;
SpecificBumpPtrAllocator<LVScopeRoot> ScopesRoot;
...
SpecificBumpPtrAllocator<LVType> Types;
SpecificBumpPtrAllocator<LVTypeDefinition> TypesDefinition;
...
SpecificBumpPtrAllocator<LVLine> Lines;
SpecificBumpPtrAllocator<LVLineDebug> LinesDebug;
...

Each variant will be created by a specific function:

createScope(), createScopeRoot(), ...
createType(), createTypeDefinition(), ...
createLine(), createLineDebug(), ...

Options summary:
a) Keep the bucket with smart pointers

class LVReader {
  // Allocated logical elements.
  SmallVector<std::unique_ptr<LVObject>> AllocatedObjects;
};

b) Use SpecificBumpPtrAllocator

class LVReader {
  ...
  SpecificBumpPtrAllocator<LVScope> Scopes;
  SpecificBumpPtrAllocator<LVScopeRoot> ScopesRoot;
  ...
};

The current patch implements option (b).

This involves quite a lot of changes specially in the CodeView Reader which uses visitor modules (logical, types and symbols) to traverse the debug information.

Whenever I see this mentioned, I get pretty skittish - the solution chosen shouldn't be dependent on how much code churn is involved (because it's the LLVM coding standards we're concerned with before accepting this code into the LLVM project).

But if it's the right solution regardless of cost, I'm more inclined/comfortable with it.

Is this API purely a "build/growth" without ever modifying? (like Clang's AST which uses bump/arena allocation - because you can't mutate the AST after it's built - you can't remove nodes, change nodes, etc)

In which case was the old design excessively chatty about allocation management, given that monolithic nature of the data structure?

Hi @CarlosAlbertoEnciso and @dblaikie - I've had a look at this patch too and have left some inline nits / questions / suggestions.

Is this API purely a "build/growth" without ever modifying? (like Clang's AST which uses bump/arena allocation - because you can't mutate the AST after it's built - you can't remove nodes, change nodes, etc)

This is the impression I get. AFAIK the tree is built and then printed, or compared with another tree, then the work is done and the nodes can be deleted (at once, together). Saying that, there is something interesting going on during the comparison where it looks like nodes are removed from one tree (which will be discarded) to another. Although some nodes are mutated in that case it doesn't involve deleting any nodes. I think the node-mutation works due to the fact that the containers used by nodes to point to their children are heap allocated using the standard allocator (i.e., some content of the nodes are spilled out from the bumpptrallocators into the general heap - this may be a necessary design choice, I'm not familiar enough to be able to say?).

FWIW I think this looks like a generally tidier memory management strategy than was previously used. It seems like a reasonable solution to me - it is now clear that all raw pointers are referencing memory owned by some other object (almost always the reader).

llvm/include/llvm/DebugInfo/LogicalView/Core/LVLocation.h
175

Why is this using a unique_ptr - can or should this be a container of raw pointers, allocating the objects using the new bumpptr allocator?

llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
82

I think this comment should be copied or moved to the class LVReader level (and should be a doxygen comment there - using /// rather than //).

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
120–125

Not really related to the core issue being discussed here but do these members need to be heap allocated at all? Is the space saving between using SmallVector<T, 0> and std::unique_ptr<SmallVector<T, N> worth the additional complexity of having to explicably allocate the container when adding the first element? The answer might simply be yes.

In either case, if that change is worthwhile it doesn't need to be made in this patch.

132

Another point for a future patch - as discussed offline it would probably be better to use a chaining iterator over the other containers rather than keep a separate container Children that mirrors their contents.

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
103

What benefit do we get from std::vector<std::unique_ptr<LVLines>> over std::vector<LVLines>? Are the elements compared by address?

EDIT: see comment later on asking if we actually need this container at all.

llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
372

Why do we need DiscoveredLines at all now? Since the reader now owns the line objects when createLineAssembler is called below.

llvm/unittests/DebugInfo/LogicalView/LogicalElementsTest.cpp
113–125

I think a macro that combined Z = createX(); EXPECT_NE(Z, nullptr) would've been reasonable in order to reduce the line count of this change. It's probably not worth going back and adding retrospectively though.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVLocation.h
175

Good observation. Changed to use the bumpptr mechanism.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
82

Moved to the class LVReader level.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVScope.h
120–125

The issue is the space saving:

SmallVector<T, 0> --> size: 16
std::unique_ptr<SmallVector<T, N> --> size: 8

132

Good point. This is something that it should be changed as it would reduce the space usage.
A future patch is the best way to proceed.

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
103

To answer both questions, this is the logic behind the debug and assembler line processing:

In createScopes for each Compile Unit:

  • call traverseDieAndChildren to parse the debug info
  • Call createLineAndFileRecords that creates LVLineDebugs for each entry in the debug line section
  • call createInstructions that creates a vector of LVLineAssembler for the associated text section. Its unique pointer is stored in DiscoveredLines and never updated. Its raw pointer is stored in ScopeInstructions. The AssemblerMappings is updated to record specific section index, current logical scope and text address.
  • call addSectionRange to collect all debug ranges
  • call processLines to allocate the collected lines (debug and assembler) to their logical scope. It uses the ScopeInstructions to locate specific assembler lines using data from AssemblerMappings. It moves lines (debug and assembler) to their logical scope.

At the end of the scopes creation, DiscoveredLines constains the unique pointers to empty vector.

llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
372

See previous comment that explains the role of DiscoveredLines.

llvm/unittests/DebugInfo/LogicalView/LogicalElementsTest.cpp
113–125

I think creating a macro that combines the creation and check is worth it, as it reduces the extra noise introduced by the current patch.

Address the feedback from @Orlando:

  • Use SpecificBumpPtrAllocator for LVOperation.
  • Replace the sequences
Z = createX(); EXPECT_NE(Z, nullptr);
Z = createX(); EXPECT_NE(Z, nullptr); Z->functionX();

with the macro

CREATE(Z, createX);
CREATE(Z, createX, FunctionX);
Orlando added inline comments.Jan 30 2023, 2:34 AM
llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
103

It moves lines (debug and assembler) to their logical scope.

Where does this happen? I can't see any uses of DiscoveredLines in this patch other than where entries are added. Sorry if this is a silly question - just trying to get my head around this field still.

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
103

This function creates instructions (LVLineAssembler) for the specific SectionIndex in the .text section. Those
instructions are associated with the logica 'Scope'.

Error LVBinaryReader::createInstructions(LVScope *Scope,
                                         LVSectionIndex SectionIndex,
                                         const LVNameInfo &NameInfo) {
  ...
  std::unique_ptr<LVLines> InstructionsSP = std::make_unique<LVLines>();
  LVLines *Instructions = InstructionsSP.get();
  DiscoveredLines.emplace_back(std::move(InstructionsSP));
  ...
  // Traverse the .text section and create the assembler lines.
  // Adding each line to the 'Instructions' vector.
  Instructions->push_back(Line);
  ...
  // Create the following association:
  // the logical 'Scope' for the function in the .text section
  // identified with 'SectionIndex', has the assembler lines
  // contained in 'Instructions'.
  ScopeInstructions.add(SectionIndex, Scope, Instructions);

  // Note: The last item in 'DiscoveredLines' is the smart
  // pointer to the allocated vector 'InstructionsSP' and its
  // raw pointer is 'Instructions'.
}

During the traversal of the debug information sections, we created the logical lines representing the disassembled instructions from the text section and the logical lines representing the line records from the debug line section. Using the ranges associated with the logical scopes, we will allocate those logical lines to their logical scopes.

void LVBinaryReader::processLines(LVLines *DebugLines,
                                  LVSectionIndex SectionIndex,
                                  LVScope *Function) {
  ...
  // The 'DebugLines' vector contains the debug lines
  // for the current compile unit.
  ...
  // Get the associated instructions for the found 'Scope'.
  LVLines InstructionLines;
  LVLines *Lines = ScopeInstructions.find(SectionIndex, Scope);
  if (Lines)
    InstructionLines = std::move(*Lines);
  // Note: We get the vector of instructions from 'ScopeInstructions'
  // using the 'SectionIndex' and 'Scope' and move its contents
  // to a local working vector 'InstructionLines'.
  // 'ScopeInstructions' was updated in 'createInstructions'.
  ...
  for (LVLine *InstructionLine : InstructionLines) {
     ...
     // Using the 'address' information, insert some
     // of these 'InstructionLine' into the 'DebugLines'.
     // Basically merge some 'Instructions' with the
     // lines representing the .debug_line records.
     DebugLines->push_back(InstructionLine);
     ...
  }
  ...
  // Get 'ranges' information.
  LVRange *ScopesWithRanges = getSectionRanges(SectionIndex);
  ScopesWithRanges->startSearch();
  ...
  for (LVLine *Line : *DebugLines) {
    ...
    // Add line object to scope.
    Scope->addElement(Line);
    ...
  }

This function creates the logical scopes for all the compilation units.

Error LVELFReader::createScopes() {
  ...
  // Traverse compile units.
  for (const std::unique_ptr<DWARFUnit> &CU : CompileUnits) {
    ...
    traverseDieAndChildren(CUDie, Root, SkeletonDie);
    ...
    // Create debug lines for current compile unit and
    // store them in the 'CULines' vector. 
    createLineAndFileRecords(DwarfContext->getLineTableForUnit(CU.get()));
    ...
    // Create instruction lines
    createInstructions()
    ...
    processLines(&CULines, SectionIndex);
    ...
    CULines.clear();
  }
}

The moving of the lines (debug and/or assembler) is done in LVBinaryReader::processLines.

That's all of my concerns addressed now, so this LGTM. Thanks @CarlosAlbertoEnciso for your detailed responses.

I won't hit accept though in case @dblaikie has further comments on this new approach.

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
103

Thanks a lot for the detailed breakdown. I fundamentally misunderstood what DiscoveredLines represented and had misread its type as vector<unique_ptr<Line*>> (which is incorrect) rather than vector<unique_ptr<vector<Line*>>>. I'm happy with this now.

That's all of my concerns addressed now, so this LGTM. Thanks @CarlosAlbertoEnciso for your detailed responses.

I won't hit accept though in case @dblaikie has further comments on this new approach.

@Orlando Thanks very much for your review and feedback.

dblaikie added inline comments.Feb 2 2023, 10:25 AM
llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
103

Any chance of avoiding the extra unique_ptr indirection in there, and having a vector<vector<Line*>> instead?

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
103

The nested vector in DiscoveredLines has its address taken by ScopeInstructions so we need the additional indirection of the unique_ptr to ensure the pointer remains valid.

@dblaikie Do you have any additional questions or concerns that I have not answered it? Thanks.

dblaikie added inline comments.Feb 13 2023, 4:54 PM
llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
188

Undefine the macro after its uses - and maybe add an LLVM prefix to make this less likely to collide with other things?

Be nice to avoid the macros if possible, but nothing comes up quickly for me - probably some sort of type map could be used for the allocators and then these creation functions could be a function template instead. But not sure if we have a convenient type map already at hand.

238

This macro's only used once, could it be removed (for now/until it's needed)?

llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
151–152

This seems unnecessary - could be sunk into the uses.

154–157
156–157
158–160
llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
692

If Traverse always takes something non-null, it'd be nice if it took its argument by reference - then these call sites could be:

Traverse(*Parent->Types, SortFunction);

(& avoid the need for the explicit .get() call)

Similarly below for SetCompareState

llvm/lib/DebugInfo/LogicalView/Core/LVType.cpp
258–259

idly: Should this be else-if?
And there seems to be some extra {} here than is usually used in LLVM (except where ambiguous, single line blocks should omit the {})

llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
370–372
371

Address feedback from @dblaikie.

@dblaikie: Thanks for your valuable feedback.

llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
188

Added undef for all the macros.
Added the LV prefix which is the one used across the tool.

Would you be OK if we leave the macros for the time being and find a more elegant solution based on your suggestion (type map and function templates) for a later patch (improvement).

238

I introduced the macro, just for consistency with the other macros.
But, may be I am misunderstanding your point.
Do you want to remove the macro and instead use the createLocation() function?
Something like:

LVOperation *createOperation(LVSmall OpCode, LVUnsigned Operand1, LVUnsigned Operand2)
    return new (AllocatedOperation.Allocate()) LVOperation(OpCode, Operand1, Operand2);
}
llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
151–152

Good point. Removed unnecessary definitions.

154–157

Changed to use auto.

156–157

Removed SecondMap.

158–160

Added LVSecondMapType *.

llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
692

Good point. Replaced to use reference for both Traverse and SetCompareState.

llvm/lib/DebugInfo/LogicalView/Core/LVType.cpp
258–259

Good observation. Changed to else-if. Removed extra {}.

llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
370–372

Changed to use auto.

371

Changed to reference.

dblaikie accepted this revision.Feb 15 2023, 11:31 AM

Thanks

llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
188

LV doesn't seem like a sufficiently unique prefix - I'd expect all the LLVM macros to start with LLVM - do they not? Hmm, a quick grep seems to indicate yeah, a bunch of LLVM macros don't use a more strict prefix. Pity about that. Ah well.

*shrug*

Sure, separate patch is OK. Probably a standard (or LLVM) map with typeinfo (would need a specialization to expose equality and hasher) as the key might suffice.

238

yeah, since there's only one, that seems OK?

llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
692

Actually now it looks like traverse takes the unique_ptr by reference, rather than the underlying value by reference, which I guess is necessary if they can be null. So, sounds good.

This revision is now accepted and ready to land.Feb 15 2023, 11:31 AM

Update the patch to reflect a change based on @dblaikie feedback.

Replaced the macro LV_CREATE_OPERATION with the explicit function:

// Operations creation.
LVOperation *createOperation(LVSmall OpCode, LVUnsigned Operand1,
                             LVUnsigned Operand2) {
  return new (AllocatedOperation.Allocate())
      LVOperation(OpCode, Operand1, Operand2);
}
llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
188

I will include the following item for a separate patch:

  • Replace the macros used for the memory management with your suggested approach (map with typeinfo).
238

Removed the macro LV_CREATE_OPERATION.
Use the explicit function:

// Operations creation.
LVOperation *createOperation(LVSmall OpCode, LVUnsigned Operand1,
                             LVUnsigned Operand2) {
  return new (AllocatedOperation.Allocate())
      LVOperation(OpCode, Operand1, Operand2);
}
llvm/lib/DebugInfo/LogicalView/Core/LVScope.cpp
692

Thanks.

@Orlando Do you have any additional comments on the updated patch?

Do you have any additional comments on the updated patch?

It looks like all the comments have been addressed or acknowledged, so LGTM.

I've put one question inline but I'm happy for you to address it (if it does need addressing) without another round of review.

llvm/lib/DebugInfo/LogicalView/Core/LVType.cpp
265

Before your recent change this branch was taken if options().getAttributeArgument() is false, but now this branch is taken if options().getAttributeArgument() is true (and both Type->getIsKindType() and Type->getIsKindScope()) are false). Just checking - is the new behaviour correct?

And if so, what was the result of the previous incorrect behaviour?

llvm/lib/DebugInfo/LogicalView/Core/LVType.cpp
265

Very good catch. The behaviour has not being changed.

Incorrec placement for the closing }. The code should be:

if (options().getAttributeArgument()) {
  if (Type->getIsKindType())
    TypesParam->push_back(Type->getTypeAsType());
  else if (Type->getIsKindScope())
    ScopesParam->push_back(Type->getTypeAsScope());
} else
  TypesParam->push_back(Type);
Orlando added inline comments.Feb 16 2023, 1:41 AM
llvm/lib/DebugInfo/LogicalView/Core/LVType.cpp
265

nit-pick: when you make the change, please add a set of braces around the final else-block (to keep it uniform since the if-block does use braces (covered here).

Thanks!

This revision was landed with ongoing or failed builds.Feb 16 2023, 2:32 AM
This revision was automatically updated to reflect the committed changes.

@dblaikie, @Orlando Thanks very much for your valuable feedback.