This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/DebugInfo/DWARF/
-
llvm/
-
DebugInfo/
-
DWARF/
-
DWARFUnit.h
-
lib/DebugInfo/DWARF/
-
DebugInfo/
-
DWARF/
1/5
DWARFUnit.cpp

Differential D102634

Calculate indexes of last child of each DWARF entry once during tryExtractDIEsIfNeeded.
Needs ReviewPublic

Authored by simon.giesecke on May 17 2021, 9:11 AM.

Download Raw Diff

Details

Reviewers

clayborg
dblaikie
aprantl
JDevlieghere

Summary

This ensures that the last child indexes are calculated in linear time and
can later be queried in constant time by getLastChild.

The baseline situation was that individual calls to getLastChild were linear in the
size of DieArray. Calling getLastChild once for every DWARFDebugInfoEntry was
amortized quadratic in the size of DieArray.

Running "llvm-gsymutil --convert llvm-gsymutil --quiet" using a RelWithDebInfo build
of llvm-gsymutil (after recent other optimizations) and gathering a profile using
perf showed that llvm::DWARFUnit::getLastChild is the no. 1 function, accounting
for 9.9% of the CPU time, and llvm::DWARFUnit::getSibiling is the no. 4 function,
accounting for 4.76% of the CPU time.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

simon.giesecke created this revision.May 17 2021, 9:11 AM

Herald added subscribers: arphaman, hiraditya. · View Herald TranscriptMay 17 2021, 9:12 AM

simon.giesecke requested review of this revision.May 17 2021, 9:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 17 2021, 9:12 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

One thing I don't know here is whether all relevant uses of DWARFUnit will end up iterating the children. If that's not the case, then this be done only optionally. We can't do it on-demand on the first call to getLastChild easily though, because then, e.g. GsymCreator, would need to synchronize accesses from multiple threads. But maybe this could be passed as an option to tryExtractDIEsIfNeeded?

If this is worthwhile, a similar thing could be done for the siblings.

simon.giesecke edited the summary of this revision. (Show Details)May 17 2021, 9:14 AM

Harbormaster completed remote builds in B104846: Diff 345907.May 17 2021, 9:58 AM

You should probably get Phabricator working: https://llvm.org/docs/Phabricator.html

This will ensure your patches have context. If you submitting patches manually, you need to specify more context lines:

git diff -U999999

The DWARFDebugInfoEntry in LLDB contains more data. See inlined comment on improving performance for the LLVM DWARF reader. But if we are going to fix perf issues with DWARFDie navigation, we should improve the DWARFDebugInfoEntry class to contain the information needed to navigate better. FYI: be very careful when trying to port any code from the LLDB DWARF parser as it actually removes the NULL terminator DIEs from its DWARFUnit vector of DWARFDebugInfoEntry objects as this saves a ton of memory and they aren't needed if you create your DWARFDebugInfoEntry with enough info.

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
833–842	This is a side affect of how the DWARFDebugInfoEntry class is created. LLDB has a more efficient way of doing things as each DWARFDebugInfoEntry knows its sibling index. Since all of the DIEs are contained in an std::vector inside of the DWARFUnit, we can just store the parent index, the sibling index. See the file lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h. uint32_t m_parent_idx; // How many to subtract from "this" to get the parent. uint32_t m_sibling_idx; // How many to add to "this" to get the sibling. The current DWARFDebugInfoEntry just contains a offset and a depth and an abbrev pointer. This info doesn't make it easy to navigate the DWARFDie objects sibling, and parent. If the LLVM DWARF parser adopts this parent index and sibling index, then navigation can happen much quicker and this function can simply get the sibling index and subtract 1 as that will always the the NULL tag that terminates the previous DIE child chain.

This revision now requires changes to proceed.May 17 2021, 7:44 PM

In D102634#2764987, @clayborg wrote:

You should probably get Phabricator working: https://llvm.org/docs/Phabricator.html

This will ensure your patches have context. If you submitting patches manually, you need to specify more context lines:

git diff -U999999

I have set up arc now.

The DWARFDebugInfoEntry in LLDB contains more data. See inlined comment on improving performance for the LLVM DWARF reader. But if we are going to fix perf issues with DWARFDie navigation, we should improve the DWARFDebugInfoEntry class to contain the information needed to navigate better.

Ok, I added this separately first, because I wasn't sure if it should be optional, and not take up memory for use cases that might not need it. But from your comments I now understand that the extra memory usage should not be an issue.

FYI: be very careful when trying to port any code from the LLDB DWARF parser as it actually removes the NULL terminator DIEs from its DWARFUnit vector of DWARFDebugInfoEntry objects as this saves a ton of memory and they aren't needed if you create your DWARFDebugInfoEntry with enough info.

Should we remove the NULL terminator DIEs here as well, eventually? But I guess even if that should be done, this requires a lot of further adaptations. However, we would probably save more memory than we're adding through the extra sibling and parent index entries.

Update using arc

Harbormaster completed remote builds in B104977: Diff 346103.May 18 2021, 4:11 AM

What's your use case such that this performance concern has come up for you?

avl added a subscriber: avl.May 18 2021, 11:42 AM

avl added inline comments.

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
833–842	uint32_t m_parent_idx; How many to subtract from "this" to get the parent. uint32_t m_sibling_idx; How many to add to "this" to get the sibling. The current DWARFDebugInfoEntry just contains a offset and a depth and an abbrev pointer. This info doesn't make it easy to navigate the DWARFDie objects sibling, and parent. If the LLVM DWARF parser adopts this parent index and sibling index, then navigation can happen much quicker and this function can simply get the sibling index and subtract 1 as that will always the the NULL tag that terminates the previous DIE child chain. If the LLVM DWARF parser adopts above parent index and sibling index solution then it would also be useful for dsymutil. Currently it creates links to parents by separate pass.

In D102634#2766582, @dblaikie wrote:

What's your use case such that this performance concern has come up for you?

I think anyone who goes to any DIE that contains a bunch of children and uses the DWARFDie::iterator (like for a CU Die) and iterates over its children is probably the worst case.

So any users of DWARFDie::end() like DWARFDie::children(). This is used in many places like in llvm/lib/CodeGen/AsmPrinter/AsmPrinterDwarf.cpp, llvm/lib/CodeGen/AsmPrinter/DIEHash.cpp, llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp, llvm/lib/DWARFLinker/DWARFLinker.cpp and llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp. There may be more and I just searched for "Die.children()".

Any of the DWARFUnit::getSibling(...), DWARFUnit::getParent(...), DWARFUnit::getLastChild(...) would be greatly sped up if we use m_parent_idx and m_sibling_idx in DWARFDebugInfoEntry.

In D102634#2765564, @simon.giesecke wrote:

In D102634#2764987, @clayborg wrote:

The DWARFDebugInfoEntry in LLDB contains more data. See inlined comment on improving performance for the LLVM DWARF reader. But if we are going to fix perf issues with DWARFDie navigation, we should improve the DWARFDebugInfoEntry class to contain the information needed to navigate better.

Ok, I added this separately first, because I wasn't sure if it should be optional, and not take up memory for use cases that might not need it. But from your comments I now understand that the extra memory usage should not be an issue.

The LLVM DWARFDebugInfoEntry contains:

class DWARFDebugInfoEntry {
  /// Offset within the .debug_info of the start of this entry.
  uint64_t Offset = 0;

  /// The integer depth of this DIE within the compile unit DIEs where the
  /// compile/type unit DIE has a depth of zero.
  uint32_t Depth = 0;

  const DWARFAbbreviationDeclaration *AbbrevDecl = nullptr;
};

This will end up being 24 bits with 4 bytes of padding after the Depth.

If we modified the LLDB DWARFDebugInfoEntry for LLVM it could contain:

class DWARFDebugInfoEntry {
  uint64_t Offset; // Offset within the .debug_info/.debug_types
  uint32_t ParentIdx; // How many to subtract from "this" to get the parent. If zero this die has no parent
  uint32_t SiblingIdx; // How many to add to "this" to get the sibling.
  const DWARFAbbreviationDeclaration *AbbrevDecl = nullptr;
};

This would be the same 24 byte size as the current version and it would improve many of the DWARFUnit::getXXX() calls that get siblings, parents, etc.

So this wouldn't take up any more memory.

FYI: be very careful when trying to port any code from the LLDB DWARF parser as it actually removes the NULL terminator DIEs from its DWARFUnit vector of DWARFDebugInfoEntry objects as this saves a ton of memory and they aren't needed if you create your DWARFDebugInfoEntry with enough info.

Should we remove the NULL terminator DIEs here as well, eventually? But I guess even if that should be done, this requires a lot of further adaptations. However, we would probably save more memory than we're adding through the extra sibling and parent index entries.

Again, there will be no memory differences between the old and new due to the padding that exists in the older DWARFDebugInfoEntry structs. I would avoid removing the NULL tags for now because DWARF dumping code relies on these being in there for DWARF printing. This is something we could do later though.

In D102634#2766582, @dblaikie wrote:

What's your use case such that this performance concern has come up for you?

It's a hotspot when running llvm-gsymutil --convert. I can provide some numbers if you want.

In D102634#2767781, @simon.giesecke wrote:

In D102634#2766582, @dblaikie wrote:

What's your use case such that this performance concern has come up for you?

It's a hotspot when running llvm-gsymutil --convert. I can provide some numbers if you want.

If you happen to have them, might be nice to have in the patch description/commit message.

(@clayborg's suggestion of replacing "Depth" with two smaller fields seems roughly plausible to me/reasonable thing to try, for what it's worth)

Updated patch description with numbers from perf profile.

Harbormaster completed remote builds in B105433: Diff 346746.May 20 2021, 9:24 AM

avl added inline comments.Sep 6 2021, 5:44 AM

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
833–842	This is a side affect of how the DWARFDebugInfoEntry class is created. LLDB has a more efficient way of doing things as each DWARFDebugInfoEntry knows its sibling index. Since all of the DIEs are contained in an std::vector inside of the DWARFUnit, we can just store the parent index, the sibling index. See the file lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h. uint32_t m_parent_idx; // How many to subtract from "this" to get the parent. uint32_t m_sibling_idx; // How many to add to "this" to get the sibling. The current DWARFDebugInfoEntry just contains a offset and a depth and an abbrev pointer. This info doesn't make it easy to navigate the DWARFDie objects sibling, and parent. If the LLVM DWARF parser adopts this parent index and sibling index, then navigation can happen much quicker and this function can simply get the sibling index and subtract 1 as that will always the the NULL tag that terminates the previous DIE child chain. Since it was mentioned in this review - Is somebody going to implementing that idea(replacing "Depth" with two smaller fields)?

simon.giesecke added inline comments.Sep 6 2021, 6:12 AM

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
833–842	Since it was mentioned in this review - Is somebody going to implementing that idea(replacing "Depth" with two smaller fields)? I would consider that when working on this again. However, I am not sure when I will be able to get back to this. If someone else wants to pick it up, that would be great.

avl mentioned this in D96035: [dsymutil][DWARFlinker] implement separate multi-thread processing for compile units..Sep 6 2021, 6:52 AM

I at least added a comment to DWARFDebugInfoEntry to reflect the possible directions (such as adding parent/sibling index) and reference this review here: 821954f97c6b978cca72cb412e98d35caee4cac3 - so if/whenever someone feels like poking at this they might have a good chance of coming across some of this history/context.

avl added inline comments.Sep 16 2021, 1:51 PM

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
833–842	Since it was mentioned in this review - Is somebody going to implementing that idea(replacing "Depth" with two smaller fields)? I would consider that when working on this again. However, I am not sure when I will be able to get back to this. If someone else wants to pick it up, that would be great. I am interested in picking up this. Will do this. Thanks.

avl mentioned this in D110363: [DWARF][NFC] add ParentIdx and SiblingIdx to DWARFDebugInfoEntry for faster navigation..Sep 23 2021, 2:00 PM

avl mentioned this in rG0b8c50812b59: [DWARF][NFC] add ParentIdx and SiblingIdx to DWARFDebugInfoEntry for faster….Oct 1 2021, 10:12 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

DebugInfo/

DWARF/

DWARFUnit.h

1 line

lib/

DebugInfo/

DWARF/

DWARFUnit.cpp

68 lines

Diff 345907

llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h

Context not available.
	llvm::Optional<object::SectionedAddress> BaseAddr;	llvm::Optional<object::SectionedAddress> BaseAddr;
	/// The compile unit debug information entry items.	/// The compile unit debug information entry items.
	std::vector<DWARFDebugInfoEntry> DieArray;	std::vector<DWARFDebugInfoEntry> DieArray;
		std::vector<size_t> LastChildren;

	/// Map from range's start address to end address and corresponding DIE.	/// Map from range's start address to end address and corresponding DIE.
	/// IntervalMap does not support range removal, as a result, we use the	/// IntervalMap does not support range removal, as a result, we use the
Context not available.

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp

Context not available.
	#include <cstddef>	#include <cstddef>
	#include <cstdint>	#include <cstdint>
	#include <cstdio>	#include <cstdio>
		#include <stack>
	#include <utility>	#include <utility>
	#include <vector>	#include <vector>

Context not available.
	Context.getRecoverableErrorHandler()(std::move(e));	Context.getRecoverableErrorHandler()(std::move(e));
	}	}

		template <typename In, typename GetDepthFunc, typename HasChildrenPred,
		typename IsEndTagPred>
		auto findLastChildren(const In &InRange, const GetDepthFunc &GetDepth,
		const HasChildrenPred &HasChildren,
		const IsEndTagPred &IsEndTag) {
		using DepthType = decltype(GetDepth(*InRange.begin()));
		using IndexType = typename In::size_type;

		std::vector<IndexType> LastChildren(InRange.size(), 0);

		std::stack<DepthType, std::vector<IndexType>> ParentIdxs;
		for (IndexType I = 0, EndIdx = InRange.size(); I < EndIdx; ++I) {
		if (!ParentIdxs.empty()) {
		DepthType CurrentDepth = GetDepth(InRange[I]);

		DepthType StackSize = ParentIdxs.size();
		if (CurrentDepth <= StackSize) {
		// a child of the current parent
		while (CurrentDepth < StackSize--) {
		ParentIdxs.pop();
		}

		assert(!LastChildren[ParentIdxs.top()]);

		if (IsEndTag(InRange[I])) {
		// the last child of the current parent
		LastChildren[ParentIdxs.top()] = I;
		}
		}
		}

		if (HasChildren(InRange[I])) {
		assert(GetDepth(InRange[I]) == ParentIdxs.size());
		ParentIdxs.push(I);
		}
		}

		return LastChildren;
		}

	Error DWARFUnit::tryExtractDIEsIfNeeded(bool CUDieOnly) {	Error DWARFUnit::tryExtractDIEsIfNeeded(bool CUDieOnly) {
	if ((CUDieOnly && !DieArray.empty()) \|\|	if ((CUDieOnly && !DieArray.empty()) \|\| DieArray.size() > 1)
	DieArray.size() > 1)
	return Error::success(); // Already parsed.	return Error::success(); // Already parsed.

	bool HasCUDie = !DieArray.empty();	bool HasCUDie = !DieArray.empty();
Context not available.
	isLittleEndian, getAddressByteSize()));	isLittleEndian, getAddressByteSize()));
	}	}

		LastChildren =
		findLastChildren(DieArray, std::mem_fn(&DWARFDebugInfoEntry::getDepth),
		std::mem_fn(&DWARFDebugInfoEntry::hasChildren),
		[](const DWARFDebugInfoEntry &DIE) {
		return DIE.getTag() == dwarf::DW_TAG_null;
		});

	// Don't fall back to DW_AT_GNU_ranges_base: it should be ignored for	// Don't fall back to DW_AT_GNU_ranges_base: it should be ignored for
	// skeleton CU DIE, so that DWARF users not aware of it are not broken.	// skeleton CU DIE, so that DWARF users not aware of it are not broken.
	return Error::success();	return Error::success();
Context not available.
	if (!Die->hasChildren())	if (!Die->hasChildren())
	return DWARFDie();	return DWARFDie();

	uint32_t Depth = Die->getDepth();	// We do not want access out of bounds when parsing corrupted debug data.
	for (size_t I = getDIEIndex(Die) + 1, EndIdx = DieArray.size(); I < EndIdx;	size_t I = getDIEIndex(Die);
	++I) {	if (I >= LastChildren.size())
	if (DieArray[I].getDepth() == Depth + 1 &&	return DWARFDie();
	DieArray[I].getTag() == dwarf::DW_TAG_null)	size_t LastChild = LastChildren[I];
	return DWARFDie(this, &DieArray[I]);	if (!LastChild) {
	assert(DieArray[I].getDepth() > Depth && "Not processing children?");	// TODO Shouldn't Die->hasChildren() be false in that case?
		return DWARFDie();
	}	}
	return DWARFDie();	return DWARFDie(this, &DieArray[LastChild]);
		clayborgUnsubmitted Not Done Reply Inline Actions This is a side affect of how the DWARFDebugInfoEntry class is created. LLDB has a more efficient way of doing things as each DWARFDebugInfoEntry knows its sibling index. Since all of the DIEs are contained in an std::vector inside of the DWARFUnit, we can just store the parent index, the sibling index. See the file lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h. uint32_t m_parent_idx; // How many to subtract from "this" to get the parent. uint32_t m_sibling_idx; // How many to add to "this" to get the sibling. The current DWARFDebugInfoEntry just contains a offset and a depth and an abbrev pointer. This info doesn't make it easy to navigate the DWARFDie objects sibling, and parent. If the LLVM DWARF parser adopts this parent index and sibling index, then navigation can happen much quicker and this function can simply get the sibling index and subtract 1 as that will always the the NULL tag that terminates the previous DIE child chain. clayborg: This is a side affect of how the DWARFDebugInfoEntry class is created. LLDB has a more…
		avlUnsubmitted Not Done Reply Inline Actions uint32_t m_parent_idx; How many to subtract from "this" to get the parent. uint32_t m_sibling_idx; How many to add to "this" to get the sibling. The current DWARFDebugInfoEntry just contains a offset and a depth and an abbrev pointer. This info doesn't make it easy to navigate the DWARFDie objects sibling, and parent. If the LLVM DWARF parser adopts this parent index and sibling index, then navigation can happen much quicker and this function can simply get the sibling index and subtract 1 as that will always the the NULL tag that terminates the previous DIE child chain. If the LLVM DWARF parser adopts above parent index and sibling index solution then it would also be useful for dsymutil. Currently it creates links to parents by separate pass. avl: > uint32_t m_parent_idx; // How many to subtract from "this" to get the parent. > uint32_t…
		avlUnsubmitted Not Done Reply Inline Actions This is a side affect of how the DWARFDebugInfoEntry class is created. LLDB has a more efficient way of doing things as each DWARFDebugInfoEntry knows its sibling index. Since all of the DIEs are contained in an std::vector inside of the DWARFUnit, we can just store the parent index, the sibling index. See the file lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h. uint32_t m_parent_idx; // How many to subtract from "this" to get the parent. uint32_t m_sibling_idx; // How many to add to "this" to get the sibling. The current DWARFDebugInfoEntry just contains a offset and a depth and an abbrev pointer. This info doesn't make it easy to navigate the DWARFDie objects sibling, and parent. If the LLVM DWARF parser adopts this parent index and sibling index, then navigation can happen much quicker and this function can simply get the sibling index and subtract 1 as that will always the the NULL tag that terminates the previous DIE child chain. Since it was mentioned in this review - Is somebody going to implementing that idea(replacing "Depth" with two smaller fields)? avl: > This is a side affect of how the DWARFDebugInfoEntry class is created. LLDB has a more…
		simon.gieseckeAuthorUnsubmitted Done Reply Inline Actions Since it was mentioned in this review - Is somebody going to implementing that idea(replacing "Depth" with two smaller fields)? I would consider that when working on this again. However, I am not sure when I will be able to get back to this. If someone else wants to pick it up, that would be great. simon.giesecke: > Since it was mentioned in this review - Is somebody going to implementing that idea(replacing…
		avlUnsubmitted Not Done Reply Inline Actions Since it was mentioned in this review - Is somebody going to implementing that idea(replacing "Depth" with two smaller fields)? I would consider that when working on this again. However, I am not sure when I will be able to get back to this. If someone else wants to pick it up, that would be great. I am interested in picking up this. Will do this. Thanks. avl: > > Since it was mentioned in this review - Is somebody going to implementing that idea…
	}	}

	const DWARFAbbreviationDeclarationSet *DWARFUnit::getAbbreviations() const {	const DWARFAbbreviationDeclarationSet *DWARFUnit::getAbbreviations() const {
Context not available.