Download Raw Diff

Details

Reviewers

labath
dblaikie
netforce

Commits

rG6c12ae8163c7: Exposes interface to free up caching data structure in DWARFDebugLine and…

Summary

This is minimum changes extracted from https://reviews.llvm.org/D78950. The old patch tried to add LRU eviction of caching data structure. Due to multiple layers of interfaces that users could be using, it was not clear where to put the functionality. While we work out on where to put that functionality, it'll be great to add this minimum interface change so that the user could implement their own memory management. More specifically:

Add a clearLineTable method for DWARFDebugLine which erases the given offset from the LineTableMap.
DWARFDebugContext adds the clearLineTableForUnit method that leverages clearLineTable to remove the object corresponding to a given compile unit, for memory management purposes. When it is referred to again, the line table object will be repopulated.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

netforce created this revision.Oct 22 2020, 9:20 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2020, 9:20 PM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

netforce requested review of this revision.Oct 22 2020, 9:20 PM

netforce mentioned this in D78950: Adds LRU caching of compile units in DWARFContext..Oct 22 2020, 9:25 PM

Harbormaster completed remote builds in B76125: Diff 300158.Oct 22 2020, 9:59 PM

The main motivation for going down the D78950 route was to ensure the code was testable (and preferably in a live codepath that other folks are using, etc) - so having some test coverage for this would still be necessary/highly preferred, though perhaps unit test coverage could be achieved/would suffice?

(looks like the patch description is incomplete? Seems there's a partial sentence at the end?)

One thing to note is that a line table can be shared by multiple dwarf units. This regularly happens with type units. Theoretically, compile units can share a line table too, though that would be a pretty unusual setup...

In D90006#2367734, @labath wrote:

One thing to note is that a line table can be shared by multiple dwarf units. This regularly happens with type units. Theoretically, compile units can share a line table too, though that would be a pretty unusual setup...

FWIW, that does come up for LLVM when doing any kind of LTO + non-integrated assembler. (the assembly syntax doesn't provide a way to describe two distinct line tables, so both CUs end up having to refer to one line table)

Though in that case, using this API wouldn't break things, right? (unless they were being used concurrently, which would be problematic from the start - since the lazy-query/line table parsing API would race to begin with, I think?) It would mean that two CUs sharing a line table, if you invalidate one CU's line table (which is shared) then both/either CU, on its next line table query, would see a performance hit?

I guess the more involved alternative would be a reference counting scheme so that one could clear the line table from one CU and it would be a no-op unless all CUs sharing that line table had cleared it. I don't think that use case is necessary to support at the moment, so I'm fine with the more aggressive version that's implemented here.

In D90006#2368823, @dblaikie wrote:

In D90006#2367734, @labath wrote:

One thing to note is that a line table can be shared by multiple dwarf units. This regularly happens with type units. Theoretically, compile units can share a line table too, though that would be a pretty unusual setup...

FWIW, that does come up for LLVM when doing any kind of LTO + non-integrated assembler. (the assembly syntax doesn't provide a way to describe two distinct line tables, so both CUs end up having to refer to one line table)

Interesting. I am moderately surprised that this does not cause any problems in the consumers. I don't know if this would actually break anything, but I believe it will cause lldb to (re)parse the line table for each CU that refers to it.

Though in that case, using this API wouldn't break things, right? (unless they were being used concurrently, which would be problematic from the start - since the lazy-query/line table parsing API would race to begin with, I think?) It would mean that two CUs sharing a line table, if you invalidate one CU's line table (which is shared) then both/either CU, on its next line table query, would see a performance hit?

Yeah, this should break anything. I don't think this needs to be a show-stopper, but I wanted to make sure you are aware of that and are fine with that kind of a performance hit.

I guess the more involved alternative would be a reference counting scheme so that one could clear the line table from one CU and it would be a no-op unless all CUs sharing that line table had cleared it. I don't think that use case is necessary to support at the moment, so I'm fine with the more aggressive version that's implemented here.

Another option would be to decouple line table and die clearing and expose separate interfaces for clearing each one -- thereby pushing the reference counting problem to the user.

In D90006#2370774, @labath wrote:

In D90006#2368823, @dblaikie wrote:

In D90006#2367734, @labath wrote:

One thing to note is that a line table can be shared by multiple dwarf units. This regularly happens with type units. Theoretically, compile units can share a line table too, though that would be a pretty unusual setup...

FWIW, that does come up for LLVM when doing any kind of LTO + non-integrated assembler. (the assembly syntax doesn't provide a way to describe two distinct line tables, so both CUs end up having to refer to one line table)

Interesting. I am moderately surprised that this does not cause any problems in the consumers. I don't know if this would actually break anything, but I believe it will cause lldb to (re)parse the line table for each CU that refers to it.

Yeah, could totally believe that. Might be unfortunate for a large full LTO binary built without the integrated assembler, but otherwise wouldn't come up.

Though in that case, using this API wouldn't break things, right? (unless they were being used concurrently, which would be problematic from the start - since the lazy-query/line table parsing API would race to begin with, I think?) It would mean that two CUs sharing a line table, if you invalidate one CU's line table (which is shared) then both/either CU, on its next line table query, would see a performance hit?

Yeah, this should break anything. I don't think this needs to be a show-stopper, but I wanted to make sure you are aware of that and are fine with that kind of a performance hit.

Fair, for sure - good to be aware/explicit about that.

I guess the more involved alternative would be a reference counting scheme so that one could clear the line table from one CU and it would be a no-op unless all CUs sharing that line table had cleared it. I don't think that use case is necessary to support at the moment, so I'm fine with the more aggressive version that's implemented here.

Another option would be to decouple line table and die clearing and expose separate interfaces for clearing each one -- thereby pushing the reference counting problem to the user.

*nod* I don't think this /probably/ merits that nuance, but certainly something to keep in mind.

dblaikie mentioned this in D119784: [Symbolize] LRU cache binaries in llvm-symbolizer..Feb 15 2022, 10:43 AM

@netforce is this still of interest? To you/anyone else? (just curious/not important to me personally)

florinpapa added a subscriber: florinpapa.Mar 24 2022, 7:20 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2022, 7:20 PM

Taking over this change, as it is still useful and I'd like to get it through.

Rebase against origin/master

Harbormaster completed remote builds in B156200: Diff 418114.Mar 24 2022, 9:42 PM

Based on offline discussion: It'd be nice to at least have unit testing for this functionality (if/since it won't be used by llvm-symbolizer or other llvm tools) & some comments in the unit test about why it exists.

This revision now requires changes to proceed.Mar 25 2022, 1:33 PM

Address comments

florinpapa edited the summary of this revision. (Show Details)May 17 2022, 2:53 PM

Update test comment.

Looks OK otherwise

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
402	Could this be written as: ASSERT_TRUE(ExpectedLineTable4); ? (similarly for other bool tests - in general, explicitly calling operator overloads is a bit "weird" and best avoided if possible - if there's an issue with gtest only testing implicit conversion, then maybe it'd be suitable to explicitly cast to bool: `ASSERT_true((bool)ExpectedLineTable4)`)

This revision is now accepted and ready to land.May 17 2022, 3:22 PM

Avoid calling operator overloads in unit test.

florinpapa added inline comments.May 17 2022, 4:08 PM

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
402	if there's an issue with gtest only testing implicit conversion, then maybe it'd be suitable to explicitly cast to bool: ASSERT_true((bool)ExpectedLineTable4)) How do I test that? I have been using `ninja check-llvm-unit` for testing so far, is that enough?
402	I had to go with the latter, as the build failed without the explicit cast.

Harbormaster completed remote builds in B164996: Diff 430203.May 17 2022, 5:13 PM

Perform rebase.

Harbormaster completed remote builds in B165896: Diff 431456.May 23 2022, 1:35 PM

Hello, this change was approved. Could one of the reviewers help commit this, since I don't have permissions? The failing tests look like unrelated flakes, as they are timeouts and seem to fail differently after each rebase.

This revision was landed with ongoing or failed builds.May 23 2022, 8:23 PM

Closed by commit rG6c12ae8163c7: Exposes interface to free up caching data structure in DWARFDebugLine and… (authored by netforce, committed by dblaikie). · Explain Why

This revision was automatically updated to reflect the committed changes.

dblaikie added a commit: rG6c12ae8163c7: Exposes interface to free up caching data structure in DWARFDebugLine and….

Diff 431573

llvm/include/llvm/DebugInfo/DWARF/DWARFContext.h

Show First 20 Lines • Show All 327 Lines • ▼ Show 20 Lines	public:
const DWARFDebugLine::LineTable getLineTableForUnit(DWARFUnit U);		const DWARFDebugLine::LineTable getLineTableForUnit(DWARFUnit U);

/// Get a pointer to a parsed line table corresponding to a compile unit.		/// Get a pointer to a parsed line table corresponding to a compile unit.
/// Report any recoverable parsing problems using the handler.		/// Report any recoverable parsing problems using the handler.
Expected<const DWARFDebugLine::LineTable *>		Expected<const DWARFDebugLine::LineTable *>
getLineTableForUnit(DWARFUnit *U,		getLineTableForUnit(DWARFUnit *U,
function_ref<void(Error)> RecoverableErrorHandler);		function_ref<void(Error)> RecoverableErrorHandler);

		// Clear the line table object corresponding to a compile unit for memory
		// management purpose. When it's referred to again, it'll be re-populated.
		void clearLineTableForUnit(DWARFUnit *U);

DataExtractor getStringExtractor() const {		DataExtractor getStringExtractor() const {
return DataExtractor(DObj->getStrSection(), false, 0);		return DataExtractor(DObj->getStrSection(), false, 0);
}		}
DataExtractor getStringDWOExtractor() const {		DataExtractor getStringDWOExtractor() const {
return DataExtractor(DObj->getStrDWOSection(), false, 0);		return DataExtractor(DObj->getStrDWOSection(), false, 0);
}		}
DataExtractor getLineStringExtractor() const {		DataExtractor getLineStringExtractor() const {
return DataExtractor(DObj->getLineStrSection(), false, 0);		return DataExtractor(DObj->getLineStrSection(), false, 0);
▲ Show 20 Lines • Show All 121 Lines • Show Last 20 Lines

llvm/include/llvm/DebugInfo/DWARF/DWARFDebugLine.h

Show First 20 Lines • Show All 298 Lines • ▼ Show 20 Lines	bool lookupAddressRangeImpl(object::SectionedAddress Address, uint64_t Size,
std::vector<uint32_t> &Result) const;		std::vector<uint32_t> &Result) const;
};		};

const LineTable *getLineTable(uint64_t Offset) const;		const LineTable *getLineTable(uint64_t Offset) const;
Expected<const LineTable *>		Expected<const LineTable *>
getOrParseLineTable(DWARFDataExtractor &DebugLineData, uint64_t Offset,		getOrParseLineTable(DWARFDataExtractor &DebugLineData, uint64_t Offset,
const DWARFContext &Ctx, const DWARFUnit *U,		const DWARFContext &Ctx, const DWARFUnit *U,
function_ref<void(Error)> RecoverableErrorHandler);		function_ref<void(Error)> RecoverableErrorHandler);
		void clearLineTable(uint64_t Offset);

/// Helper to allow for parsing of an entire .debug_line section in sequence.		/// Helper to allow for parsing of an entire .debug_line section in sequence.
class SectionParser {		class SectionParser {
public:		public:
using LineToUnitMap = std::map<uint64_t, DWARFUnit *>;		using LineToUnitMap = std::map<uint64_t, DWARFUnit *>;

SectionParser(DWARFDataExtractor &Data, const DWARFContext &C,		SectionParser(DWARFDataExtractor &Data, const DWARFContext &C,
DWARFUnitVector::iterator_range Units);		DWARFUnitVector::iterator_range Units);
▲ Show 20 Lines • Show All 102 Lines • Show Last 20 Lines

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp

Show First 20 Lines • Show All 997 Lines • ▼ Show 20 Lines	Expected<const DWARFDebugLine::LineTable *> DWARFContext::getLineTableForUnit(

// We have to parse it first.		// We have to parse it first.
DWARFDataExtractor lineData(*DObj, U->getLineSection(), isLittleEndian(),		DWARFDataExtractor lineData(*DObj, U->getLineSection(), isLittleEndian(),
U->getAddressByteSize());		U->getAddressByteSize());
return Line->getOrParseLineTable(lineData, stmtOffset, *this, U,		return Line->getOrParseLineTable(lineData, stmtOffset, *this, U,
RecoverableErrorHandler);		RecoverableErrorHandler);
}		}

		void DWARFContext::clearLineTableForUnit(DWARFUnit *U) {
		if (!Line)
		return;

		auto UnitDIE = U->getUnitDIE();
		if (!UnitDIE)
		return;

		auto Offset = toSectionOffset(UnitDIE.find(DW_AT_stmt_list));
		if (!Offset)
		return;

		uint64_t stmtOffset = *Offset + U->getLineTableOffset();
		Line->clearLineTable(stmtOffset);
		}

void DWARFContext::parseNormalUnits() {		void DWARFContext::parseNormalUnits() {
if (!NormalUnits.empty())		if (!NormalUnits.empty())
return;		return;
DObj->forEachInfoSections([&](const DWARFSection &S) {		DObj->forEachInfoSections([&](const DWARFSection &S) {
NormalUnits.addUnitsForSection(*this, S, DW_SECT_INFO);		NormalUnits.addUnitsForSection(*this, S, DW_SECT_INFO);
});		});
NormalUnits.finishedInfoUnits();		NormalUnits.finishedInfoUnits();
DObj->forEachTypesSections([&](const DWARFSection &S) {		DObj->forEachTypesSections([&](const DWARFSection &S) {
▲ Show 20 Lines • Show All 1,018 Lines • Show Last 20 Lines

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp

Show First 20 Lines • Show All 595 Lines • ▼ Show 20 Lines	if (Pos.second) {
if (Error Err =		if (Error Err =
LT->parse(DebugLineData, &Offset, Ctx, U, RecoverableErrorHandler))		LT->parse(DebugLineData, &Offset, Ctx, U, RecoverableErrorHandler))
return std::move(Err);		return std::move(Err);
return LT;		return LT;
}		}
return LT;		return LT;
}		}

		void DWARFDebugLine::clearLineTable(uint64_t Offset) {
		LineTableMap.erase(Offset);
		}

static StringRef getOpcodeName(uint8_t Opcode, uint8_t OpcodeBase) {		static StringRef getOpcodeName(uint8_t Opcode, uint8_t OpcodeBase) {
assert(Opcode != 0);		assert(Opcode != 0);
if (Opcode < OpcodeBase)		if (Opcode < OpcodeBase)
return LNStandardString(Opcode);		return LNStandardString(Opcode);
return "special";		return "special";
}		}

uint64_t DWARFDebugLine::ParsingState::advanceAddr(uint64_t OperationAdvance,		uint64_t DWARFDebugLine::ParsingState::advanceAddr(uint64_t OperationAdvance,
▲ Show 20 Lines • Show All 889 Lines • Show Last 20 Lines

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp

	Show First 20 Lines • Show All 374 Lines • ▼ Show 20 Lines
	void DWARFUnit::clear() {			void DWARFUnit::clear() {
	Abbrevs = nullptr;			Abbrevs = nullptr;
	BaseAddr.reset();			BaseAddr.reset();
	RangeSectionBase = 0;			RangeSectionBase = 0;
	LocSectionBase = 0;			LocSectionBase = 0;
	AddrOffsetSectionBase = None;			AddrOffsetSectionBase = None;
	SU = nullptr;			SU = nullptr;
	clearDIEs(false);			clearDIEs(false);
				AddrDieMap.clear();
				if (DWO)
				DWO->clear();
	DWO.reset();			DWO.reset();
	}			}

	const char *DWARFUnit::getCompilationDir() {			const char *DWARFUnit::getCompilationDir() {
	return dwarf::toString(getUnitDIE().find(DW_AT_comp_dir), nullptr);			return dwarf::toString(getUnitDIE().find(DW_AT_comp_dir), nullptr);
	}			}

	void DWARFUnit::extractDIEsToVector(			void DWARFUnit::extractDIEsToVector(
	▲ Show 20 Lines • Show All 747 Lines • Show Last 20 Lines

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp

Show First 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	#endif
EXPECT_FALSE(Recoverable);		EXPECT_FALSE(Recoverable);
EXPECT_EQ(Expected2, *ExpectedLineTable4);		EXPECT_EQ(Expected2, *ExpectedLineTable4);

// TODO: Add tests that show that the body of the programs have been read		// TODO: Add tests that show that the body of the programs have been read
// correctly.		// correctly.
}		}

#ifdef _AIX		#ifdef _AIX
		TEST_P(DebugLineParameterisedFixture, DISABLED_ClearLineValidTable) {
		#else
		TEST_P(DebugLineParameterisedFixture, ClearLineValidTable) {
		#endif
		if (!setupGenerator(Version))
		GTEST_SKIP();

		SCOPED_TRACE("Checking Version " + std::to_string(Version) + ", Format " +
		(Format == DWARF64 ? "DWARF64" : "DWARF32"));

		LineTable &LT = Gen->addLineTable(Format);
		LT.addExtendedOpcode(9, DW_LNE_set_address, {{0xadd4e55, LineTable::Quad}});
		LT.addStandardOpcode(DW_LNS_copy, {});
		LT.addByte(0xaa);
		LT.addExtendedOpcode(1, DW_LNE_end_sequence, {});

		LineTable &LT2 = Gen->addLineTable(Format);
		LT2.addExtendedOpcode(9, DW_LNE_set_address, {{0x11223344, LineTable::Quad}});
		LT2.addStandardOpcode(DW_LNS_copy, {});
		LT2.addByte(0xbb);
		LT2.addExtendedOpcode(1, DW_LNE_end_sequence, {});
		LT2.addExtendedOpcode(9, DW_LNE_set_address, {{0x55667788, LineTable::Quad}});
		LT2.addStandardOpcode(DW_LNS_copy, {});
		LT2.addByte(0xcc);
		LT2.addExtendedOpcode(1, DW_LNE_end_sequence, {});

		generate();

		// Check that we have what we expect before calling clearLineTable().
		auto ExpectedLineTable = Line.getOrParseLineTable(LineData, 0, *Context,
		nullptr, RecordRecoverable);
		ASSERT_TRUE((bool)ExpectedLineTable);
		EXPECT_FALSE(Recoverable);
		const DWARFDebugLine::LineTable Expected = ExpectedLineTable;
		checkDefaultPrologue(Version, Format, Expected->Prologue, 16);
		EXPECT_EQ(Expected->Sequences.size(), 1u);

		uint64_t SecondOffset =
		Expected->Prologue.sizeofTotalLength() + Expected->Prologue.TotalLength;
		Recoverable = Error::success();
		auto ExpectedLineTable2 = Line.getOrParseLineTable(
		LineData, SecondOffset, *Context, nullptr, RecordRecoverable);
		ASSERT_TRUE((bool)ExpectedLineTable2);
		EXPECT_FALSE(Recoverable);
		const DWARFDebugLine::LineTable Expected2 = ExpectedLineTable2;
		checkDefaultPrologue(Version, Format, Expected2->Prologue, 32);
		EXPECT_EQ(Expected2->Sequences.size(), 2u);

		// Check that we no longer get the line tables after clearLineTable().
		Line.clearLineTable(0);
		Line.clearLineTable(SecondOffset);
		EXPECT_EQ(Line.getLineTable(0), nullptr);
		EXPECT_EQ(Line.getLineTable(SecondOffset), nullptr);

		// Check that if the same offset is requested, the contents match what we
		// had before.
		Recoverable = Error::success();
		auto ExpectedLineTable3 = Line.getOrParseLineTable(
		LineData, 0, *Context, nullptr, RecordRecoverable);
		ASSERT_TRUE((bool)ExpectedLineTable3);
		EXPECT_FALSE(Recoverable);
		const DWARFDebugLine::LineTable Expected3 = ExpectedLineTable3;
		checkDefaultPrologue(Version, Format, Expected3->Prologue, 16);
		EXPECT_EQ(Expected3->Sequences.size(), 1u);

		Recoverable = Error::success();
		auto ExpectedLineTable4 = Line.getOrParseLineTable(
		LineData, SecondOffset, *Context, nullptr, RecordRecoverable);
		ASSERT_TRUE((bool)ExpectedLineTable4);
		dblaikieUnsubmitted Not Done Reply Inline Actions Could this be written as: ASSERT_TRUE(ExpectedLineTable4); ? (similarly for other bool tests - in general, explicitly calling operator overloads is a bit "weird" and best avoided if possible - if there's an issue with gtest only testing implicit conversion, then maybe it'd be suitable to explicitly cast to bool: `ASSERT_true((bool)ExpectedLineTable4)`) dblaikie: Could this be written as: ``` ASSERT_TRUE(ExpectedLineTable4); ``` ? (similarly for other bool…
		florinpapaAuthorUnsubmitted Done Reply Inline Actions if there's an issue with gtest only testing implicit conversion, then maybe it'd be suitable to explicitly cast to bool: ASSERT_true((bool)ExpectedLineTable4)) How do I test that? I have been using `ninja check-llvm-unit` for testing so far, is that enough? florinpapa: > if there's an issue with gtest only testing implicit conversion, then maybe it'd be suitable…
		florinpapaAuthorUnsubmitted Done Reply Inline Actions I had to go with the latter, as the build failed without the explicit cast. florinpapa: I had to go with the latter, as the build failed without the explicit cast.
		EXPECT_FALSE(Recoverable);
		const DWARFDebugLine::LineTable Expected4 = ExpectedLineTable4;
		checkDefaultPrologue(Version, Format, Expected4->Prologue, 32);
		EXPECT_EQ(Expected4->Sequences.size(), 2u);
		}

		#ifdef _AIX
TEST_F(DebugLineBasicFixture, DISABLED_ErrorForReservedLength) {		TEST_F(DebugLineBasicFixture, DISABLED_ErrorForReservedLength) {
#else		#else
TEST_F(DebugLineBasicFixture, ErrorForReservedLength) {		TEST_F(DebugLineBasicFixture, ErrorForReservedLength) {
#endif		#endif
if (!setupGenerator())		if (!setupGenerator())
GTEST_SKIP();		GTEST_SKIP();

LineTable &LT = Gen->addLineTable();		LineTable &LT = Gen->addLineTable();
▲ Show 20 Lines • Show All 1,580 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Exposes interface to free up caching data structure in DWARFDebugLine and DWARFUnit for memory management
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 431573

llvm/include/llvm/DebugInfo/DWARF/DWARFContext.h

llvm/include/llvm/DebugInfo/DWARF/DWARFDebugLine.h

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

Exposes interface to free up caching data structure in DWARFDebugLine and DWARFUnit for memory managementClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 431573

llvm/include/llvm/DebugInfo/DWARF/DWARFContext.h

llvm/include/llvm/DebugInfo/DWARF/DWARFDebugLine.h

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp

llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp

Exposes interface to free up caching data structure in DWARFDebugLine and DWARFUnit for memory management
ClosedPublic