This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/DebugInfo/DWARF/
-
llvm/
-
DebugInfo/
-
DWARF/
-
DWARFDebugLoc.h
-
lib/DebugInfo/DWARF/
-
DebugInfo/
-
DWARF/
-
DWARFDebugLoc.cpp
-
DWARFVerifier.cpp
-
test/DebugInfo/X86/
-
DebugInfo/
-
X86/
-
dwarfdump-debug-loc-error-cases.s
-
dwarfdump-debug-loclists-error-cases.s

Differential D63591

DWARFDebugLoc: Make parsing and error reporting more robust
ClosedPublic

Authored by labath on Jun 20 2019, 3:01 AM.

Download Raw Diff

Details

Reviewers

dblaikie
JDevlieghere
probinson

Commits

rGbd546e59026d: DWARFDebugLoc: Make parsing and error reporting more robust
rL370363: DWARFDebugLoc: Make parsing and error reporting more robust

Summary

While examining this class for possible use in lldb, I noticed two
things:

it spits out parsing errors directly to stderr
the loclists parser can incorrectly return valid location lists when parsing malformed (truncated) data

I improve the stderr situation by making the parseOneLocationList
functions return Expected<T>s. The errors are still dumped to stderr by
their callers, so this is only a partial fix, but it is enough for my
use case, as I intend to parse the locations lists one by one.

I fix the behavior in the truncated scenario by using the newly
introduced DataExtractor Cursor API.

I also add tests for handling the error cases, as they currently have no
coverage.

Diff Detail

Repository: rL LLVM

Event Timeline

labath created this revision.Jun 20 2019, 3:01 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2019, 3:01 AM

Harbormaster completed remote builds in B33675: Diff 205770.Jun 20 2019, 3:01 AM

Looks pretty good, and thanks especially for the error-case tests!
I'll give other folks a chance to chime in if they want to.

lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
101 ↗	(On Diff #205770)	This identical createError call occurs many times, maybe add a createLocListOverflowError() helper?
115 ↗	(On Diff #205770)	You could do `SavedOffset = Offset;` here, and then add a `SavedOffset == Offset` check to the next one. There's no harm to calling a `get*` function with an invalid offset.
218 ↗	(On Diff #205770)	Maybe put an llvm_unreachable here.
test/DebugInfo/X86/dwarfdump-debug-loc-error-cases.s
1 ↗	(On Diff #205770)	I was not aware of `--defsym` that looks incredibly useful! In a test that generates multiple .o files I prefer to give each one a unique name, e.g. `%t0.o` and `%t1.o` etc. It can make it easier to debug a broken test.

add createOverflowError helper
use unique file names in tests

lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
115 ↗	(On Diff #205770)	The debug_loc function doesn't use the SavedOffset pattern, because it is always reading data in fixed-size chunks. I think it would be better to keep it that way, as this is slightly more readable.
test/DebugInfo/X86/dwarfdump-debug-loc-error-cases.s
1 ↗	(On Diff #205770)	I'm writing a bunch of tests in assembly these days, so I've learned a lot of interesting tricks there. :) I'll update the tests to use distinct file names.

Harbormaster completed remote builds in B33682: Diff 205792.Jun 20 2019, 6:25 AM

LGTM but give the West Coast folks a chance to look at it.

This revision is now accepted and ready to land.Jun 20 2019, 6:41 AM

dblaikie added inline comments.Jun 20 2019, 12:47 PM

lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
27 ↗	(On Diff #205792)	I guess "Ts &&... Vals" should be "const Ts &... Vals" since they're taken by const ref by createStringError anyway - no need for the fancy &&.
31 ↗	(On Diff #205792)	Should this be StringRef rather than const char*?
171 ↗	(On Diff #205792)	Looks to me like getULEB128 doesn't quite have the right error handling, if I'm reading it correctly: unsigned shift = 0; uint32_t offset = offset_ptr; uint8_t byte = 0; while (isValidOffset(offset)) { byte = Data[offset++]; result \|= uint64_t(byte & 0x7f) << shift; shift += 7; if ((byte & 0x80) == 0) break; } offset_ptr = offset; return result; I /imagine/ it shouldn't update offset_ptr if it breaks out of the loop via !isValidOffset, rather than via the break? More broadly, I wonder if we should consider a more convenient way to do error handling here - since it's a bit unfortunate that you've had to split the logic for parsing these things across two switch statements - makes it a bit hard to follow what shape each LLE entry has, since it's spread out like this.
220 ↗	(On Diff #205792)	Given the loop condition is "while (true)" this unreachable seems a bit unnecessary (& the function has non-void return, so if there was a path that got through the loop I imagine the compiler would warn us about that?) Or is this working around a compiler that warns here despite the lack of any path out of the loop?

probinson added inline comments.Jun 20 2019, 1:41 PM

lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
220 ↗	(On Diff #205792)	I have had to add llvm_unreachable before in this kind of situation, IIRC, which is why I suggested it. Might not be necessary, if all 3 supported toolchains are smart enough nowadays.

remove fancy references
remove llvm_unreachable

lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
31 ↗	(On Diff #205792)	createStringError uses a printf format string. Taking a StringRef would mean I'd have to add `.str().c_str()` blurb.
171 ↗	(On Diff #205792)	Nice catch about the getULEB function. I'll create a separate patch for that. Overall, I'm not really happy about how the error handling is implemented here. The DataExtractor functions seem to be really good at making sure you don't crash while using them, but they also make it incredibly hard to check the result for errors. We could change them to return an Optional<T> or something, but that would make using them a lot more verbose. It sounds like me like it may be best to have some thing similar to what std::istream has. I.e., have an object which encapsulates three things: the data to parse current offset in that data an error flag This would mean one can still call GetXXX functions in sequence without additional error checking. However, at a suitable point in time (e.g., after parsing a single record/DIE/...), one can have a peek at the error flag to verify that the data he got is actually valid. WDYT?
220 ↗	(On Diff #205792)	The debug_loc parser already uses this pattern without the terminating llvm_unreachable so I'd say we can assume the current compilers are fine with that..

Harbormaster completed remote builds in B33724: Diff 205973.Jun 21 2019, 4:56 AM

labath mentioned this in D63645: [Support] Fix error handling in DataExtractor::get[US]LEB128.Jun 21 2019, 5:08 AM

Removing that llvm_unreachable is fine, in that case.
The idea for error handling for DataExtractor sounds reasonable, looks like adding an error flag wouldn't even increase the size.

In D63591#1553416, @probinson wrote:

The idea for error handling for DataExtractor sounds reasonable, looks like adding an error flag wouldn't even increase the size.

Hmm... Originally I was thinking of building something on top of DataExtractor. Putting the logic *into* the DataExtractor is an interesting idea. I kind of like it (it would solve the problem I had of how to capture the DataExtractor vs. DWARFDataExtractor relationship in the "on top" model), but there's also something that bothers me about that. I think it's the fact that this would make the DataExtractor class stateful, whereas previously it was completely stateless. That may not me all bad, but it would mean the transition has to be done more carefully (watch out for thread races, and other unintended effects of the error bit leaking out). However, it also feels weird to have the error flag be a part of DataExtractor state, while the offset isn't. So, e.g. if one extracts from the DataExtractor using two independent offsets simultaneously, the error state set by one extraction would impact the other. This would be most obvious with the strtab-style data extractors, which almost always get a bunch of completely independent queries.

One way to achieve this while keeping the DataExtractor stateless would be to pass the error flag as an additional argument to the extraction methods, just like the offset is now. But that would make things more verbose, which means one might still want to implement some kind of an abstraction on top of that to keep these things together...

Ah, hadn't considered statefulness. But if you layer another class on top of DataExtractor to handle the error flag, it would have to be replicating all the offset-is-valid checks, because of course DataExtractor itself doesn't return errors.

I have a couple more ideas to toss out there...

A DataExtractorBase class that returns Optional<whatever>, and then DataExtractor layers on top and converts None to zero, which preserves the non-statefulness as well as the current API. This adds some runtime overhead, not sure how much.
Or, a template DataExtractorBase that takes an error-handling class as a parameter (sort of like how STL containers take an allocator) and a DataExtractor specialization uses a no-op error-handling class. Should avoid runtime overhead at the cost of template cruft.

In D63591#1553600, @probinson wrote:

Ah, hadn't considered statefulness. But if you layer another class on top of DataExtractor to handle the error flag, it would have to be replicating all the offset-is-valid checks, because of course DataExtractor itself doesn't return errors.

Not really. The data extractor kind of does return errors, as we've seen in this patch, it's just that they're incredibly hard to check for. I was thinking of having the new class rely on the SavedOffset == Offset behavior, but only internally. In my prototype, I managed to tuck it all away into a single template function which takes a member function pointer argument :D. Unfortunately, that was ICE-ing on gcc :P, but that was because I was beeing too clever -- I'm sure it can be dumbed down a bit.

I have a couple more ideas to toss out there...

A DataExtractorBase class that returns Optional<whatever>, and then DataExtractor layers on top and converts None to zero, which preserves the non-statefulness as well as the current API. This adds some runtime overhead, not sure how much.

That would work. It wouldn't even have to be a separate class, if you just make sure the function names are somehow different. However, it would mean that one has to explicitly check the result of every get operation. Not the end of the world, but it would make the code using it more noisy.

Or, a template DataExtractorBase that takes an error-handling class as a parameter (sort of like how STL containers take an allocator) and a DataExtractor specialization uses a no-op error-handling class. Should avoid runtime overhead at the cost of template cruft.

I'm not exactly sure how you imagined that, but I'm sure it could be made to work, as templates can be made to do almost anything. :P I'm not sure if it would be simpler than having a wrapper class or not. I have a feeling it might end up looking fairly similar from the outside.

Pick whatever mechanism you like, we should debate it in that patch not here. :-)

Given figuring out error handling for DataExtractor is perhaps a wider issue - if you want to go ahead with this change (continue with the review & defer error handling improvements for later, leave a FIXME, etc) that seems fine.

labath mentioned this in rL364169: [Support] Fix error handling in DataExtractor::get[US]LEB128.Jun 24 2019, 2:12 AM

labath mentioned this in rGbb6d0b8e7b0d: [Support] Fix error handling in DataExtractor::get[US]LEB128.

Leave a TODO in the code.

Harbormaster completed remote builds in B33785: Diff 206205.Jun 24 2019, 6:08 AM

In D63591#1553757, @dblaikie wrote:

Given figuring out error handling for DataExtractor is perhaps a wider issue - if you want to go ahead with this change (continue with the review & defer error handling improvements for later, leave a FIXME, etc) that seems fine.

How about this ? Theoretically I could also back out the SavedOffset changes. The main thing I was trying to fix is the stderr messages, this is just something I found while trying to write tests for the error handling code. I'm not too worried about the extra "zero" location lists being reported, as those are unlikely to be valid (but it would definitely be nice to fix them).

I also have a kind of a WIP patch for doing the error handling in a better way. I'm going to put that up separately so we can discuss it there.

PS: I'm going to have about two more patches here to make this stuff usable from lldb.

labath mentioned this in D63713: Add error handling to the DataExtractor class.Jun 24 2019, 6:29 AM

DataExtractor is a copy of the one from LLDB from a while back and changes have been made to adapt it to llvm. DataExtractor was designed so that you can have one of them (like for .debug_info or any other DWARF section) and use this same extractor from multiple threads. This is why it is currently stateless.

One solution to allowing for correct error handling would be to replace the current "uint32_t *offset_ptr" arguments to DataExtractor decoding functions with a "DataCursor &Pos" where DataCursor is something like:

class DataCursor {
  llvm::Expected<uint32_t> OffsetOrError;
};

Then all of the state like the offset and any error state. Or it could be two members, an offset and an error.

The main issues is to not decrease parsing performance by introducing error checking on each byte. The current DataExtractor will return zeroes when things fail to extract, which is kind of tuned for DWARF since zeros are not valid DW_TAG, DW_AT, DW_FORM and many other DWARF values. But it does allow for fast parsing. The idea was to quickly try and parse a bunch of data, and then make sure things are ok after doing some work (like parsing an entire DIE). So be careful with any changes to ensure DWARF parsing doesn't seriously regress.

lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
190 ↗	(On Diff #206205)	We should switch the LEB functions in DataExtractor over to use the ones from: #include <llvm/Support/LEB128.h and use the: inline uint64_t decodeULEB128(const uint8_t p, unsigned n = nullptr, const uint8_t end = nullptr, const char error = nullptr); inline int64_t decodeSLEB128(const uint8_t p, unsigned n = nullptr, const uint8_t end = nullptr, const char **error = nullptr); functions... They have all the error checking and are quite efficient. since DataExtractor had been converted from LLDB over into LLVM, the person that moved DataExtractor into LLVM hadn't realized these functions (might have been me) were there when the move happened.

labath mentioned this in rGb1f29cec2511: Add error handling to the DataExtractor class.Aug 27 2019, 4:27 AM

labath mentioned this in rL370042: Add error handling to the DataExtractor class.Aug 27 2019, 4:33 AM

Rebase the patch on top of DataExtractor Cursor changes.

labath requested review of this revision.Aug 27 2019, 5:22 AM

labath edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B37356: Diff 217378.Aug 27 2019, 5:23 AM

LGTM

JDevlieghere accepted this revision.Aug 27 2019, 8:35 AM

This revision is now accepted and ready to land.Aug 27 2019, 8:35 AM

Closed by commit rL370363: DWARFDebugLoc: Make parsing and error reporting more robust (authored by labath). · Explain WhyAug 29 2019, 7:25 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

DebugInfo/

DWARF/

DWARFDebugLoc.h

13 lines

lib/

DebugInfo/

DWARF/

DWARFDebugLoc.cpp

97 lines

DWARFVerifier.cpp

12 lines

test/

DebugInfo/

X86/

dwarfdump-debug-loc-error-cases.s

58 lines

dwarfdump-debug-loclists-error-cases.s

71 lines

Diff 217872

llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h

Show All 23 Lines
public:		public:
/// A single location within a location list.		/// A single location within a location list.
struct Entry {		struct Entry {
/// The beginning address of the instruction range.		/// The beginning address of the instruction range.
uint64_t Begin;		uint64_t Begin;
/// The ending address of the instruction range.		/// The ending address of the instruction range.
uint64_t End;		uint64_t End;
/// The location of the variable within the specified range.		/// The location of the variable within the specified range.
SmallString<4> Loc;		SmallVector<uint8_t, 4> Loc;
};		};

/// A list of locations that contain one variable.		/// A list of locations that contain one variable.
struct LocationList {		struct LocationList {
/// The beginning offset where this location list is stored in the debug_loc		/// The beginning offset where this location list is stored in the debug_loc
/// section.		/// section.
uint64_t Offset;		uint64_t Offset;
/// All the locations in which the variable is stored.		/// All the locations in which the variable is stored.
Show All 22 Lines	public:

/// Parse the debug_loc section accessible via the 'data' parameter using the		/// Parse the debug_loc section accessible via the 'data' parameter using the
/// address size also given in 'data' to interpret the address ranges.		/// address size also given in 'data' to interpret the address ranges.
void parse(const DWARFDataExtractor &data);		void parse(const DWARFDataExtractor &data);

/// Return the location list at the given offset or nullptr.		/// Return the location list at the given offset or nullptr.
LocationList const *getLocationListAtOffset(uint64_t Offset) const;		LocationList const *getLocationListAtOffset(uint64_t Offset) const;

Optional<LocationList> parseOneLocationList(DWARFDataExtractor Data,		static Expected<LocationList>
uint64_t *Offset);		parseOneLocationList(const DWARFDataExtractor &Data, uint64_t *Offset);
};		};

class DWARFDebugLoclists {		class DWARFDebugLoclists {
public:		public:
struct Entry {		struct Entry {
uint8_t Kind;		uint8_t Kind;
uint64_t Value0;		uint64_t Value0;
uint64_t Value1;		uint64_t Value1;
SmallVector<char, 4> Loc;		SmallVector<uint8_t, 4> Loc;
};		};

struct LocationList {		struct LocationList {
uint64_t Offset;		uint64_t Offset;
SmallVector<Entry, 2> Entries;		SmallVector<Entry, 2> Entries;
void dump(raw_ostream &OS, uint64_t BaseAddr, bool IsLittleEndian,		void dump(raw_ostream &OS, uint64_t BaseAddr, bool IsLittleEndian,
unsigned AddressSize, const MCRegisterInfo *RegInfo,		unsigned AddressSize, const MCRegisterInfo *RegInfo,
DWARFUnit *U, unsigned Indent) const;		DWARFUnit *U, unsigned Indent) const;
Show All 11 Lines
public:		public:
void parse(DataExtractor data, unsigned Version);		void parse(DataExtractor data, unsigned Version);
void dump(raw_ostream &OS, uint64_t BaseAddr, const MCRegisterInfo *RegInfo,		void dump(raw_ostream &OS, uint64_t BaseAddr, const MCRegisterInfo *RegInfo,
Optional<uint64_t> Offset) const;		Optional<uint64_t> Offset) const;

/// Return the location list at the given offset or nullptr.		/// Return the location list at the given offset or nullptr.
LocationList const *getLocationListAtOffset(uint64_t Offset) const;		LocationList const *getLocationListAtOffset(uint64_t Offset) const;

static Optional<LocationList>		static Expected<LocationList> parseOneLocationList(const DataExtractor &Data,
parseOneLocationList(DataExtractor Data, uint64_t *Offset, unsigned Version);		uint64_t *Offset,
		unsigned Version);
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_DEBUGINFO_DWARF_DWARFDEBUGLOC_H		#endif // LLVM_DEBUGINFO_DWARF_DWARFDEBUGLOC_H

llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp

Show All 22 Lines

using namespace llvm;		using namespace llvm;

// When directly dumping the .debug_loc without a compile unit, we have to guess		// When directly dumping the .debug_loc without a compile unit, we have to guess
// at the DWARF version. This only affects DW_OP_call_ref, which is a rare		// at the DWARF version. This only affects DW_OP_call_ref, which is a rare
// expression that LLVM doesn't produce. Guessing the wrong version means we		// expression that LLVM doesn't produce. Guessing the wrong version means we
// won't be able to pretty print expressions in DWARF2 binaries produced by		// won't be able to pretty print expressions in DWARF2 binaries produced by
// non-LLVM tools.		// non-LLVM tools.
static void dumpExpression(raw_ostream &OS, ArrayRef<char> Data,		static void dumpExpression(raw_ostream &OS, ArrayRef<uint8_t> Data,
bool IsLittleEndian, unsigned AddressSize,		bool IsLittleEndian, unsigned AddressSize,
const MCRegisterInfo MRI, DWARFUnit U) {		const MCRegisterInfo MRI, DWARFUnit U) {
DWARFDataExtractor Extractor(StringRef(Data.data(), Data.size()),		DWARFDataExtractor Extractor(toStringRef(Data), IsLittleEndian, AddressSize);
IsLittleEndian, AddressSize);
DWARFExpression(Extractor, dwarf::DWARF_VERSION, AddressSize).print(OS, MRI, U);		DWARFExpression(Extractor, dwarf::DWARF_VERSION, AddressSize).print(OS, MRI, U);
}		}

void DWARFDebugLoc::LocationList::dump(raw_ostream &OS, bool IsLittleEndian,		void DWARFDebugLoc::LocationList::dump(raw_ostream &OS, bool IsLittleEndian,
unsigned AddressSize,		unsigned AddressSize,
const MCRegisterInfo *MRI,		const MCRegisterInfo *MRI,
DWARFUnit *U,		DWARFUnit *U,
uint64_t BaseAddress,		uint64_t BaseAddress,
Show All 34 Lines	if (Offset) {
return;		return;
}		}

for (const LocationList &L : Locations) {		for (const LocationList &L : Locations) {
DumpLocationList(L);		DumpLocationList(L);
}		}
}		}

Optional<DWARFDebugLoc::LocationList>		Expected<DWARFDebugLoc::LocationList>
DWARFDebugLoc::parseOneLocationList(DWARFDataExtractor Data, uint64_t *Offset) {		DWARFDebugLoc::parseOneLocationList(const DWARFDataExtractor &Data,
		uint64_t *Offset) {
LocationList LL;		LocationList LL;
LL.Offset = *Offset;		LL.Offset = *Offset;
		DataExtractor::Cursor C(*Offset);

// 2.6.2 Location Lists		// 2.6.2 Location Lists
// A location list entry consists of:		// A location list entry consists of:
while (true) {		while (true) {
Entry E;		Entry E;
if (!Data.isValidOffsetForDataOfSize(Offset, 2 Data.getAddressSize())) {
WithColor::error() << "location list overflows the debug_loc section.\n";
return None;
}

// 1. A beginning address offset. ...		// 1. A beginning address offset. ...
E.Begin = Data.getRelocatedAddress(Offset);		E.Begin = Data.getRelocatedAddress(C);

// 2. An ending address offset. ...		// 2. An ending address offset. ...
E.End = Data.getRelocatedAddress(Offset);		E.End = Data.getRelocatedAddress(C);

		if (Error Err = C.takeError())
		return std::move(Err);
// The end of any given location list is marked by an end of list entry,		// The end of any given location list is marked by an end of list entry,
// which consists of a 0 for the beginning address offset and a 0 for the		// which consists of a 0 for the beginning address offset and a 0 for the
// ending address offset.		// ending address offset.
if (E.Begin == 0 && E.End == 0)		if (E.Begin == 0 && E.End == 0) {
		*Offset = C.tell();
return LL;		return LL;

if (!Data.isValidOffsetForDataOfSize(*Offset, 2)) {
WithColor::error() << "location list overflows the debug_loc section.\n";
return None;
}		}

unsigned Bytes = Data.getU16(Offset);		unsigned Bytes = Data.getU16(C);
if (!Data.isValidOffsetForDataOfSize(*Offset, Bytes)) {
WithColor::error() << "location list overflows the debug_loc section.\n";
return None;
}
// A single location description describing the location of the object...		// A single location description describing the location of the object...
StringRef str = Data.getData().substr(*Offset, Bytes);		Data.getU8(C, E.Loc, Bytes);
*Offset += Bytes;
E.Loc.reserve(str.size());
llvm::copy(str, std::back_inserter(E.Loc));
LL.Entries.push_back(std::move(E));		LL.Entries.push_back(std::move(E));
}		}
}		}

void DWARFDebugLoc::parse(const DWARFDataExtractor &data) {		void DWARFDebugLoc::parse(const DWARFDataExtractor &data) {
IsLittleEndian = data.isLittleEndian();		IsLittleEndian = data.isLittleEndian();
AddressSize = data.getAddressSize();		AddressSize = data.getAddressSize();

uint64_t Offset = 0;		uint64_t Offset = 0;
while (data.isValidOffset(Offset + data.getAddressSize() - 1)) {		while (Offset < data.getData().size()) {
if (auto LL = parseOneLocationList(data, &Offset))		if (auto LL = parseOneLocationList(data, &Offset))
Locations.push_back(std::move(*LL));		Locations.push_back(std::move(*LL));
else		else {
		logAllUnhandledErrors(LL.takeError(), WithColor::error());
break;		break;
}		}
if (data.isValidOffset(Offset))		}
WithColor::error() << "failed to consume entire .debug_loc section\n";
}		}

Optional<DWARFDebugLoclists::LocationList>		Expected<DWARFDebugLoclists::LocationList>
DWARFDebugLoclists::parseOneLocationList(DataExtractor Data, uint64_t *Offset,		DWARFDebugLoclists::parseOneLocationList(const DataExtractor &Data,
unsigned Version) {		uint64_t *Offset, unsigned Version) {
LocationList LL;		LocationList LL;
LL.Offset = *Offset;		LL.Offset = *Offset;
		DataExtractor::Cursor C(*Offset);

// dwarf::DW_LLE_end_of_list_entry is 0 and indicates the end of the list.		// dwarf::DW_LLE_end_of_list_entry is 0 and indicates the end of the list.
while (auto Kind =		while (auto Kind = static_cast<dwarf::LocationListEntry>(Data.getU8(C))) {
static_cast<dwarf::LocationListEntry>(Data.getU8(Offset))) {

Entry E;		Entry E;
E.Kind = Kind;		E.Kind = Kind;
switch (Kind) {		switch (Kind) {
case dwarf::DW_LLE_startx_length:		case dwarf::DW_LLE_startx_length:
E.Value0 = Data.getULEB128(Offset);		E.Value0 = Data.getULEB128(C);
// Pre-DWARF 5 has different interpretation of the length field. We have		// Pre-DWARF 5 has different interpretation of the length field. We have
// to support both pre- and standartized styles for the compatibility.		// to support both pre- and standartized styles for the compatibility.
if (Version < 5)		if (Version < 5)
E.Value1 = Data.getU32(Offset);		E.Value1 = Data.getU32(C);
else		else
E.Value1 = Data.getULEB128(Offset);		E.Value1 = Data.getULEB128(C);
break;		break;
case dwarf::DW_LLE_start_length:		case dwarf::DW_LLE_start_length:
E.Value0 = Data.getAddress(Offset);		E.Value0 = Data.getAddress(C);
E.Value1 = Data.getULEB128(Offset);		E.Value1 = Data.getULEB128(C);
break;		break;
case dwarf::DW_LLE_offset_pair:		case dwarf::DW_LLE_offset_pair:
E.Value0 = Data.getULEB128(Offset);		E.Value0 = Data.getULEB128(C);
E.Value1 = Data.getULEB128(Offset);		E.Value1 = Data.getULEB128(C);
break;		break;
case dwarf::DW_LLE_base_address:		case dwarf::DW_LLE_base_address:
E.Value0 = Data.getAddress(Offset);		E.Value0 = Data.getAddress(C);
break;		break;
default:		default:
WithColor::error() << "dumping support for LLE of kind " << (int)Kind		cantFail(C.takeError());
<< " not implemented\n";		return createStringError(errc::illegal_byte_sequence,
return None;		"LLE of kind %x not supported", (int)Kind);
}		}

if (Kind != dwarf::DW_LLE_base_address) {		if (Kind != dwarf::DW_LLE_base_address) {
unsigned Bytes =		unsigned Bytes = Version >= 5 ? Data.getULEB128(C) : Data.getU16(C);
Version >= 5 ? Data.getULEB128(Offset) : Data.getU16(Offset);
// A single location description describing the location of the object...		// A single location description describing the location of the object...
StringRef str = Data.getData().substr(*Offset, Bytes);		Data.getU8(C, E.Loc, Bytes);
*Offset += Bytes;
E.Loc.resize(str.size());
llvm::copy(str, E.Loc.begin());
}		}

LL.Entries.push_back(std::move(E));		LL.Entries.push_back(std::move(E));
}		}
		if (Error Err = C.takeError())
		return std::move(Err);
		*Offset = C.tell();
return LL;		return LL;
}		}

void DWARFDebugLoclists::parse(DataExtractor data, unsigned Version) {		void DWARFDebugLoclists::parse(DataExtractor data, unsigned Version) {
IsLittleEndian = data.isLittleEndian();		IsLittleEndian = data.isLittleEndian();
AddressSize = data.getAddressSize();		AddressSize = data.getAddressSize();

uint64_t Offset = 0;		uint64_t Offset = 0;
while (data.isValidOffset(Offset)) {		while (Offset < data.getData().size()) {
if (auto LL = parseOneLocationList(data, &Offset, Version))		if (auto LL = parseOneLocationList(data, &Offset, Version))
Locations.push_back(std::move(*LL));		Locations.push_back(std::move(*LL));
else		else {
		logAllUnhandledErrors(LL.takeError(), WithColor::error());
return;		return;
}		}
}		}
		}

DWARFDebugLoclists::LocationList const *		DWARFDebugLoclists::LocationList const *
DWARFDebugLoclists::getLocationListAtOffset(uint64_t Offset) const {		DWARFDebugLoclists::getLocationListAtOffset(uint64_t Offset) const {
auto It = partition_point(		auto It = partition_point(
Locations, [=](const LocationList &L) { return L.Offset < Offset; });		Locations, [=](const LocationList &L) { return L.Offset < Offset; });
if (It != Locations.end() && It->Offset == Offset)		if (It != Locations.end() && It->Offset == Offset)
return &(*It);		return &(*It);
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

llvm/trunk/lib/DebugInfo/DWARF/DWARFVerifier.cpp

Show First 20 Lines • Show All 460 Lines • ▼ Show 20 Lines	if (auto SectionOffset = AttrValue.Value.getAsSectionOffset()) {
if (*SectionOffset >= DObj.getLineSection().Data.size())		if (*SectionOffset >= DObj.getLineSection().Data.size())
ReportError("DW_AT_stmt_list offset is beyond .debug_line bounds: " +		ReportError("DW_AT_stmt_list offset is beyond .debug_line bounds: " +
llvm::formatv("{0:x8}", *SectionOffset));		llvm::formatv("{0:x8}", *SectionOffset));
break;		break;
}		}
ReportError("DIE has invalid DW_AT_stmt_list encoding:");		ReportError("DIE has invalid DW_AT_stmt_list encoding:");
break;		break;
case DW_AT_location: {		case DW_AT_location: {
auto VerifyLocationExpr = [&](StringRef D) {		auto VerifyLocationExpr = [&](ArrayRef<uint8_t> D) {
DWARFUnit *U = Die.getDwarfUnit();		DWARFUnit *U = Die.getDwarfUnit();
DataExtractor Data(D, DCtx.isLittleEndian(), 0);		DataExtractor Data(toStringRef(D), DCtx.isLittleEndian(), 0);
DWARFExpression Expression(Data, U->getVersion(),		DWARFExpression Expression(Data, U->getVersion(),
U->getAddressByteSize());		U->getAddressByteSize());
bool Error = llvm::any_of(Expression, [](DWARFExpression::Operation &Op) {		bool Error = llvm::any_of(Expression, [](DWARFExpression::Operation &Op) {
return Op.isError();		return Op.isError();
});		});
if (Error \|\| !Expression.verify(U))		if (Error \|\| !Expression.verify(U))
ReportError("DIE contains invalid DWARF expression:");		ReportError("DIE contains invalid DWARF expression:");
};		};
if (Optional<ArrayRef<uint8_t>> Expr = AttrValue.Value.getAsBlock()) {		if (Optional<ArrayRef<uint8_t>> Expr = AttrValue.Value.getAsBlock()) {
// Verify inlined location.		// Verify inlined location.
VerifyLocationExpr(llvm::toStringRef(*Expr));		VerifyLocationExpr(*Expr);
} else if (auto LocOffset = AttrValue.Value.getAsSectionOffset()) {		} else if (auto LocOffset = AttrValue.Value.getAsSectionOffset()) {
// Verify location list.		// Verify location list.
if (auto DebugLoc = DCtx.getDebugLoc())		if (auto DebugLoc = DCtx.getDebugLoc())
if (auto LocList = DebugLoc->getLocationListAtOffset(*LocOffset))		if (auto LocList = DebugLoc->getLocationListAtOffset(*LocOffset))
for (const auto &Entry : LocList->Entries)		for (const auto &Entry : LocList->Entries)
VerifyLocationExpr(Entry.Loc);		VerifyLocationExpr(Entry.Loc);
}		}
break;		break;
▲ Show 20 Lines • Show All 781 Lines • ▼ Show 20 Lines	unsigned DWARFVerifier::verifyNameIndexEntries(
return NumErrors;		return NumErrors;
}		}

static bool isVariableIndexable(const DWARFDie &Die, DWARFContext &DCtx) {		static bool isVariableIndexable(const DWARFDie &Die, DWARFContext &DCtx) {
Optional<DWARFFormValue> Location = Die.findRecursively(DW_AT_location);		Optional<DWARFFormValue> Location = Die.findRecursively(DW_AT_location);
if (!Location)		if (!Location)
return false;		return false;

auto ContainsInterestingOperators = [&](StringRef D) {		auto ContainsInterestingOperators = [&](ArrayRef<uint8_t> D) {
DWARFUnit *U = Die.getDwarfUnit();		DWARFUnit *U = Die.getDwarfUnit();
DataExtractor Data(D, DCtx.isLittleEndian(), U->getAddressByteSize());		DataExtractor Data(toStringRef(D), DCtx.isLittleEndian(), U->getAddressByteSize());
DWARFExpression Expression(Data, U->getVersion(), U->getAddressByteSize());		DWARFExpression Expression(Data, U->getVersion(), U->getAddressByteSize());
return any_of(Expression, [](DWARFExpression::Operation &Op) {		return any_of(Expression, [](DWARFExpression::Operation &Op) {
return !Op.isError() && (Op.getCode() == DW_OP_addr \|\|		return !Op.isError() && (Op.getCode() == DW_OP_addr \|\|
Op.getCode() == DW_OP_form_tls_address \|\|		Op.getCode() == DW_OP_form_tls_address \|\|
Op.getCode() == DW_OP_GNU_push_tls_address);		Op.getCode() == DW_OP_GNU_push_tls_address);
});		});
};		};

if (Optional<ArrayRef<uint8_t>> Expr = Location->getAsBlock()) {		if (Optional<ArrayRef<uint8_t>> Expr = Location->getAsBlock()) {
// Inlined location.		// Inlined location.
if (ContainsInterestingOperators(toStringRef(*Expr)))		if (ContainsInterestingOperators(*Expr))
return true;		return true;
} else if (Optional<uint64_t> Offset = Location->getAsSectionOffset()) {		} else if (Optional<uint64_t> Offset = Location->getAsSectionOffset()) {
// Location list.		// Location list.
if (const DWARFDebugLoc *DebugLoc = DCtx.getDebugLoc()) {		if (const DWARFDebugLoc *DebugLoc = DCtx.getDebugLoc()) {
if (const DWARFDebugLoc::LocationList *LocList =		if (const DWARFDebugLoc::LocationList *LocList =
DebugLoc->getLocationListAtOffset(*Offset)) {		DebugLoc->getLocationListAtOffset(*Offset)) {
if (any_of(LocList->Entries, [&](const DWARFDebugLoc::Entry &E) {		if (any_of(LocList->Entries, [&](const DWARFDebugLoc::Entry &E) {
return ContainsInterestingOperators(E.Loc);		return ContainsInterestingOperators(E.Loc);
▲ Show 20 Lines • Show All 187 Lines • Show Last 20 Lines

llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loc-error-cases.s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE1=0 -o %t1.o
				# RUN: llvm-dwarfdump -debug-loc %t1.o 2>&1 \| FileCheck %s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE2=0 -o %t2.o
				# RUN: llvm-dwarfdump -debug-loc %t2.o 2>&1 \| FileCheck %s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE3=0 -o %t3.o
				# RUN: llvm-dwarfdump -debug-loc %t3.o 2>&1 \| FileCheck %s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE4=0 -o %t4.o
				# RUN: llvm-dwarfdump -debug-loc %t4.o 2>&1 \| FileCheck %s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE5=0 -o %t5.o
				# RUN: llvm-dwarfdump -debug-loc %t5.o 2>&1 \| FileCheck %s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE6=0 -o %t6.o
				# RUN: llvm-dwarfdump -debug-loc %t6.o 2>&1 \| FileCheck %s

				# CHECK: error: unexpected end of data

				.section .debug_loc,"",@progbits
				.ifdef CASE1
				.byte 1 # bogus
				.endif
				.ifdef CASE2
				.long 0 # starting offset
				.endif
				.ifdef CASE3
				.long 0 # starting offset
				.long 1 # ending offset
				.endif
				.ifdef CASE4
				.long 0 # starting offset
				.long 1 # ending offset
				.word 0 # Loc expr size
				.endif
				.ifdef CASE5
				.long 0 # starting offset
				.long 1 # ending offset
				.word 0 # Loc expr size
				.long 0 # starting offset
				.endif
				.ifdef CASE6
				.long 0 # starting offset
				.long 1 # ending offset
				.word 0xffff # Loc expr size
				.endif

				# A minimal compile unit is needed to deduce the address size of the location
				# lists
				.section .debug_info,"",@progbits
				.long .Lcu_end0-.Lcu_begin0 # Length of Unit
				.Lcu_begin0:
				.short 4 # DWARF version number
				.long 0 # Offset Into Abbrev. Section
				.byte 8 # Address Size (in bytes)
				.byte 0 # End Of Children Mark
				.Lcu_end0:

llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists-error-cases.s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE1=0 -o %t1.o
				# RUN: llvm-dwarfdump -debug-loclists %t1.o 2>&1 \| FileCheck %s --check-prefix=ULEB

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE2=0 -o %t2.o
				# RUN: llvm-dwarfdump -debug-loclists %t2.o 2>&1 \| FileCheck %s --check-prefix=ULEB

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE3=0 -o %t3.o
				# RUN: llvm-dwarfdump -debug-loclists %t3.o 2>&1 \| FileCheck %s --check-prefix=ULEB

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE4=0 -o %t4.o
				# RUN: llvm-dwarfdump -debug-loclists %t4.o 2>&1 \| FileCheck %s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE5=0 -o %t5.o
				# RUN: llvm-dwarfdump -debug-loclists %t5.o 2>&1 \| FileCheck %s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE6=0 -o %t6.o
				# RUN: llvm-dwarfdump -debug-loclists %t6.o 2>&1 \| FileCheck %s

				# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux --defsym CASE7=0 -o %t7.o
				# RUN: llvm-dwarfdump -debug-loclists %t7.o 2>&1 \| FileCheck %s --check-prefix=UNIMPL

				# CHECK: error: unexpected end of data
				# ULEB: error: malformed uleb128, extends past end
				# UNIMPL: error: LLE of kind 47 not supported

				.section .debug_loclists,"",@progbits
				.long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0
				.Ldebug_loclist_table_start0:
				.short 5 # Version.
				.byte 8 # Address size.
				.byte 0 # Segment selector size.
				.long 0 # Offset entry count.
				.Lloclists_table_base0:
				.Ldebug_loc0:
				.ifdef CASE1
				.byte 4 # DW_LLE_offset_pair
				.endif
				.ifdef CASE2
				.byte 4 # DW_LLE_offset_pair
				.uleb128 0x0 # starting offset
				.endif
				.ifdef CASE3
				.byte 4 # DW_LLE_offset_pair
				.uleb128 0x0 # starting offset
				.uleb128 0x10 # ending offset
				.endif
				.ifdef CASE4
				.byte 4 # DW_LLE_offset_pair
				.uleb128 0x0 # starting offset
				.uleb128 0x10 # ending offset
				.byte 1 # Loc expr size
				.endif
				.ifdef CASE5
				.byte 4 # DW_LLE_offset_pair
				.uleb128 0x0 # starting offset
				.uleb128 0x10 # ending offset
				.byte 1 # Loc expr size
				.byte 117 # DW_OP_breg5
				.endif
				.ifdef CASE6
				.byte 4 # DW_LLE_offset_pair
				.uleb128 0x0 # starting offset
				.uleb128 0x10 # ending offset
				.uleb128 0xdeadbeef # Loc expr size
				.endif
				.ifdef CASE7
				.byte 0x47
				.endif

				.Ldebug_loclist_table_end0:

This is an archive of the discontinued LLVM Phabricator instance.

DWARFDebugLoc: Make parsing and error reporting more robustClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 217872

llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h

llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp

llvm/trunk/lib/DebugInfo/DWARF/DWARFVerifier.cpp

llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loc-error-cases.s

llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists-error-cases.s

DWARFDebugLoc: Make parsing and error reporting more robust
ClosedPublic