This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/DebugInfo/DWARF/
-
DebugInfo/
-
DWARF/
2/6
DWARFDebugLine.cpp
-
test/tools/llvm-dwarfdump/X86/
-
tools/
-
llvm-dwarfdump/
-
X86/
-
Inputs/
-
debug_line_malformed.s
3/7
debug_line_invalid.test
-
unittests/DebugInfo/DWARF/
-
DebugInfo/
-
DWARF/
3/7
DWARFDebugLineTest.cpp
1/2
DwarfGenerator.cpp

Differential D73618

[DebugInfo] Check that we do not run past a line table end when parsing
AbandonedPublic

Authored by jhenderson on Jan 29 2020, 3:41 AM.

Download Raw Diff

Details

Reviewers

dblaikie
JDevlieghere
ikudrin
probinson
MaskRay

Summary

This change adds another check to the debug line table parsing code, namely to make sure the final offset after parsing a table matches the end offset as expected based on the unit length field. These two won't match if the unit length points to part way through an opcode, as the parsing does not stop midway through opcodes.

If the problem is detected, the offset will be reset to the expected value and an error reported.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jhenderson created this revision.Jan 29 2020, 3:41 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 29 2020, 3:41 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

ikudrin added inline comments.Jan 30 2020, 2:50 AM

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
924	I guess that the comment is not 100% accurate. Let's imagine the following purely illustrative and meaningless sequence at the end of the section: `0`, `5`, `DW_LNE_set_address`, `0`, `1`, `DW_LNE_end_sequence`. The extractor will not read an argument of `DW_LNE_set_address` and will not increment the offset, but after that, something which looks like a correct termination will be read.
927	Maybe it is more consistent to put this code before the previous check. I mean, this check is comparatively low level, and if it triggers, the input is probably corrupted so deeply that it makes no sense to interpret it, no?

jhenderson marked 2 inline comments as done.Jan 30 2020, 3:08 AM

jhenderson added inline comments.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
924	In your example, the set address opcode would consume the three bytes comprising the end sequence opcode, followed by reading a single zero value for the final byte. The offset would be adjusted to the end of the three readable bytes (thus going past the end of the DW_LNE_end_sequence opcode). The parser would then stop as `*OffsetPtr == EndOffset`. This warning wouldn't then be triggered but a no end sequence one would be. Would changing this statement "As the offset isn't incremented by the data extractor when reading past the end of the available data" to "As the offset isn't incremented by the data extractor past the end of the available data" be clearer?
927	Just to be clear, are you saying that we shouldn't report the no end sequence error if we hit this case? That's fine with me. If I did that, I'd probably pull the sort below to before this check, as that would allow me to simply return after the error has been reported, whilst still allowing clients to use the data, if they want to.

ikudrin added inline comments.Jan 30 2020, 4:44 AM

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
924	Ah, yes, I forgot about the special code which fixes `OffsetPtr` at the end of the branch for `Opcode == 0`. Well, I cannot imagine another malicious sequence. It is sad that the whole code is not straightforward and requires a long explanatory comment. I don't insist on any specific wording, but maybe it is worth to add a note about the code at the end of the `Opcode == 0` branch which I missed.
927	Right, I believe that we do not need to report about the unterminated sequence if we have reported the incorrect offset. The latter is more fundamental, to my taste.
llvm/unittests/DebugInfo/DWARF/DwarfGenerator.cpp
178	Do I understand it right that this fix along with the corresponding changes in other places may be extracted into a separate patch?

@labath - maybe some other parts of the DWARF parsing that could benefit from a constrained DWARFDataExtractor

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test
39	Should this dump in verbose mode to show more clearly which operations were parsed and which ones were not? So they match up with the LNE_* descriptions in the comments in the test?
185	(similar to other comment) - this warning sounds problematic to me & any chunk of debug info that has a specified length shouldn't be read beyond that length (as if the section itself ended at the end of the length - we should get the same error messages in both those cases)
llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
568–569	This phrasing doesn't match up with not reading past the end of a specified length - I'd expect something more like "last opcode in line table at offset 0x..1 is incomplete/truncated at offset 0x...34" or the like. Maybe with "expected to extend to 0x35" if we know that some localized implied/expressed length extends beyond some broader length that was specified.
794–796	Could you explain this further? What's incorrect about the existing usage (where is the opcode length field? Is that the 0x2 in the line above? Why would it be too short? (should the DWARFGenerator API be changed? is it computing a length that's too short for the table?))

In D73618#1850803, @dblaikie wrote:

@labath - maybe some other parts of the DWARF parsing that could benefit from a constrained DWARFDataExtractor

I think that pretty much everything would benefit from a data extractor constrained in this way. Prefixing the content with length is used in nearly every dwarf section, and so in theory, everything should be checking that it does not cross the specified length. I've seen code which attempts to do that via something like while(!endReached() && data.isValidOffset(*Offset) && *Offset < EndOffset) parseOneThing(Offset), but that is:
a) complicated
b) probably incorrect, because the end boundary is only checked at the end of the parseOneThing call, so we can still cross that boundary if the "one thing" is sitting on both sides of the boundary

If we had a "constrained" data extractor, then we wouldn't need the *Offset < EndOffset check, because the extractor would check that for us (and it would do that _everywhere_). It would also allow us to treat the "'thing' crosses a contribution boundary, but there is another contribution after it" and "'thing' crosses a contribution boundary, but hits the end of the section" cases uniformly, because as far as the code would be concerned, everything would be at the end of the section.

In D73618#1851154, @labath wrote:

In D73618#1850803, @dblaikie wrote:

@labath - maybe some other parts of the DWARF parsing that could benefit from a constrained DWARFDataExtractor

I think that pretty much everything would benefit from a data extractor constrained in this way. Prefixing the content with length is used in nearly every dwarf section, and so in theory, everything should be checking that it does not cross the specified length. I've seen code which attempts to do that via something like while(!endReached() && data.isValidOffset(*Offset) && *Offset < EndOffset) parseOneThing(Offset), but that is:
a) complicated
b) probably incorrect, because the end boundary is only checked at the end of the parseOneThing call, so we can still cross that boundary if the "one thing" is sitting on both sides of the boundary

If we had a "constrained" data extractor, then we wouldn't need the *Offset < EndOffset check, because the extractor would check that for us (and it would do that _everywhere_). It would also allow us to treat the "'thing' crosses a contribution boundary, but there is another contribution after it" and "'thing' crosses a contribution boundary, but hits the end of the section" cases uniformly, because as far as the code would be concerned, everything would be at the end of the section.

Ok, I think I now understand (after reading the other emails) that you were letting me know of the use cases, and not asking me to provide some. :)

Given the proposed changes to the DWARFDataExtractor to limit it to a certain region, I think this and similar changes I have in the pipeline should probably be shelved, as the behaviour will change and the error messages will need updating. I'll focus on other areas instead in the meantime.

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test
39	This last check is just for the last (good) section, so doesn't really need to show things in that much detail. However, I could certainly see a benefit for checking things more verbosely. We already do verbose dumps on line 29 of the test, but the checks don't check the operands. I'll look at updating that, but it should be a separate change.
185	That's fair. Saying it's been truncated at this point is somewhat lying, but would be correct once we switch to the limited data extractor.
llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
568–569	I think your proposal makes sense with the proposed data extractor changes, but possibly not before. See my out-of-line comment.
794–796	`addExtendedOpcode` takes a length for an extended opcode, followed by the opcode itself, and then data for the operands. I realise this byte should be moved into the operands argument, so I'll fix that. I don't think the API here needs changing, as it would prevent us creating broken situations like this (i.e. an extended opcode without sufficient data for its claimed length).
llvm/unittests/DebugInfo/DWARF/DwarfGenerator.cpp
178	It probably can be. It's a requirement for this patch, but dosen't need to be a part of the same commit. I'lll look at splitting it out.

dblaikie added inline comments.Feb 3 2020, 5:12 PM

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test
39	Oh, I figured this change related to the other change in this review (Case 13) which wasn't super clear to me & figured verbosity might help.
185	Right - I meant that the warning implies the implementation isn't what I'd hope/expect (& that both message and implementation should be fixed in the direction of not reading more than the prefixed length indicates for any given entity).
llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
568–569	Right right - implementation and messaging should be fixed in that direction.
794–796	What's the mention of the "table end" here, then? I guess "table end" is indicated by the table length field elsewhere? Is it autogenerated (with a possible override for invalid situations)? Should it be autogenerated differently?

jhenderson marked 2 inline comments as done.Feb 4 2020, 1:01 AM

jhenderson added inline comments.

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test
39	It's only indirectly related. Case 13 required adding new data to the input, which pushed the last table later, meaning the offset needed updating.
llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
794–796	Yeah, the table length is auto-generated, but it's auto-generated based on the raw data we add (i.e. not the interpretation of the data, but the physical number of bytes we add). See `writeDefaultPrologue` in DWARFGenerator.cpp. The ability to override the length is provided by the `createBasicPrologue` function which returns a line table prologue that can be updated for use in the test. See e.g. `ErrorForTooLargePrologueLength`. I think this is a reasonable approach. In this test case, we are overriding the DW_LNE_end_sequence length with something different, by specifying the "2" argument. Previously, by not specifying any data after it in the test, the majority of the bytes covered by this size were past the end of the table, which is itself an interesting test case, albeit not the focus of this test. Consequently it was triggering the new error, which I didn't want. When I next update this patch, I'll replace this and the previous line of code and comment with: `LT.addExtendedOpcode(0x2, DW_LNE_end_sequence, {{LineTable::Byte, 0xaa});`

Abandoning in favour of D80796 and related patches.

Herald added a subscriber: cmtice. · View Herald TranscriptJun 9 2020, 1:58 AM

Revision Contents

Path

Size

llvm/

lib/

DebugInfo/

DWARF/

DWARFDebugLine.cpp

19 lines

test/

tools/

llvm-dwarfdump/

X86/

Inputs/

debug_line_malformed.s

29 lines

debug_line_invalid.test

14 lines

unittests/

DebugInfo/

DWARF/

DWARFDebugLineTest.cpp

53 lines

DwarfGenerator.cpp

1 line

Diff 241098

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp

Show First 20 Lines • Show All 910 Lines • ▼ Show 20 Lines	Error DWARFDebugLine::LineTable::parse(

if (!State.Sequence.Empty)		if (!State.Sequence.Empty)
RecoverableErrorCallback(createStringError(		RecoverableErrorCallback(createStringError(
errc::illegal_byte_sequence,		errc::illegal_byte_sequence,
"last sequence in debug line table at offset 0x%8.8" PRIx64		"last sequence in debug line table at offset 0x%8.8" PRIx64
" is not terminated",		" is not terminated",
DebugLineOffset));		DebugLineOffset));

		// Check that we haven't gone past the expected end of the table. This can
		// only happen when an opcode starts before the expected end, and finishes
		// after. As the offset isn't incremented by the data extractor when
		// reading past the end of the available data, this will not detect cases
		// where the overflowing opcode appears at the end of the section. However, in
		// such cases, an unterminated sequence error will be raised instead, as the
		ikudrinUnsubmitted Not Done Reply Inline Actions I guess that the comment is not 100% accurate. Let's imagine the following purely illustrative and meaningless sequence at the end of the section: `0`, `5`, `DW_LNE_set_address`, `0`, `1`, `DW_LNE_end_sequence`. The extractor will not read an argument of `DW_LNE_set_address` and will not increment the offset, but after that, something which looks like a correct termination will be read. ikudrin: I guess that the comment is not 100% accurate. Let's imagine the following purely illustrative…
		jhendersonAuthorUnsubmitted Done Reply Inline Actions In your example, the set address opcode would consume the three bytes comprising the end sequence opcode, followed by reading a single zero value for the final byte. The offset would be adjusted to the end of the three readable bytes (thus going past the end of the DW_LNE_end_sequence opcode). The parser would then stop as `OffsetPtr == EndOffset`. This warning wouldn't then be triggered but a no end sequence one would be. Would changing this statement "As the offset isn't incremented by the data extractor when reading past the end of the available data" to "As the offset isn't incremented by the data extractor past the end of the available data" be clearer? jhenderson:* In your example, the set address opcode would consume the three bytes comprising the end…
		ikudrinUnsubmitted Not Done Reply Inline Actions Ah, yes, I forgot about the special code which fixes `OffsetPtr` at the end of the branch for `Opcode == 0`. Well, I cannot imagine another malicious sequence. It is sad that the whole code is not straightforward and requires a long explanatory comment. I don't insist on any specific wording, but maybe it is worth to add a note about the code at the end of the `Opcode == 0` branch which I missed. ikudrin: Ah, yes, I forgot about the special code which fixes `OffsetPtr` at the end of the branch for…
		// data extractor will return zeroes for the trailing bytes (which do not
		// correspond to an end sequence instruction).
		if (*OffsetPtr > EndOffset) {
		ikudrinUnsubmitted Not Done Reply Inline Actions Maybe it is more consistent to put this code before the previous check. I mean, this check is comparatively low level, and if it triggers, the input is probably corrupted so deeply that it makes no sense to interpret it, no? ikudrin: Maybe it is more consistent to put this code before the previous check. I mean, this check is…
		jhendersonAuthorUnsubmitted Done Reply Inline Actions Just to be clear, are you saying that we shouldn't report the no end sequence error if we hit this case? That's fine with me. If I did that, I'd probably pull the sort below to before this check, as that would allow me to simply return after the error has been reported, whilst still allowing clients to use the data, if they want to. jhenderson: Just to be clear, are you saying that we shouldn't report the no end sequence error if we hit…
		ikudrinUnsubmitted Not Done Reply Inline Actions Right, I believe that we do not need to report about the unterminated sequence if we have reported the incorrect offset. The latter is more fundamental, to my taste. ikudrin: Right, I believe that we do not need to report about the unterminated sequence if we have…
		RecoverableErrorCallback(createStringError(
		errc::invalid_argument,
		"last opcode in line table at offset 0x%8.8" PRIx64
		" ended at offset 0x%8.8" PRIx64
		" which is past the expected table end at offset 0x%8.8" PRIx64,
		DebugLineOffset, *OffsetPtr, EndOffset));
		// Recover from this error by resetting the offset back to the expected end.
		*OffsetPtr = EndOffset;
		}

// Sort all sequences so that address lookup will work faster.		// Sort all sequences so that address lookup will work faster.
if (!Sequences.empty()) {		if (!Sequences.empty()) {
llvm::sort(Sequences, Sequence::orderByHighPC);		llvm::sort(Sequences, Sequence::orderByHighPC);
// Note: actually, instruction address ranges of sequences should not		// Note: actually, instruction address ranges of sequences should not
// overlap (in shared objects and executables). If they do, the address		// overlap (in shared objects and executables). If they do, the address
// lookup would still work, though, but result would be ambiguous.		// lookup would still work, though, but result would be ambiguous.
// We don't report warning in this case. For example,		// We don't report warning in this case. For example,
// sometimes .so compiled from multiple object files contains a few		// sometimes .so compiled from multiple object files contains a few
▲ Show 20 Lines • Show All 284 Lines • Show Last 20 Lines

llvm/test/tools/llvm-dwarfdump/X86/Inputs/debug_line_malformed.s

	Show First 20 Lines • Show All 377 Lines • ▼ Show 20 Lines
	.asciz "xyz" # File name \| 3 special opcodes + DW_LNE_set_address start			.asciz "xyz" # File name \| 3 special opcodes + DW_LNE_set_address start
	.byte 9 # MD5 hash value \| DW_LNE_set_address length			.byte 9 # MD5 hash value \| DW_LNE_set_address length
	# Header end			# Header end
	.byte 2 # DW_LNE_set_address opcode			.byte 2 # DW_LNE_set_address opcode
	.quad 0x4321432143214321			.quad 0x4321432143214321
	.byte 0, 1, 1 # DW_LNE_end_sequence			.byte 0, 1, 1 # DW_LNE_end_sequence
	.Linvalid_md5_end1:			.Linvalid_md5_end1:

				# Opcode extends past the end of the table, as claimed by the unit length field.
				.long .Lshort_unit_length_end - .Lshort_unit_length_start # Length of Unit
				.Lshort_unit_length_start:
				.short 4 # DWARF version number
				.long .Lprologue_short_unit_length_end-.Lprologue_short_unit_length_start # Length of Prologue
				.Lprologue_short_unit_length_start:
				.byte 1 # Minimum Instruction Length
				.byte 1 # Maximum Operations per Instruction
				.byte 1 # Default is_stmt
				.byte -5 # Line Base
				.byte 14 # Line Range
				.byte 13 # Opcode Base
				.byte 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1 # Standard Opcode Lengths
				.asciz "dir1" # Include table
				.asciz "dir2"
				.byte 0
				.asciz "file1" # File table
				.byte 0, 0, 0
				.asciz "file2"
				.byte 1, 0, 0
				.byte 0
				.Lprologue_short_unit_length_end:
				.byte 0, 9, 2 # DW_LNE_set_address
				.quad 0xfeedfeed
				.byte 1 # DW_LNS_copy
				.byte 0, 9, 2 # DW_LNE_set_address
				.long 0xf001 # Truncated address (should be 8 bytes)
				.Lshort_unit_length_end:

	# Trailing good section.			# Trailing good section.
	.long .Lunit_good_end - .Lunit_good_start # Length of Unit (DWARF-32 format)			.long .Lunit_good_end - .Lunit_good_start # Length of Unit (DWARF-32 format)
	.Lunit_good_start:			.Lunit_good_start:
	.short 4 # DWARF version number			.short 4 # DWARF version number
	.long .Lprologue_good_end-.Lprologue_good_start # Length of Prologue			.long .Lprologue_good_end-.Lprologue_good_start # Length of Prologue
	.Lprologue_good_start:			.Lprologue_good_start:
	.byte 1 # Minimum Instruction Length			.byte 1 # Minimum Instruction Length
	.byte 1 # Maximum Operations per Instruction			.byte 1 # Maximum Operations per Instruction
	Show All 18 Lines

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test

	Show All 30 Lines
	# RUN: FileCheck %s --input-file=%t-malformed-verbose.err --check-prefixes=ALL,OTHER			# RUN: FileCheck %s --input-file=%t-malformed-verbose.err --check-prefixes=ALL,OTHER

	## We should still produce warnings for malformed tables after the specified unit.			## We should still produce warnings for malformed tables after the specified unit.
	# RUN: llvm-dwarfdump -debug-line=0 %t-malformed.o 2> %t-malformed-off-first.err \			# RUN: llvm-dwarfdump -debug-line=0 %t-malformed.o 2> %t-malformed-off-first.err \
	# RUN: \| FileCheck %s --check-prefixes=FIRST,NOLATER			# RUN: \| FileCheck %s --check-prefixes=FIRST,NOLATER
	# RUN: FileCheck %s --input-file=%t-malformed-off-first.err --check-prefix=ALL			# RUN: FileCheck %s --input-file=%t-malformed-off-first.err --check-prefix=ALL

	## Don't stop looking for the later unit if non-fatal issues are found.			## Don't stop looking for the later unit if non-fatal issues are found.
	# RUN: llvm-dwarfdump -debug-line=0x2ec %t-malformed.o 2> %t-malformed-off-last.err \			# RUN: llvm-dwarfdump -debug-line=0x339 %t-malformed.o 2> %t-malformed-off-last.err \
				dblaikieUnsubmitted Not Done Reply Inline Actions Should this dump in verbose mode to show more clearly which operations were parsed and which ones were not? So they match up with the LNE_* descriptions in the comments in the test? dblaikie: Should this dump in verbose mode to show more clearly which operations were parsed and which…
				jhendersonAuthorUnsubmitted Done Reply Inline Actions This last check is just for the last (good) section, so doesn't really need to show things in that much detail. However, I could certainly see a benefit for checking things more verbosely. We already do verbose dumps on line 29 of the test, but the checks don't check the operands. I'll look at updating that, but it should be a separate change. jhenderson: This last check is just for the last (good) section, so doesn't really need to show things in…
				dblaikieUnsubmitted Not Done Reply Inline Actions Oh, I figured this change related to the other change in this review (Case 13) which wasn't super clear to me & figured verbosity might help. dblaikie: Oh, I figured this change related to the other change in this review (Case 13) which wasn't…
				jhendersonAuthorUnsubmitted Done Reply Inline Actions It's only indirectly related. Case 13 required adding new data to the input, which pushed the last table later, meaning the offset needed updating. jhenderson: It's only indirectly related. Case 13 required adding new data to the input, which pushed the…
	# RUN: \| FileCheck %s --check-prefix=LAST --implicit-check-not='debug_line[{{.*}}]'			# RUN: \| FileCheck %s --check-prefix=LAST --implicit-check-not='debug_line[{{.*}}]'
	# RUN: FileCheck %s --input-file=%t-malformed-off-last.err --check-prefix=ALL			# RUN: FileCheck %s --input-file=%t-malformed-off-last.err --check-prefix=ALL

	# FIRST: debug_line[0x00000000]			# FIRST: debug_line[0x00000000]
	# FIRST: 0x000000000badbeef {{.*}} end_sequence			# FIRST: 0x000000000badbeef {{.*}} end_sequence
	# NOFIRST-NOT: debug_line[0x00000000]			# NOFIRST-NOT: debug_line[0x00000000]
	# NOFIRST-NOT: 0x000000000badbeef {{.*}} end_sequence			# NOFIRST-NOT: 0x000000000badbeef {{.*}} end_sequence
	# NOLATER-NOT: debug_line[{{.*}}]			# NOLATER-NOT: debug_line[{{.*}}]
	▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines
	## been read before the MD5 problem is identified.			## been read before the MD5 problem is identified.
	# NONFATAL: debug_line[0x000002ae]			# NONFATAL: debug_line[0x000002ae]
	# NONFATAL-NEXT: Line table prologue			# NONFATAL-NEXT: Line table prologue
	# NONFATAL: include_directories[ 0] = "/tmp"			# NONFATAL: include_directories[ 0] = "/tmp"
	# NONFATAL-NOT: file_names			# NONFATAL-NOT: file_names
	# NONFATAL: 0x0000000000000000 {{.*}} epilogue_begin			# NONFATAL: 0x0000000000000000 {{.*}} epilogue_begin
	# NONFATAL: 0x4321432143214321 {{.*}} is_stmt end_sequence			# NONFATAL: 0x4321432143214321 {{.*}} is_stmt end_sequence

	# LAST: debug_line[0x000002ec]			## Case 13: Last operand goes past unit end.
				# NONFATAL: debug_line[0x000002ec]
				# NONFATAL-NEXT: Line table prologue
				# NONFATAL: 0x00000000feedfeed {{.*}} is_stmt
				## The truncated row is not added by any operation, so is not recorded in the
				## table.
				# NONFATAL-NOT: is_stmt

				# LAST: debug_line[0x00000339]
	# LAST: 0x00000000cafebabe {{.*}} end_sequence			# LAST: 0x00000000cafebabe {{.*}} end_sequence

	# RESERVED: warning: parsing line table prologue at offset 0x00000048 unsupported reserved unit length found of value 0xfffffffe			# RESERVED: warning: parsing line table prologue at offset 0x00000048 unsupported reserved unit length found of value 0xfffffffe

	# ALL-NOT: warning:			# ALL-NOT: warning:
	# ALL: warning: parsing line table prologue at offset 0x00000048 found unsupported version 0x00			# ALL: warning: parsing line table prologue at offset 0x00000048 found unsupported version 0x00
	# ALL-NEXT: warning: parsing line table prologue at offset 0x0000004e found unsupported version 0x01			# ALL-NEXT: warning: parsing line table prologue at offset 0x0000004e found unsupported version 0x01
	# ALL-NEXT: warning: parsing line table prologue at 0x00000054 found an invalid directory or file table description at 0x00000073			# ALL-NEXT: warning: parsing line table prologue at 0x00000054 found an invalid directory or file table description at 0x00000073
	# ALL-NEXT: warning: failed to parse entry content descriptions because no path was found			# ALL-NEXT: warning: failed to parse entry content descriptions because no path was found
	# ALL-NEXT: warning: parsing line table prologue at 0x00000081 should have ended at 0x000000b9 but it ended at 0x000000ba			# ALL-NEXT: warning: parsing line table prologue at 0x00000081 should have ended at 0x000000b9 but it ended at 0x000000ba
	# ALL-NEXT: warning: parsing line table prologue at 0x000000c8 should have ended at 0x00000103 but it ended at 0x00000102			# ALL-NEXT: warning: parsing line table prologue at 0x000000c8 should have ended at 0x00000103 but it ended at 0x00000102
	# OTHER-NEXT: warning: unexpected line op length at offset 0x00000158 expected 0x02 found 0x01			# OTHER-NEXT: warning: unexpected line op length at offset 0x00000158 expected 0x02 found 0x01
	# OTHER-NEXT: warning: unexpected line op length at offset 0x0000015c expected 0x01 found 0x02			# OTHER-NEXT: warning: unexpected line op length at offset 0x0000015c expected 0x01 found 0x02
	# OTHER-NEXT: warning: last sequence in debug line table at offset 0x0000016c is not terminated			# OTHER-NEXT: warning: last sequence in debug line table at offset 0x0000016c is not terminated
	# ALL-NEXT: warning: parsing line table prologue at 0x000001b2 should have ended at 0x000001ce but it ended at 0x000001e1			# ALL-NEXT: warning: parsing line table prologue at 0x000001b2 should have ended at 0x000001ce but it ended at 0x000001e1
	# ALL-NEXT: warning: parsing line table prologue at 0x000001ee should have ended at 0x00000219 but it ended at 0x00000220			# ALL-NEXT: warning: parsing line table prologue at 0x000001ee should have ended at 0x00000219 but it ended at 0x00000220
	# ALL-NEXT: warning: parsing line table prologue at 0x0000022f should have ended at 0x00000251 but it ended at 0x0000025e			# ALL-NEXT: warning: parsing line table prologue at 0x0000022f should have ended at 0x00000251 but it ended at 0x0000025e
	# ALL-NEXT: warning: parsing line table prologue at 0x0000026b found an invalid directory or file table description at 0x0000029f			# ALL-NEXT: warning: parsing line table prologue at 0x0000026b found an invalid directory or file table description at 0x0000029f
	# ALL-NEXT: warning: failed to parse file entry because the MD5 hash is invalid			# ALL-NEXT: warning: failed to parse file entry because the MD5 hash is invalid
	# ALL-NEXT: warning: parsing line table prologue at 0x000002ae found an invalid directory or file table description at 0x000002e0			# ALL-NEXT: warning: parsing line table prologue at 0x000002ae found an invalid directory or file table description at 0x000002e0
	# ALL-NEXT: warning: failed to parse file entry because the MD5 hash is invalid			# ALL-NEXT: warning: failed to parse file entry because the MD5 hash is invalid
	# ALL-NEXT: warning: parsing line table prologue at 0x000002ae should have ended at 0x000002d9 but it ended at 0x000002e0			# ALL-NEXT: warning: parsing line table prologue at 0x000002ae should have ended at 0x000002d9 but it ended at 0x000002e0
				# OTHER-NEXT: warning: last sequence in debug line table at offset 0x000002ec is not terminated
				# OTHER-NEXT: warning: last opcode in line table at offset 0x000002ec ended at offset 0x0000033d which is past the expected table end at offset 0x00000339
				dblaikieUnsubmitted Not Done Reply Inline Actions (similar to other comment) - this warning sounds problematic to me & any chunk of debug info that has a specified length shouldn't be read beyond that length (as if the section itself ended at the end of the length - we should get the same error messages in both those cases) dblaikie: (similar to other comment) - this warning sounds problematic to me & any chunk of debug info…
				jhendersonAuthorUnsubmitted Done Reply Inline Actions That's fair. Saying it's been truncated at this point is somewhat lying, but would be correct once we switch to the limited data extractor. jhenderson: That's fair. Saying it's been truncated at this point is somewhat lying, but would be correct…
				dblaikieUnsubmitted Not Done Reply Inline Actions Right - I meant that the warning implies the implementation isn't what I'd hope/expect (& that both message and implementation should be fixed in the direction of not reading more than the prefixed length indicates for any given entity). dblaikie: Right - I meant that the warning implies the implementation isn't what I'd hope/expect (& that…
	# ALL-NOT: warning:			# ALL-NOT: warning:

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp

Show First 20 Lines • Show All 372 Lines • ▼ Show 20 Lines
TEST_P(DebugLineParameterisedFixture, ErrorForTooLargePrologueLength) {		TEST_P(DebugLineParameterisedFixture, ErrorForTooLargePrologueLength) {
if (!setupGenerator(Version))		if (!setupGenerator(Version))
return;		return;

SCOPED_TRACE("Checking Version " + std::to_string(Version) + ", Format " +		SCOPED_TRACE("Checking Version " + std::to_string(Version) + ", Format " +
(Format == DWARF64 ? "DWARF64" : "DWARF32"));		(Format == DWARF64 ? "DWARF64" : "DWARF32"));

LineTable &LT = Gen->addLineTable(Format);		LineTable &LT = Gen->addLineTable(Format);
		LT.addByte(0xaa);
DWARFDebugLine::Prologue Prologue = LT.createBasicPrologue();		DWARFDebugLine::Prologue Prologue = LT.createBasicPrologue();
++Prologue.PrologueLength;		++Prologue.PrologueLength;
LT.setPrologue(Prologue);		LT.setPrologue(Prologue);

generate();		generate();

auto ExpectedLineTable = Line.getOrParseLineTable(LineData, 0, *Context,		auto ExpectedLineTable = Line.getOrParseLineTable(LineData, 0, *Context,
nullptr, RecordRecoverable);		nullptr, RecordRecoverable);
ASSERT_THAT_EXPECTED(ExpectedLineTable, Succeeded());		ASSERT_THAT_EXPECTED(ExpectedLineTable, Succeeded());
DWARFDebugLine::LineTable Result(**ExpectedLineTable);		DWARFDebugLine::LineTable Result(**ExpectedLineTable);
// Undo the earlier modification so that it can be compared against a		// Undo the earlier modification so that it can be compared against a
// "default" prologue.		// "default" prologue.
--Result.Prologue.PrologueLength;		--Result.Prologue.PrologueLength;
checkDefaultPrologue(Version, Format, Result.Prologue, 0);		checkDefaultPrologue(Version, Format, Result.Prologue, 1);

uint64_t ExpectedEnd =		uint64_t ExpectedEnd = Prologue.TotalLength + Prologue.sizeofTotalLength();
Prologue.TotalLength + 1 + Prologue.sizeofTotalLength();
checkError(		checkError(
(Twine("parsing line table prologue at 0x00000000 should have ended at "		(Twine("parsing line table prologue at 0x00000000 should have ended at "
"0x000000") +		"0x000000") +
Twine::utohexstr(ExpectedEnd) + " but it ended at 0x000000" +		Twine::utohexstr(ExpectedEnd) + " but it ended at 0x000000" +
Twine::utohexstr(ExpectedEnd - 1))		Twine::utohexstr(ExpectedEnd - 1))
.str(),		.str(),
std::move(Recoverable));		std::move(Recoverable));
}		}
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	TEST_F(DebugLineBasicFixture, ErrorForUnitLengthTooLarge) {
LineTable &Padding = Gen->addLineTable();		LineTable &Padding = Gen->addLineTable();
// Add some padding to show that a non-zero offset is handled correctly.		// Add some padding to show that a non-zero offset is handled correctly.
Padding.setCustomPrologue({{0, LineTable::Byte}});		Padding.setCustomPrologue({{0, LineTable::Byte}});
LineTable &LT = Gen->addLineTable();		LineTable &LT = Gen->addLineTable();
LT.addStandardOpcode(DW_LNS_copy, {});		LT.addStandardOpcode(DW_LNS_copy, {});
LT.addStandardOpcode(DW_LNS_const_add_pc, {});		LT.addStandardOpcode(DW_LNS_const_add_pc, {});
LT.addExtendedOpcode(1, DW_LNE_end_sequence, {});		LT.addExtendedOpcode(1, DW_LNE_end_sequence, {});
DWARFDebugLine::Prologue Prologue = LT.createBasicPrologue();		DWARFDebugLine::Prologue Prologue = LT.createBasicPrologue();
// Set the total length to 1 higher than the actual length. The program body		// Set the total length to 1 higher than the actual length.
// has size 5.		++Prologue.TotalLength;
Prologue.TotalLength += 6;
LT.setPrologue(Prologue);		LT.setPrologue(Prologue);

generate();		generate();

auto ExpectedLineTable = Line.getOrParseLineTable(LineData, 1, *Context,		auto ExpectedLineTable = Line.getOrParseLineTable(LineData, 1, *Context,
nullptr, RecordRecoverable);		nullptr, RecordRecoverable);
checkError("line table program with offset 0x00000001 has length 0x00000034 "		checkError("line table program with offset 0x00000001 has length 0x00000034 "
"but only 0x00000033 bytes are available",		"but only 0x00000033 bytes are available",
std::move(Recoverable));		std::move(Recoverable));
ASSERT_THAT_EXPECTED(ExpectedLineTable, Succeeded());		ASSERT_THAT_EXPECTED(ExpectedLineTable, Succeeded());
EXPECT_EQ((*ExpectedLineTable)->Rows.size(), 2u);		EXPECT_EQ((*ExpectedLineTable)->Rows.size(), 2u);
EXPECT_EQ((*ExpectedLineTable)->Sequences.size(), 1u);		EXPECT_EQ((*ExpectedLineTable)->Sequences.size(), 1u);
}		}

		TEST_F(DebugLineBasicFixture, ErrorForUnitLengthTruncatingOpcode) {
		if (!setupGenerator())
		return;

		LineTable &Padding = Gen->addLineTable();
		// Add some padding to show that a non-zero offset is handled correctly.
		Padding.setCustomPrologue({{0, LineTable::Byte}});
		LineTable &LT = Gen->addLineTable();
		LT.addStandardOpcode(DW_LNS_copy, {});
		LT.addStandardOpcode(DW_LNS_const_add_pc, {});
		LT.addExtendedOpcode(1, DW_LNE_end_sequence, {});
		DWARFDebugLine::Prologue Prologue = LT.createBasicPrologue();
		--Prologue.TotalLength;
		LT.setPrologue(Prologue);

		generate();

		DWARFDebugLine::LineTable Table;
		uint64_t Offset = 1;
		ASSERT_THAT_ERROR(
		Table.parse(LineData, &Offset, *Context, nullptr, RecordRecoverable),
		Succeeded());
		checkError(
		"last opcode in line table at offset 0x00000001 ended at offset "
		"0x00000034 which is past the expected table end at offset 0x00000033",
		dblaikieUnsubmitted Not Done Reply Inline Actions This phrasing doesn't match up with not reading past the end of a specified length - I'd expect something more like "last opcode in line table at offset 0x..1 is incomplete/truncated at offset 0x...34" or the like. Maybe with "expected to extend to 0x35" if we know that some localized implied/expressed length extends beyond some broader length that was specified. dblaikie: This phrasing doesn't match up with not reading past the end of a specified length - I'd expect…
		jhendersonAuthorUnsubmitted Done Reply Inline Actions I think your proposal makes sense with the proposed data extractor changes, but possibly not before. See my out-of-line comment. jhenderson: I think your proposal makes sense with the proposed data extractor changes, but possibly not…
		dblaikieUnsubmitted Not Done Reply Inline Actions Right right - implementation and messaging should be fixed in that direction. dblaikie: Right right - implementation and messaging should be fixed in that direction.
		std::move(Recoverable));
		EXPECT_EQ(Table.Rows.size(), 2u);
		EXPECT_EQ(Table.Sequences.size(), 1u);
		// Show that the output offset is based on the unit length, not the actual
		// amount of data parsed.
		EXPECT_EQ(Offset, Prologue.TotalLength + Prologue.sizeofTotalLength() + 1);
		}

TEST_F(DebugLineBasicFixture, ErrorForMismatchedAddressSize) {		TEST_F(DebugLineBasicFixture, ErrorForMismatchedAddressSize) {
if (!setupGenerator(4, 8))		if (!setupGenerator(4, 8))
return;		return;

LineTable &LT = Gen->addLineTable();		LineTable &LT = Gen->addLineTable();
// The line data extractor expects size 8 (Quad) addresses.		// The line data extractor expects size 8 (Quad) addresses.
uint64_t Addr1 = 0x11223344;		uint64_t Addr1 = 0x11223344;
LT.addExtendedOpcode(5, DW_LNE_set_address, {{Addr1, LineTable::Long}});		LT.addExtendedOpcode(5, DW_LNE_set_address, {{Addr1, LineTable::Long}});
▲ Show 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	checkError({"parsing line table prologue at offset 0x00000000 found "
std::move(Unrecoverable));		std::move(Unrecoverable));
}		}

TEST_F(DebugLineBasicFixture, ParserReportsNonPrologueProblemsWhenParsing) {		TEST_F(DebugLineBasicFixture, ParserReportsNonPrologueProblemsWhenParsing) {
if (!setupGenerator())		if (!setupGenerator())
return;		return;

LineTable &LT = Gen->addLineTable(DWARF32);		LineTable &LT = Gen->addLineTable(DWARF32);
LT.addExtendedOpcode(0x42, DW_LNE_end_sequence, {});		LT.addExtendedOpcode(0x2, DW_LNE_end_sequence, {});
		// Arbitrary padding byte to ensure the end of the opcode as claimed by the
		// opcode length field does not go past the table end.
		LT.addByte(0xaa);
		dblaikieUnsubmitted Not Done Reply Inline Actions Could you explain this further? What's incorrect about the existing usage (where is the opcode length field? Is that the 0x2 in the line above? Why would it be too short? (should the DWARFGenerator API be changed? is it computing a length that's too short for the table?)) dblaikie: Could you explain this further? What's incorrect about the existing usage (where is the opcode…
		jhendersonAuthorUnsubmitted Done Reply Inline Actions `addExtendedOpcode` takes a length for an extended opcode, followed by the opcode itself, and then data for the operands. I realise this byte should be moved into the operands argument, so I'll fix that. I don't think the API here needs changing, as it would prevent us creating broken situations like this (i.e. an extended opcode without sufficient data for its claimed length). jhenderson: `addExtendedOpcode` takes a length for an extended opcode, followed by the opcode itself, and…
		dblaikieUnsubmitted Not Done Reply Inline Actions What's the mention of the "table end" here, then? I guess "table end" is indicated by the table length field elsewhere? Is it autogenerated (with a possible override for invalid situations)? Should it be autogenerated differently? dblaikie: What's the mention of the "table end" here, then? I guess "table end" is indicated by the table…
		jhendersonAuthorUnsubmitted Done Reply Inline Actions Yeah, the table length is auto-generated, but it's auto-generated based on the raw data we add (i.e. not the interpretation of the data, but the physical number of bytes we add). See `writeDefaultPrologue` in DWARFGenerator.cpp. The ability to override the length is provided by the `createBasicPrologue` function which returns a line table prologue that can be updated for use in the test. See e.g. `ErrorForTooLargePrologueLength`. I think this is a reasonable approach. In this test case, we are overriding the DW_LNE_end_sequence length with something different, by specifying the "2" argument. Previously, by not specifying any data after it in the test, the majority of the bytes covered by this size were past the end of the table, which is itself an interesting test case, albeit not the focus of this test. Consequently it was triggering the new error, which I didn't want. When I next update this patch, I'll replace this and the previous line of code and comment with: `LT.addExtendedOpcode(0x2, DW_LNE_end_sequence, {{LineTable::Byte, 0xaa});` jhenderson: Yeah, the table length is auto-generated, but it's auto-generated based on the raw data we add…
LineTable &LT2 = Gen->addLineTable(DWARF32);		LineTable &LT2 = Gen->addLineTable(DWARF32);
LT2.addExtendedOpcode(9, DW_LNE_set_address,		LT2.addExtendedOpcode(9, DW_LNE_set_address,
{{0x1234567890abcdef, LineTable::Quad}});		{{0x1234567890abcdef, LineTable::Quad}});
LT2.addStandardOpcode(DW_LNS_copy, {});		LT2.addStandardOpcode(DW_LNS_copy, {});
LT2.addByte(0xbb);		LT2.addByte(0xbb);
generate();		generate();

DWARFDebugLine::SectionParser Parser(LineData, *Context, CUs, TUs);		DWARFDebugLine::SectionParser Parser(LineData, *Context, CUs, TUs);
Parser.parseNext(RecordRecoverable, RecordUnrecoverable);		Parser.parseNext(RecordRecoverable, RecordUnrecoverable);
EXPECT_FALSE(Unrecoverable);		EXPECT_FALSE(Unrecoverable);
ASSERT_FALSE(Parser.done());		ASSERT_FALSE(Parser.done());
checkError(		checkError(
"unexpected line op length at offset 0x00000030 expected 0x42 found 0x01",		"unexpected line op length at offset 0x00000030 expected 0x02 found 0x01",
std::move(Recoverable));		std::move(Recoverable));

// Reset the error state so that it does not confuse the next set of checks.		// Reset the error state so that it does not confuse the next set of checks.
Unrecoverable = Error::success();		Unrecoverable = Error::success();
Parser.parseNext(RecordRecoverable, RecordUnrecoverable);		Parser.parseNext(RecordRecoverable, RecordUnrecoverable);

EXPECT_TRUE(Parser.done());		EXPECT_TRUE(Parser.done());
checkError("last sequence in debug line table at offset 0x00000031 is not "		checkError("last sequence in debug line table at offset 0x00000032 is not "
"terminated",		"terminated",
std::move(Recoverable));		std::move(Recoverable));
EXPECT_FALSE(Unrecoverable);		EXPECT_FALSE(Unrecoverable);
}		}

TEST_F(DebugLineBasicFixture,		TEST_F(DebugLineBasicFixture,
ParserReportsPrologueErrorsInEachTableWhenSkipping) {		ParserReportsPrologueErrorsInEachTableWhenSkipping) {
if (!setupGenerator())		if (!setupGenerator())
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

llvm/unittests/DebugInfo/DWARF/DwarfGenerator.cpp

Show First 20 Lines • Show All 169 Lines • ▼ Show 20 Lines	case 5:
break;		break;
default:		default:
llvm_unreachable("unsupported version");		llvm_unreachable("unsupported version");
}		}
if (Format == DWARF64) {		if (Format == DWARF64) {
P.TotalLength += 4;		P.TotalLength += 4;
P.FormParams.Format = DWARF64;		P.FormParams.Format = DWARF64;
}		}
		P.TotalLength += Contents.size();
		ikudrinUnsubmitted Not Done Reply Inline Actions Do I understand it right that this fix along with the corresponding changes in other places may be extracted into a separate patch? ikudrin: Do I understand it right that this fix along with the corresponding changes in other places may…
		jhendersonAuthorUnsubmitted Done Reply Inline Actions It probably can be. It's a requirement for this patch, but dosen't need to be a part of the same commit. I'lll look at splitting it out. jhenderson: It probably can be. It's a requirement for this patch, but dosen't need to be a part of the…
P.FormParams.Version = Version;		P.FormParams.Version = Version;
P.MinInstLength = 1;		P.MinInstLength = 1;
P.MaxOpsPerInst = 1;		P.MaxOpsPerInst = 1;
P.DefaultIsStmt = 1;		P.DefaultIsStmt = 1;
P.LineBase = -5;		P.LineBase = -5;
P.LineRange = 14;		P.LineRange = 14;
P.OpcodeBase = 13;		P.OpcodeBase = 13;
P.StandardOpcodeLengths = {0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1};		P.StandardOpcodeLengths = {0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1};
▲ Show 20 Lines • Show All 368 Lines • Show Last 20 Lines