This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/DebugInfo/DWARF/
-
DebugInfo/
-
DWARF/
1/1
DWARFListTable.cpp
-
test/DebugInfo/X86/
-
DebugInfo/
-
X86/
-
dwarfdump-rnglists-zero-length.s

Differential D82886

[DebugInfo] Fix a possible crash when reading a malformed .debug_*lists section.
ClosedPublic

Authored by ikudrin on Jun 30 2020, 8:50 AM.

Download Raw Diff

Details

Reviewers

jhenderson
dblaikie
aprantl

Commits

rGec9f0c7d4ae2: [DebugInfo] Fix a possible crash when reading a malformed .debug_*lists section.

Summary

DWARFListTableHeader::length() handles the zero value of HeaderData.Length in a special way, which makes the result different from the calculated value of FullLength, which leads to triggering the assertion. The patch moves the assertion a bit later when FullLength is already checked for minimal allowed value.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ikudrin created this revision.Jun 30 2020, 8:50 AM

Herald added subscribers: hiraditya, aprantl. · View Herald TranscriptJun 30 2020, 8:50 AM

Harbormaster failed remote builds in B62347: Diff 274506!Jun 30 2020, 10:18 AM

aprantl accepted this revision.Jun 30 2020, 2:26 PM

aprantl added inline comments.

llvm/lib/DebugInfo/DWARF/DWARFListTable.cpp
38	seems reasonable. Could you add an `&& "some explanation"` to the assert that explains what went wrong?

This revision is now accepted and ready to land.Jun 30 2020, 2:26 PM

jhenderson added inline comments.Jul 1 2020, 1:11 AM

llvm/test/DebugInfo/X86/dwarfdump-rnglists-format-mix.s
1 ↗	(On Diff #274506)	I think you can get rid of the "with enabled assertions" comment here. The test also shows that llvm-dwarfdump without assertions doesn't crash! As this is testing the library code, rather than something to do with llvm-dwarfdump specifically, perhaps you can just drop `dwarfdump-` from the test name.
5–7 ↗	(On Diff #274506)	Two thoughts, and not asking you to change anything if you prefer the current approach: I'm not a fan of piping llvm-mc output into llvm-dwarfdump directly, because it makes it harder to debug the test if there's a problem, since I then have to change this line to generate the intermediate object etc. I've come to the conclusion that I personally prefer having the pipe operator at the end of the previous line, since it clearly shows that the next line is new command, and that you don't need to look at it to find more arguments for the command on the first line, i.e: # RUN: llvm-mc -triple x86_64 %s -filetype=obj -o - \| \ ## Indentation shows this is a continuation; llvm-dwarfdump shows it's a new executable being called # RUN: not llvm-dwarfdump -debug-info - 2>&1 \| \ # RUN: FileCheck %s
6 ↗	(On Diff #274506)	Do you actually need to dump the debug info, or could you just dump the .debug_rnglists section explicitly? That would allow you to simplify your test input accordingly.

Added an explanatory comment to the assert.
Simplified the test.

llvm/test/DebugInfo/X86/dwarfdump-rnglists-format-mix.s
1 ↗	(On Diff #274506)	I think you can get rid of the "with enabled assertions" comment here. The test also shows that llvm-dwarfdump without assertions doesn't crash! I'll update the wording, thanks! As this is testing the library code, rather than something to do with llvm-dwarfdump specifically, perhaps you can just drop dwarfdump- from the test name. We have a group of tests here with assembly inputs that check various aspects of the library. I wanted the new test to be in the same group, next to other tests which also start with "dwarfdump-rnglists".
5–7 ↗	(On Diff #274506)	Will do. Thanks.
6 ↗	(On Diff #274506)	The test was based on the sample I run into when worked on producing DWARF64 debugging info. Probably you are right, the test may be simplified to check only the specific issue.

Okay, looks good.

Is this assertion really valuable? "length()" does the same addition that is done to create the value that length() is being compared to - it's pretty trivial? Perhaps "FullLength" should be initialized with a call to length instead of duplicating that addition?

In D82886#2126027, @dblaikie wrote:

Is this assertion really valuable? "length()" does the same addition that is done to create the value that length() is being compared to - it's pretty trivial? Perhaps "FullLength" should be initialized with a call to length instead of duplicating that addition?

In that case, we have a small inconsistency in the error message. In most cases, we report the full length of the data, which is the value of the unit length field plus the size of the field itself, but for the unit length of 0, we would not add the size of the field and would report only the raw value. This inconsistency is the only reason we calculate FullLength directly.

In D82886#2127149, @ikudrin wrote:

In D82886#2126027, @dblaikie wrote:

Is this assertion really valuable? "length()" does the same addition that is done to create the value that length() is being compared to - it's pretty trivial? Perhaps "FullLength" should be initialized with a call to length instead of duplicating that addition?

In that case, we have a small inconsistency in the error message. In most cases, we report the full length of the data, which is the value of the unit length field plus the size of the field itself, but for the unit length of 0, we would not add the size of the field and would report only the raw value. This inconsistency is the only reason we calculate FullLength directly.

Is that difference necessary? I tried removing the length == 0 special case from "length()" and no tests fail. Perhaps we could go that route instead?

In D82886#2134722, @dblaikie wrote:

Is that difference necessary? I tried removing the length == 0 special case from "length()" and no tests fail. Perhaps we could go that route instead?

For example, dumpRnglistsSection() in DWARFContext.cpp terminates the loop when length() returns 0. With a specially constructed input, your variant would result in several additional unsuccessful reads with additional error messages:

    .section .debug_rnglists,"",@progbits
    .long 0xffffffff
    .long 0xffffffff
    .byte 0xff
...
error: parsing .debug_rnglists table at offset 0x0: unexpected end of data at offset 0xb while reading [0x4, 0xc)
error: parsing .debug_rnglists table at offset 0x4: unexpected end of data at offset 0xb while reading [0x8, 0x10)
error: parsing .debug_rnglists table at offset 0x8: unexpected end of data at offset 0xb while reading [0x8, 0xc)

In D82886#2135527, @ikudrin wrote:
In D82886#2134722, @dblaikie wrote:

Is that difference necessary? I tried removing the length == 0 special case from "length()" and no tests fail. Perhaps we could go that route instead?

For example, dumpRnglistsSection() in DWARFContext.cpp terminates the loop when length() returns 0. With a specially constructed input, your variant would result in several additional unsuccessful reads with additional error messages:
    .section .debug_rnglists,"",@progbits
    .long 0xffffffff
    .long 0xffffffff
    .byte 0xff
...
error: parsing .debug_rnglists table at offset 0x0: unexpected end of data at offset 0xb while reading [0x4, 0xc)
error: parsing .debug_rnglists table at offset 0x4: unexpected end of data at offset 0xb while reading [0x8, 0x10)
error: parsing .debug_rnglists table at offset 0x8: unexpected end of data at offset 0xb while reading [0x8, 0xc)

Ah, thanks! Would be handy to have a test case for that & perhaps some other way to communicate "end of list" that's a bit more explicit?

Hmm, I'm not sure why this produce the repetition - if length() accurately returned the length that was read rather than zero, then it'd go to the end and stop, right?

In D82886#2137907, @dblaikie wrote:

Ah, thanks! Would be handy to have a test case for that & perhaps some other way to communicate "end of list" that's a bit more explicit?

For my understanding, that is not yet broken, so does not need to be fixed.

Hmm, I'm not sure why this produce the repetition - if length() accurately returned the length that was read rather than zero, then it'd go to the end and stop, right?

0xffffffff is a DWARF64 mark, so than it is read, the library expects to read the next 8 bytes.

In D82886#2139279, @ikudrin wrote:

In D82886#2137907, @dblaikie wrote:

Ah, thanks! Would be handy to have a test case for that & perhaps some other way to communicate "end of list" that's a bit more explicit?

For my understanding, that is not yet broken, so does not need to be fixed.

I'm not suggesting it needs to be fixed - but that that codepath (the one that returns zero) is untested - so when it was committed, it was committed without test coverage. It'd be good to add test coverage where it is missing like this.

Hmm, I'm not sure why this produce the repetition - if length() accurately returned the length that was read rather than zero, then it'd go to the end and stop, right?

0xffffffff is a DWARF64 mark, so than it is read, the library expects to read the next 8 bytes.

Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?

In D82886#2139657, @dblaikie wrote:

I'm not suggesting it needs to be fixed - but that that codepath (the one that returns zero) is untested - so when it was committed, it was committed without test coverage. It'd be good to add test coverage where it is missing like this.

Isn't adding that test coverage orthogonal to this particular patch?

Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?

I am OK with the current convention that if it is not possible to read the length field the code returns zero as the total length. It might be better to make the result Optional and return None in that case, but I really doubt it is worth investing time in that. Reporting something that was not read from the section (10 in your example) smells not good for me, but you may provide the patch if you feel like the code will be improved.

In D82886#2140817, @ikudrin wrote:

In D82886#2139657, @dblaikie wrote:

I'm not suggesting it needs to be fixed - but that that codepath (the one that returns zero) is untested - so when it was committed, it was committed without test coverage. It'd be good to add test coverage where it is missing like this.

Isn't adding that test coverage orthogonal to this particular patch?

Oh yeah, for sure! Didn't mean to suggest it should go here!

Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?

I am OK with the current convention that if it is not possible to read the length field the code returns zero as the total length. It might be better to make the result Optional and return None in that case, but I really doubt it is worth investing time in that. Reporting something that was not read from the section (10 in your example) smells not good for me,

Not sure I follow - 10 bytes of the length (as many bytes as were available to read) were read from the section, right? Ah, you mean the length bytes did not describe a length of 10, but 10 bytes were read. *nod* I can see how that's a bit ambiguous, indeed.

In D82886#2145774, @dblaikie wrote:

In D82886#2140817, @ikudrin wrote:

In D82886#2139657, @dblaikie wrote:

I'm not suggesting it needs to be fixed - but that that codepath (the one that returns zero) is untested - so when it was committed, it was committed without test coverage. It'd be good to add test coverage where it is missing like this.

Isn't adding that test coverage orthogonal to this particular patch?

Oh yeah, for sure! Didn't mean to suggest it should go here!

Well, I've created D83673 to add some tests. Please, take a look.

Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?

I am OK with the current convention that if it is not possible to read the length field the code returns zero as the total length. It might be better to make the result Optional and return None in that case, but I really doubt it is worth investing time in that. Reporting something that was not read from the section (10 in your example) smells not good for me,

Not sure I follow - 10 bytes of the length (as many bytes as were available to read) were read from the section, right? Ah, you mean the length bytes did not describe a length of 10, but 10 bytes were read. *nod* I can see how that's a bit ambiguous, indeed.

In fact, these 10 bytes even were never read. Only the first 4 bytes were actually read, after which the check for insufficient space triggered an error. Also note, that getInitialLength() preserves the value of Offset in case of an error, not sets it to the position of the error.

In D82886#2147198, @ikudrin wrote:

In D82886#2145774, @dblaikie wrote:

In D82886#2140817, @ikudrin wrote:

In D82886#2139657, @dblaikie wrote:

I'm not suggesting it needs to be fixed - but that that codepath (the one that returns zero) is untested - so when it was committed, it was committed without test coverage. It'd be good to add test coverage where it is missing like this.

Isn't adding that test coverage orthogonal to this particular patch?

Oh yeah, for sure! Didn't mean to suggest it should go here!

Well, I've created D83673 to add some tests. Please, take a look.

Looks good! (though perhaps a full functionality test would be good - but I guess if we end up changing this API contract at some point, the test'll fail & hopefully someone'll then look at the end-to-end situation)

Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?

I am OK with the current convention that if it is not possible to read the length field the code returns zero as the total length. It might be better to make the result Optional and return None in that case, but I really doubt it is worth investing time in that. Reporting something that was not read from the section (10 in your example) smells not good for me,

Not sure I follow - 10 bytes of the length (as many bytes as were available to read) were read from the section, right? Ah, you mean the length bytes did not describe a length of 10, but 10 bytes were read. *nod* I can see how that's a bit ambiguous, indeed.

In fact, these 10 bytes even were never read. Only the first 4 bytes were actually read, after which the check for insufficient space triggered an error. Also note, that getInitialLength() preserves the value of Offset in case of an error, not sets it to the position of the error.

*nod* sure enough - got some mildly mixed feelings still, but seems OK. Thanks for walking me through it!

Closed by commit rGec9f0c7d4ae2: [DebugInfo] Fix a possible crash when reading a malformed .debug_*lists section. (authored by ikudrin). · Explain WhyJul 14 2020, 2:36 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

DebugInfo/

DWARF/

DWARFListTable.cpp

2 lines

test/

DebugInfo/

X86/

dwarfdump-rnglists-zero-length.s

12 lines

Diff 277722

llvm/lib/DebugInfo/DWARF/DWARFListTable.cpp

Show All 23 Lines	Error DWARFListTableHeader::extract(DWARFDataExtractor Data,
if (Err)		if (Err)
return createStringError(		return createStringError(
errc::invalid_argument, "parsing %s table at offset 0x%" PRIx64 ": %s",		errc::invalid_argument, "parsing %s table at offset 0x%" PRIx64 ": %s",
SectionName.data(), HeaderOffset, toString(std::move(Err)).c_str());		SectionName.data(), HeaderOffset, toString(std::move(Err)).c_str());

uint8_t OffsetByteSize = Format == dwarf::DWARF64 ? 8 : 4;		uint8_t OffsetByteSize = Format == dwarf::DWARF64 ? 8 : 4;
uint64_t FullLength =		uint64_t FullLength =
HeaderData.Length + dwarf::getUnitLengthFieldByteSize(Format);		HeaderData.Length + dwarf::getUnitLengthFieldByteSize(Format);
assert(FullLength == length());
if (FullLength < getHeaderSize(Format))		if (FullLength < getHeaderSize(Format))
return createStringError(errc::invalid_argument,		return createStringError(errc::invalid_argument,
"%s table at offset 0x%" PRIx64		"%s table at offset 0x%" PRIx64
" has too small length (0x%" PRIx64		" has too small length (0x%" PRIx64
") to contain a complete header",		") to contain a complete header",
SectionName.data(), HeaderOffset, FullLength);		SectionName.data(), HeaderOffset, FullLength);
		assert(FullLength == length() && "Inconsistent calculation of length.");
		aprantlUnsubmitted Done Reply Inline Actions seems reasonable. Could you add an `&& "some explanation"` to the assert that explains what went wrong? aprantl: seems reasonable. Could you add an `&& "some explanation"` to the assert that explains what…
uint64_t End = HeaderOffset + FullLength;		uint64_t End = HeaderOffset + FullLength;
if (!Data.isValidOffsetForDataOfSize(HeaderOffset, FullLength))		if (!Data.isValidOffsetForDataOfSize(HeaderOffset, FullLength))
return createStringError(errc::invalid_argument,		return createStringError(errc::invalid_argument,
"section is not large enough to contain a %s table "		"section is not large enough to contain a %s table "
"of length 0x%" PRIx64 " at offset 0x%" PRIx64,		"of length 0x%" PRIx64 " at offset 0x%" PRIx64,
SectionName.data(), FullLength, HeaderOffset);		SectionName.data(), FullLength, HeaderOffset);

HeaderData.Version = Data.getU16(OffsetPtr);		HeaderData.Version = Data.getU16(OffsetPtr);
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

llvm/test/DebugInfo/X86/dwarfdump-rnglists-zero-length.s

This file was added.

				## The test checks that llvm-dwarfdump can handle a malformed input file without
				## crashing.

				# RUN: llvm-mc -triple x86_64 %s -filetype=obj -o %t
				# RUN: not llvm-dwarfdump -debug-rnglists %t 2>&1 \| FileCheck %s

				# CHECK: error: .debug_rnglists table at offset 0x0 has too small length (0x4) to contain a complete header

				## An assertion used to trigger in the debug build of the DebugInfo/DWARF
				## library if the unit length field in a range list table was 0.
				.section .debug_rnglists,"",@progbits
				.long 0