Download Raw Diff

Details

Reviewers

ikudrin
JDevlieghere
dblaikie
probinson

Commits

rGfe6983a75ae0: [DebugInfo] Error if unsupported address size detected in line table

Summary

Prior to this patch, if a DW_LNE_set_address opcode was parsed with an address size (i.e. with a length after the opcode) of anything other 1, 2, 4, or 8, an llvm_unreachable would be hit, as the data extractor does not support other values. This patch introduces a new error check that verifies the address size is one of the supported sizes, in common with other places within the DWARF parsing.

This patch also fixes calculation of a generated line table's size in unit tests. One of the tests in this patch highlighted a bug introduced in 1271cde4745, when non-byte operands were used as arguments for extended or standard opcodes.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jhenderson created this revision.Feb 4 2020, 6:15 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2020, 6:15 AM

Herald added subscribers: hiraditya, aprantl. · View Herald Transcript

jhenderson marked an inline comment as done.Feb 4 2020, 6:15 AM

jhenderson added inline comments.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
635	I'm going to write a follow-up patch for this issue.

jhenderson mentioned this in D72154: [DebugInfo] Make debug line address size mismatch non-fatal to parsing.Feb 4 2020, 6:16 AM

jhenderson mentioned this in D73901: [DebugInfo][test] Fix calculation of generated line table size.

jhenderson edited the summary of this revision. (Show Details)

jhenderson added a parent revision: D73901: [DebugInfo][test] Fix calculation of generated line table size.

ikudrin added inline comments.Feb 4 2020, 7:11 AM

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
635	You probably meant `Len >= 1`, right?

Address FIXME by using uint64_t for longer. If the size is greater than uint8_t max, then it is clearly not 4 or 8, so the adddress size check will catch it. Also added a unit test to cover this case, and made a minor change to a struct used in the test to ensure the test could pass.

jhenderson marked an inline comment as done.Feb 4 2020, 7:15 AM

jhenderson added inline comments.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
635	I'm guessing this comment is related to the now-deleted FIXME?

probinson added inline comments.Feb 4 2020, 10:20 AM

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
648	I just happened to notice D73961 which says AVR usually has 2-byte addresses. Maybe that's a broader issue for a separate patch but there are a lot of 16-bit machines still out there in the world.

dblaikie added inline comments.Feb 4 2020, 4:32 PM

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
645–665	I'd generally like to assume that the /first/ time a value/field/length/etc is seen, that is correct and anything later that conflicts with it is incorrect. In what cases would we reach here anyway? Could we check earlier and ensure the extractor has a meaningful (non-zero) address size?

jhenderson marked an inline comment as done.Feb 5 2020, 7:38 AM

jhenderson added inline comments.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
648	I just happened to notice D73961 which says AVR usually has 2-byte addresses. Maybe that's a broader issue for a separate patch but there are a lot of 16-bit machines still out there in the world. Thanks for flagging that up. I saw the 4/8 byte support check in several other places, so copied those. I'll expand this to 2 and 1 too, since that's what the DataExtractor can support. The other places can be fixed as needed. I'd generally like to assume that the /first/ time a value/field/length/etc is seen, that is correct and anything later that conflicts with it is incorrect. I've been following the approach of assuming the length field is correct (in this case, the length of the extended opcode). Indeed, in this case, if we were to follow a different approach, we could read past the end of the instruction. In what cases would we reach here anyway? We hit here any time the length field of a DW_LNE_set_address opcode is not either 5 or 9 (i.e. 1 for the opcode, plus 4/8 for the address). Could we check earlier and ensure the extractor has a meaningful (non-zero) address size? I'm deliberately not checking that any earlier stated address makes sense, because it's not actually needed until here. If a V5 line table had an address of, say, 7, it wouldn't matter if the table had no DW_LNE_set_address opcodes, so an earlier validation would lead to an error that has no impact. @probinson (I think) previously encouraged me to be lazy in reporting this sort of thing. Additionally, we don't always have any address size information until this point - in the case where e.g. llvm-dwarfdump is dumping just the whole of a V4 .debug_line section and there is no corresponding debug info unit, the parser knows nothing about the address until it hits this point.

jhenderson marked an inline comment as done.Feb 5 2020, 8:00 AM

jhenderson added inline comments.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
645–665	(oops, Phabricator is showing my reply above your one @dblaikie)

Address review comments:

Add support for 1-byte and 2-byte address sizes.
Fold in unit test bug fix from D73901.

jhenderson removed a parent revision: D73901: [DebugInfo][test] Fix calculation of generated line table size.Feb 5 2020, 8:04 AM

I just noticed this patch from D73961. I am not yet familiar with the DWARF support in LLVM but just wanted to say two things:

The address size (2) that LLVM uses for AVR might be wrong: avr-gcc uses 4 instead. I don't know why exactly but perhaps because there are AVR devices that have more than 64K words of flash (such as the ATmega2560).
So far it looks like code compiled with Clang has faulty debug info. dwarfdump complains ERROR: dwarf_srcfiles: DW_DLE_HEADER_LEN_BIGGER_THAN_SECSIZE (342) Corrupt dwarf. while llvm-dwarfdump doesn't even try to print the contents of .debug-info.

To reproduce, code:

int main(void)
{
  while (1) {
    __asm__ __volatile__("nop");
  }
}

Compiled with avr-gcc (the -gdwarf-4 is important because avr-gcc on Debian defaults to stabs for some reason):

avr-gcc -o avr-nop.elf -g -gdwarf-4 avr-nop.c

Compiled with Clang:

clang --target=avr -c -o avr-nop.o -Os -g avr-nop.c
avr-gcc -o avr-nop.elf avr-nop.o

Noting it here in case it is of any use, or you happen to have any hints how I could debug this further (I've already spent hours on this).

In D73962#1860331, @aykevl wrote:

I just noticed this patch from D73961. I am not yet familiar with the DWARF support in LLVM but just wanted to say two things:

The address size (2) that LLVM uses for AVR might be wrong: avr-gcc uses 4 instead. I don't know why exactly but perhaps because there are AVR devices that have more than 64K words of flash (such as the ATmega2560).

So far it looks like code compiled with Clang has faulty debug info. dwarfdump complains ERROR: dwarf_srcfiles: DW_DLE_HEADER_LEN_BIGGER_THAN_SECSIZE (342) Corrupt dwarf. while llvm-dwarfdump doesn't even try to print the contents of .debug-info.

Noting it here in case it is of any use, or you happen to have any hints how I could debug this further (I've already spent hours on this).

Hi @aykevl,

It might be worth filing a bug against LLVM to report and discuss this further. Also, if you are able to attach the object file to that bug which you are having trouble with, it might be helpful for debugging purposes.

@jhenderson thanks!
It took me a while but finally I figured out that the problem is with the AVR backend - at least the first bug I was dealing with. For some reason it adjusted absolute relocations, breaking references between DWARF sections.

jhenderson mentioned this in D74197: [DebugInfo] Simplify DWARFDebugAddr..Feb 10 2020, 1:34 AM

Ping?

Fix unit test following rebase.

Sounds alright

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
643	A comment on the "addExtendedOpcode" might help make it clearer that that's the "interesting" part of the input & how half+byte == 3 bytes, which is the invalid value being tested for on this line? (what's the "addByte(0xaa)" for? that looks a bit opaque)

This revision is now accepted and ready to land.Feb 13 2020, 5:13 PM

Closed by commit rGfe6983a75ae0: [DebugInfo] Error if unsupported address size detected in line table (authored by jhenderson). · Explain WhyFeb 14 2020, 3:09 AM

This revision was automatically updated to reflect the committed changes.

jhenderson marked 4 inline comments as done.

Diff 242314

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp

Show First 20 Lines • Show All 626 Lines • ▼ Show 20 Lines	if (Opcode == 0) {
// relocatable address. All of the other statement program opcodes		// relocatable address. All of the other statement program opcodes
// that affect the address register add a delta to it. This instruction		// that affect the address register add a delta to it. This instruction
// stores a relocatable value into it instead.		// stores a relocatable value into it instead.
//		//
// Make sure the extractor knows the address size. If not, infer it		// Make sure the extractor knows the address size. If not, infer it
// from the size of the operand.		// from the size of the operand.
{		{
uint8_t ExtractorAddressSize = DebugLineData.getAddressSize();		uint8_t ExtractorAddressSize = DebugLineData.getAddressSize();
if (ExtractorAddressSize != Len - 1 && ExtractorAddressSize != 0)		// FIXME: Check to make sure Len <= sizeof(uint8_t).
		jhendersonAuthorUnsubmitted Done Reply Inline Actions I'm going to write a follow-up patch for this issue. jhenderson: I'm going to write a follow-up patch for this issue.
		ikudrinUnsubmitted Done Reply Inline Actions You probably meant `Len >= 1`, right? ikudrin: You probably meant `Len >= 1`, right?
		jhendersonAuthorUnsubmitted Done Reply Inline Actions I'm guessing this comment is related to the now-deleted FIXME? jhenderson: I'm guessing this comment is related to the now-deleted FIXME?
		uint8_t OpcodeAddressSize = Len - 1;
		if (ExtractorAddressSize != OpcodeAddressSize &&
		ExtractorAddressSize != 0)
RecoverableErrorCallback(createStringError(		RecoverableErrorCallback(createStringError(
errc::invalid_argument,		errc::invalid_argument,
"mismatching address size at offset 0x%8.8" PRIx64		"mismatching address size at offset 0x%8.8" PRIx64
" expected 0x%2.2" PRIx8 " found 0x%2.2" PRIx64,		" expected 0x%2.2" PRIx8 " found 0x%2.2" PRIx64,
ExtOffset, ExtractorAddressSize, Len - 1));		ExtOffset, ExtractorAddressSize, Len - 1));

// Assume that the line table is correct and temporarily override the		// Assume that the line table is correct and temporarily override the
// address size.		// address size. If the size is unsupported, give up trying to read
		// the address and continue to the next opcode.
		if (OpcodeAddressSize != 4 && OpcodeAddressSize != 8) {
		probinsonUnsubmitted Done Reply Inline Actions I just happened to notice D73961 which says AVR usually has 2-byte addresses. Maybe that's a broader issue for a separate patch but there are a lot of 16-bit machines still out there in the world. probinson: I just happened to notice D73961 which says AVR usually has 2-byte addresses. Maybe that's a…
		jhendersonAuthorUnsubmitted Done Reply Inline Actions I just happened to notice D73961 which says AVR usually has 2-byte addresses. Maybe that's a broader issue for a separate patch but there are a lot of 16-bit machines still out there in the world. Thanks for flagging that up. I saw the 4/8 byte support check in several other places, so copied those. I'll expand this to 2 and 1 too, since that's what the DataExtractor can support. The other places can be fixed as needed. I'd generally like to assume that the /first/ time a value/field/length/etc is seen, that is correct and anything later that conflicts with it is incorrect. I've been following the approach of assuming the length field is correct (in this case, the length of the extended opcode). Indeed, in this case, if we were to follow a different approach, we could read past the end of the instruction. In what cases would we reach here anyway? We hit here any time the length field of a DW_LNE_set_address opcode is not either 5 or 9 (i.e. 1 for the opcode, plus 4/8 for the address). Could we check earlier and ensure the extractor has a meaningful (non-zero) address size? I'm deliberately not checking that any earlier stated address makes sense, because it's not actually needed until here. If a V5 line table had an address of, say, 7, it wouldn't matter if the table had no DW_LNE_set_address opcodes, so an earlier validation would lead to an error that has no impact. @probinson (I think) previously encouraged me to be lazy in reporting this sort of thing. Additionally, we don't always have any address size information until this point - in the case where e.g. llvm-dwarfdump is dumping just the whole of a V4 .debug_line section and there is no corresponding debug info unit, the parser knows nothing about the address until it hits this point. jhenderson: > I just happened to notice D73961 which says AVR usually has 2-byte addresses. Maybe that's a…
		RecoverableErrorCallback(createStringError(
		errc::invalid_argument,
		"address size 0x%2.2" PRIx8
		" of DW_LNE_set_address opcode at offset 0x%8.8" PRIx64
		" is unsupported",
		OpcodeAddressSize, ExtOffset));
		*OffsetPtr += OpcodeAddressSize;
		} else {
DebugLineData.setAddressSize(Len - 1);		DebugLineData.setAddressSize(Len - 1);
State.Row.Address.Address = DebugLineData.getRelocatedAddress(		State.Row.Address.Address = DebugLineData.getRelocatedAddress(
OffsetPtr, &State.Row.Address.SectionIndex);		OffsetPtr, &State.Row.Address.SectionIndex);

// Restore the address size if the extractor already had it.		// Restore the address size if the extractor already had it.
if (ExtractorAddressSize != 0)		if (ExtractorAddressSize != 0)
DebugLineData.setAddressSize(ExtractorAddressSize);		DebugLineData.setAddressSize(ExtractorAddressSize);
		}

		dblaikieUnsubmitted Done Reply Inline Actions I'd generally like to assume that the /first/ time a value/field/length/etc is seen, that is correct and anything later that conflicts with it is incorrect. In what cases would we reach here anyway? Could we check earlier and ensure the extractor has a meaningful (non-zero) address size? dblaikie: I'd generally like to assume that the /first/ time a value/field/length/etc is seen, that is…
		jhendersonAuthorUnsubmitted Done Reply Inline Actions (oops, Phabricator is showing my reply above your one @dblaikie) jhenderson: (oops, Phabricator is showing my reply above your one @dblaikie)
if (OS)		if (OS)
*OS << format(" (0x%16.16" PRIx64 ")", State.Row.Address.Address);		*OS << format(" (0x%16.16" PRIx64 ")", State.Row.Address.Address);
}		}
break;		break;

case DW_LNE_define_file:		case DW_LNE_define_file:
// Takes 4 arguments. The first is a null terminated string containing		// Takes 4 arguments. The first is a null terminated string containing
// a source file name. The second is an unsigned LEB128 number		// a source file name. The second is an unsigned LEB128 number
▲ Show 20 Lines • Show All 558 Lines • Show Last 20 Lines

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp

Show First 20 Lines • Show All 615 Lines • ▼ Show 20 Lines	checkError(
std::move(Recoverable));		std::move(Recoverable));
ASSERT_THAT_EXPECTED(ExpectedLineTable, Succeeded());		ASSERT_THAT_EXPECTED(ExpectedLineTable, Succeeded());
ASSERT_EQ((*ExpectedLineTable)->Rows.size(), 2u);		ASSERT_EQ((*ExpectedLineTable)->Rows.size(), 2u);
EXPECT_EQ((*ExpectedLineTable)->Sequences.size(), 1u);		EXPECT_EQ((*ExpectedLineTable)->Sequences.size(), 1u);
EXPECT_EQ((*ExpectedLineTable)->Rows[0].Address.Address, Addr1);		EXPECT_EQ((*ExpectedLineTable)->Rows[0].Address.Address, Addr1);
EXPECT_EQ((*ExpectedLineTable)->Rows[1].Address.Address, Addr2);		EXPECT_EQ((*ExpectedLineTable)->Rows[1].Address.Address, Addr2);
}		}

		TEST_F(DebugLineBasicFixture,
		ErrorForUnsupportedAddressSizeInSetAddressLength) {
		// Use DWARF v4, and 0 for data extractor address size so that the address
		// size is derived from the opcode length.
		if (!setupGenerator(4, 0))
		return;

		LineTable &LT = Gen->addLineTable();
		LT.addExtendedOpcode(2, DW_LNE_set_address,
		{{0x42, LineTable::Byte}});
		LT.addStandardOpcode(DW_LNS_copy, {});
		LT.addByte(0xaa);
		LT.addExtendedOpcode(1, DW_LNE_end_sequence, {});

		generate();

		auto ExpectedLineTable = Line.getOrParseLineTable(LineData, 0, *Context,
		nullptr, RecordRecoverable);
		checkError(
		"address size 0x01 of DW_LNE_set_address opcode at offset 0x00000030 is "
		dblaikieUnsubmitted Done Reply Inline Actions A comment on the "addExtendedOpcode" might help make it clearer that that's the "interesting" part of the input & how half+byte == 3 bytes, which is the invalid value being tested for on this line? (what's the "addByte(0xaa)" for? that looks a bit opaque) dblaikie: A comment on the "addExtendedOpcode" might help make it clearer that that's the "interesting"…
		"unsupported",
		std::move(Recoverable));
		ASSERT_THAT_EXPECTED(ExpectedLineTable, Succeeded());
		ASSERT_EQ((*ExpectedLineTable)->Rows.size(), 3u);
		EXPECT_EQ((*ExpectedLineTable)->Sequences.size(), 1u);
		// Show that the set address opcode is ignored in this case.
		EXPECT_EQ((*ExpectedLineTable)->Rows[0].Address.Address, 0);
		}

		TEST_F(DebugLineBasicFixture, ErrorForUnsupportedAddressSizeDefinedInHeader) {
		// Use 0 for data extractor address size so that it does not clash with the
		// header address size.
		if (!setupGenerator(5, 0))
		return;

		LineTable &LT = Gen->addLineTable();
		uint8_t AddressSize = 9;
		LT.addExtendedOpcode(AddressSize + 1, DW_LNE_set_address,
		{{0x12345678, LineTable::Quad}, {0, LineTable::Byte}});
		LT.addStandardOpcode(DW_LNS_copy, {});
		LT.addByte(0xaa);
		LT.addExtendedOpcode(1, DW_LNE_end_sequence, {});
		DWARFDebugLine::Prologue Prologue = LT.createBasicPrologue();
		Prologue.FormParams.AddrSize = AddressSize;
		LT.setPrologue(Prologue);

		generate();

		auto ExpectedLineTable = Line.getOrParseLineTable(LineData, 0, *Context,
		nullptr, RecordRecoverable);
		checkError(
		"address size 0x09 of DW_LNE_set_address opcode at offset 0x00000035 is "
		"unsupported",
		std::move(Recoverable));
		ASSERT_THAT_EXPECTED(ExpectedLineTable, Succeeded());
		ASSERT_EQ((*ExpectedLineTable)->Rows.size(), 3u);
		EXPECT_EQ((*ExpectedLineTable)->Sequences.size(), 1u);
		// Show that the set address opcode is ignored in this case.
		EXPECT_EQ((*ExpectedLineTable)->Rows[0].Address.Address, 0);
		}

TEST_F(DebugLineBasicFixture, CallbackUsedForUnterminatedSequence) {		TEST_F(DebugLineBasicFixture, CallbackUsedForUnterminatedSequence) {
if (!setupGenerator())		if (!setupGenerator())
return;		return;

LineTable &LT = Gen->addLineTable();		LineTable &LT = Gen->addLineTable();
LT.addExtendedOpcode(9, DW_LNE_set_address,		LT.addExtendedOpcode(9, DW_LNE_set_address,
{{0x1122334455667788, LineTable::Quad}});		{{0x1122334455667788, LineTable::Quad}});
LT.addStandardOpcode(DW_LNS_copy, {});		LT.addStandardOpcode(DW_LNS_copy, {});
▲ Show 20 Lines • Show All 273 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Error if unsupported address size detected in line table
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 242314

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Error if unsupported address size detected in line tableClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 242314

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp

[DebugInfo] Error if unsupported address size detected in line table
ClosedPublic