This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Error if unsupported address size detected in line table
ClosedPublic

Authored by jhenderson on Feb 4 2020, 6:15 AM.

Details

Summary

Prior to this patch, if a DW_LNE_set_address opcode was parsed with an address size (i.e. with a length after the opcode) of anything other 1, 2, 4, or 8, an llvm_unreachable would be hit, as the data extractor does not support other values. This patch introduces a new error check that verifies the address size is one of the supported sizes, in common with other places within the DWARF parsing.

This patch also fixes calculation of a generated line table's size in unit tests. One of the tests in this patch highlighted a bug introduced in 1271cde4745, when non-byte operands were used as arguments for extended or standard opcodes.

Diff Detail

Event Timeline

jhenderson created this revision.Feb 4 2020, 6:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2020, 6:15 AM
jhenderson marked an inline comment as done.Feb 4 2020, 6:15 AM
jhenderson added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
668–670

I'm going to write a follow-up patch for this issue.

ikudrin added inline comments.Feb 4 2020, 7:11 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
668–670

You probably meant Len >= 1, right?

jhenderson updated this revision to Diff 242331.Feb 4 2020, 7:13 AM

Address FIXME by using uint64_t for longer. If the size is greater than uint8_t max, then it is clearly not 4 or 8, so the adddress size check will catch it. Also added a unit test to cover this case, and made a minor change to a struct used in the test to ensure the test could pass.

jhenderson marked an inline comment as done.Feb 4 2020, 7:15 AM
jhenderson added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
668–670

I'm guessing this comment is related to the now-deleted FIXME?

probinson added inline comments.Feb 4 2020, 10:20 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
680

I just happened to notice D73961 which says AVR usually has 2-byte addresses. Maybe that's a broader issue for a separate patch but there are a lot of 16-bit machines still out there in the world.

dblaikie added inline comments.Feb 4 2020, 4:32 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
677–698

I'd generally like to assume that the /first/ time a value/field/length/etc is seen, that is correct and anything later that conflicts with it is incorrect.

In what cases would we reach here anyway? Could we check earlier and ensure the extractor has a meaningful (non-zero) address size?

jhenderson marked an inline comment as done.Feb 5 2020, 7:38 AM
jhenderson added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
680

I just happened to notice D73961 which says AVR usually has 2-byte addresses. Maybe that's a broader issue for a separate patch but there are a lot of 16-bit machines still out there in the world.

Thanks for flagging that up. I saw the 4/8 byte support check in several other places, so copied those. I'll expand this to 2 and 1 too, since that's what the DataExtractor can support. The other places can be fixed as needed.

I'd generally like to assume that the /first/ time a value/field/length/etc is seen, that is correct and anything later that conflicts with it is incorrect.

I've been following the approach of assuming the length field is correct (in this case, the length of the extended opcode). Indeed, in this case, if we were to follow a different approach, we could read past the end of the instruction.

In what cases would we reach here anyway?

We hit here any time the length field of a DW_LNE_set_address opcode is not either 5 or 9 (i.e. 1 for the opcode, plus 4/8 for the address).

Could we check earlier and ensure the extractor has a meaningful (non-zero) address size?

I'm deliberately not checking that any earlier stated address makes sense, because it's not actually needed until here. If a V5 line table had an address of, say, 7, it wouldn't matter if the table had no DW_LNE_set_address opcodes, so an earlier validation would lead to an error that has no impact. @probinson (I think) previously encouraged me to be lazy in reporting this sort of thing.

Additionally, we don't always have any address size information until this point - in the case where e.g. llvm-dwarfdump is dumping just the whole of a V4 .debug_line section and there is no corresponding debug info unit, the parser knows nothing about the address until it hits this point.

jhenderson marked an inline comment as done.Feb 5 2020, 8:00 AM
jhenderson added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
677–698

(oops, Phabricator is showing my reply above your one @dblaikie)

jhenderson updated this revision to Diff 242621.Feb 5 2020, 8:03 AM
jhenderson edited the summary of this revision. (Show Details)

Address review comments:

  1. Add support for 1-byte and 2-byte address sizes.
  2. Fold in unit test bug fix from D73901.
aykevl added a subscriber: aykevl.Feb 5 2020, 11:31 AM

I just noticed this patch from D73961. I am not yet familiar with the DWARF support in LLVM but just wanted to say two things:

  • The address size (2) that LLVM uses for AVR might be wrong: avr-gcc uses 4 instead. I don't know why exactly but perhaps because there are AVR devices that have more than 64K words of flash (such as the ATmega2560).
  • So far it looks like code compiled with Clang has faulty debug info. dwarfdump complains ERROR: dwarf_srcfiles: DW_DLE_HEADER_LEN_BIGGER_THAN_SECSIZE (342) Corrupt dwarf. while llvm-dwarfdump doesn't even try to print the contents of .debug-info.

To reproduce, code:

int main(void)
{
  while (1) {
    __asm__ __volatile__("nop");
  }
}

Compiled with avr-gcc (the -gdwarf-4 is important because avr-gcc on Debian defaults to stabs for some reason):

avr-gcc -o avr-nop.elf -g -gdwarf-4 avr-nop.c

Compiled with Clang:

clang --target=avr -c -o avr-nop.o -Os -g avr-nop.c
avr-gcc -o avr-nop.elf avr-nop.o

Noting it here in case it is of any use, or you happen to have any hints how I could debug this further (I've already spent hours on this).

I just noticed this patch from D73961. I am not yet familiar with the DWARF support in LLVM but just wanted to say two things:

  • The address size (2) that LLVM uses for AVR might be wrong: avr-gcc uses 4 instead. I don't know why exactly but perhaps because there are AVR devices that have more than 64K words of flash (such as the ATmega2560).
  • So far it looks like code compiled with Clang has faulty debug info. dwarfdump complains ERROR: dwarf_srcfiles: DW_DLE_HEADER_LEN_BIGGER_THAN_SECSIZE (342) Corrupt dwarf. while llvm-dwarfdump doesn't even try to print the contents of .debug-info.

Noting it here in case it is of any use, or you happen to have any hints how I could debug this further (I've already spent hours on this).

Hi @aykevl,

It might be worth filing a bug against LLVM to report and discuss this further. Also, if you are able to attach the object file to that bug which you are having trouble with, it might be helpful for debugging purposes.

aykevl added a comment.Feb 7 2020, 3:18 AM

@jhenderson thanks!
It took me a while but finally I figured out that the problem is with the AVR backend - at least the first bug I was dealing with. For some reason it adjusted absolute relocations, breaking references between DWARF sections.

Fix unit test following rebase.

dblaikie accepted this revision.Feb 13 2020, 5:13 PM

Sounds alright

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
662

A comment on the "addExtendedOpcode" might help make it clearer that that's the "interesting" part of the input & how half+byte == 3 bytes, which is the invalid value being tested for on this line? (what's the "addByte(0xaa)" for? that looks a bit opaque)

This revision is now accepted and ready to land.Feb 13 2020, 5:13 PM
This revision was automatically updated to reflect the committed changes.
jhenderson marked 4 inline comments as done.