This is an archive of the discontinued LLVM Phabricator instance.

[DWARF] Better detect errors in Address Range Tables.
ClosedPublic

Authored by ikudrin on Dec 27 2019, 8:21 AM.

Details

Summary

The patch tries to cover most remaining cases of wrong data.

Diff Detail

Event Timeline

ikudrin created this revision.Dec 27 2019, 8:21 AM

Could these tests all be in one run? (ie: even though the length is "invalid" in terms of parsing the contents of the section - but valid enough to jump to the next contribution and parse that - except, I guess, at an invalid 64 bit length perhaps, because that might be too long to be bothered testing - so perhaps that could be the last test case in the file)

llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp
143–147

Worth having two messages here to more specifically communicate the kind of invalidity? ("invalid length" is a bit vague/doesn't really say what to do about the length/why it's a problem - even if this remains as one message, perhaps the message should describe the two kinds of invalidity that could be the problem)

ikudrin updated this revision to Diff 237667.Jan 13 2020, 6:56 AM
  • Made the diagnostic a bit more distinct.
  • Moved the tests to a gtest-based suite.

Could these tests all be in one run? (ie: even though the length is "invalid" in terms of parsing the contents of the section - but valid enough to jump to the next contribution and parse that - except, I guess, at an invalid 64 bit length perhaps, because that might be too long to be bothered testing - so perhaps that could be the last test case in the file)

As I said in D71876, that requires a separate patch and a separate consideration. Hope that gtest-based tests solve the issue in a better way.

dblaikie accepted this revision.Jan 17 2020, 3:10 PM

Looks good - future work done as separate patches for separable changes/diagnostics would probably be good (& if you've got some broader goals in terms of what functionality you need/would like in llvm-dwarfdump, an email to llvm-dev explaining the broader goals (do they overlap with James Henderson's goals, improving the line table error handling?) would probably be good - to link to in future reviews, etc)

This revision is now accepted and ready to land.Jan 17 2020, 3:10 PM
dblaikie requested changes to this revision.Jan 17 2020, 3:17 PM

Sounds good - thanks!

This revision now requires changes to proceed.Jan 17 2020, 3:17 PM
dblaikie accepted this revision.Jan 17 2020, 3:19 PM

Let's try that again - guess I'd already approved this before.

This revision is now accepted and ready to land.Jan 17 2020, 3:19 PM

Thanks for the review!

Looks good - future work done as separate patches for separable changes/diagnostics would probably be good (& if you've got some broader goals in terms of what functionality you need/would like in llvm-dwarfdump, an email to llvm-dev explaining the broader goals (do they overlap with James Henderson's goals, improving the line table error handling?) would probably be good - to link to in future reviews, etc)

My main intention is to add support for 64-bit DWARF. All other changes are just accompanying the main goal, like fixing issues when I found them or making other small improvements. I believe there should be no conflicts with James Henderson's changes.

Thanks for the review!

Looks good - future work done as separate patches for separable changes/diagnostics would probably be good (& if you've got some broader goals in terms of what functionality you need/would like in llvm-dwarfdump, an email to llvm-dev explaining the broader goals (do they overlap with James Henderson's goals, improving the line table error handling?) would probably be good - to link to in future reviews, etc)

My main intention is to add support for 64-bit DWARF.

Ah, cool - thanks for that! (out of curiosity, but by no means necessary - what's the motivation for that? LLVM doesn't currently produce 64-bit DWARF, I think & haven't seen any user-requests/had any discussions about adding it that I recall, but I might've missed some. Useful to know what companies/projects/etc are interested in these things, if there's more context you'd like to share)

All other changes are just accompanying the main goal, like fixing issues when I found them or making other small improvements. I believe there should be no conflicts with James Henderson's changes.

ikudrin updated this revision to Diff 239108.Jan 20 2020, 6:14 AM
  • Make expressions constexpr instead of static const.
This revision was automatically updated to reflect the committed changes.
jhenderson added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp
167–171

@ikudrin, I know this was committed a while back, but I only just ran into this.

Why was this check added? There's nothing in the DWARF spec that I know of that requires a non-zero length for all non-terminating entries. In our Sony linker, prior to the new tombstoning functionality that has recently been implemented, we would patch a reference in the .debug_aranges to have a zero length, and -1 for the address. This used to be fine, but recent versions of llvm-dwarfdump now can't dump this. Even without the Sony changes, there are occasionally functions in the DWARF with zero length (see also my post earlier today on the llvm-dev list for .debug_loc and zero length entry descriptions).

dblaikie added inline comments.Jul 21 2020, 1:30 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp
167–171

Yep, +1 we should allow empty ranges here. Even with better tombstoning - a) backwards compatibility with existing implementations (ld.bfd) b) even with proper tombstoning, it'd effectively create an empty range (-2, -2) - though arguably in that case we should detect the tombstoning first, before considering the range size/other issues

Here's a reproduction with commonly accessible tools:

$ cat range2.cpp
void f1() {
}
void f2() {
}
int main() {
  f1();
}
$ clang++ range2.cpp -g -fuse-ld=bfd -ffunction-sections -gdwarf-aranges -Wl,-gc-sections
$ llvm-dwarfdump -verify a.out
Verifying a.out:        file format elf64-x86-64
Verifying .debug_abbrev...
Verifying .debug_info Unit Header Chain...
error: DIE address ranges are not contained in its parent's ranges:
0x0000000b: DW_TAG_compile_unit
              DW_AT_producer    ("clang version 12.0.0 (git@github.com:llvm/llvm-project.git 80e6844e28f10c1c2c31b301c3ef980c7a11ee6b)")
              DW_AT_language    (DW_LANG_C_plus_plus_14)
              DW_AT_name        ("range2.cpp")
              DW_AT_stmt_list   (0x00000000)
              DW_AT_comp_dir    ("/usr/local/google/home/blaikie/dev/scratch")
              DW_AT_low_pc      (0x0000000000000000)
              DW_AT_ranges      (0x00000000
                 [0x0000000000401110, 0x0000000000401116)
                 [0x0000000000000001, 0x0000000000000001)
                 [0x0000000000401120, 0x000000000040112d))

0x00000043:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000000000)
                DW_AT_high_pc   (0x0000000000000006)
                DW_AT_frame_base        (DW_OP_reg6)
                DW_AT_linkage_name      ("_Z2f2v")
                DW_AT_name      ("f2")
                DW_AT_decl_file ("/usr/local/google/home/blaikie/dev/scratch/range2.cpp")
                DW_AT_decl_line (3)
                DW_AT_external  (true)

Verifying .debug_info references...
Verifying .debug_types Unit Header Chain...
Errors detected.

Similarly, with GCC (which produces debug_aranges by default):

$ g++ range2.cpp -g -fuse-ld=bfd -ffunction-sections -Wl,-gc-sections
$ llvm-dwarfdump -verify a.out
Verifying a.out:        file format elf64-x86-64
Verifying .debug_abbrev...
Verifying .debug_info Unit Header Chain...
error: DIE address ranges are not contained in its parent's ranges:
0x0000000b: DW_TAG_compile_unit
              DW_AT_producer    ("GNU C++14 10.0.0 20200111 (experimental) -mtune=generic -march=x86-64 -g -fuse-ld=bfd -ffunction-sections")
              DW_AT_language    (DW_LANG_C_plus_plus)
              DW_AT_name        ("range2.cpp")
              DW_AT_comp_dir    ("/usr/local/google/home/blaikie/dev/scratch")
              DW_AT_ranges      (0x00000000
                 [0x0000000000401102, 0x0000000000401109)
                 [0x0000000000000001, 0x0000000000000001)
                 [0x0000000000401109, 0x0000000000401119))
              DW_AT_low_pc      (0x0000000000000000)
              DW_AT_stmt_list   (0x00000000)

0x0000004e:   DW_TAG_subprogram
                DW_AT_external  (true)
                DW_AT_name      ("f2")
                DW_AT_decl_file ("/usr/local/google/home/blaikie/dev/scratch/range2.cpp")
                DW_AT_decl_line (3)
                DW_AT_decl_column       (0x06)
                DW_AT_linkage_name      ("_Z2f2v")
                DW_AT_low_pc    (0x0000000000000000)
                DW_AT_high_pc   (0x0000000000000007)
                DW_AT_frame_base        (DW_OP_call_frame_cfa)
                DW_AT_GNU_all_call_sites        (true)

Verifying .debug_info references...
Verifying .debug_types Unit Header Chain...
Errors detected.

Thanks @dblaikie - I've filed https://bugs.llvm.org/show_bug.cgi?id=46805 to track this. Not got time currently to fix it myself - I've hacked out the error message for now locally to unblock my work, but that removes the useful check for a premature terminator.

ikudrin marked an inline comment as done.Jul 22 2020, 7:22 PM
ikudrin added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp
167–171

Why was this check added? There's nothing in the DWARF spec that I know of that requires a non-zero length for all non-terminating entries.

I tried to follow the Standard here, see 6.1.2@DWARFv5, p. 148:

Each descriptor is a triple consisting of a segment selector, the beginning address within that segment of a range of text or data covered by some entry owned by the corresponding compilation unit, followed by the non-zero length of that range.

It looks like the DWARF spec should be updated to reflect the practice.

@jhenderson, it would be great if you prepare the fix. I'll be AFK for a couple of weeks.

jhenderson added inline comments.Jul 23 2020, 1:06 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp
167–171

Huh, you're right. We must have missed that bit in the spec when we implemented our internal solution for stripped references. I guess the trouble is that if you do remove the non-zero length bit from the spec, a system with a zero address would treat it as a terminator, unless we also removed the terminator from the spec (I don't see why we couldn't - an aranges table has a known length). I'll post on https://bugs.llvm.org/show_bug.cgi?id=46805 that point, to see if I can get some feedback from other DWARF folks.

I may not get around to the fix for a bit - I've got a few other things on my plate currently, but this is certainly going to stay on my radar so if I do get a chance I'll come back to it. I don't think it's an urgent enough issue to warrant a release branch patch, unless somebody else feels it is.