The patch tries to cover most remaining cases of wrong data.
Details
Diff Detail
Event Timeline
Could these tests all be in one run? (ie: even though the length is "invalid" in terms of parsing the contents of the section - but valid enough to jump to the next contribution and parse that - except, I guess, at an invalid 64 bit length perhaps, because that might be too long to be bothered testing - so perhaps that could be the last test case in the file)
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp | ||
---|---|---|
138–142 | Worth having two messages here to more specifically communicate the kind of invalidity? ("invalid length" is a bit vague/doesn't really say what to do about the length/why it's a problem - even if this remains as one message, perhaps the message should describe the two kinds of invalidity that could be the problem) |
As I said in D71876, that requires a separate patch and a separate consideration. Hope that gtest-based tests solve the issue in a better way.
Looks good - future work done as separate patches for separable changes/diagnostics would probably be good (& if you've got some broader goals in terms of what functionality you need/would like in llvm-dwarfdump, an email to llvm-dev explaining the broader goals (do they overlap with James Henderson's goals, improving the line table error handling?) would probably be good - to link to in future reviews, etc)
Thanks for the review!
My main intention is to add support for 64-bit DWARF. All other changes are just accompanying the main goal, like fixing issues when I found them or making other small improvements. I believe there should be no conflicts with James Henderson's changes.
Ah, cool - thanks for that! (out of curiosity, but by no means necessary - what's the motivation for that? LLVM doesn't currently produce 64-bit DWARF, I think & haven't seen any user-requests/had any discussions about adding it that I recall, but I might've missed some. Useful to know what companies/projects/etc are interested in these things, if there's more context you'd like to share)
All other changes are just accompanying the main goal, like fixing issues when I found them or making other small improvements. I believe there should be no conflicts with James Henderson's changes.
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp | ||
---|---|---|
162–166 | @ikudrin, I know this was committed a while back, but I only just ran into this. Why was this check added? There's nothing in the DWARF spec that I know of that requires a non-zero length for all non-terminating entries. In our Sony linker, prior to the new tombstoning functionality that has recently been implemented, we would patch a reference in the .debug_aranges to have a zero length, and -1 for the address. This used to be fine, but recent versions of llvm-dwarfdump now can't dump this. Even without the Sony changes, there are occasionally functions in the DWARF with zero length (see also my post earlier today on the llvm-dev list for .debug_loc and zero length entry descriptions). |
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp | ||
---|---|---|
162–166 | Yep, +1 we should allow empty ranges here. Even with better tombstoning - a) backwards compatibility with existing implementations (ld.bfd) b) even with proper tombstoning, it'd effectively create an empty range (-2, -2) - though arguably in that case we should detect the tombstoning first, before considering the range size/other issues Here's a reproduction with commonly accessible tools: $ cat range2.cpp void f1() { } void f2() { } int main() { f1(); } $ clang++ range2.cpp -g -fuse-ld=bfd -ffunction-sections -gdwarf-aranges -Wl,-gc-sections $ llvm-dwarfdump -verify a.out Verifying a.out: file format elf64-x86-64 Verifying .debug_abbrev... Verifying .debug_info Unit Header Chain... error: DIE address ranges are not contained in its parent's ranges: 0x0000000b: DW_TAG_compile_unit DW_AT_producer ("clang version 12.0.0 (git@github.com:llvm/llvm-project.git 80e6844e28f10c1c2c31b301c3ef980c7a11ee6b)") DW_AT_language (DW_LANG_C_plus_plus_14) DW_AT_name ("range2.cpp") DW_AT_stmt_list (0x00000000) DW_AT_comp_dir ("/usr/local/google/home/blaikie/dev/scratch") DW_AT_low_pc (0x0000000000000000) DW_AT_ranges (0x00000000 [0x0000000000401110, 0x0000000000401116) [0x0000000000000001, 0x0000000000000001) [0x0000000000401120, 0x000000000040112d)) 0x00000043: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000000) DW_AT_high_pc (0x0000000000000006) DW_AT_frame_base (DW_OP_reg6) DW_AT_linkage_name ("_Z2f2v") DW_AT_name ("f2") DW_AT_decl_file ("/usr/local/google/home/blaikie/dev/scratch/range2.cpp") DW_AT_decl_line (3) DW_AT_external (true) Verifying .debug_info references... Verifying .debug_types Unit Header Chain... Errors detected. Similarly, with GCC (which produces debug_aranges by default): $ g++ range2.cpp -g -fuse-ld=bfd -ffunction-sections -Wl,-gc-sections $ llvm-dwarfdump -verify a.out Verifying a.out: file format elf64-x86-64 Verifying .debug_abbrev... Verifying .debug_info Unit Header Chain... error: DIE address ranges are not contained in its parent's ranges: 0x0000000b: DW_TAG_compile_unit DW_AT_producer ("GNU C++14 10.0.0 20200111 (experimental) -mtune=generic -march=x86-64 -g -fuse-ld=bfd -ffunction-sections") DW_AT_language (DW_LANG_C_plus_plus) DW_AT_name ("range2.cpp") DW_AT_comp_dir ("/usr/local/google/home/blaikie/dev/scratch") DW_AT_ranges (0x00000000 [0x0000000000401102, 0x0000000000401109) [0x0000000000000001, 0x0000000000000001) [0x0000000000401109, 0x0000000000401119)) DW_AT_low_pc (0x0000000000000000) DW_AT_stmt_list (0x00000000) 0x0000004e: DW_TAG_subprogram DW_AT_external (true) DW_AT_name ("f2") DW_AT_decl_file ("/usr/local/google/home/blaikie/dev/scratch/range2.cpp") DW_AT_decl_line (3) DW_AT_decl_column (0x06) DW_AT_linkage_name ("_Z2f2v") DW_AT_low_pc (0x0000000000000000) DW_AT_high_pc (0x0000000000000007) DW_AT_frame_base (DW_OP_call_frame_cfa) DW_AT_GNU_all_call_sites (true) Verifying .debug_info references... Verifying .debug_types Unit Header Chain... Errors detected. |
Thanks @dblaikie - I've filed https://bugs.llvm.org/show_bug.cgi?id=46805 to track this. Not got time currently to fix it myself - I've hacked out the error message for now locally to unblock my work, but that removes the useful check for a premature terminator.
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp | ||
---|---|---|
162–166 |
I tried to follow the Standard here, see 6.1.2@DWARFv5, p. 148:
It looks like the DWARF spec should be updated to reflect the practice. @jhenderson, it would be great if you prepare the fix. I'll be AFK for a couple of weeks. |
llvm/lib/DebugInfo/DWARF/DWARFDebugArangeSet.cpp | ||
---|---|---|
162–166 | Huh, you're right. We must have missed that bit in the spec when we implemented our internal solution for stripped references. I guess the trouble is that if you do remove the non-zero length bit from the spec, a system with a zero address would treat it as a terminator, unless we also removed the terminator from the spec (I don't see why we couldn't - an aranges table has a known length). I'll post on https://bugs.llvm.org/show_bug.cgi?id=46805 that point, to see if I can get some feedback from other DWARF folks. I may not get around to the fix for a bit - I've got a few other things on my plate currently, but this is certainly going to stay on my radar so if I do get a chance I'll come back to it. I don't think it's an urgent enough issue to warrant a release branch patch, unless somebody else feels it is. |
Worth having two messages here to more specifically communicate the kind of invalidity? ("invalid length" is a bit vague/doesn't really say what to do about the length/why it's a problem - even if this remains as one message, perhaps the message should describe the two kinds of invalidity that could be the problem)