This is an archive of the discontinued LLVM Phabricator instance.

[DWARF] support for .debug_addr (consumer)
ClosedPublic

Authored by vleschuk on Jul 23 2018, 9:52 AM.

Details

Summary

This patch implements basic support for parsing and dumping DWARFv5 .debug_addr section. There is still work to be done, further patches are on their way but I think it would be better to review and commit the "sceleton" first.

What is currently missing:

  • symbol resolution (we dump only addresses)
  • support for non-zero segment selector size
  • support for 64bit dwarf

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

cc llvm-commits. Please do this when first creating the review.

Repeating the description here so it appears on the list:

This patch implements basic support for parsing and dumping DWARFv5 .debug_addr section. There is still work to be done, further patches are on their way but I think it would be better to review and commit the "sceleton" first.

What happens when dumping a file using pre-standard debug_addr? (is the version of the CU checked to make sure that the debug_addr section has a header, etc) I know this is super annoying - every time there's a section that lacks its own self-describing, versioned header, it hurts the ability to dump that section in the future :/ (in any object file that contains CUs of the old version that might have a contribution from that section, and thus cause entries in the section without a self-describing/standalone/versioned header)

Also - I know for DW_AT_ranges attributes we have some advanced dumping in some situations that can print section names and the like, I think? Could we reuse that here too so the dumped addresses also have section name+offset, perhaps?

Not specific to your change, but I noticed that a lot of the DWARFContext code uses 32-bit offsets. Would it not make sense where we are writing new stuff, to use 64-bit offsets, since DWARF does support this?

I haven't looked at the details of the new address table class. I'll look at that later.

include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
63

I'm wondering if this should return an Error instance instead of Optional for the out-of-range case. It seems like a non-existent index request would indicate a malformed section, right?

79

Missing trailing ')'

lib/DebugInfo/DWARF/DWARFContext.cpp
254

I see that this is practically identical to dumpRnglistsSection below (and it may well be similar to other stuff too). Perhaps we should generalise the mechanism for extracting and dumping a given DWARF section, made up of distinct sub-sections/tables?

262–266

I'm not particularly comfortable with the DWARFContext knowing that a Length 0 is the indicator of whether the parsing can continue. I had similar issues with the .debug_line behaviour until I added a parser to wrap the extraction process. My personal inclination (but others may disagree) is that the AddrTable should have a "validLength()" method or similar, that can be queried.

506–511

This should probably not happen in between .debug_ranges and .debug_rnglists, just because those two sections are effectively equivalent.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
26

I feel like this function is something we use repeatedly throughout the DWARF library, and quite probably elsewhere too. That's probably an indication that it needs moving into a support library.

test/tools/llvm-dwarfdump/X86/debug_addr_X86.s
10 ↗(On Diff #156809)

There's a lot of unnecessary noise in this test. Could you remove everything apart from what is actually needed to test the section dumping, please. You can even use explicit addresses in the .debug_addr section, so that there doesn't need to be any labels.

test/tools/llvm-dwarfdump/X86/debug_addr_empty.s
8

Why .debug_rnglists here?

test/tools/llvm-dwarfdump/X86/debug_addr_invalid.s
10

Again, most of this is probably unnecessary.

JDevlieghere set the repository for this revision to rL LLVM.Jul 24 2018, 3:31 AM
JDevlieghere added a project: debug-info.
probinson added inline comments.Jul 24 2018, 9:35 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
51

DataSize and AddrCount look like they don't need to be class members.

71

The extra detail after the first sentence of this comment is all correct, but I think too much. How DWARF handles initial-length is consistent across all sections, there's no need to describe it.

85

getLength would be more consistent with LLVM style, not to mention the rest of the getters in this class.

lib/DebugInfo/DWARF/DWARFContext.cpp
28

LLVM style has the include files sorted by name, so this needs to be moved.

test/tools/llvm-dwarfdump/X86/debug_addr_X86.s
2 ↗(On Diff #156809)

If all you're dumping is .debug_addr, the test doesn't need line-table directives or .cfi directives or even any actual code. It just needs a couple of labels in the .text section separated by a known number of bytes. This will keep the test focused on exactly what it's trying to exercise.

test/tools/llvm-dwarfdump/X86/debug_addr_empty.s
8

Why this section directive?

test/tools/llvm-dwarfdump/X86/debug_addr_invalid.s
3

As with the other test, you don't need to bother with line-table or .cfi directives or actual code.

Not specific to your change, but I noticed that a lot of the DWARFContext code uses 32-bit offsets. Would it not make sense where we are writing new stuff, to use 64-bit offsets, since DWARF does support this?

I agree, but this requires refactoring of existing code first: for example at least DataExtractor should be able to work with 64bit offset. I'd prefer not to change it in this revision.

vleschuk marked 11 inline comments as done.Jul 25 2018, 2:10 AM
vleschuk added inline comments.
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
51

Actually they need: DataSize is used in getter when we are working with headerless .debug_addr section (pre-DWARFv5 implementation) (see new diff when I upload it.

AddrCount is computed in extract() method and used later in dump() method. The easiest way is to save in private class data.

63

I think that returning None is also a good indication of an error. Also non-existent index can indicate an error in upper-level logic. And just IMHO for getters like this returning a requested data as function result looks better that returning it in its argument:

Error getAddrEntry(uint32_t Index, uint64_t &Result);
lib/DebugInfo/DWARF/DWARFContext.cpp
254

I thought about it too, but after that I implemented support for pre-DWARFv5 .debug_addr and it required additional parameters for extract() method. I'd prefer to leave this refactoring out of scope of this revision.

262–266

Done.

506–511

Agree, moved it above .debug_ranges.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
26

Done for .debug_addr, getting rid of implementations in other DWARF classes requires separate NFC commit.

vleschuk marked 13 inline comments as done.Jul 25 2018, 2:15 AM
jhenderson added inline comments.Jul 25 2018, 2:19 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
63

Sorry, I should have said "Expected<uint64_t>":
Expected<uint64_t> getAddrEntry(uint32_t Index);

(Error would be used where the return type would be void or where bool or an error code would be returned). To me, Optional indicates that a "no result" is a normal and reasonable usage of this interface, whereas Expected indicates that it is bad usage of the function (whether due to malformed input or due to bad upper-level logic). It also ensures that the result must be tested for error, and cannot be ignored.

What happens when dumping a file using pre-standard debug_addr? (is the version of the CU checked to make sure that the debug_addr section has a header, etc) I know this is super annoying - every time there's a section that lacks its own self-describing, versioned header, it hurts the ability to dump that section in the future :/ (in any object file that contains CUs of the old version that might have a contribution from that section, and thus cause entries in the section without a self-describing/standalone/versioned header)

Added support for pre-DWARFv5 format. Thanks for pointing this out.

Also - I know for DW_AT_ranges attributes we have some advanced dumping in some situations that can print section names and the like, I think? Could we reuse that here too so the dumped addresses also have section name+offset, perhaps?

Sorry, actually I don't see this functionality for DW_AT_ranges, could you please tell me where I missed it?

vleschuk marked 4 inline comments as done.Jul 25 2018, 3:06 AM
vleschuk updated this revision to Diff 157210.Jul 25 2018, 3:11 AM
  • Added support for pre-DWARFv5 .debug_addr format (e.g. w/o header)
  • Shortened .debug_addr consumer tests
  • Moved createError() implementation to llvm::Support
  • Relocated dumping above range_list dump
  • Moved table length verification inside AddrTable class
  • Renamed length() -> getLength()
  • Shortened comments
  • Fixed mistype in error message
probinson added inline comments.Jul 25 2018, 7:35 AM
include/llvm/DebugInfo/DWARF/DWARFContext.h
330

No, this is a bad idea. While we can reasonably expect address size to be the same across all units, the same is not true for DWARF version. Please remove this method and solve the problem another way (see comment where this is called in DWARFContext.cpp).

include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
51

I should think Addrs.size() would be identical to the recorded AddrCount, meaning AddrCount is redundant.

include/llvm/Support/Error.h
1132 ↗(On Diff #157210)

Because the intent is for this to be useful more widely, it should be committed separately. That way if you have to revert this patch, the utility template function remains in place for other users.

lib/DebugInfo/DWARF/CMakeLists.txt
18

This wants to be in alphabetical order.

lib/DebugInfo/DWARF/DWARFContext.cpp
485

The .debug_str_offsets section has the exact same problem, please use the same solution.

491–492

given that getCUAddrSize() is now implementing the for loop, shouldn't you delete this one? Similarly delete the comment as you moved it to the new function.

1615

While it is mechanically possible for address size to vary across CUs, in practice I think we are not going to see this come up within a single object file. The address size field is repeated across various DWARF headers (at least in v5) to make it easier to dump them independently, not to enable varying the address size.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
2

Needs a module header comment.

60

else goes on the same line as the preceding brace.

There's a very handy tool, clang-format-diff, which will solve this sort of thing for you.

113

Addrs.size() should work here, then you don't need AddrCount.

jhenderson added inline comments.Jul 25 2018, 7:50 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
25

Nit: unnecessary comment characters.

56

Please don't use unnecessary abbreviations in comments: addrs -> addresses (here and elsewhere).

87

This should be a static method, in my opinion, but better would be to leave it as a member and just test the class's Header Length field directly. In the latter case, you might want to rename the function to "hasValidLength"

89

length of array -> length of the array

include/llvm/Support/Error.h
1126 ↗(On Diff #157210)

This should probably be unit tested.

lib/DebugInfo/DWARF/DWARFContext.cpp
254

Okay, that's reasonable.

491–492

Do you need this code comment duplicated in full in both places?

1616

This FIXME should not be here.

1617

LLVM style uses capitalized variables i.e. addr -> Addr

(here and elsewhere)

test/tools/llvm-dwarfdump/X86/debug_addr_X86.s
1 ↗(On Diff #157210)

You don't need "X86" in the name of this test, because it is implied by the folder directory. Unless you want to distinguish between x86 and x86_64 (i.e. 4 byte and 8 byte addresses), which you may want to do anyway (but don't currently).

22–23 ↗(On Diff #157210)

If you do .long 1234 etc instead of .long .Lfunc_begin0, you can remove the .text section entirely, and not have to worry about knowing the size of nops etc.

test/tools/llvm-dwarfdump/X86/debug_addr_dwarf4.s
11

This looks to have the same issue as the other tests. Please remove everything that isn't needed. I think you'll find that even if the .debug_info etc sections are required, you can probably remove a number of the tags.

test/tools/llvm-dwarfdump/X86/debug_addr_empty.s
2

There's another test case probably needed, similar to this one: no .debug_addr section at all.

test/tools/llvm-dwarfdump/X86/debug_addr_invalid.s
22

I'd change this to be "unsupported" rather than invalid, since technically, 1 is a valid segment selector size.

vleschuk marked 2 inline comments as done.Jul 25 2018, 3:52 PM
vleschuk added inline comments.
include/llvm/Support/Error.h
1126 ↗(On Diff #157210)

Created separate review with simple unit test: https://reviews.llvm.org/D49824 .

1132 ↗(On Diff #157210)

Created separate review with simple unit test: https://reviews.llvm.org/D49824 .

vleschuk marked 4 inline comments as done.Jul 25 2018, 3:53 PM
vleschuk marked 14 inline comments as done.Jul 25 2018, 4:21 PM
vleschuk marked 2 inline comments as done.Jul 25 2018, 5:16 PM
vleschuk marked an inline comment as done.Jul 25 2018, 6:02 PM
vleschuk marked an inline comment as done.Jul 25 2018, 6:18 PM
vleschuk marked 3 inline comments as done.Jul 26 2018, 12:43 AM
vleschuk updated this revision to Diff 157429.Jul 26 2018, 12:45 AM
  • Used new createStringError() function
  • Shortened tests and added new trivial test-case w/o .debug_addr section
  • Used getMaxVersion() to guess which version of .debug_addr we are looking at
  • Fixed comment and style issues
jhenderson added inline comments.Jul 26 2018, 2:53 AM
include/llvm/BinaryFormat/Dwarf.def
874

These look to be in alphabetical order for the most part too, so you should probably move this.

include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
67

I'm wondering whether errc::invalid_argument might be more applicable. It looks like argument_out_of_domain is more to do with Mathematical functions, which this isn't.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
28

Again, I'd say this should be invalid_argument (also applies to other uses of argument_out_of_domain below).

34

Please don't warn directly in library functions like this! And even if you do, please use WithColor, to ensure it is consistent.

Different executables will have different styles for how they handle warnings (some might want them printing, others might want to ignore them, and yet more might want them to be treated as a failure). Since this warning doesn't prevent continuation, I recommend you use some sort of callback. You can see similar things in the DWARFDebugLine::LineTable class, for what I mean, and would recommend.

44

There is errc::not_supported, which I'd recommend for this.

72

It may be worth saying that Version != 5 is not supported (except for the special case)?

80

This should probably be not_supported too.

93

not_supported again.

125

Are you going to do 64-bit addresses immediately after this?

test/tools/llvm-dwarfdump/X86/debug_addr_absent.s
5

You'll want to redirect stderr to stdout too, if you are going to check for warnings etc. I'd also go as far as to say that you should do CHECK-NOT: {{.}} at the end to show that we printed nothing else.

test/tools/llvm-dwarfdump/X86/debug_addr_dwarf4.s
21

I think you can delete this label.

test/tools/llvm-dwarfdump/X86/debug_addr_empty.s
2

And one more I thought of: an addr section with a header but no members. You might want to add that to the basic test case, rather than as a new file, as that will show what happens when there are multiple tables in the same file.

test/tools/llvm-dwarfdump/X86/debug_addr_invalid.s
6

You should probably add tests for each of the different error messages you can emit, at least reasonably. The ones that I can see are where the section is too small for a length field, the section is shorter than the (otherwise valid) length field (both as too short for the header and too short for the supposed payload), if the version is not specified (i.e. Version == 0), mismatching or invalid versions, and mismatching addresses (essentially that's all the ones that are not "not_supported".

You might also want tests for the currently unsupported cases, such as DWARF64 too.

24

You should probably make this a valid section, given a non-zero segment_selector size.

probinson added inline comments.Jul 26 2018, 7:07 AM
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
125

Other parts of the library (e.g. DWARFFormValue) unconditionally print a 64-bit address, regardless of the target address size. This probably should be considered a bug in DWARFFormValue but if you print a 64-bit value here, at least it would be consistent with the rest of the library.

vleschuk marked 14 inline comments as done.Jul 27 2018, 1:52 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
44

There is only errc::function_not_supported. I used this one.

jhenderson added inline comments.Jul 27 2018, 1:57 AM
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
44

There is only errc::function_not_supported. I used this one.

Nope, errc::not_supported does definitely exist. Go to https://en.cppreference.com/w/cpp/error/errc and search for ENOTSUP for reference.

If it isn't there, then your standard library is not C++11 compliant...

vleschuk updated this revision to Diff 157656.Jul 27 2018, 1:58 AM
vleschuk marked an inline comment as done.Jul 27 2018, 1:58 AM
  • Added more tests to verify all generated error messages
  • Added invalidateLength() method to DebugAddrTable class to mark the point when we need to stop extraction
  • Added support for dumping 64bit addresses
vleschuk marked 2 inline comments as done.Jul 27 2018, 2:17 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
44

Indeed, sorry for that. Committed rL338114: [Support] Bring std::errc::not_supported to llvm::errc.

vleschuk updated this revision to Diff 157657.Jul 27 2018, 2:24 AM
vleschuk marked an inline comment as done.

Changed errc::function_not_supported to errc::not_supported.

vleschuk marked 2 inline comments as done.Jul 27 2018, 2:25 AM
jhenderson added inline comments.Jul 27 2018, 8:39 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
70

It might be worth making this message a little more verbose, to mention the fact that it's out of range of the address table at offset x, if possible.

92

Nit: missing full stop.

lib/DebugInfo/DWARF/DWARFContext.cpp
253

Since .debug_addr can pre-date DWARF v5, you should probably just remove this comment (or at least the bit to do with the version).

258

"CB" is not an obvious abbreviation to me. Please just call it "WarnCallback" or similar (ditto in the actual extract function).

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
53

Missing full stop.

55

You might as well use the existing function for this.

But actually, here and elsewhere below, I suggest you just assume that the Length is valid. I'd only invalidate it when the version is unsupported, DWARF64 is detected or the section is too short for a length field. That way the caller can just skip the rest of the table and attempt to read another one a bit later, which might work.

72

It might have lost track. Which error message shows this exactly?

88

We do support DWARF version 5 for now as well as... -> We support DWARF version 5 and the...

table which -> table, which

header -> a header

series of -> a series of

93

Probably worth putting here where the version is suggested from.

98

Here and elsewhere, use the PRIu8 and similar instead of explicit "hhu" when the type is uint8_t etc.

test/tools/llvm-dwarfdump/X86/debug_addr.s
23

It would be nice to have a test case that doesn't produce this warning.

Also - I know for DW_AT_ranges attributes we have some advanced dumping in some situations that can print section names and the like, I think? Could we reuse that here too so the dumped addresses also have section name+offset, perhaps?

Sorry, actually I don't see this functionality for DW_AT_ranges, could you please tell me where I missed it?

Oh, sure - it was added here: https://reviews.llvm.org/D36313

Also - I know for DW_AT_ranges attributes we have some advanced dumping in some situations that can print section names and the like, I think? Could we reuse that here too so the dumped addresses also have section name+offset, perhaps?

Sorry, actually I don't see this functionality for DW_AT_ranges, could you please tell me where I missed it?

Oh, sure - it was added here: https://reviews.llvm.org/D36313

Thank you David! If you don't mind I will implement this in a separate revision: there is already a lot on the plate for this one :-)

vleschuk marked 11 inline comments as done.Jul 27 2018, 8:30 PM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
72

Example:

# ERR: section is not large enough to contain a .debug_addr table of length 0x10 at offset 0x0
# ERR-NOT: {{.}}                                                                
                                                                                
# too small section to contain section of given length                          
  .section  .debug_addr,"",@progbits                                            
.Ldebug_addr0:                                                                  
  .long 12 # unit_length                                                        
  .short 5 # version                                                            
  .byte 4 # address_size                                                        
  .byte 0 # segment_selector_size

I added a test for this.

test/tools/llvm-dwarfdump/X86/debug_addr.s
23

We already have on: see debug_addr_version_mismatch.s - the first table produces the warning, the second one doesn't.

vleschuk updated this revision to Diff 157834.Jul 27 2018, 8:40 PM
vleschuk marked 2 inline comments as done.
  • More verbose error messages
  • Fixes for comment style
  • New test for the situation when section size too small for given length
jhenderson added inline comments.Jul 30 2018, 2:50 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
74

Nit: Since this and the function before are multi-line definitions, I'd suggest putting blank lines either side of them.

lib/DebugInfo/DWARF/DWARFContext.cpp
1618

Nit: I think this needs to be the explicit type, rather than auto, based on the coding style guide.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
17

I'd call invalidateLength here (yes, the reinitializing of HeaderData renders this redundant, but that then means you are relying on knowing that the value 0 for Length is the method used for knowing if the Length is invalid).

26

I think the common thing to do in other parts of the DWARF library is to call clear() here, to ensure we don't try to extract into an already-filled in table.

37

This should probably say "DWARF version is not defined in a CU" etc, since the Version is defined in the table usually.

55

Do you have any comments on the second half of this inline comment? Because I don't think that's been done.

72

Sorry, I think I meant this point here:

It may be worth saying that Version != 5 is not supported (except for the special case)?

I don't see this being done at all.

86

implementation -> implementations

93

by DWARF -> by the DWARF.

But if I've understood the behaviour correctly this may be a bad error message to have, actually. Consider the case where we have both DWARF4 and DWARF5 style debug info tables in a single file (e.g. an executable). Won't this result in a version mismatch error? Or am I misunderstanding how this code works? If so, it probably needs comments explaining what is meant by "SuggestedVersion" and where it typically might come from. It might help to rename it "UnitVersion" as well.

Nit: by DWARF -> by the DWARF...

155

Don't use auto here, as per the style guide.

test/tools/llvm-dwarfdump/X86/debug_addr.s
23

Right, in that case, I don't think you should check the warning in this test. In fact, it might be worth crafting a small .debug_info unit to suppress the warning in this basic test case.

test/tools/llvm-dwarfdump/X86/debug_addr_size_mismatch.s
1 ↗(On Diff #157834)

This test should be renamed to debug_addr_address_size_mismatch or similar, since the size could be referring to several different things.

vleschuk marked an inline comment as done.Jul 30 2018, 3:25 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFContext.cpp
1618

Actually this will differ from the style used in this particular file. I think we should leave auto here.

vleschuk marked 11 inline comments as done.Jul 30 2018, 3:41 AM
vleschuk added inline comments.Jul 30 2018, 4:44 AM
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
55

Actually I don't see how it might work. The caller relies on the fact that length value from the header is valid and parses the rest of the section based on this assumption. In this case the length field is definitely invalid and telling caller that everything is ok will result in incorrect parsing. Here is an example:

.Ldebug_addr0:                                                                  
  .long 1 # unit_length (should be 12 here)                                                        
  .short 5 # version                                                            
  .byte 4 # address_size                                                        
  .byte 0 # segment_selector_size                                               
  .long 0x00000000                                                              
  .long 0x00000001
if (!AddrTable.hasValidLength())                                          
  break; // If we do not break here we will increase Offset by 5 and not by 16 -> incorrect result.                                                                   
uint64_t Length = AddrTable.getLength();                                  
Offset = TableOffset + Length;
72

I am sorry, I think now that's me who lost track here? Are talking about the comment below?

// We support DWARF version 5 for now as well as pre-DWARF5 ...

93

You understood it correctly. But there is no ideal way to handle this. I used the same approach here as used in extraction of DebugStrOffsets, CU "suggests" MaxVersion found and version mismatch is treated as error: in case of version mismatch one can't tell exactly how to parse the header.

vleschuk updated this revision to Diff 157941.EditedJul 30 2018, 4:46 AM
  • Remove unnecessary warnings from tests
  • Separated test for 64bit addresses
  • Explicit length invalidation in clear()
  • Explicit clear() before extraction
  • Style and comment fixes
  • Rename debug_addr_size_not_multiple.s ==> debug_addr_address_size_not_multiple.s and debug_addr_size_not_multiple.s ==> debug_addr_address_size_not_multiple.s
vleschuk marked 2 inline comments as done.Jul 30 2018, 4:48 AM
vleschuk updated this revision to Diff 158172.Jul 30 2018, 10:30 PM

As mentioned in https://reviews.llvm.org/D50005 changed "header" message:

"Addr Section params" -> "Addr Section".

jhenderson added inline comments.Jul 31 2018, 2:47 AM
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
55

I think this comes down to "who should we trust more?". What do we do in other cases in the DWARF library?

72

I'll reset this by commenting on the line in question...

91

Before this check, we should check that the Version is a supported Version. The only ones supported are <= 5 (I would argue 4 and 5 explicitly, but I'm not sure at what point pre-standardisated .debug_addr tables started getting emitted, so the lower bound might be wrong). So if we encounter a Version >= 6, we should emit an unsupported error, because in that case, we can't guarantee that our parser can handle it. I'd then assume the Length is valid and jump to the next table when iterating over it.

93

Okay, I think I get you now. I think it shouldn't block this going in, in its first form, but I think I'd consider it a bug. Consider the following case:

  • test.o has been compiled with DWARF version 5
  • test2.o has been compiled with DWARF version 4 (and has a .debug_addr table)
  • test.elf is test.o and test2.o linked together.
  • llvm-dwarfdump will fail to dump both .debug_addr tables, because there will always be a mismatch.

Such a scenario is far from unlikely. It's quite common for static libraries built with older compiler versions (and therefore potentially different DWARF versions) to be linked together with objects or libraries from newer compilers.

By my understanding of the DWARF5 standard, the correct way to associate a .debug_addr table with a .debug_info table is to look at the DW_AT_addr_base attribute in the info table. In this case, each of the two .debug_info tables would point at a different .debug_addr table.

vleschuk marked an inline comment as done.Jul 31 2018, 4:08 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
55

The most similar case is .debug_rnglists implementation where the same "mechanism" (break if length is invalid) is used, however the RnglistsTable implementation doesn't use it all: there is know explicit invalidation in case of error (I think we should work on it and see if there are any problems).

In our case I suggest to leave invalidation here: we assume that the header is correct and read the Length value from it, after that we see that our data contradicts our assumption and either Length is invalid or the section is malformed. In any case it is better to report an error than produce potentially invalid output.

93

I think we can put a FIXME mark here for now and get back to it later if you don't mind.

vleschuk updated this revision to Diff 158213.Jul 31 2018, 4:12 AM
  • Added FIXME comment for version mismatch situation
  • Added check for the version <= 5 and corresponding test.
vleschuk added inline comments.Jul 31 2018, 4:13 AM
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
55

there is know explicit invalidation

there is no explicit invalidation

JDevlieghere added inline comments.Jul 31 2018, 7:42 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
56

Nit: a newline between these functions would improve readability, especially with the doxygen comments.

64

Please give the DIDumpOptions a default value to make the argument optional. (i.e. DIDumpOptions DumpOpts = {})

67

Does this have to be implemented in the header?

lib/DebugInfo/DWARF/DWARFContext.cpp
1612

Nit: s/fact/theory/

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
121

Please use a full sentences here.

156

I think a raw_string_ostream might better communicate your intentions of building up a string.

vleschuk marked 4 inline comments as done.Jul 31 2018, 8:03 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
156

I don't think so, in this situation usage of additional ostreams to build a format string will result in more code which will be less readable. I'd prefer to keep current code.

vleschuk updated this revision to Diff 158269.EditedJul 31 2018, 8:06 AM
  • Fixed comments and gaps between lines.
  • Moved get-by-index implementation from header to cpp.
  • Added default value to DumpOptions.

Okay, I think I'm happy with this, but please let the others give their feedback and thumbs up too.

Have you considered writing unit tests for this at all? I don't think they're strictly necessary, but they would be nice.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
96–99

I'd move this FIXME to where the UnitVersion is calculated in DWARFContext.cpp.

I also think the last two lines of this comment make no sense on their own, and can probably just be deleted.

Nit: as error, however the -> as an error. However, the

Aside: you don't need to manually wrap comments, because clang-format will do it for you at the correct column width.

test/tools/llvm-dwarfdump/X86/debug_addr_unsupported_version.s
20–24 ↗(On Diff #158213)

The spacing here (and in other tests) is a little all over the place. Please tidy it up a little.

vleschuk updated this revision to Diff 158280.Jul 31 2018, 8:25 AM
  • Fixed spacing in tests
  • Minor grammar fixes in comments

Okay, I think I'm happy with this, but please let the others give their feedback and thumbs up too.

@aprantl , @dblaikie , @probinson, do you have any objections on this?

Have you considered writing unit tests for this at all? I don't think they're strictly necessary, but they would be nice.

I thought about it, I will give it more thought when we have more mature implementation.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
96–99

I'd move this FIXME to where the UnitVersion is calculated in DWARFContext.cpp.

I think it better to leave it here where the actual comparison is performed.

I also think the last two lines of this comment make no sense on their own, and can probably just be deleted.

Agreed.

Aside: you don't need to manually wrap comments, because clang-format will do it for you at the correct column width.

Yep, sorry, always forget to use this tool, need to hotkey it to my vimrc =)

Related aside to my comments regarding version mismatches: is it likely that future versions of DWARF (e.g. DWARF 6) will update the version number of the .debug_addr tables without changing anything in them? I know in previous versions of DWARF, some sections didn't increase in version number in line with the overall standard version. If not, then the version mismatch check may not make sense anyway.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
96–99

But the error message will still want to exist, I believe, so the issue is finding the right CU, which is not done here. If somebody where to want to call this from another place, they might copy the existing call site, and will not be aware of the FIXME in that instance, resulting in them copying potentially dangerous code.

  • Fixed spacing in tests

The space between directive/value/comment still look quite messy to me. Could you fix that too, please.

Sorry about the late reply, I've been on vacation.

include/llvm/DebugInfo/DIContext.h
157

This looks like the wrong place to store this. Many dwarf sections have version numbers that are independent form one another. For example on Darwin we often use dwarf 4 debug_info and dwarf 2 debug_line in the same file.

158

Why is this a property of the dump options and not of whatever object that is being dumped?

vleschuk added inline comments.Jul 31 2018, 2:43 PM
include/llvm/DebugInfo/DIContext.h
158

I agree with the comment above about version, but address size should be the same for every section, am I right?

Also - I know for DW_AT_ranges attributes we have some advanced dumping in some situations that can print section names and the like, I think? Could we reuse that here too so the dumped addresses also have section name+offset, perhaps?

Sorry, actually I don't see this functionality for DW_AT_ranges, could you please tell me where I missed it?

Oh, sure - it was added here: https://reviews.llvm.org/D36313

Thank you David! If you don't mind I will implement this in a separate revision: there is already a lot on the plate for this one :-)

Yeah, better as a separate patch in any case. And there's lots of things we can improve in that regard anyway - many places addresses show up & how they could be rendered more informatively with things like this.

aprantl accepted this revision.Jul 31 2018, 3:02 PM

lgtm with the two parameters moved from DIDumpOptions into explict arguments of dumpAddrSection.

This revision is now accepted and ready to land.Jul 31 2018, 3:02 PM

Closed by rL338447

Hi @vleschuk, you didn't address my final comments before committing this, as far as I can see. Please could you go back and fix them.

Additionally, please remember to follow the guidelines at https://llvm.org/docs/Phabricator.html#committing-a-change when committing a change reviewed via Phabricator, specifically the bit about "Differential Revision" as that auto-closes the review and syncs up the diff.

It looks like my request wasn't addressed either. Can you please fix this soon?

@vleschuk, It looks like you *still* haven't addressed my comment

lgtm with the two parameters moved from DIDumpOptions into explict arguments of dumpAddrSection.

Can you please get this done ASAP?