Page MenuHomePhabricator

[DWARF] support for .debug_addr (consumer)
ClosedPublic

Authored by vleschuk on Jul 23 2018, 9:52 AM.

Details

Summary

This patch implements basic support for parsing and dumping DWARFv5 .debug_addr section. There is still work to be done, further patches are on their way but I think it would be better to review and commit the "sceleton" first.

What is currently missing:

  • symbol resolution (we dump only addresses)
  • support for non-zero segment selector size
  • support for 64bit dwarf

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
vleschuk marked an inline comment as done.Jul 27 2018, 1:58 AM
  • Added more tests to verify all generated error messages
  • Added invalidateLength() method to DebugAddrTable class to mark the point when we need to stop extraction
  • Added support for dumping 64bit addresses
vleschuk marked 2 inline comments as done.Jul 27 2018, 2:17 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
45

Indeed, sorry for that. Committed rL338114: [Support] Bring std::errc::not_supported to llvm::errc.

vleschuk updated this revision to Diff 157657.Jul 27 2018, 2:24 AM
vleschuk marked an inline comment as done.

Changed errc::function_not_supported to errc::not_supported.

vleschuk marked 2 inline comments as done.Jul 27 2018, 2:25 AM
jhenderson added inline comments.Jul 27 2018, 8:39 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
70

It might be worth making this message a little more verbose, to mention the fact that it's out of range of the address table at offset x, if possible.

92

Nit: missing full stop.

lib/DebugInfo/DWARF/DWARFContext.cpp
253

Since .debug_addr can pre-date DWARF v5, you should probably just remove this comment (or at least the bit to do with the version).

258

"CB" is not an obvious abbreviation to me. Please just call it "WarnCallback" or similar (ditto in the actual extract function).

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
53

Missing full stop.

55

You might as well use the existing function for this.

But actually, here and elsewhere below, I suggest you just assume that the Length is valid. I'd only invalidate it when the version is unsupported, DWARF64 is detected or the section is too short for a length field. That way the caller can just skip the rest of the table and attempt to read another one a bit later, which might work.

73

It might have lost track. Which error message shows this exactly?

88

We do support DWARF version 5 for now as well as... -> We support DWARF version 5 and the...

table which -> table, which

header -> a header

series of -> a series of

93

Probably worth putting here where the version is suggested from.

98

Here and elsewhere, use the PRIu8 and similar instead of explicit "hhu" when the type is uint8_t etc.

test/tools/llvm-dwarfdump/X86/debug_addr.s
23

It would be nice to have a test case that doesn't produce this warning.

Also - I know for DW_AT_ranges attributes we have some advanced dumping in some situations that can print section names and the like, I think? Could we reuse that here too so the dumped addresses also have section name+offset, perhaps?

Sorry, actually I don't see this functionality for DW_AT_ranges, could you please tell me where I missed it?

Oh, sure - it was added here: https://reviews.llvm.org/D36313

Also - I know for DW_AT_ranges attributes we have some advanced dumping in some situations that can print section names and the like, I think? Could we reuse that here too so the dumped addresses also have section name+offset, perhaps?

Sorry, actually I don't see this functionality for DW_AT_ranges, could you please tell me where I missed it?

Oh, sure - it was added here: https://reviews.llvm.org/D36313

Thank you David! If you don't mind I will implement this in a separate revision: there is already a lot on the plate for this one :-)

vleschuk marked 11 inline comments as done.Jul 27 2018, 8:30 PM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
73

Example:

# ERR: section is not large enough to contain a .debug_addr table of length 0x10 at offset 0x0
# ERR-NOT: {{.}}                                                                
                                                                                
# too small section to contain section of given length                          
  .section  .debug_addr,"",@progbits                                            
.Ldebug_addr0:                                                                  
  .long 12 # unit_length                                                        
  .short 5 # version                                                            
  .byte 4 # address_size                                                        
  .byte 0 # segment_selector_size

I added a test for this.

test/tools/llvm-dwarfdump/X86/debug_addr.s
23

We already have on: see debug_addr_version_mismatch.s - the first table produces the warning, the second one doesn't.

vleschuk updated this revision to Diff 157834.Jul 27 2018, 8:40 PM
vleschuk marked 2 inline comments as done.
  • More verbose error messages
  • Fixes for comment style
  • New test for the situation when section size too small for given length
jhenderson added inline comments.Jul 30 2018, 2:50 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
74

Nit: Since this and the function before are multi-line definitions, I'd suggest putting blank lines either side of them.

lib/DebugInfo/DWARF/DWARFContext.cpp
1625

Nit: I think this needs to be the explicit type, rather than auto, based on the coding style guide.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
17

I'd call invalidateLength here (yes, the reinitializing of HeaderData renders this redundant, but that then means you are relying on knowing that the value 0 for Length is the method used for knowing if the Length is invalid).

26

I think the common thing to do in other parts of the DWARF library is to call clear() here, to ensure we don't try to extract into an already-filled in table.

37

This should probably say "DWARF version is not defined in a CU" etc, since the Version is defined in the table usually.

55

Do you have any comments on the second half of this inline comment? Because I don't think that's been done.

73

Sorry, I think I meant this point here:

It may be worth saying that Version != 5 is not supported (except for the special case)?

I don't see this being done at all.

86

implementation -> implementations

93

by DWARF -> by the DWARF.

But if I've understood the behaviour correctly this may be a bad error message to have, actually. Consider the case where we have both DWARF4 and DWARF5 style debug info tables in a single file (e.g. an executable). Won't this result in a version mismatch error? Or am I misunderstanding how this code works? If so, it probably needs comments explaining what is meant by "SuggestedVersion" and where it typically might come from. It might help to rename it "UnitVersion" as well.

Nit: by DWARF -> by the DWARF...

155

Don't use auto here, as per the style guide.

test/tools/llvm-dwarfdump/X86/debug_addr.s
23

Right, in that case, I don't think you should check the warning in this test. In fact, it might be worth crafting a small .debug_info unit to suppress the warning in this basic test case.

test/tools/llvm-dwarfdump/X86/debug_addr_size_mismatch.s
1 ↗(On Diff #157834)

This test should be renamed to debug_addr_address_size_mismatch or similar, since the size could be referring to several different things.

vleschuk marked an inline comment as done.Jul 30 2018, 3:25 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFContext.cpp
1625

Actually this will differ from the style used in this particular file. I think we should leave auto here.

vleschuk marked 11 inline comments as done.Jul 30 2018, 3:41 AM
vleschuk added inline comments.Jul 30 2018, 4:44 AM
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
55

Actually I don't see how it might work. The caller relies on the fact that length value from the header is valid and parses the rest of the section based on this assumption. In this case the length field is definitely invalid and telling caller that everything is ok will result in incorrect parsing. Here is an example:

.Ldebug_addr0:                                                                  
  .long 1 # unit_length (should be 12 here)                                                        
  .short 5 # version                                                            
  .byte 4 # address_size                                                        
  .byte 0 # segment_selector_size                                               
  .long 0x00000000                                                              
  .long 0x00000001
if (!AddrTable.hasValidLength())                                          
  break; // If we do not break here we will increase Offset by 5 and not by 16 -> incorrect result.                                                                   
uint64_t Length = AddrTable.getLength();                                  
Offset = TableOffset + Length;
73

I am sorry, I think now that's me who lost track here? Are talking about the comment below?

// We support DWARF version 5 for now as well as pre-DWARF5 ...

93

You understood it correctly. But there is no ideal way to handle this. I used the same approach here as used in extraction of DebugStrOffsets, CU "suggests" MaxVersion found and version mismatch is treated as error: in case of version mismatch one can't tell exactly how to parse the header.

vleschuk updated this revision to Diff 157941.EditedJul 30 2018, 4:46 AM
  • Remove unnecessary warnings from tests
  • Separated test for 64bit addresses
  • Explicit length invalidation in clear()
  • Explicit clear() before extraction
  • Style and comment fixes
  • Rename debug_addr_size_not_multiple.s ==> debug_addr_address_size_not_multiple.s and debug_addr_size_not_multiple.s ==> debug_addr_address_size_not_multiple.s
vleschuk marked 2 inline comments as done.Jul 30 2018, 4:48 AM
vleschuk updated this revision to Diff 158172.Jul 30 2018, 10:30 PM

As mentioned in https://reviews.llvm.org/D50005 changed "header" message:

"Addr Section params" -> "Addr Section".

jhenderson added inline comments.Jul 31 2018, 2:47 AM
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
55

I think this comes down to "who should we trust more?". What do we do in other cases in the DWARF library?

73

I'll reset this by commenting on the line in question...

91

Before this check, we should check that the Version is a supported Version. The only ones supported are <= 5 (I would argue 4 and 5 explicitly, but I'm not sure at what point pre-standardisated .debug_addr tables started getting emitted, so the lower bound might be wrong). So if we encounter a Version >= 6, we should emit an unsupported error, because in that case, we can't guarantee that our parser can handle it. I'd then assume the Length is valid and jump to the next table when iterating over it.

93

Okay, I think I get you now. I think it shouldn't block this going in, in its first form, but I think I'd consider it a bug. Consider the following case:

  • test.o has been compiled with DWARF version 5
  • test2.o has been compiled with DWARF version 4 (and has a .debug_addr table)
  • test.elf is test.o and test2.o linked together.
  • llvm-dwarfdump will fail to dump both .debug_addr tables, because there will always be a mismatch.

Such a scenario is far from unlikely. It's quite common for static libraries built with older compiler versions (and therefore potentially different DWARF versions) to be linked together with objects or libraries from newer compilers.

By my understanding of the DWARF5 standard, the correct way to associate a .debug_addr table with a .debug_info table is to look at the DW_AT_addr_base attribute in the info table. In this case, each of the two .debug_info tables would point at a different .debug_addr table.

vleschuk marked an inline comment as done.Jul 31 2018, 4:08 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
55

The most similar case is .debug_rnglists implementation where the same "mechanism" (break if length is invalid) is used, however the RnglistsTable implementation doesn't use it all: there is know explicit invalidation in case of error (I think we should work on it and see if there are any problems).

In our case I suggest to leave invalidation here: we assume that the header is correct and read the Length value from it, after that we see that our data contradicts our assumption and either Length is invalid or the section is malformed. In any case it is better to report an error than produce potentially invalid output.

93

I think we can put a FIXME mark here for now and get back to it later if you don't mind.

vleschuk updated this revision to Diff 158213.Jul 31 2018, 4:12 AM
  • Added FIXME comment for version mismatch situation
  • Added check for the version <= 5 and corresponding test.
vleschuk added inline comments.Jul 31 2018, 4:13 AM
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
55

there is know explicit invalidation

there is no explicit invalidation

JDevlieghere added inline comments.Jul 31 2018, 7:42 AM
include/llvm/DebugInfo/DWARF/DWARFDebugAddr.h
56

Nit: a newline between these functions would improve readability, especially with the doxygen comments.

64

Please give the DIDumpOptions a default value to make the argument optional. (i.e. DIDumpOptions DumpOpts = {})

67

Does this have to be implemented in the header?

lib/DebugInfo/DWARF/DWARFContext.cpp
1619

Nit: s/fact/theory/

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
121

Please use a full sentences here.

156

I think a raw_string_ostream might better communicate your intentions of building up a string.

vleschuk marked 4 inline comments as done.Jul 31 2018, 8:03 AM
vleschuk added inline comments.
lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
156

I don't think so, in this situation usage of additional ostreams to build a format string will result in more code which will be less readable. I'd prefer to keep current code.

vleschuk updated this revision to Diff 158269.EditedJul 31 2018, 8:06 AM
  • Fixed comments and gaps between lines.
  • Moved get-by-index implementation from header to cpp.
  • Added default value to DumpOptions.

Okay, I think I'm happy with this, but please let the others give their feedback and thumbs up too.

Have you considered writing unit tests for this at all? I don't think they're strictly necessary, but they would be nice.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
96–99

I'd move this FIXME to where the UnitVersion is calculated in DWARFContext.cpp.

I also think the last two lines of this comment make no sense on their own, and can probably just be deleted.

Nit: as error, however the -> as an error. However, the

Aside: you don't need to manually wrap comments, because clang-format will do it for you at the correct column width.

test/tools/llvm-dwarfdump/X86/debug_addr_unsupported_version.s
21–25

The spacing here (and in other tests) is a little all over the place. Please tidy it up a little.

vleschuk updated this revision to Diff 158280.Jul 31 2018, 8:25 AM
  • Fixed spacing in tests
  • Minor grammar fixes in comments

Okay, I think I'm happy with this, but please let the others give their feedback and thumbs up too.

@aprantl , @dblaikie , @probinson, do you have any objections on this?

Have you considered writing unit tests for this at all? I don't think they're strictly necessary, but they would be nice.

I thought about it, I will give it more thought when we have more mature implementation.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
96–99

I'd move this FIXME to where the UnitVersion is calculated in DWARFContext.cpp.

I think it better to leave it here where the actual comparison is performed.

I also think the last two lines of this comment make no sense on their own, and can probably just be deleted.

Agreed.

Aside: you don't need to manually wrap comments, because clang-format will do it for you at the correct column width.

Yep, sorry, always forget to use this tool, need to hotkey it to my vimrc =)

Related aside to my comments regarding version mismatches: is it likely that future versions of DWARF (e.g. DWARF 6) will update the version number of the .debug_addr tables without changing anything in them? I know in previous versions of DWARF, some sections didn't increase in version number in line with the overall standard version. If not, then the version mismatch check may not make sense anyway.

lib/DebugInfo/DWARF/DWARFDebugAddr.cpp
96–99

But the error message will still want to exist, I believe, so the issue is finding the right CU, which is not done here. If somebody where to want to call this from another place, they might copy the existing call site, and will not be aware of the FIXME in that instance, resulting in them copying potentially dangerous code.

  • Fixed spacing in tests

The space between directive/value/comment still look quite messy to me. Could you fix that too, please.

Sorry about the late reply, I've been on vacation.

include/llvm/DebugInfo/DIContext.h
157

This looks like the wrong place to store this. Many dwarf sections have version numbers that are independent form one another. For example on Darwin we often use dwarf 4 debug_info and dwarf 2 debug_line in the same file.

158

Why is this a property of the dump options and not of whatever object that is being dumped?

vleschuk added inline comments.Jul 31 2018, 2:43 PM
include/llvm/DebugInfo/DIContext.h
158

I agree with the comment above about version, but address size should be the same for every section, am I right?

Also - I know for DW_AT_ranges attributes we have some advanced dumping in some situations that can print section names and the like, I think? Could we reuse that here too so the dumped addresses also have section name+offset, perhaps?

Sorry, actually I don't see this functionality for DW_AT_ranges, could you please tell me where I missed it?

Oh, sure - it was added here: https://reviews.llvm.org/D36313

Thank you David! If you don't mind I will implement this in a separate revision: there is already a lot on the plate for this one :-)

Yeah, better as a separate patch in any case. And there's lots of things we can improve in that regard anyway - many places addresses show up & how they could be rendered more informatively with things like this.

aprantl accepted this revision.Jul 31 2018, 3:02 PM

lgtm with the two parameters moved from DIDumpOptions into explict arguments of dumpAddrSection.

This revision is now accepted and ready to land.Jul 31 2018, 3:02 PM

Closed by rL338447

Hi @vleschuk, you didn't address my final comments before committing this, as far as I can see. Please could you go back and fix them.

Additionally, please remember to follow the guidelines at https://llvm.org/docs/Phabricator.html#committing-a-change when committing a change reviewed via Phabricator, specifically the bit about "Differential Revision" as that auto-closes the review and syncs up the diff.

It looks like my request wasn't addressed either. Can you please fix this soon?

@vleschuk, It looks like you *still* haven't addressed my comment

lgtm with the two parameters moved from DIDumpOptions into explict arguments of dumpAddrSection.

Can you please get this done ASAP?