This is an archive of the discontinued LLVM Phabricator instance.

lldb-dwarfdump -verify [-quiet]
ClosedPublic

Authored by clayborg on May 1 2017, 11:05 AM.

Details

Summary

Adds initial llvm-dwarfdump --verify support with unit tests.

lldb-dwarfdump gets a new "--verify" option that will verify a single file's DWARF debug info and will print out any errors that it finds. It will return an non-zero exit status if verification fails, and a zero exit status if verification succeeds. Adding the --quiet option will suppress any output the STDOUT or STDERR.

The first part of the verify does the following:

  • verifies that all CU relative references (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata) have valid CU offsets
  • verifies that all DW_FORM_ref_addr references have valid .debug_info offsets
  • verifies that all DW_AT_ranges attributes have valid .debug_ranges offsets
  • verifies that all DW_AT_stmt_list attributes have valid .debug_line offsets
  • verifies that all DW_FORM_strp attributes have valid .debug_str offsets

Unit tests were added for each of the above cases.

Diff Detail

Event Timeline

clayborg created this revision.May 1 2017, 11:05 AM
probinson edited edge metadata.May 1 2017, 1:00 PM

Mostly formatting and commentary points. I haven't looked at the test yet.

lib/DebugInfo/DWARF/DWARFContext.cpp
289

I know this is the idiom used everywhere else, but here instead of the 'if' guarding a call or a just a few lines, it's guarding essentially the entire method. I'd rather see this inverted to an early exit here, and reduce indentation in the rest of the method.

301

So, most cross-section references are verified based on the attribute (because they use forms that can refer to different sections based on context), but sometimes they are just based on the form (because the form only allows a reference to one other section, like FORM_strp). Did I get that right? It would be nice to see a comment alluding to why some references are done one way and some another way.

302

This indentation doesn't look right; maybe use clang-format?

391

I wonder if forms that should have been handled by-attribute should be explicitly documented here. Or maybe that would be too much clutter? I am not sure.

clayborg added inline comments.May 1 2017, 1:09 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
289

I am breaking my change up into many piece more to come. This if should remain for now since the a future patch will need this to be here.

301

I can add a comment.

391

We would mention them here, but I think that might clutter things.

LGTM but I think Adrian should have a chance at it, as he had many comments on the previous iteration.

lib/DebugInfo/DWARF/DWARFContext.cpp
289

ok but will probably want to be broken up into easier-to-digest pieces later then.

aprantl added inline comments.May 1 2017, 2:14 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
288

Success

290

Would it make sense to split this entire verifier into a separate function?
Alternatively, if many verifiers need to iterate over the same data, would it make sense to split this out into a verify/visitCompileUnit function?

293

please clang-format

tools/llvm-dwarfdump/llvm-dwarfdump.cpp
107

Do we really want to exit() here? What happens when llvm-dwarfdump is invoked on a .dSYM or static archive with more than one object file?

unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
1711

Whoa. This looks nice! Out of curiosity, is there a specific advantage to doing this as a unit test as opposed to a FileCheck-based test?

clayborg added inline comments.May 1 2017, 2:24 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
290

One issue with splitting it out is we will need to pass all data structures to those functions. For instance, the .debug_arange verifier will need access to a set of FunctionInfos (see previous patch). So I can split it out, but this function needs to be called even when .debug_info isn't being verified if we are verifying .debug_aranges because we will need to truth of what function ranges are in a CU and be able to compare it to the contents of the .debug_aranges. Let me know what you think...

tools/llvm-dwarfdump/llvm-dwarfdump.cpp
107

I could make a static variable that gets set to true and exit after the iterations of all files and objects is done. It would be a bit messy to try and plumb the bool return value through since the dump and verify share the same code right now?

unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
1711

Easier to debug and edit and fix is my best answer.

clayborg updated this revision to Diff 97341.May 1 2017, 2:35 PM
  • clang-format
  • fixes for Adrian
clayborg marked 2 inline comments as done.May 1 2017, 2:36 PM

Marked things done

clayborg updated this revision to Diff 97343.May 1 2017, 2:40 PM

Don't exit right away in llvm-dwarfdump if we encounter errors, exit with non-zero status after we have iterated across all files.

clayborg marked an inline comment as done.May 1 2017, 2:40 PM
aprantl added inline comments.May 1 2017, 2:40 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
290

Perhaps we can model it after llvm/lib/IR/Verifier.cpp.
It uses a private Verifier class that has visit[Enity] methods that implement various checks. State is passed through class members.

Would something like that work?

tools/llvm-dwarfdump/llvm-dwarfdump.cpp
107

We should probably just make verify a separate action from dump.

117

... and rename this to ProcessInput or something

129
... {
  if (Dump && !Quiet)
     DumpObjectFile(*Obj, Filename);
  if (Verify)
     Result |= VerifyObjectFile(...)
}

etc.

unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
1711

As long as we don't need any regex-matching on the output, this should be fine. And we can easily convert it to FileCheck later on.

clayborg added inline comments.May 1 2017, 2:51 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
290

I can make it work. Can I do that in the next patch that will add the next level of functionality?

tools/llvm-dwarfdump/llvm-dwarfdump.cpp
129

There are two calls to DumpObjectFile here, see the call below. Do we want to duplicate the twine below twice? Seems cleaner to do it the way it is currently done? The top loop uses std::for_each to we can't get the return value so we have to use a global anyway.

aprantl accepted this revision.May 1 2017, 2:58 PM

LGTM once outstanding issues are addressed.

lib/DebugInfo/DWARF/DWARFContext.cpp
290

Sure! I think we agree on the direction the code should evolve to and that is what is important here.

tools/llvm-dwarfdump/llvm-dwarfdump.cpp
129

If we stick with boolean return values, we could use std::all_of(Objects.begin(), Objects.end(), Verify) and it would still look kind of nice.

This revision is now accepted and ready to land.May 1 2017, 2:58 PM
aprantl added inline comments.May 1 2017, 3:01 PM
tools/llvm-dwarfdump/llvm-dwarfdump.cpp
129

Hmm.. I guess it will be more like:

auto Action = [&](Object *Obj)->bool {
 bool Result = true;
  if (Dump && !Quiet)
     DumpObjectFile(*Obj);
  if (Verify)
     Result = VerifyObjectFile(Obj);
  return Result; 
};

std::all_of(Objects.begin(), Objects.end(), Action)
clayborg updated this revision to Diff 97346.May 1 2017, 3:10 PM

Fixed llvm-dwarfdump.cpp to use std::all_of.

This revision was automatically updated to reflect the committed changes.