This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Add warning messages if disassembly + source for problematic inputs
ClosedPublic

Authored by mmpozulp on May 26 2019, 1:20 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

mmpozulp created this revision.May 26 2019, 1:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2019, 1:20 AM
grimar accepted this revision.May 27 2019, 3:25 AM

Generally looks good to me (with 2 nits), please wait for other opinions too.

llvm/tools/llvm-objdump/llvm-objdump.cpp
560 ↗(On Diff #201437)

I'd simplify:

if (MissingSources.insert(LineInfo.FileName).second)
  warn("failed to find source " + LineInfo.FileName);
596 ↗(On Diff #201437)

You do not need Twine( here.

This revision is now accepted and ready to land.May 27 2019, 3:25 AM
mmpozulp marked an inline comment as done.May 29 2019, 4:12 PM
mmpozulp added inline comments.
llvm/tools/llvm-objdump/llvm-objdump.cpp
596 ↗(On Diff #201437)

It appears that I have created a problem by adding the new warn(Twine) function. If I delete Twine( here, I get

llvm-objdump.cpp:594:61: error: call of overloaded 'warn(const char [48])' is ambiguous
       warn("failed to parse debug info which may be missing");
                                                             ^
llvm-objdump.cpp:375:6: note: candidate: 'void llvm::warn(llvm::StringRef)'
 void warn(StringRef Message) {
      ^~~~
llvm-objdump.cpp:380:6: note: candidate: 'void llvm::warn(llvm::Twine)'
 void warn(Twine Message) {

I needed warn(Twine) for calls like warn("a message " + VariableName) and warn(formatv("a message {0}", VariableName)) which create Twines.

Is there a way to change the warn() API to resolve the ambiguity? That way the compiler would not have to force users to pick between warn(Twine) and warn(StringRef) with a tedious Twine() or StringRef() constructor call.

mmpozulp updated this revision to Diff 202077.May 29 2019, 4:20 PM

Incorporate feedback from @grimar to simplify an if-statement.

grimar added inline comments.May 30 2019, 2:05 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
596 ↗(On Diff #201437)

I think we just need to change our void warn(StringRef Message) to void warn(const Twine &Message).

(note: Twine.h says: "Twines should only be used accepted as const references in arguments, when an API wishes to accept possibly-concatenated strings.")

jhenderson added inline comments.Jun 3 2019, 8:15 AM
llvm/test/tools/llvm-objdump/X86/source-interleave-invalid-source.test
11 ↗(On Diff #202077)

I'm concerned that there might be an interleaving issue here, since stderr is normally unbuffered. This could make the test flaky, since the output ordering is important. Perhaps it would be wise to redirect stderr to a file, and check the file contents in a separate FileCheck run (using --input-file)?

15 ↗(On Diff #202077)

I'd add a check for the file name in the message here. This can be done by the following:

# RUN: ... | FileCheck %s ... -DFILE=%t2.o
...
# WARN: ... in [[FILE]]

where FILE is an arbitrary variable name.

llvm/tools/llvm-objdump/llvm-objdump.cpp
526 ↗(On Diff #202077)

These new comments (and the two above) are missing trailing full stops.

591 ↗(On Diff #202077)

Delete this extra blank line.

592 ↗(On Diff #202077)

I see this was there before, but "<invalid>" seems like a magic string. Is there a constant it is stored in to avoid having "<invalid>" explicitly mentioned in multiple places?

jhenderson added inline comments.Jun 3 2019, 8:15 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
594 ↗(On Diff #202077)

Include the file name in this error message somewhere.

mmpozulp updated this revision to Diff 203944.Jun 10 2019, 5:29 PM

Incorporate excellent feedback from @grimar and @jhenderson. Thanks!

I have probably no more comments except 2 nits.

llvm/include/llvm/DebugInfo/DIContext.h
32 ↗(On Diff #203944)

Variable names should start from the upper case (LLVM coding style).

llvm/tools/llvm-objdump/llvm-objdump.cpp
376 ↗(On Diff #203944)

I guess you need to keep errs().flush().

grimar added inline comments.Jun 11 2019, 1:27 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
376 ↗(On Diff #203944)

I am not sure though. It seems should be better for debugging to keep it, i.e. I think this flush
allows to see a warning in the console output as soon it appears, what can probably be useful sometimes.
But I am not sure it is an important bit.

At the same time it seems that in error(Twine Message) above it is used before exit(1) to
flush the messages before application termination. But it is not used in void report_error(StringRef File, Twine Message) below.
It can be a bug probably.

jhenderson added inline comments.Jun 11 2019, 3:08 AM
llvm/include/llvm/DebugInfo/DIContext.h
32 ↗(On Diff #203944)

Just to add, the 'k' really isn't a useful part of the name. We don't use that style of variable naming.

Also, I don't think you need to define them in the header. You just need to add a declaration with extern.

llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
12 ↗(On Diff #203944)

Hmm... Not sure about this one. I think we need a different message in that case, since '<invalid>' isn't actually a file.

llvm/tools/llvm-objdump/llvm-objdump.cpp
525 ↗(On Diff #203944)

Unnecessary extra blank line

Hi @mmpozulp, just wondering if you are going to come back to this some time soon, or if you need any further assistance with this issue?

Hey @jhenderson, thanks for asking! I will have some questions for you once I look at the code again. I return from vacation on Wednesday. :)

mmpozulp updated this revision to Diff 207213.Jun 29 2019, 6:09 PM

Rebase on master

Taking off on a camping trip for the week. If anyone wants these features upstream'd before I get back please feel free to pick up where I left off, otherwise I'll be back at next week. :)

mmpozulp updated this revision to Diff 208878.Jul 9 2019, 9:28 PM

Rebase on master

mmpozulp updated this revision to Diff 208880.Jul 9 2019, 10:40 PM

Change global variable names, update warning message, delete extra newline.

mmpozulp marked 2 inline comments as done.Jul 9 2019, 11:01 PM
mmpozulp added inline comments.
llvm/include/llvm/DebugInfo/DIContext.h
32 ↗(On Diff #203944)

Where should I put the definition?

Also, I just realized that these use static constructors, which the LLVM Coding Standards prohibit. I could use the Scott Meyers singleton

static StringRef &DILineInfoBadString() {
  static StringRef S("<invalid>");
  return S;
}
static StringRef &Addr2LineBadString() {
  static StringRef S("??");
  return S;
}

but it forces references to change from DILineInfoBadString to DILineInfoBadString() and trips this compile warning for some TUs (e.g. lib/DebugInfo/DWARF/DWARFVerifier.cpp)

warning: ‘llvm::StringRef& llvm::Addr2LineBadString()’ defined but not used [-Wunused-function]

when I build with gcc 9.1.0. The warnings should go away when I move the definition out of the header.

llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
12 ↗(On Diff #203944)

I think failed to parse debug info is a bit better than failed to parse debug info from file "<invalid>". Here's what the user sees

[mike@gannon build]$ ./bin/llvm-objdump --source test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp2.o

test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp2.o:	file format ELF64-x86-64


Disassembly of section .text:

0000000000000000 foo:
./bin/llvm-objdump: warning: failed to parse debug info
       0: 55                           	pushq	%rbp
       1: 48 89 e5                     	movq	%rsp, %rbp
       4: 8b 05 00 00 00 00            	movl	(%rip), %eax
       a: 5d                           	popq	%rbp
       b: c3                           	retq
       c: 90                           	nop
       d: 90                           	nop
       e: 90                           	nop
       f: 90                           	nop

0000000000000010 main:
      10: 55                           	pushq	%rbp
      11: 48 89 e5                     	movq	%rsp, %rbp
      14: 53                           	pushq	%rbx
      15: 48 83 ec 18                  	subq	$24, %rsp
      19: c7 45 f4 00 00 00 00         	movl	$0, -12(%rbp)
      20: 48 c7 45 e8 00 00 00 00      	movq	$0, -24(%rbp)
      28: 8b 1d 00 00 00 00            	movl	(%rip), %ebx
      2e: e8 cd ff ff ff               	callq	-51 <foo>
      33: 01 d8                        	addl	%ebx, %eax
      35: 48 83 c4 18                  	addq	$24, %rsp
      39: 5b                           	popq	%rbx
      3a: 5d                           	popq	%rbp
      3b: c3                           	retq
jhenderson added inline comments.Jul 10 2019, 3:58 AM
llvm/include/llvm/DebugInfo/DIContext.h
32 ↗(On Diff #203944)

Perhaps they could be static members of DILineInfo. To avoid the static constructor issue, they could also be const char * const variables, since they can't change.

llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
12 ↗(On Diff #203944)

They'll only see this if standard output and standard error are printed to the same stream. I think you need something indicating the file in most cases, where possible.

Thinking about your previous version (reporting "invalid"), that sounds like something's wrong. Why isn't it able to print the object file name here? How could it be in an invalid state in that context? Sure, the debug info in the object file might be invalid, but that doesn't affect the file name of the object.

9 ↗(On Diff #208880)

Nit: put a space after >2, here and elsewhere.

mmpozulp updated this revision to Diff 209122.Jul 10 2019, 10:23 PM

Incorporate feedback from @jhenderson

mmpozulp marked an inline comment as done.Jul 10 2019, 10:35 PM
mmpozulp added inline comments.
llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
12 ↗(On Diff #203944)

Why isn't it able to print the object file name here?

In this test, we "failed to parse debug info" because it was stripped from the object file.

[mike@gannon build]$ ./bin/llvm-dwarfdump test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp2.o
test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp2.o:	file format ELF64-x86-64

.debug_info contents:

Thus, LineInfo.FileName == "<invalid>" for every instruction. Here is the output from dwarfdump on the unstripped file (that is, before running llvm-objcopy --strip-debug). The filename /home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c is present.

[mike@gannon build]$ ./bin/llvm-dwarfdump test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp.o
test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp.o:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000089 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000008d)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 4.0.0")
              DW_AT_language	(DW_LANG_C99)
              DW_AT_name	("source-interleave-x86_64.c")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs")
              DW_AT_GNU_pubnames	(true)
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_high_pc	(0x000000000000003c)

0x0000002a:   DW_TAG_variable
                DW_AT_name	("a")
                DW_AT_type	(0x0000003f "int")
                DW_AT_external	(true)
                DW_AT_decl_file	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c")
                DW_AT_decl_line	(1)
                DW_AT_location	(DW_OP_addr 0x0)

0x0000003f:   DW_TAG_base_type
                DW_AT_name	("int")
                DW_AT_encoding	(DW_ATE_signed)
                DW_AT_byte_size	(0x04)

0x00000046:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000000000)
                DW_AT_high_pc	(0x000000000000000c)
                DW_AT_frame_base	(DW_OP_reg6 RBP)
                DW_AT_name	("foo")
                DW_AT_decl_file	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c")
                DW_AT_decl_line	(2)
                DW_AT_type	(0x0000003f "int")
                DW_AT_external	(true)

0x0000005f:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000000010)
                DW_AT_high_pc	(0x000000000000003c)
                DW_AT_frame_base	(DW_OP_reg6 RBP)
                DW_AT_name	("main")
                DW_AT_decl_file	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c")
                DW_AT_decl_line	(6)
                DW_AT_type	(0x0000003f "int")
                DW_AT_external	(true)

0x00000078:     DW_TAG_variable
                  DW_AT_location	(DW_OP_fbreg -24)
                  DW_AT_name	("b")
                  DW_AT_decl_file	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c")
                  DW_AT_decl_line	(7)
                  DW_AT_type	(0x00000087 "int*")

0x00000086:     NULL

0x00000087:   DW_TAG_pointer_type
                DW_AT_type	(0x0000003f "int")

0x0000008c:   NULL

How do we recover the filename after stripping?

jhenderson added inline comments.Jul 11 2019, 1:40 AM
llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
12 ↗(On Diff #203944)

Isn't the name you want to report in this message the object file, not the source? That's the thing that's useful in this context (e.g. it could be "warning: unable to find debug information for object.o"). The problem here is in the object, not the source.

mmpozulp updated this revision to Diff 209419.Jul 12 2019, 12:43 AM

Emit object file name in warning message. Duh. Thanks, @jhenderson!

Cool, nearly there. Just one more suggestion.

llvm/tools/llvm-objdump/llvm-objdump.cpp
621 ↗(On Diff #209419)

It's probably worth sticking with the "failed to parse..." phrasing in case there was a genuine problem parsing it:
"failed to parse debug information for " + ObjectFilename.

Bonus points if it makes sense to use the infromation in the ExpectedLinkInfo error in this case.

mmpozulp updated this revision to Diff 209760.Jul 14 2019, 10:33 PM

Use original wording of "failed to parse..."

mmpozulp marked an inline comment as done.Jul 14 2019, 11:24 PM
mmpozulp added inline comments.
llvm/tools/llvm-objdump/llvm-objdump.cpp
621 ↗(On Diff #209419)

I ran it in lldb and observed that in this case ExpectedLineInfo has no error, so according to the programmer's manual all it can tell us is "success":

If an Expected<T> value is in success mode then the takeError() method will return a success value.

@jhenderson thanks so much for the feedback. Anything else?

jhenderson added inline comments.Jul 15 2019, 4:23 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
621 ↗(On Diff #209419)

I'm not quite sure I follow. Above, ExpectedLineInfo is checked to determine if there is an error, but we then throw it away without updating the LineInfo FileName (see lines 614-617). Presumably this code exists precisely because the debug data couldn't be parsed, and the Expected error in this case will contain the reason why. There may be cases where the line info is successfully fetched, but with no file information, which we should warn about without any additional context, as happens now. Otherwise, it would be good to provide the extra context. Something like the following:

std::string Msg;
if (!ExpectedLineInfo) {
  // pseudo code
  Msg = ExpectedLineInfo.error.message();
  consumeError(ExpectedLineInfo.takeError());
}
else
  LineInfo = *ExpectedLineInfo;

if (LineInfo.FileName == DILineInfo::BadString) {
  if(!WarnedNoDebugInfo) {
    std::string Warning = "failed to parse debug information for " + ObjectFilename;
    if (!Msg.empty())
      Warning += ": " + Msg;
    warn(Warning);

@jhenderson thanks for clarifying. I'll take a look and see what I can find.

@jhenderson, can you think of an input that would cause Symbolizer->symbolizeCode(*Obj, Address) to return an Error?

I went a few calls deep underneath Symbolizer->symbolizeCode(*Obj, Address) and it looks like LLVMSymbolizer and DWARFContext return a default-constructed DILineInfo when they fail to parse the debug info. Can we cause a failure that returns an Error instead? If so we can check whether the Error is actually an ErrorInfo subclass which is how custom errors are created according to error handling Detailed Description. A better implementation of SourcePrinter::printSourceLine() might call handleErrors() instead of consumeError(). Grepping the code for handleErrors and handleAllErrors provides some example uses. First I need some inputs to cause errors so I can see whether any ErrorInfo subclasses are returned and if they have useful context which we can incorporate into our warning. This is my first time wading into LLVM's error handling so please let me know if you have any reading suggestions. In addition to the doxygen docs for Error.h I also found a section on error handling in the programmer's manual.

@jhenderson, can you think of an input that would cause Symbolizer->symbolizeCode(*Obj, Address) to return an Error?

I went a few calls deep underneath Symbolizer->symbolizeCode(*Obj, Address) and it looks like LLVMSymbolizer and DWARFContext return a default-constructed DILineInfo when they fail to parse the debug info. Can we cause a failure that returns an Error instead? If so we can check whether the Error is actually an ErrorInfo subclass which is how custom errors are created according to error handling Detailed Description. A better implementation of SourcePrinter::printSourceLine() might call handleErrors() instead of consumeError(). Grepping the code for handleErrors and handleAllErrors provides some example uses. First I need some inputs to cause errors so I can see whether any ErrorInfo subclasses are returned and if they have useful context which we can incorporate into our warning. This is my first time wading into LLVM's error handling so please let me know if you have any reading suggestions. In addition to the doxygen docs for Error.h I also found a section on error handling in the programmer's manual.

I dug into the code myself, and there are a number of examples in the call tree where Expected/Error etc might be returned. For example, for a PPC ELF, when "SymbolizableObjectFile::create" is called, it tries to fetch the section name. If it fails to do so (e.g. because the sh_name field of the section header is invalid). There may well be many others too. I'm not sure how straightforward it would be to create this situation, but you could try using yaml2obj to create such an object for a test case (you can see an example of this error message in test/Object/invalid.test, looking for the "BROKEN-SECNAME" test case).

When it comes to handling the errors, you might want to look at logAllUnhandledErrors or toString in Error.h. The former can be used to write the error to a stream (which could be a string stream of some sort), whilst the latter converts the Error into a std::string, which you can then use.

It's worth noting that except for success values, ALL Error instances have an ErrorInfoBase member. The comments in the docs about ErrorInfo being what is used for custom errors is somewhat irrelevant to this case here, and simply means that you should inherit from ErrorInfo and not ErrorInfoBase when implementing custom error types. I'd therefore just assume that any non-success Error instancesdo have useful additional context and always use it.

Hopefully that gives you the information you need to go about my suggestion. I think in my pseudo code, you can just replace the line Msg = ExpectedLineInfo.error.message(); with Msg = toString(ExpectedLineInfo.takeError()). That would also avoid the need to call consumeError().

mmpozulp updated this revision to Diff 212042.Jul 26 2019, 8:05 PM

Use error message from Symbolizer->symbolizeCode() in "failed to parse debug information" warning

Thanks James. I was able to induce an error by modifying SymbolizableObjectFile::create to unconditionally return object_error::unexpected_eof which produced

./bin/llvm-objdump: warning: failed to parse debug information for tmp2.o: The end of the file was unexpectedly encountered
mmpozulp updated this revision to Diff 212064.Jul 27 2019, 10:39 AM

Run git-clang-format

This revision was automatically updated to reflect the committed changes.
mmpozulp reopened this revision.Jul 30 2019, 12:07 AM

Reverted after the buildbots informed me that I caused a test to fail
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/29488

This revision is now accepted and ready to land.Jul 30 2019, 12:07 AM

Reverted after the buildbots informed me that I caused a test to fail
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/29488

Are you sure this build failure was caused by your change and not just a flaky test? The latter seems more likely to me, looking at the test failure. It doesn't look like anything to do with what you've done.

mmpozulp updated this revision to Diff 213229.Aug 3 2019, 10:47 PM

Replacing svn diff with git diff.

This revision was automatically updated to reflect the committed changes.
This revision is now accepted and ready to land.Aug 5 2019, 1:54 AM
mmpozulp updated this revision to Diff 213292.Aug 5 2019, 2:00 AM

Add single quotes so [[FILE]] is now '[[FILE]]'.
@grimar can you see if this fixes the windows failures?

grimar added a comment.Aug 5 2019, 2:03 AM

Add single quotes so [[FILE]] is now '[[FILE]]'.
@grimar can you see if this fixes the windows failures?

Will answer very soon.

MaskRay added a subscriber: MaskRay.Aug 5 2019, 2:09 AM

Failing Tests (2):

LLVM :: tools/llvm-objdump/X86/source-interleave-invalid-source.test
LLVM :: tools/llvm-objdump/X86/source-interleave-missing-source.test

This has been reverted twice now... (I think you can relax the DFILE check instead of doing a full revert)

$ "-DFILE=C:\ps4-buildslave2\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\test\tools\llvm-objdump\X86/Inputs/source-interleave-x86_64.c"

warning: debug info line number 9999 exceeds the number of lines in C:/ps4-buildslave2/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/llvm.src/test/tools/llvm-objdump/X86/Inputs\source-interleave-x86_64.c

This looks like a path separator problem.

mmpozulp updated this revision to Diff 213306.Aug 5 2019, 3:23 AM

Abandon [[FILE]] use in tests.
@grimar please try this on your windows box.

If someone with a windows system wants to try
adding [[FILE]] back in these tests then go for it.
Getting it to work might be complicated. I'm
definitely not going to figure it out without my own
windows machine on which to test and iterate.

grimar added a comment.Aug 5 2019, 3:57 AM

Abandon [[FILE]] use in tests.
@grimar please try this on your windows box.

Tests from test/tools/llvm-objdump now pass for me.

jhenderson added inline comments.Aug 7 2019, 2:17 AM
llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
12 ↗(On Diff #213306)

Rather than {{.+}}, can this be {{.*}}2.o?

mmpozulp updated this revision to Diff 214280.Aug 8 2019, 7:15 PM

Use {{.*}}2.o rather than {{.+}}

Thanks, @grimar and @jhenderson. I feel ready for another re-commit. Think so too?

jhenderson accepted this revision.Aug 9 2019, 1:44 AM

Yup, LGTM.

@mmpozulp, can I confirm that this is now in? It looks like Phabricator hasn't auto-closed this issue.

This revision was automatically updated to reflect the committed changes.

@mmpozulp, can I confirm that this is now in? It looks like Phabricator hasn't auto-closed this issue.

It's in now.

@mmpozulp, can I confirm that this is now in? It looks like Phabricator hasn't auto-closed this issue.

It's in now.

Great, thanks!