Page MenuHomePhabricator

[llvm-objdump] Add warning messages if disassembly + source for problematic inputs

Authored by mmpozulp on May 26 2019, 1:20 AM.

Event Timeline

mmpozulp created this revision.May 26 2019, 1:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2019, 1:20 AM
grimar accepted this revision.May 27 2019, 3:25 AM

Generally looks good to me (with 2 nits), please wait for other opinions too.


I'd simplify:

if (MissingSources.insert(LineInfo.FileName).second)
  warn("failed to find source " + LineInfo.FileName);

You do not need Twine( here.

This revision is now accepted and ready to land.May 27 2019, 3:25 AM
mmpozulp marked an inline comment as done.May 29 2019, 4:12 PM
mmpozulp added inline comments.

It appears that I have created a problem by adding the new warn(Twine) function. If I delete Twine( here, I get

llvm-objdump.cpp:594:61: error: call of overloaded 'warn(const char [48])' is ambiguous
       warn("failed to parse debug info which may be missing");
llvm-objdump.cpp:375:6: note: candidate: 'void llvm::warn(llvm::StringRef)'
 void warn(StringRef Message) {
llvm-objdump.cpp:380:6: note: candidate: 'void llvm::warn(llvm::Twine)'
 void warn(Twine Message) {

I needed warn(Twine) for calls like warn("a message " + VariableName) and warn(formatv("a message {0}", VariableName)) which create Twines.

Is there a way to change the warn() API to resolve the ambiguity? That way the compiler would not have to force users to pick between warn(Twine) and warn(StringRef) with a tedious Twine() or StringRef() constructor call.

mmpozulp updated this revision to Diff 202077.May 29 2019, 4:20 PM

Incorporate feedback from @grimar to simplify an if-statement.

grimar added inline comments.May 30 2019, 2:05 AM

I think we just need to change our void warn(StringRef Message) to void warn(const Twine &Message).

(note: Twine.h says: "Twines should only be used accepted as const references in arguments, when an API wishes to accept possibly-concatenated strings.")

jhenderson added inline comments.Jun 3 2019, 8:15 AM

I'm concerned that there might be an interleaving issue here, since stderr is normally unbuffered. This could make the test flaky, since the output ordering is important. Perhaps it would be wise to redirect stderr to a file, and check the file contents in a separate FileCheck run (using --input-file)?


I'd add a check for the file name in the message here. This can be done by the following:

# RUN: ... | FileCheck %s ... -DFILE=%t2.o
# WARN: ... in [[FILE]]

where FILE is an arbitrary variable name.


These new comments (and the two above) are missing trailing full stops.


Delete this extra blank line.


I see this was there before, but "<invalid>" seems like a magic string. Is there a constant it is stored in to avoid having "<invalid>" explicitly mentioned in multiple places?


Include the file name in this error message somewhere.

mmpozulp updated this revision to Diff 203944.Jun 10 2019, 5:29 PM

Incorporate excellent feedback from @grimar and @jhenderson. Thanks!

I have probably no more comments except 2 nits.


Variable names should start from the upper case (LLVM coding style).


I guess you need to keep errs().flush().

grimar added inline comments.Jun 11 2019, 1:27 AM

I am not sure though. It seems should be better for debugging to keep it, i.e. I think this flush
allows to see a warning in the console output as soon it appears, what can probably be useful sometimes.
But I am not sure it is an important bit.

At the same time it seems that in error(Twine Message) above it is used before exit(1) to
flush the messages before application termination. But it is not used in void report_error(StringRef File, Twine Message) below.
It can be a bug probably.

jhenderson added inline comments.Jun 11 2019, 3:08 AM

Just to add, the 'k' really isn't a useful part of the name. We don't use that style of variable naming.

Also, I don't think you need to define them in the header. You just need to add a declaration with extern.


Hmm... Not sure about this one. I think we need a different message in that case, since '<invalid>' isn't actually a file.


Unnecessary extra blank line

Hi @mmpozulp, just wondering if you are going to come back to this some time soon, or if you need any further assistance with this issue?

Hey @jhenderson, thanks for asking! I will have some questions for you once I look at the code again. I return from vacation on Wednesday. :)

mmpozulp updated this revision to Diff 207213.Sat, Jun 29, 6:09 PM

Rebase on master

Taking off on a camping trip for the week. If anyone wants these features upstream'd before I get back please feel free to pick up where I left off, otherwise I'll be back at next week. :)

mmpozulp updated this revision to Diff 208878.Tue, Jul 9, 9:28 PM

Rebase on master

mmpozulp updated this revision to Diff 208880.Tue, Jul 9, 10:40 PM

Change global variable names, update warning message, delete extra newline.

mmpozulp marked 2 inline comments as done.Tue, Jul 9, 11:01 PM
mmpozulp added inline comments.

Where should I put the definition?

Also, I just realized that these use static constructors, which the LLVM Coding Standards prohibit. I could use the Scott Meyers singleton

static StringRef &DILineInfoBadString() {
  static StringRef S("<invalid>");
  return S;
static StringRef &Addr2LineBadString() {
  static StringRef S("??");
  return S;

but it forces references to change from DILineInfoBadString to DILineInfoBadString() and trips this compile warning for some TUs (e.g. lib/DebugInfo/DWARF/DWARFVerifier.cpp)

warning: ‘llvm::StringRef& llvm::Addr2LineBadString()’ defined but not used [-Wunused-function]

when I build with gcc 9.1.0. The warnings should go away when I move the definition out of the header.


I think failed to parse debug info is a bit better than failed to parse debug info from file "<invalid>". Here's what the user sees

[mike@gannon build]$ ./bin/llvm-objdump --source test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp2.o

test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp2.o:	file format ELF64-x86-64

Disassembly of section .text:

0000000000000000 foo:
./bin/llvm-objdump: warning: failed to parse debug info
       0: 55                           	pushq	%rbp
       1: 48 89 e5                     	movq	%rsp, %rbp
       4: 8b 05 00 00 00 00            	movl	(%rip), %eax
       a: 5d                           	popq	%rbp
       b: c3                           	retq
       c: 90                           	nop
       d: 90                           	nop
       e: 90                           	nop
       f: 90                           	nop

0000000000000010 main:
      10: 55                           	pushq	%rbp
      11: 48 89 e5                     	movq	%rsp, %rbp
      14: 53                           	pushq	%rbx
      15: 48 83 ec 18                  	subq	$24, %rsp
      19: c7 45 f4 00 00 00 00         	movl	$0, -12(%rbp)
      20: 48 c7 45 e8 00 00 00 00      	movq	$0, -24(%rbp)
      28: 8b 1d 00 00 00 00            	movl	(%rip), %ebx
      2e: e8 cd ff ff ff               	callq	-51 <foo>
      33: 01 d8                        	addl	%ebx, %eax
      35: 48 83 c4 18                  	addq	$24, %rsp
      39: 5b                           	popq	%rbx
      3a: 5d                           	popq	%rbp
      3b: c3                           	retq
jhenderson added inline comments.Wed, Jul 10, 3:58 AM

Perhaps they could be static members of DILineInfo. To avoid the static constructor issue, they could also be const char * const variables, since they can't change.


Nit: put a space after >2, here and elsewhere.


They'll only see this if standard output and standard error are printed to the same stream. I think you need something indicating the file in most cases, where possible.

Thinking about your previous version (reporting "invalid"), that sounds like something's wrong. Why isn't it able to print the object file name here? How could it be in an invalid state in that context? Sure, the debug info in the object file might be invalid, but that doesn't affect the file name of the object.

mmpozulp updated this revision to Diff 209122.Wed, Jul 10, 10:23 PM

Incorporate feedback from @jhenderson

mmpozulp marked an inline comment as done.Wed, Jul 10, 10:35 PM
mmpozulp added inline comments.

Why isn't it able to print the object file name here?

In this test, we "failed to parse debug info" because it was stripped from the object file.

[mike@gannon build]$ ./bin/llvm-dwarfdump test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp2.o
test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp2.o:	file format ELF64-x86-64

.debug_info contents:

Thus, LineInfo.FileName == "<invalid>" for every instruction. Here is the output from dwarfdump on the unstripped file (that is, before running llvm-objcopy --strip-debug). The filename /home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c is present.

[mike@gannon build]$ ./bin/llvm-dwarfdump test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp.o
test/tools/llvm-objdump/X86/Output/source-interleave-no-debug-info.test.tmp.o:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000089 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000008d)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 4.0.0")
              DW_AT_language	(DW_LANG_C99)
              DW_AT_name	("source-interleave-x86_64.c")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs")
              DW_AT_GNU_pubnames	(true)
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_high_pc	(0x000000000000003c)

0x0000002a:   DW_TAG_variable
                DW_AT_name	("a")
                DW_AT_type	(0x0000003f "int")
                DW_AT_external	(true)
                DW_AT_decl_file	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c")
                DW_AT_decl_line	(1)
                DW_AT_location	(DW_OP_addr 0x0)

0x0000003f:   DW_TAG_base_type
                DW_AT_name	("int")
                DW_AT_encoding	(DW_ATE_signed)
                DW_AT_byte_size	(0x04)

0x00000046:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000000000)
                DW_AT_high_pc	(0x000000000000000c)
                DW_AT_frame_base	(DW_OP_reg6 RBP)
                DW_AT_name	("foo")
                DW_AT_decl_file	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c")
                DW_AT_decl_line	(2)
                DW_AT_type	(0x0000003f "int")
                DW_AT_external	(true)

0x0000005f:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000000010)
                DW_AT_high_pc	(0x000000000000003c)
                DW_AT_frame_base	(DW_OP_reg6 RBP)
                DW_AT_name	("main")
                DW_AT_decl_file	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c")
                DW_AT_decl_line	(6)
                DW_AT_type	(0x0000003f "int")
                DW_AT_external	(true)

0x00000078:     DW_TAG_variable
                  DW_AT_location	(DW_OP_fbreg -24)
                  DW_AT_name	("b")
                  DW_AT_decl_file	("/home/mike/ws/llvm/llvm-project/llvm/test/tools/llvm-objdump/X86/Inputs/source-interleave-x86_64.c")
                  DW_AT_decl_line	(7)
                  DW_AT_type	(0x00000087 "int*")

0x00000086:     NULL

0x00000087:   DW_TAG_pointer_type
                DW_AT_type	(0x0000003f "int")

0x0000008c:   NULL

How do we recover the filename after stripping?

jhenderson added inline comments.Thu, Jul 11, 1:40 AM

Isn't the name you want to report in this message the object file, not the source? That's the thing that's useful in this context (e.g. it could be "warning: unable to find debug information for object.o"). The problem here is in the object, not the source.

mmpozulp updated this revision to Diff 209419.Fri, Jul 12, 12:43 AM

Emit object file name in warning message. Duh. Thanks, @jhenderson!

Cool, nearly there. Just one more suggestion.


It's probably worth sticking with the "failed to parse..." phrasing in case there was a genuine problem parsing it:
"failed to parse debug information for " + ObjectFilename.

Bonus points if it makes sense to use the infromation in the ExpectedLinkInfo error in this case.

mmpozulp updated this revision to Diff 209760.Sun, Jul 14, 10:33 PM

Use original wording of "failed to parse..."

mmpozulp marked an inline comment as done.Sun, Jul 14, 11:24 PM
mmpozulp added inline comments.

I ran it in lldb and observed that in this case ExpectedLineInfo has no error, so according to the programmer's manual all it can tell us is "success":

If an Expected<T> value is in success mode then the takeError() method will return a success value.

@jhenderson thanks so much for the feedback. Anything else?

jhenderson added inline comments.Mon, Jul 15, 4:23 AM

I'm not quite sure I follow. Above, ExpectedLineInfo is checked to determine if there is an error, but we then throw it away without updating the LineInfo FileName (see lines 614-617). Presumably this code exists precisely because the debug data couldn't be parsed, and the Expected error in this case will contain the reason why. There may be cases where the line info is successfully fetched, but with no file information, which we should warn about without any additional context, as happens now. Otherwise, it would be good to provide the extra context. Something like the following:

std::string Msg;
if (!ExpectedLineInfo) {
  // pseudo code
  Msg = ExpectedLineInfo.error.message();
  LineInfo = *ExpectedLineInfo;

if (LineInfo.FileName == DILineInfo::BadString) {
  if(!WarnedNoDebugInfo) {
    std::string Warning = "failed to parse debug information for " + ObjectFilename;
    if (!Msg.empty())
      Warning += ": " + Msg;