This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Display locations of variables alongside disassembly
ClosedPublic

Authored by ostannard on Nov 26 2019, 8:44 AM.

Details

Summary

This adds the -debug-vars option to llvm-objdump, which prints locations (currently just registers, memory and more complex expressions will be added later) of source-level variables alongside the disassembly based on DWARF info. A vertical line is printed for each live-range, with a label at the top giving the variable name and location, and the position and length of the line indicating the program counter range in which it is valid.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jhenderson added inline comments.Dec 5 2019, 3:30 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
577

Function names should start with a lower-case letter, i.e. prettyPrintDWARFOps. Applies elsewhere too.

671

Coulmns -> Columns

I'd also rephrase slightly: "The columns we are currently drawing."

686

Why are you throwing away the error and not report an error or warning?

Add a comment if there's a valid reason not to report a message.

690

Please don't use auto here. I don't know what the type is. Also, is this able to be a const &?

692–694

File a bug for this FIXME against the getLocations method.

714

Too much auto. Should it be const &?

735–736

What's the / 2 and * 2 below about?

752

Too many blank lines. Have you clang-formatted?

756

I'd pull ~0U into a constant and/or use std::numeric_limits::max().

771

Too much auto. Should it be const &?

786

Canonical LLVM style is to precalculate the end value rather than on every iteration. Also applies below.

1075

It may be worth providing a "style" argument to the option that is able to specify a drawing style i.e. "ascii" or "box" or something like that. The formatting strings could then be configured based on that.

1120

Delete extra blank line.

1660

Too much auto. What is the type here?

aprantl added inline comments.Dec 5 2019, 9:19 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
562

My intent is to have a short comment here that explains to a future reader of the code what this class is used for. Using doxygen syntax /// allows IDEs to format them more nicely and show them in code outlines / indexes etc.

ostannard updated this revision to Diff 232855.Dec 9 2019, 7:34 AM
ostannard marked 20 inline comments as done.

Made the suggested changes in the code. However, after rebasing the DWARF5 tests are failing, as the location lists are being parsed incorrectly. I'll update the tests once I've figured out that's changed there.

eugenis added inline comments.Dec 9 2019, 2:34 PM
llvm/tools/llvm-objdump/llvm-objdump.cpp
627

To render DW_AT_frame_base for inlined subroutines, you'd need to trace up to the top-level one, or remember it while constructing LiveVariable(s).

Remove unused parameters?

MaskRay added inline comments.Dec 9 2019, 11:01 PM
llvm/tools/llvm-objdump/llvm-objdump.cpp
1880

The destructor of formatted_raw_ostream is very slow.

We cannot declare formatted_raw_ostream in the loop body. The formatted_raw_ostream ctor resets the underlying stream (outs()) to unbuffered. This may introduce a 4x slowdown. I once had caused a slowdown here, which was fixed by D64969.

ostannard updated this revision to Diff 233388.Dec 11 2019, 9:21 AM
ostannard marked 3 inline comments as done.
  • Made suggested changes to tests. I've enabled strict whitespace checking just for the one test in which we check compatibility with other options which affect the layout of the output, because the disassembly uses tabs in a way which doesn't always display well in text editors.
  • getLocations now works correctly with -ffunction-sections, so removed the workaround
  • Removed unused parameters related to DW_AT_frame_base
  • Get endianness form the DWARFunit, no need to pass it around separately
  • Don't create/destroy formatted_raw_ostream as often

Minor request, but could you please remember to mark my comments as "done" once you've uploaded a patch to address them. It helps me with my follow-ups.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1106 ↗(On Diff #233388)

A comment explaining why we just throw away the error here woul be good.

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4-sections.s
58

Ping? You have addressed this in this test?

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4.s
4–5

-no-show-raw-isn -> --no-show-raw-insn etc

15

You might want to consider another test case for --debug-vars-indent with the value equal to what it would be without the switch, testing against the same check-prefix. That will show what the default value of --debug-vars-indent is more clearly.

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf5-sections.s
2

--debug-vars

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf5.s
2

--debug-vars

llvm/test/tools/llvm-objdump/PowerPC/debug-vars.s
2

--debug-vars

llvm/test/tools/llvm-symbolizer/frame-loclistx.s
1 ↗(On Diff #233388)

This test seems unrelated?

ostannard marked 30 inline comments as done.Dec 12 2019, 2:22 AM

Sorry about that, I'd been marking comments as "done" as I went, but forgot to actually click "submit". I also accidentally included D70756 in the patch.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
1106 ↗(On Diff #233388)

This change is from D70756, included in this patch by mistake.

llvm/lib/Support/FormattedStream.cpp
40

I think we always use UTF-8 (or 7-bit ASCII) internally, so no need to guard this.

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4-sections.s
58

I'm using sed to replace this string in debug-vars-dwarf4.s, where I'm testing the interaction with source and line number interleaving. I didn't think it would be worth testing that in every file, because the interactions are all in the formatting of the output, which is the same in each file.

llvm/test/tools/llvm-symbolizer/frame-loclistx.s
1 ↗(On Diff #233388)

Another part of D70756 I included in the patch by mistake.

llvm/tools/llvm-objdump/llvm-objdump.cpp
692–694

This appears to have been fixed since I wrote this, so I've removed the workaround, and no ticket is needed.

735–736

I'll add a comment to explain.

ostannard updated this revision to Diff 233542.Dec 12 2019, 2:22 AM
ostannard marked 4 inline comments as done.
  • Remove D70756 from patch
  • Use -- style options in test comments

I think my only outstanding comment is about adding a style option to enable this to be used on terminals without unicode support (see also Michael Spencer's recent response on the RFC).

I haven't really looked at the details of the dwarf parsing or disassembly however. Others with more experience in those areas might be better placed to do so.

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4-sections.s
58

Right, so just to clear things up for my own sanity: --debug-vars is not directly related to source interleaving in any way, right? It uses only the debug information in the object file, right?

aykevl added a subscriber: aykevl.Jan 8 2020, 5:39 AM
jdoerfert resigned from this revision.Jan 8 2020, 8:23 AM

GNU objdump will have this:

https://sourceware.org/ml/binutils/2020-01/msg00072.html RFC: Adding visualization of jumps to objdump's disassembly output

c6:	|  |     \----------> be 00 00 00 00       	mov    $0x0,%esi
cb:	|  |           /----> 48 8b 3d 00 00 00 00 	mov    0x0(%rip),%rdi        # d2 <main+0xd2>
d2:	|  |           |      31 c0                	xor    %eax,%eax
d4:	|  |           |  /-- e8 00 00 00 00       	callq  d9 <main+0xd9>
d9:	|  |           |  \-> bf 02 00 00 00       	mov    $0x2,%edi
de:	|  +-----------|----- e8 00 00 00 00       	callq  e3 <main+0xe3>
e3:	|  \-----------|----> 48 89 da             	mov    %rbx,%rdx
e6:	|              |      be 00 00 00 00       	mov    $0x0,%esi
eb:	|              \----- eb de                	jmp    cb <main+0xcb>
ed:	\-------------------> 48 8b 16             	mov    (%rsi),%rdx

(I'll try studying D70720 in the next few days.)

ostannard updated this revision to Diff 239817.Jan 23 2020, 2:17 AM
ostannard marked 2 inline comments as done.

Added ASCII output mode. It turns out that the unicode line drawing characters do work correctly on windows (at least windows 10, I don't have easy access to anything older to test with), I've left unicode as the default.

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4-sections.s
58

Added a note to the comment at the top of the file.

jhenderson added inline comments.Jan 24 2020, 1:04 AM
llvm/docs/CommandGuide/llvm-objdump.rst
132–135

This isn't quite the interface I'd use. I'd actually recommend making ASCII the default, since it will work in all situations, rather than requiring Windows users to specify an extra switch. If there's a good reason not to, I'd change this to --debug-vars-ascii, so that it's the normal practice of "use a switch to enable a behaviour" i.e. --debug-vars-unicode is all that is needed (and there's no need to do the slightly weird syntax of --debug-vars-unicode=false).

A third option (possibly in conjunction with the third) would be to change the switch name to --debug-vars-style, and allow it to take an enum of ASCII/Unicode (or more specifically UTF-8 probably). However, I don't think it's likely that we'll ever need more than two modes, so this is probably overkill.

My personal preference would be the first of the option, as I am primarily Windows developer, as are all of the developers I work with/make tools for.

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4-sections.s
58

I'm not seeing any changes in the comments since the previous version?

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4.s
91

Will this ever be "\\"? The SRC_COMPDIR sed pattern explicitly uses '/' now.

llvm/tools/llvm-objdump/llvm-objdump.cpp
611

This seems unrelated?

839

Have you considered something like the following approach instead of this function (which you have to call every time you need to add a single character)?

  1. Keep the enum, but assign the enum values all specific values from 0.
  2. Create two constant char arrays, one with the Unicode version, and the other with the ASCII version of the characters.
  3. Set some variable early (e.g. DbgVariableChars) to the value of one of the arrays, dependent on the style being used.
  4. Use the chars by referencing the DbgVariableChars array, indexed by the enum, i.e. something like DbgVariableChars[RangeStart].

I'm not sure if it's simpler in practice or not, but it is at least worth considering.

848

Is this code tested anywhere? I couldn't find it when searching.

ostannard marked 6 inline comments as done.Jan 30 2020, 3:28 AM
ostannard added inline comments.
llvm/docs/CommandGuide/llvm-objdump.rst
132–135

I've tested this on windows 10, and the unicode output displays correctly (in both cmd.exe and powershell). Is there some other version or configuration of windows in which this won't work? If ASCII is only needed for older versions of windows, I'd like to keep the default as unicode, because I find it much more legible than the ASCII version. There's also the option of making ASCII the default for windows and unicode elsewhere, but I'd rather avoid that if possible.

I do like the idea of switching the spelling to --debug-vars-ascii though.

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4-sections.s
58

Oops, missed a git-add.

llvm/tools/llvm-objdump/llvm-objdump.cpp
611

What do you mean? There are DWARF opcodes not handled by this function yet, we have to do something if we hit one, and printing the opcode will make adding the missing cases easier.

839

The unicode characters are each multiple bytes, so the indexes wouldn't match up between the two arrays. We could have to arrays of string literals, with one character in each literal, but I don't think that has any significant advantage over the switch statement.

848

This character only gets used when multiple live ranges start at the same assembly instruction. The example in M2 (more complex than the current tests) uses it twice - the first is a compiler bug, so I don't want to include it in a test, and the second requires printing multiple locations for parts of structs. I expect this to be tested properly when adding the struct location printing.

ostannard updated this revision to Diff 241393.Jan 30 2020, 3:29 AM
  • Rebase
  • Change option spelling to --debug-vars-ascii (keep default as unicode)

Would it be feasible to move the bulk of the implementation into lib/DebugInfo or some other more accessible place? I could imagine that this would be a neat feature for LLDB's disassemble command, for example.

ostannard updated this revision to Diff 242603.Feb 5 2020, 7:03 AM
  • Move DWARF expression printer into lib/DebugInfo/DWARF. I've left the LiveVariablePrinter class in llvm-objdump for now, because it's more tied into the rest of the disassembly loop (e.g. it needs to print after source line numbers). I'm not really sure what LLDB's requirements are here, but if we're going to move things to a library it might be worth moving more of the disassembly loop at the same time.
  • Move DWARF expression printer into lib/DebugInfo/DWARF. I've left the LiveVariablePrinter class in llvm-objdump for now, because it's more tied into the rest of the disassembly loop (e.g. it needs to print after source line numbers). I'm not really sure what LLDB's requirements are here, but if we're going to move things to a library it might be worth moving more of the disassembly loop at the same time.

Agreed, this doesn't need to happen all at once.

llvm/tools/llvm-objdump/llvm-objdump.cpp
666

Minor nit: It's encouraged to always use Doxygen (///) comments even in local definitions, because many IDEs can understand them and then dispaly online help and format the comments nicely.

ostannard updated this revision to Diff 242848.Feb 6 2020, 3:06 AM
  • Doxygen style comments
jhenderson added inline comments.Feb 7 2020, 3:12 AM
llvm/docs/CommandGuide/llvm-objdump.rst
130

I just tried this out locally, and on an already-wide window, a fairly basic expression wrapped with the default of 50 characters. Any particular reason you've used 50?

132–135

Thanks. Could you remove the "=<true|false>" bit from the description (you don't include it for --debug-vars after all...). Also replace "If true" with "If specified" and ". If false, use" with ", otherwise, use" and delete the "Defaults to false" sentence.

llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
122

skip_bytes -> skipBytes to match LLVM coding standards.

Also, use uint64_t instead of uint32_t, sine that's what the type of Op.EndOffset is.

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
343–344 ↗(On Diff #242848)

Would "in which case it will be implicitly dereferenced" make more sense?

Also I'd probably just get rid of the "which does not" bit.

405 ↗(On Diff #242848)

Any particular reason you haven't just inlined the whole function here?

llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4.s
7

Typo: diaplayed -> displayed

ostannard updated this revision to Diff 244371.Feb 13 2020, 2:40 AM
ostannard marked 6 inline comments as done.
  • Remove skip_bytes function, not needed until later patch
  • Reword comment in DWARFExpression
  • Reduce default indent to 40 columns
llvm/docs/CommandGuide/llvm-objdump.rst
130

I picked that to be wide enough to avoid colliding with the disassembled instructions, except in cases where they include long symbol names. However, maybe it is a bit too high. I'll reduce it to 40, which still avoids collisions with a few samples of (ARM and AArch64) code I've tried, and avoids wrapping on an 80-column terminal with the --no-show-raw-insn option. I'd recommend using this with --no-show-raw-insn, especially on narrow terminals, because the disassembly alone ends at around column 65-70 without it, not leaving enough room for the variable display.

llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
122

I've removed this function, it's not needed until a later patch.

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
405 ↗(On Diff #242848)

The outlined version will be used by DW_OP_fbreg and DW_OP_entry_value, which need to print a sub-range of the expression. I could inline it here, but would have to revert that soon.

jhenderson accepted this revision.Feb 14 2020, 2:54 AM

Looks good from my point of view, but probably warrants a second opinion.

This revision is now accepted and ready to land.Feb 14 2020, 2:54 AM
ostannard requested review of this revision.Feb 19 2020, 8:40 AM

Setting back to "Request Review" because this is still waiting for a second reviewer.

MaskRay removed a subscriber: MaskRay.

Some nits and style/comment/readability requests. Generally pretty happy with how it looks though. And thanks James for the numerous rounds of reviews already! :)

llvm/docs/CommandGuide/llvm-objdump.rst
132–135

FWIW I'd have probably gone with --debug-vars=<encoding> ?

echristo added inline comments.Feb 19 2020, 11:49 AM
llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
149

printCompact maybe?

llvm/include/llvm/Support/FormattedStream.h
110

Comment what ComputePosition is doing here?

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
356 ↗(On Diff #244371)

If you wouldn't mind commenting each of the cases with what's going on in them it'd be appreciated.

llvm/tools/llvm-objdump/llvm-objdump.cpp
597–599

Since this is going to get constructed with subsequent boolean arguments can we move them to enums/something? int, true, false, true isn't real obvious what it's constructing :)

714

Not a huge fan of boolean arguments. Is there a different factoring here that'll help?

jhenderson added inline comments.Feb 20 2020, 1:06 AM
llvm/docs/CommandGuide/llvm-objdump.rst
132–135

FWIW I'd have probably gone with --debug-vars=<encoding> ?

I like this idea. Saves the need for two switches, and makes things nice and explicit about what encoding to use.

Needs a rebase after D74507. --debug-vars seems to work with an ET_REL, but does it work with an ET_DYN/ET_DYN?

MaskRay added inline comments.Feb 23 2020, 5:04 PM
llvm/tools/llvm-objdump/llvm-objdump.cpp
563

//

Internal classes should not use Doxygen comments.

659

Ideally unsigned TabStop = NoShowRawInsn ? 16 : 40 in printInst should be extracted as a function.

Is --no-leading-addr taken into account?

706

addCompileUnit

Looks like this function can be merged with addFunction().

714

No need to use /// . These classes are internal.

840

Change OS << " " to print one space unconditionally can simplify code above.

865

Change OS << " " to OS << ' ' unconditionally can avoid " " above.

1829

Does this harm performance when --debug-vars is not used?

I once caused a regression (fixed by D64969).

ostannard marked 17 inline comments as done.Feb 26 2020, 10:49 AM

does it work with an ET_DYN/ET_DYN?

Not currently, I'll investigate and fix that as a separate patch.

llvm/tools/llvm-objdump/llvm-objdump.cpp
563

@aprantl previously asked for these to be changed to doxygen. I don't see anything explicit about this in the coding standard, but apparently IDEs can make use of these.

597–599

This is only ever constructed with these all set to false, so I can remove them from the constructor, and use the member names explicitly when modifying them.

659

--no-leading-addr isn't taken into account, because it doesn't affect the column the instruction starts at.

706

I don't see how, given that it calls addFunction in two places.

714

It's not easy to factor out, because CheckedVarIdxs is defined in the first half then used in the second. However, I can rename it to better describe what it does.

1829

If I've understood that discussion properly, the performance problem was caused by creating a formatted_raw_ostream for each instruction, because it flushes the underlying stream in the constructor. Here, i'm creating one per disassembled function, so the overhead will be much lower.

ostannard marked 5 inline comments as done.
  • Rebase
  • Style fixes
jhenderson added inline comments.Feb 27 2020, 2:20 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
342

I don't have a strong opinion on this, but perhaps DVDisabled etc would be more in keeping with LLVM style?

607–609

Could you avoid the constructor entirely and initialise the values inline?

ostannard updated this revision to Diff 246902.Feb 27 2020, 2:57 AM
ostannard marked 2 inline comments as done.
  • DV_Disabled -> DVDisabled
  • Remove unneeded constructor

Latest changes look okay to me, but please wait for @echristo and @MaskRay to be happy.

MaskRay added inline comments.Feb 28 2020, 1:25 PM
llvm/include/llvm/Support/FormattedStream.h
108–112

The function name self documents. The comment may be unnecessary.

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
351 ↗(On Diff #246902)

There is certainly prior art. But if changing all SmallString<20> to SmallString<16> may decrease the code size?

llvm/lib/Support/FormattedStream.cpp
29–44

Ptr < End will be safer in -DLLVM_ENABLE_ASSERTIONS=off builds in case of a malformed UTF-8 code sequence.

33

The following might be slight simpler.

auto C = static_cast<unsigned char>(*Ptr);
if (C < 0x80) MultiByte = false;
else if (C >= 0b11110000) Ptr += 3;
else if (C >= 0b11100000) Ptr += 2;
else Ptr++;
45–51

This is incorrect due to East Asian Width.... http://www.unicode.org/reports/tr11/tr11-36.html

% cat a.cc
int main() {
  int 喵🙂 = 0;
  int 喵喵🙂 = 15;
  int 喵🙂喵🙂 = 24;
  喵🙂 = 28;
  喵喵🙂 = 32;
}
% clang -c a.cc
% llvm-objdump -S --no-show-raw-insn --debug-vars a.o

a.o:    file format elf64-x86-64


Disassembly of section .text:

0000000000000000 main:
; int main() {                                          ┠─ 喵🙂 = <unknown op DW_OP_fbreg (145)>
                                                         ┠─ 喵喵🙂 = <unknown op DW_OP_fbreg (145)> 
                                                          ┠─ 喵🙂喵🙂 = <unknown op DW_OP_fbreg (145)>
       0:       pushq   %rbp                                   
       1:       movq    %rsp, %rbp                             
       4:       xorl    %eax, %eax                             
;   int 喵🙂 = 0;                                              
       6:       movl    $0, -4(%rbp)                           
;   int 喵喵🙂 = 15;                                            
       d:       movl    $15, -8(%rbp)                          
;   int 喵🙂喵🙂 = 24;                                           
      14:       movl    $24, -12(%rbp)                         
;   喵🙂 = 28;                                                 
      1b:       movl    $28, -4(%rbp)                          
;   喵喵🙂 = 32;                                                
      22:       movl    $32, -8(%rbp)                          
; }                                                          
      29:       popq    %rbp                                   
      2a:       retq                                      
ostannard marked 6 inline comments as done.Mar 2 2020, 3:44 AM
ostannard added inline comments.
llvm/lib/Support/FormattedStream.cpp
45–51

There's some code in the Support library which does this, which also simplifies this function a bit. It handles east asian characters correctly (at least the ones in your example), but doesn't get the correct widths for emojis. Clang seems to handle emojis by printing a hex value which it knows the length of:

$ /work/llvm/build/bin/clang -fsyntax-only -c width.c 
width.c:1:15: error: use of undeclared identifier 'a'
int 喵<U+1F642> = a;
                  ^
1 error generated.

Fixing that is going to be a lot of work, and isn't related to this patch, so I'll just make it work for characters supported by the existing unicode support library for now.

ostannard updated this revision to Diff 247588.Mar 2 2020, 3:46 AM
ostannard marked an inline comment as done.
  • Align columns correctly after east asian wide characters (though not emojis)
  • Other style changes
jhenderson added inline comments.Mar 4 2020, 1:30 AM
llvm/test/tools/llvm-objdump/ARM/debug-vars-wide-chars.s
5–7 ↗(On Diff #247588)

'##' for comments.

10 ↗(On Diff #247588)

I think it will, but will this work on Windows (non-UTF8 environment)?

ostannard marked 3 inline comments as done.Mar 4 2020, 5:13 AM
ostannard added inline comments.
llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4.s
91

The SRC_COMPDIR sed script isn't used for the LINE-NUMS test (if it was we'd then need to regex out the whole path), so I've put this regex back.

llvm/test/tools/llvm-objdump/ARM/debug-vars-wide-chars.s
10 ↗(On Diff #247588)

It turns out that this doesn't work, at least in my windows 10 VM, so I'll disable this test for windows.

ostannard updated this revision to Diff 248152.Mar 4 2020, 5:14 AM
  • Disable wide-chars test on windows
  • Fix the line-numbers test, which was broken on windows (different dir separator)
jhenderson added inline comments.Mar 11 2020, 1:57 AM
llvm/test/tools/llvm-objdump/ARM/debug-vars-wide-chars.s
8 ↗(On Diff #248152)

Nit: windows -> Windows

It might be wise to change "han" -> "'喵'" as I don't know what the "han" character is.

12 ↗(On Diff #248152)

Nit: east asian -> East Asian

(I think)

ostannard marked 2 inline comments as done.Mar 11 2020, 3:11 AM
ostannard added inline comments.
llvm/test/tools/llvm-objdump/ARM/debug-vars-wide-chars.s
8 ↗(On Diff #248152)

I was using "han character" meaning a Chinese, Japanese or Korean character, but it seems like "Chinese character" is a more common term, so I'll switch to that. I'd rather not use the character itself in a comment explaining that it might not render correctly on some systems!

ostannard updated this revision to Diff 249572.Mar 11 2020, 3:12 AM
jhenderson accepted this revision.Mar 13 2020, 3:06 AM

Latest changes LGTM, but please wait for others to be happy too.

This revision is now accepted and ready to land.Mar 13 2020, 3:06 AM
echristo accepted this revision.Mar 13 2020, 8:58 AM

Couple of random comments, but feel free to fix and commit. I think anything else can be handled post-review.

Thanks!

-eric

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
358 ↗(On Diff #249572)

You assume MRI isn't null so probably just make it a reference.

408 ↗(On Diff #249572)

Since MRI is going to be a reference downstream probably want to make it a reference here.

MaskRay added inline comments.Mar 13 2020, 9:18 AM
llvm/test/tools/llvm-objdump/ARM/debug-vars-dwarf4-sections.s
14

After D75713, 00000000 <foo>:

You can now delete Disassembly of section .text.foo: and CHECK-EMPTY:

MaskRay accepted this revision.EditedMar 13 2020, 9:24 AM

In the git description, it may be worth mentioning that ET_EXEC/ET_DYN are not supported yet.

llvm/lib/Support/FormattedStream.cpp
55

Delete excess braces.

ostannard marked 3 inline comments as done.Mar 16 2020, 3:39 AM
This revision was automatically updated to reflect the committed changes.

I forgot to restrict the new tests to only run when the relevant disassembler is built, should be fixed by rG161f70eae6cd and rG2878c6693875.

The tests are also failing on some windows bots with unicode errors, so I've disabled them on windows with rGf4cb9c919e28. These tests were passing locally in a Windows 10 VM, I'm not sure what is different about the buildbots.

thakis added a subscriber: thakis.Mar 16 2020, 7:52 AM

I believe this breaks llvm/test/MC/ARM/lsl-zero.s on Windows.

The failure started here: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/14810 This build has many changes, but this is the only arm-related one as far as I can tell.

It's still happening on trunk: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/14816/steps/stage%201%20check/logs/stdio

Please take a look, and if it takes a while to investigate, please revert while you look.

My reply by email doesn't seem to have made it into phab, so repeating here (sorry for spamming everyone who did get the email):

I'm not sure how this patch could have caused that failure. This isn't really an arm-related patch, that's just the architecture I happened to write most of the tests using.

Do you have access to that buildbot to get the actual output from the llvm-mc command?

Reverted (together with a surprisingly long list of related changes) for now in 9e48422035813b79088dfa4637e31cd836332a54.

When you reland, please leave some time between commits, so that if one should break something, it's clearer which commit is the problem. Most bots cycle time-based, not commit-based, so things that land in quick succession tend to end up in the same build, which makes things confusing.

Sorry, I didn't see that email. The revert did get that bot green, so it was due to this change.

I don't have easy access to that bot, no.

(And sorry, I started the revert seconds before you must've replied, so I didn't see your reply until after I had committed.)

hoyFB added a subscriber: hoyFB.Mar 16 2020, 12:20 PM
hoyFB added inline comments.
llvm/include/llvm/Support/FormattedStream.h
108–112

Hello, looks like this change breaks the clang test: clang/test/Analysis/checker-plugins.c by changing the printing format

from:

example.MyChecker:ExampleOption                   (bool) This is an example checker opt. (default: false)

to:

example.MyChecker:ExampleOption                   (bool) This is an example checker opt. (default:
                                                                             false)

and the test fails at

// CHECK-CHECKER-OPTION-HELP: example.MyChecker:ExampleOption  (bool) This is an
// CHECK-CHECKER-OPTION-HELP-SAME: example checker opt. (default: false)

Can you please take a look? Thanks!

benlangmuir added inline comments.
llvm/lib/Support/FormattedStream.cpp
51

While it is probablye reasonable to expect only well-formed UTF-8 during the top-level call to write, this will fail for a buffered raw_ostream if you write a multi-byte sequence that happens to crosses the buffer end point. raw_ostream's buffering is byte-level, and won't prevent splitting a multibyte character. We saw this assertion failure in practice.

@thakis, thanks for reverting, sorry I wasn't more proactive about that yesterday.

@hoyFB, it looks like I've accidentally fixed a bug where the help text wasn't being wrapped when the stdout isn't a TTY. I don't see anything in the code to suggest that this is intended behaviour, so I'll update the test.

@benlangmuir, thanks for the pointer, it looks like we'll need to save the first few bytes of truncated UTF-8 sequences when this happens so that we can calculate the display width once we have the rest of the bytes.

I'll split the formatted_raw_ostream changes out into a separate patch, and add some unit tests for these edge cases. I've still got no idea what the lsl-zero.s failure is about, but splitting the patch up should at least narrow it down.

See D76291 for that split-put patch.

MaskRay reopened this revision.EditedJul 8 2020, 3:35 PM

I think this is still in a reverted state (9e48422035813b79088dfa4637e31cd836332a54 ). --debug-vars is useful and I'd like to see it come back. Adding @dblaikie for some opinions.

This revision is now accepted and ready to land.Jul 8 2020, 3:35 PM

This was blocked on D76291 for a while, but that is now committed, so I'll re-land this today.

This revision was automatically updated to reflect the committed changes.