This is an archive of the discontinued LLVM Phabricator instance.

Emit line 0 line information for interesting 'orphan' instructions
AbandonedPublic

Authored by probinson on Jan 25 2016, 7:02 PM.

Details

Summary

There will always be instructions with no line information attached.
One example is the LocalArea constants generated by FastISel, but it's
by far not the only one. Compiler generated code will have no line info
and might get inlined. SelectionDAG is full of nodes that won't have
any line information (because they could be shared).

Multiple attempts have been made over time to fix the FastISel issue,
but none have felt satisfactory enough to be commited. As the prolem
is much more general than just FastISel, this patch addresses it
at line table emission time rather than instruction selection time.

It also doesn't try to give accurate line information for these
instructions, but rather marks them with a 0 line information so that
the debugger can safely ignore them.

The implemented heuristic will add a line 0 information to every insn
missing a line information and either:

  • at the beginning of a basic-block
  • after a call
  • first instruction after a label

The first 2 conditions handle the FastISel case, while the later enforces
that any instruction referenced by anything (including debug info) has
line information.

The initial measurments on line table size show an impact between 5% and 15%
(the latter is for a control-flow heavy code eg. ASANified).

There was already an option to use line 0 information for all the instructions
with no debug locations. The commit history for this code show that it was
disabled because it caused issues with GDB and with the line table size. Note
that this code was unconditionaly emitting those line 0 informations and not
using a heuristic to find intersting instructions like this one.

Diff Detail

Event Timeline

friss updated this revision to Diff 45943.Jan 25 2016, 7:02 PM
friss retitled this revision from to Emit line 0 line information for interesting 'orphan' instructions.
friss updated this object.
friss added reviewers: aprantl, dblaikie, echristo, probinson.
friss added a subscriber: llvm-commits.
echristo edited edge metadata.Jan 25 2016, 7:04 PM

I like this a lot more.

-eric

friss added a comment.Jan 25 2016, 7:04 PM

I'd appreciate it if others could look at the size impact and check that GDB continues to work with that.
I've got no clue how to write a good test case for this...

Either Dave or I can run it through some tests. I've got ideas on
testcases, but will have to get to it when I'm back from vacation.

-eric

probinson edited edge metadata.Jan 26 2016, 10:09 AM

Smooth.
I've asked Wolfgang to run a size experiment.

Can we really have a case where BB will change but there's no label?

Post-call cases... not as important as top-of-block cases, I think. Also, harder to detect reliably. Local-value instructions might not be immediately following the call instruction, if the call returns a value that has to be moved somewhere else or there's other post-call state to restore?
I think it's not horrible if post-call local-value instructions end up implicitly attached to the call's source location.

lib/CodeGen/AsmPrinter/DwarfDebug.h
246

Maybe PrevBB and PrevWasCall, for consistency?

friss added a comment.Jan 26 2016, 3:15 PM

Smooth.
I've asked Wolfgang to run a size experiment.

Thanks!

Can we really have a case where BB will change but there's no label?

In my experiments, yes. Maybe it's about fall through-only BBs, or something more subtle in the MC semantics, but I definitely needed to add a specific case for entry in a new BB.

Post-call cases... not as important as top-of-block cases, I think. Also, harder to detect reliably. Local-value instructions might not be immediately following the call instruction, if the call returns a value that has to be moved somewhere else or there's other post-call state to restore
I think it's not horrible if post-call local-value instructions end up implicitly attached to the call's source location.

To be honest, this wasn't in my first version of the patch, but when I looked back at the other thread, I saw that it was addressing this issue too. I do not feel strongly about that, I definitely could go without that part of the patch.

Wolfgang tried this patch on a couple of code bases: pre-patch .debug_line sizes were about 900KB and pushing 16MB. In both cases, the patch caused just over 40% increase in .debug_line size. (Builds done with -O0.)
That seems excessive, and makes me even more interested in dialing back the set of cases.

Removing the LastWasCall check got the size increase down to about 18%.
Wolfgang thinks the PrevLabel check contributes very little. Did you have a particular example where you thought the PrevLabel (but not at top-of-block) case would matter to the debugging experience?

friss added a comment.Jan 26 2016, 6:13 PM

Removing the LastWasCall check got the size increase down to about 18%.
Wolfgang thinks the PrevLabel check contributes very little. Did you have a particular example where you thought the PrevLabel (but not at top-of-block) case would matter to the debugging experience?

Yes. The PrevLabel check is my main motivation for this. The case I'm looking at is C++ code compiled at O0 with some always_inline methods. One inline block happens to end on an instruction that has no line information. When the debugger (LLDB) steps out of the fake inline frame, it looks at the line table to find out where it is. The previous line entry can't cover this location as it was in another frame and the next line entry hasn't started yet. So the debugger stops in some weird state. The label check ensures that any instruction pointed to gets line information.

(It's pretty easy to trigger this behavior, I can share the test case. The generated IR is too big to turn it into a useful unit test though).

Our FastISel patch dealt with line info for post-call local-value instructions mainly as an artifact of the implementation, not because we had complaints or thought it was an important case. I think omitting that case is fine. The line info for the call instruction implicitly covers the following local-value instructions, which may not be pedantically "correct" but does not have the funny stepping behavior problems that the top-of-block case has.
Your description of the inlining case sounds like a reasonable justification for the PrevLabel case, and yes it would be interesting to see the example as an experiment for our debugger.

Did you happen to try an always_inline+nodebug case? Lots of the x86 intrinsics do this, for example.

Some additional thoughts on reducing the size cost in the encoded line table.
First, using a null scope looks like it will force the file number to 1. See the inline comment.
Second, I'd hope that the column number is irrelevant and so we shouldn't bother explicitly setting it to zero. That's probably something that would have to be handled where the line-table gets encoded, rather than here, and so can be deferred to a follow-up.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1043

Passing nullptr here looks like it will force the file number to 1. Often it is 1 anyway, but not necessarily, e.g. for an inline function definition in a header file. If it isn't already 1 then we'll waste space setting the file to 1 and back for no benefit. To save space in the encoded line table, we should pass Scope here.
This didn't really matter so much when UnknownLocations was the only way to get these directives, but now we're doing it "normally" so it matters more.

Second, I'd hope that the column number is irrelevant and so we shouldn't bother explicitly setting it to zero. That's probably something that would have to be handled where the line-table gets encoded, rather than here, and so can be deferred to a follow-up.

To quantify this, a quick experiment showed a line-0 directive occupied 5 bytes, of which 2 bytes were setting the column to zero. Not setting the column reduces the size increase by 40%.

To quantify this, a quick experiment showed a line-0 directive occupied 5 bytes, of which 2 bytes were setting the column to zero. Not setting the column reduces the size increase by 40%.

Good to know - does that include changing the file too? (On either side of the equation/data)
Though I expect some of the cost of these new line table entries includes changing it back to something else on the other side - in some cases I imagine it'll be changing it back to the same file/line, potentially. Though I suppose if we don't change the file, we'll save both the change, and the change back. Column probably changes anyway so it doesn't cost us anything on the change "back" perhaps.

Right, my accounting was incomplete, and my example used unrealistically small line numbers. Let's try again being a little more analytical.

First off, I'm happy to make the simplifying assumption that column changes more-or-less every statement, and that we should accrue only one column-change per line-0 directive. (Offhand I don't know how Clang decides to set the column numbers.) Setting the column is a "standard" opcode, so occupies 1 byte plus the ULEB operand., which in this case will be 1 byte, for a total of 2 bytes. This cost is going to accrue for every line-0 directive, unless we optimize it away.
Setting the file is also a standard opcode, again 1 byte plus the ULEB operand, which will usually be 1 byte (unless the file number > 127) so again assume 2 bytes. This will happen twice per line-0 directive in a scope outside actual file #1, which is less common but surely does happen.

Setting the line number is more 'fun.' We'll simplify this to assume the line number is outside the line-range for special opcodes, so setting it to 0 and back requires standard opcodes. This time it's a 1-byte opcode plus an SLEB operand, because you don't "set" the line, you "advance" the line. The max value you can encode as SLEB in 2 bytes is 8191, if I'm counting the bits correctly. If we further assume most line numbers are in the range 127..8191 then setting the line number to 0 and back will require 3 bytes each. In addition we'll need a 1-byte special opcode to advance the address and "emit" the line record.

So, assuming we fix the file-number problem (trivially, see my inline comment) and we're up past line 128, then a line-0 directive will cost:
2 bytes to set column=0
3 bytes to advance line to 0 [e.g. advance by -200, if we were at line 200]
1 byte special opcode to emit the record
3 bytes to advance line to 201 [or whatever]
and the rest of it would be like usual.

That adds up to 9 bytes per directive (usually), and if we optimize away the column number, we save 2 bytes, or 22% of the increase (not 40%, sigh). But, I think, still worth doing.
--paulr

From: David Blaikie [mailto:dblaikie@gmail.com]
Sent: Wednesday, January 27, 2016 10:09 AM
To: reviews+D16569+public+6623e660a2ba7fa7@reviews.llvm.org; Robinson, Paul
Cc: Frédéric Riss; Adrian Prantl; Eric Christopher; llvm-commits
Subject: Re: [PATCH] D16569: Emit line 0 line information for interesting 'orphan' instructions

probinson added inline comments.Jan 27 2016, 3:49 PM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1043

Heh. Scope would have to come from PrevInstLoc, if defined, of course;
const MDNode *Scope = PrevInstLoc ? PrevInstLoc.getScope() : nullptr;
but still easy enough.
Wolfgang reports that this reduces the size-increase from 40% to 29%, or if we have already taken out LastWasCall, from 18% to 15%. This is all at -O0.

@friss, did you want to proceed with this?

friss added a comment.Feb 23 2016, 2:44 PM

@friss, did you want to proceed with this?

Sorry, I'm swamped. I can try to update the patch according to your feedback in th coming days.. I think we didn't get anyone testing this on LLDB and GDB. I just asked Adrian to have a look at LLDB, but I would really appreciate some GDB testing too.

Ping. I just diagnosed something today that reminded me of this patch.

@friss, do you mind if I commandeer this patch? I recently implemented the idea locally and it's looking pretty positive.

friss abandoned this revision.Aug 19 2016, 4:38 PM

I don't mind at all. I'm so sorry I didn't get back to this. I still think it'd be good to get some debugger testsuites to run on it, but I guess we can also let the CI handle that.

It'll still be good to run the tests. My confidence in the CI for the debug side of things is fairly low these days.

probinson commandeered this revision.Aug 19 2016, 5:43 PM
probinson edited reviewers, added: friss; removed: probinson.

Taking this over from friss. It will be a little while before I can upload a patch as I am mostly out next week and for sure want to run this by the gdb suite. I've never tried lldb before but I'll give that a go too.