Page MenuHomePhabricator

[docs] Specify rules for updating debug locations

Authored by vsk on Jun 4 2020, 3:12 PM.



Restructure HowToUpdateDebugInfo.rst to specify rules for when
transformations should preserve, merge, or drop debug locations.

The goal is to have clear, well-justified rules that come with a few
examples and counter-examples, so that pass authors can pick the best
strategy for managing debug locations depending on the specific task at

I've tried to set down sensible rules here that mostly align with what
we already do in llvm today, and that take a diverse set of use cases
into account (interactive debugging, crash triage, SamplePGO).

Please *do* try to pick these rules apart and suggest clarifications or
improvements :).

Side note: Prior to 24660ea1, this document was structured as a long
list of very specific code transformations -- the idea being that we
would fill in what to do in each specific case. I chose to reorganize
the document as a list of actions to take because it drastically cuts
down on the amount of redundant exposition/explanation needed. I hope
that's fine...

Diff Detail

Event Timeline

vsk created this revision.Jun 4 2020, 3:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 4 2020, 3:12 PM
echristo accepted this revision.Jun 4 2020, 3:47 PM
echristo added a reviewer: echristo.
echristo added subscribers: cmtice, tbosch, echristo.

This is really awesome to write up! One inline request for some elaboration, but otherwise I'm happy with this. For the record I was contemplating asking for more specifics under when to drop or moving the bits for merging down... but I'm not entirely sure that's better, just different :)



Bit more clarification here?

This revision is now accepted and ready to land.Jun 4 2020, 3:47 PM
dblaikie added inline comments.Jun 4 2020, 4:05 PM

Is this kind of "must" language consistent with the rest of this document? In this case it's an aspirational, sort of "this is ideal/what we're striving for" rather than "if you don't do this it'll fail the verifier/break", etc?


Should this be "or" rather than "and"? If you merge two instructions even in the same BB, I think the goal is you should use a merged location, right?


Might be worth a separate header - not sure how consistently we do this or how problematic it is if this isn't done.

vsk marked an inline comment as done.Jun 4 2020, 4:52 PM
vsk added a subscriber: danielcdh.
vsk added inline comments.

This is aspirational -- would "should" be more appropriate?


I'm of two minds on this one. The applicable (hypothetical) example here is "(A * B) + C => llvm.fma.f32(A, B, C)".

If the part about "being in a different block" is kept, then the location of the "fma" instruction should be determined by location cascade. That location has a realistic shot at mapping back to some real (if somewhat arbitrary) source line.

If get rid of the "being in a different block" language, the location of the "fma" instruction should be artificial (line 0).

It seems like both options suit the interactive debugging and crash triage use cases. But maybe location cascade would be a better fit for SamplePGO, if that needs to find a "real" line number to increment the right basic block execution count? It'd be helpful for anyone familiar with SamplePGO to express some opinion (cc @danielcdh).


How about: "Converting an if-then-else CFG diamond into a select. Preserving the debug locations of speculated instructions can make it seem like a condition is true when it's not (or vice versa), which can lead to a confusing single-stepping experience. The rule for :ref:dropping locations<WhenToDropLocation> should apply here." ?


That's a good point, e.g. I don't think the ASan pass bothers with setting line 0 locations. I'll try to split this up.

dblaikie added inline comments.Jun 5 2020, 9:35 AM

"should" sounds right to me - but open to other folks preferences/ideas about how this should be framed


Not sure I follow the fma example.

Let me rephrase the words in the document to see if we're both understanding them the same way: "IF multiple instructions are merged and the merged location is in a different block than one of the inputs - you /should/ (must, whatever) merge" - the implication is then if you have multiple instructions that don't change blocks, the location should /not/ be merged?

That doesn't sound like what I'd expect - in the fma example, if the * and - were from the same basic block, I'd still expect to merge the location rather than dropping or zeroing the location. So that scope information was preserved (eg: if * and - came from the same inline function, then the fma should still be attributed to that inline function).

vsk added inline comments.Jun 5 2020, 12:39 PM

Re: "the implication is then if you have multiple instructions that don't change blocks, the location should /not/ be merged" -- yep, we're on the same page about the proposed rule. Your point about expecting the right inline scope on merged instructions is compelling though. I'll fix this up.

Part of my confusion here was over how exactly SamplePGO works: can it correctly attribute a sample of a line 0 location to the scope it points to? Maybe that's moot, though, since relying on location cascade to assign locations to merged instructions could be more misleading than helpful.

dblaikie added inline comments.Jun 8 2020, 12:18 PM

Part of my confusion here was over how exactly SamplePGO works:

I don't actually know how it works.

can it correctly attribute a sample of a line 0 location to the scope it points to?

In theory I think such samples could be used, but fairly likely they aren't used.

(you could imagine a sample based profiler could look at all source lines within a sampled basic block - and mark all those source lines as sampled)

Greatly enjoying the fact that the rationale behind each rule is explained, thanks for putting this together.


Do you see this rule as covering tail duplication? AFAIUI preserving the location when tail duplicating is correct, but involves multiple non-unique predecessors. (Or, I misunderstand unique here).

probinson added inline comments.

I believe that (at least how we use it in Sony) SPGO doesn't use a line-0 location, but snoops around looking for a real source location in the same block. Jeremy will poke a couple people who actually worked on it to try to get a more definitive answer tomorrow (they're in the UK).

rob.lougher added inline comments.

When sampling an executable, the "hits" are addresses which must be mapped back to source locations using debug line info (i.e. addr2line). Any transform that moves or merges instructions without updating the debug location can therefore affect the quality of the sample. The standard example is tail-merging (which in fact was the first optimisation we fixed).

Imagine we have an if-then-else, where the "then" and "else" blocks end with the same instruction(s). Tail-merge will common the instructions in the successor block.

For block-reordering, we want to know if one side of the if-then-else is executed much more than the other. But if tail-merge has retained the original debug locations in the merged tail we will get an inaccurate sample. If the "then" location is used, hits on the common instructions will be attributed to the "then" block which is wrong. Likewise if the "else" locations were used. The location of the tail is ambiguous.

To fix this we introduced the "getMergedLocation" call. The initial version was very simple. If the locations were the same, either can be used. But if the locations are different, we can't give a location and an empty ("unknown") location is used.

At some point, this "unknown" location ends up generating a DWARF line-0 record. Now this is the bit I didn't directly work on so I don't know where it occurs. My memory is that the work was done by Paul, and that an "unknown" location at the start of a basic block forced a line-0. This did exactly what we wanted, and we rewrote our initial patch to tail-merge that explicitly created a line-0 location.

rob.lougher added inline comments.Jun 9 2020, 7:22 AM

P.S. We did a presentation of this work at EuroLLVM 2017 ( which you can watch for more examples.

probinson added inline comments.Jun 9 2020, 7:45 AM

Rob, I think the most relevant question is: If there's a sample hit on an address that maps to line 0, what does SPGO do with it? Throw it away? Rummage around for a non-zero location?

rob.lougher added inline comments.Jun 9 2020, 10:31 AM

This is handled by AutoFDO (which reads a "raw" sample and converts it into a sample with lines relative to the start of the function). It's 4 years since I hacked that code so I wanted to check to make sure.

Anyway, the answer is that samples for addresses that map to line 0 are simply thrown away...

vsk marked 2 inline comments as done.Jun 10 2020, 11:37 AM
vsk added inline comments.

Thanks for catching this, I think it should cover tail duplication. I've updated the wording to: '... if its basic block is folded into a predecessor that branches unconditionally'.


@probinson @rob.lougher thanks for the context on SamplePGO. I'm going to propose switching the language here to recommend merging instruction locations, because the alternative of relying on location cascade doesn't seem like it's better for SamplePGO purposes.

vsk updated this revision to Diff 269924.Jun 10 2020, 11:39 AM

Incorporate review feedback. Mainly:

  • Recommend merging locations in block-local combines. Add a counter-example where this doesn't apply.
  • Recommend preserving locations in tail duplication.

Thanks! I added a few nits inline.


s/source lines/source locations/

That's also what they are called in Clang.


Debugging, crash logs, and SamplePGO accuracy, ...






Simple peephole optimizations that replace or expand an instruction, like


I would either reuse + and << or use add and shl in the previous line




I wonder if we should drop the Simple/Complex adjectives altogether?


I would be curious about an explanation what makes this example different from the fma example above.

vsk marked 8 inline comments as done.Jun 11 2020, 6:25 PM
vsk added inline comments.

Thanks for pressing on this, it's not terribly clear as-written. How about: "Block-local peepholes which delete redundant instructions, like `(sext (zext i8 %x to i16) to i32) => (zext i8 %x to i32). The inner zext` is modified but remains in its block, so the rule for preserving locations should apply."

vsk updated this revision to Diff 270276.Jun 11 2020, 6:25 PM

Address some review feedback.

aprantl added inline comments.Jun 12 2020, 10:41 AM

sounds good!

vsk added a comment.Jun 17 2020, 3:35 PM

I think I've addressed all the outstanding feedback. I'll plan to land this in 24h if there aren't any more comments. Thanks!

We recently ran into a case where it looks like the variable's location was correct but the containing scope was wrong. I guess we'll need a separate update for how to manage scopes.
Not an objection to this patch, just observing there are more cases to think about.

jmorse accepted this revision.Jun 18 2020, 10:11 AM

For the record, this all LGTM.

vsk added a comment.Jun 18 2020, 2:05 PM

We recently ran into a case where it looks like the variable's location was correct but the containing scope was wrong. I guess we'll need a separate update for how to manage scopes.
Not an objection to this patch, just observing there are more cases to think about.

Thanks for flagging the issue, it's not something I've spent much time thinking about yet. My plan is to try and write some guidelines for updating debug values next. If you (or anyone else) would like to write up some guidance about dealing with scopes, that would be great!

This revision was automatically updated to reflect the committed changes.