Page MenuHomePhabricator

[DebugInfo] LiveDebugValues: don't create transfer records for potentially invalid locations
Needs ReviewPublic

Authored by jmorse on Sep 12 2019, 7:43 AM.

Details

Summary

This patch is the counter-part to D67393, and aims to solve the second problem mentioned there, that transfers which are found to be invalid after location propagation occurs should not be inserted. The solution is simple: just don't record transfers until the in-locations are all known to be correct, and then reprocess all instructions a final time creating transfers.

This involves touching every instruction in the program again, which is unfortunate, but an alternate implementation I wrote adds ~150 lines of code to try and unwind chains of transfers, involving a lot of additional complexity. IMO, this is a reasonable trade-off. As a bonus though, because we don't generate a transfer every time we look at a block, we don't generate redundant identical DBG_VALUEs if we re-visit a block.

There's a fractional reduction in scope-bytes-covered with this patch (<0.1%), all which must have been invalid.

(I'll have some asan-build-time measurements of this in a couple of hours).

Diff Detail

Event Timeline

jmorse created this revision.Sep 12 2019, 7:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2019, 7:43 AM
jmorse marked an inline comment as done.
jmorse added inline comments.
test/DebugInfo/MIR/X86/live-debug-values-bad-transfer.mir
17–19

I forgot to say in the summary: the absence of this incorrect DBG_VALUE is checked by the --implicit-check-not given to FileCheck.

just don't record transfers until the in-locations are all known to be correct

I'm having trouble with this statement conceptually. Aren't in-locations coming in through join() always correct, just that we didn't reach a fixpoint yet? I'm assuming that at the beginning all in-locations are unknown and only when all incoming basic blocks have outgoing locations, join() adds new incoming locations to be processed. I guess this boils down to how we are not really modeling "unknown" currently?

lib/CodeGen/LiveDebugValues.cpp
488

When is Transfers null?

I'm assuming that at the beginning all in-locations are unknown and only when all incoming basic blocks have outgoing locations, join() adds new incoming locations to be processed.

Ah, this is where the implementation diverges, in fact if all incoming blocks have the same outgoing location *or* unknown for a variable, join() will consider it a valid live-in location. I think it has to be this way because of loops -- for a location that is live throughout a loop, there's a circular dependency between the outgoing location for the back-edge and the live-in location for the loop header. Right now we assume the location is correct for any unknown blocks, and have to invalidate locations later if that doesn't turn out to be true. See for example D66599.

I guess this boils down to how we are not really modeling "unknown" currently?

It turns out to be even more exciting than that -- consider this IR [0] which has badly behaved, but completely legal control flow. It's illustrated in this highly sophisticated diagram [1]. A single variable location (toast==0) is created in the entry block, and then invalidated with an undef dbg.value in loop 3. Because every non-entry/non-exit block is reachable from every other block, the location-invalidation has to be propagated to them all. However, it takes at least one LiveDebugValues iteration for the invalidation to traverse a back-edge, so it takes two LiveDebugvalues iterations for the invalidation to move up the CFG and reach loop1. In that time, all the "unknown" blocks have been processed, and as a result locations have to transition from "true" to "false", rather than "unknown" to "false". It might be possible to generate a situation where something transitions from "false" to "true" too, but I haven't looked closely at that yet.

I think the fact that this is happening means I've made LiveDebugValues depart from peoples expectations, you wrote in the 2016 mailing list dicsussion [2]

the information at the nodes can only move in one direction in the lattice and can change at most once.

which isn't true any more because of D66599 removing locations. I suspect I've turned LiveDebugValues into a data-flow-like algorithm rather than a joining-lattices algorithm. Whether this is a good idea or not probably needs discussing; it's all been in aid of getting rid of invalid locations though.

[0] https://reviews.llvm.org/P8163
[1] https://imgur.com/a/YdE9kN7
[2] https://groups.google.com/forum/#!search/llvm-dev/llvm-dev/nlgxIfDQnhs/4zsGbuoTAAAJ

jmorse marked an inline comment as done.Sep 13 2019, 7:43 AM
jmorse added inline comments.
lib/CodeGen/LiveDebugValues.cpp
488

I've made the main loop (line 1392 on this patch) pass null to signify it doesn't want transfers recording.

A few revisions ago LiveDebugValues was passing enum values "transferChanges", "dontTransferChanges" to indicate whether transfers should be examined, we could do something similar here to make it clearer?

I'm assuming that at the beginning all in-locations are unknown and only when all incoming basic blocks have outgoing locations, join() adds new incoming locations to be processed.

Ah, this is where the implementation diverges, in fact if all incoming blocks have the same outgoing location *or* unknown for a variable, join() will consider it a valid live-in location. I think it has to be this way because of loops -- for a location that is live throughout a loop, there's a circular dependency between the outgoing location for the back-edge and the live-in location for the loop header. Right now we assume the location is correct for any unknown blocks, and have to invalidate locations later if that doesn't turn out to be true. See for example D66599.

I guess this boils down to how we are not really modeling "unknown" currently?

It turns out to be even more exciting than that -- consider this IR [0] which has badly behaved, but completely legal control flow. It's illustrated in this highly sophisticated diagram [1]. A single variable location (toast==0) is created in the entry block, and then invalidated with an undef dbg.value in loop 3. Because every non-entry/non-exit block is reachable from every other block, the location-invalidation has to be propagated to them all. However, it takes at least one LiveDebugValues iteration for the invalidation to traverse a back-edge, so it takes two LiveDebugvalues iterations for the invalidation to move up the CFG and reach loop1. In that time, all the "unknown" blocks have been processed, and as a result locations have to transition from "true" to "false", rather than "unknown" to "false". It might be possible to generate a situation where something transitions from "false" to "true" too, but I haven't looked closely at that yet.

I think the fact that this is happening means I've made LiveDebugValues depart from peoples expectations, you wrote in the 2016 mailing list dicsussion [2]

the information at the nodes can only move in one direction in the lattice and can change at most once.

which isn't true any more because of D66599 removing locations. I suspect I've turned LiveDebugValues into a data-flow-like algorithm rather than a joining-lattices algorithm. Whether this is a good idea or not probably needs discussing; it's all been in aid of getting rid of invalid locations though.

[0] https://reviews.llvm.org/P8163
[1] https://imgur.com/a/YdE9kN7
[2] https://groups.google.com/forum/#!search/llvm-dev/llvm-dev/nlgxIfDQnhs/4zsGbuoTAAAJ

Thanks for the thoughtful explanation. I don't think this is good. I re-read D66599 and it sounds like it is also operating on the pre-D66663 premise that the first iteration is special to prime that values. That's really bad because it makes reasoning about the algorithm very hard (as evidenced by our conversation ;-) and also turns the algorithm from a bitvector problem to something much more expensive all the way to the point that I'm not even sure if it will always terminate. I think the reason why D66599 needs to remove values is because we implemented join() and the unknown value incorrectly.

Is there a fundamental reason why we couldn't model LiveDebugValues to monotonically move results up the lattice until a fixpoint is reached? Looking at your CFG in [1], if we (conceptually) initialized every VarLoc with unknown and implement join such that join(unknown, X) == unknown, we should get an initial set of out-locs for each basic block that is derived just from the info we have for sure *within* each basic block and gradually more info gets propagated until we may or may not have a join function that produces new in-locs for a basic block. My assumption is that new in-locs would never result in an out-loc flipping from true->false but only in one flipping from unknown->something.

Maybe this assumption is wrong, but if it isn't, we should try hard to get LiveDebugValues to behave that way.

jmorse added a subscriber: Orlando.Sep 16 2019, 7:27 AM

Adrian wrote:

I don't think this is good.

Cool;

Is there a fundamental reason why we couldn't model LiveDebugValues to monotonically move results up the lattice until a fixpoint is reached? Looking at your CFG in [1], if we (conceptually) initialized every VarLoc with unknown and implement join such that join(unknown, X) == unknown, we should get an initial set of out-locs for each basic block that is derived just from the info we have for sure *within* each basic block and gradually more info gets propagated until we may or may not have a join function that produces new in-locs for a basic block. My assumption is that new in-locs would never result in an out-loc flipping from true->false but only in one flipping from unknown->something.

I think that idea works -- but it seems very close to the "Available Expressions" dataflow algorithm, which (I think) you're suggesting is too expensive, or at least undesirable right? Just to justify why I think they're similar: my understanding of available-expressions is that it initialises all possible expressions (here VarLocs) to true, then iteratively sets them to false as operands are killed (here clobbered), reaching a fixedpoint of expression availability (here variable locations) from their operand definitions (DBG_VALUEs or transfers) to kills (clobbers).

I think we would have difficulty with the fact that, even if we only implicitly made all locations "unknown" to begin with, we would still end up tracking a large number of "false" locations. Imagine the example in [0,1] were written in MIR, and that the entry variable-location were a register, which was clobbered in loop3 rather than being explicitly undef'd. There would be (AFAIUI) no way of telling that the register clobber in loop3 invalidated a variable location we cared about; we would have to track all register clobbers as "false" locations on the off-chance they clobbered a real variable location. (This would work though, be well defined, and will definitely complete).

I think the root cause is the circular dependency of locations in loops that I suggested above; we have to either

  • Speculatively consider a location true (until it's invalidated), which is what I've wound up doing, or
  • Collect the total set of invalidated locations

to break that loop.

~

I pointed @Orlando at this to see what he thought, he suggested that I've effectively downgraded "true" locations into being "explicitly unknown" or "maybe" locations that can later be found to be false; and that a true location is just a "maybe" location that never gets invalidated. I'm going to take a shot at proving that the current situation can't turn "false" locations true, which would make me feel better.

~

Alas, I'm also out of the office for the next two weeks so can't reply much further. I'll be at the conference though, in case this is best left on the backburner until then.

NB: I haven't forgotten about this, in particular the concern that the implementation on master right now might not complete. This is next on my list of things to mangle.

(I don't think the branch date for llvm-10 has been announced, but it's probably next month).

jmorse added a comment.Jan 7 2020, 4:43 AM

Hi,

I think I can narrow the conceptual difference between our interpretations down to this:

we should get an initial set of out-locs for each basic block that is derived
just from the info we have for sure *within* each basic block and gradually
more info gets propagated until we may or may not have a join function that
produces new in-locs for a basic block

Plus, you're saying "up" the lattice and I'm saying "down".

Reading through "Compilers, Principles, Techniques and Tools" (Aho, Lam, Sethi, Ullman) [0], there are actually two flavours of dataflow analyses (section 9.2):

  • Those where all lattice values are initialized to the bottom element of the lattice, then are incrementally moved up the lattice as more truths about the program become available.
  • Those where all lattice values are initialised to the _top_ element of the lattice, and are incrementally moved _down_ the lattice as more information about false-ness in the program becomes available.

Effectively the former assumes nothing and tries to build facts up from the program blocks, wheras the latter assumes everything is true and then removes those facts that can be proved false. I've been exclusively thinking in terms of the second one, and I think your terminology indicates you see LiveDebugValues as the first. (It doesn't help that I've continued to use the term "join", when I think (90%) in the second model it's actually "meet". I can never remember which way round they are).

This would explain the confusion over this patch and D66599: in the first kind of dataflow analysis, during every iteration of the analysis, live-in and live-out sets only take on a lattice element if it's definitely true. Wheras in the second kind, lattice elements aren't necessarily correct until the fixedpoint is reached, because they might yet reduce down the lattice. That's the essence of this patch (D67500), not trying to determine transfers until after the fixedpoint is reached.

Furthermore, it explains why I see a need for multiple transitions in lattice elements. In the original lattice (putting \bot at the bottom):

True     False
  \       / 
   \     /
    \   /
     \ /
     \bot

You would only ever move up the lattice to True or False if it was definitely true -- and if it was definitely true, you would only ever transition once. Wheras I think the lattice I'm effectively working with is:

\top
  |
True
  |
False

Where all values are initialized to \top, the "unknown" element (when a block is unvisited). Values decay to True when the block is first visited and the location _could_ be true, and then potentially decay to False if that assumption is later found to be invalid. (Coincidentally, False is also \bot here).

Which dataflow model is better then? The Aho/Lam/Sethi/Ullman book says the first computes a minimal solution, the second a maximal, and for variable locations I suspect we want a maximal solution. But in addition, the Cooper/Torczon "Engineering a Compiler" book (Section 9.3.6) states the second kind is an optimistic algorithm suited for dealing with loops. If you have a loop like this:

entry:
  dbg.value(%0, ...)
  br label %loop
loop:
  call void @nop()
  br i1 %cond, label %loop, label %exit
exit:
  ret void

Then the live-in lattice element of the loop block has a circular dependency with itself due to the loop backedge: it will be True if it is already True. With the second kind of dataflow analysis, we assume the live-in is True and that the backedge will confirm this later: or if the location is False across the backedge then the live-in will later reduce to False. I don't think the first kind of analysis can cope with this case, as you have to start with some kind of assumption or speculation that the live-in is True to later find out whether the backedge live-out is True. Hence, the first kind of analysis produces a minimal solution, the second maximal. This is also the problem that kicked off the 2016 discussion about missing locations [1], locations not being propagated into loops.

In summary:

  • I think we've been talking about two different flavours of dataflow analyses, happily I think I've found the words to describe the difference,
  • I believe we want to use an optimistic dataflow analysis (the second kind above) because it's necessary to propagate locations through loops,
  • This second kind of dataflow analysis requires a lattice transition from True to False because it works by starting with assuming everything is True [2], then removing anything that can be proved to be False.

(I speculated in another comment that I might have written some code that performs False -> True transitions; happily that turns out to not be correct).

[0] Which I'm led to believe is the dragon book, although the international edition doesn't have a dragon on the cover :/
[1] https://groups.google.com/d/msg/llvm-dev/nlgxIfDQnhs/c55kcncIAAAJ
[2] Technically I think the Unknown lattice element in my model is un-necessary, it's a hack to let us avoid initialising every variable location ahead of time. Didn't want to get bogged down with it in this analysis though.

jmorse added a comment.Feb 4 2020, 9:03 AM

Ping (summary of situation in previous comment).

Thanks for the writeup, this is very helpful. I am indeed thinking of LiveDebugValues as the first kind of analysis.

One thing that we must define, too, is what true and false mean in the context of LiveDebugValues. My understanding is that for each DBG_VALUE d in the analyzed MIR function and each instruction i

  • true means: "it is correct to insert d at position i"
  • false means: "it would be wrong to insert d at position i"

In other words at each instruction i the set of DBG_VALUEs that are "true" are the set of "live" DBG_VALUEs.

As for your loop example: *My* mental model actually used a three-element lattice that is suspiciously close to your second one, but using the opposite direction:

\top (wrong to place DBG_VALUE = false in the above sense / DBG_VALUE is not in the set)
   |
true (DBG_VALUE may be placed / DBG_VALUE is in the set)
   |
\bot (not fully analyzed yet)

We'd initialize everything with \bot.
join is a commutative function that is
join(x, x) = x
join(x, \bot) = x
join(x, \top) = \top

After processing the first basic block once, we get:

entry: ; {\bot}
  dbg.value(%0, ...) ; {true}
  br label %loop
loop: ; {\bot}
  call void @nop() ; {\bot}
  br i1 %cond, label %loop, label %exit ; {\bot}
exit: ; {\bot}
  ret void ; {\bot}

processing the second block once, we get

entry: ; {\bot}
  dbg.value(%0, ...) ; {true}
  br label %loop
loop: ; {join(true (entry), \bot (backedge)) = true}
  call void @nop() ; {true}
  br i1 %cond, label %loop, label %exit ; {true}
exit: ; {\bot}
  ret void ; {\bot}

processing the second block twice, we get

loop: ; {join(true (entry), true (backedge)) = true}

and then true is propagated to the final block. Every time a value changes all successor blocks are added to the worklist. The join function ensures that values can only move in one direction, from \bot -> true -> \top.

I *think* I just described something isomorphic to your type (2), just using other names.

aprantl added a subscriber: davide.Feb 4 2020, 1:12 PM
jmorse added a comment.Feb 6 2020, 9:19 AM

Adrian wrote:

My understanding is that for each DBG_VALUE d in the analyzed MIR function and each instruction i

  • true means: "it is correct to insert d at position i"
  • false means: "it would be wrong to insert d at position i"

In other words at each instruction i the set of DBG_VALUEs that are "true" are the set of "live" DBG_VALUEs.

I agree,

[...]

I *think* I just described something isomorphic to your type (2), just using other names.

Yup, sounds like we're in violent agreement then.

There are now only (!) two big differences between this model and what the LiveDebugValues code actually does. I think they're sound, and will explain below. If it sounds sound, I can bake this into file documentation for everyones future reference.

1, The implicit "Unknown" value: We don't actually represent your \bot at all, instead a location in a block is implicitly unknown until the block is explored. Practically, this means that the result of a join(...) can only ever be your true or \top. Given that we always explore in reverse-post-order, there's always an explored path back to the entry block where the value must begin \top. That means there's always a predecessor that has a value \top or true: so we can never produce a \bot from join anyway.

2, We don't "know" in advance about locations that are discovered during location propagation. For example, when we have in a block, after some control flow:

DBG_VALUE $rax, ...
$rbx = COPY $rax

And no other variable location was in $rbx, then during exploration we create a new variable location of "$rbx" without exploring its behaviour from the entry block. Implicitly, all predecessor blocks get initialized to \top value for that variable location. I think this is /OK/ but messy. Given that the variable location was not seen when exploring predecessor blocks, they would have had \top propagated into them anyway from the entry block.

Successor blocks could potentially be initialized to \top too if the new location is discovered after the first dataflow iteration, and that could artifically limit the range of the variable location. However, I think in practice that these locations are _always_ found on the first iteration: we only ever move "up" your lattice and have no unknown values after a block is explored; therefore the maximum number of locations will be "true" on the first iteration; and so any new location that could be found _must_ be found on the first iteration.

Adrian wrote:

My understanding is that for each DBG_VALUE d in the analyzed MIR function and each instruction i

  • true means: "it is correct to insert d at position i"
  • false means: "it would be wrong to insert d at position i"

In other words at each instruction i the set of DBG_VALUEs that are "true" are the set of "live" DBG_VALUEs.

I agree,

[...]

I *think* I just described something isomorphic to your type (2), just using other names.

Yup, sounds like we're in violent agreement then.

There are now only (!) two big differences between this model and what the LiveDebugValues code actually does. I think they're sound, and will explain below. If it sounds sound, I can bake this into file documentation for everyones future reference.

That would be great, and if any of our functions can benefit from renaming to fit established nomenclature, we should do that, too.

1, The implicit "Unknown" value: We don't actually represent your \bot at all, instead a location in a block is implicitly unknown until the block is explored.

It's not represented in the *data* at all since we only have a set, it just falls out of the algorithm. Can you remind me why this works again? Is it because we visit each block locally once before calling any join() function?

Practically, this means that the result of a join(...) can only ever be your true or \top. Given that we always explore in reverse-post-order, there's always an explored path back to the entry block where the value must begin \top. That means there's always a predecessor that has a value \top or true: so we can never produce a \bot from join anyway.

2, We don't "know" in advance about locations that are discovered during location propagation. For example, when we have in a block, after some control flow:

DBG_VALUE $rax, ...
$rbx = COPY $rax

And no other variable location was in $rbx, then during exploration we create a new variable location of "$rbx" without exploring its behaviour from the entry block. Implicitly, all predecessor blocks get initialized to \top value for that variable location. I think this is /OK/ but messy. Given that the variable location was not seen when exploring predecessor blocks, they would have had \top propagated into them anyway from the entry block.

How is this different from discovering a DBG_VALUE the first time?

Successor blocks could potentially be initialized to \top too if the new location is discovered after the first dataflow iteration, and that could artifically limit the range of the variable location. However, I think in practice that these locations are _always_ found on the first iteration: we only ever move "up" your lattice and have no unknown values after a block is explored; therefore the maximum number of locations will be "true" on the first iteration; and so any new location that could be found _must_ be found on the first iteration.

It's not represented in the *data* at all since we only have a set, it just falls out of the algorithm. Can you remind me why this works again? Is it because we visit each block locally once before calling any join() function?

I think it's because at least one predecessor is guaranteed to have been visited first (except for the entry block), which ensures there is at least one non-unknown lattice value join(...) to return. Inductively,

  • The entry block always starts with \top,
  • join(...) can only return Unknown (\bot) if all inputs are Unknown

And because we only visit a block when its predecessors have been visited, join() will never see a block where all its predecessors have Unknown values.

This might fall apart if the entry block is also a loop head, but I think the entry block always starting with \top might fix that.

[Locations being discovered during exploration]

How is this different from discovering a DBG_VALUE the first time?

I guess it isn't different really. For *any* variable location we might only start tracking it when it's discovered during exploration, instead of tracking it from the entry block. To reform the question: do we still correctly implement the dataflow analysis if we begin tracking variable locations after exploration starts, instead of tracking from the entry block? As explained above, I think we get correct values for previously explored blocks (i.e., \top), and I think we're not artificially limiting "new" locations to \top in later blocks because they should all be found on the first iteration.