This is an archive of the discontinued LLVM Phabricator instance.

[LiveDebugValues] 2/4 Add instruction-referencing LiveDebugValues implementation
ClosedPublic

Authored by jmorse on Jul 2 2020, 6:54 AM.

Details

Summary

(Context: this thread http://lists.llvm.org/pipermail/llvm-dev/2020-June/142368.html )

This patch imports the instruction-referencing implementation of LiveDebugValues from the linked RFC, and adds it to the build system, but doesn't make it accessible at runtime yet -- that's the next patch.

The linked RFC contains the rationale behind this new implementation; the file comment provides a high level summary of the algorithm(s) behind it. This isn't an incremental change, so I'm not sure what the review criteria are, but at the very least I guess this needs to meet the code guidelines and be documented.

(This matches 1a20684b66339 on my public-github work, plus a few comment cleanups).

Diff Detail

Event Timeline

jmorse created this revision.Jul 2 2020, 6:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2020, 6:54 AM
jmorse added a comment.Jul 2 2020, 7:48 AM

NB: in D83054 I've added livedebugvalues_many_loop_heads.mir which contains a worked example (aka pathological input) that exercises the lattice-descent code for machine value numbers. It's definitely the ugliest part.

krisb added a subscriber: krisb.Jul 2 2020, 9:03 AM
vsk added a comment.Jul 6 2020, 11:52 AM

Thanks for this, Jeremy. It'll take me multiple passes to page all of this in. I hope to get to the core algorithm changes in my next review. In the interest of getting some feedback to you sooner rather than later, I've included some minor suggestions and questions inline.

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
241

You might find it convenient to use the new bitfield utilities from https://reviews.llvm.org/D82454.

245

I don't follow this caveat, could you rephrase this?

313

Wdyt of getting rid of these DenseMapInfo specializations? Having special reserved values complicates things a bit. If profiling demonstrates that std::map is a bottleneck, they could be added back.

321

Seems worthwhile to make this a proper class, with a constructor that accepts a MachineInstr and fills out the structure. I also find the name somewhat non-specific. Wdyt of "DbgValueProperties"?

360

Why does there need to be a LocIdx reserved for the case where the value in a register isn't tracked? It doesn't look like this is done for stack slots.

393

Could just be LocIDToLocIdx.assign(NumRegs, LocIdx(0))?

669

nit -- "Rec" is suggestive of "recurrence". Wdyt of naming this "DbgValue"?

674

Wdyt of grouping 'ID' and 'MO' in a union? This would make it clear that they cannot both be in use at the same time.

jmorse updated this revision to Diff 276039.Jul 7 2020, 6:44 AM

(Rebasing, only affects LiveDebugValues.h)

jmorse updated this revision to Diff 276099.Jul 7 2020, 9:18 AM
jmorse marked 11 inline comments as done.
  • Replace a std::pair with a DbgValueProperties class,
  • Replace some Densemaps of LocIdx with std::map, for initial cleanliness,
  • Rename ValueRec to DbgValue and make two exclusive fields a union,
  • Use an Optional<ValueIDNum> instead of in-band signalling when it's an invalid result,
  • Address additional assorted feedback.
jmorse added a comment.Jul 7 2020, 9:19 AM
In D83047#2133722, @vsk wrote:

Thanks for this, Jeremy. It'll take me multiple passes to page all of this in. I hope to get to the core algorithm changes in my next review. In the interest of getting some feedback to you sooner rather than later, I've included some minor suggestions and questions inline.

Thanks for taking the time!,

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
241

I'll read up on this for a further revision, I'm generally unfamiliar with the bitfield utilities as they are.

245

Hmm, I don't think I meant to upload that, now removed.

313

Sounds like a plan,

321

Replaced with a DbgValueProperties class. This flushes out several circumstances where the class was potentially being default-constructed on map-assignment, which it's probably good to suppress.

360

Ah, this is the curse of early optimisation: it's a shortcut so that LocIDToLocIdx can be an array with constant-time lookup to identify whether the corresponding register is tracked or not (signalled by the element being nonzero). This could easily be eliminated by making LocIDToLocIdx an associative array. It isn't necessary for stack slots as they're identified by an associative array elsewhere.

However, outside of MLocTracker, I've been using a zero LocIdx as shorthand for "this is an invalid/empty value/location", in the manner of Optional<> and None. It's not really tied to either registers or stack slots.

This isn't good practice, but there was a stage in prototyping where the machine value number live-in / live-outs of a block could be an empty or null value. I don't think this is the case any more (95% sure) , I'll spend some time polishing this "feature" out of the implementation.

669

Works for me, and I guess it drives home that this is the value of the variable.

674

Works for me too.

djtodoro added inline comments.Jul 13 2020, 4:15 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
133

From the RFC [0]:

The new implementation has enough information to solve this, as it tracks all
values in all machine locations. Unfortunately it doesn't make much difference
to the location statistics for clang RelWithDebInfo builds. An additional
benefit is that we would be better placed to produce saner locations lists [4].

Have you disabled the Debug Entry Values feature (in the original LideDebugValues) when comparing new and old implementations?
It adds a lot of extra inputs with location lists, so it may be interesting to see the comparison without it.

[0] http://lists.llvm.org/pipermail/llvm-dev/2020-June/142368.html

1649
jmorse marked 2 inline comments as done.Jul 14 2020, 10:09 AM

On the topic of entry values, I believe the value based tracking should make identifying entry values trivial -- all entry value numbers will be identifiable (as a ValueIDNum) with "BlockNo=0" and "InstNo="0", indicating a value defined before the first instruction of the first block. Ideally, the logic would be after machine-value-numbers and variable-values are propagated, when final locations are picked, and would look like this:

  • This variables value is an entry value (BlockNo=0,InstNo=0),
  • That value isn't available in any machine location right now
  • I will emit a DBG_VALUE containing a DW_OP_LLVM_entry_value expression.

Right now I've got some extra work on this (InstrRefBasedLDV) patch to be done, more on that shortly, and a patch series adding some initial DBG_INSTR_REF support to upload. I should be able to prod the entry-value situation after that.

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
133

I disabled entry-values for the comparison, as it wasn't going to be an apples-to-apples comparison otherwise.

I think there might be scope for more recovered variable locations once entry-value propagation is implemented as it's likely to be more precise. Right now it's just a hunch though.

1649

Entirely I hope!

On the topic of this implementation in general: the more I look at the vlocJoin method, the more I'm convinced it's incorrect. While it does produce an identical binary when stage2 building clang, I think it's only correctly dealing with the kind of control flow that C/C++ produces. If you expose it to weirder control flow, like these MIR files:

https://gist.github.com/jmorse/7e199c289b2a3c94f73a6f62b515b05f
https://gist.github.com/jmorse/1c9daca31cd5d6c0d4ba727a5bdfb016

It produces the wrong output (details in a comment on each gist). For the latter, the problem is caused by insufficient number of iterations of vlocDataflow, not sure about the former. One major problem is that by allowing "empty" variable values (represented by no entry in the live-in map), changes that should lead to a variable-value live-out change and thus trigger another dataflow iteration, instead terminates early.

I'm pretty confident that everything else in this pass implementation is correct, and that the variable-value situation can be fixed. If anyone's currently reviewing that portion of code though, it's probably best to stop as I'm going to mess with it further.

djtodoro added a comment.EditedJul 15 2020, 1:32 AM

On the topic of entry values, I believe the value based tracking should make identifying entry values trivial -- all entry value numbers will be identifiable (as a ValueIDNum) with "BlockNo=0" and "InstNo="0", indicating a value defined before the first instruction of the first block. Ideally, the logic would be after machine-value-numbers and variable-values are propagated, when final locations are picked, and would look like this:

  • This variables value is an entry value (BlockNo=0,InstNo=0),
  • That value isn't available in any machine location right now
  • I will emit a DBG_VALUE containing a DW_OP_LLVM_entry_value expression.

My initial impression went in that direction. The implementation of the callee side of the entry values feature (production of the DW_OP_entry_values) within GCC Value Tracking System was trivial as well. I think this new implementation of LLVM LDV will give us more freedom to use entry-values even for modified parameters, if we can express the modification in terms of its entry value (i.e. param = param + 2 ==> DW_OP_plus (DW_OP_entry_value (param), 2)).

Right now I've got some extra work on this (InstrRefBasedLDV) patch to be done, more on that shortly, and a patch series adding some initial DBG_INSTR_REF support to upload. I should be able to prod the entry-value situation after that.

vsk added inline comments.Jul 20 2020, 6:10 PM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
139

Hopefully there's no need for CoalescingBitVector.h :)

228

This "NUM_LOC_BITS" looks to be quite leaky -- I expect it'll save some pain to simply bite the bullet now and turn LocIdx into a real class / value type. Then all the asserts for 'Idx < (1<<24)' can be moved into the constructor.

243

Could you make {Inst,Block}No private, and introduce predicates like isPhi() instead of writing P.InstNo == 0?

360

That would be great - imo there's a significant long-term benefit to not having a representation for invalid state. (Other than Optional<> of course.)

431

In ValueIDNum::fromU64(Locs[ID]), where does the reset to InstNo = 0 happen? Maybe this would be clearer if we could write <someValueIDNum>.setIsPhi(true)?

449

UniqueVector does have a 'reset' method. Does it not work?

464

Is there some name more specific than "bumpRegister" that conveys "get a LocIdx for a register ID"?

464

Why does "Ref" have to be an in-out parameter? Is there some way to simply return a correct LocIdx?

473

const auto &MaskPair : reverse(Masks)?

534

Can the LocIdx for SP be initialized when the MLocTracker is initialized?

541

I see - so here, "kill the register" is represented by a def.

729

Could you leave a short blurb here explaining why it's sufficient to keep the last DbgValue for a DebugVariable?

872

Do we care to prefer a spill over a non-callee saved reg?

921

I find it difficult to parse It->second.first -- would you mind making LocAndProperties a struct, so it's clear that this is accessing a location?

1167

What makes it necessary to write this in terms of uint64_t, instead of ValueIDNum?

djtodoro added inline comments.Jul 21 2020, 2:56 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
279

we can delete this now?

djtodoro added inline comments.Jul 21 2020, 3:02 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
699

std::tie()?

and for some other operator==()s

jmorse updated this revision to Diff 280205.Jul 23 2020, 11:25 AM
jmorse marked 27 inline comments as done.

Thanks for all the input -- this update addresses more of the feedback so far, although I've left a few things for future update(s). As mentioned above, there's some sketchyness with vlocJoin, which I now have a fix for, which I'll hopefully fold into this revision today or tomorrow.

[Bah, comments didn't get submitted]

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
139

Removing this,

228

I've made this change; I've also moved the two maps indexed by LocIdx's in MLocTracker away from being plain vectors. They're now IndexedMaps, which are vectors underneath, but now have a tiny little bit more type safety about them.

There are now more places where we enumerate all locations by for-loop-and-counter over the contents of LocIdxToIDNum: I'd really enjoy having a locations() method to generate a range... however I'm not aware of C++ being able to describe integer ranges until C++20 :/

279

Sure -- this moves the "ActiveMLocs" container in TransferTracker back to being a std::map. As I think Vedant pointed out, if this turns out to be a performance thing, we can restore it some other time.

360

(I'm still planning on this, slowly)

431

Ugh; this was actually an out-of-date comment, now updated.

449

It's implemented within UniqueVector as Vector.resize(0, 0), which appears to assume that the mapped type can be initialized with a zero. I haven't put any thought into fixing it (yet) as it could be worked around.

464

I've revised this to better reflect what's really going on: this method really exists to handle tracking of a new register, and I've added a lookupOrTrackRegister method to abstract over just looking up a register.

534

Good idea; I've folded that into the constructor.

541

Yeah, my understanding of register masks is that they end the liveness of a register, which means that the registers value can't be relied upon. That means we have to over-write the value number with something -- originally I used the LocIdx(0) "empty value" indicator, but def'ing them works just as well.

872

(I've added this -- I don't have a useful setup for measuring the effects of this right now, alas).

1167

Nothing -- I think it just matched my mental model of "here's a table of value numbers" to be mangled. I'll switch to writing this as ValueIDNum, but in the interests of readable patches I'll do that in an additional revision.

jmorse updated this revision to Diff 280460.Jul 24 2020, 7:37 AM

Hi,

Here's the correction for vlocJoin. Note that I've added tests in D83054 as they're not runnable from this patch (ooof).

What the previous vlocJoin implementation failed to model was that we sometimes need to "propose" a variable value to then confirm it later. This is exactly what the VarLoc implementation is doing all the time [0], by allowing variable locations to be "true" until they're found to be "false", so that locations that are live-through a loop can be found. This doesn't happen for machine-value-numbers: we only work out what value-number a location has for that problem, while for variable-value joining we have to work out what the values are _and_ whether they're present at a common machine location to be a legitimate PHI.

I've addressed this by explicitly distinguishing "proposed" and "definite" values in the DbgValue "Kind" enumeration. Now:

  • A "proposed" value is a machine-value-number, that is the variable value at a loop head if you ignore the backedges,
  • A Def is any machine-value-number that is definitely the variable value in this live-in/live-out.

Proposed values are generated at loop heads whenever backedges don't agree on the live-in value. If we re-visit the loop head and find that the backedges now agree on the variable value (with proposed values), we've discovered a live-through value and it gets confirmed as a Def, which is propagated to all successors. If we never visit the loop head again though, the "proposed" value isn't considered a true variable location and is ignored during DBG_VALUE emission.

In terms of lattices, I think this splits each position into a proposed and definite element, but doesn't otherwise change the order of exploration. The test I've added (livedebugvalues_downgradefeedback_loop.mir in D83054) checks for this failure mode.

Two additional problems came up in the process:

  • The exploration order we've previously been using isn't actually reverse-post-order!
  • Moving down the lattice doesn't necessarily guarantee a loop head is re-visited later.

For the first item, take a look at the livedebugvalues_path_lengths.mir test I've added to D83054, and consider the machine value numbers. As the test comment explains, there's a long and short path from one block to another, which can lead to predecessors not having observed dominating changes that other predecessors have. This is fixed by, in the dataflow loops, adding immediate successors to the current worklist, and successors reached via backedges to the Pending worklist. This gives a stronger guarantee that all predecessors to a block will be visited, and will have observed changes to their own predecessors, before a block is visited.

The second item is demonstrated by livedebugvalues_insensitive_downgrade.mir in D83054, again considering machine value numbers. bb.1 generates a PHI value, which could be live-through the loop at bb.2, so we downgrade the live-in value at bb.2 to the PHI value from bb.1. bb.2 should be revisited to discover that the bb.1 PHI isn't live-through and a PHI should be generated at bb.2 -- however, the live-outs from both bb.2 and bb.3 do not change after the downgrade. This means bb.2 is never re-visited, and the PHI value at bb.2 is never discovered. I've fixed this by having any downgraded block booking itself to be re-visited on the next dataflow round.

There are a couple of final concerns with this implementation, firstly that there are now two ways of signalling the absence of a variable location. One is by not having a DbgValue present in the live-in/live-out map for the variable (a scenario caused by a DBG_VALUE $noreg in the transfer function), or, having vlocJoin fail to find a candidate value and create a "NoVal" DbgValue, which I've just added. I think I still need to come up with an explanation of how that fits into the lattice exploration.

Secondly, the implementation doesn't _explicitly_ store where it is in the lattice exploration for each block. I'm 95% convinced that exploring it in reverse-post-order means we "store" stat
e in the form of earlier-in-RPO block live-outs, which could do with formalising and writing down somewhere.

[0] https://github.com/llvm/llvm-project/blob/b7f6ecf0c7d4ac86ed4983311d0501e75c659e25/llvm/lib/CodeGen/LiveDebugValues.cpp#L69

jmorse updated this revision to Diff 280920.Jul 27 2020, 7:55 AM

Misery: unfortunately diff 280460 undid the contents of the previous revision (adding isPHI helpers etc). This latest update restores that work -- please diff between this update (280920) and 280205 to see the vlocJoin rework. Sorry for the extra bother.

jmorse updated this revision to Diff 283914.Aug 7 2020, 8:41 AM
jmorse marked 2 inline comments as done.

Hi,

Here's the latest revision; in this change I've suppressed use of the "zero LocIdx": all location indexes now mean the same thing. I've done this by scattering Optional around a variety of places. There's still one rough edge, which is that the IndexedMaps in MLocTracker want a reserved value to signify a mapping that doesn't exist -- I've added an "illegal" maximally-valued LocIdx for that, which is only ever generated inside the IndexedMap. This too could be eliminated by using a std::map instead, if we're aiming for complete clean-ness at this stage.

With no empty / LocIdx(0) value, DbgValue objects in the variable-value transfer functions now have an "Undef" kind, to signify when a value is explicitly erased. This DbgValue kind isn't used in the dataflow operations.

I've also removed various bits of code that generated special or empty ValueIDNums. There are currently three instances I can't get rid of:

  • Where the VarLoc LiveDebugValues previously lost information about what was being tracked, I'm overwriting values with a special ValueIDNum, "EmptyValue". Because we're tracking all values in all locations, erasing a value necessitates a special "this was erased" value unfortunately. This feature can disappear in the long run though, it's only useful for comparing the passes.
  • Similarly, when a register mask clobbers a register in a block, I'm generating a meaningless value number too (search for NotGeneratedNum).
  • Allocating the live-in/live-out ValueIDNum tables currently requires a default initializer; I might find a way to eliminate this in the future.

In addition, I've added proper constructors to DbgValue and DbgValueProperties. I've added an iterator class to MLocTracker and range-generation locations() method, to make it clearer when code is iterating over locations.

To my mind, the biggest outstanding issue is still the "NoVal" DbgValue Kind. Removing it doesn't lead to any incorrect variable locations, instead it leads to infinite loops in code like this:

bb1:
  DBG_VALUE $somereg
  JMP_1 %bb1

Where vlocJoin repeated switches between thinking "I can move down the lattice; let's assume the incoming variable value to bb1 is live-through" and "Nothing worked; there is no correct variable value incoming to bb1". I believe I just need a clean way of terminating exploration: and to document how this works.

In the meantime: to avoid burn-out from too much LiveDebugValues, I wrote up the initial set of patches that add DBG_INSTR_REF and the related LiveDebugValues code here [0], which will slowly make their way onto phab. On my plate right now:

  • Firm up the definition of this 'NoVal' DbgValue kind, so that the dataflow lattice comprehensively makes sense.
  • Upload the first tranche of patches at [0] onto phab,
  • Work on an entry-values implementation as that'll be interesting to compare against the VarLoc based version,
  • Upload other tranches of DBG_INSTR_REF work (PHI handling is the exciting part),
  • Polish everything.

[0] https://github.com/jmorse/llvm-project/commits/initial-dbg-instrs

davide added a subscriber: davide.Aug 10 2020, 1:18 PM
vsk added a comment.Aug 10 2020, 1:54 PM

Hi @jmorse, thanks for the updates on the DbgValue::{Proposed,NoVal} scheme.

I'm having trouble paging in enough of the DBG_INSTR_REF patch queue to continue giving meaningful review. I'm not sure the traditional review process is a great fit for this, since the changes are at a prototype stage. Perhaps we might be better served by landing things as they are, as a separate code path guarded behind a cl::opt? This would allow more people to test and contribute patches. Wdyt? (cc @aprantl @davide)

jmorse added a comment.EditedAug 10 2020, 2:45 PM
In D83047#2208102, @vsk wrote:

I'm having trouble paging in enough of the DBG_INSTR_REF patch queue to continue giving meaningful review. I'm not sure the traditional review process is a great fit for this, since the changes are at a prototype stage. Perhaps we might be better served by landing things as they are, as a separate code path guarded behind a cl::opt? This would allow more people to test and contribute patches. Wdyt? (cc @aprantl @davide)

I was a bit worried too; it is after all a massive code bomb, but I can't see a way of splitting it (InstrRefBasedLDV) into independently-useful parts, and it's at the root of all the other DBG_INSTR_REF work.

To state the obvious, it's a bad plan to have a pass that only one person understands. One option might be, assuming landing in a prototype behind-a-flag state, building up a suite of tests that take a specific path through the dataflow code and explain (in test comments) what they're doing and why. This is kind of what @TWeaver did with the tests at llvm/test/DebugInfo/MIR/X86/livedebugvalues_* to stimulate different paths through VarLoc LiveDebugValues, and could / would / should aid future understanding of what's going on.

Almost-worst-case scenario: the whole pass/file could be decomposed into three parts:

  • Transfer function building
  • Machine value-numbering dataflow
  • Variable value dataflow

With debug-like options to spit out the current state between each part, for which tests can be independently written. This would be fairly awkward, but worthwhile if it allows things to be understood.

Ninja-edit: and thanks for reviewing so far!

In D83047#2208102, @vsk wrote:

Hi @jmorse, thanks for the updates on the DbgValue::{Proposed,NoVal} scheme.

I'm having trouble paging in enough of the DBG_INSTR_REF patch queue to continue giving meaningful review. I'm not sure the traditional review process is a great fit for this, since the changes are at a prototype stage. Perhaps we might be better served by landing things as they are, as a separate code path guarded behind a cl::opt? This would allow more people to test and contribute patches. Wdyt? (cc @aprantl @davide)

Ping @aprantl regarding this ^^^.

To avoid vagueness: it'd be great if this could be landed as a prototype and then refined / built up from there.

aprantl accepted this revision.Aug 19 2020, 9:51 AM
In D83047#2208102, @vsk wrote:

Hi @jmorse, thanks for the updates on the DbgValue::{Proposed,NoVal} scheme.

I'm having trouble paging in enough of the DBG_INSTR_REF patch queue to continue giving meaningful review. I'm not sure the traditional review process is a great fit for this, since the changes are at a prototype stage. Perhaps we might be better served by landing things as they are, as a separate code path guarded behind a cl::opt? This would allow more people to test and contribute patches. Wdyt? (cc @aprantl @davide)

Ping @aprantl regarding this ^^^.

To avoid vagueness: it'd be great if this could be landed as a prototype and then refined / built up from there.

Since I'm also dealing with some cognitive overload at the moment, I'm pragmatically going to agree!

This revision is now accepted and ready to land.Aug 19 2020, 9:51 AM

Since I'm also dealing with some cognitive overload at the moment, I'm pragmatically going to agree!

Much appreciated; I'll try and find a good way of explaining what's going on here as part of the rest of the instruction-referencing work. (I'll land this during the weekend as I'm confident the bots will find _something_ wrong in this much C++).