Page MenuHomePhabricator

[DebugInfo] Process DBG_VALUE_LIST in LiveDebugValues
Needs ReviewPublic

Authored by StephenTozer on Jul 15 2020, 10:58 AM.

Details

Summary

Continuing the work discussed in the RFC[0], this patch implements DBG_VALUE_LIST handling in the LiveDebugValues pass.

This is without question the largest change to a single pass in terms of both line count and complexity, largely due to the complexity of the pass and the extent to which it is currently coupled with single machine location variables. Although the line count for this change is high, the logical changes to the patch derive largely from a few basic changes.

First and foremost, the VarLocSet+VarLocMap model has changed:

Previously, VarLocMap acted as a simple bidirectional map between machine locations and VarLocs. Each VarLoc would be assigned a single unique LocIndex by VarLocMap, which could be used to look up that VarLoc in the map. This coupled with VarLocSet, which was a set of LocIndexes optimized for range lookups (using the CoalescingBitVector class). Due to how LocIndex was structured, this allowed VarLocSet to be used to look up every LocIndex associated with a given machine location - either a specific register, or one of several shared buckets for spill locations, entry values, entry value backups, or constants (undef or immediate). VarLocSet also supports all other basic set operations, and was used to track all live VarLocs at a given point.

In order to support DBG_VALUE_LIST, a few changes have been made to this model. VarLocMap no longer provides a single unique LocIndex for a VarLoc, but a set of them corresponding to each machine location used; a single VarLoc will still never have multiple indices in a single bucket, and every LocIndex of a VarLoc must be inserted into any associated VarLocSet. Any one of these indices can be used by itself to look up the VarLoc they are assigned to. Within this index set, every VarLoc has an index in the "constant" bucket; the purpose of this is to provide a means of looking up every current VarLoc in VarLocSet.
VarLocSet is no longer strictly a "set of VarLocs", but a set of LocIndexes (this probably warrants a name change), as each VarLoc may have multiple LocIndexes inserted into the VarLocSet. Functionally, this means that VarLocSet is used slightly differently; the methods for looking up every VarLoc that uses a given bucket (a specific register, spill location, entry value, or entry value backup) is unchanged. When looking up every active VarLoc, instead of fetching the entire set we simply fetch every LocIndex with Location=0 (the "constant" bucket), as every VarLoc appears exactly once in there. While these changes make up a significant bulk of the line count, there are few resulting changes outside of the functions that directly operate on VarLocSets - the signatures of these functions are mostly unchanged.

The other major change is the VarLoc class itself. This mostly consists of cleaning up its interface to work with multiple machine locations, as well as separating the old "Kind" enum into MachineLocKind (for machine locations) and EntryValueKind (describing the VarLoc as a whole). The majority of these changes are simply using more specific functions to do things that didn't need them before, such as replacing if (VarLoc.getSpillLoc()) with if (VarLoc.containsSpillLocs()), or if (VarLoc.getReg() == RegNo) with if (VarLoc.usesReg(RegNo).

[0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html

Diff Detail

Event Timeline

StephenTozer created this revision.Jul 15 2020, 10:58 AM

Thanks for doing this!

Can you please update the top-level comment of the file, so we have the whole picture of the change?

llvm/lib/CodeGen/LiveDebugValues.cpp
263 ↗(On Diff #278233)

In stands for Indices here? I'd change that in order not to mix up terms with the In/Out...

267 ↗(On Diff #278233)

This should have more intuitive name, I guess...

308 ↗(On Diff #278233)

I don't understand why we need new Kind for the EntryValue here.

322 ↗(On Diff #278233)

why typedef here?

331 ↗(On Diff #278233)

Please add a comment here by describing the purpose of the structure.

1579 ↗(On Diff #278233)

This should be deleted?

1719 ↗(On Diff #278233)

We have declared this multiple times, therefore, can we declare it just once out of the methods?

llvm/test/DebugInfo/MIR/X86/dvl-livedebugvalues-clobber.mir
74

We don't need all the MF attributes. Most of them could be deleted (applies to all the tests).

StephenTozer marked 3 inline comments as done.Jul 17 2020, 7:49 AM
StephenTozer added inline comments.
llvm/lib/CodeGen/LiveDebugValues.cpp
263 ↗(On Diff #278233)

In this case, "In" just means "In", as in the VarLocs that are in a given range. I think there could be a better name for this as well as some of the other classes here; it might even be better to make them into full classes rather than just named sets/vectors. I'm not sure whether doing so would make it easier to understand what's going on, or just make it harder to compare to the previous behaviour.

308 ↗(On Diff #278233)

After completing the overall work I found the separation isn't strictly necessary, as entry values currently only ever consist of a single register. The logical distinction is that a complete VarLoc is or is not an EntryValue; a machine location doesn't have EntryValue-ness. Still: at the moment we lose no information by combining them, although if at any point we expand what is permissible for an entry value they should be separated.

322 ↗(On Diff #278233)

This union type is useful to have at the scope of VarLoc, since several VarLoc methods use variables with these values. The best alternative I think would be to move it into MachineLoc; regardless, having it as a typedef instead of an anonymous union is convenient in the functions below where it is used.

djtodoro added inline comments.Jul 20 2020, 1:13 AM
llvm/lib/CodeGen/LiveDebugValues.cpp
263 ↗(On Diff #278233)

OK. I thinks In/Out are well known already here, so I think we don't have to worry about that. We might want to update the comment above to remove any ambiguity.

308 ↗(On Diff #278233)

That makes sense then. There are targets where the entry value would be in multiple regs, but I am not sure we can face it with the current support. I am OK with the separation.

Remove unused data from tests, add/modify comments, move VarVec declaration up, fix error in duplicated-op-removal.

@StephenTozer Thanks for addressing the comments! Can you please share with us if you have any measurements with the respect to build time increase (e.g. on clang build itself) with this patch? I think that this patch is the only one from the stack that could give us potential performance issues.

  • Fixed a few minor errors, including a linux compile error
  • Rebased onto latest master, moving changes into VarLocBasedImpl.cpp.

Rebased onto recent master, no changes.

StephenTozer added a comment.EditedDec 8 2020, 9:48 AM

@StephenTozer Thanks for addressing the comments! Can you please share with us if you have any measurements with the respect to build time increase (e.g. on clang build itself) with this patch? I think that this patch is the only one from the stack that could give us potential performance issues.

I mentioned this in a comment on another review, but just to make sure the answer is available here: there doesn't appear to be a large compile time increase as a result of this change. I haven't yet measured LiveDebugValues specifically, but for the samples in the CTMark projects in the LLVM test suite the average compile time increase currently appears to be 0.5% at -O2 -g. At this point it's hard to say how much of this is down to LiveDebugValues, but it is probably the most significant pass in that regard.

Rebased onto recent master.