Here's a patch that adds the basic storage for non-instruction variable location information. This has the potential to be very exciting, but only in the future: this implementation is deliberately na\"ive, implementing a base level of functionality. We're not looking to make debug-info super fast in this specific patch series.
Additionally, it also has a "rough" surface area because there's one artefact that we haven't managed to polish out yet: the total order of debug records, i.e. the fact that blocks of dbg.values in a block have a meaningful order and can affect each other. Eliminating that is tied in with future storage optimisations, so we haven't tried to eliminate that yet. There are only roughly five places in LLVM that really need to care about this (see future patches).
Hopefully the file-level comment in the added files and other docucomments are sufficient to piece together what's happening here: dbg.values are represented by DPValues, with their position stored in a DPMarker, which is then attached to an instruction. DPValues attempt to look as much like a DbgValueInst as possible, having most of the same accessors, the primary difference with them is that they're not an Instruction and don't exist in the Value hierachy. DPMarkers exist as a connection to the Instruction that is the position of the DPValue in a block, and to provide various utility methods for shuffling DPValues around. These are exercised in the DebugInfoTest.cpp unit test added.
Because a DPValue is neither metadata nor a Value, but still needs to refer to ValueAsMetadata objects wrapping instructions and things in the Value hierachy, we need to plumb DPValues into the metadata-lookup facilities. This is so that we can later query "give me all the DPValues referring to this Value", in the same way that you can do that for dbg.values. It's also needed so that when a Value gets RAUW'd, all the metadata-connected users get told about it: see the added unit test in LocalTest.cpp
This patch is probably the correct place to talk about two significant storage changes:
- Instructions will gain another pointer in their body, pointing at an optional DPMarker. This is a noteworthy cost because it affects non-debuginfo builds. I also think it's unavoidable, we need to store precise positions for debug-info and access them from an instruction quickly. This is a trade-off: more memory in normal builds contributes to faster -g builds, in the same way that there's a DebugLoc in class Instruction already.
- Blocks will gain a "TrailingDPValues" object, essentially a list of "dangling" DPValues for when blocks transiently don't have a terminator. Again, this inflates the size of a data structure; again, I don't think this is avoidable. Removing the terminator from a block is a legitimate operation, and we still require a location for debug-info in those circumstances.
Naming: note that the files added are named "DebugProgramInstruction", and other classes prefixed "DP". This is because we originally prototyped all of this as a "shadow" list in a block, effectively a "debug program" behind the real program. We've since moved on from that, but there are still a few names baked in, and I can't for the life of me come up with anything better right now. Suggestions welcome!
Looking back over this, we should be using a unique_ptr to store the pointer references being added, but I'm a bit to burnt out to fix that right now; coming in a future revision!
How frequently do we expect TrailingDPValues to be used? Is it worth having this heap allocated instead to save some bits?