This patch stack implements the assignment tracking analysis. This patch contains the main body, but there are unfortunately a few more large code blobs to follow.
The problem and goal
Using the Assignment Tracking "model" it's not possible to determine a variable location just by looking at a debug intrinsic in isolation. Instructions without any metadata can change the location of a variable. The meaning of dbg.assign intrinsics changes depending on whether there are linked instructions, and where they are relative to those instructions. So we need to analyse the IR and convert the embedded information into a form that SelectionDAG can consume to produce debug variable locations in MIR.
The core of the solution is a dataflow analysis which, aiming to maximise the memory location coverage for variables, outputs a mapping of instruction positions to variable location definitions.
High level overview and API
AssignmentTrackingAnalysis is a pass that analyses IR to produce a mapping of instruction positions to variable location definitions. The results are encapsulated by the FunctionVarLocs class.
The pass is integrated with LLVM in this patch but the analysis is not used yet. A future patch updates SelectionDAG separately.
The results of the analysis are exposed via getResults using the returned const FunctionVarLocs *'s const methods:
const VarLocInfo *single_locs_begin() const; const VarLocInfo *single_locs_end() const; const VarLocInfo *locs_begin(const Instruction *Before) const; const VarLocInfo *locs_end(const Instruction *Before) const; void print(raw_ostream &OS, const Function &Fn) const;
Debug intrinsics can be ignored after running the analysis. Instead, variable location definitions that occur between an instruction Inst and its predecessor (or block start) can be found by looping over the range:
locs_begin(Inst), locs_end(Inst)
Similarly, variables with a memory location that is valid for their lifetime can be iterated over using the range:
single_locs_begin(Inst), single_locs_end(Inst)
Dataflow high level details
The analysis itself is a standard fixed point dataflow algorithm that traverses the CFG using a worklist that is initialised with every block in reverse post order. It computes a result for each visited block that is used to compute the result of successor blocks. Each time the result changes for a block its successors are added to the worklist if not already present. The analysis terminates when the result of every block is stable. Care has been taken to ensure that the merging of information from predecessor blocks yields a result that changes monotonically.
For each block we track "live-in" (LiveIn) and "live-out" (LiveOut) results. The former represents the currently known input to a block, which is the merged (join) result of the live-outs of visited predecessors (empty for the entry block). The live-in set is copied to create a working set for the block (LiveSet). The working set is modified as each instruction in the block is processed (process). After processing the last instruction in the block, the working set is the live-out result for the block. The "results" are BlockInfo objects. These encode assignments to memory and to variables, and track whether each variable's memory location is a good debug location for the variable or not. The actual variable location information (concrete implicit location value, or memory address) is stored off to the side in InsertBeforeMap, which is used after the dataflow is complete to build the instruction -> location definition mapping.
Patch tour
Here's a high-level call-graph that hopefully helps patch navigation.
+-runOnFunction +-analyzeFunction +-run +-process | +-processNonDbgInstruction | | +-processTaggedInstruction | | +-processUntaggedInstruction | | | +-processDbgInstruction | +-processDbgAssign | +-processDbgValue | +-join +-joinBlockInfo +-joinLocMap | +-joinKind | +-joinAssignmentMap +-joinAssignment
AssignmentTrackingLowering::run (just run above) is where the dataflow starts. Most of this function is dedicated to initialize helper structures and setup worklist traversal scaffolding. The important functions called from here are join and process.
It's probably easier to start with join as it will result in an understanding of the types involved, giving process more meaning. join is responsible for merging the live-outs of predecessors. See the docu-comment at the forward declaration in the class definition. join calls other joinXYZ methods and those call another set, working on merging every element of BlockInfo.
BlockInfo is made up of 3 maps.
LocMap LiveLoc; AssignmentMap StackHomeValue; AssignmentMap DebugValue;
LiveLoc maps variables to LocKind, which describes the current kind of location for each variable.
StackHomeValue maps variables the last Assignment to its stack slot (N.B. looking at this now, maybe it should be keyed by address rather than variable - this can come later as a refactor if necessary as it will likely need changing with one of the TODO list items (in D132220)).
DebugValue maps variables to the last Assignment to the variable.
process is where instructions in a block are analysed. The important functions here are addMemDef, addDbgDef, setLocKind, and emitDbgValue. All the leaf process functions call these so I didn't add them to the call graph map. A call to addMemDef states a store with a given ID to a variable fragments's memory location has occurred . Similarly, addDbgDef states an assignment with an ID to a fragment of a variable has occurred. When the variable's memory location assignment and the debug assignment "match" a variable location definition describing the memory location is emitted. Otherwise, an appropriate implicit location value is chosen. setLocKind sets whether the current variable location for the variable is Mem, Val or None and emitDbgValue saves the location to InsertBeforeMap.
The analysis tracks locations for each fragment of each variable that has a definition (/is used in a debug intrinsic). addMemDef, addDbgDef, and setLocKind apply their changes to all fragments contained fully within the one passed in. So, an assignment to bits [0, 64) of a variable is noted for bits [0, 32) too.
I'm aware this patch is large and that tour is not. Hopefully it gives reviewers a good starting point though. Please don't hesitate to ask questions!
Tests are coming in a separate patch.
meganit, s/LIB/INCLUDE/, or something