This patch extends salvaging of debuginfo in the Loop Strength Reduction (LSR)
pass by translating Scalar Evaluations (SCEV) into DIExpressions. The method is
as follows:
Cacheing Step
Pre-lsr, Examine the DVIs (dbg.value) in the loop and cache information on
those that are deemed salvageable (the location op has a SCEV that can be
translated toa DIExpression). The location op, SCEV for the location op and
the DIExpr are stored in a record. Currently only old-style single location
op DVI are salvageable, but the code is readily extensible to multiple ops
and generates multi-location DVI if necessary during the salvage step.
Salvaging of variadic debug values is the likely next iteration of this
patch, if accepted.
Obtain the Loop Induction Variable (IV)
The Loop class provides two methods to get the IV. Sometimes these fail to
return anything (even in the presence of a loop.body block with a PHI node
labelled 'lsr.iv'). When these methods fail, pick a SCEVable PHI node from
the loop header and use this as the IV. A similar PHI-node selection is
made in the current lsr salvage method DbgGatherEqualValues().
Induction Variable to iteration count DIExpr translation
The SCEV for the IV/PHI node is translated to a DIExpr that results in the
current iteratoun count.
Salvage
The cached DVI are examined. If the location has been marked as unavailable,
Generate an expression that enables the value to be recovered. The
expression uses the iteration count and the location op's SCEV that was
cached to calculate the value.
I have stripped the the existing salvaging as it is capable of recovering only a
subset of the source values recovered by this implementation. However, this more
general method relies on deriving the iteration count. So, in some cases a
longer DIExpression is produced compared to the exiting method. If this is
deemed unsatisfactory there are two options:
Restore the previous functionality and select that method of salvaging when
possible. This is probably not as messy as it sounds, as I have used similar
methods for cacheing data of the salvageable DVIs. Simply restore some of the
functionality to DbgGatherSalvagableDVI() and prioritise rewriting DVIs using
constant offsets of a phi node value.
I expect this is very little work. An important detail is that the new method
caches salvageable DVI slightly earlier in ReduceLoopStrength() than the
existing method.
Have a function that finds no-ops in the generated DIExpression once the
iteration count and source value DIExpressions are combined. This will be more
complicated than the first option, but may be able to simplify a greater
number of cases.
This is not as simple, but I'm sure simplifying stack-based arithmetic is well
understood and I expect implementations exist.
There are implementation limitations.
Support top-level SCEVs for values that are SCEVCommutativeExpr. A SCEV is
of the form {Start,+,stride}, a recurrence relation, with the start applied
at initialisation and stride applied per-iteration. In LLVM this is known as
a SCEVAddRecExpr However, some values cannot be fully expressed in this
form as they are dependant on multiple phis. In this case one or more
SCEVCommutativeExpr wrap the SCEVAddRecExpr e.g.
(({start,+stride} * %a * %b) + %c).
I did not encounter these expressions when examining values that had been
optimised out, some functionality to tranlate these SCEVs has been omitted.
Recovery of DIExpressions that had multiple location ops before
LSR. Note thatsalvaged DVI with multiple location ops can be generated. DVI
with a single location op and a non-empty DIExpression are salvageable.
To implement salavaging of variadic DVI the append(DIExpr*) function must be
altered to append referenced location Ops to the SCEVDbgBuilders Values
vector. If any of these are now 'undef', then generate an expression to
recover the value.
Translation of chain (non-affine) SCEVs. LLVM has a C++
implementation that evaluates a chain SCEV with the iteration number that
should prove useful.
Translation of nested SCEVAddRecs. These are SCEVs with a loop-variant start
value and are created by nested loops.
MinMax SCEVs have not been addressed. I intended to add a push() function for
these when one was encountered. However, not a single test in the lit
Transmforms suite appeared to generate any.