Page MenuHomePhabricator

[DebugInfo][InstrRef] Take a short-cut when propagating variable values if there's only one value

Authored by jmorse on Jan 21 2022, 4:44 AM.



This is a performance patch that also partially fixes, by propagating variable values past out-of-scope blocks.

Key observation: when converting to SSA, if you only have a single variable definition, then the variable has that value in every dominated block, and no well defined value in all other blocks. Seeing how InstrRefBasedLDV does PHI-placement and value propagation, which is often slow, we can use this observation to simplify variable value propagation when there's only one variable definition.

Observe this CFG (replicated in the test below):

    /  |  \
   /   v   \
  /   bb1   \    <--- Single variable assignment happens here
 /   /   \   \
/   v     v   \
| bb2     bb3 |
|   \    /    |
|    v  v     |
|     bb4     |
 \   /   \    /
  v v     v  v
   bb5    bb6    <---  Dominance frontier is here

The general InstrRefBasedLDV algorithm would place PHIs at bb5 and bb6, on the dominance frontier of the variable assignment. We'd then do some value propagation to work out whether those PHIs could be eliminated, and finally decided that they could never be resolved to a variable location, and drop the variable location in those blocks. The speed optimisation is to, in scenarios like this where there are single assignments, to just propagate the assigned value into each dominated block, and no other blocks. That avoids PHI placement and elimination. This leads to a geomean CTMark speed improvement of 0.9% [0], horray.

However, this change isn't NFC, as it partially fixes . Observe the attached test: if we make bb2's lexical scope something out-of-scope for the variable being analysed, then VarLocBasedLDV doesn't propagate the variable location through it to bb4. I think VarLocBasedLDV does this because of performance reasons; with this change, we _do_ propagate the variable into bb4, meaning InstrRefBasedLDV diverges from VarLocBasedLDV a little.

I think this is a legitimate improvement in coverage: if there wasn't an out of scope block in the way, we would always propagate into dominated blocks like bb4 anyway. The fact that there's an out-of-scope block can only because of optimisations AFAIUI -- and I can't think of a functional reason why the intervening block might change the variable value in a different scope. If it was an artificial block (with no source locations), we already propagate through it anyway.

Testing: [0], plus I built clang-3.4 and observed some 5k more variable locations as a result of this, the benefits of the functional change. I also experimented with aborting placePHIsForSingleVarDefinition if it saw an out-of-scope block: doing so produces an identical clang 3.4 binary (modulo the usual differences).


Diff Detail

Event Timeline

jmorse created this revision.Jan 21 2022, 4:44 AM
jmorse requested review of this revision.Jan 21 2022, 4:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 21 2022, 4:44 AM
StephenTozer accepted this revision.Jan 28 2022, 6:02 AM

LGTM, nice performance improvement!

This revision is now accepted and ready to land.Jan 28 2022, 6:02 AM
This revision was landed with ongoing or failed builds.Jan 31 2022, 4:55 AM
This revision was automatically updated to reflect the committed changes.