Page MenuHomePhabricator

WIP: [IR/Verifier] Diagnose use-before-def scenarios for debug intrinsics
Needs ReviewPublic

Authored by vsk on Apr 25 2018, 7:46 PM.

Details

Summary

I'd like to use this WIP patch to start a discussion about use-before-def scenarios with debug info intrinsics.

One motivation for preventing use-before-def is to ensure correctness of sinking/hoisting code. For example, if we sink Inst from BB1 to BB2, the debug values for Inst should be moved as well. This is sometimes written as:

Begin = next(Inst.getIterator())
for I in range(Begin, BB.end()):
  if I.isDebugValue():
    break
  if I.getDebugVariable() == Inst:
    DbgValuesToSink.push_back(I)

If use-before-def is permitted, we might not sink all the relevant debug intrinsics. There may be debug uses of Inst before Begin. Also, once DbgValuesToSink is processed, there may be leftover debug uses of Inst in BB1 after the final value of I.

To avoid these issues, I propose checking for use-before-def in the IR and MIR verifiers, possibly gated by EXPENSIVE_CHECKS. The IR check catches several issues in a stage2-RelWithDebInfo build. I'd appreciate any feedback on whether or how to enable this check.

Edit: The verifier already has 'verifyDominatesUse'. This patch could be simplified by making use of that.

Diff Detail

Event Timeline

vsk created this revision.Apr 25 2018, 7:46 PM
vsk edited the summary of this revision. (Show Details)Apr 25 2018, 7:53 PM

This seems fine in principle. DO you have any idea how much churn this will cause in the existing code? Can you self-host clang without this assertion firing?

vsk added a comment.Apr 26 2018, 10:49 AM

This seems fine in principle. DO you have any idea how much churn this will cause in the existing code? Can you self-host clang without this assertion firing?

I'm not sure how much work it would take to enable this. The assertion fires almost immediately during a stage2 build of clang.

Thanks for this @vsk . This definitely makes sense from a correctness standpoint. I'm curious how frequent this case occurse; hopefully never?

vsk added a comment.Apr 26 2018, 11:21 AM

Thanks for this @vsk . This definitely makes sense from a correctness standpoint. I'm curious how frequent this case occurse; hopefully never?

The assertion fires 41,135 times during a stage2 -O3 -g build of clang. It will take some time to root-cause and address the issues. We might consider landing some version of this check gated by an off-by-default cl::opt, so that we can investigate in parallel.

I'd rather fix it beforehand. The idea of having a --even-stricter mode for the verifier makes me uneasy. I'd rather have it a binary result.

vsk added a subscriber: tyb0807.Apr 26 2018, 11:28 AM

I'm with Adrian on this one. Thanks for making this better!

bjope added a subscriber: bjope.May 8 2018, 1:30 PM
vsk added a comment.Aug 22 2018, 10:40 AM

Recently I addressed a dbg.value use-before-def bug in CodeGenPrepare (r340370). What I learned forced me to reconsider adding this verifier check.

CGP's strategy is to ignore dbg.values until as late as possible. Once it's done optimizing, it simply moves misplaced dbg.values s.t they are dominated by the instructions they reference (see CodeGenPrepare::placeDebugValues). This is a simple and correct (iiuc) approach. For maximal benefit, it would only need to be done twice: once before ISel (this is just CGP, so, done!), and again before LiveDebugVariables.

OTOH, if we were to add this verifier check, individual IR/MIR passes would gain some complexity. These passes would either need some equivalent of placeDebugValues or need to move dbg.values each time an Instruction is moved. While this would result in improved IR, it wouldn't improve debugging overall. In light of that I suggest abandoning this change.

Let me know what you think.

bjope added a comment.Aug 22 2018, 2:59 PM

A problem with CodeGenPrepare::placeDbgValues is that it can move debug intrinsics more or less blindly, for example without taking other debug intrinsics refering to the same variable into account.
If there is a use-before-def scenario it often is correct to move the using dbg.value below the def, but that depends on which transform that resulted in such a situation (I guess that often it might be due to an opt-pass not caring about debug intrinsics).
Right now CodeGenPrepare::placeDbgValues also hoists dbg.value instructions when the use already is after the def. If I remember correctly it can even move a dbg.value from one basic block to another.

We have TRs where we have noticed that debug-experience gets really weird due to CodeGenPrepare::placeDbgValues, because it has incorrectly moved dbg.value instructions. I've actually been aiming at removing that function completely.

(FWIW, the description in the beginning of CodeGenPrepare.cpp actually hints that the pass itself should be removed, that is however a different story and maybe that comment is out-dated, but I doubt that the original goal with that pass was that it should be a cleanup pass that tries to mend things that are broken elsewhere without having the knowledge about what the correct IR actually looked like)

I think that CodeGenPrepare::placeDbgValues exists for two different reasons:

  1. Bugs in IR passes introducing use-before-def scenarios (so if we remove the function we will lose some debug information that might (or might not) be valid today).
  2. Limitations in SelectionDAGBuilder, for example when it comes to properly handle "dangling" debug info between basic blocks, resulting in loss of debug information when the debug-use isn't in the same BB as def.

The verifier check can help us with (1).
I've got some ongoing patches trying to improve (2), but unfortunately I haven't had time to continue on that work since before summer.

One thing that has been nagging me is in which order to do things. Removing CodeGenPrepare::placeDbgValues would be the main goal. But without first doing fixes in SelectionDAG it gives quite different results (getting rid of some faulty debug-info, but also losing some valid debug-info). If I do not remove CodeGenPrepare::placeDbgValues first, then it is harder to motivate patches in SelectionDAG, since the problems are hidden by placeDbgValues (test cases must start after CodeGenPrepare). And maybe we need to solve (1) and adding the verifier check before removing placeDbgValues, because I guess noone has verified that SelectionDAG can handle debug-use-before-def in a safe way.

vsk added a comment.Aug 22 2018, 5:50 PM

A problem with CodeGenPrepare::placeDbgValues is that it can move debug intrinsics more or less blindly, for example without taking other debug intrinsics refering to the same variable into account.
If there is a use-before-def scenario it often is correct to move the using dbg.value below the def, but that depends on which transform that resulted in such a situation (I guess that often it might be due to an opt-pass not caring about debug intrinsics).
Right now CodeGenPrepare::placeDbgValues also hoists dbg.value instructions when the use already is after the def. If I remember correctly it can even move a dbg.value from one basic block to another.

Right, but we can fix that by querying a dominator tree when deciding whether/not to move a dbg.value. Wdyt?

We have TRs where we have noticed that debug-experience gets really weird due to CodeGenPrepare::placeDbgValues, because it has incorrectly moved dbg.value instructions.

I'd be happy to help take a look at these.

(FWIW, the description in the beginning of CodeGenPrepare.cpp actually hints that the pass itself should be removed, that is however a different story and maybe that comment is out-dated, but I doubt that the original goal with that pass was that it should be a cleanup pass that tries to mend things that are broken elsewhere without having the knowledge about what the correct IR actually looked like)

CGP does enough work that I'm skeptical of it ever going away, even if LLVM ever transitions to using GlobalISel exclusively. I agree that it's not very principled to have a 'fix broken IR' pass, but the alternatives seem problematic in two ways: added complexity in the rest of the compiler, & the compile-time cost of figuring out where to place dbg.values every time a pass moves an instruction.

I think that CodeGenPrepare::placeDbgValues exists for two different reasons:

  1. Bugs in IR passes introducing use-before-def scenarios (so if we remove the function we will lose some debug information that might (or might not) be valid today).

Yes, including bugs in CGP itself (see: r340370). If we remove placeDebugValues without some kind of replacement, variable availability will definitely regress.

  1. Limitations in SelectionDAGBuilder, for example when it comes to properly handle "dangling" debug info between basic blocks, resulting in loss of debug information when the debug-use isn't in the same BB as def.

Right, there's an interaction between placeDebugValues and SelectionDAG. As I understand it, SelectionDAG's NodeMap only maps Values to SDNodes one basic block at a time. We keep track of dangling dbg.values in two cases: 1) the Value a dbg.value refers to may have been emitted in a previously-visited block, and 2) there is dbg.value use-before-def.

In case (1), it's unlikely (ideally impossible?) that the Value will be emitted again, so (IIUC) the dangling debug value will never be resolved. (If the value simply requires constant re-materialization it's handled in a separate path.) We might consider addressing this case by cloning the dbg.value into the previously-visited block: doing this reorders variable updates, but a sufficient increase in variable availability might make it a reasonable tradeoff.

In case (2), unless I'm missing something, incidences of dbg.value use-before-def can be eliminated in a relatively cheap way with a smarter placeDebugValues.

The verifier check can help us with (1).
I've got some ongoing patches trying to improve (2), but unfortunately I haven't had time to continue on that work since before summer.

One thing that has been nagging me is in which order to do things. Removing CodeGenPrepare::placeDbgValues would be the main goal. But without first doing fixes in SelectionDAG it gives quite different results (getting rid of some faulty debug-info, but also losing some valid debug-info). If I do not remove CodeGenPrepare::placeDbgValues first, then it is harder to motivate patches in SelectionDAG, since the problems are hidden by placeDbgValues (test cases must start after CodeGenPrepare). And maybe we need to solve (1) and adding the verifier check before removing placeDbgValues, because I guess noone has verified that SelectionDAG can handle debug-use-before-def in a safe way.

vsk added a subscriber: probinson.Sep 17 2018, 2:32 PM