This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] PR40628: Don't salvage load operations
ClosedPublic

Authored by jmorse on Feb 8 2019, 10:02 AM.

Details

Summary

This patch disallows load insts from being salvaged. The reason why is PR40628 [0] -- LLVM is unable to determine where the validity of a debug expression containing DW_OP_deref ceases (i.e., where a store overwrites the intended data). It's also (I believe) unable to determine when hoisted/sunk memory instructions would invalidate a dbg.value. This has a large potential for misleading developers.

IMHO this isn't something that can be fixed without a lot of work, thus it's safer to turn off for now. If that's a contentious position though we can find a different way forwards. As mentioned in the PR, the effect on a clang-3.4 build is a 1.1% drop in variable location coverage and 1% drop in scope bytes coverage -- unfortunate, but IMHO not devastating. (I tend to put a lot more weight on avoiding inaccurate DebugInfo, as that diminishes trust in the accurate parts).

Summary of test changes:

  • GVN test: -debugify option was added to test that instructions were salvaged in https://reviews.llvm.org/rL325063 . There's another file with -debugify checks added in that commit, so there's still coverage despite me deleting it in this test.
  • DeadStoreElimination: tricky as memory operations are exactly what we're messing with. I've altered the test to have an intermediate addition that gets deleted by DSE as being redundant, and added a check that that's salvaged, which appears to have been the original aim of the test.
  • InstCombine: this specifically tests for salvaging loads, so must be disabled. I've left a negative test there to fire if this feature is ever re-enabled in a safer condition.

The new test case is just a regression test so that there's a PR available if this behaviour comes back in the future.

[0] https://bugs.llvm.org/show_bug.cgi?id=40628

Diff Detail

Event Timeline

jmorse created this revision.Feb 8 2019, 10:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 8 2019, 10:02 AM

This patch disallows load insts from being salvaged. The reason why is PR40628 [0] -- LLVM is unable to determine where the validity of a debug expression containing DW_OP_deref ceases (i.e., where a store overwrites the intended data).

Would it be feasible to inject a dbg_value(undef) at the end of the basic block or before the next store, whichever is earlier, instead?

jmorse added a comment.EditedFeb 11 2019, 4:07 AM

Hi,

Would it be feasible to inject a dbg_value(undef) at the end of the basic block or before the next store, whichever is earlier, instead?

That'd totally work, and could even be done quite late in compilation. However I think the greater issue is that with DW_OP_derefs present it's difficult to determine when code movement in optimisation passes affects dbg.values. Consider this example, compiling "-O2 -g -fno-inline -fno-unroll-loops" with clang@r353515:

static int qux[] = { 1, 2, 3, 4 };

int
foo(int baz, int *out)
{
  int sum = 0;
  for (int i = 0; i < 4; i++) {
    int extra = *out;
    sum += qux[i] * baz;
    sum %= 4;
    *out = sum;
  }
  return sum;
}

int
main(int argc, char **argv)
{
  int out = 12;
  return foo(argc, &out);
}

(The assign to 'extra' is contrived but eliminating loads is something LLVM does all the time I believe). In this code, LICM promotes '*out' to an SSA register for the body of the loop. However, before LICM the dbg.value for 'extra' has its load operand folded into a DW_OP_deref, and points at the variable in 'main'. Stepping through this program in gdb, 'extra' always reads as '12', rather than any of the values computed in the loop. Invalidating the dbg.value at the next memory store wouldn't help, as the location specified by the dbg.value never holds the value we're interested in.

My main point here is that not only would LICM need to be taught that moving the store to '*out' could invalidate some dbg.values, determining which dbg.values are affected would involve digging into DIExpressions looking for dereferences, potentially through GEPs that were folded in too, possibly even requiring alias analysis on any dbg.value with a deref. Which (as far as I'm aware) is a reasonably large amount of code complexity and computation. DeadStoreElimination would need to behave similarly (a dbg.value might try to point at the memory of a store that's later eliminated). I'm not incredibly familiar with all of LLVMs passes, but there could be more to be taught.

There might be some kind of half-way house like inserting dbg.value(undef...) to terminate the range of dbg.values containing DW_OP_deref, at the point where any such pass changes visible stores, but I believe this would still involve teaching new things to passes.

aprantl accepted this revision.Feb 11 2019, 9:21 AM

I think you convinced me. It doesn't make me happy that we have to drop this, but that's not your fault :-)

test/DebugInfo/Generic/pr40628.ll
13

Please add some kind of positive CHECK, too, otherwise this test would pass even if someone symlinked opt to /bin/true :-)

19

Is the tbaa data needed?

20

Could you get by with fewer distinct DILocations? i.e., would just attaching !23 everywhere work, too?

This revision is now accepted and ready to land.Feb 11 2019, 9:21 AM

I think you convinced me. It doesn't make me happy that we have to drop this, but that's not your fault :-)

lib/Transforms/Utils/Local.cpp
1704

Could you please add a comment here explaining why we don't salvage loads, so no-one comes back later with a great idea and adds it back in?

This revision was automatically updated to reflect the committed changes.

Folded last comments into commit, but phab doesn't show those comments inline because of path differences it seems:

  • Added comment warning against salvaging loads in salvageDebugInfoImpl
  • Removed TBBA metadata and un-necessary DILocations
  • New test case positively checks for an undef dbg.value corresponding to the optimised-out load.