This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Don't always extend variable locations when the reg location is unchanging
ClosedPublic

Authored by jmorse on May 15 2019, 5:10 AM.

Details

Summary

This patch removes the "ChangingRegs" facility in DbgEntityHistoryCalculator, as its overapproximate nature can produce incorrect variable locations (see [0]). The original intention (D3933) seems to have been to increase the number of scenarios where single-location variable records are used in the DWARF output.

The patch first kills off everything that calculates the ChangingRegs vector. Previously ChangingRegs spotted epilogues and marked the frame register as unchanging if it wasn't modified outside the epilogue. With this patch, we perform the same test by only clobbering the frame register if it is modified by instructions not marked FrameDestroy (line 256). The logic for clobbering regmasks becomes slightly more complex because we can now only iterate over RegsVars, and so need to preserve it's iteration order while scanning or regmask-clobbered registers.

The logic for terminating variable locations at the end of a basic block now becomes much more enjoyably simple: we just terminate them all.

Test changes: I need to state an assumption about correctness first. My understanding is that we deliberately don't terminate stack-based variable locations during frame destruction, because the debugger should understand that those become invalid as the frame is destroyed. (CC @dblaikie who's expressed an interest in expression simplicity in the past). This would certainly match what "gcc-7.4 test.c -g -O2 -fno-omit-frame-pointer -m32" does for the following, where a single location of DW_OP_fbreg +0 is used for the 'sandal' variable throughout the function:

int
main()
{
  int sandal;
  f(&sandal);
  return sandal;
}

(A location that is not valid in the prologue and epilogue).

With that in mind, here are what the test changes look like:

  • struct_by_value.ll doesn't get a single-location variable record any more because that single location gets clobbered (my understanding of aarch64 is w0 aliases x0 in the code below). IMO this makes sense: the location isn't on the stack, so the debugger must be told where the variable location ends.

    ldr w0, [x0]

    ret
  • inlined-argument.ll: Everything in this code except the final "ret" is from an inlined function, and is in multiple blocks. My understanding is that the single-location detection algorithm [1] doesn't actually support detecting this scenario: we just get it right because ChangingRegs causes the locations to spill over into the return block, covering the whole function. This should be fixed in a future patch rather than now, IMO.
  • pieces.ll: we now don't terminate a spilt stack location when the frame gets destroyed (by addl %rsp, $40), and so the location changes range. This doesn't generate a new single-location entry, alas.
  • dbg-addr.ll: This location should have been a single-location entry in the first place, it's not clear why ChangingRegs didn't catch it.
  • reference-argument.ll: This test contains a call inst, immediately followed by frame destruction. The new logic recognises that the (register) variable location should terminate after the call, which clobbers the register.
  • stack-value-piece.ll: A register location fragment gets clobbered when the return value is written to $xmm0, and so a new location without that fragment is created in this test.

To summarise, we now generate a few more single-location records for stack variables, and several circumstances where registers are clobbered in the final block of the function now get terminated at the correct location.

[0] https://reviews.llvm.org/D61600#1493393
[1] https://github.com/llvm/llvm-project/blob/c93f56d39e629b7bcd0f4657705264fcd7e47c0d/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1242

Diff Detail

Repository
rL LLVM

Event Timeline

jmorse created this revision.May 15 2019, 5:10 AM
bjope added a reviewer: bjope.May 15 2019, 5:30 AM
bjope added a subscriber: bjope.

Adding myself as reviewer to remember to look at this patch.

Downstream we do things differently for our OOT target. I tried to upstream that once-upon-a-time: https://reviews.llvm.org/D50638
But for some reason it never landed (I think I had some doubts about something).

We have also patched calculateDbgEntityHistory to step inside bundles. Not sure if we need that nowadays, but once upon a time we did not rely on that the def/use info on the BUNDLE instruction telling the full truth here (or maybe we allowed DEBUG_VALUE inside bundles... it is nothin upstream that prevents that afaik). So our downstream version here is slightly different compared to upstream for various historical reasons.

If we can get rid of the "ChangingRegs" stuff, then I guess we would get rid of some downstream hacks as well :-)

dstenb added a subscriber: dstenb.May 15 2019, 5:48 AM
dstenb added inline comments.
lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
360–393 ↗(On Diff #199577)

(I saw the note about this not being for review right now; just pointing this out if this is productified later on.)

Now that we close all register-described values I think that this code can be simplified a bit (e.g. not calling clobberRegisterUses), by instead closing all values by iterating over LiveEntries.

Something like this (I have not even tested compiling it, so it is probably a bit broken):

// Clobber all live debug values (both those that are register-described and
// those that are described by other values).
for (auto &Pair : LiveEntries) {
  if (Pair.second.empty())
    continue;

  // Create a clobbering entry.
  EntryIndex ClobIdx = DbgValues.startClobber(Pair.first, MBB.back());

  // End all entries.
  for (EntryIndex Idx : Pair.second) {
    DbgValueHistoryMap::Entry &Ent = DbgValues.getEntry(Pair.first, Idx);
    assert(Ent.isDbgValue() && !Ent.isClosed());
    Ent.endEntry(ClobIdx);
  }
}

LiveEntries.erase();
RegVars.erase();

Or if there is something that I have overlooked that makes it required (or preferable) to use clobberRegisterUses for closing register-described values, we should be able to change it to something like this:

// Clobber registers at the end of BB.
for (auto I = RegVars.begin(), E = RegVars.end(); I != E;) {
  auto CurElem = I++; // CurElem can be erased below.
  clobberRegisterUses(RegVars, CurElem, DbgValues, LiveEntries,
                      MBB.back());
}

// Clobber the rest of the live debug values.
for (auto &Pair : LiveEntries) {
  if (Pair.second.empty())
    continue;

  // Create a clobbering entry.
  EntryIndex ClobIdx = DbgValues.startClobber(Pair.first, MBB.back());

  // End all entries.
  for (EntryIndex Idx : Pair.second) {
    DbgValueHistoryMap::Entry &Ent = DbgValues.getEntry(Pair.first, Idx);
    assert(Ent.isDbgValue() && !Ent.isClosed());
    Ent.endEntry(ClobIdx);
  }
}

LiveEntries.erase();
aprantl added inline comments.May 15 2019, 8:56 AM
lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
226 ↗(On Diff #199577)

I appreciate the amount of red in this patch :-)

jmorse updated this revision to Diff 199786.May 16 2019, 4:15 AM

Update diff to more production-ised version, will edit summary and add reviewers shortly

jmorse retitled this revision from [WIP][DebugInfo] Don't always extend variable locations when the reg location is unchanging to [DebugInfo] Don't always extend variable locations when the reg location is unchanging.May 16 2019, 4:57 AM
jmorse edited the summary of this revision. (Show Details)
jmorse added reviewers: aprantl, dstenb, vsk, probinson.
jmorse added a subscriber: dblaikie.

My understanding is that we deliberately don't terminate stack-based variable locations during frame destruction, because the debugger should understand that those become invalid as the frame is destroyed.

CC'ing @jasonmolenda and @jingham to double-check that statement.

lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
265 ↗(On Diff #199786)

My intuition says that your typical function call will clobber more than 8 registers, but I don't know what a good number is either.

282 ↗(On Diff #199786)

For readability it might be a good idea to move the short cases to the top, with a continue, then we have more horizontal space for the meat of the function.

test/DebugInfo/AArch64/struct_by_value.ll
3 ↗(On Diff #199786)

Thanks! Feel free to land this ahead of time as an NFC commit.

bjope added inline comments.May 20 2019, 2:35 PM
lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
257 ↗(On Diff #199786)

The old solution using getFirstEpilogueInst was a little bit weird so it is nice to get rid of that. However, if I remember correctly FrameDestroy isn't used by all in-tree targets (and this change is not fully backwards compatible for OOT targets either).

Maybe that shouldn't stop us. But just something that should be taken into consideration (maybe we should make sure in-tree targets are using FrameDestroy consistently and make it obvious in commit msg or something that we now rely on FrameDestroy here).
It could even be possible to swap into relying on FrameDestory in a separate patch.

jmorse updated this revision to Diff 200466.May 21 2019, 5:34 AM

Apply review comments (change a vector size, shuffle the order of some blocks).

jmorse marked 5 inline comments as done.May 21 2019, 5:50 AM
jmorse added inline comments.
lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
257 ↗(On Diff #199786)

Ah, I read about this in the history, but didn't think about it too deeply. The interaction between OOT backends and the rest of LLVM is largely opaque to me, however IIUC, the impact of this behaviour change is that stack-locations will end prematurely in the epilogue, and they won't get as nice DWARF-expressions for such locations. Would this be sufficiently minor for it to go ahead?

I'll examine the in-tree backends to check that they're using FrameDestroy correctly.

Splitting this patch would likely require marking some tests XFail in the meantime, as many single-location values wouldn't be generated. Perhaps that's desirable, as OOT backends could then revert the FrameDestroy patch and have the correct tests marked XFail?

282 ↗(On Diff #199786)

I've shuffled the shorter cases to the top and added a continue; alas, this makes for a very different patch to the previous one.

For the RegsToClobber vector I've upped the size to 32 elements, on the principle that stack allocation is more or less free, and we're (broadly) a leaf function.

jmorse updated this revision to Diff 200939.May 23 2019, 6:57 AM
jmorse marked 2 inline comments as done.

With this update I've separated the FrameDestroy portion from the rest of the patch. The result is that debug-loc-offset.mir and pr19307.mir are marked XFAIL, as the locations they check no longer extend through the frame-destroy'd part of the produced code. I'll upload the remaining part of the patch in a moment, that adds in use of the frame-destroy flag and un-xfails those tests.

Looking through in-tree backends, only X86, AArch64 and RISC-V actually set FrameDestroy. Nevertheless I think FrameDestroy is still something we should make use of: if the assumption I stated above still holds (we deliberately let stack locations run through frame destruction), then we can only detect frame destruction if the backend tells us where it is. The failure-case is that variable locations end slightly earlier than necessary, and we might get fewer single-value variable locations. An acceptable trade-off IMHO.

I ran check-llvm with LLVM_DEFAULT_TARGET_TRIPLE="arm-none-eabi" to check that no generic tests on a non-frame-destroy backend get hit by this change.

Note that this will still leave inlined-argument.ll XFailed, as I don't believe our current implementation can soundly produce the correct output anyway.

bjope added a comment.May 24 2019, 1:38 AM

Is the intention to change the behavior for prologues as well here? The description mentions that we do a different analysis for the epilogues. But with this patch we no longer ignore register clobbering for "FrameSetup" (collectChangingRegs used to ignore FrameSetup code).

Maybe that is a good thing (just not obvious when reading the description that this patch also treats prologues differently).

lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
257 ↗(On Diff #199786)

Ah, I read about this in the history, but didn't think about it too deeply. The interaction between OOT backends and the rest of LLVM is largely opaque to me, however IIUC, the impact of this behaviour change is that stack-locations will end prematurely in the epilogue, and they won't get as nice DWARF-expressions for such locations. Would this be sufficiently minor for it to go ahead?

Ok, makes sense.

277 ↗(On Diff #200939)

Comment here is related to D62314 (that is a child). Maybe D62314 should be a parent instead (unless that complicates things even more for you)?

jmorse updated this revision to Diff 201204.May 24 2019, 5:45 AM

Bjorn wrote:

Is the intention to change the behavior for prologues as well here? The description mentions that we do a different analysis for the epilogues. But with this patch we no longer ignore register clobbering for "FrameSetup" (collectChangingRegs used to ignore FrameSetup code).

Maybe that is a good thing (just not obvious when reading the description that this patch also treats prologues differently).

Oooo, I'd missed that part, good spot. However when adding insensitivity-to-prologues into this patch, it causes a change in behaviour -- fpo-shrink-wrap.ll now extends a stack location through the whole of a prologue, where it previously terminated after the first 'push'. (The change manifests as one fewer temporary labels).

IMO this is an improvement in the output, and what ChangingRegs was _supposed_ to detect. It's not completely clear why it didn't before. AFAIUI, FrameSetup is more widely supported, and so it doesn't need to be rolled into the subsequent patch?

jmorse marked 2 inline comments as done.May 24 2019, 5:46 AM
jmorse added inline comments.
lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
277 ↗(On Diff #200939)

I'll roll the FrameDestroy comment into the child patch, it should be simpler (IMO).

jmorse marked an inline comment as done.Jun 3 2019, 7:34 AM

Ping on this -- AFAIUI people believe the change is good, and depending on FrameDestroy is deployed in a separate patch.

The assumption that stack variable locations run to the end of a function, even over the frame destruction, is seen in GCC here [0]. The location for "sandal" gets "DW_AT_location (DW_OP_breg5 EBP-12)" across the whole function, including the 'leave' inst and fiddling-with-esp. I think it's fair to say we're implementing existing expectations.

(This reverse-blocks D61600).

[0] https://godbolt.org/z/lLJg4c

aprantl accepted this revision.Jun 3 2019, 8:53 AM

The assumption that stack variable locations run to the end of a function, even over the frame destruction, is seen in GCC here [0].

I'm fine with implementing this behavior as long as we document it.

This revision is now accepted and ready to land.Jun 3 2019, 8:53 AM