This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Don't repeatedly created undef DBG_VALUEs during machine-sinking
AbandonedPublic

Authored by jmorse on Nov 25 2019, 9:21 AM.

Details

Summary

This patch addresses a performance regression reported in PR43855 [0] that appeared as a result of D58386 / rGf5e1b718a6. To recap: the MachineSink pass used to rely on DBG_VALUE insts always following the instruction that defines the corresponding VReg. D58386 collected all debug users as we walked through a block so that non-adjacent DBG_VALUEs could be sunk too.

It turns out that MachineSink will (often) move instructions to the first block that post-dominates the current block, and then try to sink further. This means if we have a lot of conditionals:

foo = bar + baz;
if (a)
  b++;
if (c)          // <---- here
  d++;
someuseof(foo);

When the computation of "foo" sinks to the bottom of the code, an extra spurious DBG_VALUE $noreg would be placed at the marker. Multiply this a few times, and in [0] there are some functions where an extra 25% DBG_VALUEs appear, all $noreg. This seemingly produced some pathological behaviour elsewhere.

To fix this, rather than immediately sinking DBG_VALUEs, they get recorded into a pass structure. When sinking is complete and instructions won't be sunk any further, new DBG_VALUEs are added, avoiding lots of intermediate DBG_VALUE $noregs being created.

[0] https://bugs.llvm.org/show_bug.cgi?id=43855

Diff Detail

Event Timeline

jmorse created this revision.Nov 25 2019, 9:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 25 2019, 9:21 AM
aprantl accepted this revision.Dec 2 2019, 10:18 AM
aprantl added inline comments.
llvm/lib/CodeGen/MachineSink.cpp
135

SmallDenseMap here, maybe?

This revision is now accepted and ready to land.Dec 2 2019, 10:18 AM
bjope added inline comments.Dec 2 2019, 12:38 PM
llvm/lib/CodeGen/MachineSink.cpp
417

My only comment is that you might consider breaking out the new code added here to a function/method before committing.

For example

void MachineSinking::reinsertSunkDebugDefs(MachineFunction &MF) {
  ...
}

(always nice to keep the runOnMachineFunction relatively small, outlining the overall steps performed in the pass)

This revision was automatically updated to reflect the committed changes.
jmorse marked 2 inline comments as done.Dec 5 2019, 8:03 AM
jmorse added inline comments.
llvm/lib/CodeGen/MachineSink.cpp
417

Shoehorned that in, cheers!

uabelho added a subscriber: uabelho.Dec 9 2019, 5:13 AM

Hi,

I've started seeing a verifier complaint with this patch.
I don't have a full testcase to share since I get this on our out-of-tree
target but maybe you can make somethng out of this anyway.

In the input to MachineSink we have this in one MBB

%5:an32pairs = mv [...]
%6:an32_rn_pairs = COPY %5.hiAcc:an32pairs
%8:an32_0_7 = COPY %6:an32_rn_pairs
DBG_VALUE %8:an32_0_7 [...]

and then in the output we get this, sinked into another MBB

%6:an32_rn_pairs = COPY %5.hiAcc:an32pairs
%8:an32_0_7 = COPY %6:an32_rn_pairs
DBG_VALUE %8.hiAcc:an32_0_7 [...]

Then the verifier complains that there is no sub register hiAcc in the an32_0_7
register class, i.e. the "hiAcc" in the

DBG_VALUE %8.hiAcc:an32_0_7

shouldn't be there.

jmorse marked an inline comment as done.Dec 9 2019, 5:27 AM

Ouch, yeah, I can see exactly why that is: the sunk DBG_VALUE is copied from the old / "original" DBG_VALUE, which in the meantime has been copy-propagated. And through the copy propagation, it grows an extra piece of subregister information.

This should be straightforwards to fix, I'll have a patch up shortly, thanks for the report!

Ouch, yeah, I can see exactly why that is: the sunk DBG_VALUE is copied from the old / "original" DBG_VALUE, which in the meantime has been copy-propagated. And through the copy propagation, it grows an extra piece of subregister information.

This should be straightforwards to fix, I'll have a patch up shortly, thanks for the report!

Sounds good. Thanks!

Misery, not today, but I still think there should be an easy fix. Please do revert this patch though if it's causing difficulties.

(D71279 is a precursor, just getting the second patch up, but I continue to be stuck in the tar pit...)

D71283 Should (TM) fix it, but, the x86 test that I wrote and which seemed to replicate the issue, now no longer replicates the issue. Which is unfortunate.

Would you be able to give that patch (and its parent) a test to see whether it fixes the problem you're encountering? Storing the subregister separately should entirely fix this I believe, I'm just having problems replicating it on x86.

hans added a subscriber: hans.Dec 10 2019, 10:08 AM

We're seeing non-determinism in Chromium builds after this commit. You can reproduce using the pre-processed source from https://bugs.chromium.org/p/chromium/issues/detail?id=1032241#c32 and running this (giant, sorry) command:

$ for x in {1..25} ; do build.release/bin/clang -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-llvm-verifier -discard-value-names -mrelocation-model pic -pic-level 2 -mthread-model posix -fmerge-all-constants -mframe-pointer=all -relaxed-aliasing -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -target-feature +sse2 -dwarf-column-info -debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb -O2 -std=c++14 -fno-operator-names -ftrivial-auto-var-init=pattern -fno-rtti -fgnuc-version=4.2.1 -Qn -vectorize-loops -vectorize-slp -fcomplete-member-pointers /tmp/a.ii -o /tmp/foo_$x.o & done && wait && sha1sum /tmp/foo_*.o

This should print 25 identical sha1 hashes for the 25 object files, but some of them have different hashes.

I believe this might be due to iteration over one of the DenseMaps, which is in non-deterministic order.

hans added a comment.Dec 10 2019, 10:21 AM

I've reverted in 49da20ddb43 until it can be fixed. Please let me know if you need help with the reproducer.

uabelho added a comment.EditedDec 10 2019, 11:17 PM

D71283 Should (TM) fix it, but, the x86 test that I wrote and which seemed to replicate the issue, now no longer replicates the issue. Which is unfortunate.

Would you be able to give that patch (and its parent) a test to see whether it fixes the problem you're encountering? Storing the subregister separately should entirely fix this I believe, I'm just having problems replicating it on x86.

I tried out D71279 and D71283 individually and they both seem to fix the problem I saw.
I haven't done very wide testing, but the mir-reproducer I had passes. Nice!

I tried out D71279 and D71283 individually and they both seem to fix the problem I saw. I haven't done very wide testing, but the mir-reproducer I had passes. Nice!

Cool; I think the former might mask the latter, I'll land the first patch when this (D70676) is back in tree...

Hans wrote:

We're seeing non-determinism in Chromium builds after this commit

Thanks for the detailed reproducer (and strong shell-fu); I've replicated this locally now, cheers. Fascinatingly, the output difference seems to be a change in the layout of location lists, not their meaning. From llvm-dwarfdump -debug-loc:

             [0x0000000000000135, 0x000000000000014b): DW_OP_breg14 R14+400, DW_OP_stack_value

 0x00014038:
+            [0x0000000000000152, 0x0000000000000171): DW_OP_reg3 RBX
+
+0x0001406b:
             [0x0000000000000152, 0x00000000000001b1): DW_OP_reg3 RBX
             [0x0000000000000281, 0x00000000000002a0): DW_OP_reg3 RBX

-0x0001407e:
+0x000140b1:
             [0x0000000000000152, 0x00000000000001b1): DW_OP_reg3 RBX
             [0x0000000000000281, 0x00000000000002a0): DW_OP_reg3 RBX

-0x000140c4:
-            [0x0000000000000152, 0x0000000000000171): DW_OP_reg3 RBX
-
 0x000140f7:
             [0x0000000000000163, 0x0000000000000171): DW_OP_reg3 RBX

That's a location entry for RBX shifting down two location list entries. It's not clear to me why this is happening; I'll dig further.

jmorse updated this revision to Diff 233384.Dec 11 2019, 9:07 AM

Here's a revised patch using SetVector, although it's not at the location I was expecting. As it turns out, the llvm::sort of DebugVariables I was using was actively harmful -- the operator< method of DebugVariable compares pointers, which is fine for DenseMaps, but not for ensuring a consistent order.

It took a quick look at implementing a correct operator< for DebugVariable, but it looks like metadata in general doesn't support ordering.

Without objection I'll drop this in in a couple of days, as the differences are minor. Thanks @hans for the report!

Without objection I'll drop this in in a couple of days, as the differences are minor. Thanks @hans for the report!

Could you please clarify? Does this patch fix the non-determinism in the output? Non-deterministic output is something we strive to avoid even in the debug info.

bjope added a comment.Dec 12 2019, 2:06 AM

Don't fully understand. This revision is closed, but latest diff is showing a new diff (not the one closed/reverted, right?). If I understand it correctly revision should be re-opened for review again, right?

bjope added a comment.Dec 12 2019, 2:37 AM

LGTM!
(if removing the unused VarInstPair/InstsToSink)

llvm/lib/CodeGen/MachineSink.cpp
360

VarInstPair and InstsToSink are no longer used afaict.

jmorse updated this revision to Diff 233572.Dec 12 2019, 4:06 AM

Remove some now needless variables, switch another DenseMap to MapVector. Having fluffed the explanation before, I'll reopen this momentarily with a fuller explanation,

jmorse reopened this revision.Dec 12 2019, 4:20 AM

Too lazy on the earlier explanation there sorry. To elaborate: I thought I'd accounted for nondeterminism in the two loops in reinsertSunkDebugDefs, but both were a bit broken:

For the outer loop, because we're only ever placing DBG_VALUEs after pre-determined instructions like this:

inst1
<--- insert DBG_VALUE here
inst2
<--- insert DBG_VALUE here

I'd assumed that it didn't matter what order that happens in. However, on second thoughts if an instruction defines more than one vreg, groups of DBG_VALUEs associated with different vregs from an instruction could change order.

For the inner loop, I was moving elements into a vector and then sorting them, based on the "DebugVariable" classes operator<. However, it turns out that compares pointers, and more generally it seems metadata doesn't support being ordered.

For both of these problems I've now switched to MapVector and SetVector respectively; the insertion order is defined by MachineSinking iterating over basic blocks (which is stable AFAIUI), and MachineSinking::ProcessBlock walking through the block backwards, which is also stable.

In the output from hans' reproducer, packs of DBG_VALUEs were being re-ordered, but in a way that didn't change the meaning of any variable location. I don't know why that led to changes in the layout of location lists (again, the meaning was the same), but either way the re-ordering is bad.

This revision is now accepted and ready to land.Dec 12 2019, 4:20 AM
jmorse requested review of this revision.Dec 12 2019, 4:20 AM
aprantl added inline comments.Dec 12 2019, 9:13 AM
llvm/lib/CodeGen/MachineSink.cpp
140

Either

struct SunkDebugDef {
     /// Set of DBG_VALUEs to recreate.
     SinkingVarSet InstsToSink; 
   /// Location to place DBG_VALUEs.
     MachineInstr *MI;          
   };

or

struct SunkDebugDef {
     SinkingVarSet InstsToSink; ///< Set of DBG_VALUEs to recreate.
     MachineInstr *MI;          ///< Location to place DBG_VALUEs.
   };
212

I find the nomenclature very confusing.

In line 137 the comment makes it sound as if

%foo = some_insn
DBG_VALUE %foo, "x"
DBG_VALUE %foo, "y"

%foo is a DebugDef and the two DB_VALUEs are insts, but here it sounds more like the DBG_VALUEs are DebugDefs ... can this be unified?

jmorse updated this revision to Diff 236835.Jan 8 2020, 8:39 AM

Address review comments; will drop an explanation inline.

jmorse marked 4 inline comments as done.Jan 8 2020, 8:44 AM
jmorse added inline comments.
llvm/lib/CodeGen/MachineSink.cpp
140

Picked the former,

212

Yeah, that seems to have been bogus. I've switched the data structures to be "SunkVRegDef"s, which should be closer to the truth, it is the instruction that defines the vreg which is sinking. I've rewritten the comments around the data structures, and renamed this function: it's inserting DBG_VALUEs, and it's the VReg definitions that have sunk.

This means the data structures doesn't have "Debug" in their names, but it should be clear from the comments that they're purely for tracking debugging information.

aprantl accepted this revision.Jan 9 2020, 9:04 AM
aprantl added inline comments.
llvm/lib/CodeGen/MachineSink.cpp
1062

"Debug uses with no other DBG_VALUE for this variable in between can be sunk"?

This revision is now accepted and ready to land.Jan 9 2020, 9:04 AM
jmorse abandoned this revision.Jan 18 2023, 6:27 AM
jmorse marked 2 inline comments as done.

(Clearing out old revisions) -- it's been so long that this is unlikely to land. This was also written at the point where I was realising that to handle DBG_VALUE movement in machinesink, we basically need to re-compute SSA form each time.

Herald added a project: Restricted Project. · View Herald TranscriptJan 18 2023, 6:27 AM