Page MenuHomePhabricator

[LiveDebugValues] Speed up removeEntryValue, NFC
ClosedPublic

Authored by vsk on May 27 2020, 6:23 PM.

Details

Summary

Instead of iterating over all VarLoc IDs in removeEntryValue(), just
iterate over the interval reserved for entry value VarLocs. This changes
the iteration order, hence the test update -- otherwise this is NFC.

This appears to give an ~8.5x wall time speed-up for LiveDebugValues when
compiling sqlite3.c 3.30.1 with a Release clang (on my machine):

        ---User Time---   --System Time--   --User+System--   ---Wall Time--- --- Name ---
Before: 2.5402 ( 18.8%)   0.0050 (  0.4%)   2.5452 ( 17.3%)   2.5452 ( 17.3%) Live DEBUG_VALUE analysis
 After: 0.2364 (  2.1%)   0.0034 (  0.3%)   0.2399 (  2.0%)   0.2398 (  2.0%) Live DEBUG_VALUE analysis

The change in removeEntryValue() is the only one that appears to affect
wall time, but for consistency (and to resolve a pending TODO), I made
the analogous changes for iterating over SpillLocKind VarLocs.

Diff Detail

Event Timeline

vsk created this revision.May 27 2020, 6:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 27 2020, 6:23 PM
vsk updated this revision to Diff 266751.May 27 2020, 11:24 PM

I've generalize the patch a bit:

  • Assign a reserved location for SpillLocKind VarLocs.
  • Share the reserved location between all EntryValue*Kind VarLocs.
  • Get rid of for (uint64_t ID : OpenRanges.getVarLocs()). We should always iterate over a subset of open VarLocs.
vsk edited the summary of this revision. (Show Details)May 27 2020, 11:26 PM
vsk updated this revision to Diff 266758.May 28 2020, 12:14 AM

Simplify the diff.

Great!

llvm/lib/CodeGen/LiveDebugValues.cpp
929

I'm not sure we want to remove this line. The backup locations are EntryValueBackupKind and EntryValueCopyBackupKind only; the EntryValueKind is real location created from one of these two.

nikic added a comment.May 28 2020, 4:49 AM

Hm, for some reason I'm not seeing any instruction count differences with this patch (or at least the latest version), in either -O0 -g or -O3 -g configuration.

The implementation here looks good, and should lead to some kind of speedup. I'll leave approving to @djtodoro as he knows the entry value mechanisms better.

llvm/unittests/ADT/CoalescingBitVectorTest.cpp
506

IMHO, we should also test a Start >= End case, as that's a condition in half_open_range.

aprantl added inline comments.May 28 2020, 10:34 AM
llvm/include/llvm/ADT/CoalescingBitVector.h
363

It's weird that this empty range violates the assertion on line 360, but I think that is okay.

Is *StartIt >= End even possible, given the assertion?

llvm/lib/CodeGen/LiveDebugValues.cpp
148

what does the k stand for?

1385
if (VL.Loc.SpillLocation != *Loc)
  // Comment explains why no pair is inserted...
  continue;
LLVM_DEBUG(dbgs() << "Restoring Register " ...
vsk marked 5 inline comments as done.May 28 2020, 12:49 PM

Hm, for some reason I'm not seeing any instruction count differences with this patch (or at least the latest version), in either -O0 -g or -O3 -g configuration.

@nikic I can reproduce the problem using wall time alone. I did 5 CTMark runs at {-O0 -g, -O3 -g} x {prepatch, postpatch} (20 runs total, at -j1, no other stabilization), but found no significant geomean compile time change prepatch vs. postpatch. However, I do see a large speedup when compiling a separate copy of sqlite3.

It looks like CTMark ships the sqlite3 3.5.7 amalgamation (85 KLOC, generated in 2008), whereas I've been testing with the 3.30.1 amalgamation (225 KLOC, generated in 2019).

So, I think this change is worthwhile, but maybe it's not sufficient.

llvm/include/llvm/ADT/CoalescingBitVector.h
363

Thanks for the careful review here. I've added simplified test cases that show why the *StartIt >= End check is necessary, and why we have to return an empty range in that case.

The situation is: say you have the bitvector {1}, and you ask for the half-open range [0, 1). Keep in mind that find(X) returns a lower bound (the first bit set at, or after, X). So *find(Start) = *find(0) = 1, which means *StartIt == End. Since we want a half-open range, we don't want to include 1 in it, hence the empty range.

Similarly, if you have the bitvector {5} and you ask for the half-open range [3, 4), *find(Start) = *find(3) = 5, which means *StartIt > End and we want to return the empty range again.

llvm/lib/CodeGen/LiveDebugValues.cpp
148

"Constant" -- it's somewhat idiomatic in that it's used a fair amount in llvm, but not consistently. I'm not attached to this. Happy to go with your preferred spelling.

929

It looks like VL.getEntryValueCopyBackupReg() returns undef when VL.Kind isn't the right kind. So it seems safe to remove the VL.isEntryBackupLoc() check?

llvm/unittests/ADT/CoalescingBitVectorTest.cpp
506

Ah, are you referring to the same case Adrian described? If so, I've just added a test to cover that.

vsk updated this revision to Diff 266998.May 28 2020, 12:49 PM

Address review feedback.

vsk edited the summary of this revision. (Show Details)May 28 2020, 12:53 PM
djtodoro added inline comments.May 29 2020, 3:11 AM
llvm/lib/CodeGen/LiveDebugValues.cpp
509

Should we make a different kEntryValBackupLocation (and something like getBackupEntryValuesVarLocs())?

The backup locations (EntryValueBackupKind and EntryValueCopyBackupKind) are invisible/backup/secondary up to the point we lost the primary location of a variable; and then we turn the backup location into entry value location (EntryValueKind) which is (further on) used as primary location.

929

Oh, I see. It is safe! Thanks.
Eventually, if we get rid of the [1] we won't need this part at all. The EntryValueCopyBackupKind was used to indicate the variable's value was not changed with the new DBG_VALUE (for the same variable), but it was moved around due to the code from [1].

[1] https://github.com/llvm/llvm-project/blob/446082b99f0bd9786d24edd3cd598dc4e5f64371/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp#L590

990

Based on my previous comments, I think we need only getBackupEntryValuesVarLocs() here.

1430

Here we would need only getBackupEntryValuesVarLocs().

aprantl added inline comments.May 29 2020, 11:39 AM
llvm/lib/CodeGen/LiveDebugValues.cpp
493

Side-note: does this get more readable if we introduce a typedef for uint32_t here, such a VarLocIndex?

1385

did this get marked as done by accident?

vsk marked 3 inline comments as done and an inline comment as not done.May 29 2020, 11:55 AM
vsk added a subscriber: tbosch.
vsk added inline comments.
llvm/lib/CodeGen/LiveDebugValues.cpp
493

That may look a little too similar to "LocIndex". Wdyt of LocIndex::uint32_location_t and LocIndex::uint32_index_t?

509

Good point, this should now be fixed.

929

I see, thanks for the pointer (cc @tbosch, this is related to the discussion from D80819).

1385

Sorry about that, somehow I missed that you were asking to convert this to an early exit. Should be fixed soon.

vsk updated this revision to Diff 267319.May 29 2020, 11:56 AM
vsk removed a subscriber: tbosch.

Address review feedback.

aprantl added inline comments.May 29 2020, 2:41 PM
llvm/lib/CodeGen/LiveDebugValues.cpp
493

sure!

djtodoro accepted this revision.Jun 1 2020, 12:20 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jun 1 2020, 12:20 AM
This revision was automatically updated to reflect the committed changes.