This is an archive of the discontinued LLVM Phabricator instance.

PR26055: Speed up LiveDebugValues::transferDebugValue()
ClosedPublic

Authored by aprantl on May 25 2016, 11:42 AM.

Details

Reviewers
friss
dberlin
Summary

This patch builds upon http://reviews.llvm.org/D20178 and speeds up LiveDebugValues::transferDebugValue() by adding an index that maps each DebugVariable to its open VarLoc.

The transferDebugValue() function needs to close all open ranges for a given DebugVariable. Iterating over the set bits of OpenRanges is prohibitively slow in practice. I experimented with using the sorted map of VarLocs in the UniqueVector to iterate only over the range of VarLocs with a given DebugVariable, but the binary search turned out to be even more expensive than just iterating over the set bits in OpenRanges.
Instead, this patch exploits the fact that there can only be one open location for each DebugVariable and redundantly stores this location in a DenseMap.

Benchmarks:
bench1: The ASAN -O3 example I've been testing with.
bench2: The bitcode from PR26055 at -O0.

Baseline (D20178):
bench1: user 2m43.600s
bench2: user 1m0.948s

This patch:
bench1: user 1m6.725s
bench2: user 0m21.452s

Diff Detail

Event Timeline

aprantl updated this revision to Diff 58466.May 25 2016, 11:42 AM
aprantl retitled this revision from to PR26055: Speed up LiveDebugValues::transferDebugValue().
aprantl updated this object.
aprantl added reviewers: dberlin, friss.
aprantl set the repository for this revision to rL LLVM.
aprantl added subscribers: zaks.anna, llvm-commits, wolfgangp.

Given this is a time/space tradeoff, it might be worth having the space
numbers as well just to demonstrate that it's a good tradeoff?

Given this is a time/space tradeoff, it might be worth having the space
numbers as well just to demonstrate that it's a good tradeoff?

This adds one DenseMap entry (16 bytes index + 4 bytes payload) per local (inlined) variable in a single function. In my large benchmark with the >4000 DBG_VALUEs this was below the noise threshold. Peak memory usage seems to fluctuate by ~ 4MB on repeat runs on my machine.

dberlin added inline comments.May 26 2016, 11:31 AM
lib/CodeGen/LiveDebugValues.cpp
294

Would you mind simply abstracting this into a openrange class (or something), so that the code to maintain the set/hash happens under the covers and this is just insert/find/erase?

(If this is really difficult or something, feel free to explain why and we can punt :P)

aprantl updated this revision to Diff 58698.May 26 2016, 2:37 PM
aprantl removed rL LLVM as the repository for this revision.

Address review comments: Add an OpenRangesSet class that hides the dual nature of VarLocList behind a set-like interface.

aprantl marked an inline comment as done.May 26 2016, 2:38 PM
dberlin accepted this revision.May 26 2016, 2:39 PM
dberlin edited edge metadata.

Looks good to me. Thanks so much for working on this.

This revision is now accepted and ready to land.May 26 2016, 2:39 PM
aprantl closed this revision.May 26 2016, 2:49 PM

Thanks! Landed as r270923.