This is an archive of the discontinued LLVM Phabricator instance.

tsan: optimize DenseSlabAlloc
ClosedPublic

Authored by dvyukov on Jul 18 2022, 5:20 AM.

Details

Summary

If lots of threads do lots of malloc/free and they overflow
per-pthread DenseSlabAlloc cache, it causes lots of contention:

31.97%  race.old  race.old            [.] __sanitizer::StaticSpinMutex::LockSlow
17.61%  race.old  race.old            [.] __tsan_read4
10.77%  race.old  race.old            [.] __tsan::SlotLock

Optimize DenseSlabAlloc to use a lock-free stack of batches of nodes.
This way we don't take any locks in steady state at all and do only
1 push/pop per Refill/Drain.

Effect on the added benchmark:

$ TIME="%e %U %S %M" time ./test.old 36 5 2000000
34.51 978.22 175.67 5833592
32.53 891.73 167.03 5790036
36.17 1005.54 201.24 5802828
36.94 1004.76 226.58 5803188

$ TIME="%e %U %S %M" time ./test.new 36 5 2000000
26.44 720.99 13.45 5750704
25.92 721.98 13.58 5767764
26.33 725.15 13.41 5777936
25.93 713.49 13.41 5791796

Diff Detail

Event Timeline

dvyukov created this revision.Jul 18 2022, 5:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2022, 5:20 AM
Herald added a subscriber: Enna1. · View Herald Transcript
dvyukov requested review of this revision.Jul 18 2022, 5:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2022, 5:20 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript
melver accepted this revision.Jul 19 2022, 5:02 AM
This revision is now accepted and ready to land.Jul 19 2022, 5:02 AM
This revision was automatically updated to reflect the committed changes.