This is an archive of the discontinued LLVM Phabricator instance.

Whenever reasonable, merge ASAN quarantine batches to save memory.
ClosedPublic

Authored by alekseyshl on Dec 22 2016, 3:13 PM.

Details

Summary

There are cases when thread local quarantine drains almost empty
quarantine batches into the global quarantine. The current approach leaves
them almost empty, which might create a huge memory overhead (each batch
is 4K/8K, depends on bitness).

Diff Detail

Repository
rL LLVM

Event Timeline

alekseyshl updated this revision to Diff 82374.Dec 22 2016, 3:13 PM
alekseyshl retitled this revision from to Whenever reasonable, merge ASAN quarantine batches to save memory..
alekseyshl updated this object.
alekseyshl added a reviewer: eugenis.
alekseyshl added a subscriber: llvm-commits.
eugenis accepted this revision.Dec 22 2016, 3:53 PM
eugenis added a reviewer: kcc.
eugenis edited edge metadata.

LGTM

lib/sanitizer_common/sanitizer_quarantine.h
150 ↗(On Diff #82374)

nullptr

166 ↗(On Diff #82374)

DequeueBatch unnecessarily updates from_cache size. Could you use from_cache->list_.front() instead?

This revision is now accepted and ready to land.Dec 22 2016, 3:53 PM
alekseyshl updated this revision to Diff 82381.Dec 22 2016, 4:14 PM
alekseyshl marked 2 inline comments as done.
alekseyshl edited edge metadata.
  • Address comments.

LGTM, but please wait for a review from Kostya

kcc edited edge metadata.Dec 27 2016, 12:22 PM

Is there any noticeable gain from this change other than in the case where quarantine_size_mb is 0?
I don't want to introduce this extra complexity unless there is a real gain.
And quarantine_size_mb=0 should simply bypass the quarantine (if we care about this corner case)

As I understand, small-but-not-zero local quarantine cache size would trigger the bad behavior without this change.

alekseyshl updated this revision to Diff 82572.Dec 27 2016, 7:05 PM
alekseyshl edited edge metadata.
  • Address comments.
  • Add a test to verify "one allocation per thread" scenario.

Not just the small thread local quarantine size can create a lot of almost empty quarantine blocks, it can also happen joining a lot of threads which happen to make only a few allocations during their lifetime.

kcc added inline comments.Dec 28 2016, 2:22 PM
lib/sanitizer_common/sanitizer_quarantine.h
151 ↗(On Diff #82572)

is it possible to make a separate function for this part of code and then unit test it?
(Something like bool Merge(QuarantineBatch *into, QuarantineBatch *from))

test/asan/TestCases/Linux/thread_local_quarantine_size_kb.cc
1 ↗(On Diff #82572)

I would suggest to leave this test (mostly?) unchanged and create a new one.

alekseyshl updated this revision to Diff 82705.Dec 29 2016, 4:21 PM
  • Address comments.
  • Add a test to verify "one allocation per thread" scenario.
  • Add unit test for SanitizerQuarantine batch transfers.
alekseyshl marked 2 inline comments as done.Dec 29 2016, 4:21 PM

Done

alekseyshl updated this revision to Diff 82709.Dec 29 2016, 4:55 PM
  • Add dropped out line back.
alekseyshl updated this revision to Diff 84915.Jan 18 2017, 5:20 PM
  • Address comments.
  • Add a test to verify "one allocation per thread" scenario.
  • Add unit test for SanitizerQuarantine batch transfers.
  • Add dropped out line back.
  • Merge quarantine blocks on Recycle, not on Put.

PTAL

Doing even minimal work under cache spin lock on Put proven to be too slow in certain conditions (many threads, many allocations). The spin lock contention hits slow path (__sanitizer::internal_sched_yield()) and context switch results in dramatically slower execution (up to 20x slowdown observed in tests).

eugenis added inline comments.Jan 18 2017, 6:05 PM
lib/sanitizer_common/sanitizer_quarantine.h
38 ↗(On Diff #84915)

would it make the code simpler if we removed sizeof(QuarantineBatch) from here?

125 ↗(On Diff #84915)

s/options/limits

alekseyshl marked an inline comment as done.Jan 19 2017, 9:52 AM
alekseyshl added inline comments.
lib/sanitizer_common/sanitizer_quarantine.h
38 ↗(On Diff #84915)

I guess, you mean to do not account for the quarantine batch size here and calculate the overhead every time we check whether the quarantine limit is reached? It will simplify the code a little bit, but we will have to perform this calculation on every Put() call or, if we still going to roll it into size_, EnqueueBatch, DequeueBatch and MergeBatches will have to be aware of it (not that great too).

alekseyshl updated this revision to Diff 84986.Jan 19 2017, 9:52 AM
  • Message text changes

LGTM

lib/sanitizer_common/sanitizer_quarantine.h
38 ↗(On Diff #84915)

OK. Does not really matter.

This revision was automatically updated to reflect the committed changes.