This is an archive of the discontinued LLVM Phabricator instance.

[scudo][standalone] Implement TSD registry disabling
ClosedPublic

Authored by cryptoad on Dec 19 2019, 10:37 AM.

Details

Summary

In order to implement malloc_{enable|disable} we were just disabling
(or really locking) the Primary and the Secondary. That meant that
allocations could still be serviced from the TSD as long as the cache
wouldn't have to be filled from the Primary.

This wasn't working out for Android tests, so this change implements
registry disabling (eg: locking) so that getTSDAndLock doesn't
return a TSD if the allocator is disabled. This also means that the
Primary doesn't have to be disabled in this situation.

For the Shared Registry, we loop through all the TSDs and lock them.
For the Exclusive Registry, we add a Disabled boolean to the Registry
that forces getTSDAndLock to use the Fallback TSD instead of the
thread local one. Disabling the Registry is then done by locking the
Fallback TSD and setting the boolean in question (I don't think this
needed an atomic variable but I might be wrong).

I clang-formatted the whole thing as usual hence the couple of extra
whiteline changes in this CL.

Diff Detail

Event Timeline

cryptoad created this revision.Dec 19 2019, 10:37 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 19 2019, 10:37 AM
Herald added subscribers: Restricted Project, jfb. · View Herald Transcript
cryptoad updated this revision to Diff 234752.Dec 19 2019, 10:44 AM

Extra whiteline.

eugenis added inline comments.Dec 19 2019, 11:22 AM
compiler-rt/lib/scudo/standalone/tsd_exclusive.h
51–52

is not this a data race?

cryptoad marked an inline comment as done.Dec 19 2019, 11:26 AM
cryptoad added inline comments.
compiler-rt/lib/scudo/standalone/tsd_exclusive.h
51–52

I think it's OK for allocations to succeed until disable() returns, which I assume is the racey part you are referring to?

cryptoad marked an inline comment as not done.Dec 19 2019, 11:26 AM
eugenis added inline comments.Dec 19 2019, 11:36 AM
compiler-rt/lib/scudo/standalone/tsd_exclusive.h
51–52

If the read of Disabled is not an acquire-read, the thread may not observe the change for a very long time. If you want any guarantee there, both the reading and writing threads need to do some kind of synchronize-y thing.

This sounds like a hard problem if we don't want to sacrifice performance of the fast path.

cryptoad added inline comments.Dec 19 2019, 11:41 AM
compiler-rt/lib/scudo/standalone/tsd_exclusive.h
51–52

Right. I'll update this to use an atomic and run some quick tests.

cryptoad updated this revision to Diff 234763.EditedDec 19 2019, 11:56 AM

Use an atomic_u8 for Disabled in the Exclusive Registry.
This should ensure that changes to the variable are visible to the
threads in a somewhat timely manner.

A run of rpc2-benchmark with this new change doesn't show noticeable
performance degradation.

EDIT: doesn't, not does :(

eugenis added inline comments.Dec 19 2019, 12:09 PM
compiler-rt/lib/scudo/standalone/tsd_shared.h
83

I'm also concerned about the shared tsd code path. Is locking the tsd enough to ensure that no allocations will happen in the future? It looks like threads hold the tsd lock for as short a time as possible, and do the rest of the work such as updating the chuck state later without synchronization. This work may not be finished by the time malloc_disable returns. Could malloc_iterate be confused if it sees the allocator in that state?

cryptoad added inline comments.Dec 19 2019, 12:20 PM
compiler-rt/lib/scudo/standalone/tsd_shared.h
83

You are correct, a thread could still be updating a block by the time disable returns.

Hopefully the window is small enough for the amount of potential confusion to be low. Given that this is mostly used in the context of libmemunreachable, which is mostly for debugging purposes, I think a best-effort approach as opposed to a catch-all that might have stronger performance implication is acceptable.

To be fair there might be something better that I haven't thought about as well.

cryptoad added inline comments.Dec 19 2019, 12:22 PM
compiler-rt/lib/scudo/standalone/tsd_shared.h
83

PS: I can add a TODO stating so.

eugenis accepted this revision.Dec 19 2019, 12:49 PM

LGTM

compiler-rt/lib/scudo/standalone/tsd_shared.h
83

I think this can be fixed properly with one or several strategically placed atomic stores in the block update code such that malloc_iterate can always recognize the state the block is in. We've made changes like this in asan & lsan some time ago, mostly in response to bad reports of racy use-after-frees.

This stuff is really hard to reason about. I think this change is OK as a first step, but please leave a TODO.

This revision is now accepted and ready to land.Dec 19 2019, 12:49 PM
cryptoad updated this revision to Diff 234783.Dec 19 2019, 2:12 PM

Add a comment for disable() with a TODO to improve the behavior.

cryptoad marked 7 inline comments as done.Dec 19 2019, 3:25 PM
This revision was automatically updated to reflect the committed changes.