This is an archive of the discontinued LLVM Phabricator instance.

[scudo][standalone] Merge Spin & Blocking mutex into a Hybrid one
ClosedPublic

Authored by cryptoad on Jul 8 2019, 11:52 AM.

Details

Summary

We ran into a problem on Fuchsia where yielding threads would never
be deboosted, ultimately resulting in several threads spinning on the
same TSD, and no possibility for another thread to be scheduled,
dead-locking the process.

While this was fixed in Zircon, this lead to discussions about if
spinning without a break condition was a good decision, and settled on
a new hybrid model that would spin for a while then block.

Currently we are using a number of iterations for spinning that is
mostly arbitrary (based on sanitizer_common values), but this can
be tuned in the future.

Since we are touching common.h, we also use this change as a vehicle
for an Android optimization (the page size is fixed in Bionic, so use
a fixed value too).

Diff Detail

Repository
rL LLVM

Event Timeline

cryptoad created this revision.Jul 8 2019, 11:52 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 8 2019, 11:52 AM
Herald added subscribers: Restricted Project, jfb, delcypher, srhines. · View Herald Transcript
hctim added inline comments.Jul 8 2019, 12:58 PM
lib/scudo/standalone/common.h
122 ↗(On Diff #208471)

Could we potentially add an android-specific test to make sure this assumption doesn't change?

Also, could this potentially be a compile-time resolution? getPageSizeCached() is called for every allocation to the secondary allocator and seems like an unneccessary branch that can be eliminated.

lib/scudo/standalone/mutex.h
21 ↗(On Diff #208471)

Instead of memset-init here, can we simply have a constexpr constructor to guarantee zero-initialisation? Would potentially prevent forgotten init() in future.

27 ↗(On Diff #208471)

Why?

29 ↗(On Diff #208471)

Could we have some sort of class-level constexpr here instead of just hardcoding the number 10?

30 ↗(On Diff #208471)

Same as above.

40 ↗(On Diff #208471)

Why is this a uptr[1]? It looks like the underlying atomic_compare_exchange_strong supports any atomic type, including char.

From my reading of the spec, the underlying type for an unscoped enum is implementation defined. If you want a concrete type, you could always use enum State : char { ... }. If you're worried about alignment, you could always use the uptr as the backing type or alignas. I feel like State Storage; would be much more easy to understand here. You could always also use enum class to get a guaranteed integer-backed enum.

cryptoad marked 3 inline comments as done.Jul 8 2019, 1:33 PM
cryptoad added inline comments.
lib/scudo/standalone/common.h
122 ↗(On Diff #208471)

Good point, I will add a test.
This will be optimized by the compiler, as SCUDO_ANDROID is a compile time constant.
From previous sanitizer_common reviews, this construct is preferred to an #if as it is less cumbersome, and more maintainable.

lib/scudo/standalone/mutex.h
21 ↗(On Diff #208471)

The benefit here is to not do unnecessary zeroing of memory when we are dealing with TLS or static variable that are already zero'd out.
That allows us to only have to use init in tests and for stack variables.
I'd rather keep it that way since it's conformant with other similar code constructs.

27 ↗(On Diff #208471)

We do not need speed here, a tighter function is preferable.
From g3 & Android disassembly, the compiler unrolls that leading to a giant chunk that isn't ideal.
I tried to figure out a way that would work for all compilers, but loop optimization (or disabling of) is fairly specific afaict.
I settled for this.

40 ↗(On Diff #208471)

This is variable that stores the underlying platform-specific data.
For Linux, we want a u32 (atomic), but for Fuchsia it's a sync_mutex_t, which should be opaque to us (Fuchsia doesn't use the State enum).
A single uptr is enough to store either (with CHECKs in their respective implementation to verify size and alignment), but using an array would make it easier to add other platforms.

cryptoad updated this revision to Diff 208498.Jul 8 2019, 1:46 PM
  • Adding constexpr for the number of yields & trys in the mutex class;
  • Adding a test to ensure that scudo::getPageSizeCached is consitent with getpagesize;
cryptoad updated this revision to Diff 208712.Jul 9 2019, 9:10 AM

Move the mutex state enum into linux.cc since it's only used there.

hctim added inline comments.Jul 9 2019, 10:50 AM
lib/scudo/standalone/mutex.h
21 ↗(On Diff #208471)

Constant initialization should generally be done at compile-time for statics, and non-IE TLS will have to zero-init either in the c'tor or init() anyway, right?

Currently, it's quite hard to keep track of whether the mutex is initialised or not. For example, SizeClassInfo::Mutex is only safe as it's used in Allocator::Primary which is (as far as I can tell) statically allocated. There shouldn't be a perf overhead for this, and it would make it a strong guarantee that the mutex is always initialised.

Up to you though - I personally find it very hard to keep track of what needs to be initialised. Also, if there is ever a change that makes the class unusable if zero-initialised, the author may mistakenly assume init() is called every time.

27 ↗(On Diff #208471)

Sounds good. Could you add this as a comment for future readers?

40 ↗(On Diff #208471)

For GWP-ASan, we decided that ifdefs around the member and the header include were okay, which is also what jemalloc does. I was initially very hesitent to use opaque platform-specific bytes here, as I didn't know the internals of zircon's sync_mutex_t. Looks like it basically typedefs down to an int through zx_futex_t, which means it doesn't violate the standard.

On this note, why in the linux implementation was the choice made to use SYS_futex instead of pthread_mutex_t?

cryptoad updated this revision to Diff 208774.Jul 9 2019, 12:14 PM
cryptoad marked 3 inline comments as done.

Addressing review comments:

  • commenting the use of the #pragma nounroll
  • using the platform specific types in mutex.h
cryptoad marked 5 inline comments as done.Jul 10 2019, 9:10 AM
cryptoad added inline comments.
lib/scudo/standalone/mutex.h
40 ↗(On Diff #208471)

Replaced the opaque uptr with platform specific types.
Discussed offline the reason for not using pthread_mutex_t, to make a long story short: performance on some platforms.

cryptoad marked 5 inline comments as done.Jul 10 2019, 9:17 AM
cryptoad added inline comments.
lib/scudo/standalone/mutex.h
21 ↗(On Diff #208471)

I'd like to keep it that way. I might revisit that at some point.

hctim accepted this revision.Jul 10 2019, 11:16 AM

LGTM.

This revision is now accepted and ready to land.Jul 10 2019, 11:16 AM
This revision was automatically updated to reflect the committed changes.
cryptoad marked an inline comment as done.