This is an archive of the discontinued LLVM Phabricator instance.

[scudo][standalone] Introduce the Secondary allocator
ClosedPublic

Authored by cryptoad on Apr 16 2019, 11:04 AM.

Details

Summary

The Secondary allocator wraps the platform allocation primitives. It is
meant to be used for larger sizes that the Primary can't fullfill, as
it will be slower, and sizes are multiple of the system page size.

This also changes some of the existing code, notably the opaque
platform data being passed to the platform specific functions: we can
shave a couple of syscalls on Fuchsia by storing additional data (this
addresses a TODO).

Event Timeline

cryptoad created this revision.Apr 16 2019, 11:04 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 16 2019, 11:04 AM
Herald added subscribers: llvm-commits, Restricted Project, jfb and 2 others. · View Herald Transcript
cryptoad updated this revision to Diff 195427.Apr 16 2019, 12:03 PM

Adding a death test & one for getBlockSize.

vitalybuka added inline comments.Apr 16 2019, 1:50 PM
lib/scudo/standalone/secondary.h
166

Why not to typedef OpaquePlatformData to something meaningful on each platform?
template is also fine, but I don't expect you will need multiple instantiations of MapAllocator
So maybe just define PlatformData on each platform?

cryptoad added inline comments.Apr 16 2019, 3:28 PM
lib/scudo/standalone/secondary.h
166

This is a valid point.
Initially I didn't want the secondary to depend on platform specific structures, but I can look into it.

cryptoad updated this revision to Diff 195576.Apr 17 2019, 8:04 AM

Following Vitaly's advice, we define the PlatformData structure
in a platform specific way, in the respective header files.

vitalybuka added inline comments.Apr 17 2019, 11:30 AM
lib/scudo/standalone/fuchsia.h
20

Now I am thinking that name PlatformData in top namespace is not very helpful.
MmapPlatformData?

lib/scudo/standalone/linux.h
20

Why this can't be empty?

lib/scudo/standalone/secondary.h
66

It's NOINLINE so can go into cpp file?

186

Why not just lock, unlock?

190

Wouldn't be better if we lock inside of iterateOverBlocks

cryptoad marked an inline comment as done.Apr 17 2019, 11:44 AM
cryptoad added inline comments.
lib/scudo/standalone/linux.h
20

Mostly because:
(C11, 6.7.2.1 Structure and union specifiers p8) "If the struct-declaration-list does not contain any named members, either directly or via an anonymous structure or anonymous union, the behavior is undefined."
So I figured I'd put a single byte field.

lib/scudo/standalone/secondary.h
66

Good point, I kept a template model in my head but there is no point, I will split it into a CC (I will rename them all to CPP once we land it all).

186

This is mostly due to the name of the Android APIs: malloc_enable & malloc_disable.
I carried the name over to the allocator's functions in Combined, Primary & Secondary (the top level disable effectively disables Primary & Secondary).
I can rename them.

190

See the comment above. Due to how the APIs are used in Android (disable/iterate/enable), I mirrored the behavior here.

morehouse added inline comments.Apr 17 2019, 11:57 AM
lib/scudo/standalone/fuchsia.h
25

Can we simplify to struct PlatformData { ... }?

lib/scudo/standalone/linux.h
21

Can we simplify to struct PlatformData { ... }?

lib/scudo/standalone/secondary.h
64

"allows us to unmap" seems more readable.

76

We're already on the slow path, and we just did a system call, so this path optimization is negligible. Same for below.

98

When is NewMapEnd < MapEnd?

104

So we potentially shrink the committed region for large alignments but did not allocate extra beforehand. Does this mean the actual region size may be less than the requested size?

vitalybuka added inline comments.Apr 17 2019, 11:57 AM
lib/scudo/standalone/linux.h
20

This is C rule.
it's allowed in C++

lib/scudo/standalone/secondary.h
186

Confusing thing is that this is going not just disable, but block all users.

190

Do Android blocks users as well?

cryptoad marked 14 inline comments as done.Apr 17 2019, 12:32 PM
cryptoad added inline comments.
lib/scudo/standalone/secondary.h
98

In order to be able to have an aligned address, we allocate Size+Alignment from the Combined (not checked-in yet), which matches the Primary behavior (and current sanitizer_common code) as explained in the comment.

At this point there is a chance that the address we get, with adjustments made for headers and whatnot, is already aligned to the requested alignment. This means we have about alignment bytes extra that can be unmap, at the end of the mapping. In that situation, we have NewMapEnd < MapEnd.
Another scenario being our aligned mapping being in the middle of the mapped area, in this situation we can also end up with spurious bytes towards the end that can be unmapped.

104

We technically allocated extra from the Combined (see function main comment).

190

Checked with cferris@: the expected behavior for disable is to pause any further allocation/deallocation request until enable is called.

cryptoad updated this revision to Diff 195620.Apr 17 2019, 12:36 PM

This splits the secondary into a H & CC file since we don't need to be
header only.
Simplify some code constructs as suggested by Matt, remove some
UNLIKELY.
Rename PlatformData to MapPlatformData.

morehouse added inline comments.Apr 22 2019, 12:22 PM
lib/scudo/standalone/secondary.cc
18 ↗(On Diff #195620)

Can we make this comment more specific? It might be helpful to mention that allocate may return anywhere from Size - AlignmentHint to Size bytes of memory.

19 ↗(On Diff #195620)

Would it be simpler to do the extra allocation and alignment here instead?

37 ↗(On Diff #195620)

What does "user pointer" mean?

For "users" of this function the comment seems false. CommitBase will always be page-aligned, but at the end of this function we return CommitBase + HeaderSize, which may not be page-aligned.

65 ↗(On Diff #195620)

Can this be simplified to H->Data = Data?

80 ↗(On Diff #195620)

Should this be StatMapped += MapEnd - MapBase?

cryptoad marked 3 inline comments as done.Apr 22 2019, 1:28 PM
cryptoad added inline comments.
lib/scudo/standalone/secondary.cc
19 ↗(On Diff #195620)

I went back and forth on how to deal with this and I have to say it's been annoying.
Technically the secondary doesn't (shouldn't?) know about the header size of the combined.
For N bytes aligned on A, the allocation done by the secondary includes a largeblock header, but also the block header. While this could be solved by passing the extra header bytes to the secondary, it would make it depart significantly from the primary: the primary returns a block that straddles the allocation boundary and doesn't care about anything else.
By building it that way, we have a very similar approach between Primary & Secondary: the combined rounds up the size and request the alloc. The real problem (it's still debatable if it's an actual problem) would be on 32-bit when requesting secondary allocations with large alignments. The unmapping becomes necessary due to the sparsity of the address space. For 64-bit, this is a non-issue.\
The resulting function might appear hackish in the end, but it ends up fitting decently in the combined.

37 ↗(On Diff #195620)

Here user pointer means what is returned to the user (I guess as opposition to the "combined" pointer which is the one we return here to the combined).
The combined pointer will be CommitBase+HeaderSize, which will be rounded up to the next alignment boundary.
Since the alignment is greater than or equal to a page, this means that the user pointer will be (at least) page aligned.

cryptoad updated this revision to Diff 196129.Apr 22 2019, 1:30 PM

Adding some comments, changing a couple of code constructs based on
Matt's feedback.

morehouse added inline comments.Apr 22 2019, 4:23 PM
lib/scudo/standalone/secondary.cc
45 ↗(On Diff #196129)

The explanation in parens makes more sense right after the term "user pointer".

19 ↗(On Diff #195620)

I don't especially like the current setup since it is hard to understand, but if it's actually cleaner this way I guess we'll have to bite the bullet.

I do think some high-level design comments would be helpful to understand how the primary/secondary/combined components interact with each other. Not sure where the best place is for that, maybe the combined allocator?

cryptoad marked 2 inline comments as done.Apr 23 2019, 7:51 AM
cryptoad added inline comments.
lib/scudo/standalone/secondary.cc
19 ↗(On Diff #195620)

The Combined will likely be the best place.

cryptoad updated this revision to Diff 196250.Apr 23 2019, 7:52 AM

Addressing Matt's comment.

This revision is now accepted and ready to land.Apr 23 2019, 10:38 AM
This revision was automatically updated to reflect the committed changes.