This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
10/16
HardenedAllocator.rst
-
projects/compiler-rt/
-
compiler-rt/
-
cmake/
1
config-ix.cmake
-
lib/
1
CMakeLists.txt
-
hardened_allocator/
-
CMakeLists.txt
7/11
scudo_allocator.h
61/69
scudo_allocator.cc
1/1
scudo_flags.h
1/1
scudo_flags.cc
1/2
scudo_flags.inc
1/4
scudo_malloc_linux.cc
-
scudo_new_delete.cc
2/2
scudo_rtl.cc
6/6
scudo_utils.h
24/30
scudo_utils.cc
-
test/
-
CMakeLists.txt
-
hardened_allocator/
-
CMakeLists.txt
1/1
alignment.cc
1/1
double-free.cc
-
lit.cfg
-
lit.site.cfg.in
1/1
malloc.cc
-
memalign.cc
2/3
mismatch.cc
-
overflow.cc
2/2
quarantine.cc
1/1
realloc.cc
-
sized-delete.cc
3/3
sizes.cc

Differential D20084

[sanitizer] Initial implementation of a Hardened Allocator
ClosedPublic

Authored by cryptoad on May 9 2016, 3:13 PM.

Download Raw Diff

Details

Reviewers

kcc
vitalybuka
glider
krasin
dvyukov
eugenis
pcc
aizatsky

Commits

rG712fc9803a4d: [sanitizer] Initial implementation of a Hardened Allocator
rCRT271968: [sanitizer] Initial implementation of a Hardened Allocator
rL271968: [sanitizer] Initial implementation of a Hardened Allocator

Summary

This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
The following were implemented:

additional consistency checks on the allocation function parameters and on the heap chunks;
use of checksum protected chunk header, to detect corruption;
randomness to the allocator base;
delayed freelist (quarantine), to mitigate use after free and overall determinism.

Additional mitigations are in the works.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

vitalybuka added inline comments.May 11 2016, 12:05 PM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
54	enum ChunkState : u8 { ChunkAvailible = 0, };
81	I see why you need 128bit for structure. Why do you need u128 type for 2 bit members?
310	could be removed?
407	no Die?
544	maybe remove temp variable?
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h
58	QuarantineSizeMb;
71	const uptr AllocatorSpace =
73	static is not needed
84	scudoMalloc(uptr Size, AllocType AllocType)

kcc added inline comments.May 11 2016, 1:51 PM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
81	I actually like it this way. The intent is more clear.

Resolving the issues raised in the new batch of comments.
Started renaming the variables, functions, etc, to be compliant with the LLVM coding standards.

docs/HardenedAllocator.rst
89	I would have to check that.
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
407	Good catch!

glider added inline comments.May 12 2016, 5:30 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
14	Please either elaborate what other "various security improvements" are there, or remove that phrase.
116	I would've started with a handwritten crc32 implementation and bother with hardware support only iff it's performance-critical (don't think it is)
162	Shouldn't the success memory order be __ATOMIC_RELEASE?
176	Please remind if Scudo is going to be used together with any of the sanitizers. If yes, the destructor magic won't probably work as intended, because other tools also play with it.
263	Are we going to target 32-bit systems? This is gonna overflow uptr on x86.
279	Despite SSE 4.2 may be quite common at Google, I don't think it's a good idea to bail out if it's unsupported. Note TestCPUFeature() doesn't work on AMD processors yet.
436	So remove it, maybe?
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h
2	I don't insist much, but I think either the library name should be "scudo" instead of "hardened_allocator", or the names of the files under hardened_allocator/ should start with "hardened_allocator_"
projects/compiler-rt/lib/hardened_allocator/scudo_flags.inc
30	Feature request: filling the chunk context with a nonzero byte.
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
2	Do the build configs prevent this file from being built on ARM?
33	Do you really need the subleaf parameter now?
40	I believe requiring certain Intel CPU models in order for the allocator to work isn't a good idea.
58	Note this is thread-unsafe. Not sure if that matters here, but still.
85	Why not use a bool here?
87	Usually a "k" prefix denotes a constant. If you're going to change it, that's just a regular variable named "has_rd_rand" (or something like that)
94	Please move this call inside the loop below.
105	Is this problem that severe that we want to abort? Note that we don't abort if the CPU doesn't support rdrand.
110	My gut feeling is that XORing rdtsc and the time since epoch is actually reducing the entropy, not increasing it. Any idea if that's true? Also, do we really need a dependency on std::chrono?
projects/compiler-rt/lib/hardened_allocator/scudo_utils.h
51	GetSeed is unused.
59	These "a", "b", "c" comments don't help :( Please remove them.
projects/compiler-rt/test/hardened_allocator/alignment.cc
14	`alignment` is unused.
projects/compiler-rt/test/hardened_allocator/init.cc
1 ↗	(On Diff #56965)	Do you really need this test?
projects/compiler-rt/test/hardened_allocator/malloc.cc
2	Each test must have comments that describe its purpose.
projects/compiler-rt/test/hardened_allocator/mismatch.cc
3	FYI you can use --check-prefix to write more test-specific CHECK directives.
27	Nit: spare newline
projects/compiler-rt/test/hardened_allocator/quarantine.cc
16	You should probably add nullptr checks to other allocations in other tests.
projects/compiler-rt/test/hardened_allocator/realloc.cc
30	Nit: a comment on a separat line must end with a period.
projects/compiler-rt/test/hardened_allocator/sizes.cc
20	s/fulfill/allocate?
54	Where does this line come from? I don't see the allocator printing it anywhere.

$0.05 regarding the cryptographic security: can someone please clarify the goal of hardening the allocator?
If the intention is to use it for e.g. ASan in production, the randomness should be at least not worse than that of regular allocators.

glider added inline comments.May 12 2016, 8:23 AM

projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
110	As a data point, I've ran RdTSC() and std::chrono::high_resolution_clock::now().time_since_epoch().count() 200278017 times. The number of unique values of both variables was exactly 200278017, while the number of unique XOR values was only 200205416, i.e. there were 0.036% collisions.

filcab added a subscriber: filcab.May 12 2016, 9:50 AM

filcab added inline comments.

docs/HardenedAllocator.rst
48	I think this parenthesis should be somewhere else. The crc32 instruction is an actual requirement of the allocator right now. P.S: Alternatively, remove the "if ..." and say it's a requirement, like you do in the next paragraph.
111	If we're making a "hardened allocator", shouldn't the default be to zero out all chunks? (of course, performance would suffer a lot. I'm just curious if there are other reasons)
projects/compiler-rt/cmake/config-ix.cmake
195	I know safestack, cfi, and ESan don't follow this, but we can probably put the HARDENED_ALLOCATOR stuff in alphabetical order with the other stuff above.
projects/compiler-rt/lib/CMakeLists.txt
58	Same here.
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
33	I would remove the TODO. If we're in a debugger, Abort() might trigger the debugger and stop the program execution. exit() will simply close the program. We've added the abort in ASan, back in the day (D12332), due to this. It's the default for ASan on OS X and on the PS4.
49	It's very likely a smaller price to pay than to have to use atomic updates + spin on update collisions.
61	I could do without the `u128` typedef. You're only using it for the `PackedHeader` typedef.
94	I'd suggest: COMPILER_CHECK(sizeof(UnpackedHeader) == sizeof(PackedHeader)); Since it makes it explicit we want those two to be the same and that it's not a coincidence that we're expecting both to "happen to be" the same size as `u128`. The `sizeof(PackedHeader)` isn't needed, since it's a typedef for `u128` (even after my proposed change, it won't be needed).
96	Remove the `static` here, since you're not adding it to other const-qualified variables when it isn't needed.
118	No need to do this. Much better to just handle it on the CMake side (which you're already doing).
124	"... 16 least significant bits of the header of the first 8 bytes..."
135	Why are you not using the C++11 atomics?
243	Why?
263	If this ends up being ported, it's a simple matter of using the `FIRST_32_SECOND_64` macro.
279	Source files are compiled assuming that feature is available. We'll have to add a fallback checksum (plus change build and this check) to address this comment. I would be ok with keeping the SSE4.2 requirement until we get a non-zero amount of requests/bug reports. P.S: http://store.steampowered.com/hwsurvey (first result for "hardware survey". It's clearly biased, but I'd guess developer CPUs are also biased to be more recent/powerful than an average computer) puts SSE4.2 adoption at ~80%.
335	Pobably best to zero the whole thing (`needed_size - ChunkHeaderSize`)?
374	Please push the negation through.
450	I'm guessing you only need to zero the additional contents to account for possible overflows that might have happened. Otherwise: `ZeroContents` = true -> they're already zeroed `ZeroContents` = false -> No need to zero.
474	If we have the `ZeroContents`, no need to zero again.
475	Should we zero the whole block, instead of just the size we were asked? (Hardening it a tiny bit against overflows)
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h
42	Why the empty comment trailing the `while (false)`?
projects/compiler-rt/lib/hardened_allocator/scudo_malloc_linux.cc
2	`scudo_interceptors.cc` (`.cpp` in the future, but do what Vitaly suggested and change the names only after approval to ease code review)
16	Don't add these to the whole file. I'm ok with protecting Linux/glibc-specific functions like `pvalloc`, etc. Those will need it anyway if this gets ported somewhere. No need to protect the `malloc`/`free` interceptors with `SANITIZER_LINUX`, though.
projects/compiler-rt/lib/hardened_allocator/scudo_rtl.cc
51	Should this be an `#error`?
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
60	else UNIMPLEMENTED(); (or something similar)
90	Does gcc actually do anything with this? If not, then just delete it. AFAICT, clang doesn't care unless you have an asm attribute to tie it to a specific register.
99	Nit: Why not a simple `int`? Closer to the "usual idiom" in C++.
projects/compiler-rt/lib/hardened_allocator/scudo_utils.h
27	`static_assert(sizeof(Dest) == sizeof(Source), "Sized are not equal!");`
projects/compiler-rt/test/hardened_allocator/double-free.cc
29	Add a `posix_memalign` version. We have a special case in `free` for it.
projects/compiler-rt/test/hardened_allocator/mismatch.cc
22	You should add, at least, the memalign -> something_other_than_free case, since it's a special case.
projects/compiler-rt/test/hardened_allocator/quarantine.cc
16	`if (p)`

BTW we've been discussing the issue with the random seed (and the header cookies) being reused upon fork() today.
If you've a service that forks in response to every client request, it can be exploited by brute-forcing the CRC of a single object (which remains the same upon fork())

Thus two questions arise:

shouldn't we increase the size of the header's crc32 to, um, 32 bits?
is it possible to re-initialize the seed and the cookie upon fork() (a dummy solution is to iterate over the heap and fix all headers, but maybe there's something more elegant?)

glider added inline comments.May 12 2016, 10:18 AM

projects/compiler-rt/lib/hardened_allocator/scudo_malloc_linux.cc
16	Note that for other systems (e.g. OSX) it may be incorrect to intercept malloc/free. Therefore it should be ok to keep those in a Linux/FreeBSD-specific file and keep SANITIZER_LINUX \| SANITIZER_FREEBSD for the whole file.
76	It's better to add comments (e.g. " // SANITIZER_LINUX" here) to #endif directives, especially when the code doesn't fit on a single screen.

dvyukov added inline comments.May 12 2016, 10:25 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
88	Don't we need only 20 bits here?
163	Looks pointless.
188	This will leak memory. Destructors run FIFO order, so later-created user dtors can run after you. Plus pthread frees thread stack and pthread_specific regions after running pthread_specific dtors.
196	Why initGlobal is not called from ScudoInitInternal? If you expect that malloc can come before ScudoInitInternal, then you also need to call ScudoInitInternal from initThread. Otherwise it won't work anyway.
242	Why is this commented out? Looks cleaner.
284	Do we want to sanity check options.QuarantineSizeMb) << 20? What if it overflows?
285	Make this tunable as well. If my program has 10000 threads, 1MB per thread is a lot.
294	s/alignment/malloc alignment/ So that user can get at least some glue when she sees this on console.
326	It seems to me that we don't actually need with_offset and all the associated if's. You can just always store (chunk_beg - alloc_beg) >> MinAlignmentLog into header.offset and always subtract it from user_beg.
327	There is a very tricky, implicit relation between MinAlignment, MaxAlignment and number of bits in offset. If there of these change in future we can get a nice attack vector due to offset overflow. Check that MaxAlignment/MinAlignment fits into offset during init.
382	I wonder if delete_size can be 0 and it does not mean that delete_size is not passed, it is just legally zero. What is passed in for delete of an array with 0 elements?
projects/compiler-rt/lib/hardened_allocator/scudo_flags.cc
57	You use convoluted way to express 64 that requires remembering powers of two, and then spell 64 in comments. Why not just say "64"?
projects/compiler-rt/lib/hardened_allocator/scudo_flags.inc
19	s/-1/64/ then the default will be visible in help as well. User can't tune this value if she does not have a single reference point. If I know that the default is 64, then I can set it to 32 of 128. If I don't know the default, what am I supposed to do?
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
110	This is used to initialize global cookie. I would use /dev/random. Or there must be 16 bytes of good randomness in auxv.
projects/compiler-rt/lib/hardened_allocator/scudo_utils.h
45	This is not used. Remove.
47	This is not used. Remove.
51	This is not used. Remove.

dvyukov added inline comments.May 12 2016, 10:25 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
65	It's better to comment right on the fields rather than duplicate them here. The comment has good chances of getting outdated. It's also harder to find the relevant part of the comment for a particular field. E.g.: u8 state : 2; // available, allocated, or quarantined comments like 'salt' on 'salt' field are excessive, drop them.
105	I am missing the relation between the requirement to not load header second time and making the function static. Why not: void AllocBeg(UnpackedHeader header) ?
127	Add a bold comment to checksum filed that Checksum expects it to be low 16 bits. And maybe add some debug check here.
129	Won't it do to initialize crc to cookie as: u64 crc = _mm_crc32_u64(cookie, reinterpret_cast<uptr>(this)); ?
136	Why does this need to be acquire? Please comment.
154	Why release?
162	Why acquire?

glider added inline comments.May 12 2016, 10:50 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
279	Well, IIUC right now the implementation just aborts for AMD processors, which are among those ~80%.

kcc added inline comments.May 12 2016, 12:15 PM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
116	I am pretty sure it is performance critical. If this gets used on older or non-x86 systems we can add other implementation later.
176	Afiact, scudo will not be combinable with any of the sanitizers other than with ubsan
263	no 32-bit for now (or ever?)
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h
2	I initially proposed to name the dir hardened_allocator to make it more self-descriptive. But if others don't mind to have dir named "scuda" let the author decide.

filcab added inline comments.May 12 2016, 12:52 PM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
279	I was just talking about the SSE4.2 part. Sorry, I was going to comment on the CPUID thing but forgot. Doing it now.
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
65	Same on AMD (source: https://support.amd.com/TechDocs/25481.pdf): " CPUID Fn0000_0001_ECX Feature Identifiers ... 20 SSE42: SSE4.2 instruction support. "
67	Doesn't exist on AMD. Since SSE4.2 is required for this, you'll need to implement SSE4.2 detection for other CPU brands. RDRAND seems to exist only on Intel CPUs and there's a fallback path, so having it only on Intel doesn't seem like a problem.

Addressed some of the issues raised during the review.
Additional renaming done to comply with the LLVM coding standard.

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
88	This is indeed the case. I figured I would align that one to a multiple of 8 bits as we have some space in the second half of the 128-bit integer. I am not opposed to shortening to the actual needed bit size if you feel strongly about it.
135	Using std::atomic<unsigned __int128>?
243	Sorry this was a remainder of a debugging session. I switched it back to the original plan which was to use the thread_local QuarantineCache.
263	No plan to support 32-bit as of yet, but yes we will use FIRST_32_SECOND_64 if we do.
335	I guess we have several choices here: move it up and zero the whole thing prior to the header being store leave it here and zero the whole thing post header, which will have to account for the offset (needed_size - (chunk_beg - alloc_beg)) I think the first option would be better.
projects/compiler-rt/lib/hardened_allocator/scudo_rtl.cc
51	I figured the additional initialization techniques used by ASan et al. could be added later on. Hence the #error for now.
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
33	I figured it could be useful if a feature such as RDSEED was needed.
40	My point of view when writing this was that I had to be as competitive as can be with other allocators, so that the benefit of additional checks would not be offseted by a dramatic decrease in performances. In the initial stages, it was determined that a BSD checksum vs the CPU backed CRC32 induced a performance gain of about 10% in pure allocation benchmarks, so I went that way. I am not opposed to doing something purely software, but I'd rather start this way and then expend it to be more portable.
85	I wanted to use a 3 state variable: unintialized, true, false. Hence the -1, 0, 1.
99	I tried to be consistent using the Sanitizer types. But I see your point.
110	Is your suggestion to get rid of the epoch component?
projects/compiler-rt/test/hardened_allocator/sizes.cc
54	This is from sanitizer_allocator.cc, that handles this condition.

dvyukov added inline comments.May 13 2016, 12:55 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
89	Does it improve generated code? Leaving it as 24 is OK in that case, but it needs to be explained in comments. Width of that field has crucial implicit relation with Min/MaxAlignment. When double-checked width, I found that it's not what I would expect it to be. What means that either I missing something else important here, or there is a bug, or things get out of sync. This uncertainty is very unpleasant and takes time for anybody reading the code.
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
111	Yes, all that is predictable. /dev/random is meant specifically for such cases, it uses various sources of entropy to create strongly random numbers. On second though, just remove all rdtsc/cpuid/rdrand trickery and read from /dev/random. If rdrand is present, kernel will use it.

filcab added inline comments.May 13 2016, 7:55 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
136	Yeah, should be nicer. Unless it's a problem due to something I'm not thinking of, then I'd rather have more standard constructs (even though there's no guarantee of an `__int128` type, let alone an `std::atomic<__int128>`, AFAICT libstdc++ and libc++ will have those implemented).

This update addresses another batch of comments raised during the review.
Among the notable changes:

after discussion with dvyukov, the memory order for the atomic operation has to changed to relaxed;
the 'with_offset' field in the header is going away, and the offset field now always stores the distance between the backend allocation and the chunk.

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
189	For this, I used the same technique that is used in ASan's PlatformTSDDtor, as it seems to be the most viable one. I am not sure what alternative would work here.
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h
43	This actually a straight copy from sanitizer_internal_defs.h, just replacing CheckFailed. So I left it as is. This will go away when I redo the CHECK_IMPL logic to follow kcc@'s suggestion to implement templated failure functions.
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
111	I gave it a try and I have had a lot of issues with /dev/random on my system. Between the fact that it's blocking, and that sometimes it won't return the amount of bytes requested, the tests have been failing inconsistently. Tests with /dev/urandom worked better though. I am going to dig further into that.

filcab added inline comments.May 16 2016, 10:13 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
135	Did `std::atomic<unsigned __int128>` not work/was too slow?
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
111	Using `/dev/urandom` should be what you need, yes. Did you still have problems with urandom, btw?

cryptoad added inline comments.May 16 2016, 10:37 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
135	It's in the works :)
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
111	/dev/udrandom appeared to work fine.

dvyukov added inline comments.May 17 2016, 1:19 AM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
189	If a free comes after we drained local cache, asan uses a global cache. Grep for "fallback" in asan_allocator.cc. Tsan now uses the same. It sucks. But I don't see how to do better. We need to detect when a thread is actually finished, but it's tricky to do with pthread_join API.
projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
111	/dev/urandom is not what you need. It trades security for performance. I.e. instead of blocking it will just give you predictable randomness. Which kind of defeats the whole purpose of a security allocator. /dev/random blocks when it does not have enough entropy. But there is not much you can do if you do need the entropy. If it returns less bytes, read again. That's how it works with all read calls.

filcab added inline comments.May 17 2016, 6:02 AM

projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
111	People like Daniel Bernstein and Thomas Ptacek (and others) tend to disagree and say that /dev/urandom is what we need: http://blog.cr.yp.to/20140205-entropy.html http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ I'm no expert in this, so I tend to rely on people who work on this kind of thing.

This diff addresses another batch of comments from the review, as well as some renaming to converge towards LLVM coding standard compliance.
Among the notable changes:

a fallback mechanism has been added to service allocations and deallocation post thread tear-down as per dvyukov guidance;
the initialization has been moved around to not depend on .preinit_array; a test has been added to make sure that preinit allocations are serviced successfully;
std::atomic is used in place of the GCC builtins; the compiled code is identical.

There are still some comments left to address, notably regarding the source of randomness.

kubamracek added a subscriber: kubamracek.May 22 2016, 11:55 AM

With this diff ends the renaming process, so unless I missed or misunderstood something, this should be compliant with the LLVM coding standards.
Additional, I migrated the thread local PRNG initialization to use /dev/urandom for seeding purposes. This appears to not have significantly impacted the performances.

One of the outstanding items on my list was to have a look at the CHECK logic to be able to have everything fast fail without callbacks if something went wrong.
kcc's suggestion was to change the CHECKs in the Allocator to make them call a templated failure function (or virtual) (http://reviews.llvm.org/D20084#425327).
I've realize since then that we also need that to be true for the Quarantine, and potentially any abstracted Sanitizer function (the per platform file access functions come to mind). Basically we really want CheckFailed to be ours anywhere in the project, so that when compiling the hardened allocator, none of the with-callbacks version will be compiled in.

I'd welcome some feedback as to how to do that as cleanly as possible, thanks!

Yea...
So one possible way is to re-define __sanitizer::CheckFailed.
It should be relatively easy if we never going to allow mixing scudo and the sanitizers.
There is no way to mix scudo and asan/tsan/msan anyway, because they have conflicting allocators.
You may mix scudo and ubsan, since ubsan does not have an allocator, but the only sane way to have ubasan
in prod is to use it in trapping mode which does not have run-time. So we are good here too.

So, try this:

move __sanitizer::CheckFailed into a separate file.
make sure it is used by *san
make sure it is not used by scudo
defined your own __sanitizer::CheckFailed

I have not tested it, so something may go wrong, you'll need to experiment...

kcc added inline comments.May 27 2016, 3:19 PM

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h
30	So, you should not need this any more, let's remove it now. I'll make on more pass afterwards (Monday-ish)

In D20084#441391, @cryptoad wrote:

One of the outstanding items on my list was to have a look at the CHECK logic to be able to have everything fast fail without callbacks if something went wrong.

Why can't we just set Die as CheckFailedCallback?

dvyukov added inline comments.May 28 2016, 11:46 PM

projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
100	urandom is not secure and can allow to guess the cookie in a local setuid binary.

In D20084#443276, @dvyukov wrote:

In D20084#441391, @cryptoad wrote:

One of the outstanding items on my list was to have a look at the CHECK logic to be able to have everything fast fail without callbacks if something went wrong.

Why can't we just set Die as CheckFailedCallback?

What I am trying to prevent here is the use of callbacks at all. They would be an interesting target for an attacker as they would be writable function pointers that could be triggered on demand on heap corruption.

projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
100	So on this matter it seems that the general agreement is that urandom on modern Linux system is secure and can be used for cryptographic purposes. Even the more recent getrandom system call uses urandom by default, with the following entry in the man page: "Unless you are doing long-term key generation (and perhaps not even then), you probably shouldn't be using GRND_RANDOM. The cryptographic algorithms used for /dev/urandom are quite conservative, and so should be sufficient for all purposes." /dev/random performs poorly in my tests, often blocking the allocator.

In D20084#443951, @cryptoad wrote:

In D20084#443276, @dvyukov wrote:

In D20084#441391, @cryptoad wrote:

One of the outstanding items on my list was to have a look at the CHECK logic to be able to have everything fast fail without callbacks if something went wrong.

Why can't we just set Die as CheckFailedCallback?

What I am trying to prevent here is the use of callbacks at all. They would be an interesting target for an attacker as they would be writable function pointers that could be triggered on demand on heap corruption.

But that would mean that an attacker broke ASLR and can write arbitrary values at necessary memory locations. Does it still make sense to defend in such case?

projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc
100	Okay. You know better.

In D20084#444017, @dvyukov wrote:

But that would mean that an attacker broke ASLR and can write arbitrary values at necessary memory locations. Does it still make sense to defend in such case?

That is correct. I think it is still worth it to not take the chance.
Previous work on other heaps have leveraged such features, given the same assumptions (for example the commit function pointer in the Windows Heap https://www.blackhat.com/presentations/bh-usa-09/MCDONALD/BHUSA09-McDonald-WindowsHeap-PAPER.pdf).
I think it's particularly important to make sure that the failure path fails fast and ideally without the possibility of interruption (like __fastfail http://www.alex-ionescu.com/?p=69).

We now replace the Sanitizer's termination functions so that no callbacks can be called in CheckFailed and Die.

Mostly LG.
Please address a few remaining nits and wait until tomorrow for more comments.
If no significant comments, rename the dir to scudo and I will land it.

Note: I did not review this code from security perspective in details because

not an expert
there are known security weaknesses in the backend allocator (we will need to handle them separately)
it'll be easier to do further security assessment once the code is committed.

docs/HardenedAllocator.rst
90	Did you?
95	Give an example instead of referring to "usual ASan syntax". Scudo users don't have to be asan experts.
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
64	align the comment block
110	I suggest to replace all cases of if (!cond) { Printf() Die() } With if (!cond) DieWithMessage(); This is using the Printf from sanitizer_common, right? It might be worth replacing it with your own, simpler one. If you agree, just leave a TODO near DieWithMessage and address it later.
279	s/late/later
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h
19	is currently only supported? os "supports x86_64"?
50	Does the following block of code have to be in this header? Why not in .cc?

Addressing some comments raised in the review, notably:

new dieWithMessage function wrapping a Printf+Die functionality - still currently using Sanitizer's VSNPrintf which will be changed later;
updated the documentation;

docs/HardenedAllocator.rst
90	I removed the part about the preinit_array as I do not use that anymore. Whatever LIT is using requires the whole-archive flag, if using gcc to link the static library against a project, it doesn't.
95	I didn't realize that I hadn't updated the options names below as well. Also added ThreadLocalQuarantineSizeKb.
projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc
110	There is also a PrintfAndReportCallback callback that I just noticed. I will have to address that later as well.

LGTM.
I think it's as good as we can make it via code review.
Let's make it better by incremental changes.

Now, please rename the directories to lib/scudo and test/scudo, upload the updated patch and let me land it.
Probably also rename HardenedAllocator.rst to ScudoHardenedAllocator.rst or some such (up to you)

This revision is now accepted and ready to land.Jun 3 2016, 2:48 PM

This patch renames the directories and files to fit the scheme suggested during the review, and the LLVM practices:

hardened_allocator is now scudo everywhere;
documentation is now in ScudoHardenedAllocator.rst;
all .cc files are now .cpp;
additionally scudo_malloc_linux.cc is now scudo_interceptors.cpp;
build files have been updated accordingly, as well as all references to the previous naming scheme (the library and checks rules are now 'scudo' and 'check-scudo').

I am getting these when trying 'ninja check-scudo'
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/atomic:266: undefined reference to `__sync_val_compare_and_swap_16'
Any suggestions?

BTW, do we make sure that check-scudo is not run if the current machine does not support proper SSE?

Most likely were missing -latomic (IIRC)
I wouldn't expect to have to link with that when using the std::atomic, but
that might be it.

Filipe

In D20084#449057, @kcc wrote:

I am getting these when trying 'ninja check-scudo'
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/atomic:266: undefined reference to `__sync_val_compare_and_swap_16'
Any suggestions?

Regarding @filcab's comment, -latomic is in the cflags in the lit.cfg, not sure why this is not working for you.

BTW, do we make sure that check-scudo is not run if the current machine does not support proper SSE?

We do check for SSE 4.2 in the init of the Allocator via CHECK(testCPUFeature(SSE4_2))

In D20084#449057, @kcc wrote:

I am getting these when trying 'ninja check-scudo'
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/atomic:266: undefined reference to `__sync_val_compare_and_swap_16'
Any suggestions?

I also use -mcx16 in g3 BUILD, that might be it.

We do check for SSE 4.2 in the init of the Allocator via CHECK(testCPUFeature(SSE4_2))

That's not enough.
This is a run-time check and so if someone runs "check-all" on a machine that does not support SSE4_2
thye *will* run check-scudo and get a test failure.
Instead, we need to ensure that check-scudo is not executed as part of check-all when there is no proper HW support

And of course, fix check-scudo on my machine [ :) ] so that I can test it before committing.

Updated Scudo LIT CMakeLists.txt to only add check-scudo on Linux x64 machines with SSE4.2.

Closed by commit rL271968: [sanitizer] Initial implementation of a Hardened Allocator (authored by kcc). · Explain WhyJun 6 2016, 6:27 PM

This revision was automatically updated to reflect the committed changes.

Thanks again for contributing this code, let's now make the allocator even more hardened!

Revision Contents

Path

Size

docs/

HardenedAllocator.rst

111 lines

projects/

compiler-rt/

cmake/

config-ix.cmake

14 lines

lib/

CMakeLists.txt

4 lines

hardened_allocator/

32 lines

98 lines

585 lines

33 lines

81 lines

35 lines

scudo_malloc_linux.cc

75 lines

69 lines

51 lines

58 lines

121 lines

test/

CMakeLists.txt

3 lines

hardened_allocator/

21 lines

25 lines

49 lines

39 lines

7 lines

27 lines

42 lines

41 lines

38 lines

43 lines

69 lines

40 lines

61 lines

Diff 57362

docs/HardenedAllocator.rst

				========================
				Scudo Hardened Allocator
				========================

				.. contents::
				:local:
				:depth: 1

				Introduction
				============
				The Scudo Hardened Allocator is a user-mode allocator based on LLVM Sanitizer's
				CombinedAllocator, which aims at providing additional mitigations against heap
				based vulnerabilities, while retaining good performance.

				kccUnsubmitted Done Reply Inline Actions [not a native English speaker here] performances or performance? kcc: [not a native English speaker here] performances or performance?
				The name "Scudo" has been retained from the initial implementation (Escudo
				meaning Shield in Spanish and Portuguese).
				kccUnsubmitted Done Reply Inline Actions Sweet! kcc: Sweet!

				Design
				======
				Chunk Header
				------------
				Every chunk of heap memory will be preceded by a chunk header. This has two
				purposes, the first one being to store various information about the chunk,
				the second one being to detect potential heap overflows. In order to achieve
				this, the header will be checksumed, involving the pointer to the chunk itself
				and a global secret. Any corruption of the header will be detected when said
				header is accessed, and the process terminated.

				The following information is stored in the header:

				- the 16-bit checksum;
				- the user requested size for that chunk, which is necessary for reallocation
				purposes;
				- the state of the chunk (available, allocated or quarantined);
				- the allocation type (malloc, new, new[] or memalign), to detect potential
				mismatches in the allocation APIs used;
				- whether or not the chunk is offseted (ie: if the chunk beginning is different
				than the backend allocation beginning, which is most often the case with some
				aligned allocations);
				- the associated offset;
				- a 16-bit salt.

				On x64, which is currently the only architecture supported, the header fits
				within 16-bytes, which works nicely with the minimum alignment requirements.

				The checksum is computed as a CRC32 (if the associated CPU instructions are
				available) of the chunk pointer itself, and the 16 bytes of header with the
				checksum field zeroed out. The result is then xored with a global secret.
				filcabUnsubmitted Done Reply Inline Actions I think this parenthesis should be somewhere else. The crc32 instruction is an actual requirement of the allocator right now. P.S: Alternatively, remove the "if ..." and say it's a requirement, like you do in the next paragraph. filcab: I think this parenthesis should be somewhere else. The crc32 instruction is an actual…

				The header is atomically loaded and stored to prevent races (this requires
				platform support such as the cmpxchg16b intruction). This is important as two
				consecutives chunks could belong to different threads. We also want to avoid
				any type of double fetches of information located in the header, and use local
				copies of the header for this purpose.

				kccUnsubmitted Done Reply Inline Actions stack copies? I would call them local copies, because there is a good change that they are not on the stack but on a register kcc: stack copies? I would call them local copies, because there is a good change that they are not…
				Delayed Freelist
				-----------------
				A delayed freelist allows us to not return a chunk directly to the backend, but
				to keep it aside for a while. Once a criterion is met, the delayed freelist is
				emptied, and the quarantined chunks are returned to the backend. This helps
				mitigate use-after-free vulnerabilities by reducing the determinism of the
				kccUnsubmitted Done Reply Inline Actions helps mitigate? kcc: helps mitigate?
				allocation and deallocation patterns.
				kccUnsubmitted Done Reply Inline Actions s/to/by ? kcc: s/to/by ?

				This feature is using the Sanitizer's Quarantine as its base, and the amount of
				memory that it can hold is configurable by the user (see the Options section
				below).

				Randomness
				----------
				It is important for the allocator to not make use of fixed addresses. We use
				the dynamic base option for the SizeClassAllocator, allowing us to benefit
				from the randomness of mmap.

				Usage
				=====

				Library
				-------
				The allocator static library can be built from the LLVM build tree thanks to
				the "hardened_allocator" CMake rule. The associated tests can be exercised
				thanks to the "check-hardened_allocator" CMake rule.
				kccUnsubmitted Done Reply Inline Actions did you mean hardened_allocator? kcc: did you mean hardened_allocator?

				Linking the static library to your project will likely require the use of the
				"whole-archive" linker flag (or equivalent) as we make use of the
				.preinit_array section to initialize the allocator. Additional linker flags can
				kccUnsubmitted Done Reply Inline Actions "make use"? kcc: "make use"?
				be required depending on your project.

				Your linked binary should now make use of the Scudo allocation and deallocation
				functions.
				kccUnsubmitted Not Done Reply Inline Actions You mean, dynamic linker? Static linker (such as e.g. lld) can safely use it, right? kcc: You mean, dynamic linker? Static linker (such as e.g. lld) can safely use it, right?
				cryptoadAuthorUnsubmitted Not Done Reply Inline Actions I would have to check that. cryptoad: I would have to check that.

				kccUnsubmitted Not Done Reply Inline Actions Did you? kcc: Did you?
				cryptoadAuthorUnsubmitted Not Done Reply Inline Actions I removed the part about the preinit_array as I do not use that anymore. Whatever LIT is using requires the whole-archive flag, if using gcc to link the static library against a project, it doesn't. cryptoad: I removed the part about the preinit_array as I do not use that anymore. Whatever LIT is using…
				Options
				-------
				Several aspects of the allocator can be configured through environment options,
				following the usual ASan options syntax, through the variable SCUDO_OPTIONS.

				kccUnsubmitted Done Reply Inline Actions Give an example instead of referring to "usual ASan syntax". Scudo users don't have to be asan experts. kcc: Give an example instead of referring to "usual ASan syntax". Scudo users don't have to be asan…
				cryptoadAuthorUnsubmitted Not Done Reply Inline Actions I didn't realize that I hadn't updated the options names below as well. Also added ThreadLocalQuarantineSizeKb. cryptoad: I didn't realize that I hadn't updated the options names below as well. Also added…
				The following options are available:

				- quarantine_size_mb (integer, defaults to -1): the size (in Mb) of quarantine
				used to delay the actual deallocation of chunks. Lower value may reduce
				kccUnsubmitted Done Reply Inline Actions What do negative values mean? kcc: What do negative values mean?
				memory usage but decrease the effectiveness of the mitigation; a negative
				value will fallback to a default of 64Mb;

				- alloc_dealloc_mismatch (boolean, defaults to true): whether or not we report
				errors on malloc/delete, new/free, new/delete[], etc;

				- new_delete_size_mismatch (boolean, defaults to true): whether or not we
				report errors on mismatch between size of new and delete;

				- zero_chunk_contents (boolean, defaults to false): whether or not we zero
				chunk contents on allocation and deallocation.

				filcabUnsubmitted Not Done Reply Inline Actions If we're making a "hardened allocator", shouldn't the default be to zero out all chunks? (of course, performance would suffer a lot. I'm just curious if there are other reasons) filcab: If we're making a "hardened allocator", shouldn't the default be to zero out all chunks? (of…

projects/compiler-rt/cmake/config-ix.cmake

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines
set(ALL_PROFILE_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64} ${PPC64}		set(ALL_PROFILE_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64} ${PPC64}
${MIPS32} ${MIPS64})		${MIPS32} ${MIPS64})
set(ALL_TSAN_SUPPORTED_ARCH ${X86_64} ${MIPS64} ${ARM64} ${PPC64})		set(ALL_TSAN_SUPPORTED_ARCH ${X86_64} ${MIPS64} ${ARM64} ${PPC64})
set(ALL_UBSAN_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64}		set(ALL_UBSAN_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64}
${MIPS32} ${MIPS64} ${PPC64} ${S390X})		${MIPS32} ${MIPS64} ${PPC64} ${S390X})
set(ALL_SAFESTACK_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM64} ${MIPS32} ${MIPS64})		set(ALL_SAFESTACK_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM64} ${MIPS32} ${MIPS64})
set(ALL_CFI_SUPPORTED_ARCH ${X86} ${X86_64} ${MIPS64})		set(ALL_CFI_SUPPORTED_ARCH ${X86} ${X86_64} ${MIPS64})
set(ALL_ESAN_SUPPORTED_ARCH ${X86_64})		set(ALL_ESAN_SUPPORTED_ARCH ${X86_64})
		set(ALL_HARDENED_ALLOCATOR_SUPPORTED_ARCH ${X86_64})
		filcabUnsubmitted Not Done Reply Inline Actions I know safestack, cfi, and ESan don't follow this, but we can probably put the HARDENED_ALLOCATOR stuff in alphabetical order with the other stuff above. filcab: I know safestack, cfi, and ESan don't follow this, but we can probably put the…

if(APPLE)		if(APPLE)
include(CompilerRTDarwinUtils)		include(CompilerRTDarwinUtils)

find_darwin_sdk_dir(DARWIN_osx_SYSROOT macosx)		find_darwin_sdk_dir(DARWIN_osx_SYSROOT macosx)
find_darwin_sdk_dir(DARWIN_iossim_SYSROOT iphonesimulator)		find_darwin_sdk_dir(DARWIN_iossim_SYSROOT iphonesimulator)
find_darwin_sdk_dir(DARWIN_ios_SYSROOT iphoneos)		find_darwin_sdk_dir(DARWIN_ios_SYSROOT iphoneos)
find_darwin_sdk_dir(DARWIN_watchossim_SYSROOT watchsimulator)		find_darwin_sdk_dir(DARWIN_watchossim_SYSROOT watchsimulator)
▲ Show 20 Lines • Show All 170 Lines • ▼ Show 20 Lines	list_intersect(SAFESTACK_SUPPORTED_ARCH
ALL_SAFESTACK_SUPPORTED_ARCH		ALL_SAFESTACK_SUPPORTED_ARCH
SANITIZER_COMMON_SUPPORTED_ARCH)		SANITIZER_COMMON_SUPPORTED_ARCH)
list_intersect(CFI_SUPPORTED_ARCH		list_intersect(CFI_SUPPORTED_ARCH
ALL_CFI_SUPPORTED_ARCH		ALL_CFI_SUPPORTED_ARCH
SANITIZER_COMMON_SUPPORTED_ARCH)		SANITIZER_COMMON_SUPPORTED_ARCH)
list_intersect(ESAN_SUPPORTED_ARCH		list_intersect(ESAN_SUPPORTED_ARCH
ALL_ESAN_SUPPORTED_ARCH		ALL_ESAN_SUPPORTED_ARCH
SANITIZER_COMMON_SUPPORTED_ARCH)		SANITIZER_COMMON_SUPPORTED_ARCH)
		list_intersect(HARDENED_ALLOCATOR_SUPPORTED_ARCH
		ALL_HARDENED_ALLOCATOR_SUPPORTED_ARCH
		SANITIZER_COMMON_SUPPORTED_ARCH)
else()		else()
# Architectures supported by compiler-rt libraries.		# Architectures supported by compiler-rt libraries.
filter_available_targets(SANITIZER_COMMON_SUPPORTED_ARCH		filter_available_targets(SANITIZER_COMMON_SUPPORTED_ARCH
${ALL_SANITIZER_COMMON_SUPPORTED_ARCH})		${ALL_SANITIZER_COMMON_SUPPORTED_ARCH})
# LSan and UBSan common files should be available on all architectures		# LSan and UBSan common files should be available on all architectures
# supported by other sanitizers (even if they build into dummy object files).		# supported by other sanitizers (even if they build into dummy object files).
filter_available_targets(LSAN_COMMON_SUPPORTED_ARCH		filter_available_targets(LSAN_COMMON_SUPPORTED_ARCH
${SANITIZER_COMMON_SUPPORTED_ARCH})		${SANITIZER_COMMON_SUPPORTED_ARCH})
filter_available_targets(UBSAN_COMMON_SUPPORTED_ARCH		filter_available_targets(UBSAN_COMMON_SUPPORTED_ARCH
${SANITIZER_COMMON_SUPPORTED_ARCH})		${SANITIZER_COMMON_SUPPORTED_ARCH})
filter_available_targets(ASAN_SUPPORTED_ARCH ${ALL_ASAN_SUPPORTED_ARCH})		filter_available_targets(ASAN_SUPPORTED_ARCH ${ALL_ASAN_SUPPORTED_ARCH})
filter_available_targets(DFSAN_SUPPORTED_ARCH ${ALL_DFSAN_SUPPORTED_ARCH})		filter_available_targets(DFSAN_SUPPORTED_ARCH ${ALL_DFSAN_SUPPORTED_ARCH})
filter_available_targets(LSAN_SUPPORTED_ARCH ${ALL_LSAN_SUPPORTED_ARCH})		filter_available_targets(LSAN_SUPPORTED_ARCH ${ALL_LSAN_SUPPORTED_ARCH})
filter_available_targets(MSAN_SUPPORTED_ARCH ${ALL_MSAN_SUPPORTED_ARCH})		filter_available_targets(MSAN_SUPPORTED_ARCH ${ALL_MSAN_SUPPORTED_ARCH})
filter_available_targets(PROFILE_SUPPORTED_ARCH ${ALL_PROFILE_SUPPORTED_ARCH})		filter_available_targets(PROFILE_SUPPORTED_ARCH ${ALL_PROFILE_SUPPORTED_ARCH})
filter_available_targets(TSAN_SUPPORTED_ARCH ${ALL_TSAN_SUPPORTED_ARCH})		filter_available_targets(TSAN_SUPPORTED_ARCH ${ALL_TSAN_SUPPORTED_ARCH})
filter_available_targets(UBSAN_SUPPORTED_ARCH ${ALL_UBSAN_SUPPORTED_ARCH})		filter_available_targets(UBSAN_SUPPORTED_ARCH ${ALL_UBSAN_SUPPORTED_ARCH})
filter_available_targets(SAFESTACK_SUPPORTED_ARCH		filter_available_targets(SAFESTACK_SUPPORTED_ARCH
${ALL_SAFESTACK_SUPPORTED_ARCH})		${ALL_SAFESTACK_SUPPORTED_ARCH})
filter_available_targets(CFI_SUPPORTED_ARCH ${ALL_CFI_SUPPORTED_ARCH})		filter_available_targets(CFI_SUPPORTED_ARCH ${ALL_CFI_SUPPORTED_ARCH})
filter_available_targets(ESAN_SUPPORTED_ARCH ${ALL_ESAN_SUPPORTED_ARCH})		filter_available_targets(ESAN_SUPPORTED_ARCH ${ALL_ESAN_SUPPORTED_ARCH})
		filter_available_targets(HARDENED_ALLOCATOR_SUPPORTED_ARCH
		${ALL_HARDENED_ALLOCATOR_SUPPORTED_ARCH})
endif()		endif()

if (MSVC)		if (MSVC)
# See if the DIA SDK is available and usable.		# See if the DIA SDK is available and usable.
set(MSVC_DIA_SDK_DIR "$ENV{VSINSTALLDIR}DIA SDK")		set(MSVC_DIA_SDK_DIR "$ENV{VSINSTALLDIR}DIA SDK")
if (IS_DIRECTORY ${MSVC_DIA_SDK_DIR})		if (IS_DIRECTORY ${MSVC_DIA_SDK_DIR})
set(CAN_SYMBOLIZE 1)		set(CAN_SYMBOLIZE 1)
else()		else()
▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines
endif()		endif()

if (COMPILER_RT_HAS_SANITIZER_COMMON AND ESAN_SUPPORTED_ARCH AND		if (COMPILER_RT_HAS_SANITIZER_COMMON AND ESAN_SUPPORTED_ARCH AND
OS_NAME MATCHES "Linux")		OS_NAME MATCHES "Linux")
set(COMPILER_RT_HAS_ESAN TRUE)		set(COMPILER_RT_HAS_ESAN TRUE)
else()		else()
set(COMPILER_RT_HAS_ESAN FALSE)		set(COMPILER_RT_HAS_ESAN FALSE)
endif()		endif()

		if (COMPILER_RT_HAS_SANITIZER_COMMON AND HARDENED_ALLOCATOR_SUPPORTED_ARCH AND
		OS_NAME MATCHES "Linux")
		set(COMPILER_RT_HAS_HARDENED_ALLOCATOR TRUE)
		else()
		set(COMPILER_RT_HAS_HARDENED_ALLOCATOR FALSE)
		endif()

projects/compiler-rt/lib/CMakeLists.txt

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	if(COMPILER_RT_BUILD_SANITIZERS)

if(COMPILER_RT_HAS_CFI)		if(COMPILER_RT_HAS_CFI)
add_subdirectory(cfi)		add_subdirectory(cfi)
endif()		endif()

if(COMPILER_RT_HAS_ESAN)		if(COMPILER_RT_HAS_ESAN)
add_subdirectory(esan)		add_subdirectory(esan)
endif()		endif()

		if(COMPILER_RT_HAS_HARDENED_ALLOCATOR)
		add_subdirectory(hardened_allocator)
		endif()
		filcabUnsubmitted Not Done Reply Inline Actions Same here. filcab: Same here.
endif()		endif()

projects/compiler-rt/lib/hardened_allocator/CMakeLists.txt

				add_custom_target(hardened_allocator)

				include_directories(..)

				set(HARDENED_ALLOCATOR_CFLAGS ${SANITIZER_COMMON_CFLAGS})
				append_rtti_flag(OFF HARDENED_ALLOCATOR_CFLAGS)
				list(APPEND HARDENED_ALLOCATOR_CFLAGS -msse4.2)

				set(HARDENED_ALLOCATOR_SOURCES
				scudo_allocator.cc
				scudo_flags.cc
				scudo_malloc_linux.cc
				scudo_new_delete.cc
				scudo_rtl.cc
				scudo_utils.cc)

				if(COMPILER_RT_HAS_HARDENED_ALLOCATOR)
				foreach(arch ${HARDENED_ALLOCATOR_SUPPORTED_ARCH})
				add_compiler_rt_runtime(clang_rt.hardened_allocator
				STATIC
				ARCHS ${arch}
				SOURCES ${HARDENED_ALLOCATOR_SOURCES}
				$<TARGET_OBJECTS:RTInterception.${arch}>
				$<TARGET_OBJECTS:RTSanitizerCommon.${arch}>
				$<TARGET_OBJECTS:RTSanitizerCommonLibc.${arch}>
				CFLAGS ${HARDENED_ALLOCATOR_CFLAGS}
				PARENT_TARGET hardened_allocator)
				endforeach()
				endif()

				add_dependencies(compiler-rt hardened_allocator)

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h

				//===-- scudo_allocator.h ---------------------------------------- C++ --===//
				//
				gliderUnsubmitted Not Done Reply Inline Actions I don't insist much, but I think either the library name should be "scudo" instead of "hardened_allocator", or the names of the files under hardened_allocator/ should start with "hardened_allocator_" glider: I don't insist much, but I think either the library name should be "scudo" instead of…
				kccUnsubmitted Not Done Reply Inline Actions I initially proposed to name the dir hardened_allocator to make it more self-descriptive. But if others don't mind to have dir named "scuda" let the author decide. kcc: I initially proposed to name the dir hardened_allocator to make it more self-descriptive. But…
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Header for scudo_allocator.cc.
				///
				//===----------------------------------------------------------------------===//

				#ifndef SCUDO_ALLOCATOR_H_
				#define SCUDO_ALLOCATOR_H_

				#ifndef __x86_64__
				# error "The Scudo hardened allocator currently only supports on x86_64."
				#endif
				kccUnsubmitted Done Reply Inline Actions is currently only supported? os "supports x86_64"? kcc: is currently only supported? os "supports x86_64"?

				#include "scudo_flags.h"

				#include "sanitizer_common/sanitizer_allocator.h"

				namespace __scudo {

				// We have to redefine CHECK_IMPL, as the __sanitizer one involves calling a
				// CheckFailedCallback function, which could be abused by a potential attacker.
				#ifdef CHECK_IMPL
				#undef CHECK_IMPL
				kccUnsubmitted Done Reply Inline Actions So, you should not need this any more, let's remove it now. I'll make on more pass afterwards (Monday-ish) kcc: So, you should not need this any more, let's remove it now. I'll make on more pass afterwards…
				#endif

				#define CHECK_IMPL(c1, op, c2) \
				do { \
				__sanitizer::u64 v1 = (u64)(c1); \
				__sanitizer::u64 v2 = (u64)(c2); \
				if (UNLIKELY(!(v1 op v2))) \
				__scudo::CheckFailed(__FILE__, __LINE__, \
				"(" #c1 ") " #op " (" #c2 ")", v1, v2); \
				} while (false) \
				/**/

				filcabUnsubmitted Not Done Reply Inline Actions Why the empty comment trailing the `while (false)`? filcab: Why the empty comment trailing the `while (false)`?
				// We will also use our own CheckFailed and Die functions, once again to avoid
				cryptoadAuthorUnsubmitted Not Done Reply Inline Actions This actually a straight copy from sanitizer_internal_defs.h, just replacing CheckFailed. So I left it as is. This will go away when I redo the CHECK_IMPL logic to follow kcc@'s suggestion to implement templated failure functions. cryptoad: This actually a straight copy from sanitizer_internal_defs.h, just replacing CheckFailed. So I…
				// the __sanitizer ones that have callbacks.
				void NORETURN
				CheckFailed(const char file, int line, const char cond, u64 v1, u64 v2);
				void NORETURN Die();

				enum AllocType : u8 {
				FromMalloc = 0, // Memory block came from malloc, realloc, calloc, etc.
				kccUnsubmitted Done Reply Inline Actions Does the following block of code have to be in this header? Why not in .cc? kcc: Does the following block of code have to be in this header? Why not in .cc?
				FromNew = 1, // Memory block came from operator new.
				FromNewArray = 2, // Memory block came from operator new [].
				FromMemalign = 3, // Memory block came from memalign, posix_memalign, etc.
				};

				struct AllocatorOptions {
				u32 QuarantineSizeMb;
				u32 ThreadLocalQuarantineSizeKb;
				vitalybukaUnsubmitted Done Reply Inline Actions QuarantineSizeMb; vitalybuka: QuarantineSizeMb;
				bool MayReturnNull;
				bool DeallocationTypeMismatch;
				bool DeleteSizeMismatch;
				bool ZeroContents;

				void SetFrom(const Flags f, const CommonFlags cf);
				void CopyTo(Flags f, CommonFlags cf) const;
				};

				void InitializeAllocator(const AllocatorOptions &options);
				void DrainQuarantine();

				const uptr AllocatorSpace = ~0ULL;
				vitalybukaUnsubmitted Done Reply Inline Actions const uptr AllocatorSpace = vitalybuka: const uptr AllocatorSpace =
				const uptr AllocatorSize = 0x10000000000ULL;
				const uptr MinAlignmentLog = 4; // 16 bytes for x64
				vitalybukaUnsubmitted Done Reply Inline Actions static is not needed vitalybuka: static is not needed
				const uptr MaxAlignmentLog = 24;

				typedef DefaultSizeClassMap SizeClassMap;
				typedef SizeClassAllocator64<AllocatorSpace, AllocatorSize, 0, SizeClassMap>
				PrimaryAllocator;
				typedef SizeClassAllocatorLocalCache<PrimaryAllocator> AllocatorCache;
				typedef LargeMmapAllocator<> SecondaryAllocator;
				typedef CombinedAllocator<PrimaryAllocator, AllocatorCache, SecondaryAllocator>
				ScudoAllocator;

				void *scudoMalloc(uptr Size, AllocType Type);
				vitalybukaUnsubmitted Done Reply Inline Actions scudoMalloc(uptr Size, AllocType AllocType) vitalybuka: scudoMalloc(uptr Size, AllocType AllocType)
				void scudoFree(void *Ptr, AllocType Type);
				void scudoSizedFree(void *Ptr, uptr Size, AllocType Type);
				void scudoRealloc(void Ptr, uptr Size);
				void *scudoCalloc(uptr NMemB, uptr Size);
				void *scudoMemalign(uptr Alignment, uptr Size);
				void *scudoValloc(uptr Size);
				void *scudoPvalloc(uptr Size);
				int scudoPosixMemalign(void **MemPtr, uptr Alignment, uptr Size);
				void *scudoAlignedAlloc(uptr Alignment, uptr Size);
				uptr scudoMallocUsableSize(void *Ptr);

				} // namespace __scudo

				#endif // SCUDO_ALLOCATOR_H_

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc

				//===-- scudo_allocator.cc --------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Scudo Hardened Allocator implementation.
				/// It uses the sanitizer_common allocator as a base and aims at mitigating
				/// heap corruption vulnerabilities. It provides a checksum-guarded chunk
				/// header, a delayed free list, and additional sanity checks.
				///
				gliderUnsubmitted Done Reply Inline Actions Please either elaborate what other "various security improvements" are there, or remove that phrase. glider: Please either elaborate what other "various security improvements" are there, or remove that…
				//===----------------------------------------------------------------------===//

				#include "scudo_allocator.h"
				#include "scudo_utils.h"

				#include "sanitizer_common/sanitizer_allocator_interface.h"
				#include "sanitizer_common/sanitizer_quarantine.h"

				#include <limits.h>
				#include <pthread.h>
				#include <smmintrin.h>

				#include <cstring>

				namespace __scudo {

				void NORETURN Die() {
				if (common_flags()->abort_on_error)
				Abort();
				filcabUnsubmitted Done Reply Inline Actions I would remove the TODO. If we're in a debugger, Abort() might trigger the debugger and stop the program execution. exit() will simply close the program. We've added the abort in ASan, back in the day (D12332), due to this. It's the default for ASan on OS X and on the PS4. filcab: I would remove the TODO. If we're in a debugger, Abort() might trigger the debugger and stop…
				internal__exit(common_flags()->exitcode);
				}

				void NORETURN CheckFailed(const char file, int line, const char cond,
				u64 v1, u64 v2) {
				// FIXME: currently using sanitizer's Printf. We might want to use
				// something less complex to avoid potential issues.
				Printf("CHECK failed: %s:%d %s (%lld, %lld)\n", file, line, cond, v1, v2);
				Die();
				}

				static ScudoAllocator &getAllocator();

				static thread_local Xorshift128Plus Prng;
				// Global static cookie, initialized at start-up.
				static u64 Cookie;
				filcabUnsubmitted Done Reply Inline Actions It's very likely a smaller price to pay than to have to use atomic updates + spin on update collisions. filcab: It's very likely a smaller price to pay than to have to use atomic updates + spin on update…

				enum ChunkState : u8 {
				ChunkAvailable = 0,
				ChunkAllocated = 1,
				ChunkQuarantine = 2
				vitalybukaUnsubmitted Done Reply Inline Actions enum ChunkState : u8 { ChunkAvailible = 0, }; vitalybuka: enum ChunkState : u8 { ChunkAvailible = 0, };
				};

				typedef unsigned __int128 PackedHeader;

				// Our header requires 128-bit of storage on x64 (the only platform supported
				// as of now), which fits nicely with the alignment requirements.
				// Having the offset saves us from using functions such as GetBlockBegin, that
				filcabUnsubmitted Done Reply Inline Actions I could do without the `u128` typedef. You're only using it for the `PackedHeader` typedef. filcab: I could do without the `u128` typedef. You're only using it for the `PackedHeader` typedef.
				// is fairly costly. Our first implementation used the MetaData as well, which
				// offers the advantage of being stored away from the chunk itself, but
				// accessing it was costly as well.
				kccUnsubmitted Not Done Reply Inline Actions align the comment block kcc: align the comment block
				// The header will be atomically loaded and stored using the 16-byte primitives
				dvyukovUnsubmitted Done Reply Inline Actions It's better to comment right on the fields rather than duplicate them here. The comment has good chances of getting outdated. It's also harder to find the relevant part of the comment for a particular field. E.g.: u8 state : 2; // available, allocated, or quarantined comments like 'salt' on 'salt' field are excessive, drop them. dvyukov: It's better to comment right on the fields rather than duplicate them here. The comment has…
				// offered by the platform (likely requires cmpxchg16b support).
				struct UnpackedHeader {
				// 1st 8 bytes
				u16 checksum : 16;
				u64 requested_size : 40; // Needed for reallocation purposes.
				u8 state : 2; // available, allocated, or quarantined
				u8 alloc_type : 2; // malloc, new, new[], or memalign
				u8 unused_0_ : 4;
				// 2nd 8 bytes
				u64 offset : 20; // Offset from the beginning of the backend
				// allocation to the beginning chunk itself, in
				// multiples of MinAlignment. See comment about its
				// maximum value and test in Initialize.
				u64 unused_1_ : 28;
				u16 salt : 16;
				};
				vitalybukaUnsubmitted Done Reply Inline Actions I see why you need 128bit for structure. Why do you need u128 type for 2 bit members? vitalybuka: I see why you need 128bit for structure. Why do you need u128 type for 2 bit members?
				kccUnsubmitted Not Done Reply Inline Actions I actually like it this way. The intent is more clear. kcc: I actually like it this way. The intent is more clear.

				COMPILER_CHECK(sizeof(UnpackedHeader) == sizeof(PackedHeader));

				const uptr ChunkHeaderSize = sizeof(PackedHeader);

				struct ScudoChunk : UnpackedHeader {
				// We can't use the offset member of the chunk itself, as we would double
				dvyukovUnsubmitted Done Reply Inline Actions Don't we need only 20 bits here? dvyukov: Don't we need only 20 bits here?
				cryptoadAuthorUnsubmitted Done Reply Inline Actions This is indeed the case. I figured I would align that one to a multiple of 8 bits as we have some space in the second half of the 128-bit integer. I am not opposed to shortening to the actual needed bit size if you feel strongly about it. cryptoad: This is indeed the case. I figured I would align that one to a multiple of 8 bits as we have…
				// fetch it without any warranty that it wouldn't have been tampered. To
				dvyukovUnsubmitted Done Reply Inline Actions Does it improve generated code? Leaving it as 24 is OK in that case, but it needs to be explained in comments. Width of that field has crucial implicit relation with Min/MaxAlignment. When double-checked width, I found that it's not what I would expect it to be. What means that either I missing something else important here, or there is a bug, or things get out of sync. This uncertainty is very unpleasant and takes time for anybody reading the code. dvyukov: Does it improve generated code? Leaving it as 24 is OK in that case, but it needs to be…
				// prevent this, we work with a local copy of the header.
				void AllocBeg(UnpackedHeader Header) {
				return reinterpret_cast<void *>(
				reinterpret_cast<uptr>(this) - (Header->offset << MinAlignmentLog));
				}
				filcabUnsubmitted Done Reply Inline Actions I'd suggest: COMPILER_CHECK(sizeof(UnpackedHeader) == sizeof(PackedHeader)); Since it makes it explicit we want those two to be the same and that it's not a coincidence that we're expecting both to "happen to be" the same size as `u128`. The `sizeof(PackedHeader)` isn't needed, since it's a typedef for `u128` (even after my proposed change, it won't be needed). filcab: I'd suggest: COMPILER_CHECK(sizeof(UnpackedHeader) == sizeof(PackedHeader)); Since it makes…

				// CRC32 checksum of the Chunk pointer and its ChunkHeader.
				filcabUnsubmitted Done Reply Inline Actions Remove the `static` here, since you're not adding it to other const-qualified variables when it isn't needed. filcab: Remove the `static` here, since you're not adding it to other const-qualified variables when it…
				// It currently uses the Intel Nehalem SSE4.2 crc32 64-bit instruction.
				u16 Checksum(UnpackedHeader *Header) const {
				u64 HeaderHolder[2];
				memcpy(HeaderHolder, Header, sizeof(HeaderHolder));
				u64 Crc = _mm_crc32_u64(Cookie, reinterpret_cast<uptr>(this));
				// This is somewhat of a shortcut. The checksum is stored in the 16 least
				// significant bits of the first 8 bytes of the header, hence zero-ing
				// those bits out. It would be more valid to zero the checksum field of the
				// UnpackedHeader, but would require holding an additional copy of it.
				dvyukovUnsubmitted Done Reply Inline Actions I am missing the relation between the requirement to not load header second time and making the function static. Why not: void AllocBeg(UnpackedHeader header) ? dvyukov: I am missing the relation between the requirement to not load header second time and making the…
				Crc = _mm_crc32_u64(Crc, HeaderHolder[0] & 0xffffffffffff0000ULL);
				Crc = _mm_crc32_u64(Crc, HeaderHolder[1]);
				return static_cast<u16>(Crc);
				}

				kccUnsubmitted Not Done Reply Inline Actions I suggest to replace all cases of if (!cond) { Printf() Die() } With if (!cond) DieWithMessage(); This is using the Printf from sanitizer_common, right? It might be worth replacing it with your own, simpler one. If you agree, just leave a TODO near DieWithMessage and address it later. kcc: I suggest to replace all cases of if (!cond) { Printf() Die() } With if (!
				cryptoadAuthorUnsubmitted Not Done Reply Inline Actions There is also a PrintfAndReportCallback callback that I just noticed. I will have to address that later as well. cryptoad: There is also a PrintfAndReportCallback callback that I just noticed. I will have to address…
				// Loads and unpacks the header, verifying the checksum in the process.
				void loadHeader(UnpackedHeader *unpacked_header) const {
				PackedHeader packed_header;
				__atomic_load(reinterpret_cast<const PackedHeader *>(this), &packed_header,
				__ATOMIC_RELAXED);
				*unpacked_header = bit_cast<UnpackedHeader>(packed_header);
				gliderUnsubmitted Not Done Reply Inline Actions I would've started with a handwritten crc32 implementation and bother with hardware support only iff it's performance-critical (don't think it is) glider: I would've started with a handwritten crc32 implementation and bother with hardware support…
				kccUnsubmitted Not Done Reply Inline Actions I am pretty sure it is performance critical. If this gets used on older or non-x86 systems we can add other implementation later. kcc: I am pretty sure it is performance critical. If this gets used on older or non-x86 systems we…
				if (unpacked_header->checksum != Checksum(unpacked_header)) {
				Printf("ERROR: corrupted chunk header at address %p\n", this);
				filcabUnsubmitted Done Reply Inline Actions No need to do this. Much better to just handle it on the CMake side (which you're already doing). filcab: No need to do this. Much better to just handle it on the CMake side (which you're already…
				Die();
				}
				}

				// Packs and stores the header, computing the checksum in the process.
				void storeHeader(UnpackedHeader *new_unpacked_header) {
				filcabUnsubmitted Done Reply Inline Actions "... 16 least significant bits of the header of the first 8 bytes..." filcab: "... 16 least significant bits of the header of the first 8 bytes..."
				new_unpacked_header->checksum = Checksum(new_unpacked_header);
				PackedHeader new_packed_header =
				bit_cast<PackedHeader>(*new_unpacked_header);
				dvyukovUnsubmitted Not Done Reply Inline Actions Add a bold comment to checksum filed that Checksum expects it to be low 16 bits. And maybe add some debug check here. dvyukov: Add a bold comment to checksum filed that Checksum expects it to be low 16 bits. And maybe add…
				__atomic_store(reinterpret_cast<PackedHeader *>(this), &new_packed_header,
				__ATOMIC_RELAXED);
				dvyukovUnsubmitted Done Reply Inline Actions Won't it do to initialize crc to cookie as: u64 crc = _mm_crc32_u64(cookie, reinterpret_cast<uptr>(this)); ? dvyukov: Won't it do to initialize crc to cookie as: u64 crc = _mm_crc32_u64(cookie…
				}

				// Packs and stores the header, computing the checksum in the process. We
				// compare the current header with the expected provided one to ensure that
				// we are not being raced by a corruption occurring in another thread.
				void compareExchangeHeader(UnpackedHeader *new_unpacked_header,
				filcabUnsubmitted Done Reply Inline Actions Why are you not using the C++11 atomics? filcab: Why are you not using the C++11 atomics?
				cryptoadAuthorUnsubmitted Done Reply Inline Actions Using std::atomic<unsigned __int128>? cryptoad: Using std::atomic<unsigned __int128>?
				filcabUnsubmitted Done Reply Inline Actions Did `std::atomic<unsigned __int128>` not work/was too slow? filcab: Did `std::atomic<unsigned __int128>` not work/was too slow?
				cryptoadAuthorUnsubmitted Done Reply Inline Actions It's in the works :) cryptoad: It's in the works :)
				UnpackedHeader *old_unpacked_header) {
				dvyukovUnsubmitted Done Reply Inline Actions Why does this need to be acquire? Please comment. dvyukov: Why does this need to be acquire? Please comment.
				filcabUnsubmitted Done Reply Inline Actions Yeah, should be nicer. Unless it's a problem due to something I'm not thinking of, then I'd rather have more standard constructs (even though there's no guarantee of an `__int128` type, let alone an `std::atomic<__int128>`, AFAICT libstdc++ and libc++ will have those implemented). filcab: Yeah, should be nicer. Unless it's a problem due to something I'm not thinking of, then I'd…
				new_unpacked_header->checksum = Checksum(new_unpacked_header);
				PackedHeader new_packed_header =
				bit_cast<PackedHeader>(*new_unpacked_header);
				PackedHeader old_packed_header =
				bit_cast<PackedHeader>(*old_unpacked_header);
				if (!__atomic_compare_exchange(reinterpret_cast<PackedHeader *>(this),
				&old_packed_header,
				&new_packed_header,
				false,
				__ATOMIC_RELAXED,
				__ATOMIC_RELAXED)) {
				Printf("ERROR: race on chunk header at address %p\n", this);
				Die();
				}
				}
				};

				static pthread_once_t GlobalInited = PTHREAD_ONCE_INIT;
				dvyukovUnsubmitted Done Reply Inline Actions Why release? dvyukov: Why release?
				static thread_local bool ThreadInited;
				static pthread_key_t pkey;
				static thread_local AllocatorCache cache;

				static void teardownThread(void *p) {
				uptr v = reinterpret_cast<uptr>(p);
				// The glibc POSIX thread-local-storage deallocation routine calls user
				// provided destructors in a loop of PTHREAD_DESTRUCTOR_ITERATIONS.
				gliderUnsubmitted Done Reply Inline Actions Shouldn't the success memory order be __ATOMIC_RELEASE? glider: Shouldn't the success memory order be __ATOMIC_RELEASE?
				dvyukovUnsubmitted Done Reply Inline Actions Why acquire? dvyukov: Why acquire?
				// We want to be called last since other destructors might call free and the
				dvyukovUnsubmitted Done Reply Inline Actions Looks pointless. dvyukov: Looks pointless.
				// like, so we wait until PTHREAD_DESTRUCTOR_ITERATIONS before draining the
				// quarantine and swallowing the cache.
				if (v < PTHREAD_DESTRUCTOR_ITERATIONS) {
				pthread_setspecific(pkey, reinterpret_cast<void *>(v + 1));
				return;
				}
				DrainQuarantine();
				getAllocator().DestroyCache(&cache);
				}

				static void initGlobal() {
				pthread_key_create(&pkey, teardownThread);
				}
				gliderUnsubmitted Done Reply Inline Actions Please remind if Scudo is going to be used together with any of the sanitizers. If yes, the destructor magic won't probably work as intended, because other tools also play with it. glider: Please remind if Scudo is going to be used together with any of the sanitizers. If yes, the…
				kccUnsubmitted Done Reply Inline Actions Afiact, scudo will not be combinable with any of the sanitizers other than with ubsan kcc: Afiact, scudo will not be combinable with any of the sanitizers other than with ubsan

				static void NOINLINE initThread() {
				pthread_once(&GlobalInited, initGlobal);
				pthread_setspecific(pkey, reinterpret_cast<void *>(1));
				getAllocator().InitCache(&cache);
				ThreadInited = true;
				}

				struct QuarantineCallback {
				explicit QuarantineCallback(AllocatorCache *cache)
				: cache_(cache) {}

				dvyukovUnsubmitted Done Reply Inline Actions This will leak memory. Destructors run FIFO order, so later-created user dtors can run after you. Plus pthread frees thread stack and pthread_specific regions after running pthread_specific dtors. dvyukov: This will leak memory. Destructors run FIFO order, so later-created user dtors can run after…
				// Chunk recycling function, returns a quarantined chunk to the backend.
				cryptoadAuthorUnsubmitted Done Reply Inline Actions For this, I used the same technique that is used in ASan's PlatformTSDDtor, as it seems to be the most viable one. I am not sure what alternative would work here. cryptoad: For this, I used the same technique that is used in ASan's PlatformTSDDtor, as it seems to be…
				dvyukovUnsubmitted Done Reply Inline Actions If a free comes after we drained local cache, asan uses a global cache. Grep for "fallback" in asan_allocator.cc. Tsan now uses the same. It sucks. But I don't see how to do better. We need to detect when a thread is actually finished, but it's tricky to do with pthread_join API. dvyukov: If a free comes after we drained local cache, asan uses a global cache. Grep for "fallback" in…
				void Recycle(ScudoChunk *chunk) {
				UnpackedHeader header;
				chunk->loadHeader(&header);
				if (header.state != ChunkQuarantine) {
				Printf("ERROR: invalid chunk state when recycling address %p\n",
				chunk);
				Die();
				dvyukovUnsubmitted Done Reply Inline Actions Why initGlobal is not called from ScudoInitInternal? If you expect that malloc can come before ScudoInitInternal, then you also need to call ScudoInitInternal from initThread. Otherwise it won't work anyway. dvyukov: Why initGlobal is not called from ScudoInitInternal? If you expect that malloc can come before…
				}
				void *ptr = chunk->AllocBeg(&header);
				getAllocator().Deallocate(cache_, ptr);
				}

				/// Internal quarantine allocation and deallocation functions.
				void *Allocate(uptr size) {
				// The internal quarantine memory cannot be protected by us. But the only
				// structures allocated are QuarantineBatch, that are 8KB for x64. So we
				// will use mmap for those, and given that Deallocate doesn't pass a size
				// in, we enforce the size of the allocation to be sizeof(QuarantineBatch).
				// TODO(kostyak): switching to mmap impacts greatly performances, we have
				// to find another solution
				// CHECK_EQ(size, sizeof(QuarantineBatch));
				// return MmapOrDie(size, "QuarantineBatch");
				return getAllocator().Allocate(cache_, size, 1, false);
				}

				void Deallocate(void *ptr) {
				// UnmapOrDie(ptr, sizeof(QuarantineBatch));
				getAllocator().Deallocate(cache_, ptr);
				}

				AllocatorCache *cache_;
				};

				typedef Quarantine<QuarantineCallback, ScudoChunk> ScudoQuarantine;
				typedef ScudoQuarantine::Cache QuarantineCache;
				static thread_local QuarantineCache quarantine_cache;

				void AllocatorOptions::SetFrom(const Flags f, const CommonFlags cf) {
				MayReturnNull = cf->allocator_may_return_null;
				QuarantineSizeMb = f->QuarantineSizeMb;
				ThreadLocalQuarantineSizeKb = f->ThreadLocalQuarantineSizeKb;
				DeallocationTypeMismatch = f->DeallocationTypeMismatch;
				DeleteSizeMismatch = f->DeleteSizeMismatch;
				ZeroContents = f->ZeroContents;
				}

				void AllocatorOptions::CopyTo(Flags f, CommonFlags cf) const {
				cf->allocator_may_return_null = MayReturnNull;
				f->QuarantineSizeMb = QuarantineSizeMb;
				f->ThreadLocalQuarantineSizeKb = ThreadLocalQuarantineSizeKb;
				f->DeallocationTypeMismatch = DeallocationTypeMismatch;
				f->DeleteSizeMismatch = DeleteSizeMismatch;
				f->ZeroContents = ZeroContents;
				dvyukovUnsubmitted Done Reply Inline Actions Why is this commented out? Looks cleaner. dvyukov: Why is this commented out? Looks cleaner.
				}
				filcabUnsubmitted Done Reply Inline Actions Why? filcab: Why?
				cryptoadAuthorUnsubmitted Done Reply Inline Actions Sorry this was a remainder of a debugging session. I switched it back to the original plan which was to use the thread_local QuarantineCache. cryptoad: Sorry this was a remainder of a debugging session. I switched it back to the original plan…

				struct Allocator {
				static const uptr MaxAllowedMallocSize = 1ULL << 40;
				static const uptr MinAlignment = 1 << MinAlignmentLog;
				static const uptr MaxAlignment = 1 << MaxAlignmentLog; // 16 MB

				ScudoAllocator allocator;
				ScudoQuarantine quarantine;

				bool DeallocationTypeMismatch;
				bool ZeroContents;
				bool DeleteSizeMismatch;

				explicit Allocator(LinkerInitialized)
				: quarantine(LINKER_INITIALIZED) {}

				void Initialize(const AllocatorOptions &options) {
				// Currently SSE 4.2 support is required. This might change late.
				CHECK(testCPUFeature(SSE4_2)); // for crc32

				gliderUnsubmitted Done Reply Inline Actions Are we going to target 32-bit systems? This is gonna overflow uptr on x86. glider: Are we going to target 32-bit systems? This is gonna overflow uptr on x86.
				filcabUnsubmitted Done Reply Inline Actions If this ends up being ported, it's a simple matter of using the `FIRST_32_SECOND_64` macro. filcab: If this ends up being ported, it's a simple matter of using the `FIRST_32_SECOND_64` macro.
				cryptoadAuthorUnsubmitted Done Reply Inline Actions No plan to support 32-bit as of yet, but yes we will use FIRST_32_SECOND_64 if we do. cryptoad: No plan to support 32-bit as of yet, but yes we will use FIRST_32_SECOND_64 if we do.
				kccUnsubmitted Done Reply Inline Actions no 32-bit for now (or ever?) kcc: no 32-bit for now (or ever?)
				// Verify that the header offset field can hold the maximum offset. In the
				// worst case scenario, the backend allocation is already aligned on
				// MaxAlignment, so in order to store the header and still be aligned, we
				// add an extra MaxAlignment. As a result, the offset from the beginning of
				// the backend allocation to the chunk will be MaxAlignment -
				// ChunkHeaderSize.
				UnpackedHeader Header = {};
				uptr MaximumOffset = (MaxAlignment - ChunkHeaderSize) >> MinAlignmentLog;
				Header.offset = MaximumOffset;
				if (Header.offset != MaximumOffset) {
				Printf("ERROR: the maximum possible offset doesn't fit in the header\n");
				Die();
				}

				DeallocationTypeMismatch = options.DeallocationTypeMismatch;
				DeleteSizeMismatch = options.DeleteSizeMismatch;
				gliderUnsubmitted Done Reply Inline Actions Despite SSE 4.2 may be quite common at Google, I don't think it's a good idea to bail out if it's unsupported. Note TestCPUFeature() doesn't work on AMD processors yet. glider: Despite SSE 4.2 may be quite common at Google, I don't think it's a good idea to bail out if…
				filcabUnsubmitted Done Reply Inline Actions Source files are compiled assuming that feature is available. We'll have to add a fallback checksum (plus change build and this check) to address this comment. I would be ok with keeping the SSE4.2 requirement until we get a non-zero amount of requests/bug reports. P.S: http://store.steampowered.com/hwsurvey (first result for "hardware survey". It's clearly biased, but I'd guess developer CPUs are also biased to be more recent/powerful than an average computer) puts SSE4.2 adoption at ~80%. filcab: Source files are compiled assuming that feature is available. We'll have to add a fallback…
				gliderUnsubmitted Done Reply Inline Actions Well, IIUC right now the implementation just aborts for AMD processors, which are among those ~80%. glider: Well, IIUC right now the implementation just aborts for AMD processors, which are among those…
				filcabUnsubmitted Done Reply Inline Actions I was just talking about the SSE4.2 part. Sorry, I was going to comment on the CPUID thing but forgot. Doing it now. filcab: I was just talking about the SSE4.2 part. Sorry, I was going to comment on the CPUID thing but…
				kccUnsubmitted Done Reply Inline Actions s/late/later kcc: s/late/later
				ZeroContents = options.ZeroContents;
				allocator.Init(options.MayReturnNull);
				quarantine.Init(static_cast<uptr>(options.QuarantineSizeMb) << 20,
				static_cast<uptr>(
				options.ThreadLocalQuarantineSizeKb) << 10);
				dvyukovUnsubmitted Done Reply Inline Actions Do we want to sanity check options.QuarantineSizeMb) << 20? What if it overflows? dvyukov: Do we want to sanity check options.QuarantineSizeMb) << 20? What if it overflows?
				Cookie = Prng.Next();
				dvyukovUnsubmitted Done Reply Inline Actions Make this tunable as well. If my program has 10000 threads, 1MB per thread is a lot. dvyukov: Make this tunable as well. If my program has 10000 threads, 1MB per thread is a lot.
				}

				// Allocates a chunk.
				void *Allocate(uptr size, uptr alignment, AllocType alloc_type) {
				if (UNLIKELY(!ThreadInited))
				initThread();
				if (!IsPowerOfTwo(alignment)) {
				Printf("ERROR: malloc alignment is not a power of 2\n");
				Die();
				dvyukovUnsubmitted Done Reply Inline Actions s/alignment/malloc alignment/ So that user can get at least some glue when she sees this on console. dvyukov: s/alignment/malloc alignment/ So that user can get at least some glue when she sees this on…
				}
				if (alignment > MaxAlignment)
				return allocator.ReturnNullOrDie();
				if (alignment < MinAlignment)
				alignment = MinAlignment;
				if (size == 0)
				size = 1;
				if (size >= MaxAllowedMallocSize)
				return allocator.ReturnNullOrDie();
				uptr rounded_size = RoundUpTo(size, MinAlignment);
				uptr extra_bytes = ChunkHeaderSize;
				if (alignment > MinAlignment)
				extra_bytes += alignment;
				uptr needed_size = rounded_size + extra_bytes;
				if (needed_size >= MaxAllowedMallocSize)
				return allocator.ReturnNullOrDie();
				vitalybukaUnsubmitted Done Reply Inline Actions could be removed? vitalybuka: could be removed?
				void *ptr = allocator.Allocate(&cache, needed_size, MinAlignment);
				if (!ptr)
				return allocator.ReturnNullOrDie();

				uptr alloc_beg = reinterpret_cast<uptr>(ptr);
				uptr chunk_beg = alloc_beg + ChunkHeaderSize;
				if (!IsAligned(chunk_beg, alignment))
				chunk_beg = RoundUpTo(chunk_beg, alignment);
				CHECK_LE(chunk_beg + size, alloc_beg + needed_size);
				ScudoChunk *chunk =
				reinterpret_cast<ScudoChunk *>(chunk_beg - ChunkHeaderSize);
				UnpackedHeader header = {};
				header.state = ChunkAllocated;
				header.offset = (chunk_beg - ChunkHeaderSize - alloc_beg)
				>> MinAlignmentLog;
				header.alloc_type = alloc_type;
				dvyukovUnsubmitted Done Reply Inline Actions It seems to me that we don't actually need with_offset and all the associated if's. You can just always store (chunk_beg - alloc_beg) >> MinAlignmentLog into header.offset and always subtract it from user_beg. dvyukov: It seems to me that we don't actually need with_offset and all the associated if's. You can…
				header.requested_size = size;
				dvyukovUnsubmitted Done Reply Inline Actions There is a very tricky, implicit relation between MinAlignment, MaxAlignment and number of bits in offset. If there of these change in future we can get a nice attack vector due to offset overflow. Check that MaxAlignment/MinAlignment fits into offset during init. dvyukov: There is a very tricky, implicit relation between MinAlignment, MaxAlignment and number of bits…
				header.salt = static_cast<u16>(Prng.Next());
				chunk->storeHeader(&header);
				void user_ptr = reinterpret_cast<void >(chunk_beg);
				if (ZeroContents && allocator.FromPrimary(ptr))
				memset(user_ptr, 0, size);
				// TODO(kostyak): hooks sound like a terrible idea security wise but might
				// be needed for things to work properly?
				// if (&__sanitizer_malloc_hook) __sanitizer_malloc_hook(user_ptr, size);
				filcabUnsubmitted Done Reply Inline Actions Pobably best to zero the whole thing (`needed_size - ChunkHeaderSize`)? filcab: Pobably best to zero the whole thing (`needed_size - ChunkHeaderSize`)?
				cryptoadAuthorUnsubmitted Done Reply Inline Actions I guess we have several choices here: move it up and zero the whole thing prior to the header being store leave it here and zero the whole thing post header, which will have to account for the offset (needed_size - (chunk_beg - alloc_beg)) I think the first option would be better. cryptoad: I guess we have several choices here: - move it up and zero the whole thing prior to the…
				return user_ptr;
				}

				// Deallocates a Chunk, which means adding it to the delayed free list (or
				// Quarantine).
				void Deallocate(void *user_ptr, uptr delete_size, AllocType alloc_type) {
				if (UNLIKELY(!ThreadInited))
				initThread();
				// TODO(kostyak): see hook comment above
				// if (&__sanitizer_free_hook) __sanitizer_free_hook(user_ptr);
				if (!user_ptr)
				return;
				uptr chunk_beg = reinterpret_cast<uptr>(user_ptr);
				if (!IsAligned(chunk_beg, MinAlignment)) {
				Printf("ERROR: attempted to deallocate a chunk not properly aligned at "
				"address %p\n", user_ptr);
				Die();
				}
				ScudoChunk *chunk =
				reinterpret_cast<ScudoChunk *>(chunk_beg - ChunkHeaderSize);
				UnpackedHeader old_header;
				chunk->loadHeader(&old_header);
				if (old_header.state != ChunkAllocated) {
				Printf("ERROR: invalid chunk state when deallocating address %p\n",
				chunk);
				Die();
				}
				UnpackedHeader new_header = old_header;
				new_header.state = ChunkQuarantine;
				chunk->compareExchangeHeader(&new_header, &old_header);
				if (DeallocationTypeMismatch) {
				// The deallocation type has to match the allocation one
				if (new_header.alloc_type != alloc_type) {
				// With the exception of memalign'd Chunks, that can be still be free'd
				if (new_header.alloc_type != FromMemalign \|\| alloc_type != FromMalloc) {
				Printf("ERROR: allocation type mismatch on address %p\n", chunk);
				Die();
				}
				}
				filcabUnsubmitted Done Reply Inline Actions Please push the negation through. filcab: Please push the negation through.
				}
				uptr size = new_header.requested_size;
				if (DeleteSizeMismatch) {
				if (delete_size && delete_size != size) {
				Printf("ERROR: invalid sized delete on chunk at address %p\n", chunk);
				Die();
				}
				}
				dvyukovUnsubmitted Not Done Reply Inline Actions I wonder if delete_size can be 0 and it does not mean that delete_size is not passed, it is just legally zero. What is passed in for delete of an array with 0 elements? dvyukov: I wonder if delete_size can be 0 and it does not mean that delete_size is not passed, it is…
				quarantine.Put(&quarantine_cache, QuarantineCallback(&cache), chunk, size);
				}

				// Returns the actual usable size of a chunk. Since this requires loading the
				// header, we will return it in the second parameter, as it can be required
				// by the caller to perform additional processing.
				uptr UsableSize(const void ptr, UnpackedHeader header) {
				if (UNLIKELY(!ThreadInited))
				initThread();
				if (!ptr)
				return 0;
				uptr chunk_beg = reinterpret_cast<uptr>(ptr);
				ScudoChunk *chunk =
				reinterpret_cast<ScudoChunk *>(chunk_beg - ChunkHeaderSize);
				chunk->loadHeader(header);
				// Getting the usable size of a chunk only makes sense if it's allocated.
				if (header->state != ChunkAllocated) {
				Printf("ERROR: attempted to size a non-allocated chunk at address %p\n",
				chunk);
				Die();
				}
				uptr size = allocator.GetActuallyAllocatedSize(chunk->AllocBeg(header));
				// UsableSize works as malloc_usable_size, which is also what (AFAIU)
				// tcmalloc's MallocExtension::GetAllocatedSize aims at providing. This
				// means we will return the size of the chunk from the user beginning to
				vitalybukaUnsubmitted Done Reply Inline Actions no Die? vitalybuka: no Die?
				cryptoadAuthorUnsubmitted Done Reply Inline Actions Good catch! cryptoad: Good catch!
				// the end of the 'user' allocation, hence us subtracting the header size
				// and the offset from the size.
				if (size == 0)
				return size;
				return size - ChunkHeaderSize - (header->offset << MinAlignmentLog);
				}

				// Helper function that doesn't care about the header.
				uptr UsableSize(const void *Ptr) {
				UnpackedHeader Header;
				return UsableSize(Ptr, &Header);
				}

				// Reallocates a chunk. We can save on a new allocation if the new requested
				// size still fits in the chunk.
				void Reallocate(void old_ptr, uptr new_size) {
				if (UNLIKELY(!ThreadInited))
				initThread();
				UnpackedHeader old_header;
				uptr usable_size = UsableSize(old_ptr, &old_header);
				uptr chunk_beg = reinterpret_cast<uptr>(old_ptr);
				ScudoChunk *chunk =
				reinterpret_cast<ScudoChunk *>(chunk_beg - ChunkHeaderSize);
				if (old_header.alloc_type != FromMalloc) {
				Printf("ERROR: invalid chunk type when reallocating address %p\n",
				chunk);
				Die();
				}
				UnpackedHeader new_header = old_header;
				gliderUnsubmitted Done Reply Inline Actions So remove it, maybe? glider: So remove it, maybe?
				// The new size still fits in the current chunk.
				if (new_size <= usable_size) {
				// TODO(kostyak): zero the additional contents
				new_header.requested_size = new_size;
				chunk->compareExchangeHeader(&new_header, &old_header);
				return old_ptr;
				}
				// Otherwise, we have to allocate a new chunk and copy the contents of the
				// old one.
				void *new_ptr = Allocate(new_size, MinAlignment, FromMalloc);
				if (new_ptr) {
				uptr old_size = old_header.requested_size;
				memcpy(new_ptr, old_ptr, Min(new_size, old_size));
				new_header.state = ChunkQuarantine;
				filcabUnsubmitted Done Reply Inline Actions I'm guessing you only need to zero the additional contents to account for possible overflows that might have happened. Otherwise: `ZeroContents` = true -> they're already zeroed `ZeroContents` = false -> No need to zero. filcab: I'm guessing you only need to zero the additional contents to account for possible overflows…
				chunk->compareExchangeHeader(&new_header, &old_header);
				quarantine.Put(&quarantine_cache, QuarantineCallback(&cache), chunk,
				old_size);
				}
				return new_ptr;
				}

				void *Calloc(uptr NMemB, uptr Size) {
				uptr Total = NMemB * Size;
				if (Size != 0 && Total / Size != NMemB) // Overflow check
				return allocator.ReturnNullOrDie();
				void *Ptr = Allocate(Total, MinAlignment, FromMalloc);
				// If ZeroContents, the content of the chunk has already been zero'd out.
				if (!ZeroContents && Ptr && allocator.FromPrimary(Ptr))
				memset(Ptr, 0, UsableSize(Ptr));
				return Ptr;
				}

				void DrainQuarantine() {
				quarantine.Drain(&quarantine_cache, QuarantineCallback(&cache));
				}
				};

				static Allocator instance(LINKER_INITIALIZED);
				filcabUnsubmitted Done Reply Inline Actions If we have the `ZeroContents`, no need to zero again. filcab: If we have the `ZeroContents`, no need to zero again.

				filcabUnsubmitted Done Reply Inline Actions Should we zero the whole block, instead of just the size we were asked? (Hardening it a tiny bit against overflows) filcab: Should we zero the whole block, instead of just the size we were asked? (Hardening it a tiny…
				static ScudoAllocator &getAllocator() {
				return instance.allocator;
				}

				void InitializeAllocator(const AllocatorOptions &options) {
				instance.Initialize(options);
				}

				void DrainQuarantine() {
				instance.DrainQuarantine();
				}

				void *scudoMalloc(uptr Size, AllocType Type) {
				return instance.Allocate(Size, Allocator::MinAlignment, Type);
				}

				void scudoFree(void *Ptr, AllocType Type) {
				instance.Deallocate(Ptr, 0, Type);
				}

				void scudoSizedFree(void *Ptr, uptr Size, AllocType Type) {
				instance.Deallocate(Ptr, Size, Type);
				}

				void scudoRealloc(void Ptr, uptr Size) {
				if (!Ptr)
				return instance.Allocate(Size, Allocator::MinAlignment, FromMalloc);
				if (Size == 0) {
				instance.Deallocate(Ptr, 0, FromMalloc);
				return nullptr;
				}
				return instance.Reallocate(Ptr, Size);
				}

				void *scudoCalloc(uptr NMemB, uptr Size) {
				return instance.Calloc(NMemB, Size);
				}
				vitalybukaUnsubmitted Done Reply Inline Actions code sometimes uses if(p), sometimes if (p == nullptr) I'd recommed if(p) and if(!p) everywhere for consistency vitalybuka: code sometimes uses if(p), sometimes if (p == nullptr) I'd recommed if(p) and if(!p) everywhere…

				void *scudoValloc(uptr Size) {
				return instance.Allocate(Size, GetPageSizeCached(), FromMemalign);
				}

				void *scudoMemalign(uptr Alignment, uptr Size) {
				return instance.Allocate(Size, Alignment, FromMemalign);
				}

				void *scudoPvalloc(uptr Size) {
				uptr PageSize = GetPageSizeCached();
				Size = RoundUpTo(Size, PageSize);
				if (Size == 0) {
				// pvalloc(0) should allocate one page.
				Size = PageSize;
				}
				return instance.Allocate(Size, PageSize, FromMemalign);
				}

				int scudoPosixMemalign(void **MemPtr, uptr Alignment, uptr Size) {
				*MemPtr = instance.Allocate(Size, Alignment, FromMemalign);
				return 0;
				}

				void *scudoAlignedAlloc(uptr Alignment, uptr Size) {
				// size must be a multiple of the alignment. To avoid a division, we first
				// make sure that alignment is a power of 2.
				CHECK(IsPowerOfTwo(Alignment));
				CHECK_EQ((Size & (Alignment - 1)), 0);
				return instance.Allocate(Size, Alignment, FromMalloc);
				}

				vitalybukaUnsubmitted Done Reply Inline Actions maybe remove temp variable? vitalybuka: maybe remove temp variable?
				uptr scudoMallocUsableSize(void *Ptr) {
				return instance.UsableSize(Ptr);
				}

				} // namespace __scudo

				using namespace __scudo;

				// MallocExtension helper functions

				uptr __sanitizer_get_current_allocated_bytes() {
				uptr stats[AllocatorStatCount];
				getAllocator().GetStats(stats);
				return stats[AllocatorStatAllocated];
				}

				uptr __sanitizer_get_heap_size() {
				uptr stats[AllocatorStatCount];
				getAllocator().GetStats(stats);
				return stats[AllocatorStatMapped];
				}

				uptr __sanitizer_get_free_bytes() {
				return 1;
				}

				uptr __sanitizer_get_unmapped_bytes() {
				return 1;
				}

				uptr __sanitizer_get_estimated_allocated_size(uptr size) {
				return size;
				}

				int __sanitizer_get_ownership(const void *p) {
				return instance.UsableSize(p) != 0;
				}

				uptr __sanitizer_get_allocated_size(const void *p) {
				return instance.UsableSize(p);
				}

projects/compiler-rt/lib/hardened_allocator/scudo_flags.h

				//===-- scudo_flags.h -------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Header for scudo_flags.cc.
				///
				//===----------------------------------------------------------------------===//

				#ifndef SCUDO_FLAGS_H_
				#define SCUDO_FLAGS_H_

				namespace __scudo {

				struct Flags {
				#define SCUDO_FLAG(Type, Name, DefaultValue, Description) Type Name;
				#include "scudo_flags.inc"
				#undef SCUDO_FLAG

				void SetDefaults();
				};

				Flags *flags();

				vitalybukaUnsubmitted Done Reply Inline Actions I see call only from init, so maybe this does not need to be inline and extern Flags scudo_flags_dont_use_directly; can be moved into cc vitalybuka: I see call only from init, so maybe this does not need to be inline and extern Flags…
				void InitializeFlags();

				} // namespace __scudo

				#endif // SCUDO_FLAGS_H_

projects/compiler-rt/lib/hardened_allocator/scudo_flags.cc

				//===-- scudo_flags.cc ------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Hardened Allocator flag parsing logic.
				///
				//===----------------------------------------------------------------------===//

				#include "scudo_flags.h"

				#include "sanitizer_common/sanitizer_flags.h"
				#include "sanitizer_common/sanitizer_flag_parser.h"

				namespace __scudo {

				Flags scudo_flags_dont_use_directly; // use via flags().

				void Flags::SetDefaults() {
				#define SCUDO_FLAG(Type, Name, DefaultValue, Description) Name = DefaultValue;
				#include "scudo_flags.inc"
				#undef SCUDO_FLAG
				}

				static void RegisterScudoFlags(FlagParser parser, Flags f) {
				#define SCUDO_FLAG(Type, Name, DefaultValue, Description) \
				RegisterFlag(parser, #Name, Description, &f->Name);
				#include "scudo_flags.inc"
				#undef SCUDO_FLAG
				}

				void InitializeFlags() {
				SetCommonFlagsDefaults();
				{
				CommonFlags cf;
				cf.CopyFrom(*common_flags());
				cf.exitcode = 1;
				OverrideCommonFlags(cf);
				}
				Flags *f = flags();
				f->SetDefaults();

				FlagParser scudo_parser;
				RegisterScudoFlags(&scudo_parser, f);
				RegisterCommonFlags(&scudo_parser);

				scudo_parser.ParseString(GetEnv("SCUDO_OPTIONS"));

				InitializeCommonFlags();

				// Sanity checks and default settings for the Quarantine parameters.

				if (f->QuarantineSizeMb < 0) {
				dvyukovUnsubmitted Done Reply Inline Actions You use convoluted way to express 64 that requires remembering powers of two, and then spell 64 in comments. Why not just say "64"? dvyukov: You use convoluted way to express 64 that requires remembering powers of two, and then spell 64…
				const int DefaultQuarantineSizeMb = 64;
				f->QuarantineSizeMb = DefaultQuarantineSizeMb;
				}
				// We enforce an upper limit for the quarantine size of 4Gb.
				if (f->QuarantineSizeMb > (4 * 1024)) {
				Printf("ERROR: the quarantine size is too large\n");
				Die();
				}
				if (f->ThreadLocalQuarantineSizeKb < 0) {
				const int DefaultThreadLocalQuarantineSizeKb = 1024;
				f->ThreadLocalQuarantineSizeKb = DefaultThreadLocalQuarantineSizeKb;
				}
				// And an upper limit of 128Mb for the thread quarantine cache.
				if (f->ThreadLocalQuarantineSizeKb > (128 * 1024)) {
				Printf("ERROR: the per thread quarantine cache size is too large\n");
				Die();
				}
				}

				Flags *flags() {
				return &scudo_flags_dont_use_directly;
				}

				}

projects/compiler-rt/lib/hardened_allocator/scudo_flags.inc

				//===-- scudo_flags.inc ------------------------------------------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Hardened Allocator runtime flags.
				///
				//===----------------------------------------------------------------------===//

				#ifndef SCUDO_FLAG
				# error "Define SCUDO_FLAG prior to including this file!"
				#endif

				SCUDO_FLAG(int, QuarantineSizeMb, 64,
				"Size (in Mb) of quarantine used to delay the actual deallocation "
				dvyukovUnsubmitted Done Reply Inline Actions s/-1/64/ then the default will be visible in help as well. User can't tune this value if she does not have a single reference point. If I know that the default is 64, then I can set it to 32 of 128. If I don't know the default, what am I supposed to do? dvyukov: s/-1/64/ then the default will be visible in help as well. User can't tune this value if she…
				"of chunks. Lower value may reduce memory usage but decrease the "
				"effectiveness of the mitigation.")

				SCUDO_FLAG(int, ThreadLocalQuarantineSizeKb, 1024,
				"Size (in Kb) of per-thread cache used to offload the global "
				"quarantine. Lower value may reduce memory usage but might increase "
				"the contention on the global quarantine.")

				SCUDO_FLAG(bool, DeallocationTypeMismatch, true,
				"Report errors on malloc/delete, new/free, new/delete[], etc.")

				gliderUnsubmitted Not Done Reply Inline Actions Feature request: filling the chunk context with a nonzero byte. glider: Feature request: filling the chunk context with a nonzero byte.
				SCUDO_FLAG(bool, DeleteSizeMismatch, true,
				"Report errors on mismatch between size of new and delete.")

				SCUDO_FLAG(bool, ZeroContents, false,
				"Zero chunk contents on allocation and deallocation.")

projects/compiler-rt/lib/hardened_allocator/scudo_malloc_linux.cc

				//===-- scudo_malloc_linux.cc ------------------------------------ C++ --===//
				//
				filcabUnsubmitted Not Done Reply Inline Actions `scudo_interceptors.cc` (`.cpp` in the future, but do what Vitaly suggested and change the names only after approval to ease code review) filcab: `scudo_interceptors.cc` (`.cpp` in the future, but do what Vitaly suggested and change the…
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Linux specific malloc interception functions.
				///
				//===----------------------------------------------------------------------===//

				#include "sanitizer_common/sanitizer_platform.h"
				#if SANITIZER_LINUX

				filcabUnsubmitted Not Done Reply Inline Actions Don't add these to the whole file. I'm ok with protecting Linux/glibc-specific functions like `pvalloc`, etc. Those will need it anyway if this gets ported somewhere. No need to protect the `malloc`/`free` interceptors with `SANITIZER_LINUX`, though. filcab: Don't add these to the whole file. I'm ok with protecting Linux/glibc-specific functions like…
				gliderUnsubmitted Not Done Reply Inline Actions Note that for other systems (e.g. OSX) it may be incorrect to intercept malloc/free. Therefore it should be ok to keep those in a Linux/FreeBSD-specific file and keep SANITIZER_LINUX \| SANITIZER_FREEBSD for the whole file. glider: Note that for other systems (e.g. OSX) it may be incorrect to intercept malloc/free. Therefore…
				#include "scudo_allocator.h"

				#include "interception/interception.h"

				using namespace __scudo;

				INTERCEPTOR(void, free, void *ptr) {
				scudoFree(ptr, FromMalloc);
				}

				INTERCEPTOR(void, cfree, void *ptr) {
				scudoFree(ptr, FromMalloc);
				}

				INTERCEPTOR(void*, malloc, uptr size) {
				return scudoMalloc(size, FromMalloc);
				}

				INTERCEPTOR(void, realloc, void ptr, uptr size) {
				return scudoRealloc(ptr, size);
				}

				INTERCEPTOR(void*, calloc, uptr nmemb, uptr size) {
				return scudoCalloc(nmemb, size);
				}

				INTERCEPTOR(void*, valloc, uptr size) {
				return scudoValloc(size);
				}

				INTERCEPTOR(void*, memalign, uptr alignment, uptr size) {
				return scudoMemalign(alignment, size);
				}

				INTERCEPTOR(void*, __libc_memalign, uptr alignment, uptr size) {
				return scudoMemalign(alignment, size);
				}

				INTERCEPTOR(void*, pvalloc, uptr size) {
				return scudoPvalloc(size);
				}

				INTERCEPTOR(void*, aligned_alloc, uptr alignment, uptr size) {
				return scudoAlignedAlloc(alignment, size);
				}

				INTERCEPTOR(int, posix_memalign, void **memptr, uptr alignment, uptr size) {
				return scudoPosixMemalign(memptr, alignment, size);
				}

				INTERCEPTOR(uptr, malloc_usable_size, void *ptr) {
				return scudoMallocUsableSize(ptr);
				}

				INTERCEPTOR(int, mallopt, int cmd, int value) {
				return -1;
				}

				#endif // SANITIZER_LINUX
				gliderUnsubmitted Done Reply Inline Actions It's better to add comments (e.g. " // SANITIZER_LINUX" here) to #endif directives, especially when the code doesn't fit on a single screen. glider: It's better to add comments (e.g. " // SANITIZER_LINUX" here) to #endif directives, especially…

projects/compiler-rt/lib/hardened_allocator/scudo_new_delete.cc

				//===-- scudo_new_delete.cc -------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Interceptors for operators new and delete.
				///
				//===----------------------------------------------------------------------===//

				#include "scudo_allocator.h"

				#include "interception/interception.h"

				#include <cstddef>

				using namespace __scudo;

				#define CXX_OPERATOR_ATTRIBUTE INTERCEPTOR_ATTRIBUTE

				// Fake std::nothrow_t to avoid including <new>.
				namespace std {
				struct nothrow_t {};
				} // namespace std

				CXX_OPERATOR_ATTRIBUTE
				void *operator new(size_t size) {
				return scudoMalloc(size, FromNew);
				}
				CXX_OPERATOR_ATTRIBUTE
				void *operator new[](size_t size) {
				return scudoMalloc(size, FromNewArray);
				}
				CXX_OPERATOR_ATTRIBUTE
				void *operator new(size_t size, std::nothrow_t const&) {
				return scudoMalloc(size, FromNew);
				}
				CXX_OPERATOR_ATTRIBUTE
				void *operator new[](size_t size, std::nothrow_t const&) {
				return scudoMalloc(size, FromNewArray);
				}

				CXX_OPERATOR_ATTRIBUTE
				void operator delete(void *ptr) NOEXCEPT {
				return scudoFree(ptr, FromNew);
				}
				CXX_OPERATOR_ATTRIBUTE
				void operator delete[](void *ptr) NOEXCEPT {
				return scudoFree(ptr, FromNewArray);
				}
				CXX_OPERATOR_ATTRIBUTE
				void operator delete(void *ptr, std::nothrow_t const&) NOEXCEPT {
				return scudoFree(ptr, FromNew);
				}
				CXX_OPERATOR_ATTRIBUTE
				void operator delete[](void *ptr, std::nothrow_t const&) NOEXCEPT {
				return scudoFree(ptr, FromNewArray);
				}
				CXX_OPERATOR_ATTRIBUTE
				void operator delete(void *ptr, size_t size) NOEXCEPT {
				scudoSizedFree(ptr, size, FromNew);
				}
				CXX_OPERATOR_ATTRIBUTE
				void operator delete[](void *ptr, size_t size) NOEXCEPT {
				scudoSizedFree(ptr, size, FromNewArray);
				}

projects/compiler-rt/lib/hardened_allocator/scudo_rtl.cc

				//===-- scudo_rtl.cc --------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Main file for the Hardened Allocator runtime library.
				///
				//===----------------------------------------------------------------------===//

				#include "scudo_allocator.h"

				namespace __scudo {

				bool scudo_inited;
				bool scudo_init_is_running;

				static void ScudoInitInternal() {
				if (LIKELY(scudo_inited))
				return;
				SanitizerToolName = "Scudo";
				CHECK(!scudo_init_is_running && "Scudo init calls itself!");
				scudo_init_is_running = true;

				InitializeFlags();

				AllocatorOptions allocator_options;
				allocator_options.SetFrom(flags(), common_flags());
				InitializeAllocator(allocator_options);

				scudo_inited = true;
				scudo_init_is_running = false;
				}

				} // namespace __scudo

				using namespace __scudo;

				void __scudo_init() {
				ScudoInitInternal();
				}

				#if SANITIZER_CAN_USE_PREINIT_ARRAY
				__attribute__((section(".preinit_array"), used))
				void (*__local_scudo_preinit)(void) = __scudo_init;
				#else
				#error "Can't use .preinit_array"
				#endif
				filcabUnsubmitted Done Reply Inline Actions Should this be an `#error`? filcab: Should this be an `#error`?
				cryptoadAuthorUnsubmitted Done Reply Inline Actions I figured the additional initialization techniques used by ASan et al. could be added later on. Hence the #error for now. cryptoad: I figured the additional initialization techniques used by ASan et al. could be added later on.

projects/compiler-rt/lib/hardened_allocator/scudo_utils.h

				//===-- scudo_utils.h -------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Header for scudo_utils.cc.
				///
				//===----------------------------------------------------------------------===//

				#ifndef SCUDO_UTILS_H_
				#define SCUDO_UTILS_H_

				#include <string.h>

				#include "sanitizer_common/sanitizer_common.h"

				namespace __scudo {

				template <class Dest, class Source>
				inline Dest bit_cast(const Source& source) {
				static_assert(sizeof(Dest) == sizeof(Source), "Sizes are not equal!");
				Dest dest;
				memcpy(&dest, &source, sizeof(dest));
				filcabUnsubmitted Done Reply Inline Actions `static_assert(sizeof(Dest) == sizeof(Source), "Sized are not equal!");` filcab: `static_assert(sizeof(Dest) == sizeof(Source), "Sized are not equal!");`
				return dest;
				}

				enum CPUFeature {
				SSE4_2 = 0,
				RDRAND = 1,
				ENUM_CPUFEATURE_MAX
				};
				bool testCPUFeature(CPUFeature feature);

				// Tiny PRNG based on https://en.wikipedia.org/wiki/Xorshift#xorshift.2B
				// The state (128 bits) will be stored in thread local storage.
				struct Xorshift128Plus {
				public:
				Xorshift128Plus();
				u64 Next() {
				u64 x = state_0_;
				const u64 y = state_1_;
				dvyukovUnsubmitted Done Reply Inline Actions This is not used. Remove. dvyukov: This is not used. Remove.
				state_0_ = y;
				x ^= x << 23;
				dvyukovUnsubmitted Done Reply Inline Actions This is not used. Remove. dvyukov: This is not used. Remove.
				state_1_ = x ^ y ^ (x >> 17) ^ (y >> 26);
				return state_1_ + y;
				}
				private:
				gliderUnsubmitted Done Reply Inline Actions GetSeed is unused. glider: GetSeed is unused.
				dvyukovUnsubmitted Done Reply Inline Actions This is not used. Remove. dvyukov: This is not used. Remove.
				u64 state_0_;
				u64 state_1_;
				};

				} // namespace __scudo

				#endif // SCUDO_UTILS_H_
				gliderUnsubmitted Done Reply Inline Actions These "a", "b", "c" comments don't help :( Please remove them. glider: These "a", "b", "c" comments don't help :( Please remove them.

projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc

				//===-- scudo_utils.cc ------------------------------------------- C++ --===//
				//
				gliderUnsubmitted Not Done Reply Inline Actions Do the build configs prevent this file from being built on ARM? glider: Do the build configs prevent this file from being built on ARM?
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// Platform specific utility functions.
				///
				//===----------------------------------------------------------------------===//

				#include "scudo_utils.h"

				#include <cstring>
				#include <chrono> // for std::chrono::high_resolution_clock
				#include <functional> // for std::hash
				#include <thread> // for std::this_thread

				namespace __scudo {

				typedef struct {
				u32 eax;
				u32 ebx;
				u32 ecx;
				u32 edx;
				} CPUIDInfo;

				static void getCPUID(CPUIDInfo *info, u32 leaf, u32 subleaf)
				{
				asm volatile("cpuid"
				: "=a" (info->eax), "=b" (info->ebx), "=c" (info->ecx), "=d" (info->edx)
				gliderUnsubmitted Not Done Reply Inline Actions Do you really need the subleaf parameter now? glider: Do you really need the subleaf parameter now?
				cryptoadAuthorUnsubmitted Not Done Reply Inline Actions I figured it could be useful if a feature such as RDSEED was needed. cryptoad: I figured it could be useful if a feature such as RDSEED was needed.
				: "a" (leaf), "c" (subleaf)
				);
				}

				// Returns true is the CPU is a "GenuineIntel" or "AuthenticAMD"
				static bool isSupportedCPU()
				{
				gliderUnsubmitted Not Done Reply Inline Actions I believe requiring certain Intel CPU models in order for the allocator to work isn't a good idea. glider: I believe requiring certain Intel CPU models in order for the allocator to work isn't a good…
				cryptoadAuthorUnsubmitted Not Done Reply Inline Actions My point of view when writing this was that I had to be as competitive as can be with other allocators, so that the benefit of additional checks would not be offseted by a dramatic decrease in performances. In the initial stages, it was determined that a BSD checksum vs the CPU backed CRC32 induced a performance gain of about 10% in pure allocation benchmarks, so I went that way. I am not opposed to doing something purely software, but I'd rather start this way and then expend it to be more portable. cryptoad: My point of view when writing this was that I had to be as competitive as can be with other…
				CPUIDInfo Info;

				getCPUID(&Info, 0, 0);
				if (memcmp(reinterpret_cast<char *>(&Info.ebx), "Genu", 4) == 0 &&
				memcmp(reinterpret_cast<char *>(&Info.edx), "ineI", 4) == 0 &&
				memcmp(reinterpret_cast<char *>(&Info.ecx), "ntel", 4) == 0) {
				return true;
				}
				if (memcmp(reinterpret_cast<char *>(&Info.ebx), "Auth", 4) == 0 &&
				memcmp(reinterpret_cast<char *>(&Info.edx), "enti", 4) == 0 &&
				memcmp(reinterpret_cast<char *>(&Info.ecx), "cAMD", 4) == 0) {
				return true;
				}
				return false;
				}

				bool testCPUFeature(CPUFeature feature)
				{
				gliderUnsubmitted Not Done Reply Inline Actions Note this is thread-unsafe. Not sure if that matters here, but still. glider: Note this is thread-unsafe. Not sure if that matters here, but still.
				static bool InfoInitialized = false;
				static CPUIDInfo kCPUInfo = {};
				filcabUnsubmitted Done Reply Inline Actions else UNIMPLEMENTED(); (or something similar) filcab: else UNIMPLEMENTED(); (or something similar)

				if (InfoInitialized == false) {
				if (isSupportedCPU() == true)
				getCPUID(&kCPUInfo, 1, 0);
				else
				filcabUnsubmitted Done Reply Inline Actions Same on AMD (source: https://support.amd.com/TechDocs/25481.pdf): " CPUID Fn0000_0001_ECX Feature Identifiers ... 20 SSE42: SSE4.2 instruction support. " filcab: Same on AMD (source: https://support.amd.com/TechDocs/25481.pdf): " CPUID Fn0000_0001_ECX…
				UNIMPLEMENTED();
				InfoInitialized = true;
				filcabUnsubmitted Done Reply Inline Actions Doesn't exist on AMD. Since SSE4.2 is required for this, you'll need to implement SSE4.2 detection for other CPU brands. RDRAND seems to exist only on Intel CPUs and there's a fallback path, so having it only on Intel doesn't seem like a problem. filcab: Doesn't exist on AMD. Since SSE4.2 is required for this, you'll need to implement SSE4.2…
				}
				switch (feature) {
				case SSE4_2:
				return ((kCPUInfo.ecx >> 20) & 0x1) != 0;
				case RDRAND:
				return ((kCPUInfo.ecx >> 30) & 0x1) != 0;
				default:
				break;
				}
				return false;
				}

				static u64 getRdTSC() {
				// Clang: __builtin_readcyclecounter
				u64 low, high;
				__asm__ volatile("rdtsc" : "=a" (low), "=d" (high));
				return (high << 32) \| low;
				}
				gliderUnsubmitted Done Reply Inline Actions Why not use a bool here? glider: Why not use a bool here?
				cryptoadAuthorUnsubmitted Done Reply Inline Actions I wanted to use a 3 state variable: unintialized, true, false. Hence the -1, 0, 1. cryptoad: I wanted to use a 3 state variable: unintialized, true, false. Hence the -1, 0, 1.

				// RdRand64 will call rdrand if the feature is available for the CPU, otherwise
				gliderUnsubmitted Done Reply Inline Actions Usually a "k" prefix denotes a constant. If you're going to change it, that's just a regular variable named "has_rd_rand" (or something like that) glider: Usually a "k" prefix denotes a constant. If you're going to change it, that's just a regular…
				// it will use a XOR of the cycle counter, the high resolution clock and the
				// thread ID hash.
				static u64 RdRand64() {
				filcabUnsubmitted Done Reply Inline Actions Does gcc actually do anything with this? If not, then just delete it. AFAICT, clang doesn't care unless you have an asm attribute to tie it to a specific register. filcab: Does gcc actually do anything with this? If not, then just delete it. AFAICT, clang doesn't…
				static s8 HasRdRand = -1;
				if (HasRdRand == -1) {
				HasRdRand = testCPUFeature(RDRAND);
				}
				gliderUnsubmitted Done Reply Inline Actions Please move this call inside the loop below. glider: Please move this call inside the loop below.
				if (HasRdRand == 1) {
				u64 rnd;
				u8 carry;

				// Normally we need only one execution, but if the first attempt failed,
				filcabUnsubmitted Done Reply Inline Actions Nit: Why not a simple `int`? Closer to the "usual idiom" in C++. filcab: Nit: Why not a simple `int`? Closer to the "usual idiom" in C++.
				cryptoadAuthorUnsubmitted Done Reply Inline Actions I tried to be consistent using the Sanitizer types. But I see your point. cryptoad: I tried to be consistent using the Sanitizer types. But I see your point.
				// we fall back to retries.
				dvyukovUnsubmitted Done Reply Inline Actions urandom is not secure and can allow to guess the cookie in a local setuid binary. dvyukov: urandom is not secure and can allow to guess the cookie in a local setuid binary.
				cryptoadAuthorUnsubmitted Done Reply Inline Actions So on this matter it seems that the general agreement is that urandom on modern Linux system is secure and can be used for cryptographic purposes. Even the more recent getrandom system call uses urandom by default, with the following entry in the man page: "Unless you are doing long-term key generation (and perhaps not even then), you probably shouldn't be using GRND_RANDOM. The cryptographic algorithms used for /dev/urandom are quite conservative, and so should be sufficient for all purposes." /dev/random performs poorly in my tests, often blocking the allocator. cryptoad: So on this matter it seems that the general agreement is that urandom on modern Linux system is…
				dvyukovUnsubmitted Done Reply Inline Actions Okay. You know better. dvyukov: Okay. You know better.
				for (int c = 10; c != 0; --c) {
				asm volatile("rdrand %0; setc %1": "=r" (rnd), "=qm" (carry));
				if (carry != 0)
				return rnd; // Success
				}
				gliderUnsubmitted Done Reply Inline Actions Is this problem that severe that we want to abort? Note that we don't abort if the CPU doesn't support rdrand. glider: Is this problem that severe that we want to abort? Note that we don't abort if the CPU doesn't…

				// All attempts failed.
				Printf("WARNING: RDRAND failed. Falling back.\n");
				}
				std::hash<std::thread::id> hasher;
				gliderUnsubmitted Done Reply Inline Actions My gut feeling is that XORing rdtsc and the time since epoch is actually reducing the entropy, not increasing it. Any idea if that's true? Also, do we really need a dependency on std::chrono? glider: My gut feeling is that XORing rdtsc and the time since epoch is actually reducing the entropy…
				gliderUnsubmitted Done Reply Inline Actions As a data point, I've ran RdTSC() and std::chrono::high_resolution_clock::now().time_since_epoch().count() 200278017 times. The number of unique values of both variables was exactly 200278017, while the number of unique XOR values was only 200205416, i.e. there were 0.036% collisions. glider: As a data point, I've ran RdTSC() and std::chrono::high_resolution_clock::now().
				cryptoadAuthorUnsubmitted Done Reply Inline Actions Is your suggestion to get rid of the epoch component? cryptoad: Is your suggestion to get rid of the epoch component?
				dvyukovUnsubmitted Done Reply Inline Actions This is used to initialize global cookie. I would use /dev/random. Or there must be 16 bytes of good randomness in auxv. dvyukov: This is used to initialize global cookie. I would use /dev/random. Or there must be 16 bytes of…
				return getRdTSC() ^ hasher(std::this_thread::get_id()) ^
				dvyukovUnsubmitted Done Reply Inline Actions Yes, all that is predictable. /dev/random is meant specifically for such cases, it uses various sources of entropy to create strongly random numbers. On second though, just remove all rdtsc/cpuid/rdrand trickery and read from /dev/random. If rdrand is present, kernel will use it. dvyukov: Yes, all that is predictable. /dev/random is meant specifically for such cases, it uses various…
				cryptoadAuthorUnsubmitted Done Reply Inline Actions I gave it a try and I have had a lot of issues with /dev/random on my system. Between the fact that it's blocking, and that sometimes it won't return the amount of bytes requested, the tests have been failing inconsistently. Tests with /dev/urandom worked better though. I am going to dig further into that. cryptoad: I gave it a try and I have had a lot of issues with /dev/random on my system. Between the fact…
				filcabUnsubmitted Done Reply Inline Actions Using `/dev/urandom` should be what you need, yes. Did you still have problems with urandom, btw? filcab: Using `/dev/urandom` should be what you need, yes. Did you still have problems with urandom…
				cryptoadAuthorUnsubmitted Done Reply Inline Actions /dev/udrandom appeared to work fine. cryptoad: /dev/udrandom appeared to work fine.
				dvyukovUnsubmitted Done Reply Inline Actions /dev/urandom is not what you need. It trades security for performance. I.e. instead of blocking it will just give you predictable randomness. Which kind of defeats the whole purpose of a security allocator. /dev/random blocks when it does not have enough entropy. But there is not much you can do if you do need the entropy. If it returns less bytes, read again. That's how it works with all read calls. dvyukov: /dev/urandom is not what you need. It trades security for performance. I.e. instead of blocking…
				filcabUnsubmitted Done Reply Inline Actions People like Daniel Bernstein and Thomas Ptacek (and others) tend to disagree and say that /dev/urandom is what we need: http://blog.cr.yp.to/20140205-entropy.html http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ I'm no expert in this, so I tend to rely on people who work on this kind of thing. filcab: People like Daniel Bernstein and Thomas Ptacek (and others) tend to disagree and say that…
				std::chrono::high_resolution_clock::now().time_since_epoch().count();
				}

				// Default constructor for Xorshift128Plus seeds the state with RdRand64
				Xorshift128Plus::Xorshift128Plus() {
				state_0_ = RdRand64();
				state_1_ = RdRand64();
				}

				} // namespace __scudo

projects/compiler-rt/test/CMakeLists.txt

Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	if(COMPILER_RT_HAS_UBSAN)
add_subdirectory(cfi)		add_subdirectory(cfi)
endif()		endif()
if(COMPILER_RT_HAS_SAFESTACK)		if(COMPILER_RT_HAS_SAFESTACK)
add_subdirectory(safestack)		add_subdirectory(safestack)
endif()		endif()
if(COMPILER_RT_HAS_ESAN)		if(COMPILER_RT_HAS_ESAN)
add_subdirectory(esan)		add_subdirectory(esan)
endif()		endif()
		if(COMPILER_RT_HAS_HARDENED_ALLOCATOR)
		add_subdirectory(hardened_allocator)
		endif()
endif()		endif()

if(COMPILER_RT_STANDALONE_BUILD)		if(COMPILER_RT_STANDALONE_BUILD)
# Now that we've traversed all the directories and know all the lit testsuites,		# Now that we've traversed all the directories and know all the lit testsuites,
# introduce a rule to run to run all of them.		# introduce a rule to run to run all of them.
get_property(LLVM_LIT_TESTSUITES GLOBAL PROPERTY LLVM_LIT_TESTSUITES)		get_property(LLVM_LIT_TESTSUITES GLOBAL PROPERTY LLVM_LIT_TESTSUITES)
get_property(LLVM_LIT_DEPENDS GLOBAL PROPERTY LLVM_LIT_DEPENDS)		get_property(LLVM_LIT_DEPENDS GLOBAL PROPERTY LLVM_LIT_DEPENDS)
add_lit_target(check-all		add_lit_target(check-all
"Running all regression tests"		"Running all regression tests"
${LLVM_LIT_TESTSUITES}		${LLVM_LIT_TESTSUITES}
DEPENDS ${LLVM_LIT_DEPENDS})		DEPENDS ${LLVM_LIT_DEPENDS})
endif()		endif()

projects/compiler-rt/test/hardened_allocator/CMakeLists.txt

				set(HARDENED_ALLOCATOR_LIT_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
				set(HARDENED_ALLOCATOR_LIT_BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR})


				set(HARDENED_ALLOCATOR_TEST_DEPS ${SANITIZER_COMMON_LIT_TEST_DEPS})
				if(NOT COMPILER_RT_STANDALONE_BUILD)
				list(APPEND HARDENED_ALLOCATOR_TEST_DEPS hardened_allocator)
				endif()

				configure_lit_site_cfg(
				${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.in
				${CMAKE_CURRENT_BINARY_DIR}/lit.site.cfg
				)

				add_lit_testsuite(check-hardened_allocator
				"Running the Hardened Allocator tests"
				${CMAKE_CURRENT_BINARY_DIR}
				DEPENDS ${HARDENED_ALLOCATOR_TEST_DEPS})
				set_target_properties(check-hardened_allocator PROPERTIES FOLDER
				"Hardened Allocator tests")

projects/compiler-rt/test/hardened_allocator/alignment.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: not %run %t pointers 2>&1 \| FileCheck %s

				// Tests that a non-16-byte aligned pointer will trigger the associated error
				// on deallocation.

				#include <assert.h>
				#include <malloc.h>
				#include <stdint.h>
				#include <stdlib.h>
				#include <string.h>

				int main(int argc, char **argv)
				{
				gliderUnsubmitted Done Reply Inline Actions `alignment` is unused. glider: `alignment` is unused.
				assert(argc == 2);
				if (!strcmp(argv[1], "pointers")) {
				void *p = malloc(1U << 16);
				if (!p)
				return 1;
				free(reinterpret_cast<void *>(reinterpret_cast<uintptr_t>(p) \| 8));
				}
				return 0;
				}

				// CHECK: ERROR: attempted to deallocate a chunk not properly aligned

projects/compiler-rt/test/hardened_allocator/double-free.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: not %run %t malloc 2>&1 \| FileCheck %s
				// RUN: not %run %t new 2>&1 \| FileCheck %s
				// RUN: not %run %t newarray 2>&1 \| FileCheck %s
				// RUN: not %run %t memalign 2>&1 \| FileCheck %s

				// Tests double-free error on pointers allocated with different allocation
				// functions.

				#include <assert.h>
				#include <stdlib.h>
				#include <string.h>

				int main(int argc, char **argv)
				{
				assert(argc == 2);
				if (!strcmp(argv[1], "malloc")) {
				void *p = malloc(sizeof(int));
				if (!p)
				return 1;
				free(p);
				free(p);
				}
				if (!strcmp(argv[1], "new")) {
				int *p = new int;
				if (!p)
				return 1;
				delete p;
				delete p;
				filcabUnsubmitted Done Reply Inline Actions Add a `posix_memalign` version. We have a special case in `free` for it. filcab: Add a `posix_memalign` version. We have a special case in `free` for it.
				}
				if (!strcmp(argv[1], "newarray")) {
				int *p = new int[8];
				if (!p)
				return 1;
				delete[] p;
				delete[] p;
				}
				if (!strcmp(argv[1], "memalign")) {
				void *p = nullptr;
				posix_memalign(&p, 0x100, sizeof(int));
				if (!p)
				return 1;
				free(p);
				free(p);
				}
				return 0;
				}

				// CHECK: ERROR: invalid chunk state when deallocating address

projects/compiler-rt/test/hardened_allocator/lit.cfg

				# -- Python --

				import os

				# Setup config name.
				config.name = 'Hardened Allocator'

				# Setup source root.
				config.test_source_root = os.path.dirname(__file__)

				# Path to the static library
				base_lib = os.path.join(config.compiler_rt_libdir,
				"libclang_rt.hardened_allocator-%s.a" % config.target_arch)
				whole_archive = "-Wl,-whole-archive %s -Wl,-no-whole-archive " % base_lib

				# Test suffixes.
				config.suffixes = ['.c', '.cc', '.cpp', '.m', '.mm', '.ll', '.test']

				# C flags.
				c_flags = ["-std=c++11",
				"-lstdc++",
				"-ldl",
				"-lrt",
				"-pthread",
				"-latomic", #for __atomic_load_16, __atomic_store_16, __atomic_compare_exchange_16
				"-fPIE",
				"-pie",
				"-O0"]

				def build_invocation(compile_flags):
				return " " + " ".join([config.clang] + compile_flags) + " "

				# Add clang substitutions.
				config.substitutions.append( ("%clang_scudo ",
				build_invocation(c_flags) + whole_archive ) )

				# Hardened Allocator tests are currently supported on Linux only.
				if config.host_os not in ['Linux']:
				config.unsupported = True

projects/compiler-rt/test/hardened_allocator/lit.site.cfg.in

				@LIT_SITE_CFG_IN_HEADER@

				# Load common config for all compiler-rt lit tests.
				lit_config.load_config(config, "@COMPILER_RT_BINARY_DIR@/test/lit.common.configured")

				# Load tool-specific config that would do the real work.
				lit_config.load_config(config, "@HARDENED_ALLOCATOR_LIT_SOURCE_DIR@/lit.cfg")

projects/compiler-rt/test/hardened_allocator/malloc.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: %run %t 2>&1
				gliderUnsubmitted Done Reply Inline Actions Each test must have comments that describe its purpose. glider: Each test must have comments that describe its purpose.

				// Tests that a regular workflow of allocation, memory fill and free works as
				// intended. Also tests that a zero-sized allocation succeeds.

				#include <malloc.h>
				#include <stdlib.h>
				#include <string.h>

				int main(int argc, char **argv)
				{
				void *p;
				size_t size = 1U << 8;

				p = malloc(size);
				if (!p)
				return 1;
				memset(p, 'A', size);
				free(p);
				p = malloc(0);
				if (!p)
				return 1;
				free(p);

				return 0;
				}

projects/compiler-rt/test/hardened_allocator/memalign.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: %run %t valid 2>&1
				// RUN: not %run %t invalid 2>&1 \| FileCheck %s

				// Tests that the various aligned allocation functions work as intended. Also
				// tests for the condition where the alignment is not a power of 2.

				#include <assert.h>
				#include <malloc.h>
				#include <stdlib.h>
				#include <string.h>

				int main(int argc, char **argv)
				{
				void *p;
				size_t alignment = 1U << 12;
				size_t size = alignment;

				assert(argc == 2);
				if (!strcmp(argv[1], "valid")) {
				p = memalign(alignment, size);
				if (!p)
				return 1;
				free(p);
				p = nullptr;
				posix_memalign(&p, alignment, size);
				if (!p)
				return 1;
				free(p);
				p = aligned_alloc(alignment, size);
				if (!p)
				return 1;
				free(p);
				}
				if (!strcmp(argv[1], "invalid")) {
				p = memalign(alignment - 1, size);
				free(p);
				}
				return 0;
				}

				// CHECK: ERROR: malloc alignment is not a power of 2

projects/compiler-rt/test/hardened_allocator/mismatch.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: SCUDO_OPTIONS=DeallocationTypeMismatch=1 not %run %t mallocdel 2>&1 \| FileCheck %s
				// RUN: SCUDO_OPTIONS=DeallocationTypeMismatch=0 %run %t mallocdel 2>&1
				gliderUnsubmitted Not Done Reply Inline Actions FYI you can use --check-prefix to write more test-specific CHECK directives. glider: FYI you can use --check-prefix to write more test-specific CHECK directives.
				// RUN: SCUDO_OPTIONS=DeallocationTypeMismatch=1 not %run %t newfree 2>&1 \| FileCheck %s
				// RUN: SCUDO_OPTIONS=DeallocationTypeMismatch=0 %run %t newfree 2>&1
				// RUN: SCUDO_OPTIONS=DeallocationTypeMismatch=1 not %run %t memaligndel 2>&1 \| FileCheck %s
				// RUN: SCUDO_OPTIONS=DeallocationTypeMismatch=0 %run %t memaligndel 2>&1

				// Tests that type mismatches between allocation and deallocation functions are
				// caught when the related option is set.

				#include <assert.h>
				#include <stdlib.h>
				#include <string.h>
				#include <malloc.h>

				int main(int argc, char **argv)
				{
				assert(argc == 2);
				if (!strcmp(argv[1], "mallocdel")) {
				int p = (int )malloc(16);
				if (!p)
				filcabUnsubmitted Done Reply Inline Actions You should add, at least, the memalign -> something_other_than_free case, since it's a special case. filcab: You should add, at least, the memalign -> something_other_than_free case, since it's a special…
				return 1;
				delete p;
				}
				if (!strcmp(argv[1], "newfree")) {
				int *p = new int;
				gliderUnsubmitted Done Reply Inline Actions Nit: spare newline glider: Nit: spare newline
				if (!p)
				return 1;
				free((void *)p);
				}
				if (!strcmp(argv[1], "memaligndel")) {
				int p = (int )memalign(0x10, 0x10);
				if (!p)
				return 1;
				delete p;
				}
				return 0;
				}

				// CHECK: ERROR: allocation type mismatch on address

projects/compiler-rt/test/hardened_allocator/overflow.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: not %run %t malloc 2>&1 \| FileCheck %s
				// RUN: SCUDO_OPTIONS=QuarantineSizeMb=1 not %run %t quarantine 2>&1 \| FileCheck %s

				// Tests that header corruption of an allocated or quarantined chunk is caught.

				#include <assert.h>
				#include <stdlib.h>
				#include <string.h>

				int main(int argc, char **argv)
				{
				assert(argc == 2);
				if (!strcmp(argv[1], "malloc")) {
				// Simulate a header corruption of an allocated chunk (1-bit)
				void *p = malloc(1U << 4);
				if (!p)
				return 1;
				((char *)p)[-1] ^= 1;
				free(p);
				}
				if (!strcmp(argv[1], "quarantine")) {
				void *p = malloc(1U << 4);
				if (!p)
				return 1;
				free(p);
				// Simulate a header corruption of a quarantined chunk
				((char *)p)[-2] ^= 1;
				// Trigger the quarantine recycle
				for (int i = 0; i < 0x100; i++) {
				p = malloc(1U << 16);
				free(p);
				}
				}
				return 0;
				}

				// CHECK: ERROR: corrupted chunk header at address

projects/compiler-rt/test/hardened_allocator/quarantine.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: SCUDO_OPTIONS=QuarantineSizeMb=1 %run %t 2>&1

				// Tests that the quarantine prevents a chunk from being reused right away.
				// Also tests that a chunk will eventually become available again for
				// allocation when the recycling criteria has been met.

				#include <malloc.h>
				#include <stdlib.h>
				#include <string.h>

				int main(int argc, char **argv)
				{
				void p, old_p;
				size_t size = 1U << 16;

				gliderUnsubmitted Done Reply Inline Actions You should probably add nullptr checks to other allocations in other tests. glider: You should probably add nullptr checks to other allocations in other tests.
				filcabUnsubmitted Done Reply Inline Actions `if (p)` filcab: `if (p)`
				// The delayed freelist will prevent a chunk from being available right away
				p = malloc(size);
				if (!p)
				return 1;
				old_p = p;
				free(p);
				p = malloc(size);
				if (!p)
				return 1;
				if (old_p == p)
				return 1;
				free(p);

				// Eventually the chunk should become available again
				bool found = false;
				for (int i = 0; i < 0x100 && found == false; i++) {
				p = malloc(size);
				if (!p)
				return 1;
				found = (p == old_p);
				free(p);
				}
				if (found == false)
				return 1;

				return 0;
				}

projects/compiler-rt/test/hardened_allocator/realloc.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: %run %t pointers 2>&1
				// RUN: %run %t contents 2>&1
				// RUN: not %run %t memalign 2>&1 \| FileCheck %s

				// Tests that our reallocation function returns the same pointer when the
				// requested size can fit into the previously allocated chunk. Also tests that
				// a new chunk is returned if the size is greater, and that the contents of the
				// chunk are left unchanged.
				// As a final test, make sure that a chunk allocated by memalign cannot be
				// reallocated.

				#include <assert.h>
				#include <malloc.h>
				#include <string.h>

				int main(int argc, char **argv)
				{
				void p, old_p;
				size_t size = 32;

				assert(argc == 2);
				if (!strcmp(argv[1], "pointers")) {
				old_p = p = realloc(nullptr, size);
				if (!p)
				return 1;
				size = malloc_usable_size(p);
				// Our realloc implementation will return the same pointer if the size
				// requested is lower or equal to the usable size of the associated chunk.
				p = realloc(p, size - 1);
				gliderUnsubmitted Done Reply Inline Actions Nit: a comment on a separat line must end with a period. glider: Nit: a comment on a separat line must end with a period.
				if (p != old_p)
				return 1;
				p = realloc(p, size);
				if (p != old_p)
				return 1;
				// And a new one if the size is greater.
				p = realloc(p, size + 1);
				if (p == old_p)
				return 1;
				// A size of 0 will free the chunk and return nullptr.
				p = realloc(p, 0);
				if (p)
				return 1;
				old_p = nullptr;
				}
				if (!strcmp(argv[1], "contents")) {
				p = realloc(nullptr, size);
				if (!p)
				return 1;
				for (int i = 0; i < size; i++)
				reinterpret_cast<char *>(p)[i] = 'A';
				p = realloc(p, size + 1);
				// The contents of the reallocated chunk must match the original one.
				for (int i = 0; i < size; i++)
				if (reinterpret_cast<char *>(p)[i] != 'A')
				return 1;
				}
				if (!strcmp(argv[1], "memalign")) {
				// A chunk coming from memalign cannot be reallocated.
				p = memalign(16, size);
				if (!p)
				return 1;
				p = realloc(p, size);
				free(p);
				}
				return 0;
				}

				// CHECK: ERROR: invalid chunk type when reallocating address

projects/compiler-rt/test/hardened_allocator/sized-delete.cc

				// RUN: %clang_scudo -fsized-deallocation %s -o %t
				// RUN: SCUDO_OPTIONS=DeleteSizeMismatch=1 %run %t gooddel 2>&1
				// RUN: SCUDO_OPTIONS=DeleteSizeMismatch=1 not %run %t baddel 2>&1 \| FileCheck %s
				// RUN: SCUDO_OPTIONS=DeleteSizeMismatch=0 %run %t baddel 2>&1
				// RUN: SCUDO_OPTIONS=DeleteSizeMismatch=1 %run %t gooddelarr 2>&1
				// RUN: SCUDO_OPTIONS=DeleteSizeMismatch=1 not %run %t baddelarr 2>&1 \| FileCheck %s
				// RUN: SCUDO_OPTIONS=DeleteSizeMismatch=0 %run %t baddelarr 2>&1

				// Ensures that the sized delete operator errors out when the appropriate
				// option is passed and the sizes do not match between allocation and
				// deallocation functions.

				#include <new>
				#include <assert.h>
				#include <stdlib.h>
				#include <string.h>

				int main(int argc, char **argv)
				{
				assert(argc == 2);
				if (!strcmp(argv[1], "gooddel")) {
				long long *p = new long long;
				operator delete(p, sizeof(long long));
				}
				if (!strcmp(argv[1], "baddel")) {
				long long *p = new long long;
				operator delete(p, 2);
				}
				if (!strcmp(argv[1], "gooddelarr")) {
				char *p = new char[64];
				operator delete[](p, 64);
				}
				if (!strcmp(argv[1], "baddelarr")) {
				char *p = new char[63];
				operator delete[](p, 64);
				}
				return 0;
				}

				// CHECK: ERROR: invalid sized delete on chunk at address

projects/compiler-rt/test/hardened_allocator/sizes.cc

				// RUN: %clang_scudo %s -o %t
				// RUN: SCUDO_OPTIONS=allocator_may_return_null=0 not %run %t malloc 2>&1 \| FileCheck %s
				// RUN: SCUDO_OPTIONS=allocator_may_return_null=1 %run %t malloc 2>&1
				// RUN: SCUDO_OPTIONS=allocator_may_return_null=0 not %run %t calloc 2>&1 \| FileCheck %s
				// RUN: SCUDO_OPTIONS=allocator_may_return_null=1 %run %t calloc 2>&1
				// RUN: %run %t usable 2>&1

				// Tests for various edge cases related to sizes, notably the maximum size the
				// allocator can allocate. Tests that an integer overflow in the parameters of
				// calloc is caught.

				#include <assert.h>
				#include <malloc.h>
				#include <stdlib.h>
				#include <string.h>

				#include <limits>

				int main(int argc, char **argv)
				{
				gliderUnsubmitted Done Reply Inline Actions s/fulfill/allocate? glider: s/fulfill/allocate?
				assert(argc == 2);
				if (!strcmp(argv[1], "malloc")) {
				// Currently the maximum size the allocator can allocate is 1ULL<<40 bytes.
				size_t size = std::numeric_limits<size_t>::max();
				void *p = malloc(size);
				if (p)
				return 1;
				size = (1ULL << 40) - 16;
				p = malloc(size);
				if (p)
				return 1;
				}
				if (!strcmp(argv[1], "calloc")) {
				// Trigger an overflow in calloc.
				size_t size = std::numeric_limits<size_t>::max();
				void *p = calloc((size / 0x1000) + 1, 0x1000);
				if (p)
				return 1;
				}
				if (!strcmp(argv[1], "usable")) {
				// Playing with the actual usable size of a chunk.
				void *p = malloc(1007);
				if (!p)
				return 1;
				size_t size = malloc_usable_size(p);
				if (size < 1007)
				return 1;
				memset(p, 'A', size);
				p = realloc(p, 2014);
				if (!p)
				return 1;
				size = malloc_usable_size(p);
				if (size < 2014)
				return 1;
				gliderUnsubmitted Done Reply Inline Actions Where does this line come from? I don't see the allocator printing it anywhere. glider: Where does this line come from? I don't see the allocator printing it anywhere.
				cryptoadAuthorUnsubmitted Done Reply Inline Actions This is from sanitizer_allocator.cc, that handles this condition. cryptoad: This is from sanitizer_allocator.cc, that handles this condition.
				memset(p, 'B', size);
				free(p);
				}
				return 0;
				}

				// CHECK: allocator is terminating the process

This is an archive of the discontinued LLVM Phabricator instance.

[sanitizer] Initial implementation of a Hardened AllocatorClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 57362

docs/HardenedAllocator.rst

projects/compiler-rt/cmake/config-ix.cmake

projects/compiler-rt/lib/CMakeLists.txt

projects/compiler-rt/lib/hardened_allocator/CMakeLists.txt

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.h

projects/compiler-rt/lib/hardened_allocator/scudo_allocator.cc

projects/compiler-rt/lib/hardened_allocator/scudo_flags.h

projects/compiler-rt/lib/hardened_allocator/scudo_flags.cc

projects/compiler-rt/lib/hardened_allocator/scudo_flags.inc

projects/compiler-rt/lib/hardened_allocator/scudo_malloc_linux.cc

projects/compiler-rt/lib/hardened_allocator/scudo_new_delete.cc

projects/compiler-rt/lib/hardened_allocator/scudo_rtl.cc

projects/compiler-rt/lib/hardened_allocator/scudo_utils.h

projects/compiler-rt/lib/hardened_allocator/scudo_utils.cc

projects/compiler-rt/test/CMakeLists.txt

projects/compiler-rt/test/hardened_allocator/CMakeLists.txt

projects/compiler-rt/test/hardened_allocator/alignment.cc

projects/compiler-rt/test/hardened_allocator/double-free.cc

projects/compiler-rt/test/hardened_allocator/lit.cfg

projects/compiler-rt/test/hardened_allocator/lit.site.cfg.in

projects/compiler-rt/test/hardened_allocator/malloc.cc

projects/compiler-rt/test/hardened_allocator/memalign.cc

projects/compiler-rt/test/hardened_allocator/mismatch.cc

projects/compiler-rt/test/hardened_allocator/overflow.cc

projects/compiler-rt/test/hardened_allocator/quarantine.cc

projects/compiler-rt/test/hardened_allocator/realloc.cc

projects/compiler-rt/test/hardened_allocator/sized-delete.cc

projects/compiler-rt/test/hardened_allocator/sizes.cc

[sanitizer] Initial implementation of a Hardened Allocator
ClosedPublic