This is an archive of the discontinued LLVM Phabricator instance.

[scudo] Bypass Quarantine if its size is set to 0
ClosedPublic

Authored by cryptoad on Apr 20 2017, 2:41 PM.

Details

Summary

In the current state of things, the deallocation path puts a chunk in the
Quarantine whether it's enabled or not (size of 0). When the Quarantine is
disabled, this results in the header being loaded (and checked) twice, and
stored (and checksummed) once, in deallocate and Recycle.

This change introduces a quarantineOrDeallocateChunk function that has a
fast path to deallocation if the Quarantine is disabled. Even though this is
not the preferred configuration security-wise, this change saves a sizeable
amount of processing for that particular situation (which could be adopted by
low memory devices). Additionally this simplifies a bit deallocate and
reallocate.

Event Timeline

cryptoad created this revision.Apr 20 2017, 2:41 PM
dvyukov accepted this revision.Apr 21 2017, 1:46 AM

If we actually plan to use such configuration, does it make sense to check header for corruption when we reallocate a block (in allocate)? That will give us at least some windows for UAF detection.

Also, we place header before user block, but buffer overruns are more common than underruns. Does it make sense to also check end of block for corruption (either add another checksum at the end, or move header to the end of block).

This revision is now accepted and ready to land.Apr 21 2017, 1:46 AM

If we actually plan to use such configuration, does it make sense to check header for corruption when we reallocate a block (in allocate)? That will give us at least some windows for UAF detection.

I thought about this at some point and I think it wasn't reasonably achievable for a few reasons.
IIRC the main issue was that when a chunk is returned to the backend, we basically lose control over it's header.
If it's in the quarantine-batch size-class or transfer-batch size-class it could be overwritten by whatever is at the start of those structures.
The first allocation of a chunk being all 0 is also problematic, it could be a special case, but then wouldn't be distinguishable from an all 0 overwrite.

Also, we place header before user block, but buffer overruns are more common than underruns. Does it make sense to also check end of block for corruption (either add another checksum at the end, or move header to the end of block).

With the current implementation, we rely on the fact that what comes after chunk will also have a header (if in use), which will be checked in due time.
This is not 100% true due to batches, and is a weakness that is to be addressed.
The initial reasoning was that it was computationally cheaper to get the header before the chunk rather than behind (simple subtraction vs getting the size of the chunk each time).

Having an additional checksum or marker at the end of the chunk is a possibility.
I think it would be better as an option, since it would come with an additional performance hit, which at this point in time I am trying to avoid.

which will be checked in due time.

It's not checked until the object is freed, which may not happen at all. Attackers are good at laying out objects in the required order, so they place something long-living afterwards it won't be freed.
But it's up to you.

It's not checked until the object is freed, which may not happen at all. Attackers are good at laying out objects in the required order, so they place something long-living afterwards it won't be freed.

I am in agreement with you.
I can also see that happening for both chunk A and B, whether the checksum is at the end of A or be beginning on B, eg: overflowing A into B without A or B getting freed for a long time.
I think this is a tough problem to solve with an allocator only, and hopefully the randomness of the chunks layout will help making harder to have B after A (though not impossible).

cryptoad closed this revision.Apr 21 2017, 11:23 AM