This is an archive of the discontinued LLVM Phabricator instance.

[scudo] Early exit from the case can't do page release.
ClosedPublic

Authored by Chia-hungDuan on Mar 17 2023, 10:48 AM.

Details

Summary

There are heuristics to avoid marking blocks if there's little chance
to release pages. So far, those logics only exist in block-marking
section and we didn't leverage the results of those logics. For example,
in a round of releaseToOS try, we know it's still 128 KB away from the
release threshold. In the next round of releaseToOS, we can early exit
if the number of pushed bytes is smaller than 128 KB without looping
each memory group. This CL adds this heuristic and has reduced amount of
time in checking the status of each memory group.

This CL only applies this heuristic on SizeClassAllocator64.
SizeClassAllocator32 has a smaller region/group size and has little
impact on the default value.

Diff Detail

Event Timeline

Chia-hungDuan created this revision.Mar 17 2023, 10:48 AM
Chia-hungDuan requested review of this revision.Mar 17 2023, 10:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2023, 10:48 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript
Chia-hungDuan retitled this revision from [scudo] Early exit the case can't do page release. to [scudo] Early exit from the case can't do page release..Mar 17 2023, 10:50 AM
Chia-hungDuan added a reviewer: cferris.

With this CL, the filtered try-marking-blocks is reduced ~1000 times (6000 vs 5000) in com.google.android.googlequicksearchbox:interactor on Android

cferris requested changes to this revision.Mar 17 2023, 2:53 PM

Small comment nits.

You gathered data for the one process, but is there any other data worth gathering?

I did see some extra work going on in system_server, but I was using my test account so it might have a different set of apps that triggers a different allocation pattern. Let me know if you want me to gather anything on this device.

compiler-rt/lib/scudo/standalone/primary64.h
125

Releasing small blocks

126

page releases

This revision now requires changes to proceed.Mar 17 2023, 2:53 PM
Chia-hungDuan marked 2 inline comments as done.

Address review comment and fix a bug in keep the invariant that the TryReleaseThreshold.
It should be the minimum distance to the next page release. So if it has reached the high
density, then the next threshold will be the SmallerBlockReleasePageDelta (which we use for
mitigating thrashing)

Small comment nits.

You gathered data for the one process, but is there any other data worth gathering?

I did see some extra work going on in system_server, but I was using my test account so it might have a different set of apps that triggers a different allocation pattern. Let me know if you want me to gather anything on this device.

From my case, there are two processes fall into that condition very frequently. (the followings are copied from bug)

com.google.android.as
com.google.android.googlequicksearchbox:interactor

In general, this is supposed to be a quick check without visiting each memory group, it's conservative. Which means we are supposed to not miss any real release chance we had before but get exempted from visiting group. According to the data, each visiting group without marking blocks may take 309.6(ns) ~ 14732.2(ns). It's not a long operation but it may still be worth of saving.

Besides, we are more aggressive than before (the Region->AllocatedUser / 16U), we have more chances to release pages. Same cases I've observed are the two processes above, I see they release 500 KB ~ 1MB 30s after the device boot.

I was expecting system_server should be reporting most of this case. Maybe you can try this patch on your device and see if there's any different?

Chia-hungDuan edited the summary of this revision. (Show Details)Mar 17 2023, 3:48 PM
Chia-hungDuan edited the summary of this revision. (Show Details)

Slightly adjust the heuristic.
The new threshold should also consider PushedBytesDelta because we don't reset it after a failed page release

The evaluation is from the same scenario as above - reboot the phone and wait for 30s, then collect all the events of calling releaseToOS.

From the log, the new threshold is larger than the original one (AllocatedUser / 16U) in most of cases and therefore avoid more unsuccessful mark-block-attempt. When the threshold is smaller than the original one, I see some chances that we can release some pages there.

Add some number from system_server

Before:

Threshold: AllocatedUser / 16 U
scudo   : -- Average Operation Time -- -- Name (# of Calls) --
scudo   :           1850.9(ns)            releaseToOSMaybe (30907)
scudo   :            359.5(ns)              TryMarkingBlocks (28092)
scudo   :          44311.0(ns)              releaseFreeMemoryToOS (1)

After:

Threshold: Region->TryReleaseThreshold
scudo   : -- Average Operation Time -- -- Name (# of Calls) --
scudo   :           1056.2(ns)            releaseToOSMaybe (30857)
scudo   :            127.8(ns)              TryMarkingBlocks (22342)
scudo   :          15992.0(ns)              releaseFreeMemoryToOS (1)

# of releaseToOSMaybe - # of TryMarkingBlocks = # of cases which is early exit.

At before, (30907 − 28092) ÷ 30907 ~= 9.1% cases are excluded from trying of marking blocks
With this CL, (30857 − 22342)÷30857 ~= 27.5% cases are excluded from trying of marking blocks

Even with some improvement, the rate is still lower than I thought. Will try to dig more opportunity.

cferris accepted this revision.Mar 21 2023, 6:12 PM

LGTM

This revision is now accepted and ready to land.Mar 21 2023, 6:12 PM