This is an archive of the discontinued LLVM Phabricator instance.

[scudo] Adding a public Scudo interface
ClosedPublic

Authored by cryptoad on Dec 12 2017, 1:01 PM.

Details

Summary

The first and only function to start with allows to set the soft or hard RSS
limit at runtime. Add associated tests.

Diff Detail

Event Timeline

cryptoad created this revision.Dec 12 2017, 1:01 PM
Herald added subscribers: Restricted Project, mgorny. · View Herald TranscriptDec 12 2017, 1:01 PM

I should add here that this is required for programs wanting to set their RSS limit based on computations done at runtime due to hardware constraints or other factors.
From a security perspective, this could have an impact if called with parameters controlled by a malicious user, or if set too low by a program (memory exhaustion exploitation is on the rise).

To address your security concern, maybe have a define to compile Scudo without public API?

cryptoad updated this revision to Diff 126778.Dec 13 2017, 10:06 AM

Added some words about security in the interface comment.
Added a define to allow the disabling of the function.

alekseyshl accepted this revision.Dec 13 2017, 10:15 AM

Maybe add a test case for !SCUDO_CAN_USE_PUBLIC_INTERFACE?

This revision is now accepted and ready to land.Dec 13 2017, 10:15 AM
cryptoad updated this revision to Diff 126782.Dec 13 2017, 10:21 AM

Adding a TODO to add tests for the various defines.
Currently the tests we run leverage a static library built with a default
configuration. We should add gtests like the other sanitizers to ensure all
the compile time configuration options work as expected.

This revision was automatically updated to reflect the committed changes.

The added test has been disabled on armhf in rCRT320665 because it fails a bot: http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/2952/steps/ninja%20check%202/logs/FAIL%3A%20Scudo-armhf%3A%3A%20interface.cpp

I see the exact same error on x86_64 CentOS 7, did you find out what the problem is? I'm happy to assist if you have problems to reproduce the problem...

The added test has been disabled on armhf in rCRT320665 because it fails a bot: http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/2952/steps/ninja%20check%202/logs/FAIL%3A%20Scudo-armhf%3A%3A%20interface.cpp

I see the exact same error on x86_64 CentOS 7, did you find out what the problem is? I'm happy to assist if you have problems to reproduce the problem...

I haven't been able to repro the issue. If you have a configuration or some insight as to what the failure is, please let me know!

The added test has been disabled on armhf in rCRT320665 because it fails a bot: http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/2952/steps/ninja%20check%202/logs/FAIL%3A%20Scudo-armhf%3A%3A%20interface.cpp

I see the exact same error on x86_64 CentOS 7, did you find out what the problem is? I'm happy to assist if you have problems to reproduce the problem...

I haven't been able to repro the issue. If you have a configuration or some insight as to what the failure is, please let me know!

Any normal build on CentOS 7 failed reliably. I've submitted D41649 which solves the problem for me, I'll see if this also fixes the failure on ARM...

Side note: Do I understand this correctly that scudo will only get active on an intercepted malloc and exceeded limits without a malloc have no effect? That's a huge restriction, probably most applications in the HPC world first allocate all structures and only start touching the memory afterwards...

Any normal build on CentOS 7 failed reliably. I've submitted D41649 which solves the problem for me, I'll see if this also fixes the failure on ARM...

Cool thanks! Will have look.

Side note: Do I understand this correctly that scudo will only get active on an intercepted malloc and exceeded limits without a malloc have no effect? That's a huge restriction, probably most applications in the HPC world first allocate all structures and only start touching the memory afterwards...

I am not sure I understand correctly the question. So I will attempt an answer, and if it doesn't apply, let me know.

If compiled with Scudo, all allocation functions are handled by the allocator (malloc, realloc, calloc, memalign, etc).
The soft/hard RSS limit feature comes from a need from some applications to enforce an upper limit to the memory used.
This is mostly needed due to the fact that we reserve a large portion of the address space, and as such, setting an rlimit is not viable.
The RSS limit comes as an opportunistic, best-effort, check to allow for some type of upper bound check.
It is not enabled by default (eg: 0), and it is expansive (a few syscalls to read from /proc).
If an application uses directly mmap (or any OS backed memory allocation function), we would only catch a limit "overflow" on the next use of a heap allocation function.
Also if we allocate/free some memory backed by the Secondary allocator quickly (before the check interval elapses), we wouldn't catch it as well.

I hope this answers your question.

Another point:
For ASan, an identical check exists as a background thread. We decided not to do that due to some requirements on Android.
It could potentially come back as a background thread check on supported platforms if that would address the concerns you are raising.

I am not sure I understand correctly the question. So I will attempt an answer, and if it doesn't apply, let me know.

If compiled with Scudo, all allocation functions are handled by the allocator (malloc, realloc, calloc, memalign, etc).
The soft/hard RSS limit feature comes from a need from some applications to enforce an upper limit to the memory used.
This is mostly needed due to the fact that we reserve a large portion of the address space, and as such, setting an rlimit is not viable.
The RSS limit comes as an opportunistic, best-effort, check to allow for some type of upper bound check.
It is not enabled by default (eg: 0), and it is expansive (a few syscalls to read from /proc).
If an application uses directly mmap (or any OS backed memory allocation function), we would only catch a limit "overflow" on the next use of a heap allocation function.
Also if we allocate/free some memory backed by the Secondary allocator quickly (before the check interval elapses), we wouldn't catch it as well.

I hope this answers your question.

Yes, that confirms my understanding that the following logic wouldn't be catched:

  1. Call malloc and allocate all arrays that will be needed.
  2. Initialize the arrays - this commits the memory to count as RSS.
  3. (Computation on arrays)
  4. free the arrays.

Another point:
For ASan, an identical check exists as a background thread. We decided not to do that due to some requirements on Android.
It could potentially come back as a background thread check on supported platforms if that would address the concerns you are raising.

This could work although I'm only trying to understand what this sanitizer is able to detect.