This is an archive of the discontinued LLVM Phabricator instance.

[libc++][hardening] Categorize more assertions.
ClosedPublic

Authored by var-const on Jul 20 2023, 11:16 AM.

Details

Reviewers
ldionne
Group Reviewers
Restricted Project
Commits
rG4a652e4a996f: [libc++][hardening] Categorize more assertions.

Diff Detail

Event Timeline

var-const created this revision.Jul 20 2023, 11:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 11:16 AM
var-const requested review of this revision.Jul 20 2023, 11:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 11:16 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
ldionne added inline comments.
libcxx/include/__config
281

It would be great to rebase this on top of D155866.

287

Here we could add a short comment rationalizing the choice: Nullptr conditions are generally meant to protect against dereferencing a null pointer, which generally results in a crash.

libcxx/include/__hash_table
1112

It would be nice to make sure that all the non-internal assertions we add are tested. In some cases this will mean removing XFAIL on an existing test, in other cases we'll need to add some tests but it should be easy.

libcxx/include/__string/char_traits.h
145–146

Maybe _LIBCPP_ASSERT_NON_OVERLAPPING_RANGES? Both ranges can technically be valid ranges yet still be overlapping. Also, the result of the algo will be incorrect if the ranges overlap but I don't think this can directly cause bad stuff to happen, so I don't think we want this in the hardened mode. Using another category would allow excluding it.

libcxx/include/string
956 ↗(On Diff #542605)

I think a lot of these NON_NULL checks are actually VALID_RANGE checks. http://eel.is/c++draft/iterators#iterator.requirements.general-11

Here we seem to be using __s != nullptr as a way to say __s is dereferenceable. Do we want to move these checks to ASSERT_VALID_RANGE? Do we want these checks in hardened mode? I'm not sure, I think we should probably review this with a security specialist.

963 ↗(On Diff #542605)

Separate from the ASSERT_VALID_RANGE question, do we want to introduce helper functions to help checking whether a range is valid?

var-const marked 2 inline comments as done.
  • Partially address feedback;
  • remove the non-null assertions from the patch;
  • rebase.
libcxx/include/string
956 ↗(On Diff #542605)

It's a great question and a topic for discussion, IMO. I think reviewing this with a security expert would be great.

I'd prefer to avoid blocking the patch on this, though, given that the branch cutoff date is coming up, and a large part of the patch seems to be more straightforward than the questions related to non-null checks. I'd remove the NON_NULL part from this patch and do it in a follow-up instead, does that sound reasonable to you?

963 ↗(On Diff #542605)

Yes, this is something I'd really love to do. I'd do this in a separate patch, though, if that's okay.

ldionne accepted this revision.Jul 21 2023, 10:44 AM

LGTM with tests!

libcxx/include/__string/char_traits.h
241

Do we want to change the message to overlapping ranges? To consider as we add tests.

This revision is now accepted and ready to land.Jul 21 2023, 10:44 AM

I really like this hardening work. I think it would be good to have bit more discussion and tweaking after LLVM 17 has branched. I would love it when a security expert can have a look at it. I expect they have a better view on what root causes for security issues are.

libcxx/include/__hash_table
1112

+1

libcxx/include/__tree
171

As mentioned in D155906 I really would like to have these kind of cheap internal sanity checks enabled in _LIBCPP_ENABLE_HARDENED_MODE. They are not user errors, but they still cause potential issues for users, including users in safety-critial places. When there's not enough time to get that done before LLVM 17, I'm happy to defer that to LLVM 18.

libcxx/include/string
956 ↗(On Diff #542605)

+1 for having security expert having a look on this. I would like to have this check in hardened mode. This is check to check and users of libc++ may do the "wrong thing". But not a blocker for me.

I really like this hardening work. I think it would be good to have bit more discussion and tweaking after LLVM 17 has branched. I would love it when a security expert can have a look at it. I expect they have a better view on what root causes for security issues are.

FWIW, we're planning to review everything with security experts in the coming weeks. The idea for now is to be conservative and only enable the checks that we are certain need to be added in the hardened mode.

var-const added inline comments.Jul 21 2023, 3:27 PM
libcxx/include/__tree
171

Thanks for starting this discussion!

My current thinking has been to keep the hardened mode as lightweight as possible. I have a big concern about performance penalty blocking adoption; also, while I'm not 100% sure this can happen, my ideal would be if we could make the hardened mode the default at some point (of course, users would still be able to opt out and use the unchecked mode). Because of that, so far I have been generally erring on the side of performance and using the following criteria for enabling an assertion:

  1. Is the failure directly triggered by user input?
  2. Is the failure directly exploitable?
  3. Is the failure condition relatively cheap to check?

Personally, I feel #2 is the most important distinction. Almost any error, including logic errors that don't trigger UB, can lead to very incorrect results and also contribute to a security vulnerability. I envision the hardened mode as only preventing errors that create a security vulnerability directly (e.g. an out-of-bounds read) -- the kind of errors where we can confidently say "if your program has one of those, it is vulnerable, regardless of any other context" (and thus we can justify the extra penalty for checking those).

From that perspective, it follows that checks for null pointers should not be a part of the hardened mode (FWIW, it was somewhat surprising to me, or at least I found it counterintuitive). From what I know (and I'd be happy to be corrected on this if I'm missing something), dereferencing a null pointer on any relatively modern OS leads to a crash and is not directly exploitable (it might contribute to a more complicated exploit arising from context, but so can many other types of failures).

The upside I'm hoping we get from limiting the number of checks we perform is wider adoption. The intention behind hardened mode is not that it provides complete safety or entirely eliminates UB -- rather, it makes your program measurably safer at a "negligible" cost; it relatively cheaply protects against the most catastrophic stuff. I'm hoping something like the 80/20 principle applies here (meaning that just a fraction of the cost that would be necessary to protect against all UB is enough to prevent 80% of exploitable vulnerabilities). The "perfect" hardened mode (aspirational, not realistic) is the mode where if you were to enable it, you would be guaranteed that an attacker could not use your program to achieve arbitrary code execution, and it's cheap enough to use in production.

We currently have the hardened mode and the debug mode. While I'd prefer to keep this as simple as possible and stick with the 2 modes, I'd be open to exploring adding a 3rd mode in-between those two that would contain more checks that the hardened mode while still aiming at some reasonable level of performance (which for the debug mode is almost a non-goal).

Tagging @ldionne as well in case he has a different perspective.

I really like this hardening work. I think it would be good to have bit more discussion and tweaking after LLVM 17 has branched. I would love it when a security expert can have a look at it. I expect they have a better view on what root causes for security issues are.

FWIW, we're planning to review everything with security experts in the coming weeks.

Great to hear, thanks.

libcxx/include/__tree
171

I'm not sure whether deferring a null pointer will always leads to a crash on a modern OS. I can imagine some cases where that might be an issue. Embedded systems may use less well know OSes. Are kernel drivers allowed to modify address 0 or with low values? (This assumes a nullptr is mapped to address 0, which is not required by the Standard.)

Maybe we should ask this to security experts. They probably can tell us how often deferring a nullptr may lead to real-world issues.

Even when it's not a security-issue I think it would be good to test a nullptr when not allowed. Deferring nullptr is known to be quite a large issue in both C and C++. I'm also open to a third mode. Maybe we should measure what the overhead for adding null checks is. When we annotate it with likelihood attribute telling the compiler to expect a valid pointer the branch predictor will always guess correctly. In that case the overhead should be low. The nullptr case will be more expense, but I in that case the performance is no longer relevant.

I think we can move forward this way. When we put everything in the right category we can measure the overhead of every category. That way we can have number of the performance overhead and can make an informed decision in this regard.

I think it would be good to discuss this on Discord/Discourse to get input of more people in the LLVM communicate. Again I'm fine to do that for LLVM 18 and keep the current way in LLVM 17.

var-const added inline comments.Jul 24 2023, 2:55 PM
libcxx/include/__hash_table
1112

As discussed offline -- I will do this in a follow-up, simply due to the release timing. I did a quick survey and counted 83 untested assertions (among the ones currently categorized), so unfortunately it will take a while.

libcxx/include/__tree
171

Maybe we should ask this to security experts. They probably can tell us how often deferring a nullptr may lead to real-world issues.

Absolutely. I think these are all great questions to discuss.

I'm not sure whether deferring a null pointer will always leads to a crash on a modern OS.

That's my understanding but I would absolutely love a domain expert to confirm.

Embedded systems may use less well know OSes. Are kernel drivers allowed to modify address 0 or with low values? (This assumes a nullptr is mapped to address 0, which is not required by the Standard.)

IMO, the hardened mode should only contain checks that apply to almost every user; it shouldn't make users of widely-used systems to pay a performance penalty for the sake of more niche OSs. I'd much rather think of a way to let vendors of these kind of systems to enable additional null pointer checks on top of the hardened mode, whether via a third mode or some finer-grained approach. My overarching concern is users avoiding the hardened mode due to performance concerns, whether real or perceived; I'm afraid that enabling too many checks might backfire and result in less security due to a lack of use.

Maybe we should measure what the overhead for adding null checks is.

+1.

When we annotate it with likelihood attribute telling the compiler to expect a valid pointer the branch predictor will always guess correctly. In that case the overhead should be low.

There was recently a Discourse thread about how the assume attribute can prevent certain optimizations, so I'm not sure we can always rely on that.

I think we can move forward this way. When we put everything in the right category we can measure the overhead of every category. That way we can have number of the performance overhead and can make an informed decision in this regard.

I think it would be good to discuss this on Discord/Discourse to get input of more people in the LLVM communicate. Again I'm fine to do that for LLVM 18 and keep the current way in LLVM 17.

Agreed -- once the release is done, I'd like to get some performance data and start a wider discussion.

This revision was automatically updated to reflect the committed changes.
AdvenamTacet added inline comments.Aug 1 2023, 12:57 PM
libcxx/include/__tree
171

In Linux userspace programs, a null pointer dereference will result in a SIGSEGV signal sent to the process if the program didn't allocate memory at address zero. Nowadays, this is not possible due to the vm.mmap_min_addr runtime kernel parameter which is usually set to 65536 (although there are probably systems that set it to 4096 [0]) though a root user can change it with sysctl -w vm.mmap_min_addr=0 and then programs can allocate memory at zero address using the mmap syscall as:

cpp
#include <sys/mman.h>
mmap(0, sysconf(_SC_PAGE_SIZE), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);

A program can also register a SIGSEGV handler which could maybe be a vector of attack in case of null pointer dereferences, although this is rather unlikely as the program would have to be very specific.

However, a null pointer dereference can be used to launch a denial-of-service (DoS) attack.

On the Linux kernel side, null pointer dereferences were exploitable in the past when both the vm.mmap_min_addr and SMEP/SMAP [1] mitigations (or equivalents on e.g. ARM) were not there. In such a case, a null pointer dereference in the kernel could actually fetch the data from a userspace program that invoked a syscall.

Null pointer dereferences in the kernel may also end up as "kernel oops" and it turned out that this behavior was exploitable recently as shown by the Google Project Zero team [2]. However, this bug is already fixed in the upstream kernel repository. It is also worth mentioning that libc++ is not used in the kernel code.

Given all of the above, I believe that the overhead of checking for null pointer dereferences in the hardened mode is not necessary, at least not on Linux.

[0] https://grep.app/search?q=mmap_min_addr%3D
[1] https://wiki.osdev.org/Supervisor_Memory_Protection
[2] https://googleprojectzero.blogspot.com/2023/01/exploiting-null-dereferences-in-linux.html

PS
Is there a discussion on discourse to post it for long term visibility and discussion?