This is an archive of the discontinued LLVM Phabricator instance.

[libc++][hardening][NFC] Introduce `_LIBCPP_ASSERT_UNCATEGORIZED`.
ClosedPublic

Authored by var-const on Jun 26 2023, 2:00 PM.

Details

Summary

Replace most uses of _LIBCPP_ASSERT with
_LIBCPP_ASSERT_UNCATEGORIZED.

This is done as a prerequisite to introducing hardened mode to libc++.
The idea is to make enabling assertions an opt-in with (somewhat)
fine-grained controls over which categories of assertions are enabled.
The vast majority of assertions are currently uncategorized; the new
macro will allow turning on _LIBCPP_ASSERT (the underlying mechanism
for all kinds of assertions) without enabling all the uncategorized
assertions (in the future; this patch preserves the current behavior).

Diff Detail

Event Timeline

var-const created this revision.Jun 26 2023, 2:00 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: wenlei. · View Herald Transcript
var-const requested review of this revision.Jun 26 2023, 2:00 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript

Are there still instances of raw _LIBCPP_ASSERT?

This basically LGTM but I'd like to know the answer to this before we merge (also, CI).

var-const added a comment.EditedJun 27 2023, 9:19 AM

Are there still instances of raw _LIBCPP_ASSERT?

This basically LGTM but I'd like to know the answer to this before we merge (also, CI).

Yes, there are some:

  1. In debug.cpp -- I haven't touched those, hoping that https://reviews.llvm.org/D153672 makes them go away and to avoid a merge conflict (although it would be trivial to solve, to be fair).
  2. In test/support/test.support/test_check_assertion.pass.cpp.
  3. In tests under test/libcxx/assertions/.

For the tests, my reasoning was that they are testing the assertions machinery and thus should use the fundamental assertion mechanism directly. Let me know what you think!

Mordante accepted this revision as: Mordante.Jun 27 2023, 10:12 AM
Mordante added a subscriber: Mordante.

Is there already a plan which categories you want to make?
No objection to this approach, LGTM leaving the final approval to @ldionne.

Note: this patch is a prerequisite for https://reviews.llvm.org/D151637 (which is a draft but helps see the state towards which this patch is working).

Is there already a plan which categories you want to make?
No objection to this approach, LGTM leaving the final approval to @ldionne.

There are a few specific categories I have planned for short- to mid-term, but not a full plan for everything. The perfect (probably infeasible) state is if the standard library would check all the function preconditions, thus preventing all undefined behavior arising from violated preconditions. I don't think we could list out all the possible categories without going through preconditions of every single library function, which seems overkill at this point -- I think it would be better to focus on the most widely used functions and classes first.

I previously went through span, vector, string_view and the functions in <algorithm> and came up with:

  • "valid range" -- checks that the given input range is valid (the end is reachable from the beginning);
  • "valid container access" -- essentially checks that the index given to a container is valid. This also affects functions like vector::back() (which could be seen as vector[vector.size() - 1]. The important thing for this grouping is that validation can be performed using just the information within the container, i.e. without storing anything extra in the iterator;
  • "non-foreign iterator" -- checks that an iterator given to a container does indeed refer to the same container object;
  • "sorted, etc. input" (ad hoc name but captures the intention) -- for algorithms that expect the input to be sorted/partitioned/heapified, checks that the input indeed satisfies that criteria.

I'm sure going through other classes and functions would yield more categories, though I would also reuse these where possible. E.g. I plan to put checking that an optional or a function is not empty before accessing the stored data into the "valid container access" category (since those classes can be seen as containers holding 0 or 1 element).

ldionne accepted this revision.Jun 27 2023, 2:13 PM

LGTM w/ green CI (seems like you need to format a couple of headers).

[...]

For the tests, my reasoning was that they are testing the assertions machinery and thus should use the fundamental assertion mechanism directly. Let me know what you think!

Ok, that sounds great!

Is there already a plan which categories you want to make?

And users would enable _LIBCPP_ENABLE_HARDENED_MODE or _LIBCPP_ENABLE_DEBUG_MODE (names not final), but wouldn't actually control the exact assertion categories. Each of _LIBCPP_ENABLE_HARDENED_MODE and _LIBCPP_ENABLE_DEBUG_MODE would basically be a cherry-pick of various categories of assertions. That's the current plan at least.

This revision is now accepted and ready to land.Jun 27 2023, 2:13 PM
var-const updated this revision to Diff 535190.Jun 27 2023, 4:46 PM

Fix format.

Is there already a plan which categories you want to make?
No objection to this approach, LGTM leaving the final approval to @ldionne.

There are a few specific categories I have planned for short- to mid-term, but not a full plan for everything. The perfect (probably infeasible) state is if the standard library would check all the function preconditions, thus preventing all undefined behavior arising from violated preconditions. I don't think we could list out all the possible categories without going through preconditions of every single library function, which seems overkill at this point -- I think it would be better to focus on the most widely used functions and classes first.

I previously went through span, vector, string_view and the functions in <algorithm> and came up with:

  • "valid range" -- checks that the given input range is valid (the end is reachable from the beginning);
  • "valid container access" -- essentially checks that the index given to a container is valid. This also affects functions like vector::back() (which could be seen as vector[vector.size() - 1]. The important thing for this grouping is that validation can be performed using just the information within the container, i.e. without storing anything extra in the iterator;
  • "non-foreign iterator" -- checks that an iterator given to a container does indeed refer to the same container object;
  • "sorted, etc. input" (ad hoc name but captures the intention) -- for algorithms that expect the input to be sorted/partitioned/heapified, checks that the input indeed satisfies that criteria.

I'm sure going through other classes and functions would yield more categories, though I would also reuse these where possible. E.g. I plan to put checking that an optional or a function is not empty before accessing the stored data into the "valid container access" category (since those classes can be seen as containers holding 0 or 1 element).

Thanks for the info!