This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] Add strict weak ordering checks to sorting algorithms
ClosedPublic

Authored by danlark on May 10 2023, 5:59 AM.

Details

Summary

This is the implementation of the first proposal of strict weak ordering checks described in https://discourse.llvm.org/t/rfc-strict-weak-ordering-checks-in-the-debug-libc/70217

This targets the most vulnerable algorithms like std::sort

Diff Detail

Event Timeline

danlark created this revision.May 10 2023, 5:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2023, 5:59 AM
danlark requested review of this revision.May 10 2023, 5:59 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
danlark updated this revision to Diff 520997.May 10 2023, 8:02 AM

Fix includes

danlark updated this revision to Diff 521598.May 12 2023, 3:31 AM

Fix incorrect tests and ADL lookups

danlark updated this revision to Diff 521604.May 12 2023, 4:47 AM

Merge all changes I had about includes and tests

danlark updated this revision to Diff 521635.May 12 2023, 6:48 AM

Follow the standard library lint

philnik added inline comments.May 12 2023, 9:00 AM
libcxx/include/__debug_utils/strict_weak_ordering_check.h
15

IMO we shouldn't include stuff based on whether we are in debug mode or not.

29

If you never use the _AlgPolicy there is no point in passing it.

33
50

Could we print the indices/types if they can be printed instead of just saying "you are wrong"?

libcxx/include/module.modulemap.in
1117–1119
libcxx/test/libcxx/algorithms/alg.sorting/assert.sort.invalid_comparator.pass.cpp
172

Could you use a struct which violates the condition instead? floating-point types specifically are almost trivial to diagnose at compile-time. A struct which uses a float should be hard enough to diagnose (at least from the library).

libcxx/test/std/algorithms/alg.sorting/alg.heap.operations/sort.heap/complexity.pass.cpp
72

Could this be extended by a constant instead?

libcxx/test/std/algorithms/alg.sorting/alg.heap.operations/sort.heap/ranges_sort_heap.pass.cpp
210

Why is this changed?

266

Could this be extended by a constant instead?

libcxx/utils/data/ignore_format.txt
308

Please clang-format the header.

danlark updated this revision to Diff 522079.May 15 2023, 1:23 AM
danlark marked 8 inline comments as done.

Address comments

libcxx/include/__debug_utils/strict_weak_ordering_check.h
15

This is consistent with __debug_utils/randomize_range. I believe the intention is not to expose more headers than needed for compile times

50

This information might be incorrect. As said in the RFC, quadratic algorithm can show if there is a problem, not the triple that violates the condition

In this case we figured out a pair in the range that we think consist of equal elements to be comparable, say, it can be with the epsilon float sorting (2 elements are considered equal if their diff is no more than eps)

eps/2, eps, 2eps

We will detect a pair eps/2, 2eps and we will show it to the user. It does not explain much on what happened.

libcxx/test/std/algorithms/alg.sorting/alg.heap.operations/sort.heap/ranges_sort_heap.pass.cpp
210

Because 2, 1, 3 is not a max heap, calling sort_heap is incorrect in this case

danlark updated this revision to Diff 522092.May 15 2023, 2:19 AM

Remove arg policy

philnik added inline comments.May 15 2023, 9:55 AM
libcxx/include/__debug_utils/strict_weak_ordering_check.h
15

I can't remember why I did it that way, but the compile times for these headers are mot likely not relevant. This is also not the style in the rest of the code base.

libcxx/utils/data/ignore_format.txt
308

When the header is formatted this change shouldn't be necessary.

danlark updated this revision to Diff 522512.May 16 2023, 2:41 AM

Fix comments

danlark marked 2 inline comments as done.May 16 2023, 2:42 AM
philnik accepted this revision.May 26 2023, 3:42 PM

LGTM

This revision is now accepted and ready to land.May 26 2023, 3:42 PM
danlark added a comment.EditedMay 30 2023, 5:44 AM

I don't have commit rights, can you help me submit the change?

Danila Kutenin
kutdanila@yandex.ru

I don't have commit rights, can you help me submit the change?

Danila Kutenin
kutdanila@yandex.ru

I tried to land this patch on your behalf, but there are some merge conflicts. Can you rebase the patch on main?

libcxx/test/libcxx/private_headers.verify.cpp
345

This file is removed upstream, can you revert this hunk before rebasing?

llvm/utils/gn/secondary/libcxx/include/BUILD.gn
381

FYI this is done automatically.

danlark updated this revision to Diff 528216.Jun 4 2023, 7:36 AM

Rebase to head

I don't have commit rights, can you help me submit the change?

Danila Kutenin
kutdanila@yandex.ru

I tried to land this patch on your behalf, but there are some merge conflicts. Can you rebase the patch on main?

All done, thanks!

I don't have commit rights, can you help me submit the change?

Danila Kutenin
kutdanila@yandex.ru

I tried to land this patch on your behalf, but there are some merge conflicts. Can you rebase the patch on main?

All done, thanks!

Thanks! I'll wait for the current CI run to finish and land the patch when it's green.

This revision was landed with ongoing or failed builds.Jun 4 2023, 10:27 AM
This revision was automatically updated to reflect the committed changes.

The build failure is a test that is a bit flaky. Landed on your behalf.

ldionne added inline comments.Jul 17 2023, 8:22 AM
libcxx/include/__algorithm/sort.h
703

Sorry for being so late to the party, but I think there's a problem here. We should decide whether we want to use these added facilities or __comp_ref_type to catch these kinds of issues. Right now, we have a weird situation where both can potentially be used at the same time. This means that we have the following mechanisms in place and enabled independently:

  1. The assertions I added that ensure that we would not dereference iterators out-of-bounds inside std::sort. Those are arguably hardening assertions, not only debug mode assertions.
  2. The existing __comp_ref_type mechanism which aims to catch invalid strict weak orders at the predicate level.
  3. The new mechanism introduced by this patch.

Is there a reason why we'd want to have these three severely overlapping mechanisms (and especially within the same algorithm)?

ldionne added inline comments.
libcxx/include/__algorithm/sort.h
703
philnik added inline comments.Jul 17 2023, 9:00 AM
libcxx/include/__algorithm/sort.h
703

I think they are complementary. All of them have their strengths and weaknesses.

  • The assertions are relatively light weight, but miss a lot of the strict weak ordering issues. As you said yourself, these can be considered hardening asserts.
  • The __comp_ref_type catches a lot more issues, but can't check across multiple pairs.
  • This new check is very expensive, making it only suitable for a subrange and only in debug mode, but it catches a lot of the strict weak ordering issues.

I think the assertions aren't useful to combine with __comp_ref_type and the new checks, but also don't hurt. Having both for larger ranges makes sense, since the __comp_ref_type can check the whole range.

ldionne added inline comments.Jul 17 2023, 2:53 PM
libcxx/include/__algorithm/sort.h
703

Ok, that's something I could go with. However, in that case we need to reinstate the invalid_comparator.pass.cpp test to do what it did before. It was a hardening test that ensures that we don't go OOB inside these algorithms when the comparator is wrong. The goal was to ensure that we don't do that bad thing even when the debug mode is disabled. After this patch, however, that test became a test for the newly introduced strict weak order checks, which only make sense in debug mode.

@danlark Do you think you could pick that up?

danlark added inline comments.Jul 18 2023, 3:27 AM
libcxx/include/__algorithm/sort.h
703

I think we still check for what you ask in the comment:

Line 87:

TEST_LIBCPP_ASSERT_FAILURE(std::sort(copy.begin(), copy.end(), checked_predicate), "Would read out of bounds");

The test checks that out of bound reads check fires first. Then if it is not able to identify anything, the strict weak ordering check asserts.