Also test all the range algorithms to verify the support.
Details
- Reviewers
philnik huixie90 ldionne jdoerfert - Group Reviewers
Restricted Project - Commits
- rGa7c3379cf9a4: [libc++][ranges] Make range algorithms support proxy iterators
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
libcxx/include/__algorithm/copy_backward.h | ||
---|---|---|
31 | @philnik I think this was a bug; it wasn't caught because copy_backward tests never try the case where the sentinel isn't the same type as the iterator. This fix is enough to make robust_against_proxy_iterators pass; however, if I change test_iterators in copy_backward test to define class Sent as = sentinel_wrapper<In>, I get more breakages. Could you please look into this? | |
libcxx/include/__algorithm/inplace_merge.h | ||
2 | These changes are simple but very intrusive. The vast majority is replacing unqualified calls to swap with _IterOps<_AlgPolicy>::iter_swap, and calls to *i1 = std::move(*i2) with _IterOps<_AlgPolicy>::__iter_move(i2). | |
libcxx/include/__algorithm/iterator_operations.h | ||
50 | Should this be moved to a different file? Or perhaps this file should be renamed from iterator_operations.h to something more generic? | |
libcxx/include/__algorithm/ranges_fill.h | ||
33 | @philnik I think this was a bug -- please let me know if I'm missing something, or if you would prefer a different fix. | |
libcxx/include/__algorithm/sort.h | ||
226 | This is one case where the difference between ranges and classic versions isn't confined to iterator operations (but stable_sort would require a similar call to rotate). | |
libcxx/src/algorithm.cpp | ||
16 | @ldionne This gives me some concern. Do these changes look reasonable to you? If yes, should we also add specializations for _RangeAlgPolicy? | |
libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp | ||
152 | Uninitialized memory algorithms don't work with ProxyIterator because it doesn't satisfy nothrow-input-iterator that requires the result of dereferencing the iterator to be the same as iter_reference_t. I'm not sure if it's possible to write a different valid proxy iterator that would satisfy that constraint but still require special handling in the implementation. @huixie90 What do you think? |
libcxx/include/__algorithm/copy_backward.h | ||
---|---|---|
31 | Since D128864 fixes this properly I would prefer if you revert the changes here and I enable the additional tests over there instead. | |
libcxx/include/__algorithm/inplace_merge.h | ||
62 | Why are you adding an alias here? It doesn't look like much of a benefit for two calls. Ditto for a lot of other places. | |
libcxx/include/__algorithm/iterator_operations.h | ||
50 | I'm already not very happy that we pull all the iterator operations in. We should avoid pulling whole algorithms in. Instead, I think the better option would be to rewrite min_element to support ranges like that: template <class _Iter, class _Sent, class _Proj, class _Comp> _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_AFTER_CXX11 _Iter __min_element(_Iter __first, _Sent __last, _Comp& __comp, _Proj& __proj) { if (__first == __last) return __first; _Iter __i = __first; while (++__i != __last) if (std::__invoke(__comp, std::__invoke(__proj, *__i), std::__invoke(__proj, *__first))) __first = __i; return __first; } template <class _ForwardIterator, class _Compare> _LIBCPP_NODISCARD_EXT inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_AFTER_CXX11 _ForwardIterator min_element(_ForwardIterator __first, _ForwardIterator __last, _Compare __comp) { static_assert(__is_callable<_Compare, decltype(*__first), decltype(*__first)>::value, "The comparator has to be callable"); auto __proj = __identity(); return std::__min_element(__first, __last, __comp, __proj); } and update libcxx/algorithms/callable.verify.cpp. | |
libcxx/include/__algorithm/sort.h | ||
142 | Shouldn't this be is_same<decltype(*declval<_Iter>()), _Tp&>::value? Otherwise this would disable the optimization for (almost) all iterators, especially raw pointers. Also, could you qualify declval? While it's not strictly necessary it makes error messages a lot better. | |
226 | As I already said above, I think the better alternative is to make __min_element work with both classic and ranges iterators. | |
libcxx/src/algorithm.cpp | ||
16 | I'm quite certain that this is an ABI break, since template parameters are mangled. I think you have to revert these changes and add a __sort_policy or something like that as a dispatch function. Regarding _RangeAlgPolicy I think we should wait a few releases and try things out in the unstable ABI before effectively killing the possibility of changing the ABI in any way (assuming we want to do that at all). | |
libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp | ||
1 | IIUC some types of problems are only obvious when looking at the output of the algorithms, so I think it's not enough to ensure that the algorithms can be instantiated. |
libcxx/include/__algorithm/inplace_merge.h | ||
---|---|---|
68 | should this call std::ranges::move for ranges policy? or Algs<_AlgPolicy>::move if that exists? | |
185 | I guess we want to __rotate<AlgPolicy> when it exists , perhaps add a todo here? | |
202 | while you are here, perhaps consider remove these stale comments? | |
libcxx/include/__algorithm/make_heap.h | ||
28 | we have two different patterns for these __ functions.
The problem is that when these two patterns co-exists, let's say __a takes two different types, and __b takes the same type. If __a needs to call __b, it could fail because __b expecting the same type. I think we should make all algorithms consistent. | |
libcxx/include/__algorithm/sort.h | ||
142–166 | I think you don't need to guard the is_same<decltype(*declval<_Iter>()), _Tp>::value The algorithm is correct for proxy iterator, because of the correct use of value_type. It'd only be would be wrong if it is using auto Proxy Iterator needs to define value_type in the correct way to make these swap patterns work. Normally these swap patterns need to be combined with iter_move (the triple-move swap), but here we are dealing with is_arithmetic types so moves are just copies | |
337–340 | nit: here we know that __i2 is a pointer so std::move(*__it2) is actually ok? (to be confirmed) | |
560–596 | hmm | |
libcxx/src/algorithm.cpp | ||
18–52 | Can we introduce some new names like __sort_with_al_policy, which is the common sharable code between classic and ranges, and we keep __sort only for classic algorithm which calls the __sort_with_al_policy with classic policy so that these symbols are stable and have the same behaviour | |
libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp | ||
14 | It is great to have a test for iter_swap. But this test won't cover the iter_move. If an algorithm doesn't use iter_move but use std::move(*it), it would still compile (for copyable types) because value_type x = std::move(*it) would probably ends with copies. For algorithms that do move stuff, we should probably test them with proxy iterator of move only types so that it will catch incorrect use of std::move | |
28–45 | these tests are great and they can catch some problems immediately. and thanks for adding them. However, since these tests don't actually verify the results of the algorithms, I think we should add additional tests in individual tests in the case-by-case manner. For example, for sorting, we can test given a vector<int>, we can create a ProxyRange, sort the ProxyRange and verify the original vector is sorted | |
152 | The nothrow-xxx-iterator explicitly requires std::same_as<std::remove_cvref_t<std::iter_reference_t<I>>, std::iter_value_t<I>>, which basically means value_type& == reference, which rules out all the proxy iterators. It makes sense for writing to uninitialized memory because we do need the address of *it to be something that we can write to (not the proxies). So for most uninitialized algorithms, they are designed not to work with proxies. So we can't test them. However, there are two exceptions. for uninitialized_copy and uninitialized_move, the inputs and outputs are separate. So in theory the inputs can be ProxyRange and the outputs can be a simple array which does model nothrow-xxx-iterator | |
libcxx/test/support/test_iterators.h | ||
881–882 | hmm. the two overloads above should already cover this. see the test or did I missing something? | |
895–896 | thanks for adding this. |
Address feedback:
- in sort.h, attach _AlgPolicy to the _Comp template argument for the functions where adding a new template parameter would break the ABI;
- run the test;
- other fixes.
Also fix the CI.
Rebase on main.
libcxx/include/__algorithm/copy_backward.h | ||
---|---|---|
31 | Sure, reverted this (also commented out copy_backward in the robust test file and added a TODO). | |
libcxx/include/__algorithm/inplace_merge.h | ||
62 | Removed from here and elsewhere. If there are other places you'd like to change, please let me know. | |
68 | Very good point, I thought about this. Added a TODO for now. | |
202 | I'd rather not add unrelated changes to this patch -- it's already quite large. | |
libcxx/include/__algorithm/iterator_operations.h | ||
50 | Discussed offline. I slightly prefer the current approach, but not enough to spend too much time discussing this. Went with the approach you suggested. | |
libcxx/include/__algorithm/make_heap.h | ||
28 | This should be caught by testing, so I don't think it should cause any runtime issues, but it can definitely be confusing when reading the code. However, sometimes it reflects the nature of the algorithm -- for some algorithms it is essential that __last is a random-access or bidirectional iterator, and for those it makes sense to reflect that in the signature. For others, __last is only ever used to serve as a guard, so they can trivially support sentinels. Perhaps using different names rather than __last having two meanings would make this a little more readable? | |
libcxx/include/__algorithm/sort.h | ||
142 | Thanks a lot for catching this, it's a very unfortunate typo. I removed this condition altogether thanks to @huixie90's analysis below. | |
142–166 | Thanks a lot for doing this analysis! This really makes things much simpler. I added a brief comment to the function as well. | |
337–340 | This is a very good observation, thanks! | |
560–596 | This is reverted now due to @philnik's comment (it breaks the ABI). | |
libcxx/src/algorithm.cpp | ||
16 | Thanks a lot for calling out the ABI break, I was concerned about that and it makes sense. There's no choice but to revert this change to the template instantiation and come up with a different approach. For the different approach, I would strongly prefer to avoid:
We discussed this offline, and you suggested (IIUC -- feel free to correct my summary) essentially to rename the functions that are being explicitly instantiated, then add one-liner functions with the old instantiated names that just forward arguments to the renamed functions. IIUC, inlining would make sure both the forwarding function _and_ the actual implementation function end up in the dylib, even though only the forwarding function is being instantiated. This sounds like a good approach, but I'm _really_ concerned about fiddling with the ABI so close to the branch cut. It would also take me a while to verify that this works correctly (i.e., no or negligible binary bloat, implementation functions are actually in the dylib). With that in mind, I decided to go with a different approach. It might be suboptimal long-term, but it's easy to implement and shouldn't affect the ABI at all. I want to land this patch ASAP because the intrusive changes to sort and heap algorithms affect a few of the unimplemented ranges patches. As long as this patch makes things better and doesn't prevent us from using a different approach later, I think it should be good to land. What I did was to keep the same number of template arguments in __sort, __sort5, and __insertion_sort_incomplete (the ones being explicitly instantiated) and attach the policy tag to the comparator instead. The way it works is:
It's certainly complicated and not pretty, but I think it solves the proxy iterator problem, shouldn't cause any performance regressions and leaves the door open for changing this in the future. What do you think? | |
18–52 | So far I went with wrapping the policy tag, which seems to be the least verbose and least intrusive change (at the cost of some complexity). See one of my comments above in a thread started by @philnik for a bit more context. Please let me know what you think. | |
libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp | ||
1 | I have somewhat mixed feelings about this. In short, I feel that the Pareto principle applies and this test would catch, say, 80% of the issues for 20% of the effort -- I expect that compilation issues are a lot more likely than subtle incorrect behavior. I would be against adding specific test cases for proxy iterators across all algorithms. However, I think we have algorithms where ProxyIterator is simply added as one more iterator type to instantiate a test_iterators function, which I think is great and should offer sufficient testing. But even with that, I still feel this test file has value because it's compact. It's easy to maintain and extend (for example, adding another proxy iterator type to this file is way easier than going across all the ~100 algorithm test files). It's also way easier to verify that every relevant algorithm is tested in this file rather than going through all of our algorithm tests. I think those upsides outweigh the incomplete coverage. I changed this to be a runtime test, but I don't want to block this patch on adding checks for the output -- I think it improves our test coverage even without those checks which we can add in a follow-up. | |
14 | This is a great point and actually something I thought about -- we should probably test _all_ customization points. I'd prefer to do this in a follow-up, though -- I think this patch is already quite big and it makes the coverage strictly better even if it doesn't achieve one hundred percent coverage. Does that sound reasonable to you? To be clear, I plan to do the follow-up immediately after landing this. I want to land this patch ASAP because the changes to sort.h and friends are going to cause a lot of merge conflicts for the not-yet-implemented algorithms (e.g. inplace_merge). By the way, do you think we should try to add all checks to the same ProxyIterator class or create a separate test iterator (and sentinel and range) for each customization point? | |
28–45 | Please see my other comment above -- in short, I think it would be great to add ProxyIterator as one of the iterator types that instantiate test_iterators across our algorithms. I wouldn't add specific test cases, though, because I don't think those can be maintained well across all the ~100 algorithms. In either case, this is something that I feel should be done in a follow-up. | |
152 | Thanks a lot for looking into this! It's very helpful. It seems that the following constraint: requires constructible_from<iter_value_t<_OutputIterator>, iter_reference_t<_InputIterator>> also prevents proxy iterators from being used with uninitialized_{copy,move}? | |
libcxx/test/support/test_iterators.h | ||
881–882 | I don't want to spend too much time investigating this just because of the deadline coming soon (although it's a pretty interesting issue). The test fails when the assigned type is const (which currently isn't tested). My (unconfirmed) suspicion is that the implicitly-generated default operator= is a better match when the type on the right side is const Proxy&. Choosing between a template and a non-template function, overload resolution prefers a non-template function if both are an equally good match; const Proxy& is an exact match for the signature of the default operator=. Confusingly, however, this works for the move assignment operator, even though the same logic about it being a better match seems to apply. cppreference says that "A deleted implicitly-declared move assignment operator is ignored by overload resolution". Presumably this doesn't apply to a deleted implicitly-declared _copy_ assignment operator, which would explain the difference in behavior. |
libcxx/include/__algorithm/copy_backward.h | ||
---|---|---|
31 | Now that I'm running this test, I'm also getting an error from move_backward. Hopefully the other patch will fix those as well; commented it out for now. |
LGTM for now with the min_element change and green CI. I think there should be a few follow-up PRs to clean some things up though.
libcxx/include/__algorithm/iterator_operations.h | ||
---|---|---|
50 | Am I missing something, or did you not update this part yet? | |
libcxx/include/__algorithm/sort.h | ||
34–80 | I'll try to make a follow-up PR to remove this, but as I said I'm Ok with it for now. | |
55 | __enable_if_t? | |
73 | __enable_if_t? |
Changes to min_element + other feedback.
libcxx/include/__algorithm/iterator_operations.h | ||
---|---|---|
50 | Sorry, looks like I messed up merging. Should be good now, thanks for checking. |
LGTM with green ci
we could follow up with more testing for iter_move
libcxx/include/__algorithm/make_heap.h | ||
---|---|---|
28 | it won't trigger runtime issues , just compile time issues. It is not a matter of whether the Opts::next is called on the caller or here. I am not sure what is the best approach, I prefer to making all __ algorithms to take an iterator and a sentinel of different types (obviously it would still work if you pass two iterators of same type), so that both classic and ranges algorithms can call it and won't worry about calling opts::next. It is the __ algorithm to decide whether to call opts::next or not. But in any case, we could revisit later if we are getting confused | |
libcxx/include/__algorithm/nth_element.h | ||
76 | question: does this algorithm have the "robust against copy comparator test"? right now the comparator is copied several times but I guess this is fine | |
libcxx/include/__algorithm/pop_heap.h | ||
39 | should this call iter_move? | |
libcxx/include/__algorithm/sort.h | ||
44 | I am slightly concerned about the reference member because if we are going to copy the comparator around we have risk of having dangling reference. but I guess the trick is that we never copy the comparators? | |
54 | > > in the end to support c++03? | |
71 | where does the int come from? How can a policy be an int? | |
libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp | ||
14 | not all algorithms support move only types. and I think it is sufficient to use Proxy<MoveOnly> for the test. | |
28–45 | we don't need to test all 100 algorithms as most algorithms won't swap elements. I do agree that this test covers 80%+ problems. I just feel more confident if we actually checking that the result is correct. I think the "sorting" algorithms are worth checking the result | |
152 | hmm. that is strange. std::array a{1,2,3}; ProxyRange inputs{a}; and output is std::array<Proxy<int>, 3> out; iter_value_t<_OutputIterator> should be Proxy<int> and iter_reference_t<_InputIterator> should be Proxy<int&> | |
libcxx/test/support/test_iterators.h | ||
881–882 | hmm. I suspect the tests are relying on the implicit conversion from T to Proxy<T>. (my original assignment explicitly check if rhs is Proxy and your new one allows implicit conversions. The class wasn't designed for Proxy<T> = T but if that makes the test easier it would be fine. I will take a look later and it shouldn't block your patch. |
Fix the CI, address feedback.
libcxx/include/__algorithm/nth_element.h | ||
---|---|---|
76 | Yes, it is verified by the robust... test. _Compare is either a reference type, or, in debug mode, a small wrapper class that contains a reference member to the actual comparator (this is why _Compare is used explicitly to instantiate those functions -- deduction would deduce a non-reference type). Thus in either case, what's being passed around is a reference. | |
libcxx/include/__algorithm/pop_heap.h | ||
39 | Thanks for catching this, I missed it somehow. | |
libcxx/include/__algorithm/sort.h | ||
44 | It's a bit complicated, but there is always a value at the top of the call stack (in std::sort or ranges::sort), and all the functions down the call stack pass around a reference to the original value. _WrapAlgPolicy follows the same approach. | |
71 | The expression is meaningless -- I just need to refer to _Wrapped::_AlgPolicy in a SFINAE context so that this specialization is preferred when _Wrapped is a _WrapAlgPolicy. | |
libcxx/test/support/test_iterators.h | ||
881–882 | Thanks! |
libcxx/include/__algorithm/sort.h | ||
---|---|---|
34 | I would also mention the ABI stability in the comments | |
71 | I was a bit confused and thought you are trying to distinguish between classic policy and range policy. Now I understand that it is for both and distinguish them from the unwrapped ones. Why can't you just do template <class... _Ts> struct _UnwrapAlgPolicy<_WrapAlgPolicy <Ts...>, void>{ // code }; (we can also remove the second template argument void from here and the primary template) |
libcxx/test/std/algorithms/ranges_robust_against_proxy_iterators.compile.pass.cpp | ||
---|---|---|
152 | _OutputIterator would have to be a nothrow-forward-iterator, so IIUC its value_type cannot be Proxy. |
Thanks for the patch! I have a few comments that would ideally require a post-commit change, but apart from that everything LGTM.
libcxx/include/__algorithm/iterator_operations.h | ||
---|---|---|
64 | Let's not fix this immediately, but we should use __advance, __next, etc. for all these names. I understand the reasoning that those are "reserved names" because they are used elsewhere in the library, however this is playing with fire. For example, std::next was introduced in C++11. Hence, users could technically #define next(...) in C++03 and use <algorithm> in a valid program. Right now, this would break since we are using the next name. To avoid having to even think about this, I would just throw __ in front of anything that's internal, period. It's less pretty, but it's simpler and more robust. | |
libcxx/include/__algorithm/ranges_fill.h | ||
33 | Can we add a test specifically for this bug? | |
libcxx/include/__algorithm/sort.h | ||
123 | _LIBCPP_HIDE_FROM_ABI? | |
545 | Here is another approach that would maintain ABI compatibility, but in which nobody writing new code would actually use the explicitly instantiated functions in the dylib. I think it would be reasonable to investigate making this simplification after the LLVM 15 release is done (we should compile significant code bases and try to see if we notice regressions, and if not, we could do this *and* include the __sort instantiations only in the stable ABI). // // In this header: // template <class _AlgPolicy, class _RandomAccessIterator, class _Comp> inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_AFTER_CXX17 void __sort_impl(_RandomAccessIterator __first, _RandomAccessIterator __last, _Comp& __comp) { std::__debug_randomize_range<_AlgPolicy>(__first, __last); using _Comp_ref = typename __comp_ref_type<_Comp>::type; if (__libcpp_is_constant_evaluated()) { std::__partial_sort<_AlgPolicy, _Comp_ref>(__first, __last, __last, _Comp_ref(__comp)); } else { typedef typename iterator_traits<_RandomAccessIterator>::difference_type difference_type; difference_type __depth_limit = 2 * __log2i(__last - __first); std::__introsort<_AlgPolicy, _Compare>(__first, __last, __comp, __depth_limit); } } // // In algorithm.cpp: // // Only used for STL classic algorithms. We keep this name for ABI compatibility because we have // been explicitly instantiating `__sort` functions in the shared library. template <class _Compare, class _RandomAccessIterator> void __sort(_RandomAccessIterator __first, _RandomAccessIterator __last, _Compare __comp) { std::__sort_impl<_ClassicAlgPolicy>(std::move(__first), std::move(__last), std::move(__comp)); } // All the explicit instantiations [...] |
Heads up: we see breakages that root-cause to this commit. I don't yet know how exactly it is related though. Investigating further.
Thanks for the heads-up. Can you post more information about the breakage (ideally the compiler error, assuming it is a compiler error) while you're investigating?
I'm still struggling to get an isolated reproducer, but now I think the crash is happening in v8 code around this line: https://source.chromium.org/chromium/chromium/src/+/master:v8/src/objects/objects.cc;l=6212;drc=cc136c19595fe0a1cde294d8df6cb1b61023012b (not sure about this code online being identical to what I'm looking at, but it seems quite similar at least in the function I pointed at).
This is the relevant part of the stack trace:
#0 0x0000555559d547bd in void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&) () #1 0x0000555559d544b5 in void std::__u::__introsort<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, std::__u::iterator_traits<v8::internal::AtomicSlot>::difference_type) () #2 0x0000555559d363a3 in v8::internal::BaseNameDictionary<v8::internal::GlobalDictionary, v8::internal::GlobalDictionaryShape>::IterationIndices(v8::internal::Isolate*, v8::internal::Handle<v8::internal::GlobalDictionary>) () #3 0x0000555559bfce13 in v8::internal::Genesis::TransferNamedProperties(v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::JSObject>) ()
And the crashing process is a nodejs binary.
We're also hitting this in Chromium. https://bugs.chromium.org/p/chromium/issues/detail?id=1346012 has instructions for reproing in chromium, and a bigger stack. We'll try to reduce this more.
https://bugs.chromium.org/p/chromium/issues/detail?id=1346012 contains:
# Fatal error in ../../v8/src/objects/dictionary-inl.h, line 222 # Debug check failed: KeyAt(cage_base, entry).IsPropertyCell(cage_base).
What could cause this debug check to fail? A double-move? Something else? That information would be extremely useful in figuring out whether the issue is in libc++ or not.
We think we understand the issue, and this should fix it: https://reviews.llvm.org/D130197
The larger point I was making is that sometimes, we make changes in libc++ and that exposes problems in user code. If that were the case and Chromium's code was incorrect, there's nothing we could do on our end. I don't think this applies here, however -- I think we found the issue in libc++.
The stack trace I posted earlier contains introsort and insertion_sort_3, which were affected by this patch. @eaeltsin is working on reproducing this with d8, which should be more compact than using nodejs.
@eaeltsin Can you try applying D130197 to see if that fixes the issue? We'll commit it nonetheless because it fixes issues with proxy iterators, but it would be good to confirm that it indeed fixes your issue so we can stop looking.
Unfortunately not, this doesn't fix the issue I'm seeing.
I'm not yet ready with the reproducer.
What I'm doing is building d8 with disabled pointers compression and optimizations, and running it - it fails on init, no js test needed.
Sample stack trace - looks like the code reads from a bad address :
* thread #1, name = 'd8', stop reason = signal SIGSEGV: invalid address (fault address: 0xf) * frame #0: 0x00005555579e68b9 d8`void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&) [inlined] v8::internal::TaggedField<v8::internal::Smi, 16>::load(cage_base=<unavailable>, host=HeapObject @ rdi, offset=0) at tagged-field-inl.h:67:20 frame #1: 0x00005555579e68b9 d8`void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&) [inlined] v8::internal::PropertyCell::property_details_raw(this=<unavailable>, cage_base=<unavailable>) const at property-cell-inl.h:25:1 frame #2: 0x00005555579e68b9 d8`void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&) [inlined] v8::internal::PropertyCell::property_details_raw(this=<unavailable>) const at property-cell-inl.h:25:1 frame #3: 0x00005555579e68b9 d8`void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&) [inlined] v8::internal::PropertyCell::property_details(this=<unavailable>) const at property-cell-inl.h:32:36 frame #4: 0x00005555579e68b9 d8`void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&) [inlined] v8::internal::PropertyDetails v8::internal::GlobalDictionaryShape::DetailsAt<v8::internal::GlobalDictionary>(dict=GlobalDictionary @ rbx, entry=<unavailable>) at dictionary-inl.h:338:29 frame #5: 0x00005555579e68a8 d8`void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&) [inlined] v8::internal::Dictionary<v8::internal::GlobalDictionary, v8::internal::GlobalDictionaryShape>::DetailsAt(this=0x00007fffffffcf10, entry=<unavailable>) at dictionary-inl.h:71:10 frame #6: 0x00005555579e68a8 d8`void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(v8::internal::AtomicSlot, v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&) [inlined] v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>::operator(this=0x00007fffffffcf10, a=<unavailable>, b=<unavailable>)(unsigned long, unsigned long) at dictionary.h:376:14 frame #7: 0x00005555579e688f d8`void std::__u::__insertion_sort_3<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(__first=AtomicSlot @ r15, __last=AtomicSlot @ r14, __comp=0x00007fffffffcf10) at sort.h:320:9 frame #8: 0x00005555579e67e6 d8`void std::__u::__introsort<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(__first=AtomicSlot @ r15, __last=AtomicSlot @ 0x00007fffffffceb8, __comp=0x00007fffffffcf10, __depth=9) at sort.h:453:7 [artificial] frame #9: 0x00005555579e6588 d8`void std::__u::__introsort<std::__u::_ClassicAlgPolicy, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(__first=AtomicSlot @ r15, __last=AtomicSlot @ 0x00007fffffffceb8, __comp=0x00007fffffffcf10, __depth=9) at sort.h:590:7 frame #10: 0x0000555557a0bc8b d8`v8::internal::BaseNameDictionary<v8::internal::GlobalDictionary, v8::internal::GlobalDictionaryShape>::IterationIndices(v8::internal::Isolate*, v8::internal::Handle<v8::internal::GlobalDictionary>) [inlined] void std::__u::__sort<v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>&, v8::internal::AtomicSlot>(__first=<unavailable>, __last=<unavailable>, __wrapped_comp=0x00007fffffffcf10) at sort.h:627:3 frame #11: 0x0000555557a0bc5d d8`v8::internal::BaseNameDictionary<v8::internal::GlobalDictionary, v8::internal::GlobalDictionaryShape>::IterationIndices(v8::internal::Isolate*, v8::internal::Handle<v8::internal::GlobalDictionary>) at sort.h:687:5 frame #12: 0x0000555557a0bc5d d8`v8::internal::BaseNameDictionary<v8::internal::GlobalDictionary, v8::internal::GlobalDictionaryShape>::IterationIndices(v8::internal::Isolate*, v8::internal::Handle<v8::internal::GlobalDictionary>) [inlined] void std::__u::sort[abi:v15000]<v8::internal::AtomicSlot, v8::internal::EnumIndexComparator<v8::internal::GlobalDictionary>>(__first=<unavailable>, __last=<unavailable>, __comp=EnumIndexComparator<v8::internal::GlobalDictionary> @ 0x00007fffffffcf10) at sort.h:694:3 frame #13: 0x0000555557a0bc5d d8`v8::internal::BaseNameDictionary<v8::internal::GlobalDictionary, v8::internal::GlobalDictionaryShape>::IterationIndices(isolate=0x0000068f3fc40000, dictionary=<unavailable>) at objects.cc:6291:5 frame #14: 0x000055555783465c d8`v8::internal::Genesis::TransferNamedProperties(this=0x00007fffffffd140, from=Handle<v8::internal::JSObject> @ 0x00007fffffffcf60, to=Handle<v8::internal::JSObject> @ 0x00007fffffffcfe8) at bootstrapper.cc:6240:9 frame #15: 0x0000555557833ff9 d8`v8::internal::Genesis::HookUpGlobalObject(this=0x00007fffffffd140, global_object=<unavailable>) at bootstrapper.cc:1434:3 ...
Can you please try again with https://reviews.llvm.org/D130212? It looks like the first patch didn't fully fix the issue.
Excellent, thanks a lot for checking and reporting back! The credit for the second patch goes to @huixie90.
And the final confirmation: our full test run didn't find any more issues related to this commit. Thanks again for the prompt fix!
Sad news: there's another problem caused by this commit (and not fixed by https://reviews.llvm.org/D130212), which wasn't detected initially. The problem is some sort of a data corruption happening in a version of MySQL. I think, the version we see this in should be close to this one: https://fossies.org/dox/mysql-8.0.29/varlen__sort_8h_source.html. A standalone repro is being worked on, but maybe the original code gives enough information to work with.
Thank you for the heads-up. Unfortunately, the source doesn't really give us enough information to act upon. While the repro is in the works, do you happen to have any additional information about the issue, like a stack trace?
This code seems to be calling to sort with some custom iterators, which leads to the corruption of the data being sorted: https://github.com/mysql/mysql-server/blob/8.0/sql/handler.cc#L6779
I think, this version is closer to where I see the problem: https://github.com/mysql/mysql-server/blob/a1bf3cfcecc85d3ae20828d9f78a710f3277ab40/sql/handler.cc#L6704
We would need a repro case. varlen_element's move semantics look very suspicious, but it's hard to say whether this can actually cause the issue you're seeing.
Some more information about the problem. The stack trace looks like follows. The problem manifests as an ASan out of memory error because the move constructor in varlen_sort.h:59 is getting incorrect data and it's trying to allocate an absurdly big amount of memory.
==7713==ERROR: AddressSanitizer: out of memory: allocator is trying to allocate 0xc5600000e4e bytes #0 0x55deed3caedd in operator new[](unsigned long) third_party/llvm/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:98:3 #1 0x55deed6e733c in varlen_element MYSQLSRCHOME/include/varlen_sort.h:59:17 #2 0x55deed6e733c in bool std::__u::__insertion_sort_incomplete<void varlen_sort<unsigned char, DsMrr_impl::dsmrr_fill_buffer()::$_3>(unsigned char*, unsigned char*, unsigned long, DsMrr_impl::dsmrr_fill_buffer()::$_3)::'lambda'(varlen_element const&, varlen_element const&)&, varlen_iterator>(DsMrr_impl::dsmrr_fill_buffer()::$_3, DsMrr_impl::dsmrr_fill_buffer()::$_3, unsigned char) third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__algorithm/sort.h:373:18 #3 0x55deed6e4fcd in void std::__u::__introsort<std::__u::_ClassicAlgPolicy, void varlen_sort<unsigned char, DsMrr_impl::dsmrr_fill_buffer()::$_3>(unsigned char*, unsigned char*, unsigned long, DsMrr_impl::dsmrr_fill_buffer()::$_3)::'lambda'(varlen_element const&, varlen_element const&)&, varlen_iterator>(varlen_iterator, varlen_iterator, DsMrr_impl::dsmrr_fill_buffer()::$_3, std::__u::iterator_traits<varlen_iterator>::difference_type) third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__algorithm/sort.h:575:19 #4 0x55deed6db70b in __sort<(lambda at MYSQLSRCHOME/include/varlen_sort.h:199:13) &, varlen_iterator> third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__algorithm/sort.h:627:3 #5 0x55deed6db70b in __sort_impl<std::__u::_ClassicAlgPolicy, varlen_iterator, (lambda at MYSQLSRCHOME/include/varlen_sort.h:199:13)> third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__algorithm/sort.h:687:5 #6 0x55deed6db70b in sort<varlen_iterator, (lambda at MYSQLSRCHOME/include/varlen_sort.h:199:13)> third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__algorithm/sort.h:694:3 #7 0x55deed6db70b in varlen_sort<unsigned char, (lambda at MYSQLSRCHOME/sql/handler.cc:6738:7)> MYSQLSRCHOME/include/varlen_sort.h:197:3 #8 0x55deed6db70b in DsMrr_impl::dsmrr_fill_buffer() MYSQLSRCHOME/sql/handler.cc:6736:3 #9 0x55deed6da9c8 in DsMrr_impl::dsmrr_init(RANGE_SEQ_IF*, void*, unsigned int, unsigned int, HANDLER_BUFFER*) MYSQLSRCHOME/sql/handler.cc:6617:17 #10 0x55deedf21bb6 in MultiRangeRowIterator::Init() MYSQLSRCHOME/sql/bka_iterator.cc:368:18 #11 0x55deedf21100 in BKAIterator::ReadOuterRows() MYSQLSRCHOME/sql/bka_iterator.cc:228:22 #12 0x55deedf2174a in BKAIterator::Read() MYSQLSRCHOME/sql/bka_iterator.cc:274:19 #13 0x55deede3ed6b in Query_expression::ExecuteIteratorQuery(THD*) MYSQLSRCHOME/sql/sql_union.cc:1231:36 #14 0x55deede3f4e4 in Query_expression::execute(THD*) MYSQLSRCHOME/sql/sql_union.cc:1284:10 #15 0x55deedd5a4f6 in Sql_cmd_dml::execute_inner(THD*) MYSQLSRCHOME/sql/sql_select.cc:776:15 #16 0x55deedd59381 in Sql_cmd_dml::execute(THD*) MYSQLSRCHOME/sql/sql_select.cc:574:7 #17 0x55deedcb2863 in mysql_execute_command(THD*, bool) MYSQLSRCHOME/sql/sql_parse.cc:4438:29 #18 0x55deedcae283 in dispatch_sql_command(THD*, Parser_state*) MYSQLSRCHOME/sql/sql_parse.cc:5037:19 #19 0x55deedcaac9c in dispatch_command(THD*, COM_DATA const*, enum_server_command) MYSQLSRCHOME/sql/sql_parse.cc:1863:7 #20 0x55deedcacd8b in do_command(THD*) MYSQLSRCHOME/sql/sql_parse.cc:1342:18 #21 0x55deed3fe4f3 in handle_connection(void*) MYSQLSRCHOME/sql/conn_handler/connection_handler_per_thread.cc:310:13 #22 0x55def2434649 in pfs_spawn_thread(void*) MYSQLSRCHOME/storage/perfschema/pfs.cc:2905:3 #23 0x7f6af0285b54 in start_thread (/usr/grte/v5/lib64/libpthread.so.0+0xbb54) (BuildId: 64752de50ebd1a108f4b3f8d0d7e1a13)
sort.h:373 does
value_type __t(_Ops::__iter_move(__i));
When reaching this line the value of __i is correct. However, after the call to __iter_move, the varlen_element move constructor (source) gets incorrect values.
I've managed to get the test to pass by replacing occurrences of _Ops::__iter_move(iter) back to _VSTD::move(*iter) and I don't see anything particularly wrong with the affected mysql code so I suspect there's some remaining problem in __iter_move that hasn't been solved with the follow-up patches so far.
Hi,
Thank you very much for posting the analysis. I think the problem is this line
https://fossies.org/dox/mysql-8.0.29/varlen__sort_8h_source.html#l00187
It sets the iteartor_traits to be iterator_traits<varlen_element *>. The iterator_traits<varlen_element *>::reference is varlen_element& (note the reference here). However, its operator* does not return varlen_element&, but a prvalue varlen_element
https://fossies.org/dox/mysql-8.0.29/varlen__sort_8h_source.html#l00090
So this is technically UB because its iteartor_traits lied.
Note that we relied on the iterator_traits::reference to be correct to dispatch the iter_move
https://github.com/llvm/llvm-project/blob/78015047b22dd64f7c647ab7ce04d42e761e7b8f/libcxx/include/__algorithm/iterator_operations.h#L81
We could in theory dispatch on the decltype(*it), but this is somehow challenging in C++03 mode where decltype does not exist.
So in conclusion, it is mysql code that has UB. But it might still be better for us to keep the UB working.
@ldionne @var-const ,
what do you think?
Here is the repro we're discussing in the most recent comments: https://godbolt.org/z/xPhY4eE7a
It's quite long, so as an alternative to the godbolt link I'll post as an attachment instead of pasting inline:
@alexfh @jgorbe @rupprecht Thank you for the additional details! I believe the analysis by @huixie90 is spot on and have prepared a patch based on it: https://reviews.llvm.org/D130538
Can you try out the patch to see if it resolves the issue you're seeing? It does fix the repro case for me.
Thanks! Unfortunately, it just changes the failure that we now see from a crash to a weird runtime failure; incorrectly sorted results, I assume.
I have an alternative change to fix the UB and return varlen_element & instead of varlen_element (isn't that the eventual fix we should do anyway?), and initial tests seem to be passing now. I'm not sure what D130538 is missing.
@rupprecht Thank you for trying the fix. You can't return varlen_element& because the function creates a temporary varlen_element on the fly. The fix is to remove the template specialization of std::iterator_traits<varlen_iterator> , which incorrectly defines the traits. instead , define some member typedef:
using value_type = varlen_element; using reference = varlen_element; using difference_type = ptrdiff_t; using iterator_category = std::random_access_iterator_tag; using pointer = void; // this one is also wrong because of wrong operator->
note (pointer type and operator-> also seem to be wrong but we don't use -> in our algorithms.
https://godbolt.org/z/T8cxffb3o
Also the fact that value_type being the same as reference is a bit suspicious as most proxy iterators have different types. but it might be fine since the code worked before.
Could you please send us a reproducer where the runtime behaviour is wrong? It might be another UB somewhere else which exposed by this patch. We don't want to blindly break user code even if it is UB so it would be good to see a reproducer.
(Please could you send the expected output of the reproducer as well because I don't fully understand what the class is supposed to do atm)
Thanks everyone for chiming in on this issue. I am speaking to @var-const right now who summarized the issues to me. It looks like there is some UB in MySQL right now because their funky iterator type is lying about iterator_traits<It>::reference, and that's really the crux of the problem IMO. That needs to be fixed, and it's not something that libc++ can accommodate by making the code work.
However, it also seems like it's something that we can easily catch at compile-time instead, hence my suggestion to add a static_assert in D130538. It will still break Chromium via its MySQL dependency, however the error will clearly point out how it is broken and how to fix it.
@rupprecht If you are able to get us a reproduction of the failure on top of D130538 (without the static_assert), we'd be interested in taking a look to understand whether there's something more subtle at play. However, our current understanding is that D130538 + static_assert will work in all cases (i.e. either it will work as intended, or it will issue a compile-time diagnostic). If you see additional issues after we move forward with D130538, please let us know via the usual means. This has been really useful in finding subtle problems early and we definitely want to address such issues with the highest priority.
Emphatic +1 to this, fixing the UB is better than hacking up libc++, and a static assert would be amazing here. I'd much rather handle a clear build breakage than spend hours/days hunting down the source of a weird runtime error.
@rupprecht If you are able to get us a reproduction of the failure on top of D130538 (without the static_assert), we'd be interested in taking a look to understand whether there's something more subtle at play. However, our current understanding is that D130538 + static_assert will work in all cases (i.e. either it will work as intended, or it will issue a compile-time diagnostic). If you see additional issues after we move forward with D130538, please let us know via the usual means. This has been really useful in finding subtle problems early and we definitely want to address such issues with the highest priority.
I am attempting to do this, although it's a little more challenging now that the failure is moved later -- it was relatively easy to provide the previous repro because the test runner crashed at the moment something went wrong; producing wrong results without crashing requires more specific knowledge of mysql internals.
Sorry about the delay. I have a slightly evil reproducer, modified from my original one, that shows the incorrect sort result if we take the static_assert out of D130538 so we can keep using a buggy iterator: https://godbolt.org/z/xfhseoaW7 (also attached)
pre-sort varlen_element[4] = 2436 varlen_element[5] = 2438 varlen_element[6] = 2439 varlen_element[7] = 2437 varlen_element[8] = 2440 post-sort varlen_element[4] = 2436 varlen_element[5] = 2438 <-- should be a 2437 here varlen_element[6] = 2439 varlen_element[7] = 2439 <-- oops, 2439 is here twice varlen_element[8] = 2440
I don't know if there's anything interesting you want to look at here -- we should expect nothing less than incorrect results by writing a UB iterator. The iterator is fixed now, and the static_cast in D130538 catches the buggy version of the iterator.
In fact it looks like the assert has found a few more iterators that are either getting away with UB or have bad test coverage, so we're definitely glad the assert is there now.
@rupprecht Thanks a lot for the reproducer! The issue here is the varlen_element's heuristics for creating an independent copy of the underlying data. Currently, it's done in the move constructor, on the assumption that insertion sort inside std::sort would necessarily call the move constructor. However, this assumption is no longer true.
Previously, the temporary element would be created like this (simplified):
value_type temp = std::move(*iter);
*iter returns a prvalue varlen_element, which std::move then casts to an rvalue reference, which results in the move constructor being invoked.
After the rewrite, the call to std::move(*iter) is replaced by an internal function __iter_move which emulates C++20's iter_move. __iter_move notices that *iter returns a temporary and thus also returns a value; attempting to cast it to an rvalue reference would result in a dangling reference:
// simplified template <class _Iter, enable_if_t<IterDerefReturnsPrvalue<_Iter>> typename _Iter::value_type __iter_move(_Iter& __iter) { return *__iter; }
Due to mandatory copy elision, the move constructor is no longer invoked:
value_type temp = __iter_move(iter);
Instead, the result of calling *iter is stored right in temp, with no copying or moving involved.
Since there was no call to the move constructor, now two varlen_elements are shallow copies of each other, sharing a pointer to the same underlying data. Thus, the value of the temporary changes when the value of the original iterator from which it was created changes; that's how the repro case ends up with two 2439 elements.
clang-format not found in user’s local PATH; not linting file.