This is an archive of the discontinued LLVM Phabricator instance.

Reapply "[libc++][ranges]Refactor `copy{,_backward}` and `move{,_backward}`"
ClosedPublic

Authored by var-const on Dec 2 2022, 3:26 PM.

Details

Summary

This reverts commit a6e1080b87db8fbe0e1afadd96af5a3c0bd5e279.

Fix the conditions when the memmove optimization can be applied and refactor them out into a reusable type trait, fix and significantly expand the tests.

Diff Detail

Event Timeline

var-const created this revision.Dec 2 2022, 3:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 2 2022, 3:26 PM
var-const updated this revision to Diff 479762.Dec 2 2022, 3:33 PM

Apply the fix.

var-const edited the summary of this revision. (Show Details)Dec 2 2022, 3:33 PM
var-const updated this revision to Diff 480328.Dec 5 2022, 9:19 PM

Expand testing.

var-const published this revision for review.Dec 5 2022, 9:20 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
ldionne added inline comments.
libcxx/include/__algorithm/copy_move_common.h
38

I would make this a proper boolean trait instead. It may make it easier to reuse elsewhere outside of SFINAE contexts.

39–45

After our discussion about is_trivially_copyable and is_trivially_<operation>, I think I would change this to:

(is_same<_In, _Out>::value && is_trivially_copyable<_In>::value) ||
(
  sizeof(_In) == sizeof(_Out) &&
  is_integral<_In>::value && is_integral<_Out>::value &&
  !(is_same<remove-cv(_In), bool>::value || is_same<remove-cv(_Out), bool>::value)
)

I read this as "first, the obviously correct case", and then "special casing for integral types that we know have the same value representation".

And then we could rename this to __type_traits/is_always_bitcastable? It's ugly but I think it conveys the idea of what we're trying to do here, which is essentially tell types for which bit_cast<To>(From) is *always* valid and never UB. This trait should also be documented (via comment).

libcxx/test/libcxx/algorithms/alg.modifying.operations/copy_move_nontrivial.pass.cpp
127

Why not this? That will make the testing below more complicated for bool but I think it makes sense.

ldionne added inline comments.Dec 6 2022, 12:48 PM
libcxx/test/libcxx/algorithms/alg.modifying.operations/copy_move_nontrivial.pass.cpp
152–153

I would do something like this?

static constexpr T array[N] = {...};
return array + i;

Another option would be to use a file-scope variable template.

philnik added a subscriber: philnik.Dec 6 2022, 3:20 PM
philnik added inline comments.
libcxx/include/__algorithm/copy_move_common.h
52
var-const updated this revision to Diff 480804.Dec 7 2022, 1:12 AM
var-const marked 4 inline comments as done.

Address feedback, further expand testing.

libcxx/include/__algorithm/copy_move_common.h
52

Hmm, I couldn't immediately see how that issue is related, could you please double-check the link? If the link is correct, then I'm missing some context, sorry.

var-const added inline comments.Dec 7 2022, 1:27 AM
libcxx/include/__algorithm/copy_move_common.h
42

I think the volatile condition is only relevant for memmove and friends -- bit_cast is happy to convert between volatile and non-volatile, and __are_always_bitcastable should probably follow suit.

libcxx/include/__type_traits/are_always_bitcastable.h
25 ↗(On Diff #480804)

This is my attempt to formalize the trait. Some things that I think are worth calling out:

  • the way this is formulated is symmetrical -- not only T has to be bit-castable to U, but also U to T. Without this requirement, I wasn't sure how to express that the object size needs to be the same (if U is larger, then U can represent all values of T, so having this as just a one-way relationship didn't seem to work);
  • related to the previous point, there is no precedent (that I found) for a type trait with a "plural" name (i.e., prefixed with are and not is). I chose the plural prefix to stress that this is a symmetric relationship between types;
  • I chose to discard cv-qualifiers (IIUC, they don't affect bit_cast). memmove and friends don't work with volatile types, so this trait isn't fully sufficient to check whether memmove can be called;
  • please double-check my reasoning. Part of the reason I tried to go through every possible case is to make it easier to notice if any of my assumptions are wrong.
33 ↗(On Diff #480804)

Do we still need this in new code?

33 ↗(On Diff #480804)

Do we need to inherit from integral_constant for internal traits?

45–47 ↗(On Diff #480804)

This is the closest I could get to a "formal" definition of what this trait is trying to do. Please let me know what you think.

67 ↗(On Diff #480804)

From testing, this currently isn't true -- somehow e.g. int[1] is considered bit-castable to itself.

68 ↗(On Diff #480804)

(or so I think)

libcxx/test/libcxx/algorithms/alg.modifying.operations/copy_move_trivial.pass.cpp
238

Please let me know if you think this test is overkill (or alternatively, if you'd like to test more types -- I'm operating on the assumption that if it works for one signed/unsigned pair, it works for all of them).

libcxx/test/libcxx/type_traits/are_always_bitcastable.compile.pass.cpp
13 ↗(On Diff #480804)

Please let me know if you think these tests are overkill.

16 ↗(On Diff #480804)

I think it's better to avoid including this file from <type_traits> -- users shouldn't pay the extra compilation penalty for our internal stuff (even if that penalty is very small).

205 ↗(On Diff #480804)

This is the behavior that we want, right?

211 ↗(On Diff #480804)

I found it surprising, but arrays are currently considered bit-convertible. What do you think?

var-const updated this revision to Diff 480812.Dec 7 2022, 1:27 AM

Fix the CI.

philnik added inline comments.Dec 7 2022, 2:43 AM
libcxx/include/__algorithm/copy_move_common.h
52

Not sure why the LLVM redirect is broken on this, here is the github link I meant: https://github.com/llvm/llvm-project/issues/28901

ldionne accepted this revision.Dec 8 2022, 12:09 PM
ldionne added a subscriber: jfb.

I'll have another look but this LGTM w/ comments applied. Thanks a lot for all the work on this patch, the rabbit hole is much deeper than I initially thought.

libcxx/include/__algorithm/copy_move_common.h
42

Do we have a test with an _In and/or an _Out that is volatile?

libcxx/include/__type_traits/are_always_bitcastable.h
33 ↗(On Diff #480804)

_LIBCPP_TEMPLATE_VIS isn't needed, not because it's new code, but because this gives default visibility to a struct that doesn't require any visibility. There's nothing to give visibility to.

Do we need to inherit from integral_constant for internal traits?

No.

45–47 ↗(On Diff #480804)

I think I like this.

1 ↗(On Diff #480812)

@jfb Just tagging you here in case you're curious to take a look. Your pair of eyes on this file would certainly be a bonus.

27 ↗(On Diff #480812)

I would also add a mention of http://eel.is/c++draft/basic.types.general#3 which blesses what we're doing here with trivially copyable types.

59–61 ↗(On Diff #480812)
libcxx/test/libcxx/algorithms/alg.modifying.operations/copy_move_trivial.pass.cpp
16–17

If it's just a warning, you could use // ADDITIONAL_COMPILE_FLAGS(gcc): -Wno-whatever.

libcxx/test/libcxx/type_traits/are_always_bitcastable.compile.pass.cpp
62–65 ↗(On Diff #480812)

Then you get the same benefit for eliminating repetition, but IMO this is less arbitrary. And for the "same types" case, we're just instantiating the same function twice (well once really because compiler memoization).

67–70 ↗(On Diff #480812)

IMO this doesn't pull its weight.

16 ↗(On Diff #480804)

I would argue that's a "libc++ policy" decision and it's contrary to what we had agreed on when we started granularizing headers. IIRC, we had decided that a header like <foo> should include everything inside <__foo/*.h>, regardless of whether it's public API or not. That being said, that simply doesn't work -- I've run into cases where including everything inside <__foo/*.h> leads to circular dependencies, so this is something we'll likely have to revisit. The status quo is also that we are not systematically including all of <__foo/*.h>. So I'm fine with what you've done.

205 ↗(On Diff #480804)

I think so, because references are not objects.

211 ↗(On Diff #480804)

I think that for the purpose of this trait, we should probably also enforce that the two types are std::is_trivially_assignable_v<To, From>, and that would handle the array case. WDYT?

Perhaps this means that the name and purpose of the trait must change a tiny bit, maybe something like __copy_assignment_is_bitcast? Basically anything that will hint at the fact that this does a bit more than checking whether the two types are bit-castable, since we're also taking into account other unrelated language rules like t = u being valid. We should make sure to have a test that would catch it if our std::copy suddenly started to accept things that are not assignable to each other just because we are using std::memcpy under the hood.

This revision is now accepted and ready to land.Dec 8 2022, 12:09 PM
var-const updated this revision to Diff 482743.Dec 14 2022, 1:12 AM
var-const marked 11 inline comments as done.

Address feedback.

var-const updated this revision to Diff 482744.Dec 14 2022, 1:13 AM

Link to the GitHub issue about volatile pointers.

The title of the commit could be changed to "Reapply" instead of "Revert Revert" -- that's always a bit confusing IMO.

ldionne added inline comments.Dec 14 2022, 11:08 AM
libcxx/include/__type_traits/is_assignment_equivalent_to_bitcast.h
37–80 ↗(On Diff #482744)
// This is the trait for the core type property, separate from syntactic requirements
template <class _From, class _To> // those are cv-unqualified types always
struct __is_always_bitcastable {
  static constexpr bool value = 
     // basic case, only one type, check that it's trivially copyable
     (is_same< _From, _To >::value && is_trivially_copyable<_From>::value) ||
     (
       sizeof(_From) == sizeof(_To) &&
       is_integral<_From>::value &&
       is_integral<_To>::value &&
       !is_same<_To, bool>::value
     );
};

// Then inside std::copy, do:
enable_if_t<
  is_assignable<_From, _To>::value && __is_always_bitcastable<_From, _To>::value
>
It copy(...) {
  // use memcpy, which we know is valid since __is_always_bitcastable is true
}

That should also make the tests for this trait less weird, since now it will be natural that array types are always bitcastable, per the definition of trivially-copyable.

libcxx/test/libcxx/type_traits/is_assignment_equivalent_to_bitcast.compile.pass.cpp
70–75 ↗(On Diff #482744)

Let's inline this into check below? It's only used once.

var-const retitled this revision from Revert "Revert "[libc++][ranges]Refactor `copy{,_backward}` and `move{,_backward}`"" to Reapply "[libc++][ranges]Refactor `copy{,_backward}` and `move{,_backward}`".Dec 17 2022, 1:55 PM
var-const edited the summary of this revision. (Show Details)
var-const updated this revision to Diff 483787.Dec 17 2022, 4:06 PM
var-const marked an inline comment as done.

Address feedback

var-const marked 2 inline comments as done.Dec 17 2022, 4:08 PM
var-const added inline comments.
libcxx/include/__algorithm/copy_move_common.h
42

Yes (the very last test case in copy_move_nontrivial.pass.cpp).

52

Yes, it fixes the compilation error when passing pointers to volatile. Thanks for finding the issue!

libcxx/include/__type_traits/are_always_bitcastable.h
27 ↗(On Diff #480812)

Slightly rephrased as

In other words, `From` and `To` have the same value representation and the set of values of `From` is a subset of the set of values of `To`.

(I think that having the same object representation can be implied, but the bit about having the set of values is important because e.g. a boolean and its corresponding integer type are said to have the same value representation)

I'm not sure about adding that particular section. It explicitly blesses the simple case (copying between objects of the same type), but this trait is primarily for conversions.

libcxx/include/__type_traits/is_assignment_equivalent_to_bitcast.h
37–80 ↗(On Diff #482744)

Done. Please check the new rationale for (rejecting) pointers. Arrays are accepted now like we discussed.

libcxx/test/libcxx/algorithms/alg.modifying.operations/copy_move_trivial.pass.cpp
16–17

I'm no longer getting that when running the buildbot locally. Perhaps the warning doesn't cover the case when the ambiguating function is a template. I'll wait for CI to be sure.

libcxx/test/libcxx/type_traits/are_always_bitcastable.compile.pass.cpp
16 ↗(On Diff #480804)

Thanks for explaining the additional context!

211 ↗(On Diff #480804)

I went with __is_assignment_equivalent_to_bitcast which is quite a mouthful -- happy to improve the name.

We might split this into two traits (one that doesn't check for is_trivially_assignable but checks everything else, and the other is a conjunction of the first trait and is_trivially_assignable), but I'd wait until we have a use case for that (assuming it's a good idea in the first place).

var-const marked 2 inline comments as done.Dec 17 2022, 5:12 PM
var-const added inline comments.
libcxx/test/libcxx/algorithms/alg.modifying.operations/copy_move_trivial.pass.cpp
16–17

Hmm, the CI still gets the error, and apparently it's a bona fide error, not just a warning. Unfortunately, I had to mark GCC as unsupported again.

var-const updated this revision to Diff 483791.Dec 17 2022, 5:13 PM

Mark GCC as unsupported to fix the CI.

var-const added inline comments.Dec 17 2022, 5:42 PM
libcxx/include/valarray
4935

@ldionne I'm getting

$ "diff" "-w" "/llvm/libcxx/test/libcxx/transitive_includes/cxx2b.csv" "/llvm/build/generic-cxx2b/test/libcxx/Output/transitive_includes.sh.cpp.dir/t.tmp/transitive_includes.csv"
# command output:
*** /llvm/libcxx/test/libcxx/transitive_includes/cxx2b.csv
--- /llvm/build/generic-cxx2b/test/libcxx/Output/transitive_includes.sh.cpp.dir/t.tmp/transitive_includes.csv
***************
*** 722,728 ****
  valarraycmath
  valarraycstddef
  valarraycstdlib
- valarraycstring
  valarrayinitializer_list
  valarraylimits
  valarraynew
--- 722,727 ----

error: command failed with exit status: 1

What's the reason this is only defined in C++20 and below?

philnik added inline comments.Dec 18 2022, 4:52 AM
libcxx/include/valarray
4935

We've decided to remove all transitive includes from C++2b, since it's not considered stable. You can just remove the valarray cstring line from Cxx2b.csv.

var-const updated this revision to Diff 487630.Jan 9 2023, 6:06 PM

Fix CI and rebase.

Try to fix the GCC crashes on the CI.

var-const updated this revision to Diff 488844.Jan 12 2023, 7:07 PM

Fix the CI.

This revision was landed with ongoing or failed builds.Jan 13 2023, 4:57 PM
This revision was automatically updated to reflect the committed changes.

Heads-up that we're seeing crashes due to the recommit of this change.
msan reports "use-of-uninitialized-value" in multiple places, where input is clearly initialized.

Heads-up that we're seeing crashes due to the recommit of this change.
msan reports "use-of-uninitialized-value" in multiple places, where input is clearly initialized.

Thank you for the heads-up. Can you provide more details?

Heads-up that we're seeing crashes due to the recommit of this change.
msan reports "use-of-uninitialized-value" in multiple places, where input is clearly initialized.

Thank you for the heads-up. Can you provide more details?

This is going to be extremely difficult to back out since we've made changes on top of that. Please work with us to understand this issue and fix it forward ASAP. In particular, we will want to make sure it lands in LLVM 16 as early as possible.

Thanks!

Not a breakage report, but just a comment in case it wasn't expected/intended: it looks like this increases the conformance requirement for custom iterators passed to std::copy and friends. From a couple tests, it seems like gcc/libstdc++ already has stricter requirements than libc++, so the conformance requirement is being matched between libc++ and libstdc++ now. As an example:

struct MyIterator {
    using iterator_category = std::forward_iterator_tag;
    using value_type = float;
    // These two lines are now also needed
    // using pointer = float*;
    // using reference = const float&;
    using difference_type = std::ptrdiff_t;
    value_type operator*();
    MyIterator& operator++();
    friend bool operator==(const MyIterator&, const MyIterator&);
    friend bool operator!=(const MyIterator&, const MyIterator&);
};

void func() {
    std::vector<float> dest;
    std::copy(MyIterator(), MyIterator(), std::back_inserter(dest));
}

https://godbolt.org/z/jTdbT4xcP

It seems pretty clear the change here is correct, this is just an FYI of the impact & in case anyone else bumps into this.

alexfh added a subscriber: alexfh.Jan 26 2023, 7:33 PM

We've found the cause of the problems @asbirlea reported above: violation of aliasing rules. Something similar to this snippet: https://gcc.godbolt.org/z/3Y9GEMW4e

Thus, again, the change is correct, but it's worth mentioning in the release notes that more aggressive optimizations in std::copy and friends may expose a lot more UB around std::vector and not only.

@alexfh @asbirlea @rupprecht Thank you for calling this out. Added notes about potential breakages of existing code to the release notes in https://github.com/llvm/llvm-project/commit/abda3e523ade7377b03a25dc2b6192dbe855c567

libcxx/include/valarray