This is an archive of the discontinued LLVM Phabricator instance.

[libc++] <span>, like <string_view>, has no use for debug iterators.
ClosedPublic

Authored by Quuxplusone on Apr 21 2021, 3:43 PM.

Details

Summary
A span has no idea what container (if any) "owns" its iterators, nor
under what circumstances they might become invalidated.

This fixes the following libcxx tests under `-D_LIBCPP_DEBUG=1`:
- libcxx/test/std/containers/views/span.cons/deduct.pass.cpp
- libcxx/test/std/containers/views/span.iterators/begin.pass.cpp
- libcxx/test/std/containers/views/span.iterators/end.pass.cpp
- libcxx/test/std/containers/views/span.iterators/rbegin.pass.cpp
- libcxx/test/std/containers/views/span.iterators/rend.pass.cpp
- libcxx/test/std/containers/views/span.sub/first.pass.cpp
- libcxx/test/std/containers/views/span.sub/last.pass.cpp
- libcxx/test/std/containers/views/span.sub/subspan.pass.cpp

It does occur to me that we could probably come up with a strategy for string_view and span to do debug iterators, where you could form an association of a string_view object with a C object, and the string_view's iterators would be sort of "transitively" associated with that C; and this association would be formed whenever you (1) converted a string directly to a string_view or (2) constructed a string_view from a pair of C::iterator (for C=string or vector). However, this idea is clearly beyond the scope of this PR, and I'm not aware that anyone is asking for it to exist, either.

Diff Detail

Event Timeline

Quuxplusone requested review of this revision.Apr 21 2021, 3:43 PM
Quuxplusone created this revision.
Herald added a reviewer: Restricted Project. · View Herald TranscriptApr 21 2021, 3:43 PM
miscco added a subscriber: miscco.Apr 21 2021, 11:18 PM

Just to be sure, are there ABI concerns for span?

If somebody has some interfaces that take a span pointer (yeah its stupid but...) we might get into some trouble here

libcxx/include/span
203

@miscco asks "Are there ABI concerns for span?" Well, in order to break, they'd have to be passing a span<T>::iterator across an ABI boundary. (Just passing a const span<T>& would be fine.) And span is new in C++20, which we don't claim to implement yet. So I think now is the perfect time to fix it. :)

Anyway, we kinda don't have a choice, right? :) The existing code simply fails to compile in _LIBCPP_DEBUG=1 mode, and I think we need it to compile.

Poke buildkite. Ping @ldionne for approval, and @mclow.lists for adding debug iterators to <span> originally.

Due to the ABI implications I defer the approval to one of the full maintainers.

libcxx/include/span
203

We've been shipping <span> since libc++ 7.0 so it has been available for several years. (Not sure when we enabled the feature-test macro.) Maybe we should let the type depend on the _LIBCPP_ABI_VERSION version.

Sorry, been sitting on this patch for a while. I looked into the archives and found a discussion back in August 2019 between @EricWF, @mclow.lists and I about fixing that very issue, but we never actually did it. At that time I was okay with breaking the span ABI, but that was almost 3 years ago. The other thing that was in the balance at that time was being able to implement constexpr operations, but I think we solved that problem with _LIBCPP_CONSTEXPR_IF_NODEBUG.

I agree with you in principle and I'm not sure why we have debug iterators in span in the first place. On the other hand, this is an ABI break on a feature that we've been shipping. It's a widely used feature, too, so it'll impact users. Let's try to quantify how bad it would be. The issues I can spot are:

  1. If someone has span::iterator as part of a function signature, it'll be a linker error since the mangling will change (from __wrap_iter to T*).
  2. This changes the triviality of span::iterator, which is good for performance but also changes how it is handled by the calling convention. So if it gets returned (or passed), it'll go into registers instead of on the stack. That's a pretty major change that would only get detected at runtime if someone's returning those from a function.
  3. Otherwise, the layout of __wrap_iter<T*> and just T* are the same, so I don't foresee any issues with someone holding a span::iterator as a class member or something like that.

This is a tough call, really. To be conservative, I'd say "we can't do that", but I agree things are pretty badly broken if we don't do it.

It's a widely used feature, too, so it'll impact users.

Rather: it has the potential to impact users. I shouldn't make claims as if I knew for sure. We really don't have many metrics for making these sorts of calls.

Sorry, been sitting on this patch for a while. I looked into the archives and found a discussion back in August 2019 between @EricWF, @mclow.lists and I about fixing that very issue, but we never actually did it. At that time I was okay with breaking the span ABI, but that was almost 3 years ago. The other thing that was in the balance at that time was being able to implement constexpr operations, but I think we solved that problem with _LIBCPP_CONSTEXPR_IF_NODEBUG.

I agree with you in principle and I'm not sure why we have debug iterators in span in the first place. On the other hand, this is an ABI break on a feature that we've been shipping. It's a widely used feature, too, so it [has the potential to] impact users. Let's try to quantify how bad it would be.

Okay. But remember, through all of this, we know that nobody is using span::iterator in _LIBCPP_DEBUG mode, because that doesn't compile. The ABI breakage we're worrying about is ABI breakage specifically in the regular non-debug mode.

  1. If someone has span::iterator as part of a function signature, it'll be a linker error since the mangling will change (from __wrap_iter to T*).

Agreed. Would you prefer to make span use __wrap_iter conditionally? I.e.,

#if _LIBCPP_DEBUG == 2
    using iterator = pointer;
#else
    using iterator = __wrap_iter<pointer>;  // to maintain our ABI
#endif

(Again, we know it's safe to change the behavior in _LIBCPP_DEBUG=1 mode, because there is no behavior there today — it simply doesn't compile.)

  1. This changes the triviality of span::iterator, which is good for performance but also changes how it is handled by the calling convention.

No, __wrap_iter<T*> is already trivially copyable today (outside of debug mode).
So I don't think the calling convention would change at all. Example: https://godbolt.org/z/do8o4Ez6z

  1. Otherwise, the layout of __wrap_iter<T*> and just T* are the same, so I don't foresee any issues with someone holding a span::iterator as a class member or something like that.

Agreed.

@ldionne, I still think we should just break ABI — stylistically nobody should ever be passing span::iterator across an ABI boundary period, which means nobody's writing functions taking span::iterator, which means nobody cares about the mangling of those functions' names. (That is, we're not breaking the ABI of span; just of its iterators.) Also, better to break early (before claiming C++20 support) than late; if we don't break ABI now I think we're just kicking the tech-debt can down the road to where it will be even harder to break ABI in the future. However, I do think the solution I put in response to #1 would be OK if you really wanted to kick that can.

Agreed. Would you prefer to make span use __wrap_iter conditionally? I.e.,

#if _LIBCPP_DEBUG == 2
    using iterator = pointer;
#else
    using iterator = __wrap_iter<pointer>;  // to maintain our ABI
#endif

If we go this route I think we should consider to add a new ABI flag in our experimental ABI and use that, something like

#if _LIBCPP_DEBUG == 2 || defined(_LIBCPP_ABI_SPAN_POINTER_ITERATOR)
    using iterator = pointer;
#else
    using iterator = __wrap_iter<pointer>;  // to maintain our ABI
#endif
ldionne requested changes to this revision.Apr 28 2021, 2:47 PM

Okay. But remember, through all of this, we know that nobody is using span::iterator in _LIBCPP_DEBUG mode, because that doesn't compile. The ABI breakage we're worrying about is ABI breakage specifically in the regular non-debug mode.

Yes, I know, that's what I'm talking about too.

  1. If someone has span::iterator as part of a function signature, it'll be a linker error since the mangling will change (from __wrap_iter to T*).

Agreed. Would you prefer to make span use __wrap_iter conditionally? I.e.,

Yes, going down that route is definitely a possibility.

  1. This changes the triviality of span::iterator, which is good for performance but also changes how it is handled by the calling convention.

No, __wrap_iter<T*> is already trivially copyable today (outside of debug mode).
So I don't think the calling convention would change at all. Example: https://godbolt.org/z/do8o4Ez6z

Oh, I missed the #ifdef around the copy constructor. This is actually great news.

@ldionne, I still think we should just break ABI — stylistically nobody should ever be passing span::iterator across an ABI boundary period,

You are, very obviously, not a vendor :-). Libc++ has historically had a policy of zero tolerance for ABI breaks. This is serious. We can't break our users "because they should not have been writing code in some way", especially since that judgement is mostly subjective. Concretely, people write lots of code, and what you think might never happen actually does happen, and sometimes even for good reasons.

I've actually been trying to relax that policy one little step at a time, for example with std::pointer_safety. You'll say it's trivial and I agree, however it's still a step in the right direction. Now, the issue at hand with span::iterator is a lot more serious IMO, and we can't take it lightly (apologies if I'm mis-perceiving how you take the situation).

Also, better to break early (before claiming C++20 support) than late;

Where do we *not* claim C++20 support? This is an entirely theoretical argument. We support std::span the day we ship it in our headers without some sort of explicit unavailable attribute (or something along those lines). That very day, which was about 3 years ago, we decided that we'd support std::span in a stable way pretty much forever.

if we don't break ABI now I think we're just kicking the tech-debt can down the road to where it will be even harder to break ABI in the future. However, I do think the solution I put in response to #1 would be OK if you really wanted to kick that can.

I think @Mordante's solution is the better solution here. I suggest this:

  1. We conditionally make std::span::iterator be T* when (1) debug mode is enabled, or (2) an ABI macro is defined.
  2. That ABI macro is DISABLED by default, so that we're using __wrap_iter all the time by default (i.e. no ABI break).
  3. We then switch to enabling the ABI macro by default along with a release note that explains the ABI break. We add a note to contact the libcxx list if they ever turn that macro on, because otherwise that option will be removed at [some release].

I want to do (3) in a separate step so that we can test it on its own (e.g. build all of Apple/Google/others with the change). It will also make it easier to revert if needed, and to isolate the change in case it causes an issue. Then, if nobody says anything and there is no breakage in the wild, we can remove the macro altogether in a few releases, assuming nobody ever actually needed to define that macro to bring back the old behavior.

It's more complicated than just making the change, but it's the only reasonable way to proceed IMO. That way, we fix the issue in the short term, we have no tech debt in the long term, and we don't break users leaving them helpless.

Also, the only reason while I'm OK with making the change at all is because it'll be a linker error. If it were a runtime error (e.g. if the triviality had changed), I don't think we could ever go to step (3), and that would always stay a feature enabled only in the unstable ABI.

Are you OK with that direction? If so, please implement step (1) and (2) here, and we can have (3) be a separate review.

This revision now requires changes to proceed.Apr 28 2021, 2:47 PM

+1 for @ldionne's approach.

Added under _LIBCPP_ABI_SPAN_POINTER_ITERATORS.
(We're still doing pointers unconditionally when _LIBCPP_DEBUG==2, because otherwise things don't compile; and we always want things to be able to compile in debug mode. Right?)

oops, update again

ldionne accepted this revision.Apr 30 2021, 2:41 PM

LGTM with _LIBCPP_DEBUG_LEVEL instead.

libcxx/include/span
203

_LIBCPP_DEBUG is never 2, see libcxx/include/__config:893:

// _LIBCPP_DEBUG potential values:
//  - undefined: No assertions. This is the default.
//  - 0:         Basic assertions
//  - 1:         Basic assertions + iterator validity checks.
#if !defined(_LIBCPP_DEBUG)
# define _LIBCPP_DEBUG_LEVEL 0
#elif _LIBCPP_DEBUG == 0
# define _LIBCPP_DEBUG_LEVEL 1
#elif _LIBCPP_DEBUG == 1
# define _LIBCPP_DEBUG_LEVEL 2
#else
# error Supported values for _LIBCPP_DEBUG are 0 and 1
#endif

Yes, it is terribly confusing. It was even worse before I documented it (lol). I think you want to use _LIBCPP_DEBUG_LEVEL == 2 instead.

This revision is now accepted and ready to land.Apr 30 2021, 2:41 PM

Rebase; fix _LIBCPP_DEBUG_LEVEL issue; remove XFAILs from the now-tested tests. Let's see if the new buildbot step works!