This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Remove raw call to debug handler from __char_traits_length_checked
ClosedPublic

Authored by ldionne on Mar 8 2022, 10:11 AM.

Details

Summary

It seems to me that _LIBCPP_ASSERT is sufficient here -- although CI might
tell me why I'm wrong.

Diff Detail

Event Timeline

ldionne created this revision.Mar 8 2022, 10:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2022, 10:11 AM
ldionne requested review of this revision.Mar 8 2022, 10:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2022, 10:11 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Quuxplusone accepted this revision.Mar 8 2022, 10:42 AM
Quuxplusone added a subscriber: Quuxplusone.

LGTM if CI is green!

This helper is actually called in only one place, and it strikes me that it might be vastly more useful to name it

template <class _Tp>
_LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR
inline _Tp *__assert_not_null(_Tp *__p) _NOEXCEPT {
  _LIBCPP_ASSERT(__p != nullptr, "pointer must not be null");
  return __p;
}

as in

_LIBCPP_CONSTEXPR _LIBCPP_INLINE_VISIBILITY
basic_string_view(const _CharT* __s)
    : __data(_VSTD::__assert_not_null(__s)), __size(_Traits::size(__s)) {}

Also, if you don't generalize it, consider moving it closer to its only call-site, in <string_view>?

This revision is now accepted and ready to land.Mar 8 2022, 10:42 AM
ldionne updated this revision to Diff 413942.Mar 8 2022, 2:52 PM

Apply Arthur's excellent suggestion.

EricWF requested changes to this revision.Mar 8 2022, 4:45 PM
EricWF added a subscriber: EricWF.

Please don't change this. Google uses this hook to change the behavior of string_view internally. We're working to get rid of the patch, but this helper function was specifically put in place to allow for the patch.

This revision now requires changes to proceed.Mar 8 2022, 4:45 PM
ldionne accepted this revision as: Restricted Project.Mar 9 2022, 6:06 AM

Please don't change this. Google uses this hook to change the behavior of string_view internally. We're working to get rid of the patch, but this helper function was specifically put in place to allow for the patch.

I said something along those lines on Discord, but I'll repeat it here. It is unreasonable for upstream to halt progress based on downstream hacks that are invisible to upstream (remember the _VSTD:: change?). Libc++ is used by hundreds of organizations, we literally couldn't do much at all if we always waited for folks to remove their hacks before making a change. In practice, what you should do is simply revert this patch internally until you've managed to remove the workaround -- it's fairly customary for that to happen, at least here at Apple. I wouldn't push for this change if it were just a cleanup, however it is necessary as part of the larger effort to enable assertions and clean up the debug mode (see D121123 & friends).

Also, it is quite possible that the string_view hook you have is something that could be more generally useful -- is there any reason not to simply upstream it, or upstream a form of it that makes your downstream diff easier to maintain? I'm not looking to create trouble for you if that can be avoided.

I'll wait to hear back before shipping this.

Please don't change this. Google uses this hook to change the behavior of string_view internally. We're working to get rid of the patch, but this helper function was specifically put in place to allow for the patch.

@EricWF, sounds like Louis and I had two different interpretations of your request. :) Do you mean "Google wants the name and signature of __char_traits_length_unchecked<_Traits> to remain unchanged, please don't change it to __assert_not_null"? Or do you mean "Google wants the implementation of __char_traits_length_unchecked<_Traits> to remain unchanged, please don't change its body to use _LIBCPP_ASSERT"? I assumed you meant only the former; @ldionne's reply suggests that maybe you meant the latter. (IMVHO the latter is unreasonable, whereas the former is understandable but I agree Google should work with us to find a path forward and/or document what they need, because the current name-and-signature of __char_traits_length_unchecked is pretty weird.)

Please don't change this. Google uses this hook to change the behavior of string_view internally. We're working to get rid of the patch, but this helper function was specifically put in place to allow for the patch.

@EricWF, sounds like Louis and I had two different interpretations of your request. :) Do you mean "Google wants the name and signature of __char_traits_length_unchecked<_Traits> to remain unchanged, please don't change it to __assert_not_null"? Or do you mean "Google wants the implementation of __char_traits_length_unchecked<_Traits> to remain unchanged, please don't change its body to use _LIBCPP_ASSERT"? I assumed you meant only the former; @ldionne's reply suggests that maybe you meant the latter. (IMVHO the latter is unreasonable, whereas the former is understandable but I agree Google should work with us to find a path forward and/or document what they need, because the current name-and-signature of __char_traits_length_unchecked is pretty weird.)

Agreed -- if only the name and signature must not change, we should be able to do this instead without breaking Google (a simple merge conflict resolution should be sufficient):

template <class _Traits>
_LIBCPP_INLINE_VISIBILITY
_LIBCPP_CONSTEXPR
inline size_t __char_traits_length_checked(const typename _Traits::char_type* __s) _NOEXCEPT {
  return _LIBCPP_ASSERT(__s != nullptr, "null pointer pass to non-null argument of char_traits<...>::length"), _Traits::length(__s);
}

It doesn't change what I've said above -- we don't want to be "held hostage" to invisible downstream code, but this could definitely be a way to unblock the assertions work without making @EricWF 's life unnecessarily difficult. It would still be nice to understand what's the internal hook and whether there's a better way to properly support that in upstream.

ldionne updated this revision to Diff 414131.Mar 9 2022, 9:34 AM

Address concerns after discussing offline with Eric. This should provide an acceptable balance between making progress and not unduly breaking a vendor until they've managed to remove their downstream debt.

We would very much be open to upstreaming a version of change in order to reduce the burden to both the project and Google. We are working diligently at the moment to remove the patch internally entirely. It's the primary focus of my work at the moment. So anything we upstream would be temporary on the order of 6 months to a year. Let me know what you think is appropriate here.

Otherwise, this changed looks fine to us. If you wouldn't mind giving me until this afternoon so I can negotiate landing the required changes at Google. The changes are simple but require some finesse to ensure they land at the right time

libcxx/include/string_view
244

If you use the ternary operator, you can make it a single statement without having to worry about custom char types that hijack ADL for operator,

Which is why it was written out that way initially.

Non built-in character types are allowed in string view, right?

EricWF accepted this revision.Mar 10 2022, 6:44 AM
This revision is now accepted and ready to land.Mar 10 2022, 6:44 AM
libcxx/include/string_view
244

I'm not thrilled by return _LIBCPP_ASSERT... either, but I do agree with Louis that I'd rather see _LIBCPP_ASSERT than the raw call to __libcpp_debug_function.
To fix the operator, issue, it's cheap and easy to replace the comma with , void(), ; I think we should do that.

We would very much be open to upstreaming a version of change in order to reduce the burden to both the project and Google. We are working diligently at the moment to remove the patch internally entirely. It's the primary focus of my work at the moment. So anything we upstream would be temporary on the order of 6 months to a year. Let me know what you think is appropriate here.

If we can upstream something like a temporary escape hatch to change the behavior of std::string_view so it accepts nullptr, I think it's reasonable, and we can remove it in the future once you've managed to migrate off of it. My understanding is that the diff is rather small and non viral.

Otherwise, this changed looks fine to us. If you wouldn't mind giving me until this afternoon so I can negotiate landing the required changes at Google. The changes are simple but require some finesse to ensure they land at the right time

Yes, no worries. I'll ping you on Discord this afternoon to check if it's okay to land this.

libcxx/include/string_view
244

You mean like __s ? _Traits::length(__s) : _LIBCPP_ASSERT(false, "message")? Yes, I think that would work. However, _LIBCPP_ASSERT is an expression casted to (void), so I don't think it's possible to hijack operator, as currently written -- the statement expands roughly to ((void)assertion-expr), _Traits::length(__s).

Non built-in character types are allowed in string view, right?

Yes, I definitely believe that's the case.

Yes, no worries. I'll ping you on Discord this afternoon to check if it's okay to land this.

libcxx/include/string_view
244

That makes the type of the expression void, no?

EricWF added inline comments.Mar 10 2022, 7:44 AM
libcxx/include/string_view
244

That makes the type of the expression void, no?

To answer my own question: Only if it's done to the last operand.

I thought we poisoned operator, in the test suite. If we still do, we should ensure there is a test that would have caught the uncasted operator, usage..

libcxx/include/string_view
244

Right, I meant

return _LIBCPP_ASSERT(__s != nullptr, "null pointer passed to non-null argument of char_traits<...>::length"), void(), _Traits::length(__s);

@ldionne's reply indicates that it's not possible to regression-test this, because the type of _LIBCPP_ASSERT(...) already happens to be void. (So the , void() does nothing. But I still think it might be a good idea, so the maintainer doesn't have to go look up the type of _LIBCPP_ASSERT.)

I thought we poisoned operator, in the test suite.

The test iterators do poison it for themselves alone. But (at the moment) it's not possible for us to provide a completely generic top-level hijacker template<class T, class U> void operator,(T, U); because some of the test files use the built-in operator, in for-loops and stuff. We'd first have to remove all for (...; ++i, ++j) from the test files, and then we could turn on a generic operator, if we wanted.

EricWF added inline comments.Mar 10 2022, 1:03 PM
libcxx/include/string_view
244

Or we could put the operator, overload in namespace std. Then we only have to fix cases where ADL would kick in within the tests.

I could have sworn I did that previously. Like it lived in nasty_macro.h.

libcxx/include/string_view
244

Then we only have to fix cases where ADL would kick in

Apparently a catch-all operator,(T,U) still doesn't affect when both sides are int; but it turns out that a lot of those ++i, ++j have i as some container iterator type.

After digging into that, I've got some local diffs inserting , void(), in test code. I might turn that into a PR in a day or two if nobody beats me to it.

I could have sworn I did that previously.

(FYI, nasty_macro.h is now libcxx/test/libcxx/nasty_macros.compile.pass.cpp.) Nope. I mean, it does ring a bell vaguely for me too, but I see no trace in the codebase, so I assume I must be thinking of the test iterators. struct nasty_mutex does overload its own operator,, too.