This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Remove generic char_traits implementation
ClosedPublic

Authored by philnik on Aug 3 2023, 5:26 PM.

Details

Reviewers
Mordante
Group Reviewers
Restricted Project
Commits
rGe30a148b098d: [libc++] Remove generic char_traits implementation
Summary

This has been deprecated and should be removed now.

Diff Detail

Event Timeline

philnik created this revision.Aug 3 2023, 5:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2023, 5:26 PM
philnik requested review of this revision.Aug 3 2023, 5:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2023, 5:26 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Mordante requested changes to this revision.Aug 7 2023, 9:14 AM
Mordante added a subscriber: Mordante.

I've not looked closely at the patch, please update the Release notes for LLVM-18.

This revision now requires changes to proceed.Aug 7 2023, 9:14 AM
philnik updated this revision to Diff 548327.Aug 8 2023, 1:17 PM

Address comments

libcxx/test/std/utilities/format/format.formattable/concept.formattable.compile.pass.cpp
416

@Mordante Does this seem fine to you? I don't think it makes sense to test int here, since from the looks of it this is about testing character types.

Mordante accepted this revision.Aug 9 2023, 9:51 AM

LGTM modulo some nits.

libcxx/docs/ReleaseNotes/18.rst
53–54
56
libcxx/test/std/utilities/format/format.formattable/concept.formattable.compile.pass.cpp
416

Yes I'm fine to remove it.

This revision is now accepted and ready to land.Aug 9 2023, 9:51 AM
This revision was landed with ongoing or failed builds.Aug 9 2023, 3:55 PM
This revision was automatically updated to reflect the committed changes.
philnik marked 3 inline comments as done.
gulfem added a subscriber: gulfem.Aug 9 2023, 6:45 PM

It seems like this change broke building LLDB:

In file included from /home/martin/code/llvm-project/lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp:17:
In file included from /home/martin/clang-nightly/aarch64-w64-mingw32/include/c++/v1/thread:92:
In file included from /home/martin/clang-nightly/aarch64-w64-mingw32/include/c++/v1/__thread/formatter.h:18:
In file included from /home/martin/clang-nightly/aarch64-w64-mingw32/include/c++/v1/__format/formatter_integral.h:21:
In file included from /home/martin/clang-nightly/aarch64-w64-mingw32/include/c++/v1/__format/formatter_output.h:22:
In file included from /home/martin/clang-nightly/aarch64-w64-mingw32/include/c++/v1/__format/parser_std_format_spec.h:39:
/home/martin/clang-nightly/aarch64-w64-mingw32/include/c++/v1/string:723:46: error: implicit instantiation of undefined template 'std::char_traits<unsigned char>'
  723 |     static_assert(( is_same<_CharT, typename traits_type::char_type>::value), 
      |                                              ^
/home/martin/code/llvm-project/lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp:634:34: note: in instantiation of template class 'std::basic_string<unsigned char>' requested here
  634 |       std::basic_string<uint8_t> zeros(reg_info.byte_size, '\0');
      |                                  ^
/home/martin/clang-nightly/aarch64-w64-mingw32/include/c++/v1/__fwd/string.h:23:29: note: template is declared here
   23 | struct _LIBCPP_TEMPLATE_VIS char_traits;
      |                             ^

In this case, it seems like the fix is quite trivial:

diff --git a/lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp b/lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
index 4efc454967a1..f9d95fc5d2a6 100644
--- a/lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
+++ b/lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
@@ -631,7 +631,7 @@ static void WriteRegisterValueInHexFixedWidth(
   } else {
     // Zero-out any unreadable values.
     if (reg_info.byte_size > 0) {
-      std::basic_string<uint8_t> zeros(reg_info.byte_size, '\0');
+      std::vector<uint8_t> zeros(reg_info.byte_size, '\0');
       AppendHexValue(response, zeros.data(), zeros.size(), false);
     }
   }
libcxx/include/__string/char_traits.h
79

FWIW, I've never seen this deprecation message in earlier builds of lldb - I rechecked old build logs, and this wasn't ever printed.

In addition to breaking LLDB's build itself, this change also broke the following test: https://github.com/llvm/llvm-project/blob/main/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/string_view/main.cpp that uses std::basic_string<unsigned char> - is it not allowed?

Error when building test subject.

[...]

Build Command Output:
[...]
In file included from [...]/llvm-project/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/string_view/main.cpp:2:
[...]/include/c++/v1/string:723:46: error: implicit instantiation of undefined template 'std::char_traits<unsigned char>'
  723 |     static_assert(( is_same<_CharT, typename traits_type::char_type>::value),
      |                                              ^
[...]/llvm-project/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/string_view/main.cpp:95:36: note: in instantiation of template class 'std::basic_string<unsigned char>' requested here
   95 |   std::basic_string<unsigned char> uchar_source(10, 'a');
      |                                    ^
[...]/include/c++/v1/__fwd/string.h:23:29: note: template is declared here
   23 | struct _LIBCPP_TEMPLATE_VIS char_traits;
      |                             ^

In addition to breaking LLDB's build itself, this change also broke the following test: https://github.com/llvm/llvm-project/blob/main/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/string_view/main.cpp that uses std::basic_string<unsigned char> - is it not allowed?

Error when building test subject.

[...]

Build Command Output:
[...]
In file included from [...]/llvm-project/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/string_view/main.cpp:2:
[...]/include/c++/v1/string:723:46: error: implicit instantiation of undefined template 'std::char_traits<unsigned char>'
  723 |     static_assert(( is_same<_CharT, typename traits_type::char_type>::value),
      |                                              ^
[...]/llvm-project/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/string_view/main.cpp:95:36: note: in instantiation of template class 'std::basic_string<unsigned char>' requested here
   95 |   std::basic_string<unsigned char> uchar_source(10, 'a');
      |                                    ^
[...]/include/c++/v1/__fwd/string.h:23:29: note: template is declared here
   23 | struct _LIBCPP_TEMPLATE_VIS char_traits;
      |                             ^

AFAIK it's not.

libcxx/include/__string/char_traits.h
79

Do you have -Wno-deprecated or something like that in the config? I don't know what we can do other than deprecating the things we want to remove and mentioning them in the release notes.

gribozavr2 added inline comments.Aug 10 2023, 8:53 AM
libcxx/include/__string/char_traits.h
79

This specialization is not used directly in LLDB's code. Thus the primary source location for the deprecation warning would be in another libc++ headers. Clang silences warnings in system headers.

I'll be honest, this change is kind of annoying. The deprecation itself was added in clang 17, which is yet to be released. IOW, most code bases that are using the deprecated char_traits haven't even had the chance to see those deprecation warnings. And now they hard-fail to build with clang trunk...

Also, uses of e.g. basic_string<unsigned char> do not show a deprecation warning with clang 17 at all, because the trigger ends up being in <string>, which is a system header (as mentioned in previous comments).

Oh and this one is fun too: error: implicit instantiation of undefined template 'std::char_traits<const char>'

I'll be honest, this change is kind of annoying. The deprecation itself was added in clang 17, which is yet to be released. IOW, most code bases that are using the deprecated char_traits haven't even had the chance to see those deprecation warnings. And now they hard-fail to build with clang trunk...

This has been deprecated since LLVM 16: https://releases.llvm.org/16.0.0/projects/libcxx/docs/ReleaseNotes.html#id6.

Also, uses of e.g. basic_string<unsigned char> do not show a deprecation warning with clang 17 at all, because the trigger ends up being in <string>, which is a system header (as mentioned in previous comments).

That's not great, but we can't do much about that. It would be great if clang could be improved, but we can't wait a decade to wait for deprecated features to be removed.

Oh and this one is fun too: error: implicit instantiation of undefined template 'std::char_traits<const char>'

That seems like a good thing to me, since it's almost definitely erroneous. You can't correctly implement things like assign for a const char.

thakis added a subscriber: thakis.Sep 12 2023, 2:53 PM

Late to the party, but this is somewhat inconvenient for us too. We also didn't see any warnings about this, for the reason @glandium mentions.

Generally for removals, it'd be cool if they could be put behind an ifdef for a release, so that users can do:

  1. Update libc++
  2. Run into these issues, add the define
  3. Update code

instead of having to do 3 before 1.

We've not update libc++ in months now since we've gone through a series of "update code" steps, and every time we were ready, something new like this had landed.

FWIW after speaking with @thakis offline, I do think we did "everything right", in the following sense. We added the deprecated attribute to the class and left it there across one full release (roughly 1 year). This should leave enough time for people to notice the warning when updating libc++, turn it off temporarily with -Wno-deprecated as they fix their code, and then re-enable the warning. I think this is the right model we want to give folks.

However, the major caveat that makes this whole thing a fiasco is that nobody ever saw the deprecation warning, because Clang doesn't show those from system headers (TIL). This is terrible and it means that no amount of good intentions on our side can overcome the fact that from the users perspective, there was basically 0 lead time until the removal.

We need to fix this. Another thing we could work on improving is the ability to silence just specific deprecation warnings in Clang, for example something like -Wno-deprecated=char_traits would be really nice. Until we have a reasonable path to remove these deprecated entities, I don't think we should be removing them (after all, we would never remove them without any lead time and this is exactly what we did here for all intents and purposes).

I am reintroducing the deprecated class here https://github.com/llvm/llvm-project/pull/66153.

Don't get me wrong, I really want to remove this class. However, I care more about not giving a bad experience to users, and we'll need to figure out how to handle deprecation attributes sanely in order to do that.

CC @aaron.ballman since we'll probably need to work with Clang to make this happen.

However, the major caveat that makes this whole thing a fiasco is that nobody ever saw the deprecation warning, because Clang doesn't show those from system headers (TIL).

So when someone creates a basic_string<int>, both basic_string and char_traits<int> being defined in headers, users do not get any warning despite the template specialization
originally being produced in user code?
At least, I am assuming this is what is happening.
It's something clang should fix, definitively. Although i can see that there are a few different options. Do we have an issue for that?

rnk added subscribers: cjdb, rnk.Sep 13 2023, 9:33 AM

FWIW after speaking with @thakis offline, I do think we did "everything right", in the following sense. We added the deprecated attribute to the class and left it there across one full release (roughly 1 year). This should leave enough time for people to notice the warning when updating libc++, turn it off temporarily with -Wno-deprecated as they fix their code, and then re-enable the warning. I think this is the right model we want to give folks.

I think this gets to some recent Clang discussions about adding library diagnostics. After you fix the system header problem, the next problem is that we have one giant -Wdeprecated-declarations flag, and in a sufficiently large codebase, it only takes a few deprecation attributes in code you don't control before you have to disable this warning globally. We have users who build with -Wno-deprecated-declarations, and it's not reasonable to say "oh, you turned off the warnings, so it's your own fault you missed the deprecation period". We need a solution for finer grained deprecation warnings, -Wdeprecated=libcxx, -Wdeprecated=char_traits, or a better spelling.

@cjdb , is there a github issue that folks can watch to get updates on this proposal?

I think this gets to some recent Clang discussions about adding library diagnostics. After you fix the system header problem, the next problem is that we have one giant -Wdeprecated-declarations flag, and in a sufficiently large codebase, it only takes a few deprecation attributes in ...

That's a great point. We talked a bit at this too. It doesn't necessarily need compiler support though, at least for libc++: What MS STL does iirc is to have one define per deprecation annotation to disable that particular deprecation annotation. In that model, when updating libc++ you'll see all new deprecation diags when updating, then disable them with those macros, and then fix the deprecated things async after updating. Then you'll remove your macros to make sure you don't backslide. That way you get scalable deprecation rollouts without clang changes. It "just" needs libc++ to use this model 🙂

The challenges here is that char_traits is both declared and instantiated in a system header.
We need a logic to somehow realize that the instantiation of char_traits was ultimately caused by user code.

Ideally, an instantiation of char_traits in a system header that is not caused by a user in some way would not produce warning.
So we need to walk back the stack of instantiation to see where things originated from and probably compare the types too

Consider (purely for demonstration purpose);

1/ A scenario caused by a user-requested specialization that we want to diagnose;

//libc++:
template<typename T, typename Trait = char_trait<T>> 
basic_string {};
// usercode
template basic_string<int>; // user messed up

2/ A scenario that the user is not responsible for at all, which is why we silence warnings in system headers

//libc++:
using __do_what_i_say_not_what_i_do  = char_trait<int>> ;

// user code
// user can't do on anything about warnings

A scenario caused by a user-requested specialization that is still unactionable for the user

//libc++:
template<typename T, typename Trait = char_trait<double>>
basic_string {};
// usercode
template basic_string<int>; // user can't act on a warning here

We need to allow 1 and not 2 & not 3 either preferably.

Exposing the diagnostics engine to code won't help clang refine the definition of system header.

thakis added inline comments.Sep 18 2023, 7:46 AM
libcxx/include/__string/char_traits.h
107

In https://github.com/mitchdowd/jnipp/pull/40 someone brought up ABI implications of this change. Most, but not all, methods of this class are marked _LIBCPP_HIDE_FROM_ABI. For example, this one isn't. Since these methods are all inline methods, it's probably fine, but does anyone know why some methods here have _LIBCPP_HIDE_FROM_ABI but others don't?

philnik added inline comments.Sep 18 2023, 7:58 AM
libcxx/include/__string/char_traits.h
107

_LIBCPP_INLINE_VISIBILITY is a deprecated alias for _LIBCPP_HIDE_FROM_ABI, so it is in fact marked _LIBCPP_HIDE_FROM_ABI. If any are missing that's an oversight that should be fixed.

thakis added inline comments.Sep 18 2023, 8:57 AM
libcxx/include/__string/char_traits.h
107

Thanks for explaining! I made https://github.com/llvm/llvm-project/pull/66661 so you don't have to explain this again in the future.

See https://github.com/llvm/llvm-project/pull/72694 for the patch we'll land in the LLVM 19 time frame.

libcxx/include/__string/char_traits.h