This is an archive of the discontinued LLVM Phabricator instance.

Split _LIBCPP_STRING_EXTERN_TEMPLATE_LIST up into a V1 and UNSTABLE version.
ClosedPublic

Authored by mvels on Feb 19 2020, 1:35 PM.

Details

Summary

This change splits the _LIBCPP_STRING_EXTERN_TEMPLATE_LIST up into a _LIBCPP_STRING_V1_EXTERN_TEMPLATE_LIST containing the stable ABI, and a _LIBCPP_STRING_UNSTABLE_EXTERN_TEMPLATE_LIST containing the unstable ABI.

The purpose is to explicitly define and maintain the two lists, where the unstable ABI allows for ABI breaking changes for purposes such as optimization while offering a strong guarantee that any change inside the unstable ABI does not affect the stable ABI.

As per the comment in the __string header, we do still allow etries to be added to the stable ABI list as the c++ versions and corresponding c++ std API changes.

Merge branch 'master' into unstable-abi-list

Event Timeline

mvels created this revision.Feb 19 2020, 1:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2020, 1:35 PM
mvels updated this revision to Diff 245527.Feb 19 2020, 2:43 PM

Inline basic_string::erase for fastpath where __n == npos

This change checks for the case where people want to erase a string to the end, i.e., __n == npos, and inlines the call if so.

This also demonstrates keeping the ABI intact for V1, but inlining the erase() method for unstable.

mvels updated this revision to Diff 245530.Feb 19 2020, 2:47 PM

Fix external template list

mvels updated this revision to Diff 245532.Feb 19 2020, 2:48 PM

Fix Phabricator (arc tool is not very bright)

My personal preference is to not split the lists, my rational is:

  • It encourages unstable to deviate from stable more. I would like us to keep the code and instantiations between the two as similar as possible. The more they differ, the more it is to maintain and the more likely it is that one contains bugs or starts to rot.
  • It's useful to see the difference between stable and unstable at a glance. Having two lists makes it much harder.

For reference, here's my attempt at a similar thing:
https://reviews.llvm.org/D73807

libcxx/include/__string
76

We should add a design doc for all @mvels work on std::string and its instantiations.

See http://libcxx.llvm.org/docs/#design-documents for a list of current design docs, so you can see how libc++ does them.
They can be found in the source tree under libcxx/docs/

EricWF added inline comments.Feb 19 2020, 2:55 PM
libcxx/include/__string
76

Note: I don't want the doc to hold up these changes specifically, but we should work towards adding one.

My personal preference is to not split the lists, my rational is:

  • It encourages unstable to deviate from stable more. I would like us to keep the code and instantiations between the two as similar as possible. The more they differ, the more it is to maintain and the more likely it is that one contains bugs or starts to rot.

I actually think that acknowledging that they are different instantiation lists makes everything clearer and easier to maintain. We don't "pretend" that they are close when they are really two different things. As we embrace changes in the unstable version of the ABI, I don't see why we'd strive to keep it close to the stable ABI. We should simply setup testers for both ABIs, since both seem to be in use.

  • It's useful to see the difference between stable and unstable at a glance. Having two lists makes it much harder.

Do you really expect to need to see that difference often? If so, we could sort the lists and then diff would show you the differences easily.

My primary concern is that I find it harder to maintain a list strung together via macro expansions than a manually laid-out list like in this patch. It also allows us to have an invariant, which is that the stable list should never change (except with extreme care).

ldionne accepted this revision.Feb 20 2020, 6:48 AM

I personally like this change so I'm going to approve it, but let's wait to see if we can get @EricWF 's approval too. If not, we'll have to find another way.

This revision is now accepted and ready to land.Feb 20 2020, 6:48 AM
mvels added a comment.Feb 20 2020, 8:06 PM

Thanks for the Feedback Eric, personally I think this version or your alternative are close, and both are based on the same (solid) principle of 'make the ABI explicit'

I do like (and prefer) Luis preference of explicitly specifying (decoupling) the V1 and UNSTABLE template list.
While somewhat more verbose, it's cleaner, and easy to validate 'unstable' changes vs 'extend new c++ API in stable'

I also think we may have to consider a future where we may want to roll unstable into a new V2 stable version (and new V3 / unstable).
Having the tables separate makes this imo a cleaner cut. (And I had an interesting discussion with Titus this morning how we should actually consider any future Vn versions as a new llvm-project rather than introducing versions, alas, that's crazy talk).

Anyway, I don't think we are strongly married into one or the other, if we don't like it for whatever reason, the changes are mostly re-arranging the explicit template macros in __string and a handful of lines in string / string.cpp, likely less than the heavy lifting you already did for making the external templates explicit (again, thanks for that!) . From out chat I understand that you are ok with this version if there is a majority preference, let's meet up when you are back next week.

I am working on two documents, one outlining the motivation, consequences and options for tweaking the unstable ABI, and a separate doc summarizing the intended string optimizations (some in detail and general principles). Hopefully I have something worth sharing tomorrow, thanks!

mvels closed this revision.Feb 20 2020, 8:56 PM
EricWF added inline comments.Feb 23 2020, 4:21 PM
libcxx/include/string
4341–4347

We always introduce a named macro for each ABI break rather than gating them generically.

Please fix this in upstream.

libcxx/src/string.cpp
23–29

We always introduce a named macro for each ABI break rather than gating them generically.

Please fix this in upstream.

My personal preference is to not split the lists, my rational is:

  • It encourages unstable to deviate from stable more. I would like us to keep the code and instantiations between the two as similar as possible. The more they differ, the more it is to maintain and the more likely it is that one contains bugs or starts to rot.

I actually think that acknowledging that they are different instantiation lists makes everything clearer and easier to maintain. We don't "pretend" that they are close when they are really two different things. As we embrace changes in the unstable version of the ABI, I don't see why we'd strive to keep it close to the stable ABI. We should simply setup testers for both ABIs, since both seem to be in use.

Ideally we'll keep the unstable and stable ABI lists as close as possible, with the goal of moving all unstable instantiations into stable.

  • It's useful to see the difference between stable and unstable at a glance. Having two lists makes it much harder.

Do you really expect to need to see that difference often? If so, we could sort the lists and then diff would show you the differences easily.

My primary concern is that I find it harder to maintain a list strung together via macro expansions than a manually laid-out list like in this patch. It also allows us to have an invariant, which is that the stable list should never change (except with extreme care).

My personal preference is to not split the lists, my rational is:

  • It encourages unstable to deviate from stable more. I would like us to keep the code and instantiations between the two as similar as possible. The more they differ, the more it is to maintain and the more likely it is that one contains bugs or starts to rot.

I actually think that acknowledging that they are different instantiation lists makes everything clearer and easier to maintain. We don't "pretend" that they are close when they are really two different things. As we embrace changes in the unstable version of the ABI, I don't see why we'd strive to keep it close to the stable ABI. We should simply setup testers for both ABIs, since both seem to be in use.

Ideally we'll keep the unstable and stable ABI lists as close as possible, with the goal of moving all unstable instantiations into stable.

Are you saying that we'd want to put explicit instantiations used by the unstable ABI into the .so in the stable ABI, too? I don't understand why we'd do that?