This is an archive of the discontinued LLVM Phabricator instance.

[libc++][NFC] Inline the string constructors
ClosedPublic

Authored by philnik on Jun 17 2022, 11:40 AM.

Details

Reviewers
ldionne
EricWF
Group Reviewers
Restricted Project
Commits
rG87abc5013ac0: [libc++][NFC] Inline the string constructors
Summary

This removes a lot of boilerplate code.

Diff Detail

Event Timeline

philnik created this revision.Jun 17 2022, 11:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2022, 11:40 AM
philnik requested review of this revision.Jun 17 2022, 11:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2022, 11:40 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
EricWF requested changes to this revision.Jun 20 2022, 2:13 PM
EricWF added a subscriber: EricWF.

I don't think this change is something we should do.

First, this has a lot of functional changes. Like allowing the compiler to inline these functions, which we've explicitly externally instantiated and were consciously put out of line for that reason. This will result in a large code size increase.
And it's not something I think we should do to our users. It largely negates the ABI exports libc++ intentional created.

Second, IMHO, the boilerplate is fine. It's not boilerplate we can get wrong, since the compiler will enforce correctness. I don' think this boilerplate is significant enough to warrant the code size changes.

This revision now requires changes to proceed.Jun 20 2022, 2:13 PM

First, this has a lot of functional changes.

Have I ever claimed there aren't any functional changes?

Like allowing the compiler to inline these functions, which we've explicitly externally instantiated and were consciously put out of line for that reason.

I'm aware that there are some functions we've consciously not marked inline, but I don't know which ones they are (assuming there are any which aren't explicitly declared in extern_template_lists.h). For example basic_string(const CharT*, const Allocator&) isn't marked inline, but also isn't in the exported functions list. It looks to me a lot like we've just forgotten to mark it inline. If we do that consciously I'd argue that we should write it inline and mark it _LIBCPP_NOINLINE or something like it. That makes it obvious that we didn't just forget to mark a function inline.

This will result in a large code size increase.
And it's not something I think we should do to our users. It largely negates the ABI exports libc++ intentional created.

Most of the functions I've inlined were already marked inline, so for those this is purely NFC.

Second, IMHO, the boilerplate is fine. It's not boilerplate we can get wrong, since the compiler will enforce correctness. I don' think this boilerplate is significant enough to warrant the code size changes.

The compiler also knows that std::basic_string<char, char_traits<char>, my_fancy_allocator<char>>::iterator is correct. That doesn't mean I should use it instead of auto. I should, in fact, use auto to make it easier for myself and others to read it.

Second, IMHO, the boilerplate is fine. It's not boilerplate we can get wrong, since the compiler will enforce correctness. I don' think this boilerplate is significant enough to warrant the code size changes.

The compiler also knows that std::basic_string<char, char_traits<char>, my_fancy_allocator<char>>::iterator is correct. That doesn't mean I should use it instead of auto. I should, in fact, use auto to make it easier for myself and others to read it.

Fewer letters != more readable. auto contains fewer characters, but also contains zero information. The more readable code is that which spells out the return type. std::basic_string<Char, Traits, Alloc>::iterator is verbose, but it's also perfectly readable.
You're in the standard library. You can read templates just fine. We optimize for correctness, not ease of readability. And while those two are related, it's important to remember which one we're actually optimizing for.

And what happens if we accidentally change the return type of the function because we changed the function body, like returning something convertible to the iterator type rather than the iterator?

Perhaps, in this case, requiring the compiler to instantiate the function in order to determine its return type isn't a problem in your example, but it is in a bunch of other cases.
We should be using auto very sparingly. In fact, we should only ever use it when the standard mandates.

Further, I don't think moving the function bodies does anything other than make the class declaration less readable. Which is bad because ideally we should be able to visually inspect the class declaration to see if it matches the standard (std::string is a long way off from that ATM, but that's not license to make it worse).

@ldionne are you convinced this change improves readability? If so, can you say why?
@ldionne What are your thoughts on using auto return types in libc++ when the standard doesn't mandate it?

And what happens if we accidentally change the return type of the function because we changed the function body, like returning something convertible to the iterator type rather than the iterator?

I wasn't talking about function return types. I had more something in mind where you have to auto it = first;, where it's obviously clear what the type is. I think you might be correct that we shouldn't use auto as a return type on standard-mandated functions, but I think that's debatable. Anyways, this patch isn't changing any return types to auto, to there is no point in discussing it here further.

Further, I don't think moving the function bodies does anything other than make the class declaration less readable. Which is bad because ideally we should be able to visually inspect the class declaration to see if it matches the standard (std::string is a long way off from that ATM, but that's not license to make it worse).

We obviously disagree here. I see no reason to have template <class _CharT, class _Traits, class _Allocator> hundreds of times, _LIBCPP_CONSTEXPR_AFTER_CXX17 two times as often as needed or basic_string<_CharT, _Traits, _Allocator> instead of basic_string in the file without any real use.

@ldionne are you convinced this change improves readability? If so, can you say why?

Yes, I personally do find that it does improve readability. I've always been put off by the fact that we define functions out-of-line, and have to repeat all kinds of crazy declarations (and often complicated SFINAE). To me that just doesn't make much sense, given that our API is dictated by the Standard and documented in our top-of-header synopsis. So repeating the declarations one more time in the class declaration does not add a lot of value, in my opinion. Of course in the end this is kind of a matter of taste.

@ldionne What are your thoughts on using auto return types in libc++ when the standard doesn't mandate it?

I don't think that would be a good idea, but this patch doesn't seem to introduce this so I think we're good.


I think the real potential concern with this patch is indeed turning functions that were not marked as inline into inline functions, since that could have an impact on codegen. I suggest we make this patch a NFC by only moving functions that were already marked as inline into the class. After that, we should be left with only out-of-line functions outside of the class. We won't know which ones were intentionally left out-of-line and which ones were accidental, but we can take a stab and consider each function individually, moving them into the class when we think it's the intent. That can be done in followups.

philnik updated this revision to Diff 455946.Aug 26 2022, 9:38 AM
  • Only inline constructors already marked inline
philnik retitled this revision from [libc++] Inline the string constructors to [libc++][NFC] Inline the string constructors.Aug 26 2022, 9:38 AM
ldionne accepted this revision.Sep 2 2022, 8:13 AM
This revision was not accepted when it landed; it landed in state Needs Review.Sep 17 2022, 9:00 AM
This revision was automatically updated to reflect the committed changes.