This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Set correct size at the end of growing std::string
ClosedPublic

Authored by AdvenamTacet on Apr 19 2023, 1:56 AM.

Details

Summary

This commit deprecates std::basic_string::__grow_by, which is part of ABIv1. The function is replaced by std::basic_string:__grow_by_without_replace, which is not part of ABI.

  • The original function __grow_by is deprecated because it does not set the string size, therefore it may not update the size when the size is changed, and it may also not set the size at all when the string was short initially. This leads to unpredictable size value. It is not removed or changed to avoid breaking the ABI.
  • The commit adds _LIBCPP_HIDE_FROM_ABI guarded by _LIBCPP_ABI_VERSION >= 2 to __grow_by. This allows the function to be used in the dylib in ABIv1 without raising the [abi:v170000] error and removes it from future ABIs. _LIBCPP_HIDE_FROM_ABI_AFTER_V1 cannot be used.
  • Additionally, __grow_by has been removed from _LIBCPP_STRING_UNSTABLE_EXTERN_TEMPLATE_LIST in libcxx/include/__string/extern_template_lists.h.

This bugfix is necessary to implement string ASan annotations, because it mitigates the problems encountered in D132769.

Diff Detail

Event Timeline

AdvenamTacet created this revision.Apr 19 2023, 1:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 19 2023, 1:56 AM
AdvenamTacet requested review of this revision.Apr 19 2023, 1:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 19 2023, 1:56 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

Is there a test that can catch this issue?

philnik requested changes to this revision.Apr 20 2023, 7:05 AM

I don't think we can do this change. This changes the postconditions of a function in the dylib, which is an ABI break.

This revision now requires changes to proceed.Apr 20 2023, 7:05 AM

Is there a test that can catch this issue?

@ldionne __grow_by is private and every public function using it increases the size and eventually sets a new value. So I'm not sure how to create a test for it.

I realized that it's incorrect when I tried to use .size() value to annotate container and it failed.

This changes the postconditions of a function in the dylib, which is an ABI break.

@philnik I would argue that it's a bug and should be fixed.

https://github.com/llvm/llvm-project/blob/75196f8e72be3f18c5a831e23f385c4ae3eb62b5/libcxx/include/string#L703-L720

Let's look at layout (ASL as example, but the same reasoning works with the second layout):

struct __long
{
    pointer   __data_;
    size_type __size_;
    size_type __cap_ : sizeof(size_type) * CHAR_BIT - 1;
    size_type __is_long_ : 1;
};

struct __short
{
    value_type __data_[__min_cap];
    unsigned char __padding_[sizeof(value_type) - 1];
    unsigned char __size_ : 7;
    unsigned char __is_long_ : 1;
};

If string is short, its size is in its last byte, rest of the object is its content.
Then __grow_by is called and string is going to became long. The function sets data pointer and capacity (bytes at the beginning and the end, but middle bytes (size) are unchanged).

__set_long_pointer(__p);
__set_long_cap(__allocation.count);

It means that size is set to "random value" (old content) as previously it was strings content (eg. it could be 0x4974277341537472 - `"It'sAStr" - and it will be its size now).

I see no value in that kind of behavior and I think it may be changed.


But let me know if it's desired/cannot be changed, I will modify D132769 so __grow_by unpoisons memory for future elements already, then a call to .size() is not necessary.

Is there a test that can catch this issue?

@ldionne __grow_by is private and every public function using it increases the size and eventually sets a new value. So I'm not sure how to create a test for it.

I realized that it's incorrect when I tried to use .size() value to annotate container and it failed.

This changes the postconditions of a function in the dylib, which is an ABI break.

@philnik I would argue that it's a bug and should be fixed.

https://github.com/llvm/llvm-project/blob/75196f8e72be3f18c5a831e23f385c4ae3eb62b5/libcxx/include/string#L703-L720

Let's look at layout (ASL as example, but the same reasoning works with the second layout):

struct __long
{
    pointer   __data_;
    size_type __size_;
    size_type __cap_ : sizeof(size_type) * CHAR_BIT - 1;
    size_type __is_long_ : 1;
};

struct __short
{
    value_type __data_[__min_cap];
    unsigned char __padding_[sizeof(value_type) - 1];
    unsigned char __size_ : 7;
    unsigned char __is_long_ : 1;
};

If string is short, its size is in its last byte, rest of the object is its content.
Then __grow_by is called and string is going to became long. The function sets data pointer and capacity (bytes at the beginning and the end, but middle bytes (size) are unchanged).

__set_long_pointer(__p);
__set_long_cap(__allocation.count);

It means that size is set to "random value" (old content) as previously it was strings content (eg. it could be 0x4974277341537472 - `"It'sAStr" - and it will be its size now).

I see no value in that kind of behavior and I think it may be changed.


But let me know if it's desired/cannot be changed, I will modify D132769 so __grow_by unpoisons memory for future elements already, then a call to .size() is not necessary.

I don't disagree that it's a bug, but if we started to rely on __grow_by setting the size anywhere (or the compiler removes redundant stores), it would result in broken code and unhappy users. Maybe we could make a new function that calls __grow_by which has the correct postconditions? We could then remove __grow_by from ABIv2 and add the new function. That way we have the better interface without and ABI break.

AdvenamTacet added a comment.EditedApr 21 2023, 2:39 AM

Maybe we could make a new function that calls __grow_by which has the correct postconditions?

Sounds good to me. As __grow_by_and_replace exists, maybe a new function may be named __grow_by_without_replace?


While I was reading it, I realized that my patch suggestion is also incorrect. It should be __set_long_size(__old_sz - __n_del + __n_add), I will appreciate if someone confirms it.

This update:

  • adds a new function __grow_by_without_replace,
  • replaces calls to __grow_by with the new function,
  • adds new names to ABI (is it correct?).

Should I change something?

philnik added inline comments.May 17 2023, 3:47 PM
libcxx/include/__string/extern_template_lists.h
31

This is unfortunately also not possible (at least without some policy changes). Fortunately it's also not necessary for this patch.

80–81

Let's remove this when we add __grow_by_without_replace.

libcxx/include/string
1921–1930

We should mark this as _LIBCPP_DEPRECATED_("use __grow_by_without_replace") to avoid calling it accidentally.

1931

This should be something like _LIBCPP_HIDE_FROM_ABI_V1 or similar.

2383

Please clang-format!

2389

The template arguments shouldn't be necessary.

AdvenamTacet added inline comments.May 17 2023, 9:17 PM
libcxx/include/__string/extern_template_lists.h
31

I don't understand.

philnik added inline comments.May 17 2023, 9:27 PM
libcxx/include/__string/extern_template_lists.h
31

Adding a new symbol to the dylib requires having availability annotations to aid in back-porting to older systems. If we add a new symbol in string, essentially all users would be required to target the latest OS when they use std::string in their programs, which isn't really an option.

AdvenamTacet marked 7 inline comments as done.

This update fixes issues mentioned in the code review:

  • adds _LIBCPP_DEPRECATED_("use __grow_by_without_replace"),
  • adds _LIBCPP_HIDE_FROM_ABI_AFTER_V1,
  • changes deprecated call to __grow_by in __grow_by_without_replace with a copy-paste code of the function.
  • in libcxx/include/__string/extern_template_lists.h removes references to __grow_by and __grow_by_without_replace,
  • clang-format, I hope this time I used the correct config!

I hope I understood all comments. I'm in the middle of testing right now.

libcxx/include/__string/extern_template_lists.h
31

Ok, I removed all references to __grow_by_without_replace.

80–81

Removed __grow_by from the file.

libcxx/include/string
1921–1930

Cannot call deprecated function, so I had to copy-paste the code.

1931

I used _LIBCPP_HIDE_FROM_ABI_AFTER_V1.

To the best of my knowledge, that implementation is correct.

  • The same value as in __grow_by_and_replace.
  • D132769 is working with that patch.
  • I tested it with more than unit tests.

The patch seems to be ready to land.

ldionne added inline comments.Jun 12 2023, 12:21 PM
libcxx/include/__string/extern_template_lists.h
30–31

We can't remove this function from the explicit instantiation list, since that would be an ABI break.

libcxx/include/string
1932

I would be tempted to make this _LIBCPP_HIDE_FROM_ABI and not include it in the explicit instantiation list. So far, our explicit instantiation lists have only made our lives really hard with std::string (e.g. getting in the way of fixing bugs).

2351–2356

Please document the bug that this function has and why we're keeping it around.

AdvenamTacet marked 2 inline comments as done.

As suggested, this update:

  • reverts changes in libcxx/include/__string/extern_template_lists.h,
  • in a comment, describes the bug and explains why the function is preserved,
  • adds _LIBCPP_HIDE_FROM_ABI to __grow_by_without_replace (),
  • removes changes in libcxx/lib/abi.

I hope, the patch may land after that change.

AdvenamTacet marked an inline comment as done.Jun 13 2023, 6:14 PM
philnik added inline comments.Jun 14 2023, 9:21 AM
libcxx/include/string
1921–1930

You can use _LIBCPP_SUPPRESS_DEPRECATED_PUSH and _LIBCPP_SUPPRESS_DEPRECATED_POP to ignore the diagnostic. IMO that's also what we should do to avoid code bloat.

1932

I don't really see the problem of adding this to the ABIv2 list. It's not like we will freeze ABIv2 any time soon, and we probably want to reassess what we export from the dylib and what we don't before doing that anyways.

2348

It's not like we would care about the old behavior otherwise.

AdvenamTacet marked an inline comment as done.

This update improves a comment.

AdvenamTacet added inline comments.Jun 14 2023, 11:39 AM
libcxx/include/string
1921–1930

Thx! I tried it, but __grow_by doesn't work because of _LIBCPP_HIDE_FROM_ABI_AFTER_V1:

undefined reference to `std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__grow_by[abi:v170000](unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

Is there one more work-around?

It works when I remove _LIBCPP_HIDE_FROM_ABI_AFTER_V1, but I believe it's worse solution than a few more copy-pasted lines.
Let me know what you think!

1932

If you decide that I should revert the revert, let me know, I don't have an opinion about it. But also I don't see any real advantage of adding this function to ABI.

AdvenamTacet edited the summary of this revision. (Show Details)Jun 30 2023, 6:00 PM
ldionne accepted this revision.Jul 5 2023, 6:08 AM

LGTM with a suggestion to rephrase a comment.

libcxx/include/string
2346–2348

Just rewording, IMO this is easier to read. WDYT?

AdvenamTacet marked an inline comment as done.
AdvenamTacet edited the summary of this revision. (Show Details)

This update:

  • applies comment change suggested by @ldionne (thank you!),
  • resolves conflicts.
AdvenamTacet marked an inline comment as not done.Jul 5 2023, 4:54 PM
AdvenamTacet added inline comments.
libcxx/include/string
1921–1930

Hey @philnik, does it look ok for you? If yes, please, accept the review. If you want me to work on something, please, let me know how I should improve it.

2346–2348

Looks good, thank you! I applied that change.

A fix for rebase.

Removing std::__debug_db_invalidate_all(this); from __grow_by_without_replace as it was removed form __grow_by.

philnik added inline comments.Jul 10 2023, 12:57 PM
libcxx/include/string
1921–1930

Where does that happen? The inline version should always be emitted, so there should never be a linker error with [abi:v170000]. IMO code duplication isn't the right thing here, since the code is quite complex and will get out of sync in the future.

To test the issue on CI, this update does:

  • rebase,
  • add _LIBCPP_SUPPRESS_DEPRECATED_PUSH and _LIBCPP_SUPPRESS_DEPRECATED_POP to not duplicate code.

I hope I did it correctly.

AdvenamTacet added inline comments.Jul 10 2023, 8:45 PM
libcxx/include/string
1921–1930

Hey @philnik, I recreated this error on CI. Let me know if I used _LIBCPP_SUPPRESS_DEPRECATED_POP correctly and if you know how to fix it. I agree that code duplication is far from ideal, but I don't know how to avoid it, keeping everything other requested. Possibly I'm just incorrectly using a macro.

philnik added inline comments.Jul 11 2023, 2:17 PM
libcxx/include/string
1921–1930

Ah, I understand the problem now. I think we can just do

#if _LIBCPP_ABI_VERSION >= 2 // We want to use the function in the dylib in ABIv1
  _LIBCPP_HIDE_FROM_ABI
#endif

That should solve the problem.

Testing a potential fix.

This patch reverts previous attempt and is fixing the preoblem.

The solution is like suggested, using _LIBCPP_ABI_VERSION but around _LIBCPP_HIDE_FROM_ABI_AFTER_V1. Locally works.

philnik accepted this revision.Jul 12 2023, 10:22 PM

LGTM with green CI % nit.

libcxx/include/string
1923

This is already guarded; no need to make it more confusing.

This revision is now accepted and ready to land.Jul 12 2023, 10:22 PM
AdvenamTacet marked 4 inline comments as done.

This update:

  • adds a comment explaining why we guard _LIBCPP_HIDE_FROM_ABI,
  • changes guarded _LIBCPP_HIDE_FROM_ABI_AFTER_V1 to _LIBCPP_HIDE_FROM_ABI.

I will land the patch, if CI stays green.

AdvenamTacet edited the summary of this revision. (Show Details)Jul 13 2023, 8:10 PM

This update removes __grow_by from _LIBCPP_STRING_UNSTABLE_EXTERN_TEMPLATE_LIST in libcxx/include/__string/extern_template_lists.h.

Hey @philnik, I made additional self-review and realized that the commit does not remove the function from _LIBCPP_STRING_UNSTABLE_EXTERN_TEMPLATE_LIST, so I fix it in this update. Unfortunatelly, _LIBCPP_HIDE_FROM_ABI_AFTER_V1 still does not work.

AdvenamTacet added inline comments.Jul 13 2023, 8:29 PM
libcxx/include/string
1921–1930

Thank you for suggestions! It works!

But fact that we cannot use _LIBCPP_HIDE_FROM_ABI_AFTER_V1 made me ask a question, if we are doing everything correctly, which led to realization that I don't remove __grow_by function from ABIv2. I fixed it by removing it from _LIBCPP_STRING_UNSTABLE_EXTERN_TEMPLATE_LIST.

Unfortunately, _LIBCPP_HIDE_FROM_ABI_AFTER_V1 still does not work.
Now I believe everything it correct and ready to land. I will land it Tomorrow.