This is an archive of the discontinued LLVM Phabricator instance.

A possible direction for fixing https://llvm.org/bugs/show_bug.cgi?id=25973.
ClosedPublic

Authored by mclow.lists on Jan 4 2016, 11:50 AM.

Details

Summary

libc++'s basic_string class assumes that iterator operations on the iterators that are passed to it don't throw. This is wrong, and means that we don't meet the strong exception guarantees in the standard.

A general solution is to do all the work in a temporary string, and then merge/swap the result into the destination in the end.
However, this is wasteful (extra allocations and copying) when the iterator operations can't throw.

Here I introduce some scaffolding to support detecting these iterators.
I took an old name from <algorithm> that was no longer used: __libcpp_is_trivial_iterator as the basis.
A trivial iterator is a pointer, or one of libc++'s wrapper around another iterator: move_iterator, reverse_iterator and __wrap_iterator (when the wrapped iterator is trivial). These are assumed to never throw on indirection, next/prev, comparison and assignment.

Then I add a string-specific type trait: __libcpp_string_gets_noexcept_iterator, which is either a trivial iterator or something that has noexcept increment, dereference and comparison. When I get a non-input iterator where __libcpp_string_gets_noexcept_iterator<Iter>::value is true, I do the algorithm in-place.

Note that this diff is just for assign; there are several other cases to deal with in <string>.

I'm looking for feedback on the general direction.

Diff Detail

Event Timeline

mclow.lists retitled this revision from to A possible direction for fixing https://llvm.org/bugs/show_bug.cgi?id=25973..
mclow.lists updated this object.
mclow.lists added a subscriber: cfe-commits.
mclow.lists added inline comments.Jan 4 2016, 11:53 AM
include/string
1211

This is not quite right yet. I need to check for assignment, and the two commented out lines attempt to check for increment.

However, the first one fails when the iterator type is a pointer, and the second just fails.

1556

Drive-by improvement. I got tired of all the __is_input_iterator <_InputIterator>::value && !__is_forward_iterator<_InputIterator>::value, so I added __is_exactly_input_iterator

2522

This is just test scaffolding so that I can be sure that the correct overload is getting called.
Will be removed before checkin.

2538

Same as #2522

Howard suggested a fix for the "is incremental" problem in the type trait. I also added assignable.

mclow.lists marked an inline comment as done.Jan 4 2016, 12:03 PM
tcanens added a subscriber: tcanens.Jan 4 2016, 2:32 PM
tcanens added inline comments.
include/string
2523

This default constructs the allocator, but the allocator may not be default constructible.

mclow.lists updated this revision to Diff 44220.Jan 7 2016, 9:02 AM

Fixed a problem in the __is_exactly_input_iterator trait where it would fail to compile if <_Iter> was not an iterator.

Added the cases for append, insert, and replace.

I have updated the tests (not included in this diff)

mclow.lists marked an inline comment as done.Jan 7 2016, 9:04 AM
tcanens added inline comments.Jan 7 2016, 9:17 AM
include/string
2677–2678

If an exception is thrown after a push_back() causes reallocation, existing iterators/pointers/references would have been invalidated, and the catch block can't do anything about it.

It looks like a temporary string is also needed here.

I fixed the but that Tim pointed out, and added the tests I've written.

I think this is ready to land.

tcanens added inline comments.Jan 7 2016, 12:15 PM
include/string
2809–2811

This may cause infinite recursion if the allocator uses a fancy pointer (so that __libcpp_string_gets_noexcept_iterator returns false).

Perhaps return insert(__pos, __temp.data(), __temp.data()+__temp.size());?

tcanens added inline comments.Jan 7 2016, 12:20 PM
include/string
2673

Likewise - should probably be append(__temp.data(), __temp.size());.

Updated to address Tim's comments.

mclow.lists marked 2 inline comments as done.Jan 11 2016, 7:10 AM
mclow.lists added inline comments.
include/string
2677–2678

Dang, you're right. Thanks.

Ping? I'd like to land this before the branch for release tomorrow.

rsmith edited edge metadata.Jan 12 2016, 11:36 AM

Would this be expected to degrade performance for people building with -fno-exceptions who use iterators with non-noexcept operations (such as LLVM and Clang)?

I think the best way to handle Richard's question is to specialize __libcpp_string_gets_noexcept_iterator to always return true when exceptions are turned off.

rsmith added inline comments.Jan 12 2016, 12:06 PM
include/iterator
442

Is it permitted for people to derive their own tag types from input_iterator_tag and use them here? Perhaps "not (derived from) forward_iterator_tag" would be more robust?

mclow.lists edited edge metadata.

Made all iterators "noexcept" when building with exceptions turned off.

mclow.lists added inline comments.Jan 12 2016, 5:46 PM
include/iterator
442

Sigh. Yes, boost::input_output_iterator_tag (and probably others) inherit from std:: input_iterator_tag

Last change, I hope. I added some more tests to make sure that string was correctly identifying throwing vs. non-throwing iterators, and I discovered that I had an #ifdef backwards. Fixing that revealed a bug in __libcpp_string_gets_noexcept_iterator_impl, in that it would cause a compile error if you give it an output iterator.

Addressed Richard's comment about things that derive from std::input_iterator_tag, (since Boost does this), even though section 24.4.3/1 seems to say that you must use one of the predefined iterator category tags.

mclow.lists marked 3 inline comments as done.Jan 12 2016, 7:05 PM
EricWF edited edge metadata.Jan 13 2016, 8:26 AM

Almost LGTM. Could you re-upload with more context quickly? (ex git diff -U999)

include/iterator
1419

Should this trait handle const and volatile like most other traits do?

mclow.lists added inline comments.Jan 13 2016, 10:47 AM
include/iterator
1419

I don't think so; a const iterator cannot be incremented or decremented, so it's pretty useless. Note that pointers to const (i.e, __libcpp_is_trivial_iterator<move_iterator<const char *>>::value) is already true.

EricWF accepted this revision.Jan 13 2016, 12:31 PM
EricWF edited edge metadata.

Sorry, this LGTM. You should push it to 3.8 as well.

include/iterator
1419

It doesn't need to so long as "_Iter" is taken by value as it is in <string>.

This revision is now accepted and ready to land.Jan 13 2016, 12:31 PM
rsmith accepted this revision.Jan 13 2016, 1:04 PM
rsmith edited edge metadata.
mclow.lists closed this revision.Jan 25 2016, 8:15 AM

This was landed as r257682