This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Don't assume that string_view::const_iterator is a raw pointer
ClosedPublic

Authored by ldionne on Nov 28 2022, 5:05 AM.

Details

Summary

Our implementation of std::format assumed that string_view's iterators
were raw pointers in various places. If we want to introduce a checked
iterator in debug mode, that won't be true anymore. This patch removes
that assumption.

Diff Detail

Event Timeline

ldionne created this revision.Nov 28 2022, 5:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 28 2022, 5:05 AM
ldionne requested review of this revision.Nov 28 2022, 5:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 28 2022, 5:05 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

@Mordante Please let me know if you like this approach. In a few places, we have the choice of blindly taking _Iterator or explicitly stating basic_string_view::const_iterator -- let me know what you prefer and I'll change it.

ldionne updated this revision to Diff 478225.Nov 28 2022, 6:36 AM

Fix some CI issues

Mordante accepted this revision.Nov 28 2022, 11:14 AM

Nice catch, I thought the iterators were required to be character pointers, but they indeed are not. So I feel this it the better way to do it. (Unless we want to "unwrap our iterators" in the top-level format functions.)

@Mordante Please let me know if you like this approach. In a few places, we have the choice of blindly taking _Iterator or explicitly stating basic_string_view::const_iterator -- let me know what you prefer and I'll change it.

In general I like the approach, but I left some comments. I really think we should constrain the iterator types since the code makes assumptions on which operations are allowed.

Otherwise LGTM!

libcxx/include/__format/format_string.h
33

The current name is misleading after the change.

The original name was based on <charconv> which uses char pointers. When switching to iterators I prefer to use a more generic name.

104

I think it would be good to add consistent constrains to _Iterator. This line requires a random access iterator.
In some places you used contiguous_iterator, so I think it would be good to do that consistently.
(All code is C++20 so nothing we can use concepts.)

libcxx/include/__format/formatter_output.h
266–267

In combination with the next suggestion.

268–270

Wouldn't this work?

281

Likewise.

This revision is now accepted and ready to land.Nov 28 2022, 11:14 AM
ldionne marked 2 inline comments as done.Dec 5 2022, 6:54 AM
ldionne added inline comments.
libcxx/include/__format/formatter_output.h
268–270

Yes it would work, but don't you feel like it's more readable to introduce a separate name for the type?

ldionne updated this revision to Diff 480094.Dec 5 2022, 6:54 AM

Apply most comments.

Mordante added inline comments.Dec 14 2022, 10:51 AM
libcxx/include/__format/formatter_output.h
268–270

My main concern it that by giving the type a name, it's intended to be configurable by the caller, which isn't intended.
Maybe lifting this requirement to a requires instead?

ldionne marked 4 inline comments as done.Jan 27 2023, 1:41 PM
ldionne added inline comments.
libcxx/include/__format/formatter_output.h
291

I stumbled upon this, but this is never tested with non-char* iterators, it seems. Same for a few functions below. It is probably worth investigating to add some test coverage.

ldionne updated this revision to Diff 492895.Jan 27 2023, 1:42 PM

Address comments.

CI issues seem unrelated, shipping since @Mordante was happy with a prior version and I addressed the remaining comments.

This revision was landed with ongoing or failed builds.Jan 30 2023, 7:20 AM
This revision was automatically updated to reflect the committed changes.