This is an archive of the discontinued LLVM Phabricator instance.

[libc++] [LIBCXX-DEBUG-FIXME] Iterating a string::iterator "off the end" is UB.
ClosedPublic

Authored by Quuxplusone on Apr 30 2021, 4:55 PM.

Details

Summary
The range of char pointers [data, data+size] is a valid closed range,
but the range [begin, end) is valid only half-open.

Diff Detail

Event Timeline

Quuxplusone requested review of this revision.Apr 30 2021, 4:55 PM
Quuxplusone created this revision.
Herald added 1 blocking reviewer(s): Restricted Project. · View Herald TranscriptApr 30 2021, 4:55 PM

poke buildkite

krisb accepted this revision.May 3 2021, 11:11 AM

LGTM, but it may be worth adding a comment since it really looks like a valid case until realizing that dereferencing end() is UB.

ldionne accepted this revision.May 4 2021, 6:59 AM

Explanation of the change for others who might not have understood it (Arthur explained it to me offline cause I didn't get it):

In3 is an iterator equal to In2.begin(). We're using overload (5) from https://en.cppreference.com/w/cpp/filesystem/path/path , which expects "a pointer or an input iterator to a null-terminated character/wide character sequence". In2.data() fits that description, because it's a raw char*; but In2.begin() doesn't, because it's an iterator to a character sequence that is not null-terminated. Getting to the null terminator requires iterating "off the end" of the valid range and dereferencing In2.begin() + In2.size(), which is In2.end(), which is not dereferenceable.

So, with debug iterators, that "off the end" iteration fails a _LIBCPP_ASSERT; but even without debug iterators, technically, it's library UB for the test case to be trying to do it.

This revision is now accepted and ready to land.May 4 2021, 6:59 AM