This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/ADT/
-
llvm/
-
ADT/
-
StringRef.h
-
unittests/ADT/
-
ADT/
-
StringRefTest.cpp

Differential D47973

[ADT] Change the behavior of `StringRef::rsplit()` when the separator is not found.
AbandonedPublic

Authored by MTC on Jun 8 2018, 10:07 PM.

Download Raw Diff

Details

Reviewers

zturner
vsk
xbolva00

Summary

For StringRef::split(), it performs the search from the front to the back to find the first occurence of the separator, so when the separator is not found, returning <*this, ""> makes sense. For SrtingRef::rsplit(), it performs the search from the back to the front to find the "first" occurence of the separator too, so when the separator is not found, returning <"", *this> makes more sense.

Diff Detail

Repository

rL LLVM

Build Status

Buildable 19127
Build 19127: arc lint + arc unit

Event Timeline

MTC created this revision.Jun 8 2018, 10:07 PM

Thanks for the patch. The StringRef change itself looks fine, but please make sure to update the callsites in-tree (e.g in llvm, clang, lldb etc).

Makes sense, thanks

This revision is now accepted and ready to land.Jun 9 2018, 5:19 AM

In D47973#1127309, @vsk wrote:

Thanks for the patch. The StringRef change itself looks fine, but please make sure to update the callsites in-tree (e.g in llvm, clang, lldb etc).

Thanks for your reminder, vsk! Sorry for my negligence, I thought these corner cases were definitely included in the test case, so after the test was successful, I thought there was no potential problem.

After I re-examined some of the usages of StringRef::rsplit(), I think I need to abandon this patch.

There are three reasons for this:

With some usages, the string may need to be rsplit() through multiple separator. After splitting using the first separator, we can always use sample_string.rsplit(separator).first for the next split stage. However for this patch, we have to use sample_sring.rsplit(separator).first if the separator is in sample_string, and use sample_string.rsplit(separator).second if the separator is not in sample_string. The code is not clean!
Changing the behavior of existing libraries is dangerous especially it is difficult for me to figure out all the code logic of the StringRef::rsplit() usages.
Like the rsplit() in other programming languages, e.g python. rsplit() in python splits string from the right at the specified separator and returns a list of strings, and rsplit() guarantee list[0] always have values even if the separator isn't in the string. So for StringRef::rsplit(), even though StringRef::rsplit() returns a pair instead of list, we should guarantee that pair.first always has value for subsequent use.

Sorry for the lack of thinking about this patch!

What do you think, @xbolva00, @vsk!

MTC abandoned this revision.Jun 12 2018, 5:53 AM

Revision Contents

Path

Size

include/

llvm/

ADT/

StringRef.h

14 lines

unittests/

ADT/

StringRefTest.cpp

4 lines

Diff 150606

include/llvm/ADT/StringRef.h

Show First 20 Lines • Show All 744 Lines • ▼ Show 20 Lines	std::pair<StringRef, StringRef> split(StringRef Separator) const {
return std::make_pair(*this, StringRef());		return std::make_pair(*this, StringRef());
return std::make_pair(slice(0, Idx), slice(Idx + Separator.size(), npos));		return std::make_pair(slice(0, Idx), slice(Idx + Separator.size(), npos));
}		}

/// Split into two substrings around the last occurrence of a separator		/// Split into two substrings around the last occurrence of a separator
/// string.		/// string.
///		///
/// If \p Separator is in the string, then the result is a pair (LHS, RHS)		/// If \p Separator is in the string, then the result is a pair (LHS, RHS)
/// such that (*this == LHS + Separator + RHS) is true and RHS is		/// such that (*this == LHS + Separator + RHS) is true and RHS is minimal.
/// minimal. If \p Separator is not in the string, then the result is a		/// If \p Separator is not in the string, then the result is a pair (LHS,
/// pair (LHS, RHS) where (*this == LHS) and (RHS == "").		/// RHS) where (LHS == "") and (*this == RHS).
///		///
/// \param Separator - The string to split on.		/// \param Separator - The string to split on.
/// \return - The split substrings.		/// \return - The split substrings.
LLVM_NODISCARD		LLVM_NODISCARD
std::pair<StringRef, StringRef> rsplit(StringRef Separator) const {		std::pair<StringRef, StringRef> rsplit(StringRef Separator) const {
size_t Idx = rfind(Separator);		size_t Idx = rfind(Separator);
if (Idx == npos)		if (Idx == npos)
return std::make_pair(*this, StringRef());		return std::make_pair(StringRef(), *this);
return std::make_pair(slice(0, Idx), slice(Idx + Separator.size(), npos));		return std::make_pair(slice(0, Idx), slice(Idx + Separator.size(), npos));
}		}

/// Split into substrings around the occurrences of a separator string.		/// Split into substrings around the occurrences of a separator string.
///		///
/// Each substring is stored in \p A. If \p MaxSplit is >= 0, at most		/// Each substring is stored in \p A. If \p MaxSplit is >= 0, at most
/// \p MaxSplit splits are done and consequently <= \p MaxSplit + 1		/// \p MaxSplit splits are done and consequently <= \p MaxSplit + 1
/// elements are added to A.		/// elements are added to A.
Show All 26 Lines	public:
/// \param KeepEmpty - True if empty substring should be added.		/// \param KeepEmpty - True if empty substring should be added.
void split(SmallVectorImpl<StringRef> &A, char Separator, int MaxSplit = -1,		void split(SmallVectorImpl<StringRef> &A, char Separator, int MaxSplit = -1,
bool KeepEmpty = true) const;		bool KeepEmpty = true) const;

/// Split into two substrings around the last occurrence of a separator		/// Split into two substrings around the last occurrence of a separator
/// character.		/// character.
///		///
/// If \p Separator is in the string, then the result is a pair (LHS, RHS)		/// If \p Separator is in the string, then the result is a pair (LHS, RHS)
/// such that (*this == LHS + Separator + RHS) is true and RHS is		/// such that (*this == LHS + Separator + RHS) is true and RHS is minimal.
/// minimal. If \p Separator is not in the string, then the result is a		/// If \p Separator is not in the string, then the result is a pair (LHS,
/// pair (LHS, RHS) where (*this == LHS) and (RHS == "").		/// RHS) where (LHS == "") and (*this == RHS).
///		///
/// \param Separator - The character to split on.		/// \param Separator - The character to split on.
/// \return - The split substrings.		/// \return - The split substrings.
LLVM_NODISCARD		LLVM_NODISCARD
std::pair<StringRef, StringRef> rsplit(char Separator) const {		std::pair<StringRef, StringRef> rsplit(char Separator) const {
return rsplit(StringRef(&Separator, 1));		return rsplit(StringRef(&Separator, 1));
}		}

▲ Show 20 Lines • Show All 121 Lines • Show Last 20 Lines

unittests/ADT/StringRefTest.cpp

Show First 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	EXPECT_EQ(std::make_pair(StringRef("h"), StringRef("llo")),
Str.split('e'));		Str.split('e'));
EXPECT_EQ(std::make_pair(StringRef(""), StringRef("ello")),		EXPECT_EQ(std::make_pair(StringRef(""), StringRef("ello")),
Str.split('h'));		Str.split('h'));
EXPECT_EQ(std::make_pair(StringRef("he"), StringRef("lo")),		EXPECT_EQ(std::make_pair(StringRef("he"), StringRef("lo")),
Str.split('l'));		Str.split('l'));
EXPECT_EQ(std::make_pair(StringRef("hell"), StringRef("")),		EXPECT_EQ(std::make_pair(StringRef("hell"), StringRef("")),
Str.split('o'));		Str.split('o'));

EXPECT_EQ(std::make_pair(StringRef("hello"), StringRef("")),		EXPECT_EQ(std::make_pair(StringRef(""), StringRef("hello")),
Str.rsplit('X'));		Str.rsplit('X'));
EXPECT_EQ(std::make_pair(StringRef("h"), StringRef("llo")),		EXPECT_EQ(std::make_pair(StringRef("h"), StringRef("llo")),
Str.rsplit('e'));		Str.rsplit('e'));
EXPECT_EQ(std::make_pair(StringRef(""), StringRef("ello")),		EXPECT_EQ(std::make_pair(StringRef(""), StringRef("ello")),
Str.rsplit('h'));		Str.rsplit('h'));
EXPECT_EQ(std::make_pair(StringRef("hel"), StringRef("o")),		EXPECT_EQ(std::make_pair(StringRef("hel"), StringRef("o")),
Str.rsplit('l'));		Str.rsplit('l'));
EXPECT_EQ(std::make_pair(StringRef("hell"), StringRef("")),		EXPECT_EQ(std::make_pair(StringRef("hell"), StringRef("")),
Str.rsplit('o'));		Str.rsplit('o'));

EXPECT_EQ(std::make_pair(StringRef("he"), StringRef("o")),		EXPECT_EQ(std::make_pair(StringRef("he"), StringRef("o")),
Str.rsplit("ll"));		Str.rsplit("ll"));
EXPECT_EQ(std::make_pair(StringRef(""), StringRef("ello")),		EXPECT_EQ(std::make_pair(StringRef(""), StringRef("ello")),
Str.rsplit("h"));		Str.rsplit("h"));
EXPECT_EQ(std::make_pair(StringRef("hell"), StringRef("")),		EXPECT_EQ(std::make_pair(StringRef("hell"), StringRef("")),
Str.rsplit("o"));		Str.rsplit("o"));
EXPECT_EQ(std::make_pair(StringRef("hello"), StringRef("")),		EXPECT_EQ(std::make_pair(StringRef(""), StringRef("hello")),
Str.rsplit("::"));		Str.rsplit("::"));
EXPECT_EQ(std::make_pair(StringRef("hel"), StringRef("o")),		EXPECT_EQ(std::make_pair(StringRef("hel"), StringRef("o")),
Str.rsplit("l"));		Str.rsplit("l"));
}		}

TEST(StringRefTest, Split2) {		TEST(StringRefTest, Split2) {
SmallVector<StringRef, 5> parts;		SmallVector<StringRef, 5> parts;
SmallVector<StringRef, 5> expected;		SmallVector<StringRef, 5> expected;
▲ Show 20 Lines • Show All 866 Lines • Show Last 20 Lines