This is an archive of the discontinued LLVM Phabricator instance.

[ADT] Add `StringRef::rsplit(StringRef Separator)`.
ClosedPublic

Authored by MTC on May 26 2018, 1:38 AM.

Details

Summary

Add StringRef::rsplit(StringRef Separator) to achieve the function of getting the tail substring according to the separator. A typical usage is to get data in std::basic_string::data.

Diff Detail

Repository
rL LLVM

Event Timeline

MTC created this revision.May 26 2018, 1:38 AM
xbolva00 accepted this revision.Jun 3 2018, 9:38 AM
xbolva00 added a subscriber: xbolva00.

lg

This revision is now accepted and ready to land.Jun 3 2018, 9:38 AM
vsk requested changes to this revision.Jun 3 2018, 5:28 PM
vsk added a subscriber: vsk.

Hi @MTC, thanks for the patch.

With this patch, the logic for split would be duplicated 3 times now in StringRef.h. I think it'd be nice to reuse some code by moving the core logic into a private method of StringRef, say, splitAtIndex(size_t). Then, each (r)split(char | StringRef) can use the common helper.

Relatedly, an rsplit(char) overload seems to be missing here.

This revision now requires changes to proceed.Jun 3 2018, 5:28 PM

rsplit(char) already exists. That said, another way to reuse code would be
to have rsplit(char) call the StringRef version

MTC updated this revision to Diff 150335.Jun 7 2018, 8:15 AM

This update consists of two parts

  • Make rsplit(char) reuse rsplit(StringRef).
  • Make split(char) reuse split(StringRef).

Do I need to divide the patch into two parts? One is to add rsplit(StringRef), the other is to refactor rsplit(char) and split(char).

MTC added a comment.EditedJun 7 2018, 8:18 AM
In D47406#1120355, @vsk wrote:

Hi @MTC, thanks for the patch.

With this patch, the logic for split would be duplicated 3 times now in StringRef.h. I think it'd be nice to reuse some code by moving the core logic into a private method of StringRef, say, splitAtIndex(size_t). Then, each (r)split(char | StringRef) can use the common helper.

Relatedly, an rsplit(char) overload seems to be missing here.

First of all, sorry for the long delay!

Thanks for your review, @vsk! I do agree with you that there is need some work to remove the duplicates code. However I think zturner's approach is more neat, what do you think?

MTC added a comment.Jun 7 2018, 8:24 AM

rsplit(char) already exists. That said, another way to reuse code would be
to have rsplit(char) call the StringRef version

Thanks for your advice, @zturner! I have refactored split(char) and rsplit(char) to use StringRef version. I think the code is more clean now.

zturner added inline comments.Jun 7 2018, 8:34 AM
include/llvm/ADT/StringRef.h
755 ↗(On Diff #150335)

I can't decide if it would make more sense to have the "not found" case return a pair (LHS, RHS) where (LHS == "") && (*this == RHS)? What do you think?

vsk accepted this revision.Jun 7 2018, 1:31 PM
In D47406#1125104, @MTC wrote:

rsplit(char) already exists. That said, another way to reuse code would be
to have rsplit(char) call the StringRef version

Thanks for your advice, @zturner! I have refactored split(char) and rsplit(char) to use StringRef version. I think the code is more clean now.

LGTM. Please wait to see if Zachary has any further feedback. Thanks!

include/llvm/ADT/StringRef.h
755 ↗(On Diff #150335)

I think this makes sense, but would make more sense as a follow-up commit. I don't think it needs to be a prerequisite for adding an rsplit overload.

This revision is now accepted and ready to land.Jun 7 2018, 1:31 PM
zturner accepted this revision.Jun 7 2018, 1:36 PM
MTC added inline comments.Jun 7 2018, 7:35 PM
include/llvm/ADT/StringRef.h
755 ↗(On Diff #150335)

Yeah, I also think this is more reasonable.

  1. split() find the first occurrence, perform a lookup from start to end. Given "hello" and separator 'w', the return will be "hello" + "".
  2. rsplit() find the last occurrence, perfrom a lookup from end to start. Given "hello" and separator'w', the return should be "" + "hello".

If you all think this is more reasonable, I will give another patch to implement this. After all, current patch is for adding rsplit(StringRef). What do you think?

This revision was automatically updated to reflect the committed changes.