This is an archive of the discontinued LLVM Phabricator instance.

[llvm] [ADT] Update llvm::Split() per Pavel Labath's suggestions
ClosedPublic

Authored by mgorny on Sep 27 2021, 5:14 AM.

Details

Summary

Optimize the iterator comparison logic to compare Current.data() pointers. Use std::tie for assignments from std::pair. Replace the custom class with a function returning iterator_range.

Diff Detail

Event Timeline

mgorny created this revision.Sep 27 2021, 5:14 AM
mgorny requested review of this revision.Sep 27 2021, 5:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 27 2021, 5:14 AM
labath added inline comments.Sep 27 2021, 5:23 AM
llvm/include/llvm/ADT/StringExtras.h
556–559

this should also be a lower-case split now that it's a function.

(Believe it or not, this is the reason I was first drawn to this -- it's uncommon to see a class used like this because, even in the cases where you do have a separate class, you usually create a function wrapper for it. The original reason for this is pragmatic (use of template argument deduction), but the practice is so ubiquitous that a deviation from it stands out.)

mgorny marked an inline comment as done.Sep 27 2021, 6:16 AM
mgorny added inline comments.
llvm/include/llvm/ADT/StringExtras.h
556–559

Yeah, that makes sense. In fact, I originally named the class lowercase for this exact reason.

mgorny updated this revision to Diff 375238.Sep 27 2021, 6:17 AM
mgorny marked an inline comment as done.

Make the function lowercase. I had to modify IR/DataLayout.cpp to call its own static split() via ::split().

joerg added a subscriber: joerg.Sep 28 2021, 1:11 PM

Why are all the changes from separator character to separator string necessary or desirable?

Why are all the changes from separator character to separator string necessary or desirable?

Because otherwise we'd have to resort to hacks to extend the separator's lifetime.

labath accepted this revision.Sep 29 2021, 4:52 AM

Looks good to me, but I'd give @joerg a chance to respond first.

Why are all the changes from separator character to separator string necessary or desirable?

There's no good place to store the separator character. The functional version of split is implemented by creating a StringRef(&the_char, 1), but that won't work in the iterator version. Keeping it doesn't seem to be worth the trouble, as it does not have any performance benefits, and doesn't make the callsites more complicated.

This revision is now accepted and ready to land.Sep 29 2021, 4:52 AM
joerg added a comment.Oct 6 2021, 6:46 PM

Can't you wrap iterator_range<T> and possibly even support Twines like that? That is, don't extend the life time of the iterators, but store it in the range?

Can't you wrap iterator_range<T> and possibly even support Twines like that? That is, don't extend the life time of the iterators, but store it in the range?

I can but @labath didn't want the extra code.

labath added a comment.Oct 7 2021, 1:31 AM

Can't you wrap iterator_range<T> and possibly even support Twines like that? That is, don't extend the life time of the iterators, but store it in the range?

I can but @labath didn't want the extra code.

I am not sure that it is worth it, but I am not completely opposed to the idea either. Symmetry with StringRef::split also counts for something..

So how should I proceed?

Do you have commit access? If not, please read the llvm developer policy and follow the instructions, thanks!

Do you have commit access? If not, please read the llvm developer policy and follow the instructions, thanks!

That's not the problem. The problem is that @labath and @joerg seem not to agree on how it should be done, and I don't know whether to hack support for char delimiter to preserve the old syntax as @joerg wanted, or to keep it as is (i.e. without support for char delimiter) as @labath originally suggested.

I'm fine with either option, and it doesn't seem to me like @joerg has a particularly strong more opinion either, so maybe you can cast the deciding vote?

I consider it more important to fix the equality operator and the name capitalization (in that order).

mgorny updated this revision to Diff 381187.Oct 21 2021, 3:12 AM
mgorny edited the summary of this revision. (Show Details)

Restored support for char Separator — after all, it isn't hard and the cost is negligible.

mgorny updated this revision to Diff 381189.Oct 21 2021, 3:20 AM

Update LLDB call sites again.

mgorny updated this revision to Diff 381209.Oct 21 2021, 4:14 AM

Add explicit test for use in a for loop.

mgorny planned changes to this revision.Oct 21 2021, 4:55 AM

Something's broken.

mgorny updated this revision to Diff 381226.Oct 21 2021, 5:28 AM

Ok, this turned out to be a little bit harder than I anticipated — in particular I needed to add a copy constructor and assignment operator.

@labath, do you think it's fine or should I revert to non-char version?

This revision is now accepted and ready to land.Oct 21 2021, 5:28 AM
labath accepted this revision.Oct 22 2021, 1:42 AM

It makes things a bit more complicated, which then makes the case for dropping char slightly stronger, but I still don't think it's worth making a fuss of it.

llvm/unittests/ADT/StringExtrasTest.cpp
317

maybe EXPECT_THAT(Result, testing::ElementsAre("foo", "bar", "", "baz"));

mgorny marked an inline comment as done.Oct 22 2021, 2:33 AM
This revision was landed with ongoing or failed builds.Oct 22 2021, 3:28 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2021, 3:28 AM