This is an archive of the discontinued LLVM Phabricator instance.

[NFC] Introduce llvm::to_vector_of to allow creation of SmallVector<T> from range of items convertible to type T
ClosedPublic

Authored by yurai007 on Jul 14 2022, 8:33 AM.

Diff Detail

Event Timeline

yurai007 created this revision.Jul 14 2022, 8:33 AM
yurai007 requested review of this revision.Jul 14 2022, 8:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2022, 8:33 AM
yurai007 added inline comments.Jul 14 2022, 8:44 AM
llvm/include/llvm/ADT/SmallVector.h
1290

TODO: add more tests to make sure there is no surprises coming from overloading rules

llvm/unittests/ADT/SmallVectorTest.cpp
1145

That's basically copy of STLExtrasTest.ToVector test body after removing enumerate() calls. I feel it make sense to have it here since to_vector belongs to SmallVector.

nikic added inline comments.Jul 14 2022, 9:24 AM
llvm/include/llvm/ADT/SmallVector.h
1285–1286

Why do we need to explicitly pass InlineSize<R>::value? Isn't that the default anyway?

llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
814 ↗(On Diff #444678)

llvm prefix unnecessary

nikic added a comment.Jul 14 2022, 9:28 AM

By the way, I think it would be worthwhile to add an ArrayRef constructor to SmallVector, or else drop the existing iterator_range ctor. I've wanted to switch an iterator_range to ArrayRef a few times in the past, and always gave up due to the SmallVector fallout. We should support both or neither.

By the way, I think it would be worthwhile to add an ArrayRef constructor to SmallVector, or else drop the existing iterator_range ctor. I've wanted to switch an iterator_range to ArrayRef a few times in the past, and always gave up due to the SmallVector fallout. We should support both or neither.

“Neither” makes some sense to me, since to_vector and to_vector_of are clear and support any range naturally. @dblaikie, any thoughts?

By the way, I think it would be worthwhile to add an ArrayRef constructor to SmallVector, or else drop the existing iterator_range ctor. I've wanted to switch an iterator_range to ArrayRef a few times in the past, and always gave up due to the SmallVector fallout. We should support both or neither.

“Neither” makes some sense to me, since to_vector and to_vector_of are clear and support any range naturally. @dblaikie, any thoughts?

Yeah, I'd tend to lean that way - if std::vector doesn't have generic range-based construction, I'm happy with SmallVector not having it either.

llvm/include/llvm/ADT/SmallVector.h
1295–1298

Rather than an overload, can the Size parameter have a default value in the other definition and this overload can be removed?

yurai007 updated this revision to Diff 444978.Jul 15 2022, 7:25 AM
yurai007 retitled this revision from [WIP][NFC] Introduce llvm::to_vector_of to allow creation of SmallVector<T> from range of items convertible to type T to [NFC] Introduce llvm::to_vector_of to allow creation of SmallVector<T> from range of items convertible to type T.
yurai007 edited the summary of this revision. (Show Details)

Remove WIP tag since overall direction seem to be fine and now all tests are passing (at least locally). Address some comments.

yurai007 marked 2 inline comments as done.Jul 15 2022, 7:27 AM
yurai007 added inline comments.
llvm/include/llvm/ADT/SmallVector.h
1285–1286

Yes, it's indeed redundant! Interesting, I didn't even bother thinking about removing it since it was present on to_vector before my change. And given that's about widely used SmallVector header I didn't expect that it could have anything superfluous.

yurai007 marked an inline comment as done.Jul 15 2022, 7:28 AM
dexonsmith added inline comments.Jul 15 2022, 8:02 AM
llvm/include/llvm/ADT/SmallVector.h
1285–1286

Maybe it predates SmallVector itself having a default? Or maybe it has some hard-to-predict impact on compile time? You might be able to figure that out via a git-blame and/or a look at the original review thread, or you can ask the author that added it (if it was me, I don’t remember :)). Most likely it can just be skipped though.

By the way, I think it would be worthwhile to add an ArrayRef constructor to SmallVector, or else drop the existing iterator_range ctor. I've wanted to switch an iterator_range to ArrayRef a few times in the past, and always gave up due to the SmallVector fallout. We should support both or neither.

“Neither” makes some sense to me, since to_vector and to_vector_of are clear and support any range naturally. @dblaikie, any thoughts?

Yeah, I'd tend to lean that way - if std::vector doesn't have generic range-based construction, I'm happy with SmallVector not having it either.

All right, I will prepare separate change for this.

yurai007 added inline comments.Jul 18 2022, 12:30 AM
llvm/include/llvm/ADT/SmallVector.h
1295–1298

The thing is we need to compute default Size from R which is Size's successor on template parameter list. We can reorder list and it would work as long as Size is default:

 template <typename Out, typename R, unsigned Size = CalculateSmallVectorDefaultInlinedElements<
             ValueTypeFromRangeType<R>>::value>
  SmallVector<Out, Size> to_vector_of(R &&Range) {
  return {std::begin(Range), std::end(Range)};
}

However when Size is non-default both preceding types Out, R need to be passed explicitly during instantiation. If I understand it correctly, that was reason for having 2 definitions of to_vector function - for similar reason they couldn't be merged into one. It would be much easier if Size depended on Out instead but except some special scenarios when it's allowed (pointer types), in general I don't think we can rely on this.

yurai007 updated this revision to Diff 445446.Jul 18 2022, 3:53 AM

Replace to_vector_of usage with to_vector where possible, improve names in ToVectorOf test case. All comments should be addressed now.

yurai007 updated this revision to Diff 446778.Jul 22 2022, 4:06 AM

Some cosmetic changes in tests.

Might be worth submitting this separately/without updating all the callers - so that if a caller has to be reverted for some reason we don't have to revert the functionality and testing, etc, itself? (callers could probably done in reasonable chunks - 10s, maybe 50 at a time? huge patches are a bit awkward (again, in terms of bisecting/reverting/etc) but doing one patch for each caller's probably excessive too - somewhere in between)

llvm/include/llvm/ADT/SmallVector.h
1285–1286

Patch might be easier to read if these were removed in a separate patch, if they can be separate? (looks like they should be/are unrelated to the addition of to_vector_of?

llvm/unittests/ADT/SmallVectorTest.cpp
1145

We shouldn't duplicate tests - if to_vector is in SmallVector.h, it should be tested in SmallVectorTest.cpp, not in STLExtrasTest.cpp - so it should probably be moved here, rather than copied - and that could be done in a preliminary/separate commit?

1169–1171

Generally prefer non-member overloads where possible to allow for equal conversions for LHS and RHS. In this case it could be a friend to make it easy to write inline

1174–1184

This seems a bit too specific? Emulating a particular LLVM data type like MDOperand - could this be named/designed more generally/abstractly so as not to be overloaded with the specifics of some LLVM detail?

would it be possible to specialize cast<> for this?

would it be possible to specialize cast<> for this?

Seems like a stretch for the concept/definition of cast<> though I recall there are some novel implementations of cast for other types - so I'm open to an argument to the contrary, that this conversion would be consistent with others implemented via cast<>.

I think cast<> comes with the implication that the conversion is zero-cost (in non-assert builds), which is not the case here. to_vector() is an expensive O(n) operation.

yurai007 added inline comments.Jul 29 2022, 3:35 AM
llvm/include/llvm/ADT/SmallVector.h
1285–1286

You are right, I can extract it to separate patch.

llvm/unittests/ADT/SmallVectorTest.cpp
1145

We shouldn't duplicate tests - if to_vector is in SmallVector.h, it should be tested in SmallVectorTest.cpp, not in STLExtrasTest.cpp - so it should probably be moved here, rather than copied - and that could be done in a preliminary/separate commit?

I would agree except the fact that STLExtrasTest.ToVector is not exactly 1-1 copy because it has enumerate() call (STLExtras.h) and I believe it tests both to_vector and enumerate functions. There are following options I see right now:

  1. We can cut and paste STLExtrasTest.ToVector here without any modification (well, except test case name), but it requires extra include for enumerate(). That enumerate is not really needed for purpose of testing to_vector function.
  2. We can simply remove STLExtrasTest.ToVector and leave SmallVectorTest.ToVector as it is but I'm a bit paranoid about removing unit test covering enumerate since it smells like regression.
  3. We can leave STLExtrasTest.ToVector as it is, and copy it to SmallVectorTest.ToVector with enumerate (because it's irrelevant) removal. That's chosen option.
1169–1171

Ok. I will fix it.

1174–1184

I thought there would be some value in referring to MDOperand/Metadata classes since that was background for this patch: https://reviews.llvm.org/D129565#3648665 But I get your point - what matters here is just property of MDOperand being convertible to Metadata. I can use 'From' and 'To' names as more appropriate in expressing conversion.
Short information about original, motivational use case can be simply put in 'Summary'/commit message.

Might be worth submitting this separately/without updating all the callers - so that if a caller has to be reverted for some reason we don't have to revert the functionality and testing, etc, itself? (callers could probably done in reasonable chunks - 10s, maybe 50 at a time? huge patches are a bit awkward (again, in terms of bisecting/reverting/etc) but doing one patch for each caller's probably excessive too - somewhere in between)

Ok. I'm gonna split it in such way.

yurai007 updated this revision to Diff 448624.Jul 29 2022, 7:11 AM

Addressed comments.

yurai007 marked 2 inline comments as done.Jul 29 2022, 7:13 AM
dblaikie added inline comments.Jul 29 2022, 12:25 PM
llvm/unittests/ADT/SmallVectorTest.cpp
877–880

This could be a constexpr variable template, maybe? & then the EXPECT_TRUEs testing this could instead be static_assert rather than runtime tests?

1174–1184

Yep, that's the thinking!

Maybe we can even remove the fact that one side of this is a pointer, and the other a value? Have both sides be values (eg: SmallVector<From> -> SmallVector<To>)? Shouldn't be any need for so many operations on From or To either (eg: the operator *, operator From * (I'd expect the only conversion should be the To(From) ctor), get(), etc) - probably From and To can be structs, maybe with a single int member that gets copied on conversion, so that can be checked on the other side of the conversion?)

1187–1189

Oh, looks like this is backwards, maybe? I'd expect the source type to be To and the destination type to be From, so more like to_vector_of<To>(SomeFromVector)?

1188–1227

I guess this is how the other to_vectors were already tested? But if you're up for it in a pre or post commit, might be nice to simplify this - like I'm not sure it's necessary to test all these variants. The implementation is generic - it tests anything you can do begin(x) and end(x) on, so we probably only need to test one such type? But if we really want to test 4, perhaps we can avoid rewriting the code and use a type expansion in gunit?)

yurai007 updated this revision to Diff 449212.Aug 2 2022, 1:09 AM

Address all recent comments.

yurai007 marked 4 inline comments as done.Aug 2 2022, 1:09 AM
yurai007 added inline comments.Aug 2 2022, 1:14 AM
llvm/unittests/ADT/SmallVectorTest.cpp
1188–1227

I agree it's not necessary so eventually decided to remove ArrayRef variants.

yurai007 added inline comments.Aug 2 2022, 1:32 AM
llvm/unittests/ADT/SmallVectorTest.cpp
1150

For now message argument is required, soon after https://reviews.llvm.org/D130689 could be removed. Anyway if needed may be done in separate NFC.

yurai007 updated this revision to Diff 451460.Aug 10 2022, 7:51 AM

Rebase and adjust to C++17 (static_asserts)

dblaikie accepted this revision.Aug 10 2022, 10:41 AM

Looks good

This revision is now accepted and ready to land.Aug 10 2022, 10:41 AM

Looks good

@nikic @dblaikie @dexonsmith: Great. Thank you all for the review!

This revision was landed with ongoing or failed builds.Aug 12 2022, 6:22 AM
This revision was automatically updated to reflect the committed changes.