This is an archive of the discontinued LLVM Phabricator instance.

Improve the genericity of `llvm::enumerate()`.
ClosedPublic

Authored by zturner on Mar 10 2017, 4:54 PM.

Details

Summary

This all started when I tried to store the results of an enumerate range into a vector and then sort according to the value. In doing so, I found out that enumerate was insufficiently generic, as it did not support iterator_facade_base and so you couldn't even call std::distance() on two of its iterators.

Ideally, one should not have to write this code:

std::vector<std::pair<size_t, std::string>> Items;
for (auto X : enumerate(Range))
  Items.push_back(std::make_pair(X.Index, X.Value));

just to get something they can sort. It would be nice if you could write one line like std::vector<std::pair<size_t, std::string>>(enumerate(X).begin(), enumerate(X).end());

But the value type of an enumerate isn't even a pair, so you don't know how to specify it (if that was what you wanted). We'd like to make use of type deduction.

To make all this work I re-wrote enumerate to support iterator_facade_base, and I add an additional helper function called to_vector which you can now call as auto Results = to_vector<N>(enumerate(Range)); (replacing enumerate with whatever adapter you want), and it will return you a SmallVector<T, N> where T is the value type of the original range.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Mar 10 2017, 4:54 PM
dblaikie accepted this revision.Mar 11 2017, 10:46 AM
dblaikie added inline comments.
llvm/include/llvm/ADT/STLExtras.h
894–898 ↗(On Diff #91436)

Perhaps not relevant now - but eventually this might want to be generalized to arbitrary range conversion (conversion to a std::set? etc... )

1040 ↗(On Diff #91436)

Usually op overloads should be non-members whenever possible (ensures implicit conversions happen symmetrically for left and right operands) - though perhaps that isn't possible with the iterator facade helper?

1045 ↗(On Diff #91436)

This doesn't look valid - is it? (I think R is a type, not a variable, so shouldn't be valid in these expressions)

1048 ↗(On Diff #91436)

Missing return? This should've warned/broken the build/crashed-at-O0, etc? Was there some missing test coverage?

1062 ↗(On Diff #91436)

If the index of the end is going to be -1, it could be used to quickly catch a few mistakes with iterators - assert(index != -1) in operator++, for example?

1065–1067 ↗(On Diff #91436)

Once there's a generic utility function for all ranges as you've provided with to_vector - is this member still useful/necessary? (indeed this function seems to be untested - the test case exercises the generic to_vector instead)

This revision is now accepted and ready to land.Mar 11 2017, 10:46 AM
bryant added inline comments.Mar 11 2017, 11:03 AM
llvm/include/llvm/ADT/STLExtras.h
1024 ↗(On Diff #91436)

can iterator_adaptor_base be used for this? it looks like its default template params fit.

Will address the rest of the comments in a later followup.

llvm/include/llvm/ADT/STLExtras.h
1062 ↗(On Diff #91436)

-1 is sufficient to be end, but not necessary. For example, when ++'ing a valid iterator, you don't know if it's end without having the end of the backing container. So in that case index will just be whatever. Such an assert would still be fine, but it would only guarantee that you're not incrementing the result of std::end(enumerate(R)), and it wouldn't trigger when you increment an iterator past it's actual end.

1065–1067 ↗(On Diff #91436)

This was from some early attempts that is not needed anymore. I meant to remove it, but forgot. Thanks for catching it.

zturner updated this revision to Diff 91484.Mar 11 2017, 4:31 PM

A few improvements as suggested by blaikie@

zturner added inline comments.Mar 11 2017, 4:34 PM
llvm/include/llvm/ADT/STLExtras.h
1024 ↗(On Diff #91436)

I was not able to make this work well. iterator_adapter_base stores a copy of the iterator While I also want to store a copy of the iterator, I also need to store an index. And I need both to be part of the iterator's value_type. So if I want to use iterator_adapter_base, It becomes wasteful because I store a copy of the iterator twice. Once in iterator_adapter_base and again in the result member variable.

chandlerc added inline comments.Mar 12 2017, 10:49 PM
llvm/include/llvm/ADT/STLExtras.h
891–897 ↗(On Diff #91484)

Why not a range constructor for SmallVector?

This revision was automatically updated to reflect the committed changes.