This is an archive of the discontinued LLVM Phabricator instance.

[llvm] Make Sequence reverse-iterable
ClosedPublic

Authored by gchatelet on May 18 2021, 3:11 AM.

Details

Summary

This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse.
It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++().

Diff Detail

Event Timeline

gchatelet created this revision.May 18 2021, 3:11 AM
gchatelet requested review of this revision.May 18 2021, 3:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2021, 3:11 AM
courbet added inline comments.May 18 2021, 5:31 AM
llvm/include/llvm/ADT/Sequence.h
84

why is this no longer a random access iterator ?

85

Let's keep the forwarnding reference ctor for when T is expensive:

template <typename U, typename Enabler = decltype(ValueT(std::declval<U>()))>
  value_sequence_iterator(U &&Value) : Value(std::forward<U>(Value)) {}
139

Does this need to be public ?

139

should this be value_range ?

157
template <typename U, typename Enabler = decltype(ValueT(std::declval<U>()))>
value_sequence(U &&Begin, U&& end);

or even:

template <typename U1,, typename U2, typename Enabler = decltype(ValueT(std::declval<U1>())),typename Enabler = decltype(ValueT(std::declval<U2>()))>
value_sequence(U1 &&Begin, U2&& end);
gchatelet marked 3 inline comments as done.May 20 2021, 8:06 AM
gchatelet added inline comments.
llvm/include/llvm/ADT/Sequence.h
84

forward_iterator ought to be enough for everyone : )
More seriously I first thought it would suffice and it does for all of llvm, except MLIR. It does something funky with its metadata and assumes that the underlying iterator is random access. I've implemented the random access iterator trait but it makes the code more complex.

157

This introduces some resolution issues and I'm not sure this brings much.
Iterators must be lightweight and copyable by definition.

edit: This one can actually be solved by the second version with two typename template parameters. It still feels a bit overkill to me but let me know it you think it's worth it.

In file included from /redacted/git/llvm-project/llvm/lib/Analysis/LoopCacheAnalysis.cpp:30:
/redacted/git/llvm-project/llvm/include/llvm/ADT/Sequence.h:86:41: error: no matching constructor for initialization of 'llvm::value_range<unsigned int>::const_iterator' (aka 'value_range_iterator<unsigned int, true>')
  const_iterator begin() const { return {Begin}; }
                                        ^~~~~~~
/redacted/git/llvm-project/llvm/lib/Analysis/LoopCacheAnalysis.cpp:169:20: note: in instantiation of member function 'llvm::value_range<unsigned int>::begin' requested here
  for (auto SubNum : seq<unsigned>(0, NumSubscripts - 1)) {
                   ^
/redacted/git/llvm-project/llvm/include/llvm/ADT/Sequence.h:30:3: note: candidate constructor not viable: no known conversion from 'llvm::value_range<unsigned int>::value_type' (aka 'const unsigned int') to 'const llvm::detail::value_range_iterator<unsigned int, true>' for 1st argument
  value_range_iterator(const value_range_iterator &) = default;
  ^
/redacted/git/llvm-project/llvm/include/llvm/ADT/Sequence.h:33:3: note: candidate template ignored: substitution failure [with U = const unsigned int &]: use of undeclared identifier 'ValueT'
  value_range_iterator(U &&Value) : Value(std::forward<U>(Value)) {}
  ^
/redacted/git/llvm-project/llvm/include/llvm/ADT/Sequence.h:28:3: note: candidate constructor not viable: requires 0 arguments, but 1 was provided
  value_range_iterator() = default;
  ^
gchatelet updated this revision to Diff 346745.May 20 2021, 8:08 AM
gchatelet marked an inline comment as done.
  • Address comments
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
gchatelet added inline comments.May 20 2021, 8:09 AM
llvm/include/llvm/ADT/SmallVector.h
1 ↗(On Diff #346745)

This one is not to be submitted but helps make tests D102760 is under review.

courbet added inline comments.May 20 2021, 11:02 AM
llvm/include/llvm/ADT/Sequence.h
29

style (here and below)

140

why const ? This makes value_range not movable.

157

Given that ValueT is a template parameter it is generally a good practice to not assume anything about the cost of copying.
We might have a cheaply movable but expensive to copy ValueT for example. A user might want to move the bounds into the value range.

(And BTW given that the iterator holds a member variable of type ValueT, the iterator is not necessarily cheap to copy, so we might as well try to avoid the copies).

that being said I only see seq currently being used with integers, so your call.

178

std::move(Begin), std::move(End) if you go with the forwarding reference ctors

mehdi_amini added inline comments.May 20 2021, 4:17 PM
llvm/include/llvm/ADT/Sequence.h
46

Why all the upper case for the method here?

gchatelet updated this revision to Diff 347006.May 21 2021, 6:22 AM
gchatelet marked 7 inline comments as done.
  • Addressing comments
llvm/include/llvm/ADT/Sequence.h
140

Indeed

courbet added inline comments.May 21 2021, 7:30 AM
llvm/include/llvm/ADT/Sequence.h
87

This is formally OK given that the return value T verifies __Referenceable<T>, but is there any reason not to return a const T& ?
(Again, this is mostly a theoretical argument, feel free to ignore.)

gchatelet updated this revision to Diff 347027.May 21 2021, 7:39 AM
  • remove unneeded container traits, simplify implementation
gchatelet added inline comments.May 21 2021, 8:51 AM
llvm/include/llvm/ADT/Sequence.h
87

Yes there is a reason.
A typical range or container holds all of its values and iterators refer to them, so it makes sense to return a reference because the value outlives the iterator.
In this particular case the iterator contains the value which might lead to dangling references.

In fact this is why I had to provide the reverse iterators in the first place. With the original Sequence code, the std::reverse_iterator's pre-increment operator generated from Sequence::iterator behaves like a post-increment operator and copies the iterator. Pseudo code:

reverse_iterator operator++(int) {
  reverse_iterator Tmp = *this;
  ++(*this);
  return Tmp;
}

In the following code, value ends up being a dangling reference.

auto sequence = std::reverse(llvm::Seq(0, 10));
const auto& value = *++sequence.begin();

Using a value type instead of a reference type should trigger an error from the compiler in such cases.

courbet accepted this revision.May 24 2021, 10:51 PM
courbet added inline comments.
llvm/include/llvm/ADT/Sequence.h
87

Makes sense, thanks for the explanation.

This revision is now accepted and ready to land.May 24 2021, 10:51 PM
gchatelet updated this revision to Diff 350257.Jun 7 2021, 5:38 AM
  • Remove dependency in SmallVector, use begin/end elsewhere
gchatelet updated this revision to Diff 350258.Jun 7 2021, 5:40 AM
  • Remove dependency in SmallVector, use begin/end elsewhere

@courbet do you mind having another look?
I reverted the changes in SmallVector as stated in D102760 and used begin/end instead.

gchatelet marked 2 inline comments as done.Jun 7 2021, 5:45 AM
gchatelet updated this revision to Diff 350262.Jun 7 2021, 6:08 AM
  • PDLToPDLInterp modification can now be simplified
Quuxplusone added inline comments.
llvm/include/llvm/ADT/Sequence.h
25–27

Lines 26 and 28, please avoid std::iterator because it is deprecated in C++17 and I assume we anticipate compiling LLVM with -std=c++17 in the near future. Just write out the 5 public member typedefs by hand. (Actually you already write out difference_type, so there's only 4 typedefs left. ;))

42

Naming nit: When I saw Forward in an iterator context, I assumed you were talking about forward iterators versus bidirectional iterators, and was going to say "technically, bidirectional iterators are also forward iterators." But on second glance I see that by IsForward you really mean IsNotReversed. I suggest s/IsForward/!IsReversed/g and s/Backward/Reverse/g.

Orthogonally, I didn't realize until operator[] that by value_range you meant not just "operator* returns by value," but actually iota_range; and I wonder if that would be a good renaming.

56–58

I suspect this should be just

explicit value_range_iterator(T Value) : Value(Value) {}

All constructors (except copy and move) should always be explicit; and I don't think the templatey stuff is serving a purpose here.

157
explicit value_range(value_type Begin, value_type End)
    : Begin(Begin), End(End) {}

Based on @courbet's comment, I also suggest adding static_assert(std::is_integral<ValueT>::value, ""); into this code, as a speed bump for anyone who tries to use it with non-integral types (because that's not currently a supported nor tested use-case, and they ought to think about it before plowing ahead blindly).

gchatelet updated this revision to Diff 350264.Jun 7 2021, 6:21 AM
  • Revert "PDLToPDLInterp modification can now be simplified"
gchatelet updated this revision to Diff 350323.Jun 7 2021, 9:18 AM
gchatelet marked 4 inline comments as done.
  • Address comments

@Quuxplusone how about renaming seq into iota then?
In a second NFC patch possibly to keep it simple.

We may also provide an overloaded version that implicitly starts at 0 (e.g. iota(5) -> 0, 1, 2, 3, 4).
More than half of the call sites use 0 explicitly as a first parameter.

llvm/include/llvm/ADT/Sequence.h
157

SGTM I have a use case that is not a primitive scalar value but a thin wrapper around it. I'll update the static_assert accordingly then.

Quuxplusone added inline comments.Jun 7 2021, 1:35 PM
llvm/include/llvm/ADT/Sequence.h
42

@gchatelet wrote:

how about renaming seq into iota then? In a second NFC patch possibly to keep it simple.

FWIW, I would oppose that. I mean, iota is established C++ jargon — that's why I used it above when I needed one word to explain my realization — but it's still jargon; I think seq captures the intent pretty nicely. It's like the shell utility [seq](https://en.wikipedia.org/wiki/Seq_(Unix)).

We may also provide an overloaded version that implicitly starts at 0 (e.g. iota(5) -> 0, 1, 2, 3, 4). More than half of the call sites use 0 explicitly as a first parameter.

I would extremely strongly oppose a one-argument overload, if the name were iota, because std::views::iota(5) is actually 5, 6, 7, .... (This is a huge pain point and I think there's a proposal to retroactively kill off the one-argument iota from the Standard.)

If the name is kept as seq, I would mildly oppose a one-argument overload.

If the name is changed to e.g. range or xrange (a la Python), then the one-argument overload might make sense — but it would still be unnecessary, so probably provide some-cost-and-no-benefit, so probably not worth doing.

gchatelet marked an inline comment as done.Jun 8 2021, 6:04 AM

Ok so let's leave it as is. I'll address the remaining clang-tidy shortly and submit after that.

gchatelet updated this revision to Diff 350593.Jun 8 2021, 6:06 AM
  • Rebase and address tests clang-tidy.
This revision was landed with ongoing or failed builds.Jun 8 2021, 6:19 AM
This revision was automatically updated to reflect the committed changes.

Reverted ; seems like gcc5 didn't like some constructs in this patch, see details here: https://buildkite.com/mlir/mlir-core/builds/14326#2cda617f-aeb8-49d2-833c-b5a734cba86a ; let me know if you can't reproduce.

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
321

Why did we lose this direct initialization pattern? Can we adjust something to keep this?

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
321

FWIW, I don't think we want to keep this. A lazy llvm::seq object physically-is something very different from a SmallVector<int64_t>; I think it's very good that the new code separates them. It might be even better to just use the std::iota algorithm to initialize the vector's contents:

SmallVector<int64_t> transposition(linalgOp.getNumLoops());
std::iota(transposition.begin(), transposition.end(), 0);
std::swap(transposition.back(), transposition[indexOp.dim()]);

If we were to keep it a one-liner, I would argue in favor of a named method on seq:

SmallVector<int64_t> transposition =
    llvm::seq<int64_t>(0, linalgOp.getNumLoops()).asSmallVector();
std::swap(transposition.back(), transposition[indexOp.dim()]);
mehdi_amini added inline comments.Jun 8 2021, 10:54 AM
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
321

A lazy llvm::seq object physically-is something very different from a SmallVector<int64_t>

Yes of course, but I don't see the point you're making here? I'm getting a lazy-range thing and I want to materialize it in a vector. We wouldn't want an implicit conversion, but that's not a reason to disallow the explicit materialization.

The question is what is the most concise and readable code for this (one-liner if possible), an llvm::iota may be palatable (to avoid begin/end forced by the std:: version), but I still find it adding a cognitive load: it actually requires to initialize a vector of zeroes before overwriting it with iota.

Probably not enough of a common pattern to optimize for though...