This is an archive of the discontinued LLVM Phabricator instance.

[llvm] Add enum iteration to Sequence
ClosedPublic

Authored by gchatelet on Jun 8 2021, 8:30 AM.

Details

Summary

This patch allows iterating typed enum via the ADT/Sequence utility.

Diff Detail

Event Timeline

gchatelet created this revision.Jun 8 2021, 8:30 AM
gchatelet requested review of this revision.Jun 8 2021, 8:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2021, 8:30 AM
Quuxplusone requested changes to this revision.Jun 8 2021, 9:24 AM
Quuxplusone added inline comments.
llvm/include/llvm/ADT/Sequence.h
210–223

I don't think it makes sense to force the caller to decide between seq and enum_seq spellings. If I were implementing this facility, I'd just give iota_range a second template type parameter:

template<class V, class U>
struct iota_range_iterator {
    using value_type = V;
    V operator*() const { return static_cast<V>(value_); }
    ~~~
private:
    U value_;
};

and then here instantiate iota_range<EnumType, std::underlying_type_t<EnumType>>.
On line 250 below, you make an iota_range_iterator<IterableEnum<ValueE>>, whose operator* gives back values of type IterableEnum<ValueE> instead of ValueE — that's going to be really confusing to the caller. I wouldn't do that.

template <class T, std::enable_if_t<std::is_integral<T>::value, int> = 0>
auto seq(T Begin, T End) {
  return iota_range_iterator<T, T>(Begin, End);
}

template <class T, std::enable_if_t<std::is_enum<T>::value, int> = 0>
auto seq(T Begin, T End) {
  using U = std::underlying_type_t<T>;
  return iota_range_iterator<T, U>(U(Begin), U(End));
}

Notice you don't need std::move because these types are integral.

llvm/include/llvm/CodeGen/ValueTypes.h
43 ↗(On Diff #350610)

Get rid of IterableEnum as I propose above, and then you can get rid of this new overload. Ditto throughout.

This revision now requires changes to proceed.Jun 8 2021, 9:24 AM
gchatelet marked an inline comment as done.Jun 9 2021, 3:36 AM
gchatelet added inline comments.
llvm/include/llvm/ADT/Sequence.h
210–223

Thx for helping me taking a step back.
I originally wanted to separate concerns (iteration and typing) but the fact that IterableEnum<T> in not T is problematic indeed. The abstraction is leaking.

Now conflating the two orthogonal aspect make the implementation a bit more complex but I think it's worth it.

There is an additional design point that you may have missed. seq has the C++ iterator semantic, i.e. the end value is exclusive.
When iterating enums we want end to be included.
That's a problem with typed enums as user can't write seq(Enum::First, Enum::Last + 1). We are not allowed to do arithmetic on typed enums (and TBH it's cumbersome to write).

I've fixed this by adding another method seq_inclusive but I'm not too happy about the naming. Suggestions welcome.
Adding a different method does not compose well with overloading seq based on is_integral/is_enum so I moved the type resolution inside iota_range directly. Let me know what you think.

gchatelet updated this revision to Diff 350843.Jun 9 2021, 3:37 AM
  • Trying a different approach
Quuxplusone added inline comments.Jun 9 2021, 7:40 AM
llvm/include/llvm/ADT/Sequence.h
210–223

There is an additional design point [...] When iterating enums we want end to be included.

Yuck. I had missed that. seq_inclusive seems like a decent solution to me. If you really wanted to be explicit everywhere, you could have a llvm::halfopen_seq and an llvm::closed_seq... but I assume that's a bad idea because the half-open version is used all over the place and we like the shortness of its name?
Another option would be a "fluent interface" via member function:

for (int i : llvm::seq<int>(0, 10)) { /* loop from 0 to 9 */ }
for (int i : llvm::seq<int>(1, 10).inclusive()) { /* loop from 1 to 10 */ }

Watch out, though: the trivial implementation of .inclusive() would lead to integer overflow in llvm::seq<int>(INT_MIN, INT_MAX).inclusive() and so you'd have to think about what you wanted to do there: make it work, at the cost of complicated implementation? make it runtime-assert-fail? just ignore the potential overflow? find a simple implementation I'm not thinking of off the top of my head? ;)

I definitely agree that we do not want seq<E>(x, y) to iterate over one more element than seq<underlying_type_t<E>>(x, y); the enum-ness of the type parameter should affect only the enum-ness of the value_type, definitely not the number of things iterated over.

Btw, some enumeration types do have "one-past-the-end" values, as in

enum E { Apple, Orange, Banana, MaxFruit = Banana+1 };
for (E fruit : llvm::halfopen_seq<E>(Apple, MaxFruit)) { /* is correct */ }

and of course all integer types do have "closed" range forms as well:

for (int i : llvm::closed_seq<int>(INT_MIN, INT_MAX)) { /* is correct */ }
for (int i : llvm::seq<int>(INT_MIN, INT_MAX).inclusive()) { /* is correct */ }
gchatelet updated this revision to Diff 351136.Jun 10 2021, 5:03 AM
  • rebase and use seq_inclusive in a few places
gchatelet marked 2 inline comments as done.Jun 10 2021, 5:14 AM
gchatelet added inline comments.
llvm/include/llvm/ADT/Sequence.h
210–223

There is an additional design point [...] When iterating enums we want end to be included.

Yuck. I had missed that. seq_inclusive seems like a decent solution to me. If you really wanted to be explicit everywhere, you could have a llvm::halfopen_seq and an llvm::closed_seq... but I assume that's a bad idea because the half-open version is used all over the place and we like the shortness of its name?

I count about 150 call sites most of them are exclusive so yeah, I'd prefer to keep the most used version short.

Another option would be a "fluent interface" via member function:

for (int i : llvm::seq<int>(0, 10)) { /* loop from 0 to 9 */ }
for (int i : llvm::seq<int>(1, 10).inclusive()) { /* loop from 1 to 10 */ }

I usually like fluent interfaces but it has to be omnipotent (setting a value). The way I implemented it would lead to bugs if client calls llvm::seq<int>(1, 10).inclusive().inclusive().
There are ways to prevent it (adding more state to iota_range) but I feel like handling it as a separate function works quite well.

Watch out, though: the trivial implementation of .inclusive() would lead to integer overflow in llvm::seq<int>(INT_MIN, INT_MAX).inclusive() and so you'd have to think about what you wanted to do there: make it work, at the cost of complicated implementation? make it runtime-assert-fail? just ignore the potential overflow? find a simple implementation I'm not thinking of off the top of my head? ;)

I'll go with runtime assert.

I definitely agree that we do not want seq<E>(x, y) to iterate over one more element than seq<underlying_type_t<E>>(x, y); the enum-ness of the type parameter should affect only the enum-ness of the value_type, definitely not the number of things iterated over.

Yeah they ought to be orthogonal.

Btw, some enumeration types do have "one-past-the-end" values, as in

enum E { Apple, Orange, Banana, MaxFruit = Banana+1 };
for (E fruit : llvm::halfopen_seq<E>(Apple, MaxFruit)) { /* is correct */ }

and of course all integer types do have "closed" range forms as well:

for (int i : llvm::closed_seq<int>(INT_MIN, INT_MAX)) { /* is correct */ }
for (int i : llvm::seq<int>(INT_MIN, INT_MAX).inclusive()) { /* is correct */ }

Let me add more tests to make sure these cases are handled.

llvm/include/llvm/ADT/Sequence.h
123–129

llvm::detail::StorageType is too generic of a name to land-grab, IMHO. This can/should be an implementation detail of iota_range_iterator — maybe it doesn't even need to be a template parameter.

template<class T>
struct iota_range_iterator {
    using storage_type = typename std::conditional_t<std::is_enum<T>::value,
        type_identity<T>, std::underlying_type<T>>::type;

Technically, instantiating underlying_type<T> for non-enum T was UB before C++20.
In practice I'd hope it works on all implementations we care about.
I don't know where to get a C++14-friendly type_identity but hopefully you can figure it out.
https://en.cppreference.com/w/cpp/types/type_identity

llvm/include/llvm/ADT/Sequence.h
195

I've just noticed that this is unidiomatic: on normal iterators, --begin() would be UB. (std::reverse_iterator always stores one-past-where-it's-"pointing"-to, for this reason.) I guess it's OK here, but if you've got the time, maybe think about some way to avoid the hiccup here.

219

Phab somehow ate my comment here.
This should be just return seq(Begin, static_cast<T>(End + 1));
and then in the constructor of iota_range, please assert(Begin <= End) just as a sanity check.

gchatelet marked 3 inline comments as done.Jun 14 2021, 1:17 AM
gchatelet added inline comments.
llvm/include/llvm/ADT/Sequence.h
123–129

llvm::detail::StorageType is too generic of a name to land-grab, IMHO. This can/should be an implementation detail of iota_range_iterator — maybe it doesn't even need to be a template parameter.
Technically, instantiating underlying_type<T> for non-enum T was UB before C++20.
In practice I'd hope it works on all implementations we care about.

Unfortunately no, std::conditional must evaluated both types, and std::underlying_type<T> does not provide the type member when T is not an enum. The standard seems to specify this use case explicitly.
https://en.cppreference.com/w/cpp/types/underlying_type

I had to provide type resolution via template specialization.
Because of how nested template specialization works, I can't make StorageType part of iota_range_iterator either so it has to live outside of iota_range_iterator.

I tried many different solutions but I couldn't make it much better.

I renamed StorageType to iota_range_iterator_storage_type so at least it's less susceptible to accidental hijacking.

I don't know where to get a C++14-friendly type_identity but hopefully you can figure it out.
https://en.cppreference.com/w/cpp/types/type_identity

type_identity doesn't help in that case.

219

Phab somehow ate my comment here.
This should be just return seq(Begin, static_cast<T>(End + 1));

Because End may be a scoped enum End + 1 is not always a valid statement.
Trying to do the arithmetic here will yield the same problems that IterableEnum was supposed to solve.
The best we could do here would be something along those lines:

return seq(Begin, static_cast<T>(static_cast<detail::StorageType<T>::type>(End) + 1));

Which, IMHO, is redoing most of the work that has been moved to iota_range_iterator (hence the use of iterators to bump the values)

gchatelet updated this revision to Diff 351793.Jun 14 2021, 1:22 AM
gchatelet marked 2 inline comments as done.
  • rename StorageType to iota_range_iterator_storage_type
llvm/include/llvm/ADT/Sequence.h
123–129

I'd mixed up the arguments to conditional_t. Try this:
https://godbolt.org/z/q9fxznja7

gchatelet updated this revision to Diff 351874.Jun 14 2021, 7:49 AM
gchatelet marked an inline comment as done.
  • Use std::conditional instead of custom specialization
llvm/include/llvm/ADT/Sequence.h
123–129

oh... my apologies. I should have checked as well. I had convinced myself that the error was std::conditional_t needing to resolve both types ...

So I had to provide a C++14 type_identity but it works.

gchatelet updated this revision to Diff 352459.Jun 16 2021, 9:10 AM
gchatelet marked an inline comment as done.
-rebase and address comments

Please take another look @Quuxplusone .

llvm/include/llvm/ADT/Sequence.h
195

I've added a note as I couldn't find a better implementation.

llvm/include/llvm/ADT/Sequence.h
211–222

I would say "from Begin to End, exclusive of End" — or more colloquially but clearer IMO, "from Begin up to but not including End."

llvm/include/llvm/ADT/Sequence.h
219

Yeah, I guess since End + 1 doesn't work for scoped enums, I'm willing to live with *++result.end().

Or... Maybe iota_range should just take three arguments? Does this work?

explicit iota_range_iterator(T Value, bool plus1)
    : Value(static_cast<U>(Value) + plus1) {}

explicit iota_range(T Begin, T End, bool inclusive)
    : Begin(Begin, false), End(End, inclusive) {
  assert(Begin <= End);
}

// seq
return iota_range<T>(Begin, End, false);

// seq_inclusive
return iota_range<T>(Begin, End, true);
llvm/unittests/ADT/SequenceTest.cpp
77–84

I'd also test reverse(seq_inclusive(min, min)).
You shouldn't need the <T> anywhere on lines 72-84.

llvm/unittests/IR/ConstantRangeTest.cpp
1554–1556

You shouldn't need the explicit template argument here.

gchatelet updated this revision to Diff 352655.Jun 17 2021, 2:47 AM
gchatelet marked 4 inline comments as done.
  • Address comments
llvm/include/llvm/ADT/Sequence.h
219

Begin and End are not iterators but values (BTW I renamed them to avoid confusion), this is needed because of rbegin()/rend() must be constructed from values.

Now this can be handled by adding an additional member but then it gets really confusing when creating the reverse iterators:

value_type Begin;
value_type End;
bool Inclusive;

explicit iota_range(T Begin, T End, bool Inclusive)
    : Begin(Begin), End(End), Inclusive(Inclusive) {
  assert(Begin <= End);
}

auto begin() const { return const_iterator(Begin, false); }
auto end() const { return const_iterator(End, Inclusive); }

auto rbegin() const { return const_reverse_iterator(*--end(), false); }
auto rend() const { return const_reverse_iterator(*--begin(), false); } // <- false here is confusing
llvm/unittests/ADT/SequenceTest.cpp
77–84

Unfortunately I do because + 1 and - 1 trigger integer promotion which makes both arguments of different types.
I can introduce variables though.

gchatelet added inline comments.Jun 17 2021, 2:52 AM
llvm/include/llvm/ADT/Sequence.h
182–223

@Quuxplusone when using seq_inclusive with End == std::numeric_limits<T>::max() iteration works fine but iterator comparison is broken.
There are two possibilities here:

  1. Leave it as-is and document the broken behavior while allowing inclusive End being INT_MAX
  2. assert when using seq_inclusive and End == std::numeric_limits<T>::max(), effectively disabling iteration up to - and including - INT_MAX.

The current implementation is for 1. but I think 2. is more reasonable.
What do you think?

llvm/include/llvm/ADT/Sequence.h
188–197

Hmm. I suggest

{
  assert(Begin <= End);
  // This assertion forbids iota_range(0, UINT_MAX, true)
  // which would otherwise be treated as an empty range,
  // and iota_range(1, UINT_MAX, true) which would otherwise
  // report that s.begin() > s.end().
  // iota_range(INT_MIN, INT_MAX, true) will have already caused
  // undefined overflow, but this assertion will likely catch it too.
  if (Inclusive)
    assert(End < PastEndValue);
}

FWIW, I like your renaming of Begin to BeginValue etc.

llvm/unittests/ADT/SequenceTest.cpp
77–84

Ah, I see. (FWIW, I'd bikeshed the names to minp1 and maxm1, but whatever.)

gchatelet updated this revision to Diff 352981.Jun 18 2021, 5:36 AM
  • rebase + prevent overflow of PastEndValue + tests
gchatelet marked 2 inline comments as done.Jun 18 2021, 5:38 AM
gchatelet added inline comments.
llvm/include/llvm/ADT/Sequence.h
188–197

Hmm. I suggest

{
  assert(Begin <= End);
  // This assertion forbids iota_range(0, UINT_MAX, true)
  // which would otherwise be treated as an empty range,
  // and iota_range(1, UINT_MAX, true) which would otherwise
  // report that s.begin() > s.end().
  // iota_range(INT_MIN, INT_MAX, true) will have already caused
  // undefined overflow, but this assertion will likely catch it too.
  if (Inclusive)
    assert(End < PastEndValue);
}

The assert didn't trigger in the case of iota_range(INT_MIN, INT_MAX, true) so I used numeric_limits.
Tests are passing.

llvm/include/llvm/ADT/Sequence.h
191–192

I believe teeechnically this is still UB, because if the enumeration has no specified underlying type, and its enumerator values are e.g. enum E { A=1, B=2, C=3 }, then it's UB to evaluate E(4). (It basically works like a bit-field: you can evaluate E(mask) for any mask that doesn't use bits beyond the highest enumerator's bits.) But in practice I think this is fine.
For readability, I would still prefer to see it with an if:

if (Inclusive)
  assert(End != T(std::numeric_limits<typename iterator::underlying_type>::max()));

Although now you've got me confused again about why we're storing BeginValue and PastEndValue as value_type (i.e. E) instead of underlying_type. How does your size() method work when value_type is an enum type? I think data "at rest" should always be stored in the underlying integer type; the only time it should ever be converted to value_type (i.e. E) is when the iterator is being dereferenced.

gchatelet updated this revision to Diff 353791.Jun 22 2021, 2:27 PM
gchatelet marked an inline comment as done.
  • Address comments
  • Fix clang-tidy warning
llvm/include/llvm/ADT/Sequence.h
191–192

I believe teeechnically this is still UB, because if the enumeration has no specified underlying type, and its enumerator values are e.g. enum E { A=1, B=2, C=3 }, then it's UB to evaluate E(4). (It basically works like a bit-field: you can evaluate E(mask) for any mask that doesn't use bits beyond the highest enumerator's bits.) But in practice I think this is fine.

Thx! So I'm now moving all the logic in the underlying type world.
Technically End + 1 is implementation defined for signed types but it's guarded by the assertion so it's a logical error before being implementation defined.

Although now you've got me confused again about why we're storing BeginValue and PastEndValue as value_type (i.e. E) instead of underlying_type. How does your size() method work when value_type is an enum type? I think data "at rest" should always be stored in the underlying integer type; the only time it should ever be converted to value_type (i.e. E) is when the iterator is being dereferenced.

Thx a lot for challenging me on this one. You are correct that we can use underlying_type instead of value_type directly into iota_range.
I've moved things around: I think it's much better now. Let me know if you think this can be improved further.

gchatelet updated this revision to Diff 353793.Jun 22 2021, 2:28 PM
gchatelet marked an inline comment as done.
  • Fix another clang-tidy warning
Quuxplusone accepted this revision.Jun 22 2021, 2:42 PM

I probably shouldn't be the only person to accept this PR, but it looks fine to me at this point.

llvm/include/llvm/ADT/Sequence.h
28

Nits: Please add a newline before struct; and consider renaming/reordering the parameters so that T comes before U. I think that should be easy and mechanical (if tedious), because neither T nor U is deduced nor defaulted.

195

Note that BeginValue-1 might be UB; e.g. seq(INT_MIN, 0).rend(). But I think it's fine to ignore this corner case.

218

"End is not allowed to be"... what?

llvm/unittests/ADT/SequenceTest.cpp
54

Nit: I don't see why you bother defining LAST when you could just say E below.

This revision is now accepted and ready to land.Jun 22 2021, 2:42 PM
gchatelet updated this revision to Diff 353885.Jun 23 2021, 1:45 AM
gchatelet marked 3 inline comments as done.
  • Trying a different approach
  • rebase and use seq_inclusive in a few places
  • rename StorageType to iota_range_iterator_storage_type
  • Use std::conditional instead of custom specialization
  • Address comments
  • Address comments
  • Prevent overflow of PastEndValue + tests
  • Address comments
  • Fix clang-tidy warning
  • Fix another clang-tidy warning
  • Fix UB in reverse iteration with signed type
gchatelet marked an inline comment as done.Jun 23 2021, 1:46 AM
gchatelet added inline comments.
llvm/include/llvm/ADT/Sequence.h
28

Nits: Please add a newline before struct;

clang-format puts it back to the same line :-/

and consider renaming/reordering the parameters so that T comes before U. I think that should be easy and mechanical (if tedious), because neither T nor U is deduced nor defaulted.

Done.

195

Note that BeginValue-1 might be UB; e.g. seq(INT_MIN, 0).rend(). But I think it's fine to ignore this corner case.

Fixed it.

courbet added inline comments.Jun 23 2021, 2:12 AM
llvm/include/llvm/ADT/Sequence.h
159

So you can't iota_range and enum with uint8_t underlying type and 256 values ? That seems a bit artificial.
Maybe we could always use intmax_t (resp. uintmax_t for unsigned underlying types) as raw_type ?

courbet added inline comments.Jun 23 2021, 2:31 AM
llvm/include/llvm/ADT/Sequence.h
220

Maybe also note that all enum values are generated, even those that do not appear in the enumerator list ?

gchatelet updated this revision to Diff 353930.Jun 23 2021, 4:38 AM
gchatelet marked 2 inline comments as done.
  • Added warning on validity of enum range iteration.
  • Use uintmax_t/intmax_t types for storage.
gchatelet marked an inline comment as done.Jun 23 2021, 4:39 AM
gchatelet added inline comments.
llvm/include/llvm/ADT/Sequence.h
159

So you can't iota_range and enum with uint8_t underlying type and 256 values ? That seems a bit artificial.
Maybe we could always use intmax_t (resp. uintmax_t for unsigned underlying types) as raw_type ?

Fair enough. I'm slightly concerned about integer promotion/conversion rules but it should be safe (famous last words).

gchatelet updated this revision to Diff 353936.Jun 23 2021, 5:36 AM
gchatelet marked an inline comment as done.
  • rebase and squash commits
courbet accepted this revision.Jun 23 2021, 6:59 AM

@Quuxplusone are you still OK with the last changes? If so I think I can submit.

llvm/include/llvm/ADT/Sequence.h
159

Well, now I'm wondering why we don't just use uintptr_t all the time. Why does the storage type ever have to be signed intptr_t?
I think the rationale has to do with ensuring that r.begin() < r.end(); but maybe that just indicates that Op::difference is computing its return value via the wrong expression.
But anyway, I continue to be uninterested in blocking this at this point. :)

Please do verify that the test failures in the latest run aren't your fault (e.g. by rebasing and letting the tests run again) before committing.

213–215

Maybe "seq will generate each consecutive value, even if no enumerator with that value exists."

gchatelet updated this revision to Diff 354239.Jun 24 2021, 6:42 AM
  • Updated enum documentation
gchatelet marked an inline comment as done.Jun 24 2021, 6:42 AM
gchatelet marked an inline comment as done.Jun 24 2021, 6:51 AM
gchatelet added inline comments.
llvm/include/llvm/ADT/Sequence.h
159

Well, now I'm wondering why we don't just use uintptr_t all the time. Why does the storage type ever have to be signed intptr_t?

So there are enums that explicitly use negative values (e.g. FloatingPointMode.h:44) so I think it's reasonable to allow negative values.
Also, I think it's a valid use case to use seq with negative values as well (e.g. seq<int>(-10, 0)).

I think the rationale has to do with ensuring that r.begin() < r.end(); but maybe that just indicates that Op::difference is computing its return value via the wrong expression.
But anyway, I continue to be uninterested in blocking this at this point. :)

Thx a lot for the time you spent on the review. Highly appreciate it.

Please do verify that the test failures in the latest run aren't your fault (e.g. by rebasing and letting the tests run again) before committing.

Yes.

llvm/include/llvm/ADT/Sequence.h
159

For the record, I was talking about the storage type (raw_type) being unsigned, not the value type (value_type). I certainly agree that the value_type needs to be able to be signed or unsigned, enum or integral. But the reason you ever need anything besides uintptr_t as the raw_type is much subtler and harder to see.

gchatelet marked an inline comment as done.

This is currently blocked on failing tests see https://reviews.llvm.org/D83351#2857797 for reference.

gchatelet updated this revision to Diff 356499.Jul 5 2021, 7:04 AM
  • Fix iteration over AttributeList
Quuxplusone added inline comments.
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
87–91 ↗(On Diff #356499)

LGTM, but per @lebedev.ri 's comment in D83351, I think this would benefit from a follow-up patch that turns index_begin and index_end into signed ints.
Style nit: Personally I'd rather see "unsigned(-1)" and "-1, 0, 1...", than "unsigned(~0)" and "~0, 0, 1...".

llvm/unittests/ADT/SequenceTest.cpp
118–119

Pre-existing consistency issue: I see all of the following in llvm/unittests/, and they can't all be right:

#if !defined(NDEBUG) && GTEST_HAS_DEATH_TEST  // 1
#if defined(GTEST_HAS_DEATH_TEST) && !defined(NDEBUG)  // 2
#ifndef NDEBUG  // 3

and as a two-liner:

#ifdef GTEST_HAS_DEATH_TEST
#ifndef NDEBUG  // 4

In a vacuum, I'd recommend (2) above, because once the #if condition mentions DEATH, then your comment on line 118 becomes redundant and can be removed.

lebedev.ri added inline comments.Jul 5 2021, 7:23 AM
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
87–91 ↗(On Diff #356499)

This change NACK.
It should probably be something like

for (unsigned SetIdx : seq<int>(AL.index_begin(), AL.index_end())) {

or

for (unsigned SetIdx : seq((int)AL.index_begin(), (int)AL.index_end())) {
Quuxplusone added inline comments.Jul 5 2021, 7:42 AM
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
87–91 ↗(On Diff #356499)

I approve of seq((int)AL.index_begin(), (int)AL.index_end()) but not seq<int>(AL.index_begin(), AL.index_end()) — I don't want people writing explicit template arguments all over the place.
Regardless, this would still benefit from a followup patch to actually change the type of index_begin so that we don't have to cast it on every use.

gchatelet updated this revision to Diff 356507.Jul 5 2021, 7:46 AM
gchatelet marked 2 inline comments as done.
  • Fix DEATH_TEST guard
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
87–91 ↗(On Diff #356499)

Ideally, AttributeList should be changed to preserve iterator semantic, this was the intention of one of the previous author (Read summary of D32811).
Unfortunately it seems a bit more work than just renaming the enum.

I went for the very explicit version because using int involves two casts:

  • two from unsigned to signed which are implementation defined and thus non standard.
  • one from signed to unsigned which is well defined and uses modulo 2 arithmetic.

See Integer conversions in https://en.cppreference.com/w/c/language/conversion

I think it's important to make sure the code is crystal clear and the double cast makes it non trivial to reason about.

We can:

  1. Use for (unsigned SetIdx : seq<int>(AL.index_begin(), AL.index_end())) although I think it's implementation defined and a fix for something that is broken elsewhere.
  2. Write the loop explicitly explaining why we have to do it this way, allegedly not fixing the real problem either.
  3. Invest extra time to fix the issue in AttributeList so it respects begin<=end.

WDYT?

FYI I'm fixing AttributeList indexing in D105485.

gchatelet updated this revision to Diff 358297.Jul 13 2021, 9:21 AM
  • rebase and use the same iterating scheme as in D105485.
This revision was landed with ongoing or failed builds.Jul 13 2021, 9:22 AM
This revision was automatically updated to reflect the committed changes.
gchatelet added inline comments.Jul 13 2021, 9:22 AM
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
87–91 ↗(On Diff #356499)

Since I'm blocked on D105485, I'll submit this one using the idiomatic way of iterating through AttributeList. This is consistent with what I did in D105485 as well.

 % git grep \.index_begin\(
include/llvm/IR/Attributes.h:764:  unsigned index_begin() const { return AttributeList::FunctionIndex; }
lib/Bitcode/Writer/BitcodeWriter.cpp:836:    for (unsigned i = AL.index_begin(), e = AL.index_end(); i != e; ++i) {
lib/Bitcode/Writer/ValueEnumerator.cpp:1039:  for (unsigned i = PAL.index_begin(), e = PAL.index_end(); i != e; ++i) {
lib/IR/Attributes.cpp:1550:  for (unsigned i = index_begin(), e = index_end(); i != e; ++i) {
lib/Transforms/Utils/FunctionComparator.cpp:113:  for (unsigned i = L.index_begin(), e = L.index_end(); i != e; ++i) {

This breaks check-llvm on arm macs: http://45.33.8.238/macm1/13855/step_11.txt

Thx for reporting. I'll revert the patch asap.