This is an archive of the discontinued LLVM Phabricator instance.

[Polly] Implement an iterator for isl maps, basic_maps, sets, basic_sets
ClosedPublic

Authored by philip.pfaffe on Jun 13 2018, 10:36 AM.

Details

Summary

Provide an iterator to simplify iteration over some isl collections.
Since these types do not natively support iteration, they have to be converted
to an list first by the caller, but can then be used in a ranged for loop:

isl::set S;
for (auto SubSet : S.get_basic_set_list ()) {
  // ...
}

Event Timeline

philip.pfaffe created this revision.Jun 13 2018, 10:36 AM

For reference: This is an alternative to D47604

[suggestion] Add unit tests for iterating over isl sets.

include/polly/Support/ISLTools.h
28

Why not const List &?

50

Should this hold a reference to the list? That is, the underlying object is not freed until all iterators are destroyed.

90

[suggestion] Make it take const isl::map_list &M, so we safe a reference counting cycle.

[serious] Without having looked further into it, does the map_iterator store a pointer to M, which does goes out-of-scope after leaving this function?

philip.pfaffe marked an inline comment as done.Jun 13 2018, 2:35 PM
philip.pfaffe added inline comments.
include/polly/Support/ISLTools.h
28

I guess technically it could be const, since in isl everything is by-value.

50

It needs a reference to do the iteration.

90

You're right, this should take a reference to the list.

philip.pfaffe marked an inline comment as done.
Add a unittest and address comments

Looks like I forgot to add pollydev as a subscriber. Shall I resubmit the patch?

Thanks for the test case

Looks like I forgot to add pollydev as a subscriber. Shall I resubmit the patch?

Don't think it's necessary.

include/polly/Support/ISLTools.h
50

What I mean is that the iterator could increase isl's ref counter to ensure that it is not free'd as long as one iterator exists. That is, is the following code valid?

basic_set_iterator I, E;
{
  auto List = S.get_basic_set_list();
  I = begin(List); E = end(List);
} // List out-of-scope here
for (; I != E; ++I) { ... }
unittests/Support/ISLTools.cpp
13

Ctx no free'd?

Thank you Philip. This clearly looks a lot better than my patch. From my perspective this looks good after Michael's comments are addressed. Thanks a lot.

philip.pfaffe marked an inline comment as done.
Free the context in the unittest.
philip.pfaffe marked an inline comment as done.Jun 18 2018, 3:24 AM
philip.pfaffe added inline comments.
include/polly/Support/ISLTools.h
50

Well, if you destroy the container you obtained iterators for and then use the iterators, that's on you ;)

Meinersbur added inline comments.Jun 18 2018, 9:47 AM
include/polly/Support/ISLTools.h
50

Using reference counting would allow us to not invalidate the iterators.

It's not just destroying the container, but also modifying it while iterating. Isl objects are reference counted, so the underlying object might be shared by other references as well.

At least it should be documented when iterators are invalidated.

unittests/Support/ISLTools.cpp
24

[suggestion] I quite like using RAII to free the isl_ctx as done in IslTest.cpp:

std::unique_ptr<isl_ctx, decltype(&isl_ctx_free)> Ctx(isl_ctx_alloc(), &isl_ctx_free);
philip.pfaffe added inline comments.Jun 19 2018, 2:15 AM
include/polly/Support/ISLTools.h
50

If I capture the List by value, how do I compare iterators? I'll have to compare the lists by value then, and that's quite expensive.

I can certainly document iterator invalidation, but it behaves just as you'd expect: destruction or modification of the list invalidate.

unittests/Support/ISLTools.cpp
24

Why doesn't the C++ wrapper do this already?

RAII management of the isl_ctx

philip.pfaffe marked an inline comment as done.Jun 19 2018, 3:54 AM
Meinersbur added inline comments.Jun 19 2018, 3:35 PM
include/polly/Support/ISLTools.h
50

Comparison of iterators from different containers is undefined (except: value-initialized iterators). You can also compare their pointers: List.get() == O.get(). However, since isl list objects use copy-on-write as an implementation detail, it pointer-equality of two list references (as in: two variables of lists that may or may not contain the same elements) was never well-defined.

IMHO the interesting question is whether the additional definedness/safety (accidentally invalidating an iterator is one of my least favorite kinds of bugs) is worth the potential overhead of reference counter increment/decrement.

unittests/Support/ISLTools.cpp
24

We could not decide what the right approach is. Freeing the isl_ctx would mean that isl::ctx owns the reference. But in most cases, you are not owning it: E.g. if returned by a get_ctx function to do something else.

philip.pfaffe added inline comments.Jun 20 2018, 1:00 AM
include/polly/Support/ISLTools.h
50

I'll buy the comparison point, but I'm not comfortable relying on copy semantics of isl lists:

  • the iterator is a template. I can instantiate it for std::vector.
  • it'd mean depending on internal implementation details of isl lists.
  • iterator use-after-invalidation is a bug everywhere in the code, and is easily checkable. Valgrind will detect this immediately.
philip.pfaffe added inline comments.Jun 20 2018, 4:01 AM
unittests/Support/ISLTools.cpp
24

Then why not just return the C object? Or decide ownership based on how the object is constructed?

bollu added inline comments.Jun 22 2018, 4:52 AM
include/polly/Support/ISLTools.h
50

So, I now understand what you mean when you said that the copy-on-write is part of the implementation, not the interface.

If I understood correctly, you mean that:

  1. If we use const ListT List, and then ISL decides to use std::vector (at which point, ISL's interface changes, since it has abandoned COW semantics), our code will need to change.
  1. However, if we use const ListT* List, then if ISL decides to use std::vector, out code will not need to change.

Hence, 2 is better than 1 since it makes less assumptions about ISL's semantics, and leads to less code churn.

Please correct me if I am mis-representing your position.

Is this implementation morally correct _today_?

While I agree with this position from a software engineering perspective, I disagree with it when it comes to the semantics of your code.

By holding a const ListT *List, with respect to ISL's _current_ interface, you are holding a pointer-to-pointer (so, something like a weak reference). Hence, I feel that this implementation violates ISL's _current_ interface, to prevent code churn in the _future_ (which is unlikely, BTW) that ISL changes its COW semantics.

Keeping a const ListT List is in line with ISL's _current_ interface of COW semantics, and is thus the correct choice.

UB with respect to iterators if ISL decides to clone objects

Another argument put forth was that if ISL changes the implementation of COW to actually clone the underlying object, then something like:

isl_list l1 = ...;
auto begin1 = l1.begin();

isl_list l2 = l1;
auto end2 = l2.end();

// UB, since we are comparing iterators across different objects,
// since ISL cloned L2 into L1.
if (begin1 != end2) {
    ...
}

I don't believe this triggers UB according to iterator semantics.

  1. Definition of input iterator

An input iterator is one that satisfies the iterator laws (de-referncible, existence of ++, along with a == operator.

By this definition, yes the example given would be UB, since I'm performing == between two iterators in different domains. However, there are two important caveats that make this argument incorrect:

a. [EqualityComparable == is an equivalence relation](http://eel.is/c++draft/utility.arg.requirements#tab:equalitycomparable)

is an equivalence relation, that is, it has the following properties:

...

b. Domain of an iterator can change over time

term the domain of == is used in the ordinary mathematical sense to denote the set of values over which == is (required to be) defined. This set can change over time.

So, _as the iterator implementor_, I can say that when

l2 = l1;

occurs, then __the domain of == on begin1 expands to contain all valid iterators of l2, and vice versa for `end2.

This way, begin1 == end2 is perfectly well defined and the compiler cannot optimise this away.

Practically, is const ListT List nicer or const ListT* List

From a usability perspective, assuming that ISL does not change its COW semantics (and therefore we pay no code churn), I much prefer const ListT List, since not only does it respect the current interface semantics, but it also makes debugging easier as Michael pointed out.

Hence, I vote for const ListT List :)

philip.pfaffe added inline comments.Jun 22 2018, 5:09 AM
include/polly/Support/ISLTools.h
50

So I'm not going to make this a value instead of a reference:

While the moral issue is indisputable per se, this is purely a software engineering issue, and morals don't count if they don't help. We can debate whether COW semantics is public or private implementation, but the fact is that with a List value I'd depend on an implementation detail that I do not have to depend on. Every SE book and lecture on the planet agrees that's a bad idea.

Having an iterator own a copy of the container it iterates is just insane. Nobody can read the code without stumbling over this, and nobody can understand it without in-depth knowledge of isl. A reference is absolutely clear in the intent and behaviour of this implementation.

I disagree with Micheal's argument for debuggability. In the current implementation of the iterator and isl, the only possible kind of invalidation is if the container goes out of scope, because it is immutable. That's a dangling reference error and is easily debuggable.

A follow-up to last week's phone call:

C++ compilers cannot optimize iterators based on what the equality operator's domain is because the compiler has no understanding of what a iterator is. It is a library concept.

Previous drafts for C++ contained axioms, which could be used by the compiler for optimization (I think they were mostly meant for correctness checking, not optimization, as concepts were also mostly about improving compiler error messages), if such axiom was added to the iterator concept. This is speculative, but I'd assume such a axiom could at most be added to iterator implementations, not iterators in general. Most iterators don't even store a reference/pointer/etc to the container object they are iterating over.

The most recent draft for concepts does not contain axiom or mention that these could be used for compiler optimizations.

unittests/Support/ISLTools.cpp
24

@grosser used this for Polly's isl-noexceptions.h since get_ctx() had to return something. The isl version of it never generated something using isl_ctx as parameter or return value. I don't know why he chose to not use isl_ctx directly, but I had no objections.

bollu added a comment.Jun 26 2018, 6:26 AM

@Meinersbur even assuming that such a concept exists (since one could argue such a concept does exist for a range-based for loop to work), I still don't think it allows the compiler to optimise away iterators based on the interpretation of the spec I outlined (== is an equivalence relation whose domain is mutable).

bollu accepted this revision.Jun 27 2018, 12:31 PM

Sorry, I meant to accept this last week. I accept this as an acceptable "software engineering" compromise, while I still hold the philosophical reservations I have mentioned.

This revision is now accepted and ready to land.Jun 27 2018, 12:31 PM

@Meinersbur If you are opposed to merging this, please request changes.

Committing is ok, I merely disagree with @philip.pfaffe's arguments, not with the patch itself.

This revision was automatically updated to reflect the committed changes.