Page MenuHomePhabricator

Introduce subtraction for FlatAffineConstraints
ClosedPublic

Authored by arjunp on Sep 2 2020, 6:46 PM.

Details

Summary

Subtraction is a foundational arithmetic operation that is often used when computing, for example, data transfer sets or cache hits. Since the result of subtraction need not be a convex polytope, a new class PresburgerSet is introduced to represent unions of convex polytopes.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Thanks for reviewing this!

mlir/include/mlir/Analysis/Presburger/Set.h
69 ↗(On Diff #290412)

Fixed the comment. fac is modified by the algorithm internally but gets restored to its original state by the time it returns.

mlir/lib/Analysis/Presburger/CMakeLists.txt
2

Merged MLIRPresburger into MLIRLoopAnalysis to avoid a cyclic dependency between them.

mlir/lib/Analysis/Presburger/Set.cpp
33 ↗(On Diff #290412)

For private access, would it be cleaner than accessing the member variable directly?

The build failure appears to be unrelated. From the log:

ERROR   exception: Cmd('git') failed due to: exit code(1)
  cmdline: git fetch --all
  stdout: 'Fetching origin
Fetching upstream'
  stderr: 'git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.
 
Please make sure you have the correct access rights
and the repository exists.
error: Could not fetch origin'
ftynse added inline comments.Sep 10 2020, 10:39 AM
mlir/include/mlir/Analysis/Presburger/Set.h
64 ↗(On Diff #290984)

Why is this a static function but insertsect/union are not?

arjunp added inline comments.Sep 10 2020, 6:30 PM
mlir/include/mlir/Analysis/Presburger/Set.h
64 ↗(On Diff #290984)

The implementation details are such that making intersect/union static functions that return the resultant set rather than be member functions modifying an object results in unnecessary copies.

On the other hand, the implementation of complement is such that the result set is constructed from scratch anyway. If the caller wants both the original set as well as the complement, a static function allows them to obtain that without any extra copies. If it was only a non-static member function, they would have to create a copy of the original set and then call complement on the copy, incurring an additional copy. The same is true of subtract, actually, so that could be made static too, if this reasoning makes sense.

ftynse added inline comments.Sep 11 2020, 9:24 AM
mlir/include/mlir/Analysis/Presburger/Set.h
23 ↗(On Diff #290984)

Are there any intended guarantees of the "canonical" form of the set? E.g., it appears to me that the contained FAC's can overlap, which may or may not be a good idea depending on how operations will be handled.

64 ↗(On Diff #290984)

Leaking implementation details to the API is very unfortunate. This means that if the implementation changes, so will the API, and it also makes it harder for uses to understand when they function they are calling is static and when it is not. In the case of already-complex abstraction such as Presburger sets, it adds even more complexity that could have been avoided.

Could you rather have something like a (non-static) getComplementSet() function that returns the complement set and indicates in its name that it produces a new value?

Have you considered having a facade class that doesn't store actual data so is cheap to copy, and use shared_ptr / uniquer / copy-on-write behind the scenes?

NRVO is a thing in modern compilers, does the copy actually happen when you return?

70 ↗(On Diff #290984)

Why is this taking a mutable reference to fac?

mlir/lib/Analysis/Presburger/CMakeLists.txt
1

Why this change?

The general policy is to have one library per directory, and accordingly a cmake file.

mlir/lib/Analysis/Presburger/Set.cpp
111 ↗(On Diff #290984)

reserve space in the vector before populating it in the loop

123 ↗(On Diff #290984)

same here

129 ↗(On Diff #290984)

Notation nit: in set theory, - is symmetric difference, e.g. (A \ B) u (B \ A), one-sided difference is usually denoted as \

132 ↗(On Diff #290984)

Nit: ^ for intersection, v for union ?

134 ↗(On Diff #290984)

What is the "basic set" ?

136 ↗(On Diff #290984)

What is s_ij, a single constraint?

145 ↗(On Diff #290984)

Do you ignore the fact that constraints are redundant, or the constraints themselves?

166 ↗(On Diff #290984)

SmallBitVector?

167 ↗(On Diff #290984)

Do not recompute the upper bound on each iteration

173 ↗(On Diff #290984)

Non-trivial lambdas deserve a comment, similarly to top-level functions

230–234 ↗(On Diff #290984)

Why is this even necessary?

238 ↗(On Diff #290984)

This wouldn't need a comment if you had FlatAffineConstraints::universe(), or used PresubrgerSet::universe(0 instead.

256 ↗(On Diff #290984)

copy-pasta

bondhugula added inline comments.Sep 13 2020, 10:59 PM
mlir/include/mlir/Analysis/Presburger/Set.h
64 ↗(On Diff #290984)

For something returning a new value, we could use a unique_ptr return value, but that would lead to a heap allocation (in some ways defeating the purpose of using a SmallVector and contiguous storage inside a FAC)? To avoid heap allocations, we could instead use a trailing output argument (passed by reference), and this will both avoid heap allocations and copies for sure - but this makes call site code not as clean and "composing". And finally returning the FAC by value would lead to copies if "return value optimization" doesn't happen (could be sensitive to the control flow of the implementation?), and one may want to not rely on it since performance is important here. Avoid heap allocations (esp. for small ones which are actually the common case) and copies is key here.

What would be the right choices? @ftynse @arjunp

ftynse added inline comments.Sep 14 2020, 9:29 AM
mlir/include/mlir/Analysis/Presburger/Set.h
30 ↗(On Diff #290984)

This will make a useless copy of fac on call, and enable implicit copy-construction. Not sure the latter is a good idea.

64 ↗(On Diff #290984)

IMO, unique_ptr is a bad idea unless there is some _uniqueness_ semantics, either in ownership or in object existence. Here it does not seem to be the case, and it will lead to an even more awkward to use API.

I am slightly concerned that all this can be classified as premature optimization. Especially since we are looking into exponential-complexity algorithms, is the execution time really dominated by copies or allocations? At what point we actually stop benefiting from inline stack storage because we allocate anyway?

For this specific case, I just don't see any good reason to have static functions (other than makeUniverse, which is a factory function). Even more, I would actually prefer having only an explicit cast from FAC to PresburgerSet and only support subtraction between to instances of the latter class. This makes it clear, when reading the code, that it starts to use a more expressive and expensive abstraction.

I would strongly recommend thinking holistically about trade-offs in efficiency and usability, and coming up with a proposition of how the end result should look like and hopefully some design/implementation principles. So far it seems to be "avoid copy at all cost", and if this is the approach that is going to be taken, I would like to see a convincing rationale to support it.

Regarding the API, I would also recommend return *this from all mutating function so that they become chainable: PresburgerSet(fac).subtract(otherSet).insersect(moreSet).complement() reads quite fluently. As opposed to auto x =PresburgerSet::subtract(fac, otherSet); x.intersect(moreSet); PresburgerSet::complement(x); that is currently required.

70 ↗(On Diff #290984)

Haven't seen the comment below. It's a bad smell for the API, again. As the user, I would have no way of knowing if the operand is reverted back to the original state or not, so I would have copied it just to be safe. And since the argument is not const, future changes make stop reverting it back to the original state and break all uses which expect that in subtle ways.

arjunp updated this revision to Diff 291968.Sep 15 2020, 10:40 AM
arjunp marked 14 inline comments as done.

Changed the API to return the result of set operations. Union, intersect, subtract, complement are now non-static, const member functions that return the result of the operation. This also allows chaining. Also addressed other comments and fixed an issue with the intersect tests.

arjunp added inline comments.Sep 15 2020, 3:20 PM
mlir/include/mlir/Analysis/Presburger/Set.h
23 ↗(On Diff #290984)

No guarantees are intended. I can't think of any algorithmic simplifications or optimizations that could be made to the implementations of any of the operations by taking advantage of a no-overlap invariant.

64 ↗(On Diff #290984)

I've changed the API to make subtract and complement non-static. I realized that I was slightly mistaken earlier -- intersect is also like subtract and complement in that it constructs the result from scratch, so returning the value seems favourable. Now unionSet, intersect, subtract and complement are all non-static, const, and return the result of the operation. This also allows chaining.

Also, only subtraction between sets is possible now and the FAC -> PresburgerSet cast has been made explicit.

70 ↗(On Diff #290984)

It's just passed by value now. Also, only subtraction between PresburgerSets is possible now, so this function has been made private.

mlir/lib/Analysis/Presburger/CMakeLists.txt
1

FlatAffineConstraints depends on Simplex, while Set depends on FlatAffineConstraints, resulting in a cyclic dependency if MLIRPresburger is kept separate.

mlir/lib/Analysis/Presburger/Set.cpp
230–234 ↗(On Diff #290984)

Removed. (It was previously necessary since rvalues couldn't be passed to the non-const lvalue overload. Since that overload no longer takes a reference, this is no longer needed.)

bondhugula added inline comments.Sep 15 2020, 3:45 PM
mlir/include/mlir/Analysis/Presburger/Set.h
64 ↗(On Diff #290984)

Especially since we are looking into exponential-complexity algorithms, is the

These are not of exponential complexity --- most of the things using these are just quadratic/cubic and many linear. The only one exponential is also for all practical purposes of polynomial time here (Simplex). And more importantly, both n (the input size whether dimensionality of constraint) are small here - from a handful to few tens at most. In the common case, in fact, they are actually quite small. I understand things depend on use cases. But I think this is a good time to select the right data structures. Otherwise, I agree with all the points made - I don't think we should go for "avoid all copies" at this stage, but also not go in the wrong direction.

I strongly feel copies and stack allocations to be quite important if the focus remains on making sure the input sizes and use cases *by design* are all small - at least in the upstream passes/utilities, which has an impact on how passes/utilities are designed to make use of this API.

arjunp updated this revision to Diff 292257.Sep 16 2020, 9:46 AM

Try to rebase to the new LLVM HEAD.

arjunp updated this revision to Diff 292258.Sep 16 2020, 9:50 AM

Trying again to rebase.

bondhugula requested changes to this revision.Sep 17 2020, 1:31 AM
bondhugula added inline comments.
mlir/include/mlir/Analysis/Presburger/Set.h
49–50 ↗(On Diff #292258)

The comment can be rephrased. Something like Perform a union of fac` with this FlatAffineConstraints`?

50 ↗(On Diff #292258)

Should it be addFlat.. or unionFlat...? Add can sometimes be misleading - appending constraints will lead to an intersect for a single system.

mlir/include/mlir/Analysis/Presburger/Simplex.h
172–173

Likewise.

mlir/lib/Analysis/AffineStructures.cpp
1059

evaluate an affine expression ?

1075–1079

Use the i = 0, e = ...; i < e form.

mlir/lib/Analysis/Presburger/Set.cpp
254 ↗(On Diff #292258)

It is odd that a PresburgerSet(..) will return an empty set while FlatAffineConstraints(.., ...) will return a universal set. Could we make things consistent? Could you just do PresburgetSet::getEmptySet(.., ...)? It's confusing otherwise.

294 ↗(On Diff #292258)

const

304–308 ↗(On Diff #292258)

Could you improve this slightly for debugging purposes? You could use interleaveComma while printing the list enclosed in [ -- so that we easily know where the constraint sets start and end in a debug output.

mlir/unittests/Analysis/Presburger/SetTest.cpp
4–8 ↗(On Diff #292258)

Please add a comment here and/or below to say what this file is testing.

16–52 ↗(On Diff #292258)

All of these methods are missing doc comments. We definitely need them at least for the first one.

495–506 ↗(On Diff #292258)

Sorry, I can't immediately tell locally what these mean. I wasn't expecting points in the 2 to 10 range for x and y appear here, but there are five of them. testComplementAtPoints and all other test* methods are missing doc comments. Please add.

This revision now requires changes to proceed.Sep 17 2020, 1:31 AM
ftynse added inline comments.Sep 17 2020, 4:35 AM
mlir/include/mlir/Analysis/Presburger/Set.h
23 ↗(On Diff #290984)

Then it is worth documenting.

64 ↗(On Diff #290984)

Now unionSet, intersect, subtract and complement are all non-static, const, and return the result of the operation. This also allows chaining.

Great, now they are all consistent! Let's push this one step further and try to think if the "const member function acting on a const reference and returning a new object" applies to other operations we might want to have on PresburgerSet or not. It could also be surprising to get some operations that mutate the object they are called on and others that produce new objects.

Also, only subtraction between sets is possible now and the FAC -> PresburgerSet cast has been made explicit.

Thanks!

These are not of exponential complexity --- most of the things using these are just quadratic/cubic and many linear.

I meant simplex indeed. Also, as we start wrapping it into operations such as subtraction and building higher-level operations on top of those, it will become increasingly less clear which operations are more likely to be computationally expensive.

But I think this is a good time to select the right data structures.

You know that I am very open to doing one thing that works and reconsidering it later if needed. There are costs of such reconsideration, but they don't look terribly high at this point. If we make a properly isolated API that does not leak implementation details, this cost of change will remain manageable. But this shouldn't mean we shouldn't carefully choose the right data structures in the first place.

I strongly feel copies and stack allocations to be quite important if the focus remains on making sure the input sizes and use cases *by design* are all small - at least in the upstream passes/utilities, which has an impact on how passes/utilities are designed to make use of this API.

A data-driven approach could work well here. If we collect some performance measurements of how the implementation behaves for different kinds of inputs, we can quantify what small actually means and what are the costs of not keeping cases small.

mlir/lib/Analysis/Presburger/CMakeLists.txt
1

Then some files should move... For example, we can have lib/Analysis/PresburgerSet.cpp, keep Simplex where it is (optionally rename MLIRPresburger to MLIRSimplex) and keep the dependency of MLIRLoopAnalysis on MLIRSimplex.

Or we can go the other way around, which we should arguably do anyway, and factor the entire affine analysis out into a separate directory under Analysis.

mlir/lib/Analysis/Presburger/Set.cpp
254 ↗(On Diff #292258)

Speaking about which, is there a reason why we need public default constructors? If not, I'd just go with factory functions for both empty and universe sets (in a separate patch, this patch should be consistent with what is already present in the codebase)

arjunp updated this revision to Diff 292545.Sep 17 2020, 9:32 AM
arjunp marked 26 inline comments as done.

Address comments.

mlir/include/mlir/Analysis/Presburger/Set.h
50 ↗(On Diff #292258)

Changed to unionFACInPlace to disambiguate from the union operation that returns the result rather than mutating the set and added some documentation to clarify this.

I also added an analogous unionSetInPlace and used it in subtract. If we want to provide the user the option to save copies in case they want to union in-place, this can be left public. Otherwise we can make it private.

mlir/lib/Analysis/Presburger/CMakeLists.txt
1–4

Moved Set.cpp to lib/Analysis/PresburgerSet.cpp.

mlir/lib/Analysis/Presburger/Set.cpp
254 ↗(On Diff #292258)

Changed to PresburgetSet::getEmptySet(.., ...).

304–308 ↗(On Diff #292258)

It looks somewhat awkward printing the FACs with interleaved commas... each FAC takes up multiple lines, and the header of each FAC dump (something like Constraints (1 dims, 0 symbols, 0 locals), (1 constraints)) is much more prominent than the comma is. With commas it looks something like this

4 FlatAffineConstraints: [
Constraints (1 dims, 0 symbols, 0 locals), (1 constraints)
(None  const)
-1 -5 >= 0

,
Constraints (1 dims, 0 symbols, 0 locals), (1 constraints)
(None  const)
1 -3 = 0

,
Constraints (1 dims, 0 symbols, 0 locals), (1 constraints)
(None  const)
1 -4 = 0

,
Constraints (1 dims, 0 symbols, 0 locals), (1 constraints)
(None  const)
1 -5 = 0

]

Are the commas adding any value here?

mlir/unittests/Analysis/Presburger/SetTest.cpp
4–8 ↗(On Diff #292258)

Does this work?

495–506 ↗(On Diff #292258)

This is not a list of points that are supposed to be present; it is a list of points to validate correctness at. iIt checks that these points are present in the complement iff they are not present in the original set.

bondhugula accepted this revision.Sep 17 2020, 10:19 AM

@ftynse comments from the previous round sound good to me too. Looks good to commit from my side once those are addressed.

mlir/lib/Analysis/Presburger/Set.cpp
304–308 ↗(On Diff #292258)

They aren't really adding any value. I just meant they could be improved in some way. Feel free to do as you see fit. Thanks.

This revision is now accepted and ready to land.Sep 17 2020, 10:19 AM
arjunp updated this revision to Diff 292620.Sep 17 2020, 2:25 PM

clang-format.

arjunp updated this revision to Diff 292628.Sep 17 2020, 2:48 PM
arjunp marked an inline comment as done.

Changed auto to explicit types in PresburgerSetTest.

arjunp added inline comments.Sep 17 2020, 2:50 PM
mlir/lib/Analysis/Presburger/CMakeLists.txt
1

Moved to PresburgerSet.cpp.

mlir/lib/Analysis/Presburger/Set.cpp
254 ↗(On Diff #292258)

(also made the default constructor private)

arjunp updated this revision to Diff 292637.Sep 17 2020, 3:19 PM

clang-format again. Fix a typo in PresburgerSetTest (makeFACFromIneq -> makeFACFromIneqs)

I hope all the issues have been addressed. @ftynse, do you have any further concerns?

Gentle reminder :) @ftynse, did you have any further concerns?

ftynse accepted this revision.Wed, Oct 7, 6:40 AM

Gentle reminder :) @ftynse, did you have any further concerns?

Nope, sorry, once a commit is approved by someone, it disappears from the attention set for everybody else...

arjunp added a comment.Wed, Oct 7, 7:54 AM

Great! I don't have the permissions yet -- can one of you land the patch? :)

This revision was automatically updated to reflect the committed changes.