Subtraction is a foundational arithmetic operation that is often used when computing, for example, data transfer sets or cache hits. Since the result of subtraction need not be a convex polytope, a new class PresburgerSet is introduced to represent unions of convex polytopes.
Thanks for reviewing this!
Fixed the comment. fac is modified by the algorithm internally but gets restored to its original state by the time it returns.
Merged MLIRPresburger into MLIRLoopAnalysis to avoid a cyclic dependency between them.
For private access, would it be cleaner than accessing the member variable directly?
The build failure appears to be unrelated. From the log:
ERROR exception: Cmd('git') failed due to: exit code(1) cmdline: git fetch --all stdout: 'Fetching origin Fetching upstream' stderr: 'firstname.lastname@example.org: Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. error: Could not fetch origin'
The implementation details are such that making intersect/union static functions that return the resultant set rather than be member functions modifying an object results in unnecessary copies.
On the other hand, the implementation of complement is such that the result set is constructed from scratch anyway. If the caller wants both the original set as well as the complement, a static function allows them to obtain that without any extra copies. If it was only a non-static member function, they would have to create a copy of the original set and then call complement on the copy, incurring an additional copy. The same is true of subtract, actually, so that could be made static too, if this reasoning makes sense.
Are there any intended guarantees of the "canonical" form of the set? E.g., it appears to me that the contained FAC's can overlap, which may or may not be a good idea depending on how operations will be handled.
Leaking implementation details to the API is very unfortunate. This means that if the implementation changes, so will the API, and it also makes it harder for uses to understand when they function they are calling is static and when it is not. In the case of already-complex abstraction such as Presburger sets, it adds even more complexity that could have been avoided.
Could you rather have something like a (non-static) getComplementSet() function that returns the complement set and indicates in its name that it produces a new value?
Have you considered having a facade class that doesn't store actual data so is cheap to copy, and use shared_ptr / uniquer / copy-on-write behind the scenes?
NRVO is a thing in modern compilers, does the copy actually happen when you return?
Why is this taking a mutable reference to fac?
Why this change?
The general policy is to have one library per directory, and accordingly a cmake file.
reserve space in the vector before populating it in the loop
Notation nit: in set theory, - is symmetric difference, e.g. (A \ B) u (B \ A), one-sided difference is usually denoted as \
Nit: ^ for intersection, v for union ?
What is the "basic set" ?
What is s_ij, a single constraint?
Do you ignore the fact that constraints are redundant, or the constraints themselves?
Do not recompute the upper bound on each iteration
Non-trivial lambdas deserve a comment, similarly to top-level functions
Why is this even necessary?
This wouldn't need a comment if you had FlatAffineConstraints::universe(), or used PresubrgerSet::universe(0 instead.
For something returning a new value, we could use a unique_ptr return value, but that would lead to a heap allocation (in some ways defeating the purpose of using a SmallVector and contiguous storage inside a FAC)? To avoid heap allocations, we could instead use a trailing output argument (passed by reference), and this will both avoid heap allocations and copies for sure - but this makes call site code not as clean and "composing". And finally returning the FAC by value would lead to copies if "return value optimization" doesn't happen (could be sensitive to the control flow of the implementation?), and one may want to not rely on it since performance is important here. Avoid heap allocations (esp. for small ones which are actually the common case) and copies is key here.
This will make a useless copy of fac on call, and enable implicit copy-construction. Not sure the latter is a good idea.
IMO, unique_ptr is a bad idea unless there is some _uniqueness_ semantics, either in ownership or in object existence. Here it does not seem to be the case, and it will lead to an even more awkward to use API.
I am slightly concerned that all this can be classified as premature optimization. Especially since we are looking into exponential-complexity algorithms, is the execution time really dominated by copies or allocations? At what point we actually stop benefiting from inline stack storage because we allocate anyway?
For this specific case, I just don't see any good reason to have static functions (other than makeUniverse, which is a factory function). Even more, I would actually prefer having only an explicit cast from FAC to PresburgerSet and only support subtraction between to instances of the latter class. This makes it clear, when reading the code, that it starts to use a more expressive and expensive abstraction.
I would strongly recommend thinking holistically about trade-offs in efficiency and usability, and coming up with a proposition of how the end result should look like and hopefully some design/implementation principles. So far it seems to be "avoid copy at all cost", and if this is the approach that is going to be taken, I would like to see a convincing rationale to support it.
Regarding the API, I would also recommend return *this from all mutating function so that they become chainable: PresburgerSet(fac).subtract(otherSet).insersect(moreSet).complement() reads quite fluently. As opposed to auto x =PresburgerSet::subtract(fac, otherSet); x.intersect(moreSet); PresburgerSet::complement(x); that is currently required.
Haven't seen the comment below. It's a bad smell for the API, again. As the user, I would have no way of knowing if the operand is reverted back to the original state or not, so I would have copied it just to be safe. And since the argument is not const, future changes make stop reverting it back to the original state and break all uses which expect that in subtle ways.
Changed the API to return the result of set operations. Union, intersect, subtract, complement are now non-static, const member functions that return the result of the operation. This also allows chaining. Also addressed other comments and fixed an issue with the intersect tests.
No guarantees are intended. I can't think of any algorithmic simplifications or optimizations that could be made to the implementations of any of the operations by taking advantage of a no-overlap invariant.
I've changed the API to make subtract and complement non-static. I realized that I was slightly mistaken earlier -- intersect is also like subtract and complement in that it constructs the result from scratch, so returning the value seems favourable. Now unionSet, intersect, subtract and complement are all non-static, const, and return the result of the operation. This also allows chaining.
Also, only subtraction between sets is possible now and the FAC -> PresburgerSet cast has been made explicit.
It's just passed by value now. Also, only subtraction between PresburgerSets is possible now, so this function has been made private.
FlatAffineConstraints depends on Simplex, while Set depends on FlatAffineConstraints, resulting in a cyclic dependency if MLIRPresburger is kept separate.
Removed. (It was previously necessary since rvalues couldn't be passed to the non-const lvalue overload. Since that overload no longer takes a reference, this is no longer needed.)
These are not of exponential complexity --- most of the things using these are just quadratic/cubic and many linear. The only one exponential is also for all practical purposes of polynomial time here (Simplex). And more importantly, both n (the input size whether dimensionality of constraint) are small here - from a handful to few tens at most. In the common case, in fact, they are actually quite small. I understand things depend on use cases. But I think this is a good time to select the right data structures. Otherwise, I agree with all the points made - I don't think we should go for "avoid all copies" at this stage, but also not go in the wrong direction.
I strongly feel copies and stack allocations to be quite important if the focus remains on making sure the input sizes and use cases *by design* are all small - at least in the upstream passes/utilities, which has an impact on how passes/utilities are designed to make use of this API.
The comment can be rephrased. Something like Perform a union of fac` with this FlatAffineConstraints`?
Should it be addFlat.. or unionFlat...? Add can sometimes be misleading - appending constraints will lead to an intersect for a single system.
evaluate an affine expression ?
Use the i = 0, e = ...; i < e form.
It is odd that a PresburgerSet(..) will return an empty set while FlatAffineConstraints(.., ...) will return a universal set. Could we make things consistent? Could you just do PresburgetSet::getEmptySet(.., ...)? It's confusing otherwise.
Could you improve this slightly for debugging purposes? You could use interleaveComma while printing the list enclosed in [ -- so that we easily know where the constraint sets start and end in a debug output.
Please add a comment here and/or below to say what this file is testing.
All of these methods are missing doc comments. We definitely need them at least for the first one.
Sorry, I can't immediately tell locally what these mean. I wasn't expecting points in the 2 to 10 range for x and y appear here, but there are five of them. testComplementAtPoints and all other test* methods are missing doc comments. Please add.
Then it is worth documenting.
Great, now they are all consistent! Let's push this one step further and try to think if the "const member function acting on a const reference and returning a new object" applies to other operations we might want to have on PresburgerSet or not. It could also be surprising to get some operations that mutate the object they are called on and others that produce new objects.
I meant simplex indeed. Also, as we start wrapping it into operations such as subtraction and building higher-level operations on top of those, it will become increasingly less clear which operations are more likely to be computationally expensive.
You know that I am very open to doing one thing that works and reconsidering it later if needed. There are costs of such reconsideration, but they don't look terribly high at this point. If we make a properly isolated API that does not leak implementation details, this cost of change will remain manageable. But this shouldn't mean we shouldn't carefully choose the right data structures in the first place.
A data-driven approach could work well here. If we collect some performance measurements of how the implementation behaves for different kinds of inputs, we can quantify what small actually means and what are the costs of not keeping cases small.
Then some files should move... For example, we can have lib/Analysis/PresburgerSet.cpp, keep Simplex where it is (optionally rename MLIRPresburger to MLIRSimplex) and keep the dependency of MLIRLoopAnalysis on MLIRSimplex.
Or we can go the other way around, which we should arguably do anyway, and factor the entire affine analysis out into a separate directory under Analysis.
Speaking about which, is there a reason why we need public default constructors? If not, I'd just go with factory functions for both empty and universe sets (in a separate patch, this patch should be consistent with what is already present in the codebase)
Changed to unionFACInPlace to disambiguate from the union operation that returns the result rather than mutating the set and added some documentation to clarify this.
I also added an analogous unionSetInPlace and used it in subtract. If we want to provide the user the option to save copies in case they want to union in-place, this can be left public. Otherwise we can make it private.
Moved Set.cpp to lib/Analysis/PresburgerSet.cpp.
Changed to PresburgetSet::getEmptySet(.., ...).
It looks somewhat awkward printing the FACs with interleaved commas... each FAC takes up multiple lines, and the header of each FAC dump (something like Constraints (1 dims, 0 symbols, 0 locals), (1 constraints)) is much more prominent than the comma is. With commas it looks something like this
4 FlatAffineConstraints: [ Constraints (1 dims, 0 symbols, 0 locals), (1 constraints) (None const) -1 -5 >= 0 , Constraints (1 dims, 0 symbols, 0 locals), (1 constraints) (None const) 1 -3 = 0 , Constraints (1 dims, 0 symbols, 0 locals), (1 constraints) (None const) 1 -4 = 0 , Constraints (1 dims, 0 symbols, 0 locals), (1 constraints) (None const) 1 -5 = 0 ]
Are the commas adding any value here?
Does this work?
This is not a list of points that are supposed to be present; it is a list of points to validate correctness at. iIt checks that these points are present in the complement iff they are not present in the original set.