This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Add integer range inference analysis
ClosedPublic

Authored by krzysz00 on Apr 19 2022, 11:00 AM.

Details

Summary

This commit defines a dataflow analysis for integer ranges, which
uses a newly-added InferIntRangeInterface to compute the lower and
upper bounds on the results of an operation from the bounds on the
arguments. The range inference is a flow-insensitive dataflow analysis
that can be used to simplify code, such as by statically identifying
bounds checks that cannot fail in order to eliminate them.

The InferIntRangeInterface has one method, inferResultRanges(), which
takes a vector of inferred ranges for each argument to an op
implementing the interface and a callback allowing the implementation
to define the ranges for each result. These ranges are stored as
ConstantIntRanges, which hold the lower and upper bounds for a
value. Bounds are tracked separately for the signed and unsigned
interpretations of a value, which ensures that the impact of
arithmetic overflows is correctly tracked during the analysis.

The commit also adds a -test-int-range-inference pass to test the
analysis until it is integrated into SCCP or otherwise exposed.

Finally, this commit fixes some bugs relating to the handling of
region iteration arguments and terminators in the data flow analysis
framework.

Depends on D124020

Depends on D124021

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rriddle added inline comments.Apr 25 2022, 4:35 PM
mlir/lib/Dialect/Arithmetic/Analysis/IntRangeAnalysis.cpp
47–48
59–60

Can you add more newlines into these functions to better separate the functionality?

94–95
119–125

Can you split the functor into a separate variable? That would likely format this better.

120–124

?

138–141

Same comments here as above.

@rriddle To clarify, what types of tests do you think would be appropriate? Unit tests of the lattice implementation?

Secondly, what would the more generic implementation look like? That is, where's the missing generality?

Since it might help, I'll restate the context that this current set of patches is a general solution to.

If we take

func @example() -> index {
    // Is gpu.block_id, but I use our wrapper for clarity and because, unlike with gpu.block_id, we can know this is <= int32_max
    %0 = miopen.block_id
    %1 = affine.apply affine_map<(d0) -> (d0 floordiv 9 + ((d0 mod 9) floordiv 3) * 3 + (d0 mod 3) * 9)>(%0)
    return %1 : index
}

and run it through -lower-affine -canonicalize, we get

func @example() -> index {
    %c9 = arith.constant 9 : index
    %c0 = arith.constant 0 : index
    %c3 = arith.constant 3 : index
    %c-1 = arith.constant -1 : index
    %0 = miopen.block_id
    %1 = arith.cmpi slt, %0, %c0 : index
    %2 = arith.subi %c-1, %0 : index
    %3 = arith.select %1, %2, %0 : index
    %4 = arith.divsi %3, %c9 : index
    %5 = arith.subi %c-1, %4 : index
    %6 = arith.select %1, %5, %4 : index
    %7 = arith.remsi %0, %c9 : index
    %8 = arith.cmpi slt, %7, %c0 : index
    %9 = arith.addi %7, %c9 : index
    %10 = arith.select %8, %9, %7 : index
    %11 = arith.cmpi slt, %10, %c0 : index
    %12 = arith.subi %c-1, %10 : index                                                      %13 = arith.select %11, %12, %10 : index
    %14 = arith.divsi %13, %c3 : index
    %15 = arith.subi %c-1, %14 : index
    %16 = arith.select %11, %15, %14 : index
    %17 = arith.muli %16, %c3 : index
    %18 = arith.addi %6, %17 : index
    %19 = arith.remsi %0, %c3 : index
    %20 = arith.cmpi slt, %19, %c0 : index
    %21 = arith.addi %19, %c3 : index
    %22 = arith.select %20, %21, %19 : index
    %23 = arith.muli %22, %c9 : index
    %24 = arith.addi %18, %23 : index
    return %24 : index
}

Note how this code has a bunch of conditional logic for handling the case of the arguments to mod or division being negative when we can infer, from the fact that the map is being run on a non-negative value, that those conditions are statically false. (Note that it won't always make sense to drop those checks - we might have put a d0 - 1 somewhere in there to handle padding, for example)

Ideally (and -fold-inferred-constants -canonicalize -unsigned-when-equivalent will get us there) we want this simplified to

func @example() -> index {
    %c3 = arith.constant 3 : index
    %c9 = arith.constant 9 : index
    %0 = miopen.block_id
    %1 = arith.divui %0, %c9 : index
    %2 = arith.remui %0, %c9 : index
    %3 = arith.divui %2, %c3 : index
    %4 = arith.muli %3, %c3 : index
    %5 = arith.addi %0, %4 : index
    %6 = arith.remsi %0, %c3 : index
    %7 = arith.muli %6, %c9 : index
    %8 = arith.addi %5, %7 : index
    return %8 : index
}

This would make the IR we send to LLVM cleaner and enable certain optimizations (for example, when we switched some hardcoded remsi ..., %c64 to remui ..., %c64 in our generated code, LLVM could swap to bitwise arithmetic instead of actually emitting remainder instructions).

mlir/include/mlir/Dialect/Arithmetic/Analysis/IntRangeAnalysis.h
11

There's nothing stopping this from being somewhere more general, but I think that, because the passes that use all this are in arith, as are most of the implementors, having it in arith makes sense from a namespacing angle.

25

This is in a header because it'll be used by multiple passes - see https://reviews.llvm.org/D124024

mlir/include/mlir/Dialect/Arithmetic/IR/InferIntRangeInterface.h
30

I initially had that simplified model but found it hard to work with, especially since, unlike in LLVM, we don't have things like nuw/nsw.

Do you think it'd make sense for us to use this ConstantRange? I'll note its semantics are unclear, and I can't quite work out how they stuck both signed and unsigned ranges in there.

I could flip these IntegerAttrs to Optional<APInt>s - that's probably a better type.

mlir/include/mlir/Dialect/Arithmetic/IR/InferIntRangeInterface.td
45

What kind of callback? And I agree that this is missing a way to handle region arguments that aren't things like iter_args(), hence the changes to the data flow framework so I could add visitNonControlFlowArguments() in the analysis pass.

Do you see cases where we'd want to do interesting things with regions that aren't LoopLikeInterfaces or if/br-like?

mlir/include/mlir/Dialect/Arithmetic/Transforms/Passes.h
30

Should that patch be also disjoint from https://reviews.llvm.org/D124022 ?

Just wanted to follow up on my questions about next steps

Mogball added inline comments.Apr 28 2022, 9:17 AM
mlir/include/mlir/Dialect/Arithmetic/Analysis/IntRangeAnalysis.h
11

I don't have a strong opinion on the namespace other than "it should eventually not be in arith" but I don't think the pass and analysis should be located under mlir/Dialect/Arithmetic, since that's tied too closely to the dialect

mlir/include/mlir/Dialect/Arithmetic/IR/InferIntRangeInterface.h
76–83

These functions are more of an implementation detail of integer range analysis which I don't think should even be exposed publicly.

mlir/lib/Dialect/Arithmetic/Transforms/FoldInferredConstants.cpp
41

There's nothing about this pass that is specific to the arithmetic dialect.

krzysz00 marked an inline comment as not done.Apr 28 2022, 11:06 AM
krzysz00 added inline comments.
mlir/include/mlir/Dialect/Arithmetic/Analysis/IntRangeAnalysis.h
11

On the other hand, ConstantOp is under Arithmetic ...

If y'all're happy with this all going into mlir/Analysis and mlir/Transforms and the main MLIR namespace, that's fine by me.

I'm just coming from the perspective where the passes and analyses are mostly about arith, and so mlir/Dialect/Arithmetic/ is a logical place to put them

mlir/include/mlir/Dialect/Arithmetic/IR/InferIntRangeInterface.h
76–83

Well, because of how data flow analysis work, these functions have to be methods on the struct.

And because there will be multiple passes that use this analysis, the struct in question has to be declared in a header.

And I'm not aware of how to put a private header together - I'd bo open to IntRangeAttrs being hidden from the public API, but I've got no idea how to do that.

Mogball added inline comments.Apr 28 2022, 2:52 PM
mlir/include/mlir/Dialect/Arithmetic/Analysis/IntRangeAnalysis.h
11

Yeah that's fine with me. SCCP is under mlir/Transforms as well

@rriddle Looking to follow up for more clarity about

  • What sort of increased generality you're wanting to see in this patch
  • What a test for the lattice functionality should look like (unit test? adding some sort of test pass?)
  • What the callback-based version of InferIntRangeInterface you alluded to might look like
krzysz00 updated this revision to Diff 426780.May 3 2022, 11:00 AM

Move from integer attrs to Optional<APInt>

So to keep the ball rolling on this one, I will pretend to be river and answer your questions :P

  1. Just move things out of the arithmetic dialect and probably drop the arith namespace (mlir/Transforms). The analysis itself could go inside the source file like SCCPAnalysis.
  2. Add 1 or 2 ops to the test dialect that, when combined with control-flow from scf or cf, cover all the "interesting" cases, i.e. integers ranges being overdefined, defined (and a range), and defined (collapsed to a constant).
  3. The callback would be something like
void getIntegerRanges(ArrayRef<IntRangeAttrs> operandRanges, function_ref<void(Value, IntRangeAttrs)> setValueRange);

Which an implementing op could call on its results and any region arguments. This way you can implement visitNonControlFlowArguments using it too.

Basically +1 to everything Jeff said. (It also seems like a lot of the comments are still unresolved?)

mlir/include/mlir/Dialect/Arithmetic/IR/InferIntRangeInterface.h
30

How does LLVM implement utilize ConstantRange then? Ideally I think we should try to avoid having two parallel ranges, given that now we have to implement things operating on both.

mlir/lib/Dialect/Arithmetic/Analysis/IntRangeAnalysis.cpp
30

Unresolved? Prefer using namespace blah; over namespace blah { in cpp files.

59–60

Unresolved?

94–95

Unresolved?

mlir/lib/Dialect/Arithmetic/IR/InferIntRangeInterface.cpp
16

Same comments about namespaces here.

mlir/lib/Dialect/Arithmetic/Transforms/FoldInferredConstants.cpp
25

Same comments here about static functions, and namespaces.

krzysz00 updated this revision to Diff 428161.May 9 2022, 12:28 PM
krzysz00 edited the summary of this revision. (Show Details)

Move things around to address some review comments

krzysz00 marked 13 inline comments as done.May 9 2022, 12:54 PM

Ok, so, to update the actions I've taken

  1. I moved InferIntRangeInterface into Interfaces/
  2. I moved IntRangeAnalysis into Analysis/. Having re-read y'all's comments on the topic, the claim I'm seeing is that, instead of directly exposing the fact that the integer range analysis is a dataflow analysis, there should be a wrapper around it that provides a more limited API. That makes sense, and I'll get on that next.
  3. I left -fold-inferred-constants under arith because
    • It specifically creates arith.constant ops
    • It's not clear how it could be generalized past that
    • It looks like we're not supposed to make MLIRTransforms depend on any dialect in particular

If y'all have notes on how to make fold-inferred-constants dialect-independent, I'm all ears.

I haven't broken -fold-inferred-constants into its own commit yet, but I can get on that.

I'm also fixing to add some unit tests for IntRangeAttrs later.

krzysz00 updated this revision to Diff 428185.May 9 2022, 1:12 PM

Expose fewer details of IntRangeAnalysis

krzysz00 updated this revision to Diff 428189.May 9 2022, 1:26 PM

Fix building

SCCP will defer to Dialect::materializeConstant. You could do the same with your pass.

Mostly LGTM, just missing some tests

mlir/include/mlir/Interfaces/InferIntRangeInterface.h
18 ↗(On Diff #428189)

Put the includes together

30 ↗(On Diff #428189)

These should be placed at the bottom of the class

33 ↗(On Diff #428189)

Brief comments on the default constructor and other constructors?

mlir/lib/Analysis/IntRangeAnalysis.cpp
24 ↗(On Diff #428189)

Typo

25 ↗(On Diff #428189)
40 ↗(On Diff #428189)

Operand lattices should never be uninitialized when an operation is visited.

106 ↗(On Diff #428189)

ForwardDataFlowAnalysis will assert if any result lattices are uninitialized, so this check is not necessary

117 ↗(On Diff #428189)

Please expand the name to result

125 ↗(On Diff #428189)

Please expand mb

169 ↗(On Diff #428189)

Operand lattices should never be uninitialized when an operation is visited.

172 ↗(On Diff #428189)
191 ↗(On Diff #428189)

This line is a no-op because markPessimisticFixpoint will overwrite the optimistic value

219 ↗(On Diff #428189)

If you need to "undo" the correction, why not take it out of the helper function and perform it here (with an inverted condition)

mlir/lib/Dialect/Arithmetic/Transforms/FoldInferredConstants.cpp
30

I would call the dialect constant materializer, either the dialect of the defining op or the parent op if it's a block argument, then the pass would be totally dialect agnostic. You could also use a folder instance to unique constants as you go

rriddle added inline comments.May 10 2022, 10:15 AM
mlir/include/mlir/Analysis/IntRangeAnalysis.h
67 ↗(On Diff #428189)

Can you use a std::unique_ptr here (i.e. use the pImpl technique) and move IntRangeAnalysisImpl to the cpp file? We shouldn't be exposing that at all within the header file.

mlir/include/mlir/Interfaces/InferIntRangeInterface.h
18 ↗(On Diff #428189)

+1. Also system includes should come last.

krzysz00 updated this revision to Diff 428506.May 10 2022, 2:56 PM
krzysz00 marked 12 inline comments as done.

Address comments on code; tests will be added tomorrow-ish

Thanks for going over all this closely, y'all!

Mogball added inline comments.May 11 2022, 4:08 AM
mlir/lib/Analysis/IntRangeAnalysis.cpp
236 ↗(On Diff #428506)

Add braces because if the comments

mlir/lib/Transforms/FoldInferredConstants.cpp
18 ↗(On Diff #428506)

MLIR is not entirely consistent with the namespace end comment but the preferred way is as such

mlir/include/mlir/Analysis/IntRangeAnalysis.h
25 ↗(On Diff #428506)
42 ↗(On Diff #428506)
mlir/include/mlir/Interfaces/InferIntRangeInterface.h
100 ↗(On Diff #428506)
mlir/lib/Analysis/IntRangeAnalysis.cpp
57–59 ↗(On Diff #428506)
mlir/lib/Transforms/FoldInferredConstants.cpp
100 ↗(On Diff #428506)
krzysz00 updated this revision to Diff 428801.May 11 2022, 4:15 PM
krzysz00 marked 2 inline comments as done.

Add tests, fix typos

Mogball accepted this revision.May 11 2022, 8:35 PM

LGTM!

Could you add one test op that sets the ranges of its block arguments and test that it works correctly?

mlir/lib/Analysis/IntRangeAnalysis.cpp
24–25 ↗(On Diff #428801)

This class can just go in the anonymous namespace

105 ↗(On Diff #428801)

The operand lattices should never be uninitialized.

148 ↗(On Diff #428801)

braces

169 ↗(On Diff #428801)

braces

222 ↗(On Diff #428801)

braces

234 ↗(On Diff #428801)

if any of the branches have braces, all of them should

mlir/test/lib/Dialect/Test/TestDialect.cpp
1411 ↗(On Diff #428801)
mlir/test/lib/Dialect/Test/TestOps.td
2883–2886 ↗(On Diff #428801)
2907–2911 ↗(On Diff #428801)
mlir/unittests/Interfaces/InferIntRangeInterfaceTest.cpp
1–2 ↗(On Diff #428801)

what happened here lol

rriddle requested changes to this revision.May 11 2022, 8:39 PM

Please wait for my approval before landing. I'd like to go over this more.

This revision now requires changes to proceed.May 11 2022, 8:39 PM
rriddle added inline comments.May 11 2022, 8:41 PM
mlir/include/mlir/Dialect/Arithmetic/IR/InferIntRangeInterface.h
30

Unresolved?

krzysz00 updated this revision to Diff 428996.May 12 2022, 10:27 AM
krzysz00 marked 7 inline comments as done.

Fix braces, other such - add unit test for branches.

( @rriddle To bring the signed/unsigned/ConstantRange discussion out of inline comments)

I've got a few reasons I'd say it's reasonable to track the signed and unsigned interpretations of values separately, in contrast with LLVM's approach.

  1. The LLVM approach appears to rely on having a bunch of flow-sensitive reasoning that generates things like nsw in order to make their analysis more precise. We don't have, and probably don't want to recreate, all that infrastructure, especially with how extensible MLIR is. So, we need another avenue to get analysis precision.
  2. Having signed/unsigned bounds like this makes it easier to reason about what the range inference implementations are doing, especially when dealing with mixtures of signed and unsigned ops (like divui or remsi). Trying to precisely get bounds out of mixed code like that is tricky.
  3. In most cases, we don't have double computation - the three ops that really have to do duplicate reasoning about the signed and unsigned result are addition, subtraction, and multiplication. Just about everything else can call fromSigned or fromUnsigned after computing one set of bounds to copy the bounds over if that's a valid thing to do.
  4. Initial attempts at a one-range system were a confusing mess and I'm not looking to fight through making sure that's implemented correctly.
mlir/lib/Analysis/IntRangeAnalysis.cpp
24–25 ↗(On Diff #428801)

Doesn't it need to have a defined namespace so it can be referenced in a header?

mlir/unittests/Interfaces/InferIntRangeInterfaceTest.cpp
1–2 ↗(On Diff #428801)

clang-format

rriddle requested changes to this revision.May 12 2022, 10:23 PM

( @rriddle To bring the signed/unsigned/ConstantRange discussion out of inline comments)

I've got a few reasons I'd say it's reasonable to track the signed and unsigned interpretations of values separately, in contrast with LLVM's approach.

  1. The LLVM approach appears to rely on having a bunch of flow-sensitive reasoning that generates things like nsw in order to make their analysis more precise. We don't have, and probably don't want to recreate, all that infrastructure, especially with how extensible MLIR is. So, we need another avenue to get analysis precision.
  2. Having signed/unsigned bounds like this makes it easier to reason about what the range inference implementations are doing, especially when dealing with mixtures of signed and unsigned ops (like divui or remsi). Trying to precisely get bounds out of mixed code like that is tricky.
  3. In most cases, we don't have double computation - the three ops that really have to do duplicate reasoning about the signed and unsigned result are addition, subtraction, and multiplication. Just about everything else can call fromSigned or fromUnsigned after computing one set of bounds to copy the bounds over if that's a valid thing to do.
  4. Initial attempts at a one-range system were a confusing mess and I'm not looking to fight through making sure that's implemented correctly.

I think this is underestimating some of the effects resulting from having both signed and unsigned:

  • It seems like in the implementation of the interfaces for the arithmetic operations you are ignoring known information

E.g. if a signed operation has inputs that have known unsigned ranges, why shouldn't we try to preserve and utilize that information when possible?

  • The duplication is not just about adding the interface to an op, but also when using the analysis

Bouncing off of the previous point, if a user doesn't want to ignore known information they will inevitably have to check both anyways.

  • You aren't going to be able to get away from nsw and related flags

The arithmetic operations don't have those at this point, but that doesn't mean they won't grow them moving forward (the current modeling is not all encompassing). We also will want this analysis to support cases like the LLVM dialect, which do encode that information already.

I have some serious concerns about using multiple ranges, and I'd like to fully understand why a single range ended up being "a confusing mess". Especially given that LLVM has utilized it successfully, and hasn't revolted/reverted into a multi-range system.

This revision now requires changes to proceed.May 12 2022, 10:23 PM

It looks like, at the very least, I misread the comments on ConstantRange. Note that, for example, computeConstantRange at https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Analysis/ValueTracking.h#L548 has a forSigned parameter, and that the scalar evolution code has an enum requiring you to specify if you're computing the range for signed or unsigned values and corresponding utility functions getUnsignedRange() and getSignedRange().

This means my point 4, about things being a huge mess, doesn't apply to LLVM - what I initially tried to do is to use the same pair of bounds for both signed and unsigned values without any context about what I wanted, which I don't think is something that can be done precisely ... which is why LLVM doesn't do it either.

And to your points about ignoring information - that really shouldn't happen. The fromSigned and fromUnsigned constructors (which is what will be used when a particular operation has a signedness in its interpretation) will copy the bounds to the other range if possible. For example, if I say fromSigned(0, 15), I'll get the ranges unsigned = [0, 15], signed = [0, 15], but if I say fromSigned(-1, 1), I'll get unsigned = [-inf, +inf], signed = [-1, 1], because -1 <= x <= 1 doesn't correspond to a contiguous set of bit patterns from the unsigned perspective.

So, if you have a signed operation, it will use information about unsigned ranges implicitly - that's where fromUnsigned comes in. (In fact, if the unsigned view of that input has an upper bound <= int_max(type), the signed operation will know that argument is non-negative).

And so, if we used one range, we'd need to do what LLVM does and thread whether we want to know about each Value from a signed or unsigned point of view through the entire analysis. It seems more reasonable to me to just keep two ranges instead.

@rriddle I hope the above comment makes some sense

rriddle added inline comments.May 17 2022, 2:10 AM
mlir/include/mlir/Analysis/IntRangeAnalysis.h
30–31 ↗(On Diff #428996)
33–35 ↗(On Diff #428996)

Analyses are expected to be constructed from the operation, please change this to a constructor.

36 ↗(On Diff #428996)
mlir/include/mlir/Interfaces/InferIntRangeInterface.h
22 ↗(On Diff #428996)

This doesn't look necessary.

28 ↗(On Diff #428996)

The name of this is a bit weird, given that there aren't any attribute. Can we use ConstantIntRange instead?

32 ↗(On Diff #428996)

Please add newlines between methods, applies throughout the commit.

39 ↗(On Diff #428996)

Please drop trivial braces.

45 ↗(On Diff #428996)

Where are these tuples being built? Can we drop this constructor?

71 ↗(On Diff #428996)

Can we just pass in the values directly? Why go through tuple?

Same for the other method.

98 ↗(On Diff #428996)

Why do we need to use optional here? Do we need the default constructor? If this is just for the analysis, we could wrap this class with an internal one that isn't outwardly exposed.

101 ↗(On Diff #428996)
mlir/lib/Analysis/IntRangeAnalysis.cpp
53 ↗(On Diff #428996)

Can you complete this comment?

mlir/test/lib/Dialect/Test/TestOps.td
2889–2891 ↗(On Diff #428996)
2912–2914 ↗(On Diff #428996)
2926–2928 ↗(On Diff #428996)

Thanks for the comments on the range representation and catching the pointless complexity introduced by Optional<>

mlir/include/mlir/Interfaces/InferIntRangeInterface.h
45 ↗(On Diff #428996)

These tuples arise from methods in the implementation logic for arith that return various [min, max] ranges as a tuple. They're useful ergonomic constructors, IMO.

71 ↗(On Diff #428996)

Utility methods that use tuples to return multiple values.

98 ↗(On Diff #428996)

Having thought about it, we don't _need_ optional - ConstantRange down in LLVM doesn't need it. The case of unbounded ranges can be handled by [0, uint_max(width)] / [int_min(width), int_max(width)], respectively, and we could use width 0 APInts to handle things that aren't integers so we have a meaningful null value.

That would all, now that I think about it, simplify things substantially in many parts of the implementation at the cost of a few minor ergonomic annoyances. I'll make it happen.

krzysz00 updated this revision to Diff 430446.May 18 2022, 10:38 AM
krzysz00 marked 8 inline comments as done.

Make the ints in ranges not optional, since that's not needed.

(submitting comments)

@rriddle Do you have any other comments?

Nice, this is getting really close. Great work.

mlir/include/mlir/Analysis/IntRangeAnalysis.h
1 ↗(On Diff #430446)

This looks a bit off.

18–20 ↗(On Diff #430446)

Are all of these necessary?

31–35 ↗(On Diff #430446)
mlir/include/mlir/Interfaces/InferIntRangeInterface.h
21 ↗(On Diff #430446)

attributes? This comment looks out of date.

51 ↗(On Diff #430446)
61 ↗(On Diff #430446)
63–65 ↗(On Diff #430446)

Is this only for the pessimistic value state stuff? Could we just have a wrapper value for the analysis, and keep ConstantIntRanges clean from this? notAnInteger feels quite weird as an interface for a ConstantIntRange.

77–78 ↗(On Diff #430446)

With above, can we please drop the tuple API until we need it?

93–95 ↗(On Diff #430446)

Related to the comment above, if we used a wrapper class in the analysis we wouldn't need this to be public exposed.

107 ↗(On Diff #430446)

nit: Can you hoist this near the constructors?

116 ↗(On Diff #430446)

Can you document this?

45 ↗(On Diff #428996)

Can we remove these until we need them then, I'm still not convinced yet of the value. The value may be more clear when there are uses, in which case they can be readded.

mlir/include/mlir/Transforms/Passes.h
54 ↗(On Diff #430446)
mlir/lib/Transforms/FoldInferredConstants.cpp
1–2 ↗(On Diff #430446)

This is a bit off.

88–97 ↗(On Diff #430446)

SCCP in LLVM represents a range within the lattice, is that something we could do here (i.e. merge the necessary things into SCCP)? It would remove the need to have a duplicate pass with similar goals to SCCP.

krzysz00 updated this revision to Diff 431743.May 24 2022, 12:02 PM
krzysz00 marked 19 inline comments as done.

Address review comments, move join/pessimistic stuff into wrapper class

Thanks, and I'm glad to hear this is all moving in a good direction

mlir/include/mlir/Interfaces/InferIntRangeInterface.h
63–65 ↗(On Diff #430446)

This becomes important for arith.constant and any other op we can think of whose return type might or might not be an integer. I think it'll improve code clarity if we keep this around as a helper method.

mlir/lib/Transforms/FoldInferredConstants.cpp
88–97 ↗(On Diff #430446)

I agree that we might be able to merge this and SCCP, but I'd rather do that in a future commit once we've roped in the SCCP folks on here to make sure we've handled their use cases (ex - do they handle vector constants? We don't)

(and I did move the tuple methods over to the commit that defines InferIntRangeInterface for arithmetic)

rriddle added inline comments.May 27 2022, 12:11 AM
mlir/include/mlir/Interfaces/InferIntRangeInterface.h
63–65 ↗(On Diff #430446)

If the constant isn't an integer, can we just bail out? It seems like something to be handled outside of this class. For the interface I assume we would either:

  • Not invoke the callback function
  • Change the callback function to take an Optional<> and invoke it with None

What happens in the situations where an operation has one result that can have ranges inferred and one that doesn't? I assume we just wouldn't provide information for the result we can't infer, or am I wrong here? The case where constant doesn't return an int should fall into that same case (this isn't an int, so I'm not giving you anything).

mlir/lib/Transforms/FoldInferredConstants.cpp
88–97 ↗(On Diff #430446)

Can we move this to being a test pass then? The premise of this pass is identical to SCCP, and I'd rather work to merge the necessary functionality into there than to rely on having a separate pass.

krzysz00 updated this revision to Diff 433115.May 31 2022, 9:41 AM
krzysz00 retitled this revision from [mlir][Arith] Add integer range inference analysis to [mlir] Add integer range inference analysis.
krzysz00 edited the summary of this revision. (Show Details)

Integrate fold-inferred-constants into SCCP

krzysz00 updated this revision to Diff 433146.May 31 2022, 11:13 AM
krzysz00 edited the summary of this revision. (Show Details)

Fix bug in dataflow analysis uncovered while testing arithmetic range implementations.

As to SCCP, I've gone and moved the invocation of the integer range analysis into SCCP. I couldn't merge it with the existing SCCP analysis due to their differences in how they detect loop-variant variables (the range-less SCCP was more conservative about it), but I did at least put both analyses in one pass.

I hope this change satisfies the objections you had.

(I also ran into a bug in how data flow analysis was handling terminators for stuff like scf.condition - this has now been fixed here.)

mlir/include/mlir/Interfaces/InferIntRangeInterface.h
63–65 ↗(On Diff #430446)

I've fixed the analysis so that failure to call the callback gives you [min(type), max(type)] and therefore removed notAnInteger(). Good call on how to clean all this up!

I will admit I'm not entirely happy with the SCCP integration - if you see any ways I could tighten it up, please do let me know.

I will admit I'm not entirely happy with the SCCP integration - if you see any ways I could tighten it up, please do let me know.

Ah sorry, didn't mean to imply you needed to do that in this patch. Can you split the SCCP out into a separate patch? Using a test pass for now to test the analysis? I'd like to get the base stuff landed, and I don't think that needs to be anchored on figuring out SCCP evolution.

I do want this functionality downstream for non-test reasons, though.

If it makes more sense, I can make -fold-inferred-constants a test pass in
the MLIR codebase and then temporarily pull it out of test/ in our fork
when I'm backporting this patch series. Would that work?

Or, if you don't think the SCCP evolution will be too contentious, I can
just split that into it's own patch and add it to the series - another few
days to make sure we get this right aren't the end of the world.

I do want this functionality downstream for non-test reasons, though.

If it makes more sense, I can make -fold-inferred-constants a test pass in
the MLIR codebase and then temporarily pull it out of test/ in our fork
when I'm backporting this patch series. Would that work?

Or, if you don't think the SCCP evolution will be too contentious, I can
just split that into it's own patch and add it to the series - another few
days to make sure we get this right aren't the end of the world.

Moving to test upstream, and then having a non-test pass downstream for now
would likely be the easiest. Given that you'll still be able to benefit from the work
you've done regardless of what we do with the SCCP integration.

krzysz00 updated this revision to Diff 433828.Jun 2 2022, 11:54 AM
krzysz00 edited the summary of this revision. (Show Details)

Move inference-based folding to -test-int-range-inference

krzysz00 updated this revision to Diff 433834.Jun 2 2022, 12:05 PM

Rebase to more recent main, clean up remaining SCCP change

rriddle accepted this revision.Jun 2 2022, 12:05 PM

Thanks! I really appreciate the patience during the review. This is really great work.

mlir/test/lib/Transforms/TestIntRangeInference.cpp
8 ↗(On Diff #433828)

nit: Please drop usernames from TODO messages.

This revision is now accepted and ready to land.Jun 2 2022, 12:05 PM
krzysz00 updated this revision to Diff 433846.Jun 2 2022, 12:43 PM

Adress final nit

River, Jeff, thank you so much for your thorough feedback and review!
This code wouldn't be as good as it is without y'all.

This revision was automatically updated to reflect the committed changes.

I revert, this broke the bot: https://lab.llvm.org/buildbot/#/builders/61/builds/27314

(you should have got an email)

I did get an email but the email only mentioned the warning, not the error.

I worked out the problem and put together a fix that I'll send in tomorrow.

Mogball added inline comments.Jun 7 2022, 1:20 PM
mlir/lib/Analysis/DataFlowAnalysis.cpp
579 ↗(On Diff #433861)

https://github.com/llvm/llvm-project/issues/55873 was introduced by this code block here. I'm investigating it at the moment