This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Add NumberOfExecutions analysis + update RegionBranchOpInterface interface to query number of region invocations
ClosedPublic

Authored by ezhulenev on Nov 6 2020, 3:02 AM.

Diff Detail

Event Timeline

ezhulenev created this revision.Nov 6 2020, 3:02 AM
ezhulenev requested review of this revision.Nov 6 2020, 3:02 AM
ezhulenev edited the summary of this revision. (Show Details)Nov 6 2020, 3:04 AM
ezhulenev added reviewers: silvas, mehdi_amini, ftynse.
ezhulenev edited the summary of this revision. (Show Details)Nov 6 2020, 3:08 AM
ezhulenev updated this revision to Diff 303402.Nov 6 2020, 4:11 AM

Follow block parents to compute number of block executions + compute number of executions for ForOp with constant bounds

ezhulenev edited the summary of this revision. (Show Details)Nov 6 2020, 4:12 AM
ftynse added a subscriber: wsmoses.Nov 6 2020, 4:17 AM
ezhulenev retitled this revision from [mlir] Add NumberOfExecutions analysis + RegionInvocationsOpInterface to controlflow interfaces to [mlir] Add NumberOfExecutions analysis + update RegionBranchOpInterface interface to query number of region invocations.Nov 6 2020, 4:23 AM
ftynse added a comment.Nov 6 2020, 4:47 AM

Very nice!

I have a bunch of nits and several generalization suggestions.

mlir/include/mlir/Analysis/NumberOfExecutions.h
38

Nit: explicit?

69

There doesn't seem to be an accessor for this field, so IMO it can be dropped (unless there is an anticipated need for it). At which point, we can also consider using BlockNumberOfExecutionsInfo = Optional<int64_t>, which can be nicely constructed from an int64_t or from llvm::None.

mlir/include/mlir/Dialect/Async/IR/AsyncOps.td
86–88

It should be possible to do DeclareOpInterfaceMethods<RegionBranchOpInterface, ["getNumRegionInvocations"]> in the def and avoid this explicit declaration.

mlir/include/mlir/Interfaces/ControlFlowInterfaces.td
135

Nit: I'd be explicit and say something like "Populates countPerRegion with the number..."

137

Nit: could we have a named constant instead of -1?

149

Nit: countPerRegion.resize(numRegions, -1) avoids the loop

mlir/lib/Analysis/NumberOfExecutions.cpp
45–46

I wonder if this can be generalized to work on all constant-like operations. Currently, we only have a ConstantLike trait, so we cannot get the constant value as attribute without knowing the concrete op type. Maybe we can turn ConstantLike into an interface with value being an interface method. Out of scope for this commit probably.

53

Could we rather query the op? Hardcoding FuncOp makes this analysis non-applicable to custom function types (we have some, e.g., in LLVM dialect and in Flang).

58

Let's also be future-proof and guard against graph regions (https://mlir.llvm.org/docs/LangRef/#graph-regions). They don't have blocks (yet), but if they do, the logic of following block successors below will not necessarily apply. An assertion could do.

66

Nit: we usually don't declare function stack variables as const in MLIR

71

Nit: I would have just used auto and kept the original lambda type here, it's not like we are passing it around.

mlir/lib/Dialect/SCF/SCF.cpp
303–306

Could we rather extend the one we already have in MathExtras.h to support negative RHS?

ezhulenev updated this revision to Diff 303415.Nov 6 2020, 5:29 AM
ezhulenev marked 9 inline comments as done.

Address PR comments

mlir/include/mlir/Analysis/NumberOfExecutions.h
69

I added a custom non-default constructible type to catch accidental default constructions of None instead of a failure. And maybe more advanced analysis can add some other fields in the future.

Added this to class documentation for the next reader and removed unused Block*.

mlir/include/mlir/Dialect/Async/IR/AsyncOps.td
86–88

Cool, learning something new every day :)

mlir/lib/Analysis/NumberOfExecutions.cpp
45–46

Yeah, that would be useful also in the context of TF graphs with tf.Const ops, but I did not find any relevant examples and chose the easiest path.

53

It guess in can just implement RegionBranchOpInterface, I decided to special case it because it's "built in" and not defined in any of the dialect .td files. I expect all other FuncOpLike things to implement RegionBranchOpInterface.

71

Unfortunately lambdas do not support recursion without ugly tricks.

ftynse accepted this revision.Nov 6 2020, 6:25 AM

LGTM. Please wait for one more reviewer to double-check the logic according to the forum discussion.

mlir/include/mlir/Analysis/NumberOfExecutions.h
69

Ack. (Phabricator doesn't let anybody other than the author to mark comments as done).

mlir/lib/Analysis/NumberOfExecutions.cpp
53

ODS is not mandatory to implement an interface, ops defined in C++ can derive Traits<InterfaceName::Trait> and get the same functionality.

I'm fine with your current implementation given the explanation provided. We should look into having FuncOp implement the interface separately and consult with @rriddle about future plans regarding it being built-in.

This revision is now accepted and ready to land.Nov 6 2020, 6:25 AM
ezhulenev marked 2 inline comments as done.Nov 6 2020, 8:43 AM
ezhulenev added inline comments.
mlir/lib/Analysis/NumberOfExecutions.cpp
53

Yeah, I know, I just don't know is it "ok" to add it to built in type :) Assuming FuncOp is as fundamental as Region I think it's ok to have a special case, but I'd like to hear opinion from someone with a better judgement.

mehdi_amini added inline comments.Nov 6 2020, 8:56 AM
mlir/lib/Analysis/NumberOfExecutions.cpp
53

FuncOp isn't as fundamental as Region, I rather treat FuncOp as any other op.

For example LLVM, SPIRV, GPU dialects aren't using FuncOp.

ezhulenev added inline comments.Nov 6 2020, 9:10 AM
mlir/lib/Analysis/NumberOfExecutions.cpp
53

I've looked into adding RegionBranchOpInterface to FuncOp`, but it feels weird because the FuncOp does not execute the region, only the CallOp does.

For example: getSuccessorEntryOperands: "Returns the operands of this operation used as the entry arguments", but the FuncOp is OpTrait::ZeroOperands.

Maybe add a special trait, something like OpTrait::InvokesRegionsOnce, an additional piece of information just like OpTrait::OneRegion.

Or just rely on OpTrait::FunctionLike here, assuming that all functions execute attached region once.

ezhulenev updated this revision to Diff 303488.Nov 6 2020, 9:46 AM

Use FunctionLike trait to to get number of region invocations

rriddle requested changes to this revision.Nov 6 2020, 11:07 AM
rriddle added inline comments.
mlir/lib/Analysis/NumberOfExecutions.cpp
14

llvm includes go after MLIR includes.

19

Is the dependency on standard ops necessary?

23

Please trim some of these includes, several are unnecessary.

36

Missing static here?

66

This is not correct and seems like it would result in incorrect usages of the analysis. If you want to estimate function executions, you'll need to use the callgraph.

75

Drop trivial braces.

122

Nit: Flip the condition of this if, and remove the trivial braces.

if (!...)
  blockInfo.insert
mlir/lib/Dialect/Async/IR/Async.cpp
125

nit: Seems like push_back(1) is better than an explicit resize and assignment.

mlir/lib/Interfaces/ControlFlowInterfaces.cpp
78

Prefer explicit qualification, i.e. mlir::kUnknownNumRegionInvocations

This revision now requires changes to proceed.Nov 6 2020, 11:07 AM
rriddle added inline comments.Nov 6 2020, 11:08 AM
mlir/lib/Analysis/NumberOfExecutions.cpp
66

Why should functions be default to executed once? That doesn't seem like a conservatively safe default.

silvas added inline comments.Nov 6 2020, 12:48 PM
mlir/lib/Analysis/NumberOfExecutions.cpp
66

I agree, there seems to be a code smell here.

One solution is for queries to be of the form Optional<int64_t> getNumberOfExecutions(Block *block, Region *assumingThisRegionIsEnteredOnce).

So the analysis could hold a DenseMap<Block *, NumberOfExecutionsAssumingDirectParentRegionEnteredOnce>. Then on a query, you would walk up parents, multiplying NumberOfExecutionsAssumingDirectParentRegionEnteredOnce at each step until we reach assumingThisRegionIsEnteredOnce (and until we implement a call graph analysis, we will return None if assumingThisRegionIsEnteredOnce is not an ancestor of block)

And the body of a module should probably be an illegal Region to pass as assumingThisRegionIsEnteredOnce, because it is a "declarative region" and never "executes" (we might need a new IR property to represent this?)

test-print-number-of-executions would have some hardcoded policy to print values assuming assumingThisRegionIsEnteredOnce == enclosingFunc.getBody()

ezhulenev updated this revision to Diff 303528.Nov 6 2020, 12:52 PM
ezhulenev marked 8 inline comments as done.

Address PR comments

mlir/lib/Analysis/NumberOfExecutions.cpp
19

It's needed for dyn_cast<ConstantOp>

66

This is roughly ~= "when a function called it executes its entry block once". And this seems like a reasonable assumption. This is not about how many call operations are in the graph.

silvas added inline comments.Nov 6 2020, 12:59 PM
mlir/lib/Analysis/NumberOfExecutions.cpp
66

what about a function like

def f(i):
  print("here") # How many executions will this be assigned?
  if i > 0:
    f(i - 1)

I think we need to be more precise about what we mean by "execution" here.

Also, does this pass correctly handle a case like?

def g(i):
  if i == 0:
    abort()
  print("here") # How many executions?

At the moment it is somewhat unclear at what scope this analysis is intended to be constructed and used. It seems like this currently doesn't handle things related to the callgraph, doesn't handle control flow well, and assumes that it only operates locally within a single function? I'm slightly concerned that we are building an analysis that is very tied with the constraints on async tokens as determined by D90716, without much reusability outside of those constraints. Along with Sean's point, I don't think the concept of "Execution" is very well scoped here and leads to some weird code smells with functions/modules/etc.

mlir/include/mlir/Dialect/SCF/SCFOps.td
201

Replace -1 here with kUnknownNumRegionInvocations.

mlir/lib/Analysis/NumberOfExecutions.cpp
54

Use m_Constant here instead, we should avoid hardcoding operation types whenever possible.

ezhulenev updated this revision to Diff 303564.Nov 6 2020, 3:06 PM

Use matchPattern to match constants

ezhulenev marked 2 inline comments as done.Nov 6 2020, 3:09 PM

At the moment it is somewhat unclear at what scope this analysis is intended to be constructed and used. It seems like this currently doesn't handle things related to the callgraph ...

I'd say that call graph is irrelevant for this analysis. Call operations "consumes" SSA values and then it is the responsibility of the callee to do the right thing. But I'm speaking from the point of view of async-ref-counting, I'd love to hear what @ftynse can tell about his need for register assignment.

Maybe there is a better terminology then "number of executions" that is less confusing.

mlir/lib/Analysis/NumberOfExecutions.cpp
66

#1. 1 execution of the basic block that contains print operations. Function call is irrelevant here.

#2. Unknown, because it involves dynamic control flow

CFG will look something like this:

^bb0
  %cond = cmp i == 0
  cond_br %cond ^abort, ^print
^abort
  br ^print
^print
def g(i):
  abort()
  print("here") # is a bit more tricky, but let's assume this is an 
                # estimate given that no operations can terminate a program

I think we need to be more precise about what we mean by "execution" here.

"How many times this operation will be executed in this region assuming region invoked N times".

Call operation is about completely different region (different instance of a region).

FWIW Original async ref counting PR uses "number of operation instances in the region"

silvas added inline comments.Nov 6 2020, 3:23 PM
mlir/lib/Analysis/NumberOfExecutions.cpp
66

I see that the current code actually does walk parents in the way I described. Sorry.

I thought somewhat about the definition of "execution" that we mean here. It's really "for *each* time that Region is entered, how many times will given operation be executed". That makes the definition work for my case f above, since it yields getNumberOfExecutions(thePrintOp, f.getBody()) == 1.

Also, because of the possibility of aborting or calling a function that cannot be proven to terminate in the middle of a block, I think it is best for getNumberOfExecutions to take an Operation * instead of a Block, so that cases like this behave as intended, and users don't have to do their own within-a-block check:

def h():
  print("here1") # Number of executions is 1 per entry to h.getBody()
  call @weProvedThatItAbortsOrInfiniteLoops()
  print("here") # Number of executions is 0 per entry to h.getBody()

As such, could we do something like an interface to this analysis that looks like:

Optional<IntegerRange> getNumberOfExecutions(Operation *op, Region *perEntryOfThisRegion)

IntegerRange is just a pair of integers that represent a lower and upper bound on the number of executions, which composes nicely to handle cases where we can't prove that the statement executes at all (like those tricky abort/infinite-loop cases) by having a lower bound of 0 (we could do something more precise than an interval, such as an arbitrary set of integers, but that might get unwieldy).

To handle this correctly, getNumRegionInvocations will need to be wordsmithed to something like "number of times the body will be invoked if the body yields normally (i.e. doesn't abort or invoke an infinite loop)". Then this analysis can do a bottom up traversal of the region tree, doing the op-level analysis within each block looking for possibly aborting/infiniteLooping operations and correctly handling any ops that follow ops that might abort/infiniteLoop, and transitively propagating that up the region tree.

We can make this more precise with interprocedural analysis that does the equivalent of inferring attributes like LLVM's norecurse/willreturn/mustprogress (see https://llvm.org/docs/LangRef.html#function-attributes)

Under this definition, we have:

def f(i):
  print("here") # Number of executions is 1 per entry to f.getBody()
  if i > 0:
    f(i - 1)

def g(i):
  if i == 0:
    abort()
  print("here") # Number of executions is 0 or 1 per entry to g.getBody()

def h():
  print("here1") # Number of executions is 1 per entry to h.getBody()
  call @weProvedThatItAbortsOrInfiniteLoops()
  print("here") # Number of executions is 0 per entry to h.getBody()

def i():
  print("here1") # Number of executions is 1 per entry to i.getBody()
  call @cannotProveItDoesntAbortOrInfiniteLoop()
  print("here") # Number of executions is 0 or 1 per entry to i.getBody()
ezhulenev updated this revision to Diff 303568.Nov 6 2020, 3:35 PM

Reworderd NumberOfExecutions documentation

ezhulenev updated this revision to Diff 303571.Nov 6 2020, 3:40 PM

Update getNumRegionInvocations documentation to mention "yields normally" property

ezhulenev added inline comments.Nov 6 2020, 3:48 PM
mlir/lib/Analysis/NumberOfExecutions.cpp
66

I updated documentation for the interface and NumberOfExecutions to mention "yields normally" and "each time" properties.

I like this API ...

Optional<IntegerRange> getNumberOfExecutions(Operation *op, Region *perEntryOfThisRegion)

... but for now computing the IntegerRange is not really needed and it adds a lot of complexity.

I plan to add API with Optional<int64_t> to represent "known fixed number" vs "everything else"

Optional<int64_t> getNumberOfExecutions(Operation *op, Region *perEntryOfThisRegion);

and assume that all operations are "normal": do not abort, or go into infinite loop == "completed in finite amount of time".

ezhulenev updated this revision to Diff 303668.Nov 7 2020, 1:23 PM

Implment Optional<int64_t> getNumberOfExecutions(Operation *op, Region *perEntryOfThisRegion) API

ezhulenev updated this revision to Diff 303704.Nov 8 2020, 2:51 AM

Check that block lies inside the perEntryOfThisRegion

silvas added inline comments.Nov 9 2020, 12:55 PM
mlir/lib/Analysis/NumberOfExecutions.cpp
66

Thanks.

I think with the new API, we don't need these special cases. FunctionLike already didn't make sense because an op is not "executed" by the module region (we should probably return "unknown" (or an error) for getNumberOfExecutions(someOpInAFunction, module.getBody()). OpTrait::OneRegion doesn't guarantee anything about the number of invocations. E.g. scf::ForOp implements OpTrait::OneRegion.

136

The note that all operates are assumed to not abort/infloop should be in the documentation comment for users of the API; it is not an implementation detail because it makes the analysis incorrect for some potential users.

ezhulenev updated this revision to Diff 303988.Nov 9 2020, 2:26 PM
ezhulenev marked 2 inline comments as done.

Do not special case functions for number of executions computation

mlir/lib/Analysis/NumberOfExecutions.cpp
66

Done. It required few more changes, but the special case for functions is gone.

136

Added as a top-level documentation because it is also relevant to blocks.

silvas accepted this revision.Nov 9 2020, 2:38 PM

Thanks Eugene. All I have left are nits. This LGTM.

We probably want to wait on River to give final sign-off. In particular surrounding this analysis not being conservative w.r.t. potentially aborting/nonterminating computations.

mlir/include/mlir/Support/MathExtras.h
32

nit: do we have a unittest you could extend? (or add? :) )

ezhulenev updated this revision to Diff 303995.Nov 9 2020, 2:57 PM
ezhulenev marked an inline comment as done.

Add a unit test for Support/MathExtras

rriddle accepted this revision.Nov 10 2020, 11:06 PM

The new wording makes more sense to me and clears the confusion I had previously, thanks! LGTM after the remaining comments.

mlir/include/mlir/Analysis/NumberOfExecutions.h
38

nit: in a finite amount

98

nit: Use /// for comments here.

mlir/lib/Analysis/NumberOfExecutions.cpp
16

Can this function include be removed?

39

This seems like it would break when a graph region is nested within a CFG region. Can we avoid the direct assert here and just use kUnknownNumRegionInvocations instead? I could see an assert on the top level operation, but is there a reason to assert on nested operations?

51–60
129

nit: in a finite amount of time

209–222
This revision is now accepted and ready to land.Nov 10 2020, 11:06 PM
ezhulenev updated this revision to Diff 304424.Nov 11 2020, 1:03 AM
ezhulenev marked 13 inline comments as done.

Address PR comments

mlir/lib/Analysis/NumberOfExecutions.cpp
209–222

I had to walk block and then operations to get sensible output for filecheck, otherwise it prints inner operation before its parent.

This revision was landed with ongoing or failed builds.Nov 11 2020, 1:43 AM
This revision was automatically updated to reflect the committed changes.