This is an archive of the discontinued LLVM Phabricator instance.

[mlir][linalg][python] Explicit shape and dimension order in OpDSL.
ClosedPublic

Authored by gysit on Jun 29 2021, 7:08 AM.

Details

Summary

Extend the OpDSL syntax with an optional domain function to specify an explicit dimension order. The extension is needed to provide more control over the dimension order instead of deducing it implicitly depending on the formulation of the tensor comprehension. Additionally, the patch also ensures the symbols are ordered according to the operand definitions of the operation.

Diff Detail

Event Timeline

gysit created this revision.Jun 29 2021, 7:08 AM
gysit requested review of this revision.Jun 29 2021, 7:08 AM

I initially planed to have a simpler loop syntax that instead of listing the dimensions as domain parameters would use inspection to set the dimension names according to the induction variable names of the for loop. For example, matrix multiplication would look as follows:

def matmul(
    A=TensorDef(T, S.M, S.K),
    B=TensorDef(T, S.K, S.N),
    C=TensorDef(U, S.M, S.N, output=True)):
  for m, n, k in domain():
    C[m, n] += cast(U, A[m, k]) * cast(U, B[k, n])

However, at least my implementation requires features that are not guaranteed to be implemented by all Python implementations. I thus went for a more verbose syntax initializes the dimensions manually:

for m, n, k in domain(D.m, D.n, D.k):
stellaraccident accepted this revision.Jun 29 2021, 8:47 AM

Quite nice: I didn't know if this was going to work out so easily.

mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml
19

Good: the implicit domain inference here was causing uncertainty/surprise for the ordering.

mlir/python/mlir/dialects/linalg/opdsl/lang/config.py
147

Can you also add the collected vs specified to the error message?

Actually, I feel that both of these conditions could trigger the same mismatch error message if it opened the full lists.

mlir/python/mlir/dialects/linalg/opdsl/lang/dsl.py
141

The for loop syntax is cute from a DSL standpoint but does introduce some repetition. An I correct in reading this as a free standing statement of:

domain(D.m, D.n)

Would also work? (With just continuing to use the dims directly in the rest of the body instead of aliasing them to local variables?

This revision is now accepted and ready to land.Jun 29 2021, 8:47 AM
gysit added inline comments.Jun 29 2021, 10:08 AM
mlir/python/mlir/dialects/linalg/opdsl/lang/dsl.py
141

Right I can change to the following syntax:

domain(D.m, D.n, D.k)
C[D.m, D.n] += A[D.m, D.k] * B[D.k, D.n]

It makes the index expressions a bit longer but there is no duplication between induction variables and domain function arguments. I do not have a strong preference and will just change to the version above then.

gysit updated this revision to Diff 355463.Jun 30 2021, 12:35 AM

Address comments.

gysit marked 2 inline comments as done.Jun 30 2021, 12:36 AM
gysit edited the summary of this revision. (Show Details)
gysit updated this revision to Diff 355480.Jun 30 2021, 1:57 AM

Reset loop order of vecmat to avoid downstream breakages.

This revision was landed with ongoing or failed builds.Jun 30 2021, 2:17 AM
This revision was automatically updated to reflect the committed changes.