Page MenuHomePhabricator

2d Arm Neon sdot op, and lowering to the intrinsic.
ClosedPublic

Authored by Benoit on May 14 2021, 8:58 AM.

Details

Summary

This adds Sdot2d op, which is similar to the usual Neon
intrinsic except that it takes 2d vector operands, reflecting the
structure of the arithmetic that it's performing: 4 separate
4-dimensional dot products, whence the vector<4x4xi8> shape.

This also adds a new pass, arm-neon-2d-to-intr, lowering
this new 2d op to the 1d intrinsic.

Diff Detail

Event Timeline

Benoit created this revision.May 14 2021, 8:58 AM
Benoit requested review of this revision.May 14 2021, 8:58 AM
nicolasvasilache accepted this revision.May 14 2021, 9:30 AM

Cool ! Accepting conditioned on these minor changes.

mlir/include/mlir/Conversion/ArmNeonStructuredToIntr/ArmNeonStructuredToIntr.h
11 ↗(On Diff #345453)

some minor comments + spacing prefixed with /// please

mlir/include/mlir/Dialect/ArmNeon/ArmNeon.td
127

So structured op has a specific meaning in a bunch of MLIR places related to Linalg and Vector.
Could we call this MamtulSomethingOp ?

135

Not sure there is tablegen support for that.
Just add it in C++ with a custom verifier ? e.g. https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/X86Vector/X86Vector.td#L78

mlir/lib/Conversion/ArmNeonStructuredToIntr/ArmNeonStructuredToIntr.cpp
2 ↗(On Diff #345453)

80 cols plz

24 ↗(On Diff #345453)

Note that one of the remaining challenges is to properly propagate the shape casts you introduce here all the way to memory operations and ideally have them fold into reindexings.
Add a TODO along those lines ?

29 ↗(On Diff #345453)

There is non LLVM type here, just 1-D vector type that map 1-1 with LLVM (unlike the 2+D vectors that don't).
Can you please rephrase a bit? I'd just say "convert to 1-D vector type + attribute to match the neon.intr.xxx operation requirements".

Those intr. ops are the separation of concern between LLVM and MLIR.

36 ↗(On Diff #345453)

camelcase everywhere plz

This revision is now accepted and ready to land.May 14 2021, 9:30 AM
Benoit planned changes to this revision.May 14 2021, 10:15 AM
Benoit updated this revision to Diff 345541.May 14 2021, 1:20 PM

Applied review comments

This revision is now accepted and ready to land.May 14 2021, 1:20 PM
Benoit marked an inline comment as done.May 14 2021, 1:24 PM

Please take a look - nontrivial enough changes that I'd like your opinion again.

mlir/include/mlir/Dialect/ArmNeon/ArmNeon.td
127

I renamed 'Structured' to '2d' everywhere.
This is the best descriptive term that I could think of.
'Matmul' would be both overly specific to certain Neon intrinsic, and not even well descriptive of the present intrinsic: there are 2 forms of ARM SDOT instruction, only one of them is a 4x4x1 matmul, and the one that LLVM has as an intrinsic (which we are dealing with here) is the other one. It's not a 4x4x1 matmul, it's 4 completely independent dot products of separate 4d vectors.

There is one of your comments which I haven't applied yet. It's the one about writing a TODO about the handling of these reshapes. Part of the problem is I didn't understand 100% of what you were saying, and part is that this topic is going to get more subtle still in the following commit, adding the lowering from vector.contract, which will involve a broadcast to map the contract to the flavor of SDOT that we have here, which (as said in the other comment) is actually 4 separate vector dot products. So if you're OK, I would suggest to do nothing about this in this CL, discuss with you next week, decide something for the next commit.

Benoit retitled this revision from structured 2d Arm Neon sdot op, and lowering to the intrinsic. to 2d Arm Neon sdot op, and lowering to the intrinsic..May 14 2021, 1:30 PM
Benoit edited the summary of this revision. (Show Details)

please add an invalid.mlir in the armneon dialect tests, one per verifier error that is not automatically produced by tablegen (we test those separately), look at other invalid.mlir to see how they are structured.

mlir/include/mlir/Dialect/ArmNeon/ArmNeon.td
127

sgtm, thanks!

mlir/lib/Dialect/ArmNeon/IR/ArmNeonDialect.cpp
28

with custom verifiers we also add invalid tests in an invalid.mlir that exercise each of the expected errors.

ftynse added inline comments.May 17 2021, 7:07 AM
mlir/include/mlir/Dialect/ArmNeon/ArmNeon.td
123–125
mlir/lib/Conversion/ArmNeon2dToIntr/ArmNeon2dToIntr.cpp
31

Please expand auto here and below. MLIR uses auto when the type is either obvious from context (e.g., there's a cast on the RHS) or prohibitively long/impossible to express.

41

Nit: consider op.res() instead of op->getResult(0) to avoid the generic API with magic numbers.

mlir/lib/Dialect/ArmNeon/IR/ArmNeonDialect.cpp
32–36

These seem to be already covered by the type constraint in ODS so no need to check them manually. If you follow Nicolas' advice from above, you'll notice which messages are never emitted.

33

Nit: op.emitOpError() / op.emitError() spare you the manual fetching of location and produce more semantically-meaningful errors.

Benoit updated this revision to Diff 350969.Jun 9 2021, 12:10 PM

Addressed review comments. Predicates now implemented using CPred.

Benoit updated this revision to Diff 350971.Jun 9 2021, 12:24 PM

addressed ftynse's review comments

Benoit updated this revision to Diff 350972.Jun 9 2021, 12:26 PM
Benoit marked 4 inline comments as done.

one last review comment

Benoit updated this revision to Diff 350973.Jun 9 2021, 12:27 PM

one more whitespace fix

Benoit marked an inline comment as done.Jun 9 2021, 12:27 PM
nicolasvasilache accepted this revision.Jun 9 2021, 1:01 PM
nicolasvasilache added inline comments.
mlir/lib/Conversion/ArmNeon2dToIntr/ArmNeon2dToIntr.cpp
33

You could give the magic constant a meaningful name and expose it as a static extra method of the op.
You could also use that static method in the .td and then everything gets normalized.

39

just a warning, a lot of the canonicalization work to push these ShapeCastOp all the way to address computations is missing.
So we'll prob. need to iterate when you start looking at a real kernel from memory and the quality of the assembly that gets generated.

Benoit marked an inline comment as done.Jun 9 2021, 1:49 PM
Benoit added inline comments.
mlir/lib/Conversion/ArmNeon2dToIntr/ArmNeon2dToIntr.cpp
33

Sorry, I don't know how to add such a static method or constant to a class defined in tablegen?

39

ah ok, thanks for the note.

mlir/lib/Conversion/ArmNeon2dToIntr/ArmNeon2dToIntr.cpp
33
Benoit updated this revision to Diff 351044.Jun 9 2021, 7:29 PM
Benoit marked an inline comment as done.

one more review comment

Benoit marked 2 inline comments as done.Jun 9 2021, 7:34 PM
nicolasvasilache accepted this revision.Jun 9 2021, 11:38 PM

Great, let's land this!

I don't think that I have write access. I read https://llvm.org/docs/Phabricator.html#committing-a-change but it's not very clear to me. As far as I can see all I can do here is write this message asking if someone can submit for me? I don't see something else than this free-text field to input to (so is this process dependent on contributors freely browsing open reviews for ones that need submitting?)

This revision was automatically updated to reflect the committed changes.