This is an archive of the discontinued LLVM Phabricator instance.

[mlir][tosa] Add lowering for tosa.rescale to linalg.generic
ClosedPublic

Authored by rsuderman on Mar 16 2021, 5:01 PM.

Details

Summary

This adds a tosa.apply_scale operation that handles the scaling operation
common to quantized operatons. This scalar operation is lowered
in TosaToStandard.

We use a separate ApplyScale factorization as this is a replicable pattern
within TOSA. ApplyScale can be reused within pool/convolution/mul/matmul
for their quantized variants.

Tests are added to both tosa-to-standard and tosa-to-linalg-on-tensors
that verify each pass is correct.

Diff Detail

Event Timeline

rsuderman created this revision.Mar 16 2021, 5:01 PM
rsuderman requested review of this revision.Mar 16 2021, 5:01 PM
silvas requested changes to this revision.Mar 16 2021, 5:27 PM
silvas added inline comments.
mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
650

with apply_scale, this should just be another elementwise conversion like any other. I don't understand why there is so much code involvedhere. (i.e. based on my a priori understanding, it seems like this patch should only change createLinalgBodyCalculationForElementwiseOp)

mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir
454

it feels like a better definition of tosa.apply_scale would be one that makes this entire body reduce to a single tosa.apply_scale op (perhaps with a constant). I.e. include all the details into there.

This revision now requires changes to proceed.Mar 16 2021, 5:27 PM
rsuderman added inline comments.Mar 16 2021, 5:33 PM
mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
650

There is quite a bit more required for Rescale. Values are not passed as tensors but as attributes and need to be serialized to constants. Furthermore, the multiply/shift values need to be conditionally broadcasted (depending whether we are rescaling per channel).

rsuderman updated this revision to Diff 331138.Mar 16 2021, 5:41 PM

Removing unneeded line.

silvas added inline comments.Mar 17 2021, 4:17 PM
mlir/include/mlir/Dialect/Tosa/IR/TosaOps.td
1497

Apply ElementwiseMappable traits. (requires allowing vectors and tensors as valid arguments, but that gives us linalg vectorization for free).

1504

I would prefer if this op was defined such that the body of the linalg op we produce contains a single op. It seems arbitrary to decompose part of the op as part of tosa->linalg and part in tosa->std. We want to preserve as much information as possible for the specialized code paths for neon / etc.

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
50

Doc comment.

685

does scale32 need to be handled?

699

ah yes, now I see why this is not just a trivial elementwise op.

rsuderman retitled this revision from [MLIR][TOSA] Add lowering for tosa.rescale to linalg.generic to [mlir][tosa] Add lowering for tosa.rescale to linalg.generic.Mar 17 2021, 4:37 PM
rsuderman updated this revision to Diff 331686.Mar 18 2021, 2:14 PM

Addressed the majority of silvas@ comments excluding mapping directly to
Rescale. ApplyScale has a direct definition according to the TOSA
specification and replicates within multiple ops. The Rescale behavior includes
zero-point adjustments and clipping which are (mostly) unique to Rescale.

rsuderman marked 5 inline comments as done.Mar 18 2021, 2:19 PM
rsuderman added inline comments.
mlir/include/mlir/Dialect/Tosa/IR/TosaOps.td
1504

So ApplyScale has a specific meaning in the TOSA specification. The additional work covers handling zero-points and clipping, which are mostly unique to Rescale and should already be handled well by our vectorization library.

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
685

The Scale32 boolean is fairly superfluous. If Scale32 is false it means the input is an i48, which we don't support yet. It is debateable whether we support i48 types in the future (and will likely require a fairly different codegen with only i64 simulation).

mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir
454

Discussed above but the additional code is unique to rescale while ApplyScale is a replicated pattern in the TOSA specification. If we want a reduction a formal clamp operator could be useful (as that is replicated in a number of operations). But that feels like a standard op improvement.

rsuderman edited the summary of this revision. (Show Details)Mar 18 2021, 2:29 PM
rsuderman updated this revision to Diff 331689.Mar 18 2021, 2:33 PM
rsuderman marked 2 inline comments as done.

Updated description as-per silvas@ request

silvas accepted this revision.Mar 18 2021, 3:05 PM
This revision is now accepted and ready to land.Mar 18 2021, 3:05 PM
rsuderman updated this revision to Diff 331718.Mar 18 2021, 4:13 PM

Sync to head.

This revision was landed with ongoing or failed builds.Mar 18 2021, 4:15 PM
This revision was automatically updated to reflect the committed changes.