This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Tosa/Utils/
-
mlir/
-
Dialect/
-
Tosa/
-
Utils/
-
ConversionUtils.h
-
CoversionUtils.h
-
lib/
-
Conversion/TosaToLinalg/
-
TosaToLinalg/
35/35
TosaToLinalg.cpp
-
TosaToLinalgNamed.cpp
-
Dialect/Tosa/
-
Tosa/
-
IR/
-
TosaCanonicalizations.cpp
-
Utils/
-
ConversionUtils.cpp
-
test/Conversion/TosaToLinalg/
-
Conversion/
-
TosaToLinalg/
6/6
tosa-to-linalg-resize.mlir

Differential D136500

[mlir][tosa] Refactor tosa.resize
ClosedPublic

Authored by rsuderman on Oct 21 2022, 2:10 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
jpienaar
dcaballe
awarzynski

Commits

rG78503e1a2f50: [mlir][tosa] Refactor tosa.resize

Summary

Moved to using helper lambdas to avoid code repetition. IR needed to be reordered to
accommodate which should be the only changes to the existing tests.

This changes the quantized test to target i48 types to guarantee types are extended
correctly when necessary.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

rsuderman created this revision.Oct 21 2022, 2:10 PM

Herald added subscribers: zero9178, bzcheeseman, sdasgup3 and 24 others. · View Herald TranscriptOct 21 2022, 2:10 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2022, 2:10 PM

rsuderman requested review of this revision.Oct 21 2022, 2:10 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptOct 21 2022, 2:10 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

rsuderman added a reviewer: jpienaar.Oct 21 2022, 2:11 PM

Harbormaster completed remote builds in B193640: Diff 469762.Oct 21 2022, 2:43 PM

rsuderman added a reviewer: dcaballe.Oct 31 2022, 3:16 PM

Herald added a subscriber: Moerafaat. · View Herald TranscriptOct 31 2022, 3:16 PM

I like all the red :-) Can't dig into the math at the moment, so leaving that to Diego and Nicholas until I'm back.

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1455–1458	Could you expand on why?
1480	Don't know if phabricator or you left a tab char here
mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
381	Why did this change?

Herald added a subscriber: hanchung. · View Herald TranscriptNov 18 2022, 12:24 PM

LGTM but I don't think I totally follow the math :). Perhaps @awarzynski could take a look?

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1480	I think it's the way Phabricator shows that we are only changing the indentation for that line.
1690	As a general comment, wondering if we could split this large match and rewrite into multiple patterns sharing some common implementation or at least refactor part of this code into utility functions, as this is growing significantly. WDYT?

dcaballe added a reviewer: awarzynski.Nov 22 2022, 3:29 PM

Thanks for improving this Op! This is a neat simplification for cases when either H or W is 1 - I wish we had images like that :)

In general, this makes sense to me, but the fine details are tricky to follow. In ideal world, this code would include more annotations that can be matched against the TOSA spec. In particular, with this change you are implementing a simplified version of this part of tosa.resize (BILINEAR mode, extracted from the spec):

acc = v00 * (unit_y - dy) * (unit_x - dx);
acc += v01 * (unit_y - dy) * dx;
acc += v10 * dy * (unit_x - dx);
acc += v11 * dy * dx;

With H equal 1, v00 = v10 and v01 = v11, right? And, IIUC, that's basically what you are leveraging here.

However, it's not obvious what the simplified formula actually is and where to find it in the code. So, could you add more comments so that we know what formula is being implemented and how that maps to the spec?

I did scan the output for one of my examples "before" and "after" this cange (BILINEAR mode) and it looked pretty much identical modulo instructions being moved around. I guess that that refactoring could be extracted to a separate patch as an NFC?

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1449–1451	[nit] May as well just move this to the top alongside other refactoring.
1452–1453	One big difference here is that you introduce `resizeTy` on top of `resultTy`. Could you explain what the difference between the two is?

Rebased and updated for comments

Harbormaster completed remote builds in B201795: Diff 481040.Dec 7 2022, 1:37 PM

Nit - rewriter changed to use builder

rsuderman marked an inline comment as done.Dec 7 2022, 1:39 PM

rsuderman added inline comments.

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1452–1453	A tosa.resize could do two separate parts: a resize (bilinear/nearest) and a broadcast (degenerate case). We try to separate this out as the resize is computationally much more expensive than broadcasting, so if we see this case we do a resize then just broadcast across the degenerate case.
1455–1458	I removed the comment instead. Originally I was going to override the result type than handle the broadcast in a separate pass. Thats slightly problematic it turns out. I have some ways to clean things up but I need to reworked the rewriter interactions.
1480	Yes, it is just indicating that the indentation only changed due to now being in a block.
1690	I agree as well. I think I know how to fix this up but it would be better in a follow up that does not involve IR change. I will follow up once this lands.
mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
381	The input test size changed from 23x23 to 23x24. This was to guarantee we actually separate checks for width / height.

rsuderman marked an inline comment as done.Dec 7 2022, 1:42 PM

Harbormaster completed remote builds in B201796: Diff 481043.Dec 7 2022, 11:01 PM

Hey Rob, thanks for all the updates! I've had a bit more time to dive a bit deeper and have 3 high level comments.

This patch implements two orthogonal things. There's the overall refactoring to use lambdas and, separately, there is some new logic to simplify the generated code in some special cases (when either height or width of the input image are 1). These changes are orthogonal and non-trivial - in my view it would make much more sense to submit them separately. This would make reviewing easier, as well the git history cleaner and more helpful. Please, can you split this patch?

The simplifications for tosa.resize are tricky to match against the spec - I feel that we really missing more comments here. Also, IIUC, the following checks from the spec mean that for IH = 1, one should also have OH = 1 (similarly for OW and IW). However, for cases like (tensor<1x13x1x1xf32>) -> tensor<1x179x23x1xf32> (extracted from the tests) this is not satisfied. Perhaps I've missed something, but it seems that this implementation is diverging from the TOSA spec. Is this intended?

ERROR_IF(OH != idiv_check((IH - 1) * scale_y_n - offset_y + border_y, scale_y_d) + 1);
ERROR_IF(OW != idiv_check((IW - 1) * scale_x_n - offset_x + border_x, scale_x_d) + 1);

The lambdas that you've introduced help with code re-use, but IMHO the existing implementation is easier to match against the spec. I think that this could be improved with some renaming and additional comments. I've made some suggestions inline. WDYT?

I hope that this helps,

-Andrzej

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1452–1453	This is fine, but the naming is a bit confusing. Ultimately, `tosa.resize` will resize an image, so the "result" of this operator is the "resized" image and intuitively I would expect this to hold: `resizeTy == resultTy`. I think that having a plain `bool` (e.g. `resizeAndBrodcast` or `resizeDegenerate`) to switch between the two cases emerging here: `(height != 1) && (width != 1),` and `(height == 1) \|\| (width == 1)` would be more descriptive and intuitive.
1500	To me it would make more sense to follow LLVM's naming style for functions here: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly. For example, this could be `getIntIdxAndDelta` instead of `floatIndices`. In particular, this lambda is predominantly computing `ix` and `dx` as well as `iy` and `dy`, right? (using notation from TOSA spec). But that's not clear from the name and also this comment suggests that only `ix` and `dx` are calculated: // x = x * scale_d + offset; // ix = floor(x / scale_n) // dx = x / scale_n - ix Similar comment for `intIndices`. I would be helpful to rename this lambda, update the comment and also highlight that this is for both `ix`/`dx` and `iy/`dy`.
1527–1531	It's not obvious to me that this is correct or, put differently, that it conforms with the TOSA spec.
1572–1629	I think that `clampEdges` is a bit misleading - this lambda takes `in` and calculates the following: val0 = clamp( in, min, max); val1 = clamp(in + 1, min, max); And in the spec this is: int16_t iy0 = apply_max(iy, 0); int16_t iy1 = apply_min(iy + 1, IH - 1); int16_t ix0 = apply_max(ix, 0); int16_t ix1 = apply_min(ix + 1, IW - 1); And these are indices for the input image. So perhaps instead of `clampEdges`, this could be `getInputIdxsAndClamp`? Also, could you add a comment that would make it easy to map it back to the spec? You could use the snippet above ^^^.
mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
583	This example and the examples above seem to interpret scale as: [scale_x_n, scale_x_d, scale_y_n, scale_y_d] rather than: [scale_y_n, scale_y_d, scale_x_n, scale_x_d] In particular, (using formula from the spec): OH = ((13-1) * 11 / 7 ) + 1 = 19.857 But OH should be 179. However, if I use `scale_x_n` and `scale_x_d` instead: OH = ((13-1) * 89 / 6 ) + 1 = 179 Could you update this and the other examples?

Remove broadcasting work and simplify commit

Updated comments

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1452–1453	Removed the broadcast work - we can recontinue discussion on the next list.
1527–1531	When width / height are 1 the entire resize is degenerate as we clamp the lookup index to [0,0] and would interpolate between two identical values. While it does not technically represent identical instructions the result is identical this would avoid generating unoptimizable code for these degenerate cases.
mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
583	Changes will appear in follow up.

rsuderman edited the summary of this revision. (Show Details)Dec 8 2022, 2:38 PM

rsuderman edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B202079: Diff 481442.Dec 8 2022, 8:04 PM

Thanks for all the refactoring! Your lambdas do make the code less verbose and interpolate in particular helps to see the actual interpolations. That was very hard in the original implementation, so thanks for doing this!

I've suggested a few more comments. Again, to make matching the code against the spec more straightforward. I appreciate that this is a very subjective thing, but I'm hoping that it will make life easier for others who decide to work in this area too. I still need to scan through the tests.

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1498	Rather than: `getIntIdxAndDeltaFp` and `getIntIdxAndDeltaInt` I was suggesting: `getIntIdxAndDelta` `getFpIdxAndDeltaFp` Either one is fine, I don't mind. It's just that `getIntIdxAndDeltaFp` is rather confusing - is this for integer or fp values?
1516	To match the comment above.
1527–1531	When width / height are 1 the entire resize is degenerate as we clamp the lookup index to [0,0] and would interpolate between two identical values. You are probably right, but it was hard to extract that from the code itself. Thanks for explaining!
1652–1653	Why "offset" in `offsetClampCastIndxs`? Naming is heard and I always really struggle. Perhaps `getClampedIdxs` would be more descriptive? I'm probably missing something here.
1681–1694	Weird, something seems to be missing here: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp#L1684-L1693
1686–1706	It wasn't obvious to me that `w1` and `w2` are weights, hence the renaming as `weight1` and `weight2` Added extra comments to make matching against the spec simpler Removed `if (size ==1)` - that special case wasn't present in the original implementation, so shouldn't be needed here? Moved the definition of `interpolate` to the top of the block to better separate from the rest WDYT? Similar suggestion for the block below.

rsuderman retitled this revision from [mlir][tosa] Refactor tosa.resize and add broadcasting along one dimension to [mlir][tosa] Refactor tosa.resize.Dec 9 2022, 11:28 AM

Updated for awarzynski comments

rsuderman marked 6 inline comments as done.Dec 9 2022, 12:22 PM

rsuderman added inline comments.

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1681–1694	Must have been wiped out during the merge
1686–1706	I implemented them for the most part. I moved the weight calculations into the interpolation function and did some rename.

Harbormaster completed remote builds in B202283: Diff 481724.Dec 9 2022, 12:56 PM

I've finally gone over all the code. Sorry that this is taking so long - there's quite a lot to cover. I have left a few more minor comments (please address them before landing this), but otherwise LGTM. Btw, this patch trims matchAndRewrite from ~300 to ~250 LOC, that's super nice to see!

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1462–1530	What's the benefit of the extra indentation?
1530	[nit] I think that this comment can safely be skipped. If you'd like to keep it, can you add one for the Int case as well?
1542–1547	`roundIndex` -> `getNearestIndexAndClamp`?
1544–1547	IIUC, this is not needed?
1646–1651	`32` is a bit of a magic number here. IMHO, comparing against `resultETy` better conveys the rationale behind the extension. Similar comment for L1642.
1656–1657	This is the special "broadcast" case not covered by the spec, isn't it? (tested e.g. here). Could you add a comment so that this is easy to identify? Otherwise it's not clear what makes `size == 1` special.
mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
270–273	Is this extra extension due to switching from `i32` to `i48`? Might be worth adding a comment.

Final comments for tosa.resize refactor

rsuderman edited the summary of this revision. (Show Details)Dec 12 2022, 1:50 PM

rsuderman marked 7 inline comments as done.Dec 12 2022, 1:53 PM

rsuderman added inline comments.

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1462–1530	This indent is necessary for the block header guard guard. It guarantees the rewriter inserts new operations within the linalgs block and restores the position to after the constructed linalg operation.
1542–1547	renamed
1544–1547	removed.
1646–1651	No, in this case we are already comparing after resultETy, we would need to compare to *ScaleN's type. Updated to use this instead but this magic number if hard defined by the standard for all integer types.
mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
270–273	Added to CL description.

Looks good, thanks for addressing all the comments. For 48 bit expansion, could we use Jakub's work there instead? (not needed in this one, more question for follow up as that seems to simplify a few things)

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
1648	Missing ; ?

This revision is now accepted and ready to land.Dec 12 2022, 2:04 PM

Harbormaster completed remote builds in B202682: Diff 482258.Dec 12 2022, 2:17 PM

jpienaar@ comments

In D136500#3990023, @jpienaar wrote:

Looks good, thanks for addressing all the comments. For 48 bit expansion, could we use Jakub's work there instead? (not needed in this one, more question for follow up as that seems to simplify a few things)

Yup, we should be able to integrate his work. I will add it to the broadcasting improvement changes.

Harbormaster completed remote builds in B202687: Diff 482266.Dec 12 2022, 2:34 PM

Closed by commit rG78503e1a2f50: [mlir][tosa] Refactor tosa.resize (authored by rsuderman). · Explain WhyDec 12 2022, 2:39 PM

This revision was automatically updated to reflect the committed changes.

rsuderman added a commit: rG78503e1a2f50: [mlir][tosa] Refactor tosa.resize.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Tosa/

Utils/

	ConversionUtils.h
	CoversionUtils.h

8 lines

CoversionUtils.h

lib/

Conversion/

TosaToLinalg/

TosaToLinalg.cpp

460 lines

TosaToLinalgNamed.cpp

2 lines

Dialect/

Tosa/

IR/

TosaCanonicalizations.cpp

2 lines

Utils/

ConversionUtils.cpp

11 lines

test/

Conversion/

TosaToLinalg/

tosa-to-linalg-resize.mlir

210 lines

Diff 482274

mlir/include/mlir/Dialect/Tosa/Utils/ConversionUtils.h

This file was moved from mlir/include/mlir/Dialect/Tosa/Utils/CoversionUtils.h.

	Show All 24 Lines
	SmallVector<utils::IteratorType>			SmallVector<utils::IteratorType>
	getNParallelLoopsAttrs(unsigned nParallelLoops);			getNParallelLoopsAttrs(unsigned nParallelLoops);

	// Takes a vector of values and condenses them to a vector with no gaps.			// Takes a vector of values and condenses them to a vector with no gaps.
	SmallVector<Value> condenseValues(const SmallVector<Value> &values);			SmallVector<Value> condenseValues(const SmallVector<Value> &values);

	// Takes the parameters for a clamp and turns it into a series of ops for float			// Takes the parameters for a clamp and turns it into a series of ops for float
	// inputs.			// inputs.
	Value clampFloatHelper(Location loc, Value arg, arith::ConstantOp min,			Value clampFloatHelper(Location loc, Value arg, Value min, Value max,
	arith::ConstantOp max, OpBuilder &rewriter);			OpBuilder &rewriter);

	// Takes the parameters for a clamp and turns it into a series of ops for			// Takes the parameters for a clamp and turns it into a series of ops for
	// integer inputs.			// integer inputs.
	Value clampIntHelper(Location loc, Value arg, arith::ConstantOp min,			Value clampIntHelper(Location loc, Value arg, Value min, Value max,
	arith::ConstantOp max, OpBuilder &rewriter);			OpBuilder &rewriter);

	// Returns the values in an attribute as an array of values.			// Returns the values in an attribute as an array of values.
	template <typename T>			template <typename T>
	void getValuesFromIntArrayAttribute(ArrayAttr attr,			void getValuesFromIntArrayAttribute(ArrayAttr attr,
	SmallVector<T> &arrayValues) {			SmallVector<T> &arrayValues) {
	for (Attribute val : attr.getValue()) {			for (Attribute val : attr.getValue()) {
	arrayValues.push_back(val.cast<IntegerAttr>().getValue().getSExtValue());			arrayValues.push_back(val.cast<IntegerAttr>().getValue().getSExtValue());
	}			}
	Show All 36 Lines

mlir/include/mlir/Dialect/Tosa/Utils/CoversionUtils.h

This file was moved to mlir/include/mlir/Dialect/Tosa/Utils/ConversionUtils.h.

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp

Show All 12 Lines

#include "mlir/Conversion/TosaToLinalg/TosaToLinalg.h" #include "mlir/Conversion/TosaToLinalg/TosaToLinalg.h"

#include "mlir/Dialect/Arith/IR/Arith.h" #include "mlir/Dialect/Arith/IR/Arith.h"

#include "mlir/Dialect/Linalg/IR/Linalg.h" #include "mlir/Dialect/Linalg/IR/Linalg.h"

#include "mlir/Dialect/Math/IR/Math.h" #include "mlir/Dialect/Math/IR/Math.h"

#include "mlir/Dialect/SCF/IR/SCF.h" #include "mlir/Dialect/SCF/IR/SCF.h"

#include "mlir/Dialect/Tensor/IR/Tensor.h" #include "mlir/Dialect/Tensor/IR/Tensor.h"

#include "mlir/Dialect/Tensor/Utils/Utils.h" #include "mlir/Dialect/Tensor/Utils/Utils.h"

#include "mlir/Dialect/Tosa/IR/TosaOps.h" #include "mlir/Dialect/Tosa/IR/TosaOps.h"

#include "mlir/Dialect/Tosa/Utils/CoversionUtils.h" #include "mlir/Dialect/Tosa/Utils/ConversionUtils.h"

#include "mlir/Dialect/Utils/ReshapeOpsUtils.h" #include "mlir/Dialect/Utils/ReshapeOpsUtils.h"

#include "mlir/IR/ImplicitLocOpBuilder.h" #include "mlir/IR/ImplicitLocOpBuilder.h"

#include "mlir/IR/Matchers.h" #include "mlir/IR/Matchers.h"

#include "mlir/IR/PatternMatch.h" #include "mlir/IR/PatternMatch.h"

#include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/DialectConversion.h"

#include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h"

#include <numeric> #include <numeric>

▲ Show 20 Lines • Show All 142 Lines • ▼ Show 20 Lines Value zpAddValue = rewriter.create<arith::ConstantOp>(

loc, rewriter.getIntegerAttr(intermediateType, zpAdd)); loc, rewriter.getIntegerAttr(intermediateType, zpAdd));

// The negation can be applied by doing: // The negation can be applied by doing:

// outputValue = inZp + outZp - inputValue // outputValue = inZp + outZp - inputValue

auto ext = rewriter.create<arith::ExtSIOp>(loc, intermediateType, args[0]); auto ext = rewriter.create<arith::ExtSIOp>(loc, intermediateType, args[0]);

auto sub = rewriter.create<arith::SubIOp>(loc, zpAddValue, ext); auto sub = rewriter.create<arith::SubIOp>(loc, zpAddValue, ext);

// Clamp to the negation range. // Clamp to the negation range.

auto min = rewriter.create<arith::ConstantIntOp>( Value min = rewriter.create<arith::ConstantIntOp>(

loc, APInt::getSignedMinValue(inputBitWidth).getSExtValue(), loc, APInt::getSignedMinValue(inputBitWidth).getSExtValue(),

intermediateType); intermediateType);

auto max = rewriter.create<arith::ConstantIntOp>( Value max = rewriter.create<arith::ConstantIntOp>(

loc, APInt::getSignedMaxValue(inputBitWidth).getSExtValue(), loc, APInt::getSignedMaxValue(inputBitWidth).getSExtValue(),

intermediateType); intermediateType);

auto clamp = clampIntHelper(loc, sub, min, max, rewriter); auto clamp = clampIntHelper(loc, sub, min, max, rewriter);

// Truncate to the final value. // Truncate to the final value.

return rewriter.create<arith::TruncIOp>(loc, elementTy, clamp); return rewriter.create<arith::TruncIOp>(loc, elementTy, clamp);

} }

▲ Show 20 Lines • Show All 1,234 Lines • ▼ Show 20 Lines

class GenericResizeConverter : public OpRewritePattern<tosa::ResizeOp> { class GenericResizeConverter : public OpRewritePattern<tosa::ResizeOp> {

public: public:

using OpRewritePattern<tosa::ResizeOp>::OpRewritePattern; using OpRewritePattern<tosa::ResizeOp>::OpRewritePattern;

LogicalResult matchAndRewrite(tosa::ResizeOp op, LogicalResult matchAndRewrite(tosa::ResizeOp op,

PatternRewriter &rewriter) const final { PatternRewriter &rewriter) const final {

Location loc = op.getLoc(); Location loc = op.getLoc();

ImplicitLocOpBuilder b(loc, rewriter);

auto input = op.getInput(); auto input = op.getInput();

auto inputTy = input.getType().cast<ShapedType>(); auto inputTy = input.getType().cast<ShapedType>();

auto resultTy = op.getType().cast<ShapedType>(); auto resultTy = op.getType().cast<ShapedType>();

auto resultElementTy = resultTy.getElementType(); auto resultETy = resultTy.getElementType();

auto imageH = inputTy.getShape()[1]; auto imageH = inputTy.getShape()[1];

auto imageW = inputTy.getShape()[2]; auto imageW = inputTy.getShape()[2];

auto dynamicDimsOr = auto dynamicDimsOr =

checkHasDynamicBatchDims(rewriter, op, {input, op.getOutput()}); checkHasDynamicBatchDims(rewriter, op, {input, op.getOutput()});

if (!dynamicDimsOr.has_value()) if (!dynamicDimsOr.has_value())

return rewriter.notifyMatchFailure( return rewriter.notifyMatchFailure(

op, "unable to get dynamic dimensions of tosa.resize"); op, "unable to get dynamic dimensions of tosa.resize");

SmallVector<Value> dynamicDims = dynamicDimsOr.value();

if (op.getMode() != "NEAREST_NEIGHBOR" && op.getMode() != "BILINEAR") if (op.getMode() != "NEAREST_NEIGHBOR" && op.getMode() != "BILINEAR")

return rewriter.notifyMatchFailure( return rewriter.notifyMatchFailure(

op, "tosa.resize mode should be NEAREST_NEIGHBOR or BILINEAR"); op, "tosa.resize mode should be NEAREST_NEIGHBOR or BILINEAR");

awarzynskiUnsubmitted

Done

[nit] May as well just move this to the top alongside other refactoring.

awarzynski: [nit] May as well just move this to the top alongside other refactoring.

auto emptyTensor = rewriter.create<tensor::EmptyOp>(

loc, resultTy.getShape(), resultElementTy, dynamicDims);

SmallVector<AffineMap, 2> affineMaps = { SmallVector<AffineMap, 2> affineMaps = {

awarzynskiUnsubmitted

Done

One big difference here is that you introduce resizeTy on top of resultTy. Could you explain what the difference between the two is?

awarzynski: One big difference here is that you introduce `resizeTy` on top of `resultTy`. Could you…

rsudermanAuthorUnsubmitted

Done

A tosa.resize could do two separate parts: a resize (bilinear/nearest) and a broadcast (degenerate case). We try to separate this out as the resize is computationally much more expensive than broadcasting, so if we see this case we do a resize then just broadcast across the degenerate case.

rsuderman: A tosa.resize could do two separate parts: a resize (bilinear/nearest) and a broadcast…

awarzynskiUnsubmitted

Done

This is fine, but the naming is a bit confusing. Ultimately, tosa.resize will resize an image, so the "result" of this operator is the "resized" image and intuitively I would expect this to hold: resizeTy == resultTy.

I think that having a plain bool (e.g. resizeAndBrodcast or resizeDegenerate) to switch between the two cases emerging here:

(height != 1) && (width != 1), and
(height == 1) || (width == 1)

would be more descriptive and intuitive.

awarzynski: This is fine, but the naming is a bit confusing. Ultimately, `tosa.resize` will resize an image…

rsudermanAuthorUnsubmitted

Done

Removed the broadcast work - we can recontinue discussion on the next list.

rsuderman: Removed the broadcast work - we can recontinue discussion on the next list.

rewriter.getMultiDimIdentityMap(resultTy.getRank())}; rewriter.getMultiDimIdentityMap(resultTy.getRank())};

auto emptyTensor = b.create<tensor::EmptyOp>(resultTy.getShape(), resultETy,

Value resize = input; dynamicDimsOr.value());

auto genericOp = rewriter.create<linalg::GenericOp>( auto genericOp = b.create<linalg::GenericOp>(

loc, resultTy, ValueRange({}), ValueRange{emptyTensor}, affineMaps, resultTy, ValueRange({}), ValueRange{emptyTensor}, affineMaps,

jpienaarUnsubmitted

Done

Could you expand on why?

jpienaar: Could you expand on why?

rsudermanAuthorUnsubmitted

Done

I removed the comment instead. Originally I was going to override the result type than handle the broadcast in a separate pass. Thats slightly problematic it turns out. I have some ways to clean things up but I need to reworked the rewriter interactions.

rsuderman: I removed the comment instead. Originally I was going to override the result type than handle…

getNParallelLoopsAttrs(resultTy.getRank())); getNParallelLoopsAttrs(resultTy.getRank()));

resize = genericOp.getResult(0); Value resize = genericOp.getResult(0);

OpBuilder::InsertionGuard regionGuard(rewriter); {

rewriter.createBlock(&genericOp.getRegion(), genericOp.getRegion().end(), OpBuilder::InsertionGuard regionGuard(b);

TypeRange({resultElementTy}), loc); b.createBlock(&genericOp.getRegion(), genericOp.getRegion().end(),

Value batch = rewriter.create<linalg::IndexOp>(loc, 0); TypeRange({resultETy}), loc);

Value y = rewriter.create<linalg::IndexOp>(loc, 1); Value batch = b.create<linalg::IndexOp>(0);

Value x = rewriter.create<linalg::IndexOp>(loc, 2); Value y = b.create<linalg::IndexOp>(1);

Value channel = rewriter.create<linalg::IndexOp>(loc, 3); Value x = b.create<linalg::IndexOp>(2);

Value channel = b.create<linalg::IndexOp>(3);

auto hwMin =

rewriter.create<arith::ConstantOp>(loc, rewriter.getI32IntegerAttr(0)); Value zeroI32 = b.create<arith::ConstantOp>(b.getI32IntegerAttr(0));

auto hMax = rewriter.create<arith::ConstantOp>( Value hMax = b.create<arith::ConstantOp>(b.getI32IntegerAttr(imageH - 1));

loc, rewriter.getI32IntegerAttr(imageH - 1)); Value wMax = b.create<arith::ConstantOp>(b.getI32IntegerAttr(imageW - 1));

auto wMax = rewriter.create<arith::ConstantOp>(

loc, rewriter.getI32IntegerAttr(imageW - 1)); Value inY = b.create<arith::IndexCastOp>(b.getI32Type(), y);

Value inX = b.create<arith::IndexCastOp>(b.getI32Type(), x);

Value inY =

rewriter.create<arith::IndexCastOp>(loc, rewriter.getI32Type(), y);

Value inX =

rewriter.create<arith::IndexCastOp>(loc, rewriter.getI32Type(), x);

bool floatingPointMode = resultElementTy.isF32(); bool floatingPointMode = resultETy.isF32();

Value yScaleN, yScaleD, xScaleN, xScaleD, yOffset, xOffset, yBorder,

xBorder;

SmallVector<int32_t> scale, offset, border; SmallVector<int32_t> scale, offset, border;

jpienaarUnsubmitted

Done

Don't know if phabricator or you left a tab char here

jpienaar: Don't know if phabricator or you left a tab char here

dcaballeUnsubmitted

Done

I think it's the way Phabricator shows that we are only changing the indentation for that line.

dcaballe: I think it's the way Phabricator shows that we are only changing the indentation for that line.

rsudermanAuthorUnsubmitted

Done

Yes, it is just indicating that the indentation only changed due to now being in a block.

rsuderman: Yes, it is just indicating that the indentation only changed due to now being in a block.

getValuesFromIntArrayAttribute(op.getScale(), scale); getValuesFromIntArrayAttribute(op.getScale(), scale);

getValuesFromIntArrayAttribute(op.getOffset(), offset); getValuesFromIntArrayAttribute(op.getOffset(), offset);

getValuesFromIntArrayAttribute(op.getBorder(), border); getValuesFromIntArrayAttribute(op.getBorder(), border);

yScaleN = rewriter.create<arith::ConstantOp>( Value yScaleN, yScaleD, xScaleN, xScaleD;

loc, rewriter.getI32IntegerAttr(scale[0])); yScaleN = b.create<arith::ConstantOp>(b.getI32IntegerAttr(scale[0]));

yScaleD = rewriter.create<arith::ConstantOp>( yScaleD = b.create<arith::ConstantOp>(b.getI32IntegerAttr(scale[1]));

loc, rewriter.getI32IntegerAttr(scale[1])); xScaleN = b.create<arith::ConstantOp>(b.getI32IntegerAttr(scale[2]));

xScaleN = rewriter.create<arith::ConstantOp>( xScaleD = b.create<arith::ConstantOp>(b.getI32IntegerAttr(scale[3]));

loc, rewriter.getI32IntegerAttr(scale[2]));

xScaleD = rewriter.create<arith::ConstantOp>( Value yOffset, xOffset, yBorder, xBorder;

loc, rewriter.getI32IntegerAttr(scale[3])); yOffset = b.create<arith::ConstantOp>(b.getI32IntegerAttr(offset[0]));

yOffset = rewriter.create<arith::ConstantOp>( xOffset = b.create<arith::ConstantOp>(b.getI32IntegerAttr(offset[1]));

loc, rewriter.getI32IntegerAttr(offset[0])); yBorder = b.create<arith::ConstantOp>(b.getI32IntegerAttr(border[0]));

xOffset = rewriter.create<arith::ConstantOp>( xBorder = b.create<arith::ConstantOp>(b.getI32IntegerAttr(border[1]));

loc, rewriter.getI32IntegerAttr(offset[1]));

yBorder = rewriter.create<arith::ConstantOp>( // Compute the ix and dx values for both the X and Y dimensions.

loc, rewriter.getI32IntegerAttr(border[0])); auto getIndexAndDeltaFp = [&](Value &index, Value &delta, Value in,

awarzynskiUnsubmitted

Done

Rather than:

getIntIdxAndDeltaFp and
getIntIdxAndDeltaInt

I was suggesting:
getIntIdxAndDelta
getFpIdxAndDeltaFp

Either one is fine, I don't mind. It's just that getIntIdxAndDeltaFp is rather confusing - is this for integer or fp values?

awarzynski: Rather than: * `getIntIdxAndDeltaFp` and * `getIntIdxAndDeltaInt` I was suggesting: *…

xBorder = rewriter.create<arith::ConstantOp>( Value scaleN, Value scaleD, Value offset,

loc, rewriter.getI32IntegerAttr(border[1])); int size, ImplicitLocOpBuilder &b) {

awarzynskiUnsubmitted

Done

To me it would make more sense to follow LLVM's naming style for functions here: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly. For example, this could be getIntIdxAndDelta instead of floatIndices.

In particular, this lambda is predominantly computing ix and dx as well as iy and dy, right? (using notation from TOSA spec). But that's not clear from the name and also this comment suggests that only ix and dx are calculated:

// x = x * scale_d + offset;
// ix = floor(x / scale_n)
// dx = x / scale_n - ix

Similar comment for intIndices. I would be helpful to rename this lambda, update the comment and also highlight that this is for both ix/dx and iy/dy`.

awarzynski: To me it would make more sense to follow LLVM's naming style for functions here: https://llvm.

// Compute the the integer index and partial offset.

Value ix, iy, dx, dy;

// x = x * scale_d + offset; // x = x * scale_d + offset;

// ix = floor(x / scale_n) // ix = floor(x / scale_n)

if (floatingPointMode) {

// dx = x / scale_n - ix // dx = x / scale_n - ix

Value y = Value val = b.create<arith::UIToFPOp>(b.getF32Type(), in);

rewriter.create<arith::UIToFPOp>(loc, rewriter.getF32Type(), inY); scaleN = b.create<arith::UIToFPOp>(b.getF32Type(), scaleN);

Value x = scaleD = b.create<arith::UIToFPOp>(b.getF32Type(), scaleD);

rewriter.create<arith::UIToFPOp>(loc, rewriter.getF32Type(), inX); offset = b.create<arith::UIToFPOp>(b.getF32Type(), offset);

val = b.create<arith::MulFOp>(val, scaleD);

yScaleN = val = b.create<arith::AddFOp>(val, offset);

rewriter.create<arith::UIToFPOp>(loc, rewriter.getF32Type(), yScaleN); val = b.create<arith::DivFOp>(val, scaleN);

yScaleD = index = b.create<math::FloorOp>(val);

rewriter.create<arith::UIToFPOp>(loc, rewriter.getF32Type(), yScaleD); delta = b.create<arith::SubFOp>(val, index);

xScaleN = index = b.create<arith::FPToSIOp>(b.getI32Type(), index);

rewriter.create<arith::UIToFPOp>(loc, rewriter.getF32Type(), xScaleN); };

xScaleD =

rewriter.create<arith::UIToFPOp>(loc, rewriter.getF32Type(), xScaleD);

yOffset =

rewriter.create<arith::UIToFPOp>(loc, rewriter.getF32Type(), yOffset);

xOffset =

rewriter.create<arith::UIToFPOp>(loc, rewriter.getF32Type(), xOffset);

y = rewriter.create<arith::MulFOp>(loc, y, yScaleD);

x = rewriter.create<arith::MulFOp>(loc, x, xScaleD);

y = rewriter.create<arith::AddFOp>(loc, y, yOffset);

x = rewriter.create<arith::AddFOp>(loc, x, xOffset);

y = rewriter.create<arith::DivFOp>(loc, y, yScaleN);

x = rewriter.create<arith::DivFOp>(loc, x, xScaleN);

iy = rewriter.create<math::FloorOp>(loc, y);

ix = rewriter.create<math::FloorOp>(loc, x);

dy = rewriter.create<arith::SubFOp>(loc, y, iy);

dx = rewriter.create<arith::SubFOp>(loc, x, ix);

iy = rewriter.create<arith::FPToSIOp>(loc, rewriter.getI32Type(), iy); // Compute the ix and dx values for the X and Y dimensions - int case.

awarzynskiUnsubmitted

Done

index = b.create<arith::FPToSIOp>(b.getI32Type(), index);

};

- // Compute the index and delta values for the integer case.

+ // Compute the ix and dx values for both the X and Y dimensions - integer case.

auto getIntIdxAndDeltaInt = [&](Value &index, Value &delta, Value in,

To match the comment above.

awarzynski: To match the comment above.

ix = rewriter.create<arith::FPToSIOp>(loc, rewriter.getI32Type(), ix); auto getIndexAndDeltaInt = [&](Value &index, Value &delta, Value in,

} else { Value scaleN, Value scaleD, Value offset,

int size, ImplicitLocOpBuilder &b) {

// x = x * scale_d + offset;

// ix = floor(x / scale_n)

// dx = x - ix * scale_n; // dx = x - ix * scale_n;

Value y = rewriter.create<arith::MulIOp>(loc, inY, yScaleD); Value val = b.create<arith::MulIOp>(in, scaleD);

Value x = rewriter.create<arith::MulIOp>(loc, inX, xScaleD); val = b.create<arith::AddIOp>(val, offset);

index = b.create<arith::DivSIOp>(val, scaleN);

y = rewriter.create<arith::AddIOp>(loc, y, yOffset); delta = b.create<arith::MulIOp>(index, scaleN);

x = rewriter.create<arith::AddIOp>(loc, x, xOffset); delta = b.create<arith::SubIOp>(val, delta);

};

iy = rewriter.create<arith::DivSIOp>(loc, y, yScaleN);

ix = rewriter.create<arith::DivSIOp>(loc, x, xScaleN);

Value tempY = rewriter.create<arith::MulIOp>(loc, iy, yScaleN);

Value tempX = rewriter.create<arith::MulIOp>(loc, ix, xScaleN);

dy = rewriter.create<arith::SubIOp>(loc, y, tempY); Value ix, iy, dx, dy;

awarzynskiUnsubmitted

Done

What's the benefit of the extra indentation?

awarzynski: What's the benefit of the extra indentation?

rsudermanAuthorUnsubmitted

Done

This indent is necessary for the block header guard guard. It guarantees the rewriter inserts new operations within the linalgs block and restores the position to after the constructed linalg operation.

rsuderman: This indent is necessary for the block header guard guard. It guarantees the rewriter inserts…

awarzynskiUnsubmitted

Done

[nit] I think that this comment can safely be skipped. If you'd like to keep it, can you add one for the Int case as well?

awarzynski: [nit] I think that this comment can safely be skipped. If you'd like to keep it, can you add…

dx = rewriter.create<arith::SubIOp>(loc, x, tempX); if (floatingPointMode) {

awarzynskiUnsubmitted

Done

It's not obvious to me that this is correct or, put differently, that it conforms with the TOSA spec.

awarzynski: It's not obvious to me that this is correct or, put differently, that it conforms with the TOSA…

rsudermanAuthorUnsubmitted

Done

When width / height are 1 the entire resize is degenerate as we clamp the lookup index to [0,0] and would interpolate between two identical values. While it does not technically represent identical instructions the result is identical this would avoid generating unoptimizable code for these degenerate cases.

rsuderman: When width / height are 1 the entire resize is degenerate as we clamp the lookup index to [0,0]…

awarzynskiUnsubmitted

Done

When width / height are 1 the entire resize is degenerate as we clamp the lookup index to [0,0] and would interpolate between two identical values.

You are probably right, but it was hard to extract that from the code itself. Thanks for explaining!

awarzynski: > When width / height are 1 the entire resize is degenerate as we clamp the lookup index to [0…

getIndexAndDeltaFp(iy, dy, inY, yScaleN, yScaleD, yOffset, imageH, b);

getIndexAndDeltaFp(ix, dx, inX, xScaleN, xScaleD, xOffset, imageW, b);

} else {

getIndexAndDeltaInt(iy, dy, inY, yScaleN, yScaleD, yOffset, imageH, b);

getIndexAndDeltaInt(ix, dx, inX, xScaleN, xScaleD, xOffset, imageW, b);

} }

if (op.getMode() == "NEAREST_NEIGHBOR") { if (op.getMode() == "NEAREST_NEIGHBOR") {

Value yPred, xPred; auto one = b.create<arith::ConstantOp>(b.getI32IntegerAttr(1));

auto zeroVal = rewriter.create<arith::ConstantOp>(

loc, rewriter.getI32IntegerAttr(0)); auto getNearestIndexAndClamp = [&](Value val, Value dval, Value scale,

auto oneVal = rewriter.create<arith::ConstantOp>( Value max, int size,

loc, rewriter.getI32IntegerAttr(1)); ImplicitLocOpBuilder &b) -> Value {

if (size == 1) {

return b.create<arith::ConstantIndexOp>(0);

}

awarzynskiUnsubmitted

Done

IIUC, this is not needed?

awarzynski: IIUC, this is not needed?

rsudermanAuthorUnsubmitted

Done

removed.

rsuderman: removed.

awarzynskiUnsubmitted

Done

roundIndex -> getNearestIndexAndClamp?

awarzynski: `roundIndex` -> `getNearestIndexAndClamp`?

rsudermanAuthorUnsubmitted

Done

renamed

rsuderman: renamed

// Round the index position towards the closest pixel location. Value pred;

if (floatingPointMode) { if (floatingPointMode) {

auto halfVal = rewriter.create<arith::ConstantOp>( auto h = b.create<arith::ConstantOp>(b.getF32FloatAttr(0.5f));

loc, rewriter.getF32FloatAttr(0.5f)); pred = b.create<arith::CmpFOp>(arith::CmpFPredicate::OGE, dval, h);

yPred = rewriter.create<arith::CmpFOp>(loc, arith::CmpFPredicate::OGE,

dy, halfVal);

xPred = rewriter.create<arith::CmpFOp>(loc, arith::CmpFPredicate::OGE,

dx, halfVal);

} else { } else {

Value dyDoubled = rewriter.create<arith::ShLIOp>(loc, dy, oneVal); Value dvalDouble = b.create<arith::ShLIOp>(dval, one);

Value dxDoubled = rewriter.create<arith::ShLIOp>(loc, dx, oneVal); pred = b.create<arith::CmpIOp>(arith::CmpIPredicate::sge,

yPred = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::sge, dvalDouble, scale);

dyDoubled, yScaleN); }

xPred = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::sge,

dxDoubled, xScaleN); auto offset = b.create<arith::SelectOp>(pred, one, zeroI32);

} val = b.create<arith::AddIOp>(val, offset);

val = clampIntHelper(loc, val, zeroI32, max, b);

auto yOffset = return b.create<arith::IndexCastOp>(b.getIndexType(), val);

rewriter.create<arith::SelectOp>(loc, yPred, oneVal, zeroVal); };

auto xOffset =

rewriter.create<arith::SelectOp>(loc, xPred, oneVal, zeroVal);

iy = rewriter.create<arith::AddIOp>(loc, iy, yOffset);

ix = rewriter.create<arith::AddIOp>(loc, ix, xOffset);

// Clamp the to be within the bounds of the input image.

iy = clampIntHelper(loc, iy, hwMin, hMax, rewriter);

ix = clampIntHelper(loc, ix, hwMin, wMax, rewriter);

// Read the value from the input array.

iy =

rewriter.create<arith::IndexCastOp>(loc, rewriter.getIndexType(), iy);

ix =

rewriter.create<arith::IndexCastOp>(loc, rewriter.getIndexType(), ix);

Value result = rewriter.create<tensor::ExtractOp>( iy = getNearestIndexAndClamp(iy, dy, yScaleN, hMax, imageH, b);

loc, input, ValueRange{batch, iy, ix, channel}); ix = getNearestIndexAndClamp(ix, dx, xScaleN, wMax, imageW, b);

rewriter.create<linalg::YieldOp>(loc, result); Value result = b.create<tensor::ExtractOp>(

input, ValueRange{batch, iy, ix, channel});

b.create<linalg::YieldOp>(result);

} else { } else {

// The mode here must be BILINEAR. // The mode here must be BILINEAR.

assert(op.getMode() == "BILINEAR"); assert(op.getMode() == "BILINEAR");

Value y0 = iy;

Value x0 = ix;

auto oneVal = rewriter.create<arith::ConstantOp>( auto oneVal = b.create<arith::ConstantOp>(b.getI32IntegerAttr(1));

loc, rewriter.getI32IntegerAttr(1));

Value y1 = rewriter.create<arith::AddIOp>(loc, y0, oneVal);

Value x1 = rewriter.create<arith::AddIOp>(loc, x0, oneVal);

y0 = clampIntHelper(loc, y0, hwMin, hMax, rewriter);

y1 = clampIntHelper(loc, y1, hwMin, hMax, rewriter);

x0 = clampIntHelper(loc, x0, hwMin, wMax, rewriter); auto getClampedIdxs = [&](Value &val0, Value &val1, int size, Value in,

x1 = clampIntHelper(loc, x1, hwMin, wMax, rewriter); Value max, ImplicitLocOpBuilder &b) {

val0 = in;

val1 = b.create<arith::AddIOp>(val0, oneVal);

val0 = clampIntHelper(loc, val0, zeroI32, max, b);

val1 = clampIntHelper(loc, val1, zeroI32, max, b);

val0 = b.create<arith::IndexCastOp>(b.getIndexType(), val0);

val1 = b.create<arith::IndexCastOp>(b.getIndexType(), val1);

};

y0 = // Linalg equivalent to the section below:

rewriter.create<arith::IndexCastOp>(loc, rewriter.getIndexType(), y0); // int16_t iy0 = apply_max(iy, 0);

y1 = // int16_t iy1 = apply_min(iy + 1, IH - 1);

rewriter.create<arith::IndexCastOp>(loc, rewriter.getIndexType(), y1); // int16_t ix0 = apply_max(ix, 0);

x0 = // int16_t ix1 = apply_min(ix + 1, IW - 1);

rewriter.create<arith::IndexCastOp>(loc, rewriter.getIndexType(), x0); Value x0, x1, y0, y1;

x1 = getClampedIdxs(y0, y1, imageH, iy, hMax, b);

rewriter.create<arith::IndexCastOp>(loc, rewriter.getIndexType(), x1); getClampedIdxs(x0, x1, imageW, ix, wMax, b);

Value y0x0 = rewriter.create<tensor::ExtractOp>( Value y0x0 = b.create<tensor::ExtractOp>(

loc, input, ValueRange{batch, y0, x0, channel}); input, ValueRange{batch, y0, x0, channel});

Value y0x1 = rewriter.create<tensor::ExtractOp>( Value y0x1 = b.create<tensor::ExtractOp>(

loc, input, ValueRange{batch, y0, x1, channel}); input, ValueRange{batch, y0, x1, channel});

Value y1x0 = rewriter.create<tensor::ExtractOp>( Value y1x0 = b.create<tensor::ExtractOp>(

loc, input, ValueRange{batch, y1, x0, channel}); input, ValueRange{batch, y1, x0, channel});

Value y1x1 = rewriter.create<tensor::ExtractOp>( Value y1x1 = b.create<tensor::ExtractOp>(

loc, input, ValueRange{batch, y1, x1, channel}); input, ValueRange{batch, y1, x1, channel});

if (floatingPointMode) { if (floatingPointMode) {

Value rightPart = dx; auto oneVal = b.create<arith::ConstantOp>(b.getF32FloatAttr(1.0f));

auto oneVal = rewriter.create<arith::ConstantOp>( auto interpolate = [&](Value val0, Value val1, Value delta,

loc, rewriter.getF32FloatAttr(1.0f)); ImplicitLocOpBuilder &b) -> Value {

Value leftPart = rewriter.create<arith::SubFOp>(loc, oneVal, dx); Value oneMinusDelta = b.create<arith::SubFOp>(oneVal, delta);

Value mul0 = b.create<arith::MulFOp>(val0, oneMinusDelta);

y0x0 = rewriter.create<arith::MulFOp>(loc, y0x0, leftPart); Value mul1 = b.create<arith::MulFOp>(val1, delta);

y0x1 = rewriter.create<arith::MulFOp>(loc, y0x1, rightPart); return b.create<arith::AddFOp>(mul0, mul1);

Value topAcc = rewriter.create<arith::AddFOp>(loc, y0x0, y0x1); };

y1x0 = rewriter.create<arith::MulFOp>(loc, y1x0, leftPart);

y1x1 = rewriter.create<arith::MulFOp>(loc, y1x1, rightPart);

Value bottomAcc = rewriter.create<arith::AddFOp>(loc, y1x0, y1x1);

Value bottomPart = dy;

Value topPart = rewriter.create<arith::SubFOp>(loc, oneVal, dy);

topAcc = rewriter.create<arith::MulFOp>(loc, topAcc, topPart);

bottomAcc = rewriter.create<arith::MulFOp>(loc, bottomAcc, bottomPart);

Value result = rewriter.create<arith::AddFOp>(loc, topAcc, bottomAcc);

rewriter.create<linalg::YieldOp>(loc, result); // Linalg equivalent to the section below:

// topAcc = v00 * (unit_x - dx);

// topAcc += v01 * dx;

Value topAcc = interpolate(y0x0, y0x1, dx, b);

// Linalg equivalent to the section below:

// bottomAcc = v10 * (unit_x - dx);

// bottomAcc += v11 * dx;

Value bottomAcc = interpolate(y1x0, y1x1, dx, b);

// Linalg equivalent to the section below:

// result = topAcc * (unit_y - dy) + bottomAcc * dy

Value result = interpolate(topAcc, bottomAcc, dy, b);

b.create<linalg::YieldOp>(result);

awarzynskiUnsubmitted

Done

I think that clampEdges is a bit misleading - this lambda takes in and calculates the following:

val0 = clamp( in, min, max);
val1 = clamp(in + 1, min, max);

And in the spec this is:

int16_t iy0 = apply_max(iy, 0);
int16_t iy1 = apply_min(iy + 1, IH - 1);
int16_t ix0 = apply_max(ix, 0);
int16_t ix1 = apply_min(ix + 1, IW - 1);

And these are indices for the input image. So perhaps instead of clampEdges, this could be getInputIdxsAndClamp? Also, could you add a comment that would make it easy to map it back to the spec? You could use the snippet above ^^^.

awarzynski: I think that `clampEdges` is a bit misleading - this lambda takes `in` and calculates the…

} else { } else {

// Perform in quantized space. // Perform in quantized space.

y0x0 = rewriter.create<arith::ExtSIOp>(loc, resultElementTy, y0x0); y0x0 = b.create<arith::ExtSIOp>(resultETy, y0x0);

y0x1 = rewriter.create<arith::ExtSIOp>(loc, resultElementTy, y0x1); y0x1 = b.create<arith::ExtSIOp>(resultETy, y0x1);

y1x0 = rewriter.create<arith::ExtSIOp>(loc, resultElementTy, y1x0); y1x0 = b.create<arith::ExtSIOp>(resultETy, y1x0);

y1x1 = rewriter.create<arith::ExtSIOp>(loc, resultElementTy, y1x1); y1x1 = b.create<arith::ExtSIOp>(resultETy, y1x1);

if (resultElementTy.getIntOrFloatBitWidth() > 32) { const int64_t deltaBitwidth = dx.getType().getIntOrFloatBitWidth();

dx = rewriter.create<arith::ExtSIOp>(loc, resultElementTy, dx); if (resultETy.getIntOrFloatBitWidth() > deltaBitwidth) {

dy = rewriter.create<arith::ExtSIOp>(loc, resultElementTy, dy); dx = b.create<arith::ExtSIOp>(resultETy, dx);

dy = b.create<arith::ExtSIOp>(resultETy, dy);

} }

Value xScaleNExt = xScaleN;

Value yScaleNExt = yScaleN; Value yScaleNExt = yScaleN;

Value xScaleNExt = xScaleN;

if (xScaleN.getType() != resultElementTy) const int64_t scaleBitwidth =

xScaleNExt = xScaleN.getType().getIntOrFloatBitWidth();

rewriter.create<arith::ExtSIOp>(loc, resultElementTy, xScaleN); if (resultETy.getIntOrFloatBitWidth() > scaleBitwidth) {

jpienaarUnsubmitted

Done

Missing ; ?

jpienaar: Missing ; ?

yScaleNExt = b.create<arith::ExtSIOp>(resultETy, yScaleN);

if (yScaleN.getType() != resultElementTy) xScaleNExt = b.create<arith::ExtSIOp>(resultETy, xScaleN);

yScaleNExt =

rewriter.create<arith::ExtSIOp>(loc, resultElementTy, yScaleN);

awarzynskiUnsubmitted

Done

Weird, something seems to be missing here: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp#L1684-L1693

awarzynski: Weird, something seems to be missing here: https://github.com/llvm/llvm…

rsudermanAuthorUnsubmitted

Done

Must have been wiped out during the merge

rsuderman: Must have been wiped out during the merge

Value topAcc, bottomAcc;

if (imageW == 1) {

topAcc = rewriter.create<arith::MulIOp>(loc, y0x0, xScaleNExt);

bottomAcc = rewriter.create<arith::MulIOp>(loc, y1x0, xScaleNExt);

} else {

Value rightPart = dx;

Value leftPart = rewriter.create<arith::SubIOp>(loc, xScaleNExt, dx);

y0x0 = rewriter.create<arith::MulIOp>(loc, y0x0, leftPart);

y0x1 = rewriter.create<arith::MulIOp>(loc, y0x1, rightPart);

topAcc = rewriter.create<arith::AddIOp>(loc, y0x0, y0x1);

y1x0 = rewriter.create<arith::MulIOp>(loc, y1x0, leftPart);

y1x1 = rewriter.create<arith::MulIOp>(loc, y1x1, rightPart);

bottomAcc = rewriter.create<arith::AddIOp>(loc, y1x0, y1x1);

} }

awarzynskiUnsubmitted

Done

Value xScaleNExt = xScaleN;

- if (resultETy.getIntOrFloatBitWidth() > 32) {

+ if (yScaleN.getType() != resultETy) {

yScaleNExt = b.create<arith::ExtSIOp>(resultETy, yScaleN);

32 is a bit of a magic number here. IMHO, comparing against resultETy better conveys the rationale behind the extension. Similar comment for L1642.

awarzynski: `32` is a bit of a magic number here. IMHO, comparing against `resultETy` better conveys the…

rsudermanAuthorUnsubmitted

Done

No, in this case we are already comparing after resultETy, we would need to compare to *ScaleN's type. Updated to use this instead but this magic number if hard defined by the standard for all integer types.

rsuderman: No, in this case we are already comparing after resultETy, we would need to compare to…

Value result; auto interpolate = [](Value val0, Value val1, Value weight0,

awarzynskiUnsubmitted

Done

Why "offset" in offsetClampCastIndxs? Naming is heard and I always really struggle. Perhaps getClampedIdxs would be more descriptive? I'm probably missing something here.

awarzynski: Why "offset" in `offsetClampCastIndxs`? Naming is heard and I always really struggle. Perhaps…

if (imageH == 1) { Value weight1,

result = rewriter.create<arith::MulIOp>(loc, topAcc, yScaleNExt); ImplicitLocOpBuilder &b) -> Value {

} else { Value mul0 = b.create<arith::MulIOp>(val0, weight0);

Value bottomPart = dy; Value mul1 = b.create<arith::MulIOp>(val1, weight1);

awarzynskiUnsubmitted

Done

This is the special "broadcast" case not covered by the spec, isn't it? (tested e.g. here). Could you add a comment so that this is easy to identify? Otherwise it's not clear what makes size == 1 special.

awarzynski: This is the special "broadcast" case not covered by the spec, isn't it? (tested e.g. [[ https…

Value topPart = rewriter.create<arith::SubIOp>(loc, yScaleNExt, dy); return b.create<arith::AddIOp>(mul0, mul1);

topAcc = rewriter.create<arith::MulIOp>(loc, topAcc, topPart); };

bottomAcc =

rewriter.create<arith::MulIOp>(loc, bottomAcc, bottomPart);

result = rewriter.create<arith::AddIOp>(loc, topAcc, bottomAcc);

}

rewriter.create<linalg::YieldOp>(loc, result); Value weight0 = b.create<arith::SubIOp>(xScaleNExt, dx);

Value weight1 = dx;

Value topAcc = interpolate(y0x0, y0x1, weight0, weight1, b);

Value bottomAcc = interpolate(y1x0, y1x1, weight0, weight1, b);

weight0 = b.create<arith::SubIOp>(yScaleNExt, dy);

weight1 = dy;

Value result = interpolate(topAcc, bottomAcc, weight0, weight1, b);

b.create<linalg::YieldOp>(result);

}

} }

rewriter.replaceOp(op, resize); rewriter.replaceOp(op, resize);

return success(); return success();

} }

}; };

// At the codegen level any identity operations should be removed. Any cases // At the codegen level any identity operations should be removed. Any cases

// where identity is load-bearing (e.g. cross device computation) should be // where identity is load-bearing (e.g. cross device computation) should be

// handled before lowering to codegen. // handled before lowering to codegen.

template <typename SrcOp> template <typename SrcOp>

class IdentityNConverter : public OpRewritePattern<SrcOp> { class IdentityNConverter : public OpRewritePattern<SrcOp> {

public: public:

using OpRewritePattern<SrcOp>::OpRewritePattern; using OpRewritePattern<SrcOp>::OpRewritePattern;

LogicalResult matchAndRewrite(SrcOp op, LogicalResult matchAndRewrite(SrcOp op,

PatternRewriter &rewriter) const final { PatternRewriter &rewriter) const final {

rewriter.replaceOp(op, op.getOperation()->getOperands()); rewriter.replaceOp(op, op.getOperation()->getOperands());

return success(); return success();

dcaballeUnsubmitted

Done

As a general comment, wondering if we could split this large match and rewrite into multiple patterns sharing some common implementation or at least refactor part of this code into utility functions, as this is growing significantly. WDYT?

dcaballe: As a general comment, wondering if we could split this large match and rewrite into multiple…

rsudermanAuthorUnsubmitted

Done

I agree as well. I think I know how to fix this up but it would be better in a follow up that does not involve IR change. I will follow up once this lands.

rsuderman: I agree as well. I think I know how to fix this up but it would be better in a follow up that…

} }

}; };

template <typename SrcOp> template <typename SrcOp>

class ReduceConverter : public OpRewritePattern<SrcOp> { class ReduceConverter : public OpRewritePattern<SrcOp> {

public: public:

using OpRewritePattern<SrcOp>::OpRewritePattern; using OpRewritePattern<SrcOp>::OpRewritePattern;

LogicalResult matchAndRewrite(SrcOp reduceOp, LogicalResult matchAndRewrite(SrcOp reduceOp,

PatternRewriter &rewriter) const final { PatternRewriter &rewriter) const final {

return reduceMatchAndRewriteHelper(reduceOp, reduceOp.getAxis(), rewriter); return reduceMatchAndRewriteHelper(reduceOp, reduceOp.getAxis(), rewriter);

} }

}; };

struct ConcatConverter : public OpConversionPattern<tosa::ConcatOp> { struct ConcatConverter : public OpConversionPattern<tosa::ConcatOp> {

using OpConversionPattern<tosa::ConcatOp>::OpConversionPattern; using OpConversionPattern<tosa::ConcatOp>::OpConversionPattern;

awarzynskiUnsubmitted

Done

if (floatingPointMode) {

- auto oneVal = b.create<arith::ConstantOp>(b.getF32FloatAttr(1.0f));

- Value w0 = b.create<arith::SubFOp>(oneVal, dx);

- Value w1 = dx;

- auto interpolate = [](Value val0, Value val1, Value d0, Value d1,

- int64_t size,

+ auto interpolate = [](Value val0, Value val1, Value weight1, Value weight2,

ImplicitLocOpBuilder &b) -> Value {

- if (size == 1)

- return val0;

- Value mul0 = b.create<arith::MulFOp>(val0, d0);

- Value mul1 = b.create<arith::MulFOp>(val1, d1);

+ Value mul0 = b.create<arith::MulFOp>(val0, weight1);

+ Value mul1 = b.create<arith::MulFOp>(val1, weight2);

return b.create<arith::AddFOp>(mul0, mul1);

- };

+ };

+ auto oneVal = b.create<arith::ConstantOp>(b.getF32FloatAttr(1.0f));

+ Value w0 = b.create<arith::SubFOp>(oneVal, dx);

+ Value w1 = dx;

+ // Linalg equivalent to the section below:

+ // topAcc = v00 * (unit_x - dx);

+ // topAcc += v01 * dx;

Value topAcc = interpolate(y0x0, y0x1, w0, w1, imageW, b);

+ // Linalg equivalent to the section below:

+ // bottomAcc = v10 * (unit_x - dx);

+ // bottomAcc += v11 * dx;

Value bottomAcc = interpolate(y1x0, y1x1, w0, w1, imageW, b);

w0 = b.create<arith::SubFOp>(oneVal, dy);

w1 = dy;

+ // Linalg equivalent to the section below:

+ // result = topAcc * (unit_y - dy) + bottomAcc * dy

Value result = interpolate(topAcc, bottomAcc, w0, w1, imageH, b);

b.create<linalg::YieldOp>(result);

} else {

It wasn't obvious to me that w1 and w2 are weights, hence the renaming as weight1 and weight2
Added extra comments to make matching against the spec simpler
Removed if (size ==1) - that special case wasn't present in the original implementation, so shouldn't be needed here?
Moved the definition of interpolate to the top of the block to better separate from the rest

WDYT? Similar suggestion for the block below.

awarzynski: * It wasn't obvious to me that `w1` and `w2` are weights, hence the renaming as `weight1` and…

rsudermanAuthorUnsubmitted

Done

I implemented them for the most part. I moved the weight calculations into the interpolation function and did some rename.

rsuderman: I implemented them for the most part. I moved the weight calculations into the interpolation…

LogicalResult LogicalResult

matchAndRewrite(tosa::ConcatOp op, OpAdaptor adaptor, matchAndRewrite(tosa::ConcatOp op, OpAdaptor adaptor,

ConversionPatternRewriter &rewriter) const override { ConversionPatternRewriter &rewriter) const override {

auto inputType = op.getOperand(0).getType().template cast<ShapedType>(); auto inputType = op.getOperand(0).getType().template cast<ShapedType>();

auto resultType = op.getType().dyn_cast<RankedTensorType>(); auto resultType = op.getType().dyn_cast<RankedTensorType>();

Location loc = op.getLoc(); Location loc = op.getLoc();

▲ Show 20 Lines • Show All 555 Lines • Show Last 20 Lines

mlir/lib/Conversion/TosaToLinalg/TosaToLinalgNamed.cpp

	Show All 12 Lines
	#include "mlir/Conversion/TosaToLinalg/TosaToLinalg.h"			#include "mlir/Conversion/TosaToLinalg/TosaToLinalg.h"
	#include "mlir/Dialect/Arith/IR/Arith.h"			#include "mlir/Dialect/Arith/IR/Arith.h"
	#include "mlir/Dialect/Linalg/IR/Linalg.h"			#include "mlir/Dialect/Linalg/IR/Linalg.h"
	#include "mlir/Dialect/Math/IR/Math.h"			#include "mlir/Dialect/Math/IR/Math.h"
	#include "mlir/Dialect/SCF/IR/SCF.h"			#include "mlir/Dialect/SCF/IR/SCF.h"
	#include "mlir/Dialect/Tensor/IR/Tensor.h"			#include "mlir/Dialect/Tensor/IR/Tensor.h"
	#include "mlir/Dialect/Tensor/Utils/Utils.h"			#include "mlir/Dialect/Tensor/Utils/Utils.h"
	#include "mlir/Dialect/Tosa/IR/TosaOps.h"			#include "mlir/Dialect/Tosa/IR/TosaOps.h"
	#include "mlir/Dialect/Tosa/Utils/CoversionUtils.h"			#include "mlir/Dialect/Tosa/Utils/ConversionUtils.h"
	#include "mlir/Dialect/Utils/ReshapeOpsUtils.h"			#include "mlir/Dialect/Utils/ReshapeOpsUtils.h"
	#include "mlir/IR/Matchers.h"			#include "mlir/IR/Matchers.h"
	#include "mlir/IR/PatternMatch.h"			#include "mlir/IR/PatternMatch.h"
	#include "mlir/Transforms/DialectConversion.h"			#include "mlir/Transforms/DialectConversion.h"
	#include "mlir/Transforms/GreedyPatternRewriteDriver.h"			#include "mlir/Transforms/GreedyPatternRewriteDriver.h"

	#include <numeric>			#include <numeric>

	▲ Show 20 Lines • Show All 947 Lines • Show Last 20 Lines

mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp

	//===- TosaCanonicalizations.cpp - Canonicalization patterns & folders ----===//			//===- TosaCanonicalizations.cpp - Canonicalization patterns & folders ----===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// \file			// \file
	// TOSA canonicalization patterns and folders.			// TOSA canonicalization patterns and folders.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "mlir/Dialect/Quant/QuantOps.h"			#include "mlir/Dialect/Quant/QuantOps.h"
	#include "mlir/Dialect/Tensor/IR/Tensor.h"			#include "mlir/Dialect/Tensor/IR/Tensor.h"
	#include "mlir/Dialect/Tosa/IR/TosaOps.h"			#include "mlir/Dialect/Tosa/IR/TosaOps.h"
	#include "mlir/Dialect/Tosa/Utils/CoversionUtils.h"			#include "mlir/Dialect/Tosa/Utils/ConversionUtils.h"
	#include "mlir/Dialect/Tosa/Utils/QuantUtils.h"			#include "mlir/Dialect/Tosa/Utils/QuantUtils.h"
	#include "mlir/Dialect/Tosa/Utils/ShapeUtils.h"			#include "mlir/Dialect/Tosa/Utils/ShapeUtils.h"
	#include "mlir/IR/BuiltinTypes.h"			#include "mlir/IR/BuiltinTypes.h"
	#include "mlir/IR/DialectImplementation.h"			#include "mlir/IR/DialectImplementation.h"
	#include "mlir/IR/Matchers.h"			#include "mlir/IR/Matchers.h"
	#include "mlir/IR/PatternMatch.h"			#include "mlir/IR/PatternMatch.h"
	#include "mlir/Transforms/FoldUtils.h"			#include "mlir/Transforms/FoldUtils.h"
	#include "mlir/Transforms/InliningUtils.h"			#include "mlir/Transforms/InliningUtils.h"
	▲ Show 20 Lines • Show All 950 Lines • Show Last 20 Lines

mlir/lib/Dialect/Tosa/Utils/ConversionUtils.cpp

	//===- ConversionUtils.cpp ------------------------------------------------===//			//===- ConversionUtils.cpp ------------------------------------------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// Utility functions for TOSA lowering			// Utility functions for TOSA lowering
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "mlir/Dialect/Tosa/Utils/CoversionUtils.h"			#include "mlir/Dialect/Tosa/Utils/ConversionUtils.h"

	using namespace mlir;			using namespace mlir;
	using namespace mlir::tosa;			using namespace mlir::tosa;

	SmallVector<utils::IteratorType>			SmallVector<utils::IteratorType>
	mlir::tosa::getNParallelLoopsAttrs(unsigned nParallelLoops) {			mlir::tosa::getNParallelLoopsAttrs(unsigned nParallelLoops) {
	return SmallVector<utils::IteratorType>(nParallelLoops,			return SmallVector<utils::IteratorType>(nParallelLoops,
	utils::IteratorType::parallel);			utils::IteratorType::parallel);
	}			}

	SmallVector<Value>			SmallVector<Value>
	mlir::tosa::condenseValues(const SmallVector<Value> &values) {			mlir::tosa::condenseValues(const SmallVector<Value> &values) {
	SmallVector<Value> condensedValues;			SmallVector<Value> condensedValues;
	for (auto value : values)			for (auto value : values)
	if (value)			if (value)
	condensedValues.push_back(value);			condensedValues.push_back(value);
	return condensedValues;			return condensedValues;
	}			}

	Value mlir::tosa::clampFloatHelper(Location loc, Value arg,			Value mlir::tosa::clampFloatHelper(Location loc, Value arg, Value min,
	arith::ConstantOp min, arith::ConstantOp max,			Value max, OpBuilder &rewriter) {
	OpBuilder &rewriter) {
	Value minValue = rewriter.create<arith::MinFOp>(loc, arg, max);			Value minValue = rewriter.create<arith::MinFOp>(loc, arg, max);
	return rewriter.create<arith::MaxFOp>(loc, minValue, min);			return rewriter.create<arith::MaxFOp>(loc, minValue, min);
	}			}

	Value mlir::tosa::clampIntHelper(Location loc, Value arg, arith::ConstantOp min,			Value mlir::tosa::clampIntHelper(Location loc, Value arg, Value min, Value max,
	arith::ConstantOp max, OpBuilder &rewriter) {			OpBuilder &rewriter) {
	auto smallerThanMin =			auto smallerThanMin =
	rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt, arg, min);			rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt, arg, min);
	auto minOrArg =			auto minOrArg =
	rewriter.create<arith::SelectOp>(loc, smallerThanMin, min, arg);			rewriter.create<arith::SelectOp>(loc, smallerThanMin, min, arg);
	auto largerThanMax =			auto largerThanMax =
	rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt, max, arg);			rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt, max, arg);
	return rewriter.create<arith::SelectOp>(loc, largerThanMax, max, minOrArg);			return rewriter.create<arith::SelectOp>(loc, largerThanMax, max, minOrArg);
	}			}

mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir

	Show First 20 Lines • Show All 118 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: @resize_nearest_int			// CHECK-LABEL: @resize_nearest_int
	func.func @resize_nearest_int(%arg0: tensor<1x15x13x1xi8>) -> () {			func.func @resize_nearest_int(%arg0: tensor<1x15x13x1xi8>) -> () {
	// CHECK: %[[INIT:.+]] = tensor.empty() : tensor<1x23x179x1xi8>			// CHECK: %[[INIT:.+]] = tensor.empty() : tensor<1x23x179x1xi8>
	// CHECK: %[[GENERIC:.+]] = linalg.generic			// CHECK: %[[GENERIC:.+]] = linalg.generic
	// CHECK: %[[IDX_0:.+]] = linalg.index 0			// CHECK: %[[IDX_0:.+]] = linalg.index 0
	// CHECK: %[[IDX_1:.+]] = linalg.index 1			// CHECK: %[[IDX_1:.+]] = linalg.index 1
	// CHECK: %[[IDX_2:.+]] = linalg.index 2			// CHECK: %[[IDX_2:.+]] = linalg.index 2
	// CHECK: %[[IDX_3:.+]] = linalg.index 3			// CHECK: %[[IDX_3:.+]] = linalg.index 3
	// CHECK-DAG: %[[XY_MIN:.+]] = arith.constant 0			// CHECK-DAG: %[[ZERO:.+]] = arith.constant 0
	// CHECK-DAG: %[[Y_MAX:.+]] = arith.constant 14			// CHECK-DAG: %[[Y_MAX:.+]] = arith.constant 14
	// CHECK-DAG: %[[X_MAX:.+]] = arith.constant 12			// CHECK-DAG: %[[X_MAX:.+]] = arith.constant 12

	// CHECK: %[[Y:.+]] = arith.index_cast %[[IDX_1]]			// CHECK: %[[Y:.+]] = arith.index_cast %[[IDX_1]]
	// CHECK: %[[X:.+]] = arith.index_cast %[[IDX_2]]			// CHECK: %[[X:.+]] = arith.index_cast %[[IDX_2]]
	// CHECK-DAG: %[[SCALE_Y_N:.*]] = arith.constant 11			// CHECK-DAG: %[[SCALE_Y_N:.*]] = arith.constant 11
	// CHECK-DAG: %[[SCALE_Y_D:.*]] = arith.constant 7			// CHECK-DAG: %[[SCALE_Y_D:.*]] = arith.constant 7
	// CHECK-DAG: %[[SCALE_X_N:.*]] = arith.constant 89			// CHECK-DAG: %[[SCALE_X_N:.*]] = arith.constant 89
	// CHECK-DAG: %[[SCALE_X_D:.*]] = arith.constant 6			// CHECK-DAG: %[[SCALE_X_D:.*]] = arith.constant 6
	// CHECK-DAG: %[[OFFSET_Y:.*]] = arith.constant 0			// CHECK-DAG: %[[OFFSET_Y:.*]] = arith.constant 0
	// CHECK-DAG: %[[OFFSET_X:.*]] = arith.constant 0			// CHECK-DAG: %[[OFFSET_X:.*]] = arith.constant 0
	// CHECK-DAG: %[[BORDER_Y:.*]] = arith.constant 0			// CHECK-DAG: %[[BORDER_Y:.*]] = arith.constant 0
	// CHECK-DAG: %[[BORDER_X:.*]] = arith.constant 0			// CHECK-DAG: %[[BORDER_X:.*]] = arith.constant 0

	// find the remainder and integer component of the target index.			// find the remainder and integer component of the target index.

	// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]			// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]
	// CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
	// CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]			// CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]
	// CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
	// CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]			// CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]
	// CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
	// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]			// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]
	// CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
	// CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]			// CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]
	// CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]

	// Round to the nearest neighor.			// CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
				// CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
				// CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
				// CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
				// CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]

	// CHECK-DAG: %[[ZERO:.*]] = arith.constant 0			// Compute the offset and bound for the Y position.
	// CHECK-DAG: %[[ONE:.*]] = arith.constant 1			// CHECK-DAG: %[[ONE:.*]] = arith.constant 1
	// CHECK: %[[D_Y_DOUBLE:.*]] = arith.shli %[[D_Y]], %[[ONE]]			// CHECK: %[[D_Y_DOUBLE:.*]] = arith.shli %[[D_Y]], %[[ONE]]
	// CHECK: %[[D_X_DOUBLE:.*]] = arith.shli %[[D_X]], %[[ONE]]
	// CHECK: %[[PRED_Y:.*]] = arith.cmpi sge, %[[D_Y_DOUBLE]], %[[SCALE_Y_N]]			// CHECK: %[[PRED_Y:.*]] = arith.cmpi sge, %[[D_Y_DOUBLE]], %[[SCALE_Y_N]]
	// CHECK: %[[PRED_X:.*]] = arith.cmpi sge, %[[D_X_DOUBLE]], %[[SCALE_X_N]]
	// CHECK: %[[VAL_37:.*]] = arith.select %[[PRED_Y]], %[[ONE]], %[[ZERO]]			// CHECK: %[[VAL_37:.*]] = arith.select %[[PRED_Y]], %[[ONE]], %[[ZERO]]
	// CHECK: %[[VAL_38:.*]] = arith.select %[[PRED_X]], %[[ONE]], %[[ZERO]]
	// CHECK: %[[VAL_39:.*]] = arith.addi %[[I_Y]], %[[VAL_37]]			// CHECK: %[[VAL_39:.*]] = arith.addi %[[I_Y]], %[[VAL_37]]
	// CHECK: %[[VAL_40:.*]] = arith.addi %[[I_X]], %[[VAL_38]]			// CHECK: %[[VAL_41:.*]] = arith.cmpi slt, %[[VAL_39]], %[[ZERO]]
				// CHECK: %[[VAL_42:.*]] = arith.select %[[VAL_41]], %[[ZERO]], %[[VAL_39]]
	// This section applies bound checking to be within the input image.

	// CHECK: %[[VAL_41:.*]] = arith.cmpi slt, %[[VAL_39]], %[[XY_MIN]]
	// CHECK: %[[VAL_42:.*]] = arith.select %[[VAL_41]], %[[XY_MIN]], %[[VAL_39]]
	// CHECK: %[[VAL_43:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[VAL_39]]			// CHECK: %[[VAL_43:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[VAL_39]]
	// CHECK: %[[VAL_44:.*]] = arith.select %[[VAL_43]], %[[Y_MAX]], %[[VAL_42]]			// CHECK: %[[VAL_44:.*]] = arith.select %[[VAL_43]], %[[Y_MAX]], %[[VAL_42]]
	// CHECK: %[[VAL_45:.*]] = arith.cmpi slt, %[[VAL_40]], %[[XY_MIN]]			// CHECK: %[[IDY:.+]] = arith.index_cast %[[VAL_44]]
	// CHECK: %[[VAL_46:.*]] = arith.select %[[VAL_45]], %[[XY_MIN]], %[[VAL_40]]
				// Compute the offset and bound for the X position.
				// CHECK: %[[D_X_DOUBLE:.*]] = arith.shli %[[D_X]], %[[ONE]]
				// CHECK: %[[PRED_X:.*]] = arith.cmpi sge, %[[D_X_DOUBLE]], %[[SCALE_X_N]]
				// CHECK: %[[VAL_38:.*]] = arith.select %[[PRED_X]], %[[ONE]], %[[ZERO]]
				// CHECK: %[[VAL_40:.*]] = arith.addi %[[I_X]], %[[VAL_38]]
				// CHECK: %[[VAL_45:.*]] = arith.cmpi slt, %[[VAL_40]], %[[ZERO]]
				// CHECK: %[[VAL_46:.*]] = arith.select %[[VAL_45]], %[[ZERO]], %[[VAL_40]]
	// CHECK: %[[VAL_47:.*]] = arith.cmpi slt, %[[X_MAX]], %[[VAL_40]]			// CHECK: %[[VAL_47:.*]] = arith.cmpi slt, %[[X_MAX]], %[[VAL_40]]
	// CHECK: %[[VAL_48:.*]] = arith.select %[[VAL_47]], %[[X_MAX]], %[[VAL_46]]			// CHECK: %[[VAL_48:.*]] = arith.select %[[VAL_47]], %[[X_MAX]], %[[VAL_46]]

	// Extract the nearest value using the computed indices.

	// CHECK: %[[IDY:.+]] = arith.index_cast %[[VAL_44]]
	// CHECK: %[[IDX:.+]] = arith.index_cast %[[VAL_48]]			// CHECK: %[[IDX:.+]] = arith.index_cast %[[VAL_48]]

	// CHECK: %[[EXTRACT:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[IDY]], %[[IDX]], %[[IDX_3]]]			// CHECK: %[[EXTRACT:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[IDY]], %[[IDX]], %[[IDX_3]]]
	// CHECK: linalg.yield %[[EXTRACT]]			// CHECK: linalg.yield %[[EXTRACT]]

	// Round to the nearest index.			// Round to the nearest index.
	%0 = "tosa.resize"(%arg0) {mode = "NEAREST_NEIGHBOR", scale = [11, 7, 89, 6], offset = [0, 0], border = [0, 0]} : (tensor<1x15x13x1xi8>) -> tensor<1x23x179x1xi8>			%0 = "tosa.resize"(%arg0) {mode = "NEAREST_NEIGHBOR", scale = [11, 7, 89, 6], offset = [0, 0], border = [0, 0]} : (tensor<1x15x13x1xi8>) -> tensor<1x23x179x1xi8>
	return			return
	}			}

	// -----			// -----

	// CHECK-LABEL: @resize_bilinear_int			// CHECK-LABEL: @resize_bilinear_int
	// CHECK-SAME: (%[[ARG0:[0-9a-zA-Z_]*]]:			// CHECK-SAME: (%[[ARG0:[0-9a-zA-Z_]*]]:
	func.func @resize_bilinear_int(%arg0: tensor<1x19x19x1xi8>) {			func.func @resize_bilinear_int(%arg0: tensor<1x19x20x1xi8>) {
	// CHECK: %[[INIT:.+]] = tensor.empty() : tensor<1x289x289x1xi32>			// CHECK: %[[INIT:.+]] = tensor.empty() : tensor<1x304x320x1xi48>
	// CHECK: %[[GENERIC:.+]] = linalg.generic			// CHECK: %[[GENERIC:.+]] = linalg.generic
	// CHECK: %[[IDX_0:.+]] = linalg.index 0			// CHECK: %[[IDX_0:.+]] = linalg.index 0
	// CHECK: %[[IDX_1:.+]] = linalg.index 1			// CHECK: %[[IDX_1:.+]] = linalg.index 1
	// CHECK: %[[IDX_2:.+]] = linalg.index 2			// CHECK: %[[IDX_2:.+]] = linalg.index 2
	// CHECK: %[[IDX_3:.+]] = linalg.index 3			// CHECK: %[[IDX_3:.+]] = linalg.index 3
	// CHECK-DAG: %[[XY_MIN:.+]] = arith.constant 0			// CHECK-DAG: %[[ZERO:.+]] = arith.constant 0
	// CHECK-DAG: %[[Y_MAX:.+]] = arith.constant 18			// CHECK-DAG: %[[Y_MAX:.+]] = arith.constant 18
	// CHECK-DAG: %[[X_MAX:.+]] = arith.constant 18			// CHECK-DAG: %[[X_MAX:.+]] = arith.constant 19
	// CHECK: %[[Y:.+]] = arith.index_cast %[[IDX_1]]			// CHECK: %[[Y:.+]] = arith.index_cast %[[IDX_1]]
	// CHECK: %[[X:.+]] = arith.index_cast %[[IDX_2]]			// CHECK: %[[X:.+]] = arith.index_cast %[[IDX_2]]
	// CHECK-DAG: %[[SCALE_Y_N:.*]] = arith.constant 16			// CHECK-DAG: %[[SCALE_Y_N:.*]] = arith.constant 16
	// CHECK-DAG: %[[SCALE_Y_D:.*]] = arith.constant 1			// CHECK-DAG: %[[SCALE_Y_D:.*]] = arith.constant 1
	// CHECK-DAG: %[[SCALE_X_N:.*]] = arith.constant 16			// CHECK-DAG: %[[SCALE_X_N:.*]] = arith.constant 16
	// CHECK-DAG: %[[SCALE_X_D:.*]] = arith.constant 1			// CHECK-DAG: %[[SCALE_X_D:.*]] = arith.constant 1
	// CHECK-DAG: %[[OFFSET_Y:.*]] = arith.constant 0			// CHECK-DAG: %[[OFFSET_Y:.*]] = arith.constant 0
	// CHECK-DAG: %[[OFFSET_X:.*]] = arith.constant 0			// CHECK-DAG: %[[OFFSET_X:.*]] = arith.constant 0
	// CHECK-DAG: %[[BORDER_Y:.*]] = arith.constant 0			// CHECK-DAG: %[[BORDER_Y:.*]] = arith.constant 0
	// CHECK-DAG: %[[BORDER_X:.*]] = arith.constant 0			// CHECK-DAG: %[[BORDER_X:.*]] = arith.constant 0

	// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]			// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]
	// CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
	// CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]			// CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]
	// CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
	// CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]			// CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]
	// CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
	// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]			// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]
	// CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
	// CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]			// CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]

				// CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
				// CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
				// CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
				// CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
	// CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]			// CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]

	// Compute the left, right, and top indices for the bilinear interpolation.			// Compute the left, right, and top indices for the bilinear interpolation.

	// CHECK-DAG: %[[ONE:.*]] = arith.constant 1			// CHECK-DAG: %[[ONE:.*]] = arith.constant 1
	// CHECK: %[[Y1:.*]] = arith.addi %[[I_Y]], %[[ONE]]			// CHECK: %[[Y1:.*]] = arith.addi %[[I_Y]], %[[ONE]]
	// CHECK: %[[X1:.*]] = arith.addi %[[I_X]], %[[ONE]]

	// Bound check each dimension.			// Bound check each dimension.

	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[I_Y]], %[[XY_MIN]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[I_Y]], %[[ZERO]]
	// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[XY_MIN]], %[[I_Y]]			// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[ZERO]], %[[I_Y]]
	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[I_Y]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[I_Y]]
	// CHECK: %[[YLO:.*]] = arith.select %[[PRED]], %[[Y_MAX]], %[[BOUND]]			// CHECK: %[[YLO:.*]] = arith.select %[[PRED]], %[[Y_MAX]], %[[BOUND]]

	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y1]], %[[XY_MIN]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y1]], %[[ZERO]]
	// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[XY_MIN]], %[[Y1]]			// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[ZERO]], %[[Y1]]
	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[Y1]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[Y1]]
	// CHECK: %[[YHI:.*]] = arith.select %[[PRED]], %[[Y_MAX]], %[[BOUND]]			// CHECK: %[[YHI:.*]] = arith.select %[[PRED]], %[[Y_MAX]], %[[BOUND]]

	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[I_X]], %[[XY_MIN]]			// CHECK: %[[YLOI:.+]] = arith.index_cast %[[YLO]]
	// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[XY_MIN]], %[[I_X]]			// CHECK: %[[YHII:.+]] = arith.index_cast %[[YHI]]

				// CHECK: %[[X1:.*]] = arith.addi %[[I_X]], %[[ONE]]
				// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[I_X]], %[[ZERO]]
				// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[ZERO]], %[[I_X]]
	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X_MAX]], %[[I_X]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X_MAX]], %[[I_X]]
	// CHECK: %[[XLO:.*]] = arith.select %[[PRED]], %[[X_MAX]], %[[BOUND]]			// CHECK: %[[XLO:.*]] = arith.select %[[PRED]], %[[X_MAX]], %[[BOUND]]

	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X1]], %[[XY_MIN]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X1]], %[[ZERO]]
	// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[XY_MIN]], %[[X1]]			// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[ZERO]], %[[X1]]
	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X_MAX]], %[[X1]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X_MAX]], %[[X1]]
	// CHECK: %[[XHI:.*]] = arith.select %[[PRED]], %[[X_MAX]], %[[BOUND]]			// CHECK: %[[XHI:.*]] = arith.select %[[PRED]], %[[X_MAX]], %[[BOUND]]

	// Extract each corner of the bilinear interpolation.

	// CHECK: %[[YLOI:.+]] = arith.index_cast %[[YLO]]
	// CHECK: %[[YHII:.+]] = arith.index_cast %[[YHI]]
	// CHECK: %[[XLOI:.+]] = arith.index_cast %[[XLO]]			// CHECK: %[[XLOI:.+]] = arith.index_cast %[[XLO]]
	// CHECK: %[[XHII:.+]] = arith.index_cast %[[XHI]]			// CHECK: %[[XHII:.+]] = arith.index_cast %[[XHI]]

				// Extract each corner of the bilinear interpolation.

	// CHECK: %[[LOLO:.+]] = tensor.extract %[[ARG0]][%[[IDX_0]], %[[YLOI]], %[[XLOI]], %[[IDX_3]]]			// CHECK: %[[LOLO:.+]] = tensor.extract %[[ARG0]][%[[IDX_0]], %[[YLOI]], %[[XLOI]], %[[IDX_3]]]
	// CHECK: %[[LOHI:.+]] = tensor.extract %[[ARG0]][%[[IDX_0]], %[[YLOI]], %[[XHII]], %[[IDX_3]]]			// CHECK: %[[LOHI:.+]] = tensor.extract %[[ARG0]][%[[IDX_0]], %[[YLOI]], %[[XHII]], %[[IDX_3]]]
	// CHECK: %[[HILO:.+]] = tensor.extract %[[ARG0]][%[[IDX_0]], %[[YHII]], %[[XLOI]], %[[IDX_3]]]			// CHECK: %[[HILO:.+]] = tensor.extract %[[ARG0]][%[[IDX_0]], %[[YHII]], %[[XLOI]], %[[IDX_3]]]
	// CHECK: %[[HIHI:.+]] = tensor.extract %[[ARG0]][%[[IDX_0]], %[[YHII]], %[[XHII]], %[[IDX_3]]]			// CHECK: %[[HIHI:.+]] = tensor.extract %[[ARG0]][%[[IDX_0]], %[[YHII]], %[[XHII]], %[[IDX_3]]]

	// CHECK: %[[XLOLO:.+]] = arith.extsi %[[LOLO]]			// CHECK: %[[XLOLO:.+]] = arith.extsi %[[LOLO]]
	// CHECK: %[[XLOHI:.+]] = arith.extsi %[[LOHI]]			// CHECK: %[[XLOHI:.+]] = arith.extsi %[[LOHI]]
	// CHECK: %[[XHILO:.+]] = arith.extsi %[[HILO]]			// CHECK: %[[XHILO:.+]] = arith.extsi %[[HILO]]
	// CHECK: %[[XHIHI:.+]] = arith.extsi %[[HIHI]]			// CHECK: %[[XHIHI:.+]] = arith.extsi %[[HIHI]]

				// CHECK-NEXT: %[[D_X_EXT:.+]] = arith.extsi %[[D_X]]
				// CHECK-NEXT: %[[D_Y_EXT:.+]] = arith.extsi %[[D_Y]]
				// CHECK-NEXT: %[[Y_N_EXT:.+]] = arith.extsi %[[SCALE_Y_N]]
				// CHECK-NEXT: %[[X_N_EXT:.+]] = arith.extsi %[[SCALE_X_N]]
				awarzynskiUnsubmitted Done Reply Inline Actions Is this extra extension due to switching from `i32` to `i48`? Might be worth adding a comment. awarzynski: Is this extra extension due to switching from `i32` to `i48`? Might be worth adding a comment.
				rsudermanAuthorUnsubmitted Done Reply Inline Actions Added to CL description. rsuderman: Added to CL description.

	// Compute the bilinear interpolation.			// Compute the bilinear interpolation.

	// CHECK: %[[NDX:.+]] = arith.subi %[[SCALE_X_N]], %[[D_X]]			// CHECK: %[[NDX:.+]] = arith.subi %[[X_N_EXT]], %[[D_X_EXT]]
	// CHECK: %[[WLOLO:.+]] = arith.muli %[[XLOLO]], %[[NDX]]			// CHECK: %[[WLOLO:.+]] = arith.muli %[[XLOLO]], %[[NDX]]
	// CHECK: %[[WLOHI:.+]] = arith.muli %[[XLOHI]], %[[D_X]]			// CHECK: %[[WLOHI:.+]] = arith.muli %[[XLOHI]], %[[D_X_EXT]]
	// CHECK: %[[LO:.+]] = arith.addi %[[WLOLO]], %[[WLOHI]]			// CHECK: %[[LO:.+]] = arith.addi %[[WLOLO]], %[[WLOHI]]
	// CHECK: %[[WHILO:.+]] = arith.muli %[[XHILO]], %[[NDX]]			// CHECK: %[[WHILO:.+]] = arith.muli %[[XHILO]], %[[NDX]]
	// CHECK: %[[WHIHI:.+]] = arith.muli %[[XHIHI]], %[[D_X]]			// CHECK: %[[WHIHI:.+]] = arith.muli %[[XHIHI]], %[[D_X_EXT]]
	// CHECK: %[[HI:.+]] = arith.addi %[[WHILO]], %[[WHIHI]]			// CHECK: %[[HI:.+]] = arith.addi %[[WHILO]], %[[WHIHI]]
	// CHECK: %[[NDY:.+]] = arith.subi %[[SCALE_Y_N]], %[[D_Y]]			// CHECK: %[[NDY:.+]] = arith.subi %[[Y_N_EXT]], %[[D_Y_EXT]]
	// CHECK: %[[WLO:.+]] = arith.muli %[[LO]], %[[NDY]]			// CHECK: %[[WLO:.+]] = arith.muli %[[LO]], %[[NDY]]
	// CHECK: %[[WHI:.+]] = arith.muli %[[HI]], %[[D_Y]]			// CHECK: %[[WHI:.+]] = arith.muli %[[HI]], %[[D_Y_EXT]]
	// CHECK: %[[RESULT:.+]] = arith.addi %[[WLO]], %[[WHI]]			// CHECK: %[[RESULT:.+]] = arith.addi %[[WLO]], %[[WHI]]
	// CHECK: linalg.yield %[[RESULT]]			// CHECK: linalg.yield %[[RESULT]]

	// Round to the nearest index.			// Round to the nearest index.
	%0 = "tosa.resize"(%arg0) {mode = "BILINEAR", scale = [16, 1, 16, 1], offset = [0, 0], border = [0, 0]} : (tensor<1x19x19x1xi8>) -> tensor<1x289x289x1xi32>			%0 = "tosa.resize"(%arg0) {mode = "BILINEAR", scale = [16, 1, 16, 1], offset = [0, 0], border = [0, 0]} : (tensor<1x19x20x1xi8>) -> tensor<1x304x320x1xi48>
	return			return
	}			}

	// -----			// -----

	// CHECK-LABEL: @resize_nearest_fp			// CHECK-LABEL: @resize_nearest_fp
	func.func @resize_nearest_fp(%input: tensor<1x50x48x1xf32>) -> () {			func.func @resize_nearest_fp(%input: tensor<1x50x48x1xf32>) -> () {
	// CHECK: %[[INIT:.+]] = tensor.empty() : tensor<1x1600x1536x1xf32>			// CHECK: %[[INIT:.+]] = tensor.empty() : tensor<1x1600x1536x1xf32>
	// CHECK: %[[GENERIC:.+]] = linalg.generic			// CHECK: %[[GENERIC:.+]] = linalg.generic
	// CHECK: %[[IDX0:.+]] = linalg.index 0			// CHECK: %[[IDX0:.+]] = linalg.index 0
	// CHECK: %[[IDX1:.+]] = linalg.index 1			// CHECK: %[[IDX1:.+]] = linalg.index 1
	// CHECK: %[[IDX2:.+]] = linalg.index 2			// CHECK: %[[IDX2:.+]] = linalg.index 2
	// CHECK: %[[IDX3:.+]] = linalg.index 3			// CHECK: %[[IDX3:.+]] = linalg.index 3
	// CHECK-DAG: %[[XYMIN:.*]] = arith.constant 0			// CHECK-DAG: %[[ZERO:.*]] = arith.constant 0
	// CHECK-DAG: %[[YMAX:.*]] = arith.constant 49			// CHECK-DAG: %[[YMAX:.*]] = arith.constant 49
	// CHECK-DAG: %[[XMAX:.*]] = arith.constant 47			// CHECK-DAG: %[[XMAX:.*]] = arith.constant 47
	// CHECK: %[[Y:.+]] = arith.index_cast %[[IDX1]]			// CHECK: %[[Y:.+]] = arith.index_cast %[[IDX1]]
	// CHECK: %[[X:.+]] = arith.index_cast %[[IDX2]]			// CHECK: %[[X:.+]] = arith.index_cast %[[IDX2]]
	// CHECK-DAG: %[[ISCALE_Y_N:.*]] = arith.constant 64			// CHECK-DAG: %[[ISCALE_Y_N:.*]] = arith.constant 64
	// CHECK-DAG: %[[ISCALE_Y_D:.*]] = arith.constant 2			// CHECK-DAG: %[[ISCALE_Y_D:.*]] = arith.constant 2
	// CHECK-DAG: %[[ISCALE_X_N:.*]] = arith.constant 64			// CHECK-DAG: %[[ISCALE_X_N:.*]] = arith.constant 64
	// CHECK-DAG: %[[ISCALE_X_D:.*]] = arith.constant 2			// CHECK-DAG: %[[ISCALE_X_D:.*]] = arith.constant 2
	// CHECK-DAG: %[[IOFFSET_Y:.*]] = arith.constant -31			// CHECK-DAG: %[[IOFFSET_Y:.*]] = arith.constant -31
	// CHECK-DAG: %[[IOFFSET_X:.*]] = arith.constant -31			// CHECK-DAG: %[[IOFFSET_X:.*]] = arith.constant -31
	// CHECK-DAG: %[[IBORDER_Y:.*]] = arith.constant 31			// CHECK-DAG: %[[IBORDER_Y:.*]] = arith.constant 31
	// CHECK-DAG: %[[IBORDER_X:.*]] = arith.constant 31			// CHECK-DAG: %[[IBORDER_X:.*]] = arith.constant 31

	// CHECK: %[[Y0:.+]] = arith.uitofp %[[Y]]			// CHECK: %[[Y0:.+]] = arith.uitofp %[[Y]]
	// CHECK: %[[X0:.+]] = arith.uitofp %[[X]]
	// CHECK: %[[SCALE_Y_N:.*]] = arith.uitofp %[[ISCALE_Y_N]]			// CHECK: %[[SCALE_Y_N:.*]] = arith.uitofp %[[ISCALE_Y_N]]
	// CHECK: %[[SCALE_Y_D:.*]] = arith.uitofp %[[ISCALE_Y_D]]			// CHECK: %[[SCALE_Y_D:.*]] = arith.uitofp %[[ISCALE_Y_D]]
				// CHECK: %[[OFFSET_Y:.*]] = arith.uitofp %[[IOFFSET_Y]]
				// CHECK: %[[VAL_29:.*]] = arith.mulf %[[Y0]], %[[SCALE_Y_D]]
				// CHECK: %[[VAL_31:.*]] = arith.addf %[[VAL_29]], %[[OFFSET_Y]]
				// CHECK: %[[VAL_33:.*]] = arith.divf %[[VAL_31]], %[[SCALE_Y_N]]
				// CHECK: %[[VAL_35:.*]] = math.floor %[[VAL_33]]
				// CHECK: %[[D_Y:.*]] = arith.subf %[[VAL_33]], %[[VAL_35]]
				// CHECK: %[[VAL_39:.*]] = arith.fptosi %[[VAL_35]]

				// CHECK: %[[X0:.+]] = arith.uitofp %[[X]]
	// CHECK: %[[SCALE_X_N:.*]] = arith.uitofp %[[ISCALE_X_N]]			// CHECK: %[[SCALE_X_N:.*]] = arith.uitofp %[[ISCALE_X_N]]
	// CHECK: %[[SCALE_X_D:.*]] = arith.uitofp %[[ISCALE_X_D]]			// CHECK: %[[SCALE_X_D:.*]] = arith.uitofp %[[ISCALE_X_D]]
	// CHECK: %[[OFFSET_Y:.*]] = arith.uitofp %[[IOFFSET_Y]]
	// CHECK: %[[OFFSET_X:.*]] = arith.uitofp %[[IOFFSET_X]]			// CHECK: %[[OFFSET_X:.*]] = arith.uitofp %[[IOFFSET_X]]

	// CHECK: %[[VAL_29:.*]] = arith.mulf %[[Y0]], %[[SCALE_Y_D]]
	// CHECK: %[[VAL_30:.*]] = arith.mulf %[[X0]], %[[SCALE_X_D]]			// CHECK: %[[VAL_30:.*]] = arith.mulf %[[X0]], %[[SCALE_X_D]]
	// CHECK: %[[VAL_31:.*]] = arith.addf %[[VAL_29]], %[[OFFSET_Y]]
	// CHECK: %[[VAL_32:.*]] = arith.addf %[[VAL_30]], %[[OFFSET_X]]			// CHECK: %[[VAL_32:.*]] = arith.addf %[[VAL_30]], %[[OFFSET_X]]
	// CHECK: %[[VAL_33:.*]] = arith.divf %[[VAL_31]], %[[SCALE_Y_N]]
	// CHECK: %[[VAL_34:.*]] = arith.divf %[[VAL_32]], %[[SCALE_X_N]]			// CHECK: %[[VAL_34:.*]] = arith.divf %[[VAL_32]], %[[SCALE_X_N]]

	// Find the remainder and integer component of the target index.

	// CHECK: %[[VAL_35:.*]] = math.floor %[[VAL_33]]
	// CHECK: %[[VAL_36:.*]] = math.floor %[[VAL_34]]			// CHECK: %[[VAL_36:.*]] = math.floor %[[VAL_34]]
	// CHECK: %[[D_Y:.*]] = arith.subf %[[VAL_33]], %[[VAL_35]]
	// CHECK: %[[D_X:.*]] = arith.subf %[[VAL_34]], %[[VAL_36]]			// CHECK: %[[D_X:.*]] = arith.subf %[[VAL_34]], %[[VAL_36]]
	// CHECK: %[[VAL_39:.*]] = arith.fptosi %[[VAL_35]]
	// CHECK: %[[VAL_40:.*]] = arith.fptosi %[[VAL_36]]			// CHECK: %[[VAL_40:.*]] = arith.fptosi %[[VAL_36]]

	// CHECK-DAG: %[[ZERO:.*]] = arith.constant 0
	// CHECK-DAG: %[[ONE:.*]] = arith.constant 1			// CHECK-DAG: %[[ONE:.*]] = arith.constant 1
	// CHECK-DAG: %[[HALF:.*]] = arith.constant 5.000000e-01			// CHECK-DAG: %[[HALF:.*]] = arith.constant 5.000000e-01
	// CHECK: %[[PRED_Y:.*]] = arith.cmpf oge, %[[D_Y]], %[[HALF]]			// CHECK: %[[PRED_Y:.*]] = arith.cmpf oge, %[[D_Y]], %[[HALF]]
	// CHECK: %[[PRED_X:.*]] = arith.cmpf oge, %[[D_X]], %[[HALF]]
	// CHECK: %[[ROUND_Y:.*]] = arith.select %[[PRED_Y]], %[[ONE]], %[[ZERO]]			// CHECK: %[[ROUND_Y:.*]] = arith.select %[[PRED_Y]], %[[ONE]], %[[ZERO]]
	// CHECK: %[[ROUND_X:.*]] = arith.select %[[PRED_X]], %[[ONE]], %[[ZERO]]
	// CHECK: %[[VAL_48:.*]] = arith.addi %[[VAL_39]], %[[ROUND_Y]]			// CHECK: %[[VAL_48:.*]] = arith.addi %[[VAL_39]], %[[ROUND_Y]]
	// CHECK: %[[VAL_49:.*]] = arith.addi %[[VAL_40]], %[[ROUND_X]]			// CHECK: %[[VAL_50:.*]] = arith.cmpi slt, %[[VAL_48]], %[[ZERO]]
				// CHECK: %[[VAL_51:.*]] = arith.select %[[VAL_50]], %[[ZERO]], %[[VAL_48]]
	// CHECK: %[[VAL_50:.*]] = arith.cmpi slt, %[[VAL_48]], %[[XYMIN]]
	// CHECK: %[[VAL_51:.*]] = arith.select %[[VAL_50]], %[[XYMIN]], %[[VAL_48]]
	// CHECK: %[[VAL_52:.*]] = arith.cmpi slt, %[[YMAX]], %[[VAL_48]]			// CHECK: %[[VAL_52:.*]] = arith.cmpi slt, %[[YMAX]], %[[VAL_48]]
	// CHECK: %[[VAL_53:.*]] = arith.select %[[VAL_52]], %[[YMAX]], %[[VAL_51]]			// CHECK: %[[VAL_53:.*]] = arith.select %[[VAL_52]], %[[YMAX]], %[[VAL_51]]
	// CHECK: %[[VAL_54:.*]] = arith.cmpi slt, %[[VAL_49]], %[[XYMIN]]			// CHECK: %[[IDY:.*]] = arith.index_cast %[[VAL_53]]
	// CHECK: %[[VAL_55:.*]] = arith.select %[[VAL_54]], %[[XYMIN]], %[[VAL_49]]
				// CHECK-DAG: %[[HALF:.*]] = arith.constant 5.000000e-01
				// CHECK: %[[PRED_X:.*]] = arith.cmpf oge, %[[D_X]], %[[HALF]]
				// CHECK: %[[ROUND_X:.*]] = arith.select %[[PRED_X]], %[[ONE]], %[[ZERO]]
				// CHECK: %[[VAL_49:.*]] = arith.addi %[[VAL_40]], %[[ROUND_X]]
				// CHECK: %[[VAL_54:.*]] = arith.cmpi slt, %[[VAL_49]], %[[ZERO]]
				// CHECK: %[[VAL_55:.*]] = arith.select %[[VAL_54]], %[[ZERO]], %[[VAL_49]]
	// CHECK: %[[VAL_56:.*]] = arith.cmpi slt, %[[XMAX]], %[[VAL_49]]			// CHECK: %[[VAL_56:.*]] = arith.cmpi slt, %[[XMAX]], %[[VAL_49]]
	// CHECK: %[[VAL_57:.*]] = arith.select %[[VAL_56]], %[[XMAX]], %[[VAL_55]]			// CHECK: %[[VAL_57:.*]] = arith.select %[[VAL_56]], %[[XMAX]], %[[VAL_55]]

	// CHECK: %[[IDY:.*]] = arith.index_cast %[[VAL_53]]
	// CHECK: %[[IDX:.*]] = arith.index_cast %[[VAL_57]]			// CHECK: %[[IDX:.*]] = arith.index_cast %[[VAL_57]]

	// CHECK: %[[EXTRACT:.+]] = tensor.extract %arg0[%[[IDX0]], %[[IDY]], %[[IDX]], %[[IDX3]]]			// CHECK: %[[EXTRACT:.+]] = tensor.extract %arg0[%[[IDX0]], %[[IDY]], %[[IDX]], %[[IDX3]]]
	// CHECK: linalg.yield %[[EXTRACT]]			// CHECK: linalg.yield %[[EXTRACT]]

	%output = "tosa.resize"(%input) {mode = "NEAREST_NEIGHBOR", scale = [64, 2, 64, 2], offset = [-31, -31], border = [31, 31]} : (tensor<1x50x48x1xf32>) -> tensor<1x1600x1536x1xf32>			%output = "tosa.resize"(%input) {mode = "NEAREST_NEIGHBOR", scale = [64, 2, 64, 2], offset = [-31, -31], border = [31, 31]} : (tensor<1x50x48x1xf32>) -> tensor<1x1600x1536x1xf32>

	return			return
	}			}

	// -----			// -----

	// CHECK-LABEL: @resize_bilinear_fp			// CHECK-LABEL: @resize_bilinear_fp
	func.func @resize_bilinear_fp(%input: tensor<1x23x23x1xf32>) -> () {			func.func @resize_bilinear_fp(%input: tensor<1x23x24x1xf32>) -> () {
	// CHECK: %[[INIT:.+]] = tensor.empty() : tensor<1x89x89x1xf32>			// CHECK: %[[INIT:.+]] = tensor.empty() : tensor<1x92x96x1xf32>
	// CHECK: %[[GENERIC:.+]] = linalg.generic			// CHECK: %[[GENERIC:.+]] = linalg.generic
	// CHECK: %[[IDX_0:.+]] = linalg.index 0			// CHECK: %[[IDX_0:.+]] = linalg.index 0
	// CHECK: %[[IDX_1:.+]] = linalg.index 1			// CHECK: %[[IDX_1:.+]] = linalg.index 1
	// CHECK: %[[IDX_2:.+]] = linalg.index 2			// CHECK: %[[IDX_2:.+]] = linalg.index 2
	// CHECK: %[[IDX_3:.+]] = linalg.index 3			// CHECK: %[[IDX_3:.+]] = linalg.index 3
	// CHECK-DAG: %[[XY_MIN:.*]] = arith.constant 0			// CHECK-DAG: %[[ZERO:.*]] = arith.constant 0
	// CHECK-DAG: %[[Y_MAX:.*]] = arith.constant 22			// CHECK-DAG: %[[Y_MAX:.*]] = arith.constant 22
	// CHECK-DAG: %[[X_MAX:.*]] = arith.constant 22			// CHECK-DAG: %[[X_MAX:.*]] = arith.constant 23
				jpienaarUnsubmitted Done Reply Inline Actions Why did this change? jpienaar: Why did this change?
				rsudermanAuthorUnsubmitted Done Reply Inline Actions The input test size changed from 23x23 to 23x24. This was to guarantee we actually separate checks for width / height. rsuderman: The input test size changed from 23x23 to 23x24. This was to guarantee we actually separate…
	// CHECK: %[[Y:.+]] = arith.index_cast %[[IDX_1]]			// CHECK: %[[Y:.+]] = arith.index_cast %[[IDX_1]]
	// CHECK: %[[X:.+]] = arith.index_cast %[[IDX_2]]			// CHECK: %[[X:.+]] = arith.index_cast %[[IDX_2]]
	// CHECK-DAG: %[[ISCALE_Y_N:.*]] = arith.constant 4			// CHECK-DAG: %[[ISCALE_Y_N:.*]] = arith.constant 4
	// CHECK-DAG: %[[ISCALE_Y_D:.*]] = arith.constant 1			// CHECK-DAG: %[[ISCALE_Y_D:.*]] = arith.constant 1
	// CHECK-DAG: %[[ISCALE_X_N:.*]] = arith.constant 4			// CHECK-DAG: %[[ISCALE_X_N:.*]] = arith.constant 4
	// CHECK-DAG: %[[ISCALE_X_D:.*]] = arith.constant 1			// CHECK-DAG: %[[ISCALE_X_D:.*]] = arith.constant 1
	// CHECK-DAG: %[[IOFFSET_Y:.*]] = arith.constant 0			// CHECK-DAG: %[[IOFFSET_Y:.*]] = arith.constant 0
	// CHECK-DAG: %[[IOFFSET_X:.*]] = arith.constant 0			// CHECK-DAG: %[[IOFFSET_X:.*]] = arith.constant 0
	// CHECK-DAG: %[[IBORDER_Y:.*]] = arith.constant 0			// CHECK-DAG: %[[IBORDER_Y:.*]] = arith.constant 0
	// CHECK-DAG: %[[IBORDER_X:.*]] = arith.constant 0			// CHECK-DAG: %[[IBORDER_X:.*]] = arith.constant 0

	// CHECK: %[[Y0:.+]] = arith.uitofp %[[Y]]			// CHECK: %[[Y0:.+]] = arith.uitofp %[[Y]]
	// CHECK: %[[X0:.+]] = arith.uitofp %[[X]]
	// CHECK: %[[SCALE_Y_N:.*]] = arith.uitofp %[[ISCALE_Y_N]]			// CHECK: %[[SCALE_Y_N:.*]] = arith.uitofp %[[ISCALE_Y_N]]
	// CHECK: %[[SCALE_Y_D:.*]] = arith.uitofp %[[ISCALE_Y_D]]			// CHECK: %[[SCALE_Y_D:.*]] = arith.uitofp %[[ISCALE_Y_D]]
				// CHECK: %[[OFFSET_Y:.*]] = arith.uitofp %[[IOFFSET_Y]]
				// CHECK: %[[VAL_29:.*]] = arith.mulf %[[Y0]], %[[SCALE_Y_D]]
				// CHECK: %[[VAL_31:.*]] = arith.addf %[[VAL_29]], %[[OFFSET_Y]]
				// CHECK: %[[VAL_33:.*]] = arith.divf %[[VAL_31]], %[[SCALE_Y_N]]
				// CHECK: %[[VAL_35:.*]] = math.floor %[[VAL_33]]
				// CHECK: %[[D_Y:.*]] = arith.subf %[[VAL_33]], %[[VAL_35]]
				// CHECK: %[[I_Y:.*]] = arith.fptosi %[[VAL_35]]

				// CHECK: %[[X0:.+]] = arith.uitofp %[[X]]
	// CHECK: %[[SCALE_X_N:.*]] = arith.uitofp %[[ISCALE_X_N]]			// CHECK: %[[SCALE_X_N:.*]] = arith.uitofp %[[ISCALE_X_N]]
	// CHECK: %[[SCALE_X_D:.*]] = arith.uitofp %[[ISCALE_X_D]]			// CHECK: %[[SCALE_X_D:.*]] = arith.uitofp %[[ISCALE_X_D]]
	// CHECK: %[[OFFSET_Y:.*]] = arith.uitofp %[[IOFFSET_Y]]
	// CHECK: %[[OFFSET_X:.*]] = arith.uitofp %[[IOFFSET_X]]			// CHECK: %[[OFFSET_X:.*]] = arith.uitofp %[[IOFFSET_X]]

	// CHECK: %[[VAL_29:.*]] = arith.mulf %[[Y0]], %[[SCALE_Y_D]]
	// CHECK: %[[VAL_30:.*]] = arith.mulf %[[X0]], %[[SCALE_X_D]]			// CHECK: %[[VAL_30:.*]] = arith.mulf %[[X0]], %[[SCALE_X_D]]
	// CHECK: %[[VAL_31:.*]] = arith.addf %[[VAL_29]], %[[OFFSET_Y]]
	// CHECK: %[[VAL_32:.*]] = arith.addf %[[VAL_30]], %[[OFFSET_X]]			// CHECK: %[[VAL_32:.*]] = arith.addf %[[VAL_30]], %[[OFFSET_X]]
	// CHECK: %[[VAL_33:.*]] = arith.divf %[[VAL_31]], %[[SCALE_Y_N]]
	// CHECK: %[[VAL_34:.*]] = arith.divf %[[VAL_32]], %[[SCALE_X_N]]			// CHECK: %[[VAL_34:.*]] = arith.divf %[[VAL_32]], %[[SCALE_X_N]]

	// CHECK: %[[VAL_35:.*]] = math.floor %[[VAL_33]]
	// CHECK: %[[VAL_36:.*]] = math.floor %[[VAL_34]]			// CHECK: %[[VAL_36:.*]] = math.floor %[[VAL_34]]
	// CHECK: %[[D_Y:.*]] = arith.subf %[[VAL_33]], %[[VAL_35]]
	// CHECK: %[[D_X:.*]] = arith.subf %[[VAL_34]], %[[VAL_36]]			// CHECK: %[[D_X:.*]] = arith.subf %[[VAL_34]], %[[VAL_36]]
	// CHECK: %[[I_Y:.*]] = arith.fptosi %[[VAL_35]]
	// CHECK: %[[I_X:.*]] = arith.fptosi %[[VAL_36]]			// CHECK: %[[I_X:.*]] = arith.fptosi %[[VAL_36]]

	// Compute the left, right, and top indices for the bilinear interpolation.			// Compute the left, right, and top indices for the bilinear interpolation.

	// CHECK-DAG: %[[ONE:.*]] = arith.constant 1			// CHECK: %[[ONE:.*]] = arith.constant 1
	// CHECK: %[[Y1:.*]] = arith.addi %[[I_Y]], %[[ONE]]
	// CHECK: %[[X1:.*]] = arith.addi %[[I_X]], %[[ONE]]

	// Bound check each dimension.			// Bound check each dimension.

	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[I_Y]], %[[XY_MIN]]			// CHECK: %[[Y1:.*]] = arith.addi %[[I_Y]], %[[ONE]]
	// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[XY_MIN]], %[[I_Y]]
				// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[I_Y]], %[[ZERO]]
				// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[ZERO]], %[[I_Y]]
	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[I_Y]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[I_Y]]
	// CHECK: %[[YLO:.*]] = arith.select %[[PRED]], %[[Y_MAX]], %[[BOUND]]			// CHECK: %[[YLO:.*]] = arith.select %[[PRED]], %[[Y_MAX]], %[[BOUND]]

	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y1]], %[[XY_MIN]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y1]], %[[ZERO]]
	// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[XY_MIN]], %[[Y1]]			// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[ZERO]], %[[Y1]]
	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[Y1]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[Y_MAX]], %[[Y1]]
	// CHECK: %[[YHI:.*]] = arith.select %[[PRED]], %[[Y_MAX]], %[[BOUND]]			// CHECK: %[[YHI:.*]] = arith.select %[[PRED]], %[[Y_MAX]], %[[BOUND]]
				// CHECK: %[[YLOI:.+]] = arith.index_cast %[[YLO]]
				// CHECK: %[[YHII:.+]] = arith.index_cast %[[YHI]]

	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[I_X]], %[[XY_MIN]]			// CHECK: %[[X1:.*]] = arith.addi %[[I_X]], %[[ONE]]
	// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[XY_MIN]], %[[I_X]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[I_X]], %[[ZERO]]
				// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[ZERO]], %[[I_X]]
	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X_MAX]], %[[I_X]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X_MAX]], %[[I_X]]
	// CHECK: %[[XLO:.*]] = arith.select %[[PRED]], %[[X_MAX]], %[[BOUND]]			// CHECK: %[[XLO:.*]] = arith.select %[[PRED]], %[[X_MAX]], %[[BOUND]]

	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X1]], %[[XY_MIN]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X1]], %[[ZERO]]
	// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[XY_MIN]], %[[X1]]			// CHECK: %[[BOUND:.*]] = arith.select %[[PRED]], %[[ZERO]], %[[X1]]
	// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X_MAX]], %[[X1]]			// CHECK: %[[PRED:.*]] = arith.cmpi slt, %[[X_MAX]], %[[X1]]
	// CHECK: %[[XHI:.*]] = arith.select %[[PRED]], %[[X_MAX]], %[[BOUND]]			// CHECK: %[[XHI:.*]] = arith.select %[[PRED]], %[[X_MAX]], %[[BOUND]]

	// CHECK: %[[YLOI:.+]] = arith.index_cast %[[YLO]]
	// CHECK: %[[YHII:.+]] = arith.index_cast %[[YHI]]
	// CHECK: %[[XLOI:.+]] = arith.index_cast %[[XLO]]			// CHECK: %[[XLOI:.+]] = arith.index_cast %[[XLO]]
	// CHECK: %[[XHII:.+]] = arith.index_cast %[[XHI]]			// CHECK: %[[XHII:.+]] = arith.index_cast %[[XHI]]

	// CHECK: %[[LOLO:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[YLOI]], %[[XLOI]], %[[IDX_3]]]			// CHECK: %[[LOLO:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[YLOI]], %[[XLOI]], %[[IDX_3]]]
	// CHECK: %[[LOHI:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[YLOI]], %[[XHII]], %[[IDX_3]]]			// CHECK: %[[LOHI:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[YLOI]], %[[XHII]], %[[IDX_3]]]
	// CHECK: %[[HILO:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[YHII]], %[[XLOI]], %[[IDX_3]]]			// CHECK: %[[HILO:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[YHII]], %[[XLOI]], %[[IDX_3]]]
	// CHECK: %[[HIHI:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[YHII]], %[[XHII]], %[[IDX_3]]]			// CHECK: %[[HIHI:.+]] = tensor.extract %arg0[%[[IDX_0]], %[[YHII]], %[[XHII]], %[[IDX_3]]]

	// CHECK-DAG: %[[ONE:.+]] = arith.constant 1.000000e+00 : f32			// CHECK-DAG: %[[ONE:.+]] = arith.constant 1.000000e+00 : f32
	// CHECK: %[[NDX:.+]] = arith.subf %[[ONE]], %[[D_X]]			// CHECK: %[[NDX:.+]] = arith.subf %[[ONE]], %[[D_X]]
	// CHECK: %[[WLOLO:.+]] = arith.mulf %[[LOLO]], %[[NDX]]			// CHECK: %[[WLOLO:.+]] = arith.mulf %[[LOLO]], %[[NDX]]
	// CHECK: %[[WLOHI:.+]] = arith.mulf %[[LOHI]], %[[D_X]]			// CHECK: %[[WLOHI:.+]] = arith.mulf %[[LOHI]], %[[D_X]]
	// CHECK: %[[LO:.+]] = arith.addf %[[WLOLO]], %[[WLOHI]]			// CHECK: %[[LO:.+]] = arith.addf %[[WLOLO]], %[[WLOHI]]
				// CHECK: %[[NDX:.+]] = arith.subf %[[ONE]], %[[D_X]]
	// CHECK: %[[WHILO:.+]] = arith.mulf %[[HILO]], %[[NDX]]			// CHECK: %[[WHILO:.+]] = arith.mulf %[[HILO]], %[[NDX]]
	// CHECK: %[[WHIHI:.+]] = arith.mulf %[[HIHI]], %[[D_X]]			// CHECK: %[[WHIHI:.+]] = arith.mulf %[[HIHI]], %[[D_X]]
	// CHECK: %[[HI:.+]] = arith.addf %[[WHILO]], %[[WHIHI]]			// CHECK: %[[HI:.+]] = arith.addf %[[WHILO]], %[[WHIHI]]
	// CHECK: %[[NDY:.+]] = arith.subf %[[ONE]], %[[D_Y]]			// CHECK: %[[NDY:.+]] = arith.subf %[[ONE]], %[[D_Y]]
	// CHECK: %[[WLO:.+]] = arith.mulf %[[LO]], %[[NDY]]			// CHECK: %[[WLO:.+]] = arith.mulf %[[LO]], %[[NDY]]
	// CHECK: %[[WHI:.+]] = arith.mulf %[[HI]], %[[D_Y]]			// CHECK: %[[WHI:.+]] = arith.mulf %[[HI]], %[[D_Y]]
	// CHECK: %[[RESULT:.+]] = arith.addf %[[WLO]], %[[WHI]]			// CHECK: %[[RESULT:.+]] = arith.addf %[[WLO]], %[[WHI]]
	// CHECK: linalg.yield %[[RESULT]]			// CHECK: linalg.yield %[[RESULT]]

	// Round by bilinear interpolation			// Round by bilinear interpolation
	%output = "tosa.resize"(%input) {mode = "BILINEAR", scale = [4, 1, 4, 1], offset = [0, 0], border = [0, 0]} : (tensor<1x23x23x1xf32>) -> tensor<1x89x89x1xf32>			%output = "tosa.resize"(%input) {mode = "BILINEAR", scale = [4, 1, 4, 1], offset = [0, 0], border = [0, 0]} : (tensor<1x23x24x1xf32>) -> tensor<1x92x96x1xf32>

	return			return
	}			}

	// -----			// -----

	// CHECK-LABEL: @resize_dyn			// CHECK-LABEL: @resize_dyn
	// CHECK-SAME: (%[[ARG0:[0-9a-zA-Z_]*]]:			// CHECK-SAME: (%[[ARG0:[0-9a-zA-Z_]*]]:
	func.func @resize_dyn(%input: tensor<?x2x2x1xi8>) -> () {			func.func @resize_dyn(%input: tensor<?x2x2x1xi8>) -> () {
	// CHECK-DAG: %[[C0:.+]] = arith.constant 0			// CHECK-DAG: %[[C0:.+]] = arith.constant 0
	// CHECK: %[[BATCH:.+]] = tensor.dim %arg0, %[[C0]]			// CHECK: %[[BATCH:.+]] = tensor.dim %arg0, %[[C0]]
	// CHECK: %[[INIT:.+]] = tensor.empty(%[[BATCH]]) : tensor<?x4x4x1xi32>			// CHECK: %[[INIT:.+]] = tensor.empty(%[[BATCH]]) : tensor<?x4x4x1xi32>
	// CHECK: %[[GENERIC:.+]] = linalg.generic			// CHECK: %[[GENERIC:.+]] = linalg.generic
	%output = "tosa.resize"(%input) { scale = [4, 2, 4, 2], offset = [-1, -1], border = [1, 1], mode = "BILINEAR" } : (tensor<?x2x2x1xi8>) -> (tensor<?x4x4x1xi32>)			%output = "tosa.resize"(%input) { scale = [4, 2, 4, 2], offset = [-1, -1], border = [1, 1], mode = "BILINEAR" } : (tensor<?x2x2x1xi8>) -> (tensor<?x4x4x1xi32>)
	return			return
	}			}

	// -----			// -----

	// CHECK-LABEL: @resize_bilinear_int48			// CHECK-LABEL: @resize_bilinear_int48
	func.func @resize_bilinear_int48(%arg0: tensor<1x19x19x1xi16>) {			func.func @resize_bilinear_int48(%arg0: tensor<1x19x19x1xi16>) {
	%0 = "tosa.resize"(%arg0) {mode = "BILINEAR", scale = [16, 1, 16, 1], offset = [0, 0], border = [0, 0]} : (tensor<1x19x19x1xi16>) -> tensor<1x289x289x1xi48>			%0 = "tosa.resize"(%arg0) {mode = "BILINEAR", scale = [16, 1, 16, 1], offset = [0, 0], border = [0, 0]} : (tensor<1x19x19x1xi16>) -> tensor<1x289x289x1xi48>
	return			return
	}			}
				awarzynskiUnsubmitted Done Reply Inline Actions This example and the examples above seem to interpret scale as: [scale_x_n, scale_x_d, scale_y_n, scale_y_d] rather than: [scale_y_n, scale_y_d, scale_x_n, scale_x_d] In particular, (using formula from the spec): OH = ((13-1) * 11 / 7 ) + 1 = 19.857 But OH should be 179. However, if I use `scale_x_n` and `scale_x_d` instead: OH = ((13-1) * 89 / 6 ) + 1 = 179 Could you update this and the other examples? awarzynski: This example and the examples above seem to interpret scale as: ``` [scale_x_n, scale_x_d…
				rsudermanAuthorUnsubmitted Done Reply Inline Actions Changes will appear in follow up. rsuderman: Changes will appear in follow up.

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][tosa] Refactor tosa.resizeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 482274

mlir/include/mlir/Dialect/Tosa/Utils/ConversionUtils.h

mlir/include/mlir/Dialect/Tosa/Utils/CoversionUtils.h

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp

mlir/lib/Conversion/TosaToLinalg/TosaToLinalgNamed.cpp

mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp

mlir/lib/Dialect/Tosa/Utils/ConversionUtils.cpp

mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir

[mlir][tosa] Refactor tosa.resize
ClosedPublic