This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Vector/
-
mlir/
-
Dialect/
-
Vector/
-
VectorOps.td
-
lib/Dialect/Vector/
-
Dialect/
-
Vector/
-
VectorOps.cpp
-
test/
-
Dialect/Vector/
-
Vector/
-
canonicalize.mlir
-
Integration/Dialect/Vector/CPU/
-
Dialect/
-
Vector/
-
CPU/
-
test-create-mask.mlir

Differential D116069

[mlir][vector] Allow values outside of [0; dim-size] in create_mask
ClosedPublic

Authored by sgrechanik on Dec 20 2021, 5:59 PM.

Download Raw Diff

Details

Reviewers

dcaballe
nicolasvasilache
aartbik
andydavis1

Commits

rG5abf11632245: [mlir][vector] Allow values outside of [0; dim-size] in create_mask

Summary

This commits explicitly states that negative values and values exceeding
vector dimensions are allowed in vector.create_mask (but not in
vector.constant_mask). These values are now truncated when
canonicalizing vector.create_mask to vector.constant_mask.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sgrechanik created this revision.Dec 20 2021, 5:59 PM

Herald added subscribers: sdasgup3, wenzhicui, wrengr and 18 others. · View Herald TranscriptDec 20 2021, 5:59 PM

sgrechanik requested review of this revision.Dec 20 2021, 5:59 PM

Herald added a subscriber: stephenneuendorffer. · View Herald TranscriptDec 20 2021, 5:59 PM

Harbormaster completed remote builds in B140168: Diff 395561.Dec 20 2021, 6:24 PM

Kindly ping

Thanks for the contribution, Sergei! I think I don't have enough experience with this op so I'll leave this to @nicolasvasilache.
How do you end up generating values that are out of the expected bounds of the mask?
Folding invalid values into valid ones could be surprising and maybe lead to silent bugs (?) but maybe it makes sense for this op. An alternative would be generating code in your use case to handle these out-of-bounds values and truncate them accordingly before they are passed to the create_mask op. I would useful if you could share a bit more about your use case.

Thanks,
Diego

@aartbik is actually the original contributor of this abstraction and the main user at this time.
Offhand it would seem to me that we wouldn't want negative values here?
I would personally rather go for an explicit truncation, but @aartbik will know better.

Values that are greater than the vector size are already used by the vectorizer when vectorizing reductions and creating a mask. In this example the mask filters out garbage elements (with index >= 400) and is based on the value %elts_left, which is often greater than 64:

func @vecdim_reduction_masked(%arg0: memref<?xf32>, %arg1: memref<f32>) {
  %cst = arith.constant 0.000000e+00 : f32
  %cst_0 = arith.constant dense<0.000000e+00> : vector<64xf32>
  %0 = affine.for %arg2 = 0 to 400 step 64 iter_args(%arg3 = %cst_0) -> (vector<64xf32>) {
    %elts_left = affine.apply affine_map<(d0) -> (400 - d0)>(%arg2)
    %3 = vector.create_mask %elts_left : vector<64xi1>
    %4 = vector.transfer_read %arg0[%arg2], %cst : memref<?xf32>, vector<64xf32>
    %5 = arith.addf %arg3, %4 : vector<64xf32>
    %6 = select %3, %5, %arg3 : vector<64xi1>, vector<64xf32>
    affine.yield %6 : vector<64xf32>
  }
  %1 = vector.reduction "add", %0 : vector<64xf32> into f32
  affine.store %1, %arg1[] : memref<f32>
  return
}

(It then fails when peeling and unrolling multiple loop iterations, in which case create_mask ops become constant_mask ops which statically checks that the value is within bounds).

Negative values are more rare, in our use case we sometimes add empty loop iterations to avoid remainder loops when performing loop unrolling. In the code above this would change the loop bound:

...
// The upper bound is changed from 400 to 512, adding an empty iteration:
%0 = affine.for %arg2 = 0 to 512 step 64 iter_args(%arg3 = %cst_0) -> (vector<64xf32>) {
  // But we still use the constant 400 here to make sure that the last iteration is really empty:
  %elts_left = affine.apply affine_map<(d0) -> (400 - d0)>(%arg2)
   ...

When %arg2 equals 448, the value of %elts_left becomes negative, but the intention here is the same, to filter out elements with the index >= 400, so the mask must be all zeros in this case.

In my opinion, both cases are natural continuations of the create_masks's semantics. We might want to keep the checks on the constant_mask op though, to have better protection against mistakes.

Thanks, Sergei! Much clearer now. I think your proposal makes more sense to me now. The problem is very specific to the way you are generating the code but it's clearing exposing some corner cases of the create_mask operation. I'll leave it to Aart, though, since he has much more context on these ops (I'm not even sure I understand why we have the constant and non-constant variant of this op so I can't fully understand the implications of this change).

My suggestion here, though, is that you shouldn't be limited by the create_mask op. You could always create masks by using logical operations. For example, you could compute something like (assuming VF=4): ([iv, iv, iv, iv] + [0, 1, 2, 3]) < [ub, ub, ub, ub]. I'm currently working on a proposal for masking and I'm not always able to use create_mask to generate all the masks needed.

Hopefully that helps!

Thanks,
Diego

The issue here is that the index type does not have a sign, so all mask values are interpreted as >= 0. In the original implementation, the

%0 = arith.constant -2 : index

would really give a mask value of

18446744073709551614

This change gives an interpretation to the mask index.

I looked at some of the actual implementations, and calling

func @create_vector_mask_dyn(%c : index) -> vector<10xi1> {
   %0 = vector.create_mask %c : vector<10xi1>
   return %0 : vector<10xi1>
 }

with %c between -4 and 11

-4 == 18446744073709551612 ( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
-3 == 18446744073709551613 ( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
-2 == 18446744073709551614 ( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
-1 == 18446744073709551615 ( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
0 ( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
1 ( 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
2 ( 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 )
3 ( 1, 1, 1, 0, 0, 0, 0, 0, 0, 0 )
4 ( 1, 1, 1, 1, 0, 0, 0, 0, 0, 0 )
5 ( 1, 1, 1, 1, 1, 0, 0, 0, 0, 0 )
6 ( 1, 1, 1, 1, 1, 1, 0, 0, 0, 0 )
7 ( 1, 1, 1, 1, 1, 1, 1, 0, 0, 0 )
8 ( 1, 1, 1, 1, 1, 1, 1, 1, 0, 0 )
9 ( 1, 1, 1, 1, 1, 1, 1, 1, 1, 0 )
10 ( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 )
11 ( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 )

So this change seems in line of what the de facto implementation is (since I see signed comparison in the generated code).

This revision is now accepted and ready to land.Jan 18 2022, 5:35 PM

Thanks! I'll merge this tomorrow if there are no more comments.
(As an alternative we can merge only the half of this change related to values larger than the vector size, and keep the old behavior for negative values, but it's probably better to explicitly require that the mask index should be interpreted as a signed integer than to leave this unspecified).

Closed by commit rG5abf11632245: [mlir][vector] Allow values outside of [0; dim-size] in create_mask (authored by sgrechanik). · Explain WhyJan 20 2022, 10:00 AM

This revision was automatically updated to reflect the committed changes.

sgrechanik added a commit: rG5abf11632245: [mlir][vector] Allow values outside of [0; dim-size] in create_mask.

jsetoain mentioned this in D118248: [mlir][Vector] Enable create_mask for scalable vectors.Feb 9 2022, 9:11 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Vector/

VectorOps.td

8 lines

lib/

Dialect/

Vector/

VectorOps.cpp

15 lines

test/

Dialect/

Vector/

canonicalize.mlir

33 lines

Integration/

Dialect/

Vector/

CPU/

test-create-mask.mlir

6 lines

Diff 401680

mlir/include/mlir/Dialect/Vector/VectorOps.td

Show First 20 Lines • Show All 2,125 Lines • ▼ Show 20 Lines	def Vector_ConstantMaskOp :
let description = [{		let description = [{
Creates and returns a vector mask where elements of the result vector		Creates and returns a vector mask where elements of the result vector
are set to '0' or '1', based on whether the element indices are contained		are set to '0' or '1', based on whether the element indices are contained
within a hyper-rectangular region specified by the 'mask_dim_sizes'		within a hyper-rectangular region specified by the 'mask_dim_sizes'
array attribute argument. Each element of the 'mask_dim_sizes' array,		array attribute argument. Each element of the 'mask_dim_sizes' array,
specifies an exclusive upper bound [0, mask-dim-size-element-value)		specifies an exclusive upper bound [0, mask-dim-size-element-value)
for a unique dimension in the vector result. The conjunction of the ranges		for a unique dimension in the vector result. The conjunction of the ranges
define a hyper-rectangular region within which elements values are set to 1		define a hyper-rectangular region within which elements values are set to 1
(otherwise element values are set to 0).		(otherwise element values are set to 0). Each value of 'mask_dim_sizes' must
		be non-negative and not greater than the size of the corresponding vector
		dimension (as opposed to vector.create_mask which allows this).

Example:		Example:

```mlir		```mlir
// create a constant vector mask of size 4x3xi1 with elements in range		// create a constant vector mask of size 4x3xi1 with elements in range
// 0 <= row <= 2 and 0 <= col <= 1 are set to 1 (others to 0).		// 0 <= row <= 2 and 0 <= col <= 1 are set to 1 (others to 0).
%1 = vector.constant_mask [3, 2] : vector<4x3xi1>		%1 = vector.constant_mask [3, 2] : vector<4x3xi1>

Show All 21 Lines	def Vector_CreateMaskOp :
let summary = "creates a vector mask";		let summary = "creates a vector mask";
let description = [{		let description = [{
Creates and returns a vector mask where elements of the result vector		Creates and returns a vector mask where elements of the result vector
are set to '0' or '1', based on whether the element indices are contained		are set to '0' or '1', based on whether the element indices are contained
within a hyper-rectangular region specified by the operands. Specifically,		within a hyper-rectangular region specified by the operands. Specifically,
each operand specifies a range [0, operand-value) for a unique dimension in		each operand specifies a range [0, operand-value) for a unique dimension in
the vector result. The conjunction of the operand ranges define a		the vector result. The conjunction of the operand ranges define a
hyper-rectangular region within which elements values are set to 1		hyper-rectangular region within which elements values are set to 1
(otherwise element values are set to 0).		(otherwise element values are set to 0). If operand-value is negative, it is
		treated as if it were zero, and if it is greater than the corresponding
		dimension size, it is treated as if it were equal to the dimension size.

Example:		Example:

```mlir		```mlir
// create a vector mask of size 4x3xi1 where elements in range		// create a vector mask of size 4x3xi1 where elements in range
// 0 <= row <= 2 and 0 <= col <= 1 are set to 1 (others to 0).		// 0 <= row <= 2 and 0 <= col <= 1 are set to 1 (others to 0).
%1 = vector.create_mask %c3, %c2 : vector<4x3xi1>		%1 = vector.create_mask %c3, %c2 : vector<4x3xi1>

▲ Show 20 Lines • Show All 248 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorOps.cpp

Show First 20 Lines • Show All 4,229 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(CreateMaskOp createMaskOp,
// Return if any of 'createMaskOp' operands are not defined by a constant.		// Return if any of 'createMaskOp' operands are not defined by a constant.
auto isNotDefByConstant = [](Value operand) {		auto isNotDefByConstant = [](Value operand) {
return !isa_and_nonnull<arith::ConstantIndexOp>(operand.getDefiningOp());		return !isa_and_nonnull<arith::ConstantIndexOp>(operand.getDefiningOp());
};		};
if (llvm::any_of(createMaskOp.operands(), isNotDefByConstant))		if (llvm::any_of(createMaskOp.operands(), isNotDefByConstant))
return failure();		return failure();
// Gather constant mask dimension sizes.		// Gather constant mask dimension sizes.
SmallVector<int64_t, 4> maskDimSizes;		SmallVector<int64_t, 4> maskDimSizes;
for (auto operand : createMaskOp.operands()) {		for (auto it : llvm::zip(createMaskOp.operands(),
auto *defOp = operand.getDefiningOp();		createMaskOp.getType().getShape())) {
maskDimSizes.push_back(cast<arith::ConstantIndexOp>(defOp).value());		auto *defOp = std::get<0>(it).getDefiningOp();
		int64_t maxDimSize = std::get<1>(it);
		int64_t dimSize = cast<arith::ConstantIndexOp>(defOp).value();
		dimSize = std::min(dimSize, maxDimSize);
		// If one of dim sizes is zero, set all dims to zero.
		if (dimSize <= 0) {
		maskDimSizes.assign(createMaskOp.getType().getRank(), 0);
		break;
		}
		maskDimSizes.push_back(dimSize);
}		}
// Replace 'createMaskOp' with ConstantMaskOp.		// Replace 'createMaskOp' with ConstantMaskOp.
rewriter.replaceOpWithNewOp<ConstantMaskOp>(		rewriter.replaceOpWithNewOp<ConstantMaskOp>(
createMaskOp, createMaskOp.getResult().getType(),		createMaskOp, createMaskOp.getResult().getType(),
vector::getVectorSubscriptAttr(rewriter, maskDimSizes));		vector::getVectorSubscriptAttr(rewriter, maskDimSizes));
return success();		return success();
}		}
};		};
Show All 19 Lines

mlir/test/Dialect/Vector/canonicalize.mlir

	// RUN: mlir-opt %s -pass-pipeline='builtin.func(canonicalize)' -split-input-file -allow-unregistered-dialect \| FileCheck %s			// RUN: mlir-opt %s -pass-pipeline='builtin.func(canonicalize)' -split-input-file -allow-unregistered-dialect \| FileCheck %s

	// -----			// -----

	// CHECK-LABEL: create_vector_mask_to_constant_mask			// CHECK-LABEL: create_vector_mask_to_constant_mask
	func @create_vector_mask_to_constant_mask() -> (vector<4x3xi1>) {			func @create_vector_mask_to_constant_mask() -> (vector<4x3xi1>) {
	%c2 = arith.constant 2 : index			%c2 = arith.constant 2 : index
	%c3 = arith.constant 3 : index			%c3 = arith.constant 3 : index
	// CHECK: vector.constant_mask [3, 2] : vector<4x3xi1>			// CHECK: vector.constant_mask [3, 2] : vector<4x3xi1>
	%0 = vector.create_mask %c3, %c2 : vector<4x3xi1>			%0 = vector.create_mask %c3, %c2 : vector<4x3xi1>
	return %0 : vector<4x3xi1>			return %0 : vector<4x3xi1>
	}			}

	// -----			// -----

				// CHECK-LABEL: create_vector_mask_to_constant_mask_truncation
				func @create_vector_mask_to_constant_mask_truncation() -> (vector<4x3xi1>) {
				%c2 = arith.constant 2 : index
				%c5 = arith.constant 5 : index
				// CHECK: vector.constant_mask [4, 2] : vector<4x3xi1>
				%0 = vector.create_mask %c5, %c2 : vector<4x3xi1>
				return %0 : vector<4x3xi1>
				}

				// -----

				// CHECK-LABEL: create_vector_mask_to_constant_mask_truncation_neg
				func @create_vector_mask_to_constant_mask_truncation_neg() -> (vector<4x3xi1>) {
				%cneg2 = arith.constant -2 : index
				%c5 = arith.constant 5 : index
				// CHECK: vector.constant_mask [0, 0] : vector<4x3xi1>
				%0 = vector.create_mask %c5, %cneg2 : vector<4x3xi1>
				return %0 : vector<4x3xi1>
				}

				// -----

				// CHECK-LABEL: create_vector_mask_to_constant_mask_truncation_zero
				func @create_vector_mask_to_constant_mask_truncation_zero() -> (vector<4x3xi1>) {
				%c2 = arith.constant 2 : index
				%c0 = arith.constant 0 : index
				// CHECK: vector.constant_mask [0, 0] : vector<4x3xi1>
				%0 = vector.create_mask %c0, %c2 : vector<4x3xi1>
				return %0 : vector<4x3xi1>
				}

				// -----

	func @extract_strided_slice_of_constant_mask() -> (vector<2x2xi1>) {			func @extract_strided_slice_of_constant_mask() -> (vector<2x2xi1>) {
	%0 = vector.constant_mask [2, 2] : vector<4x3xi1>			%0 = vector.constant_mask [2, 2] : vector<4x3xi1>
	%1 = vector.extract_strided_slice %0			%1 = vector.extract_strided_slice %0
	{offsets = [0, 0], sizes = [2, 2], strides = [1, 1]}			{offsets = [0, 0], sizes = [2, 2], strides = [1, 1]}
	: vector<4x3xi1> to vector<2x2xi1>			: vector<4x3xi1> to vector<2x2xi1>
	// CHECK: vector.constant_mask [2, 2] : vector<2x2xi1>			// CHECK: vector.constant_mask [2, 2] : vector<2x2xi1>
	return %1 : vector<2x2xi1>			return %1 : vector<2x2xi1>
	}			}
	▲ Show 20 Lines • Show All 1,188 Lines • Show Last 20 Lines

mlir/test/Integration/Dialect/Vector/CPU/test-create-mask.mlir

	// RUN: mlir-opt %s -convert-scf-to-std -convert-vector-to-llvm -convert-std-to-llvm -reconcile-unrealized-casts \| \			// RUN: mlir-opt %s -convert-scf-to-std -convert-vector-to-llvm -convert-std-to-llvm -reconcile-unrealized-casts \| \
	// RUN: mlir-cpu-runner -e entry -entry-point-result=void \			// RUN: mlir-cpu-runner -e entry -entry-point-result=void \
	// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_c_runner_utils%shlibext \| \			// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_c_runner_utils%shlibext \| \
	// RUN: FileCheck %s			// RUN: FileCheck %s

	func @entry() {			func @entry() {
				%cneg1 = arith.constant -1 : index
	%c0 = arith.constant 0 : index			%c0 = arith.constant 0 : index
	%c1 = arith.constant 1 : index			%c1 = arith.constant 1 : index
	%c2 = arith.constant 2 : index			%c2 = arith.constant 2 : index
	%c3 = arith.constant 3 : index			%c3 = arith.constant 3 : index
	%c6 = arith.constant 6 : index			%c6 = arith.constant 6 : index
				%c7 = arith.constant 7 : index

	//			//
	// 1-D.			// 1-D.
	//			//

	%1 = vector.create_mask %c2 : vector<5xi1>			%1 = vector.create_mask %c2 : vector<5xi1>
	vector.print %1 : vector<5xi1>			vector.print %1 : vector<5xi1>
	// CHECK: ( 1, 1, 0, 0, 0 )			// CHECK: ( 1, 1, 0, 0, 0 )

	scf.for %i = %c0 to %c6 step %c1 {			scf.for %i = %cneg1 to %c7 step %c1 {
	%2 = vector.create_mask %i : vector<5xi1>			%2 = vector.create_mask %i : vector<5xi1>
	vector.print %2 : vector<5xi1>			vector.print %2 : vector<5xi1>
	}			}
	// CHECK: ( 0, 0, 0, 0, 0 )			// CHECK: ( 0, 0, 0, 0, 0 )
				// CHECK: ( 0, 0, 0, 0, 0 )
	// CHECK: ( 1, 0, 0, 0, 0 )			// CHECK: ( 1, 0, 0, 0, 0 )
	// CHECK: ( 1, 1, 0, 0, 0 )			// CHECK: ( 1, 1, 0, 0, 0 )
	// CHECK: ( 1, 1, 1, 0, 0 )			// CHECK: ( 1, 1, 1, 0, 0 )
	// CHECK: ( 1, 1, 1, 1, 0 )			// CHECK: ( 1, 1, 1, 1, 0 )
	// CHECK: ( 1, 1, 1, 1, 1 )			// CHECK: ( 1, 1, 1, 1, 1 )
				// CHECK: ( 1, 1, 1, 1, 1 )

	//			//
	// 2-D.			// 2-D.
	//			//

	%3 = vector.create_mask %c2, %c3 : vector<5x5xi1>			%3 = vector.create_mask %c2, %c3 : vector<5x5xi1>
	vector.print %3 : vector<5x5xi1>			vector.print %3 : vector<5x5xi1>
	// CHECK: ( ( 1, 1, 1, 0, 0 ), ( 1, 1, 1, 0, 0 ), ( 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0 ) )			// CHECK: ( ( 1, 1, 1, 0, 0 ), ( 1, 1, 1, 0, 0 ), ( 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0 ) )
	▲ Show 20 Lines • Show All 72 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][vector] Allow values outside of [0; dim-size] in create_maskClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 401680

mlir/include/mlir/Dialect/Vector/VectorOps.td

mlir/lib/Dialect/Vector/VectorOps.cpp

mlir/test/Dialect/Vector/canonicalize.mlir

mlir/test/Integration/Dialect/Vector/CPU/test-create-mask.mlir

[mlir][vector] Allow values outside of [0; dim-size] in create_mask
ClosedPublic