This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Linalg/IR/
-
mlir/
-
Dialect/
-
Linalg/
-
IR/
-
LinalgNamedStructuredOps.yaml
-
python/mlir/dialects/linalg/opdsl/ops/
-
mlir/
-
dialects/
-
linalg/
-
opdsl/
-
ops/
-
core_named_ops.py
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
-
generalize-named-ops.mlir
-
named-ops-fail.mlir
-
named-ops.mlir

Differential D154703

[MLIR][Linalg] Add max named op to linalg
ClosedPublic

Authored by rengolin on Jul 7 2023, 4:12 AM.

Download Raw Diff

Details

Reviewers

ftynse
nicolasvasilache
harsh
chelini
adam-smnk

Commits

rGd8dc1c22bf92: [MLIR][Linalg] Add max named op to linalg

Summary

I've been trying to come up with a simple and clean implementation for
ReLU. TOSA uses clamp which is probably the goal, but that means
table-gen to make it efficient (attributes, only lower min or max).

For now, max is a reasonable named op despite ReLU, so we can start
using it for tiling and fusion, and upon success, we create a more
complete op clamp that doesn't need a whole tensor filled with zeroes
or ones to implement the different activation functions.

As with other named ops, we start "requiring" type casts and broadcasts,
and zero filled constant tensors to a more complex pattern-matcher, and
can slowly simplify with attributes or structured matchers (ex. PDL) in
the future.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

rengolin created this revision.Jul 7 2023, 4:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2023, 4:12 AM

Herald added subscribers: bviyer, Moerafaat, bzcheeseman and 25 others. · View Herald Transcript

rengolin requested review of this revision.Jul 7 2023, 4:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2023, 4:12 AM

Herald added subscribers: limo1996, stephenneuendorffer. · View Herald Transcript

Harbormaster completed remote builds in B243731: Diff 538072.Jul 7 2023, 4:31 AM

As with other named ops, we start "requiring" type casts and broadcasts, and zero filled constant tensors to a more complex pattern-matcher, and can slowly simplify with attributes or structured matchers (ex. PDL) in the future.

sgtm!

This revision is now accepted and ready to land.Jul 7 2023, 4:48 AM

Closed by commit rGd8dc1c22bf92: [MLIR][Linalg] Add max named op to linalg (authored by rengolin). · Explain WhyJul 7 2023, 5:40 AM

This revision was automatically updated to reflect the committed changes.

rengolin added a commit: rGd8dc1c22bf92: [MLIR][Linalg] Add max named op to linalg.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

IR/

LinalgNamedStructuredOps.yaml

49 lines

python/

mlir/

dialects/

linalg/

opdsl/

ops/

core_named_ops.py

19 lines

test/

Dialect/

Linalg/

generalize-named-ops.mlir

25 lines

named-ops-fail.mlir

16 lines

named-ops.mlir

34 lines

Diff 538104

mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml

Show First 20 Lines • Show All 608 Lines • ▼ Show 20 Lines	value: !ScalarExpression
fn_name: div_unsigned		fn_name: div_unsigned
operands:		operands:
- !ScalarExpression		- !ScalarExpression
scalar_arg: lhs		scalar_arg: lhs
- !ScalarExpression		- !ScalarExpression
scalar_arg: rhs		scalar_arg: rhs
--- !LinalgOpConfig		--- !LinalgOpConfig
metadata: !LinalgOpMetadata		metadata: !LinalgOpMetadata
		name: max
		cpp_class_name: MaxOp
		doc: \|-
		Takes the max (signed) between the input and a constant.

		The shapes and element types must be identical. The appropriate casts,
		broadcasts and reductions should be done previously to calling this op.

		This means reduction/broadcast/element cast semantics is explicit. Further
		passes can take that into account when lowering this code. For example,
		a `linalg.broadcast` + `linalg.div` sequence can be lowered to a
		`linalg.generic` with different affine maps for the two operands.
		structured_op: !LinalgStructuredOpConfig
		args:
		- !LinalgOperandDefConfig
		name: lhs
		kind: input_tensor
		type_var: T
		shape_map: affine_map<() -> ()>
		- !LinalgOperandDefConfig
		name: rhs
		kind: input_tensor
		type_var: T
		shape_map: affine_map<() -> ()>
		- !LinalgOperandDefConfig
		name: out
		kind: output_tensor
		type_var: T
		shape_map: affine_map<() -> ()>
		indexing_maps: !LinalgIndexingMapsConfig
		static_indexing_maps:
		- affine_map<() -> ()>
		- affine_map<() -> ()>
		- affine_map<() -> ()>
		iterator_types: []
		assignments:
		- !ScalarAssign
		arg: out
		value: !ScalarExpression
		scalar_fn:
		kind: binary
		fn_name: max_signed
		operands:
		- !ScalarExpression
		scalar_arg: lhs
		- !ScalarExpression
		scalar_arg: rhs
		--- !LinalgOpConfig
		metadata: !LinalgOpMetadata
name: matmul		name: matmul
cpp_class_name: MatmulOp		cpp_class_name: MatmulOp
doc: \|-		doc: \|-
Performs a matrix multiplication of two 2D inputs.		Performs a matrix multiplication of two 2D inputs.

Numeric casting is performed on the operands to the inner multiply, promoting		Numeric casting is performed on the operands to the inner multiply, promoting
them to the same data type as the accumulator/output.		them to the same data type as the accumulator/output.
implements:		implements:
▲ Show 20 Lines • Show All 4,983 Lines • Show Last 20 Lines

mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py

Show First 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	):
passes can take that into account when lowering this code. For example,		passes can take that into account when lowering this code. For example,
a `linalg.broadcast` + `linalg.div` sequence can be lowered to a		a `linalg.broadcast` + `linalg.div` sequence can be lowered to a
`linalg.generic` with different affine maps for the two operands.		`linalg.generic` with different affine maps for the two operands.
"""		"""
O[None] = lhs[None] / rhs[None]		O[None] = lhs[None] / rhs[None]


@linalg_structured_op		@linalg_structured_op
		def max(
		lhs=TensorDef(T1),
		rhs=TensorDef(T1),
		O=TensorDef(T1, output=True),
		):
		"""Takes the max (signed) between two inputs, elementwise.

		The shapes and element types must be identical. The appropriate casts,
		broadcasts and reductions should be done previously to calling this op.

		This means reduction/broadcast/element cast semantics is explicit. Further
		passes can take that into account when lowering this code. For example,
		a `linalg.broadcast` + `linalg.div` sequence can be lowered to a
		`linalg.generic` with different affine maps for the two operands.
		"""
		O[None] = BinaryFn.max_signed(lhs[None], rhs[None])


		@linalg_structured_op
def matmul(		def matmul(
A=TensorDef(T1, S.M, S.K),		A=TensorDef(T1, S.M, S.K),
B=TensorDef(T2, S.K, S.N),		B=TensorDef(T2, S.K, S.N),
C=TensorDef(U, S.M, S.N, output=True),		C=TensorDef(U, S.M, S.N, output=True),
cast=TypeFnAttrDef(default=TypeFn.cast_signed),		cast=TypeFnAttrDef(default=TypeFn.cast_signed),
):		):
"""Performs a matrix multiplication of two 2D inputs.		"""Performs a matrix multiplication of two 2D inputs.

▲ Show 20 Lines • Show All 1,325 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/generalize-named-ops.mlir

	Show First 20 Lines • Show All 531 Lines • ▼ Show 20 Lines
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[MAP]], #[[MAP]]]			// CHECK-SAME: indexing_maps = [#[[MAP]], #[[MAP]]]
	// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"]}			// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"]}
	// CHECK-SAME: ins(%[[LHS]] : memref<7x14x21xf32>) outs(%[[OUT]] : memref<7x14x21xf32>)			// CHECK-SAME: ins(%[[LHS]] : memref<7x14x21xf32>) outs(%[[OUT]] : memref<7x14x21xf32>)

	// CHECK: ^{{.+}}(%[[BBARG0:.+]]: f32, %[[BBARG1:.+]]: f32)			// CHECK: ^{{.+}}(%[[BBARG0:.+]]: f32, %[[BBARG1:.+]]: f32)
	// CHECK-NEXT: %[[negf:.+]] = arith.negf %[[BBARG0]] : f32			// CHECK-NEXT: %[[negf:.+]] = arith.negf %[[BBARG0]] : f32
	// CHECK-NEXT: linalg.yield %[[negf]] : f32			// CHECK-NEXT: linalg.yield %[[negf]] : f32

				// -----

				func.func @generalize_max(%lhs: memref<7x14x21xf32>, %rhs: memref<7x14x21xf32>,
				%out: memref<7x14x21xf32>) {
				linalg.max ins(%lhs, %rhs : memref<7x14x21xf32>, memref<7x14x21xf32>)
				outs(%out : memref<7x14x21xf32>)
				return
				}

				// CHECK: #[[MAP:.+]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>

				// CHECK: func @generalize_max
				// CHECK-SAME: (%[[LHS:.+]]: memref<7x14x21xf32>, %[[RHS:.+]]: memref<7x14x21xf32>,
				// CHECK-SAME: %[[OUT:.+]]: memref<7x14x21xf32>)

				// CHECK: linalg.generic
				// CHECK-SAME: indexing_maps = [#[[MAP]], #[[MAP]], #[[MAP]]]
				// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"]}
				// CHECK-SAME: ins(%[[LHS]], %[[RHS]] : memref<7x14x21xf32>, memref<7x14x21xf32>)
				// CHECK-SAME: outs(%[[OUT]] : memref<7x14x21xf32>)

				// CHECK: ^{{.+}}(%[[BBARG0:.+]]: f32, %[[BBARG1:.+]]: f32, %[[BBARG2:.+]]: f32)
				// CHECK-NEXT: %[[max:.+]] = arith.maxf %[[BBARG0]], %[[BBARG1]] : f32
				// CHECK-NEXT: linalg.yield %[[max]] : f32

mlir/test/Dialect/Linalg/named-ops-fail.mlir

	Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines

	// -----			// -----

	func.func @negf_broadcast(%arg: memref<8x16xf32>, %out: memref<4x8x16xf32>) {			func.func @negf_broadcast(%arg: memref<8x16xf32>, %out: memref<4x8x16xf32>) {
	// CHECK: op expected operand rank (2) to match the result rank of indexing_map #0 (3)			// CHECK: op expected operand rank (2) to match the result rank of indexing_map #0 (3)
	linalg.negf ins(%arg : memref<8x16xf32>) outs(%out: memref<4x8x16xf32>)			linalg.negf ins(%arg : memref<8x16xf32>) outs(%out: memref<4x8x16xf32>)
	return			return
	}			}

				// -----

				func.func @max_type_cast(%arg0: memref<4x8x16xf32>, %arg1: memref<4x8x16xf16>, %arg2: memref<4x8x16xf32>) {
				// CHECK: op requires the same type for all operands and results
				linalg.max ins(%arg0, %arg1 : memref<4x8x16xf32>, memref<4x8x16xf16>) outs(%arg2: memref<4x8x16xf32>)
				return
				}

				// -----

				func.func @max_broadcast(%arg0: memref<8x16xf32>, %arg1: memref<4x8x16xf32>, %arg2: memref<4x8x16xf32>) {
				// CHECK: op expected operand rank (2) to match the result rank of indexing_map #0 (3)
				linalg.max ins(%arg0, %arg1 : memref<8x16xf32>, memref<4x8x16xf32>) outs(%arg2: memref<4x8x16xf32>)
				return
				}

mlir/test/Dialect/Linalg/named-ops.mlir

	Show First 20 Lines • Show All 1,534 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: func @negf_tensor			// CHECK-LABEL: func @negf_tensor
	func.func @negf_tensor(%arg0: tensor<4x8x16xf32>) -> tensor<4x8x16xf32> {			func.func @negf_tensor(%arg0: tensor<4x8x16xf32>) -> tensor<4x8x16xf32> {
	%0 = tensor.empty() : tensor<4x8x16xf32>			%0 = tensor.empty() : tensor<4x8x16xf32>
	// CHECK: linalg.negf			// CHECK: linalg.negf
	// CHECK-SAME: ins(%{{.+}} : tensor<4x8x16xf32>) outs(%{{.+}} : tensor<4x8x16xf32>)			// CHECK-SAME: ins(%{{.+}} : tensor<4x8x16xf32>) outs(%{{.+}} : tensor<4x8x16xf32>)
	%1 = linalg.negf ins(%arg0 : tensor<4x8x16xf32>) outs(%0: tensor<4x8x16xf32>) -> tensor<4x8x16xf32>			%1 = linalg.negf ins(%arg0 : tensor<4x8x16xf32>) outs(%0: tensor<4x8x16xf32>) -> tensor<4x8x16xf32>
	return %1 : tensor<4x8x16xf32>			return %1 : tensor<4x8x16xf32>
	}			}

				// -----

				// CHECK-LABEL: func @max_dynamic
				func.func @max_dynamic(%arg0: memref<?x?x?xf32>, %arg1: memref<?x?x?xf32>, %arg2: memref<?x?x?xf32>) {
				// CHECK: linalg.max
				// CHECK-SAME: ins(%{{.+}}, %{{.+}} : memref<?x?x?xf32>, memref<?x?x?xf32>)
				// CHECK-SAME: outs(%{{.+}} : memref<?x?x?xf32>)
				linalg.max ins(%arg0, %arg1 : memref<?x?x?xf32>, memref<?x?x?xf32>) outs(%arg2: memref<?x?x?xf32>)
				return
				}

				// -----

				// CHECK-LABEL: func @max_static
				func.func @max_static(%arg0: memref<4x8x16xf32>, %arg1: memref<4x8x16xf32>, %arg2: memref<4x8x16xf32>) {
				// CHECK: linalg.max
				// CHECK-SAME: ins(%{{.+}}, %{{.+}} : memref<4x8x16xf32>, memref<4x8x16xf32>)
				// CHECK-SAME: outs(%{{.+}} : memref<4x8x16xf32>)
				linalg.max ins(%arg0, %arg1 : memref<4x8x16xf32>, memref<4x8x16xf32>) outs(%arg2: memref<4x8x16xf32>)
				return
				}

				// -----

				// CHECK-LABEL: func @max_tensor
				func.func @max_tensor(%arg0: tensor<4x8x16xf32>, %arg1: tensor<4x8x16xf32>) -> tensor<4x8x16xf32> {
				%0 = tensor.empty() : tensor<4x8x16xf32>
				// CHECK: linalg.max
				// CHECK-SAME: ins(%{{.+}}, %{{.+}} : tensor<4x8x16xf32>, tensor<4x8x16xf32>)
				// CHECK-SAME: outs(%{{.+}} : tensor<4x8x16xf32>)
				%1 = linalg.max ins(%arg0, %arg1 : tensor<4x8x16xf32>, tensor<4x8x16xf32>) outs(%0: tensor<4x8x16xf32>) -> tensor<4x8x16xf32>
				return %1 : tensor<4x8x16xf32>
				}

This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][Linalg] Add max named op to linalgClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 538104

mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml

mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py

mlir/test/Dialect/Linalg/generalize-named-ops.mlir

mlir/test/Dialect/Linalg/named-ops-fail.mlir

mlir/test/Dialect/Linalg/named-ops.mlir

[MLIR][Linalg] Add max named op to linalg
ClosedPublic