This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Dialect/Linalg/
-
Dialect/
-
Linalg/
-
IR/
1
LinalgOps.cpp
-
Transforms/
-
Fusion.cpp
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
-
fusion-tensor.mlir
-
invalid.mlir
-
roundtrip.mlir

Differential D74365

[mlir][Linalg] Update semantics for Linalg generic ops with tensors.
AbandonedPublic

Authored by hanchung on Feb 10 2020, 3:50 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
mravishankar

Summary

This diff enforces that the last operands correspond to the result tensors. In
reduction, we need to take result tensors as input during the calculation. Thus,
we make the block/function take more arguments to handle it.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

hanchung created this revision.Feb 10 2020, 3:50 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2020, 3:50 PM

Herald added subscribers: llvm-commits, Joonsoo, liufengdb and 9 others. · View Herald Transcript

Format lines within 80 characters.

Harbormaster failed remote builds in B46150: Diff 243690!Feb 10 2020, 4:22 PM

Harbormaster failed remote builds in B46153: Diff 243693!Feb 10 2020, 4:31 PM

mravishankar added inline comments.Feb 11 2020, 1:14 PM

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
224	I think this is really a stop-gap measure, and has some semantics issues cause tensors are SSA values, and this is effectively implementing an in-place update semantics on tensors. I might be mistaken, but this only happens for cases where the iterator_types are "reduction". One way to get around this might be to use the approach in https://bugs.llvm.org/show_bug.cgi?id=44777

Fix index access in verify functions, and use more meaningful naming for indices.

Harbormaster failed remote builds in B46274: Diff 243999!Feb 11 2020, 2:49 PM

There are unsolved issues going from the buffer world to the tensor world when reductions are involved.
Changing the semantics of all the ops, including ones that don't have reductions, is not the right way to go here.
We need a proper way to tie a tensor result to a tensor operand.
We don't have a good mechanism for this atm and this should be discussed with the bigger group.

If the objective is to unblock yourself, you could add an attribute to encode the convention that result p flows into operand k.
You can make this attribute whatever you want to encode this in your local work and ensure that the lowering to buffers does the right mapping when seeing the attribute.

None of this should change the semantics of the ops: this has farther reaching consequences that we do not understand yet.
It is possible that an extra region that operates at the level of tensors is needed to encode this cleanly, along the lines of what Mahesh proposes in: https://bugs.llvm.org/show_bug.cgi?id=44777.

This revision now requires changes to proceed.Feb 11 2020, 2:57 PM

mravishankar resigned from this revision.Mar 11 2020, 3:37 PM

hanchung abandoned this revision.Jul 13 2020, 9:37 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2020, 9:37 AM

Herald added subscribers: limo1996, msifontes, jurahul and 4 others. · View Herald Transcript

Revision Contents

Path

Size

mlir/

lib/

Dialect/

Linalg/

IR/

LinalgOps.cpp

126 lines

Transforms/

Fusion.cpp

11 lines

test/

Dialect/

Linalg/

fusion-tensor.mlir

24 lines

invalid.mlir

4 lines

roundtrip.mlir

40 lines

Diff 243999

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

Show First 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	if (parser.parseOptionalArrowTypeList(tensorResultTypes))
return failure();		return failure();
if (!tensorResultTypes.empty())		if (!tensorResultTypes.empty())
result.addTypes(tensorResultTypes);		result.addTypes(tensorResultTypes);
return parser.resolveOperands(operandsInfo, operandTypes,		return parser.resolveOperands(operandsInfo, operandTypes,
parser.getCurrentLocation(), result.operands);		parser.getCurrentLocation(), result.operands);
}		}

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyBlockArgs(GenericOpType op, Block &block);		static LogicalResult verifyBlockArgsWithOffset(GenericOpType op, Block &block,
		int offset) {
template <> LogicalResult verifyBlockArgs(GenericOp op, Block &block) {
auto nOperands = op.getNumOperands();
if (block.getNumArguments() != nOperands)
return op.emitOpError("expected number of block arguments to match number "
"of operands");

// Note: the number and type of yield values are checked in the YieldOp.		// Note: the number and type of yield values are checked in the YieldOp.
		auto nOperands = op.getNumOperands();
auto nInputViews = op.getNumInputs();		auto nInputViews = op.getNumInputs();
for (unsigned i = 0; i < nOperands; ++i) {		for (unsigned i = 0; i < nOperands; ++i) {
		int idx = i + offset;
auto viewType = op.getShapedType(i);		auto viewType = op.getShapedType(i);
if (viewType.getElementType() != block.getArgument(i).getType())		if (viewType.getElementType() != block.getArgument(idx).getType())
return op.emitOpError("expected block argument ")		return op.emitOpError("expected block argument ")
<< (i + 1) << " of the same type as elemental type of "		<< (idx + 1) << " of the same type as elemental type of "
<< ((i < nInputViews) ? "input " : "output ")		<< ((i < nInputViews) ? "input " : "output ")
<< "operand: " << viewType;		<< "operand: " << viewType;
}		}

		auto nResults = op.getNumResults();
		for (unsigned i = 0; i < nResults; ++i) {
		int shapedTypeIndex = i + nOperands;
		int blockArgIndex = shapedTypeIndex + offset;
		auto viewType = op.getShapedType(shapedTypeIndex);
		if (viewType.getElementType() != block.getArgument(blockArgIndex).getType())
		return op.emitOpError("expected block argument ")
		<< (blockArgIndex + 1)
		<< " of the same type as elemental type of result "
		<< "operand: " << viewType;
		}
return success();		return success();
}		}

		template <typename GenericOpType>
		static LogicalResult verifyBlockArgs(GenericOpType op, Block &block);

		template <> LogicalResult verifyBlockArgs(GenericOp op, Block &block) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -template <> LogicalResult verifyBlockArgs(GenericOp op, Block &block) { +template <> +LogicalResult verifyBlockArgs(GenericOp op, Block &block) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -template <> LogicalResult verifyBlockArgs(GenericOp…
		auto nOperands = op.getNumOperands();
		auto nResults = op.getNumResults();
		mravishankarUnsubmitted Not Done Reply Inline Actions I think this is really a stop-gap measure, and has some semantics issues cause tensors are SSA values, and this is effectively implementing an in-place update semantics on tensors. I might be mistaken, but this only happens for cases where the iterator_types are "reduction". One way to get around this might be to use the approach in https://bugs.llvm.org/show_bug.cgi?id=44777 mravishankar: I think this is really a stop-gap measure, and has some semantics issues cause tensors are SSA…
		if (block.getNumArguments() != nOperands + nResults)
		return op.emitOpError("expected number of block arguments to match number "
		"of operands + number of results");
		return verifyBlockArgsWithOffset(op, block, /offset=/0);
		}

template <> LogicalResult verifyBlockArgs(IndexedGenericOp op, Block &block) {		template <> LogicalResult verifyBlockArgs(IndexedGenericOp op, Block &block) {
auto nInputViews = op.getNumInputs();
auto nLoops = op.getNumLoops();		auto nLoops = op.getNumLoops();
auto nOperands = op.getNumOperands();		auto nOperands = op.getNumOperands();
if (block.getNumArguments() != nOperands + nLoops)		auto nResults = op.getNumResults();
		if (block.getNumArguments() != nOperands + nLoops + nResults)
return op.emitOpError(		return op.emitOpError(
"expected number of block arguments to match number of operands + "		"expected number of block arguments to match number of operands + "
"number of loops");		"number of loops + number of results");

// Note: the number and type of yield values are checked in the YieldOp.
for (unsigned i = 0; i < nLoops; ++i)		for (unsigned i = 0; i < nLoops; ++i)
if (!block.getArgument(i).getType().isIndex())		if (!block.getArgument(i).getType().isIndex())
return op.emitOpError("expected block argument ")		return op.emitOpError("expected block argument ")
<< (i + 1) << " to be an index";		<< (i + 1) << " to be an index";
		return verifyBlockArgsWithOffset(op, block, nLoops);
for (unsigned i = 0; i < nOperands; ++i) {
unsigned memrefArgIndex = i + nLoops;
auto viewType = op.getShapedType(i);
if (viewType.getElementType() !=
block.getArgument(memrefArgIndex).getType())
return op.emitOpError("expected block argument ")
<< (memrefArgIndex + 1)
<< " of the same type as elemental type of "
<< ((i < nInputViews) ? "input " : "output ")
<< "operand: " << viewType;
}
return success();
}		}

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyFuncArgs(GenericOpType op, FunctionType funType);		static LogicalResult verifyFuncArgs(GenericOpType op, FunctionType funType);

template <typename GenericOpType>		template <typename GenericOpType>
LogicalResult verifyFuncArgsGeneric(GenericOpType op, FunctionType funType) {		LogicalResult verifyFuncArgsGeneric(GenericOpType op, FunctionType funType) {
auto res = verifyFuncArgs(op, funType);		auto res = verifyFuncArgs(op, funType);
if (failed(res))		if (failed(res))
return res;		return res;

auto nInputs = op.getNumInputs();		auto nInputs = op.getNumInputs();
auto nOutputs = op.getNumOutputs();		auto nOutputs = op.getNumOutputs();
// linalg.generic output element types are exactly the function results.		// linalg.generic output element types are exactly the function results.
for (unsigned idx = 0; idx < nOutputs; ++idx) {		for (unsigned idx = 0; idx < nOutputs; ++idx) {
ShapedType shapedType = op.getShapedType(nInputs + idx);		ShapedType shapedType = op.getShapedType(nInputs + idx);
if (funType.getResult(idx) != shapedType.getElementType())		if (funType.getResult(idx) != shapedType.getElementType())
return op.emitOpError("expected function result ")		return op.emitOpError("expected function result ")
<< (idx + 1) << " of the same type as elemental type "		<< (idx + 1) << " of the same type as elemental type "
<< shapedType.getElementType() << " of output " << (idx + 1);		<< shapedType.getElementType() << " of output " << (idx + 1);
}		}
return success();		return success();
}		}

		template <typename GenericOpType>
		static LogicalResult
		verifyFuncArgsWithOffset(GenericOpType op, FunctionType funType, int offset) {
		// linalg.generic operands element types are exactly the first function
		// arguments.
		auto nOperands = op.getNumOperands();
		for (unsigned i = 0; i < nOperands; ++i) {
		int funcArgIndex = i + offset;
		ShapedType shapedType = op.getShapedType(i);
		if (funType.getInput(funcArgIndex) != shapedType.getElementType())
		return op.emitOpError("expected function argument ")
		<< (funcArgIndex + 1) << " of the same type as elemental type "
		<< shapedType.getElementType() << " of input " << (i + 1);
		}

		auto nResults = op.getNumResults();
		for (unsigned i = 0; i < nResults; ++i) {
		int shapedTypeIndex = i + nOperands;
		int funcArgIndex = shapedTypeIndex + offset;
		ShapedType shapedType = op.getShapedType(shapedTypeIndex);
		if (funType.getInput(funcArgIndex) != shapedType.getElementType())
		return op.emitOpError("expected function argument ")
		<< (funcArgIndex + 1) << " of the same type as elemental type "
		<< shapedType.getElementType() << " of result " << (i + 1);
		}

		return success();
		}

template <> LogicalResult verifyFuncArgs(GenericOp op, FunctionType funType) {		template <> LogicalResult verifyFuncArgs(GenericOp op, FunctionType funType) {
auto nOperands = op.getNumOperands();		auto nOperands = op.getNumOperands();
if (funType.getNumInputs() != nOperands)		auto nResults = op.getNumResults();
		if (funType.getNumInputs() != nOperands + nResults)
return op.emitOpError(		return op.emitOpError(
"expected function arguments to match number of operands");		"expected function arguments to match number of operands + number of "
		"results");
if (funType.getNumResults() != op.getNumOutputs())		if (funType.getNumResults() != op.getNumOutputs())
return op.emitOpError("expected function results(")		return op.emitOpError("expected function results(")
<< funType.getNumResults() << ") to match number of outputs("		<< funType.getNumResults() << ") to match number of outputs("
<< op.getNumOutputs() << ")";		<< op.getNumOutputs() << ")";

// linalg.generic operands element types are exactly the first function		return verifyFuncArgsWithOffset(op, funType, /offset=/0);
// arguments.
for (unsigned idx = 0; idx < nOperands; ++idx) {
ShapedType shapedType = op.getShapedType(idx);
if (funType.getInput(idx) != shapedType.getElementType())
return op.emitOpError("expected function argument ")
<< (idx + 1) << " of the same type as elemental type "
<< shapedType.getElementType() << " of operand " << (idx + 1);
}

return success();
}		}

template <>		template <>
LogicalResult verifyFuncArgs(IndexedGenericOp op, FunctionType funType) {		LogicalResult verifyFuncArgs(IndexedGenericOp op, FunctionType funType) {
auto nLoops = op.getNumLoops();		auto nLoops = op.getNumLoops();
auto nOutputs = op.getNumOutputs();		auto nOutputs = op.getNumOutputs();
auto nOperands = op.getNumOperands();		auto nOperands = op.getNumOperands();
if (funType.getNumInputs() != nOperands + nLoops)		auto nResults = op.getNumResults();
		if (funType.getNumInputs() != nOperands + nLoops + nResults)
return op.emitOpError("expected function arguments to match number of "		return op.emitOpError("expected function arguments to match number of "
"loops + number of operands");		"loops + number of operands + number of results");
if (funType.getNumResults() != nOutputs)		if (funType.getNumResults() != nOutputs)
return op.emitOpError(		return op.emitOpError(
"expected function results to match number of outputs");		"expected function results to match number of outputs");
for (unsigned i = 0; i < nLoops; ++i)		for (unsigned i = 0; i < nLoops; ++i)
if (!funType.getInput(i).isIndex())		if (!funType.getInput(i).isIndex())
return op.emitOpError("expected function argument ")		return op.emitOpError("expected function argument ")
<< (i + 1) << " to be an index";		<< (i + 1) << " to be an index";

// linalg.generic operands element types are exactly the first function		return verifyFuncArgsWithOffset(op, funType, nLoops);
// arguments.
for (unsigned idx = 0; idx < nOperands; ++idx) {
ShapedType shapedType = op.getShapedType(idx);
if (funType.getInput(idx + nLoops) != shapedType.getElementType())
return op.emitOpError("expected function argument ")
<< (idx + nLoops + 1) << " of the same type as elemental type "
<< shapedType.getElementType() << " of input " << (idx + 1);
}

return success();
}		}

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyGenericOp(GenericOpType op) {		static LogicalResult verifyGenericOp(GenericOpType op) {
auto nInputViews = op.getNumInputs();		auto nInputViews = op.getNumInputs();
auto nLoops = op.getNumLoops();		auto nLoops = op.getNumLoops();
auto nInputsAndOutputBuffers = op.getNumInputsAndOutputBuffers();		auto nInputsAndOutputBuffers = op.getNumInputsAndOutputBuffers();
if (nInputsAndOutputBuffers != llvm::size(op.views()))		if (nInputsAndOutputBuffers != llvm::size(op.views()))
▲ Show 20 Lines • Show All 811 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp

Show First 20 Lines • Show All 435 Lines • ▼ Show 20 Lines	Optional<LinalgOp> mlir::linalg::fuseTensorOps(OpBuilder &b, LinalgOp producer,
Block &consumerOpBlock = consumerOp.region().front();		Block &consumerOpBlock = consumerOp.region().front();
Block *fusedBlock = new Block();		Block *fusedBlock = new Block();
fusedOpRegion.push_back(fusedBlock);		fusedOpRegion.push_back(fusedBlock);
BlockAndValueMapping mapper;		BlockAndValueMapping mapper;
// Map the arguments for the unmodified args from the consumer.		// Map the arguments for the unmodified args from the consumer.
for (auto consumerOpArg : llvm::enumerate(consumerOpBlock.getArguments())) {		for (auto consumerOpArg : llvm::enumerate(consumerOpBlock.getArguments())) {
if (consumerOpArg.index() == consumerIdx) {		if (consumerOpArg.index() == consumerIdx) {
// Map the arguments for the args from the producer.		// Map the arguments for the args from the producer.
for (auto producerOpArg : producerOpBlock.getArguments())		for (auto producerOpArg :
mapper.map(producerOpArg,		llvm::enumerate(producerOpBlock.getArguments())) {
fusedBlock->addArgument(producerOpArg.getType()));		// Skip the operands corresponding to the results.
		if (producerOpArg.index() >= producerOp.getNumInputs())
		continue;
		mapper.map(producerOpArg.value(),
		fusedBlock->addArgument(producerOpArg.value().getType()));
		}
continue;		continue;
}		}
mapper.map(consumerOpArg.value(),		mapper.map(consumerOpArg.value(),
fusedBlock->addArgument(consumerOpArg.value().getType()));		fusedBlock->addArgument(consumerOpArg.value().getType()));
}		}

// Add operations from producer (except the yield operation) to the fused op.		// Add operations from producer (except the yield operation) to the fused op.
for (auto &op : producerOpBlock.getOperations()) {		for (auto &op : producerOpBlock.getOperations()) {
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/fusion-tensor.mlir

	// RUN: mlir-opt %s -linalg-fusion-for-tensor-ops -split-input-file \| FileCheck %s --dump-input-on-failure			// RUN: mlir-opt %s -linalg-fusion-for-tensor-ops -split-input-file \| FileCheck %s --dump-input-on-failure

	// CHECK-DAG: [[MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: [[MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>

	// CHECK-LABEL: @add_mul_fusion			// CHECK-LABEL: @add_mul_fusion
	func @add_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>			func @add_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {			%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32, %arg5: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64
	// CHECK-SAME: indexing_maps = {{\[}}[[MAP0]], [[MAP0]], [[MAP0]], [[MAP0]]{{\]}}			// CHECK-SAME: indexing_maps = {{\[}}[[MAP0]], [[MAP0]], [[MAP0]], [[MAP0]]{{\]}}
	%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {			%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {
	// CHECK: ^{{[a-zA-Z0-9_]*}}			// CHECK: ^{{[a-zA-Z0-9_]*}}
	// CHECK-SAME: [[ARG0:%[a-zA-Z0-9_]*]]			// CHECK-SAME: [[ARG0:%[a-zA-Z0-9_]*]]
	// CHECK-SAME: [[ARG1:%[a-zA-Z0-9_]*]]			// CHECK-SAME: [[ARG1:%[a-zA-Z0-9_]*]]
	// CHECK-SAME: [[ARG2:%[a-zA-Z0-9_]*]]			// CHECK-SAME: [[ARG2:%[a-zA-Z0-9_]*]]
	^bb0(%arg5: f32, %arg6: f32): // no predecessors			^bb0(%arg6: f32, %arg7: f32, %arg8: f32): // no predecessors
	// CHECK: [[T1:%[a-zA-Z0-9_]*]] = addf [[ARG0]], [[ARG1]]			// CHECK: [[T1:%[a-zA-Z0-9_]*]] = addf [[ARG0]], [[ARG1]]
	// CHECK-NOT: linalg.yield			// CHECK-NOT: linalg.yield
	// CHECK: mulf [[T1]], [[ARG2]]			// CHECK: mulf [[T1]], [[ARG2]]
	// CHECK: linalg.yield			// CHECK: linalg.yield
	%3 = mulf %arg5, %arg6 : f32			%3 = mulf %arg6, %arg7 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>
	return %2 : tensor<?x?xf32>			return %2 : tensor<?x?xf32>
	}			}

	// -----			// -----

	// CHECK-DAG: [[MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: [[MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-DAG: [[MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d1, d0)>			// CHECK-DAG: [[MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d1, d0)>
	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	#map1 = affine_map<(d0, d1) -> (d1, d0)>			#map1 = affine_map<(d0, d1) -> (d1, d0)>

	// CHECK-LABEL: @transpose_add_mul_fusion			// CHECK-LABEL: @transpose_add_mul_fusion
	func @transpose_add_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>			func @transpose_add_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map1, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {			%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map1, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32, %arg5: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64
	// CHECK-SAME: indexing_maps = {{\[}}[[MAP0]], [[MAP1]], [[MAP0]], [[MAP0]]{{\]}}			// CHECK-SAME: indexing_maps = {{\[}}[[MAP0]], [[MAP1]], [[MAP0]], [[MAP0]]{{\]}}
	%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {			%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {
	^bb0(%arg5: f32, %arg6: f32): // no predecessors			^bb0(%arg6: f32, %arg7: f32, %arg8: f32): // no predecessors
	%3 = mulf %arg5, %arg6 : f32			%3 = mulf %arg6, %arg7 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>
	return %2 : tensor<?x?xf32>			return %2 : tensor<?x?xf32>
	}			}

	// -----			// -----

	// CHECK-DAG: [[MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: [[MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-DAG: [[MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d1, d0)>			// CHECK-DAG: [[MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d1, d0)>
	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	#map1 = affine_map<(d0, d1) -> (d1, d0)>			#map1 = affine_map<(d0, d1) -> (d1, d0)>

	// CHECK-LABEL: @add_transpose_mul_fusion			// CHECK-LABEL: @add_transpose_mul_fusion
	func @add_transpose_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>			func @add_transpose_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map1, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {			%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map1, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32, %arg5: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64
	// CHECK-SAME: indexing_maps = {{\[}}[[MAP1]], [[MAP0]], [[MAP0]], [[MAP0]]{{\]}}			// CHECK-SAME: indexing_maps = {{\[}}[[MAP1]], [[MAP0]], [[MAP0]], [[MAP0]]{{\]}}
	%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map1, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {			%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map1, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {
	^bb0(%arg5: f32, %arg6: f32): // no predecessors			^bb0(%arg6: f32, %arg7: f32, %arg8: f32): // no predecessors
	%3 = mulf %arg5, %arg6 : f32			%3 = mulf %arg6, %arg7 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>
	return %2 : tensor<?x?xf32>			return %2 : tensor<?x?xf32>
	}			}

	// -----			// -----

	// CHECK-DAG: [[MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: [[MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-DAG: [[MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0)>			// CHECK-DAG: [[MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0)>
	// CHECK-DAG: [[MAP2:#[a-zA-Z0-9_]*]] = affine_map<(d0) -> (d0)>			// CHECK-DAG: [[MAP2:#[a-zA-Z0-9_]*]] = affine_map<(d0) -> (d0)>
	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	#map1 = affine_map<(d0, d1) -> (d0)>			#map1 = affine_map<(d0, d1) -> (d0)>
	#map2 = affine_map<(d0) -> (d0)>			#map2 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: @add_broadcast_mul_fusion			// CHECK-LABEL: @add_broadcast_mul_fusion
	func @add_broadcast_mul_fusion(%arg0: tensor<?xf32>, %arg1 : tensor<?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>			func @add_broadcast_mul_fusion(%arg0: tensor<?xf32>, %arg1 : tensor<?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map2, #map2, #map2], iterator_types = ["parallel"]} %arg0, %arg1 {			%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map2, #map2, #map2], iterator_types = ["parallel"]} %arg0, %arg1 {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32, %arg5: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?xf32>, tensor<?xf32> -> tensor<?xf32>			}: tensor<?xf32>, tensor<?xf32> -> tensor<?xf32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64
	// CHECK-SAME: indexing_maps = {{\[}}[[MAP1]], [[MAP1]], [[MAP0]], [[MAP0]]			// CHECK-SAME: indexing_maps = {{\[}}[[MAP1]], [[MAP1]], [[MAP0]], [[MAP0]]
	%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map1, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {			%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map1, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {
	^bb0(%arg5: f32, %arg6: f32): // no predecessors			^bb0(%arg6: f32, %arg7: f32, %arg8: f32): // no predecessors
	%3 = mulf %arg5, %arg6 : f32			%3 = mulf %arg6, %arg7 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	}: tensor<?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			}: tensor<?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>
	return %2 : tensor<?x?xf32>			return %2 : tensor<?x?xf32>
	}			}

mlir/test/Dialect/Linalg/invalid.mlir

Show First 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	func @generic_mismatched_num_returns(%arg0: memref<f32>) {
} %arg0: memref<f32>		} %arg0: memref<f32>
}		}

// -----		// -----

func @foo(%0: i32, %1: i32, %2: i32) { return }		func @foo(%0: i32, %1: i32, %2: i32) { return }

func @generic_mismatched_num_returns(%0: memref<i32>, %1: memref<f32>) {		func @generic_mismatched_num_returns(%0: memref<i32>, %1: memref<f32>) {
// expected-error @+1 {{op expected function argument 2 of the same type as elemental type 'f32' of operand 2}}		// expected-error @+1 {{op expected function argument 2 of the same type as elemental type 'f32' of input 2}}
linalg.generic {		linalg.generic {
args_in = 3,		args_in = 3,
args_out = 0,		args_out = 0,
fun = @foo,		fun = @foo,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %0, %1, %1: memref<i32>, memref<f32>, memref<f32>		} %0, %1, %1: memref<i32>, memref<f32>, memref<f32>
}		}
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
// -----		// -----

func @foo(%0: i32) -> f32 {		func @foo(%0: i32) -> f32 {
%1 = constant 0.0: f32		%1 = constant 0.0: f32
return %1: f32		return %1: f32
}		}

func @generic_fun_arg_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_fun_arg_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op expected function argument 1 of the same type as elemental type 'f32' of operand 1}}		// expected-error @+1 {{op expected function argument 1 of the same type as elemental type 'f32' of input 1}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
fun = @foo,		fun = @foo,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>		} %arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>
}		}
▲ Show 20 Lines • Show All 300 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/roundtrip.mlir

	Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines
	// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64, fun = @foo,			// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64, fun = @foo,
	// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],			// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],
	// CHECK-SAME: library_call = "some_external_function_name_1"}			// CHECK-SAME: library_call = "some_external_function_name_1"}
	// CHECK-SAME: {foo = 1 : i64}:			// CHECK-SAME: {foo = 1 : i64}:
	// CHECK-SAME: tensor<?x?xvector<3x4xi4>>, memref<?x?x?xf32, #[[strided3D]]>			// CHECK-SAME: tensor<?x?xvector<3x4xi4>>, memref<?x?x?xf32, #[[strided3D]]>

	// -----			// -----

	func @foo(%0: vector<3x4xi4>, %1: f32) -> f32 {			func @foo(%0: vector<3x4xi4>, %1: f32, %2: f32) -> f32 {
	%f0 = constant 0.0 : f32			%f0 = constant 0.0 : f32
	return %f0 : f32			return %f0 : f32
	}			}

	#accesses = [			#accesses = [
	affine_map<(i, j, k) -> (j, i)>,			affine_map<(i, j, k) -> (j, i)>,
	affine_map<(i, j, k) -> (i, k, i + j)>			affine_map<(i, j, k) -> (i, k, i + j)>
	]			]

	#trait2 = {			#trait2 = {
	args_in = 2,			args_in = 2,
	args_out = 1,			args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel", "parallel"],			iterator_types = ["parallel", "parallel", "parallel"],
	fun = @foo,			fun = @foo,
	library_call = "some_external_function_name_1"			library_call = "some_external_function_name_1"
	}			}

	func @generic_with_tensor_input_and_output(			func @generic_function_with_tensor_input_and_output(
	%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: tensor<?x?x?xf32>)			%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: tensor<?x?x?xf32>)
	-> (tensor<?x?x?xf32>) {			-> (tensor<?x?x?xf32>) {
	%0 = linalg.generic #trait2 %arg0, %arg1 {foo = 1} :			%0 = linalg.generic #trait2 %arg0, %arg1 {foo = 1} :
	tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>			tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>
	return %0 : tensor<?x?x?xf32>			return %0 : tensor<?x?x?xf32>
	}			}
	// CHECK-LABEL: func @generic_with_tensor_input_and_output			// CHECK-LABEL: func @generic_function_with_tensor_input_and_output
	// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64, fun = @foo,			// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64, fun = @foo,
	// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],			// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],
	// CHECK-SAME: library_call = "some_external_function_name_1"} %{{.}}, %{{.}} {foo = 1 : i64}:			// CHECK-SAME: library_call = "some_external_function_name_1"} %{{.}}, %{{.}} {foo = 1 : i64}:
	// CHECK-SAME: tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>			// CHECK-SAME: tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>
	// CHECK: return {{.*}} : tensor<?x?x?xf32>			// CHECK: return {{.*}} : tensor<?x?x?xf32>

	// -----			// -----

				#accesses = [
				affine_map<(i, j) -> (i, j)>,
				affine_map<(i, j) -> (i)>
				]

				#trait = {
				args_in = 1,
				args_out = 1,
				indexing_maps = #accesses,
				iterator_types = ["parallel", "reduction"],
				library_call = "some_external_function_name_1"
				}

				func @generic_block_with_tensor_input_and_output(
				%arg0: tensor<2x4xf32>, %arg1: tensor<2xf32>) -> (tensor<2xf32>) {
				%0 = linalg.generic #trait %arg0 {
				^bb0(%arg2: f32, %arg3: f32): // no predecessors
				%res = addf %arg2, %arg3 : f32
				linalg.yield %res : f32
				}: tensor<2x4xf32> -> tensor<2xf32>
				return %0 : tensor<2xf32>
				}
				// CHECK-LABEL: func @generic_block_with_tensor_input_and_output
				// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64,
				// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "reduction"]
				// CHECK-SAME: } %{{.*}} {
				// CHECK: ^bb0([[in:%.]]: f32, [[out:%.]]: f32):
				// CHECK: [[res:%.*]] = addf [[in]], [[out]] : f32
				// CHECK: linalg.yield [[res]] : f32
				// CHECK: }: tensor<2x4xf32> -> tensor<2xf32>
				// CHECK: return {{.*}} : tensor<2xf32>

				// -----

	// CHECK-DAG: #[[strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>			// CHECK-DAG: #[[strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>
	// CHECK-DAG: #[[strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>			// CHECK-DAG: #[[strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>

	#accesses = [			#accesses = [
	affine_map<(i, j, k) -> (j, i)>,			affine_map<(i, j, k) -> (j, i)>,
	affine_map<(i, j, k) -> (i, k, i + j)>			affine_map<(i, j, k) -> (i, k, i + j)>
	]			]

	▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines