This is an archive of the discontinued LLVM Phabricator instance.

We are generally moving in the direction of using std.alloc by default, and having a separate pass that promotes std.alloc to std.alloca before buffer-deallocation runs. For example, with the promote-buffers-to-stack pass.

If you just want to convert all allocs to alloca's, we might need a minor tweak the logic in that pass to allow "-1" to mean "any buffer". E.g. promote-buffers-to-stack{max-alloc-size-in-bytes=-1}.

Would that work for you?

This revision now requires changes to proceed.Nov 13 2020, 11:55 AM

In D91427#2394771, @silvas wrote:

We are generally moving in the direction of using std.alloc by default, and having a separate pass that promotes std.alloc to std.alloca before buffer-deallocation runs. For example, with the promote-buffers-to-stack pass.

Hmm .. pretty much everything in MLIR is evolving towards Op interfaces (e.g. AllocLike / CopyLike / ViewLike op interfaces) + TypeConverter which is itself really a Type -> Type interface.
Doesn't this evolution you describe become limited by what std.alloc which seems to go against generalizability and composability?

If you just want to convert all allocs to alloca's, we might need a minor tweak the logic in that pass to allow "-1" to mean "any buffer". E.g. promote-buffers-to-stack{max-alloc-size-in-bytes=-1}.

Alloca is not the only use case, assigning to various memory spaces is another use case.
In the future we will also likely see dialect-specific alloc/copy/dealloc + types (e.g. sparse and quantized).
Such ops and types will need to compose with, at least, the patterns for control-flow.

Would that work for you?

To answer your Q, yes I can easily implement a legalization that converts an Alloc to either Alloca or Alloc + memory space.
I expect it will have the usual "fixup pass" issues: (a) potentially need to recover information that may have been lost, (b) creating yet another pass, (c) 2 passes = new opportunities for phase ordering issues.

It seems to me that the hook proposed in this revision is in the spirit of previous MLIR evolutions that I would also expect bufferization to follow at some point in the future.

Then, there is also the timing aspect: if you're saying that you have a general solution that will be available very soon then great, happy to wait.
If not, then I'd prefer to not be blocked on something quite simple that can easily be evolved.

Maybe the easiest Q I have is whether you have some design doc for what bufferization should look like in a longer-term future?
I've been wondering in particular if bufferization is a generic dialect conversion with some additional AllocLike and CopyLike behavior?

Herald added a subscriber: teijeong. · View Herald TranscriptNov 16 2020, 4:51 AM

In D91427#2397106, @nicolasvasilache wrote:

In D91427#2394771, @silvas wrote:

We are generally moving in the direction of using std.alloc by default, and having a separate pass that promotes std.alloc to std.alloca before buffer-deallocation runs. For example, with the promote-buffers-to-stack pass.

Hmm .. pretty much everything in MLIR is evolving towards Op interfaces (e.g. AllocLike / CopyLike / ViewLike op interfaces) + TypeConverter which is itself really a Type -> Type interface.
Doesn't this evolution you describe become limited by what std.alloc which seems to go against generalizability and composability?

Interfaces are just one way to generalize things in MLIR. It's not the right abstraction mechanism for everything.

In particular, I think that interfaces are mostly useful for anlaysis purposes and some transformation purposes, but not as much for lowering. The reason is that when you are lowering, you really are just creating an op with a specific signature. E.g. if I need to create a "copy" when lowering, I am going to call a callback createCopy(Value from, Value to). So you might as well create std.copy %from, %to and let consumers lower std.copy to whatever they want. There is no extra information created by directly creating my.copy %from, %to vs simply lowering std.copy -> my.copy later.

If you just want to convert all allocs to alloca's, we might need a minor tweak the logic in that pass to allow "-1" to mean "any buffer". E.g. promote-buffers-to-stack{max-alloc-size-in-bytes=-1}.

Alloca is not the only use case, assigning to various memory spaces is another use case.
In the future we will also likely see dialect-specific alloc/copy/dealloc + types (e.g. sparse and quantized).
Such ops and types will need to compose with, at least, the patterns for control-flow.

Would that work for you?

To answer your Q, yes I can easily implement a legalization that converts an Alloc to either Alloca or Alloc + memory space.
I expect it will have the usual "fixup pass" issues: (a) potentially need to recover information that may have been lost, (b) creating yet another pass, (c) 2 passes = new opportunities for phase ordering issues.

Handling Alloc + memory space is highly nontrivial, because it entails context-dependent type conversions. Everything about the current type conversion infrastructure assumes that types are converted context-independently. That is, if you see a tensor type, then you know which memref type it turns into; once you open up the door to the same tensor type converting to either memref<.., 0> or memref<..., 3> you're walking on very thin ice for the dialect conversion infrastructure. It will require a lot of thought to do properly, if it is even feasible to do at all.

Example:

  %0 = "foo" : tensor<2xf32> // hypothetically converts to memref<2xf32, 0>
  %1 = "bar" : tensor<2xf32> // hypothetically converts to memref<2xf32, 3>
  br ^bb1(%0, %1)
^bb1(%bbarg0: tensor<2xf32>, %bbarg1: tensor<2xf32>):
  // use %bbarg0, %bbarg1

The key problem is knowing what type %bbarg0 and %bbarg1 need to be converted to. This is difficult to do in the current dialect conversion framework, but is relatively easy to do in a post-pass.

If you just want to blanket convert all tensors to memref<..., 3> that's easier to handle in the dialect conversion framework, but I don't see what that buys you over a simple post-pass that just adds the address space to all memref types.

It seems to me that the hook proposed in this revision is in the spirit of previous MLIR evolutions that I would also expect bufferization to follow at some point in the future.

Then, there is also the timing aspect: if you're saying that you have a general solution that will be available very soon then great, happy to wait.

I think the alloca case can be easily worked around as I described. The memref + address space is less trivial, and this patch doesn't do anything to fix the fundamental difficulties with that approach (actually, it moves us away from solving them).

If not, then I'd prefer to not be blocked on something quite simple that can easily be evolved.

Maybe the easiest Q I have is whether you have some design doc for what bufferization should look like in a longer-term future?

I think we're still figuring it out. This area is still very early in development.

I've been wondering in particular if bufferization is a generic dialect conversion with some additional AllocLike and CopyLike behavior?

I don't think additional AllocLike or CopyLike behavior is load bearing. Users can convert std.alloc/std.copy to whatever they like. There's no extra information available at bufferization time that can create ops carrying any information not already carried by std.alloc / std.copy. We can look into having some sort of transformation that precomputes some attribute for different tensor ops, and then propagates that to the std.alloc/copy. It requires thought to make sure that composes right though.

One option going forward here is to allow the tensor type to carry attributes on it (as Chris mentions here: https://llvm.discourse.group/t/rfc-memref-memory-shape-as-attribute/2229/3), such as a memory space or desired allocation kind, and then have bufferization respect that. That puts the brunt of the effort of updating types (such as basic block args and scf.if types) on the pass that annotates those memory spaces rather than conflating it with the actual dialect conversion itself.

Overall, I think one of the key issues is to what extent we want to expose "what this tensor is going to bufferize to" at the tensor level. Especially when such annotations come into existence and how they are kept valid / up to date.

nicolasvasilache abandoned this revision.Dec 7 2020, 6:02 AM

Herald added a subscriber: mravishankar. · View Herald TranscriptDec 7 2020, 6:02 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

Transforms/

Transforms.h

23 lines

lib/

Dialect/

Linalg/

Transforms/

Bufferize.cpp

76 lines

Diff 305140

mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h

Show All 17 Lines
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"

namespace mlir {		namespace mlir {
class BufferizeTypeConverter;		class BufferizeTypeConverter;
class FrozenRewritePatternList;		class FrozenRewritePatternList;

namespace linalg {		namespace linalg {

		struct LinalgBufferizeOptions;
struct LinalgFusionOptions;		struct LinalgFusionOptions;
struct LinalgTilingOptions;		struct LinalgTilingOptions;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Transformations exposed as function calls.		// Transformations exposed as function calls.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
using LinalgLoops = SmallVector<Operation *, 4>;		using LinalgLoops = SmallVector<Operation *, 4>;

Show All 11 Lines	struct TiledAndFusedLinalgOps {
SmallVector<Operation *, 4> unfusedLoops;		SmallVector<Operation *, 4> unfusedLoops;
};		};

/// Populates patterns for vectorization of all ConvN-D ops.		/// Populates patterns for vectorization of all ConvN-D ops.
void populateConvVectorizationPatterns(		void populateConvVectorizationPatterns(
MLIRContext *context, SmallVectorImpl<OwningRewritePatternList> &patterns,		MLIRContext *context, SmallVectorImpl<OwningRewritePatternList> &patterns,
ArrayRef<int64_t> tileSizes);		ArrayRef<int64_t> tileSizes);

		/// Callback function type used to perform the allocation for bufferization.
		using BufferizeAllocCallbackFn = std::function<Value(
		OpBuilder &b, Location loc, MemRefType type, ValueRange allocOperands)>;

		Value defaultBufferizationAllocFn(OpBuilder &b, Location loc, MemRefType type,
		ValueRange allocOperands);

/// Populates the given list with patterns to bufferize linalg ops.		/// Populates the given list with patterns to bufferize linalg ops.
void populateLinalgBufferizePatterns(MLIRContext *context,		struct LinalgBufferizeOptions {
BufferizeTypeConverter &converter,		BufferizeAllocCallbackFn allocationFn = defaultBufferizationAllocFn;
OwningRewritePatternList &patterns);		LinalgBufferizeOptions &setAllocFn(BufferizeAllocCallbackFn const &allocFn) {
		allocationFn = allocFn;
		return *this;
		}
		};
		void populateLinalgBufferizePatterns(
		MLIRContext *context, TypeConverter &converter,
		OwningRewritePatternList &patterns,
		LinalgBufferizeOptions options = LinalgBufferizeOptions());

/// Performs standalone tiling of a single LinalgOp by `tileSizes`.		/// Performs standalone tiling of a single LinalgOp by `tileSizes`.
/// and permute the loop nest according to `interchangeVector`		/// and permute the loop nest according to `interchangeVector`
/// The permutation is expressed as a list of integers that specify		/// The permutation is expressed as a list of integers that specify
/// the new ordering of the loop nest. The length of `interchangeVector`		/// the new ordering of the loop nest. The length of `interchangeVector`
/// must be equal to the length of `tileSizes`.		/// must be equal to the length of `tileSizes`.
/// An empty vector is interpreted as the identity permutation and the		/// An empty vector is interpreted as the identity permutation and the
/// transformation returns early.		/// transformation returns early.
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
/// (i.e. `[1,1,2]` is an invalid permutation).		/// (i.e. `[1,1,2]` is an invalid permutation).
LinalgOp interchange(LinalgOp op, ArrayRef<unsigned> interchangeVector);		LinalgOp interchange(LinalgOp op, ArrayRef<unsigned> interchangeVector);

/// Callback function type used to perform the allocation for the promoted		/// Callback function type used to perform the allocation for the promoted
/// `subView`. In `boundingSubViewsize` a best attempt is made to find the		/// `subView`. In `boundingSubViewsize` a best attempt is made to find the
/// smallest constant value for the size of the buffer needed for each		/// smallest constant value for the size of the buffer needed for each
/// dimension. If that is not possible, contains the dynamic size of the		/// dimension. If that is not possible, contains the dynamic size of the
/// subview. The call back should return the buffer to use.		/// subview. The call back should return the buffer to use.
		// TODO: unify API with
using AllocBufferCallbackFn = std::function<Optional<Value>(		using AllocBufferCallbackFn = std::function<Optional<Value>(
OpBuilder &b, SubViewOp subView, ArrayRef<Value> boundingSubViewSize,		OpBuilder &b, SubViewOp subView, ArrayRef<Value> boundingSubViewSize,
OperationFolder *folder)>;		OperationFolder *folder)>;

/// Callback function type used to deallocate the buffers used to hold the		/// Callback function type used to deallocate the buffers used to hold the
/// promoted subview.		/// promoted subview.
using DeallocBufferCallbackFn =		using DeallocBufferCallbackFn =
std::function<LogicalResult(OpBuilder &b, Value buffer)>;		std::function<LogicalResult(OpBuilder &b, Value buffer)>;
▲ Show 20 Lines • Show All 613 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp

Show All 15 Lines
#include "mlir/Dialect/Vector/VectorOps.h"		#include "mlir/Dialect/Vector/VectorOps.h"
#include "mlir/IR/Function.h"		#include "mlir/IR/Function.h"
#include "mlir/IR/Operation.h"		#include "mlir/IR/Operation.h"
#include "mlir/Pass/Pass.h"		#include "mlir/Pass/Pass.h"

using namespace ::mlir;		using namespace ::mlir;
using namespace ::mlir::linalg;		using namespace ::mlir::linalg;

static SmallVector<Range, 4> computeLoopRanges(Location loc, LinalgOp linalgOp,		static SmallVector<Range, 4> computeLoopRanges(OpBuilder &b, Location loc,
OpBuilder &b) {		LinalgOp linalgOp) {
auto indexingMaps = llvm::to_vector<4>(		auto indexingMaps = llvm::to_vector<4>(
linalgOp.indexing_maps().getAsValueRange<AffineMapAttr>());		linalgOp.indexing_maps().getAsValueRange<AffineMapAttr>());
auto inputIndexingMaps =		auto inputIndexingMaps =
llvm::makeArrayRef(indexingMaps).take_front(linalgOp.getNumInputs());		llvm::makeArrayRef(indexingMaps).take_front(linalgOp.getNumInputs());

mlir::edsc::ScopedContext scope(b, loc);		mlir::edsc::ScopedContext scope(b, loc);
return emitLoopRanges(scope.getBuilderRef(), loc,		return emitLoopRanges(scope.getBuilderRef(), loc,
concatAffineMaps(inputIndexingMaps),		concatAffineMaps(inputIndexingMaps),
getShape(b, linalgOp));		getShape(b, linalgOp));
}		}

static Value maybeConvertToIndex(Location loc, Value val, OpBuilder &b) {		static Value maybeConvertToIndex(Location loc, Value val, OpBuilder &b) {
if (val.getType().isIndex())		if (val.getType().isIndex())
return val;		return val;
return b.create<IndexCastOp>(loc, val, b.getIndexType());		return b.create<IndexCastOp>(loc, val, b.getIndexType());
}		}

static Value cloneMemref(Location loc, Value memref, OpBuilder &b) {		static Value cloneMemref(OpBuilder &b, Location loc, Value memref,
		LinalgBufferizeOptions options) {
auto memrefType = memref.getType().cast<MemRefType>();		auto memrefType = memref.getType().cast<MemRefType>();
SmallVector<Value, 4> dynOperands;		SmallVector<Value, 4> dynOperands;
for (auto dim : llvm::enumerate(memrefType.getShape())) {		for (auto dim : llvm::enumerate(memrefType.getShape())) {
if (dim.value() == TensorType::kDynamicSize) {		if (dim.value() == TensorType::kDynamicSize) {
dynOperands.push_back(b.create<DimOp>(loc, memref, dim.index()));		dynOperands.push_back(b.create<DimOp>(loc, memref, dim.index()));
}		}
}		}
auto alloc = b.create<AllocOp>(loc, memrefType, dynOperands);		Value alloc = options.allocationFn(b, loc, memrefType, dynOperands);
b.create<linalg::CopyOp>(loc, memref, alloc);		b.create<linalg::CopyOp>(loc, memref, alloc);
return alloc;		return alloc;
}		}

static LogicalResult		static LogicalResult
allocateBuffersForResults(Location loc, LinalgOp linalgOp,		allocateBuffersForResults(Location loc, LinalgOp linalgOp,
linalg::GenericOpAdaptor &adaptor,		linalg::GenericOpAdaptor &adaptor,
SmallVectorImpl<Value> &resultBuffers, OpBuilder &b) {		SmallVectorImpl<Value> &resultBuffers, OpBuilder &b,
		LinalgBufferizeOptions options) {
// Lazily compute loopRanges.		// Lazily compute loopRanges.
SmallVector<Range, 4> loopRanges;		SmallVector<Range, 4> loopRanges;

// Allocate a buffer for every tensor result.		// Allocate a buffer for every tensor result.
for (auto en : llvm::enumerate(linalgOp.getOperation()->getResultTypes())) {		for (auto en : llvm::enumerate(linalgOp.getOperation()->getResultTypes())) {
size_t resultIndex = en.index();		size_t resultIndex = en.index();
Type resultType = en.value();		Type resultType = en.value();

auto tensorType = resultType.dyn_cast<RankedTensorType>();		auto tensorType = resultType.dyn_cast<RankedTensorType>();
if (tensorType == nullptr) {		if (tensorType == nullptr) {
linalgOp.emitOpError()		linalgOp.emitOpError()
<< "tensor to buffer conversion expects ranked tensor results";		<< "tensor to buffer conversion expects ranked tensor results";
return failure();		return failure();
}		}
auto tensorShape = tensorType.getShape();		auto tensorShape = tensorType.getShape();
auto memrefType = MemRefType::get(tensorShape, tensorType.getElementType());		auto memrefType =
		MemRefType::get(tensorShape, tensorType.getElementType(), {});

// Allocate buffers for init tensors that are assumed to fold onto the first		// Allocate buffers for init tensors that are assumed to fold onto the first
// results.		// results.
// TODO: update this assumption because the reality is more complex		// TODO: update this assumption because the reality is more complex
// under linalg on tensor based transformations.		// under linalg on tensor based transformations.

bool hasInitTensor = resultIndex < linalgOp.getNumInitTensors();		bool hasInitTensor = resultIndex < linalgOp.getNumInitTensors();
if (hasInitTensor) {		if (hasInitTensor) {
resultBuffers.push_back(		resultBuffers.push_back(
cloneMemref(loc, adaptor.init_tensors()[resultIndex], b));		cloneMemref(b, loc, adaptor.init_tensors()[resultIndex], options));
continue;		continue;
}		}

// Allocate buffers for statically-shaped results.		// Allocate buffers for statically-shaped results.
if (memrefType.hasStaticShape()) {		if (memrefType.hasStaticShape()) {
resultBuffers.push_back(b.create<AllocOp>(loc, memrefType));		resultBuffers.push_back(options.allocationFn(b, loc, memrefType, {}));
continue;		continue;
}		}

// Perform a naive shape inference for the dynamically-shaped results.		// Perform a naive shape inference for the dynamically-shaped results.
// Extract the required element out of the vector.		// Extract the required element out of the vector.
SmallVector<Value, 4> dynOperands;		SmallVector<Value, 4> dynOperands;
auto resultIndexingMap = linalgOp.getOutputIndexingMap(resultIndex);		auto resultIndexingMap = linalgOp.getOutputIndexingMap(resultIndex);
for (auto shapeElement : llvm::enumerate(tensorType.getShape())) {		for (auto shapeElement : llvm::enumerate(tensorType.getShape())) {
if (loopRanges.empty())		if (loopRanges.empty())
loopRanges = computeLoopRanges(loc, linalgOp, b);		loopRanges = computeLoopRanges(b, loc, linalgOp);

if (shapeElement.value() != ShapedType::kDynamicSize)		if (shapeElement.value() != ShapedType::kDynamicSize)
continue;		continue;

AffineExpr expr = resultIndexingMap.getResult(shapeElement.index());		AffineExpr expr = resultIndexingMap.getResult(shapeElement.index());
switch (expr.getKind()) {		switch (expr.getKind()) {
case AffineExprKind::DimId: {		case AffineExprKind::DimId: {
int64_t loopIndex = expr.cast<AffineDimExpr>().getPosition();		int64_t loopIndex = expr.cast<AffineDimExpr>().getPosition();
Value size = maybeConvertToIndex(loc, loopRanges[loopIndex].size, b);		Value size = maybeConvertToIndex(loc, loopRanges[loopIndex].size, b);
dynOperands.push_back(size);		dynOperands.push_back(size);
break;		break;
}		}
default:		default:
return failure();		return failure();
}		}
}		}
resultBuffers.push_back(b.create<AllocOp>(loc, memrefType, dynOperands));		resultBuffers.push_back(
		options.allocationFn(b, loc, memrefType, dynOperands));
}		}
return success();		return success();
}		}

// Specialization for `linalg::GenericOp`.		// Specialization for `linalg::GenericOp`.
/// A pattern to convert Generic Linalg operations which work on tensors to		/// A pattern to convert Generic Linalg operations which work on tensors to
/// use buffers. BufferPlacement pass should be later used to move		/// use buffers. BufferPlacement pass should be later used to move
/// Alloc operations to the correct positions and insert the missing Dealloc		/// Alloc operations to the correct positions and insert the missing Dealloc
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
// Bufferization patterns.		// Bufferization patterns.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {
/// Generic conversion pattern that matches any LinalgOp. This avoids template		/// Generic conversion pattern that matches any LinalgOp. This avoids template
/// instantiating one pattern for each LinalgOp.		/// instantiating one pattern for each LinalgOp.
class BufferizeAnyLinalgOp : public ConversionPattern {		class BufferizeAnyLinalgOp : public ConversionPattern {
public:		public:
BufferizeAnyLinalgOp(TypeConverter &typeConverter)		BufferizeAnyLinalgOp(TypeConverter &typeConverter,
: ConversionPattern(/benefit=/1, typeConverter, MatchAnyOpTypeTag()) {}		LinalgBufferizeOptions options)
		: ConversionPattern(/benefit=/1, typeConverter, MatchAnyOpTypeTag()),
		options(options) {}

LogicalResult		LogicalResult
matchAndRewrite(Operation *op, ArrayRef<Value> operands,		matchAndRewrite(Operation *op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const final {		ConversionPatternRewriter &rewriter) const final {

LinalgOp linalgOp = dyn_cast<linalg::LinalgOp>(op);		LinalgOp linalgOp = dyn_cast<linalg::LinalgOp>(op);
if (!linalgOp)		if (!linalgOp)
return failure();		return failure();

// We abuse the GenericOpAdaptor here.		// We abuse the GenericOpAdaptor here.
// TODO: Manually create an Adaptor that captures inputs, output_buffers and		// TODO: Manually create an Adaptor that captures inputs, output_buffers and
// init_tensors for all linalg::LinalgOp interface ops.		// init_tensors for all linalg::LinalgOp interface ops.
linalg::GenericOpAdaptor adaptor(operands, op->getAttrDictionary());		linalg::GenericOpAdaptor adaptor(operands, op->getAttrDictionary());

Location loc = linalgOp.getLoc();		Location loc = linalgOp.getLoc();
SmallVector<Value, 2> newOutputBuffers(adaptor.output_buffers().begin(),		SmallVector<Value, 2> newOutputBuffers(adaptor.output_buffers().begin(),
adaptor.output_buffers().end());		adaptor.output_buffers().end());

if (failed(allocateBuffersForResults(loc, linalgOp, adaptor,		if (failed(allocateBuffersForResults(
newOutputBuffers, rewriter))) {		loc, linalgOp, adaptor, newOutputBuffers, rewriter, options))) {
linalgOp.emitOpError()		linalgOp.emitOpError()
<< "Failed to allocate buffers for tensor results.";		<< "Failed to allocate buffers for tensor results.";
return failure();		return failure();
}		}

// Delegate to the linalg generic pattern.		// Delegate to the linalg generic pattern.
if (auto genericOp = dyn_cast<linalg::GenericOp>(op)) {		if (auto genericOp = dyn_cast<linalg::GenericOp>(op)) {
finalizeBufferAllocation(rewriter, genericOp, adaptor.inputs(),		finalizeBufferAllocation(rewriter, genericOp, adaptor.inputs(),
newOutputBuffers);		newOutputBuffers);
return success();		return success();
}		}

finalizeBufferAllocation(rewriter, linalgOp, adaptor.inputs(),		finalizeBufferAllocation(rewriter, linalgOp, adaptor.inputs(),
newOutputBuffers);		newOutputBuffers);
return success();		return success();
}		}

		LinalgBufferizeOptions options;
};		};

// Extract int64_t values from the assumed ArrayAttr of IntegerAttr.		// Extract int64_t values from the assumed ArrayAttr of IntegerAttr.
static SmallVector<int64_t, 4> extractFromI64ArrayAttr(Attribute attr) {		static SmallVector<int64_t, 4> extractFromI64ArrayAttr(Attribute attr) {
return llvm::to_vector<4>(		return llvm::to_vector<4>(
llvm::map_range(attr.cast<ArrayAttr>(), [](Attribute a) -> int64_t {		llvm::map_range(attr.cast<ArrayAttr>(), [](Attribute a) -> int64_t {
return a.cast<IntegerAttr>().getInt();		return a.cast<IntegerAttr>().getInt();
}));		}));
}		}

/// Convert `subtensor %t [offsets][sizes][strides] -> %st` to an alloc + copy		/// Convert `subtensor %t [offsets][sizes][strides] -> %st` to an alloc + copy
/// pattern.		/// pattern.
/// ```		/// ```
/// %a = alloc(sizes)		/// %a = alloc(sizes)
/// %sv = subview %source [offsets][sizes][strides]		/// %sv = subview %source [offsets][sizes][strides]
/// linalg_copy(%sv, %a)		/// linalg_copy(%sv, %a)
/// ```		/// ```
///		///
/// This pattern is arguable a std pattern once linalg::CopyOp becomes		/// This pattern is arguable a std pattern once linalg::CopyOp becomes
/// std::CopyOp.		/// std::CopyOp.
class SubTensorOpConverter : public OpConversionPattern<SubTensorOp> {		class SubTensorOpConverter : public OpConversionPattern<SubTensorOp> {
public:		public:
using OpConversionPattern<SubTensorOp>::OpConversionPattern;		using OpConversionPattern<SubTensorOp>::OpConversionPattern;
		SubTensorOpConverter(TypeConverter &typeConverter,
		LinalgBufferizeOptions options, MLIRContext *context,
		PatternBenefit benefit = 1)
		: OpConversionPattern(typeConverter, context, benefit), options(options) {
		}

LogicalResult		LogicalResult
matchAndRewrite(SubTensorOp op, ArrayRef<Value> operands,		matchAndRewrite(SubTensorOp op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const final {		ConversionPatternRewriter &rewriter) const final {
SubTensorOpAdaptor adaptor(operands,		SubTensorOpAdaptor adaptor(operands,
op.getOperation()->getAttrDictionary());		op.getOperation()->getAttrDictionary());
Value sourceMemref = adaptor.source();		Value sourceMemref = adaptor.source();
assert(sourceMemref.getType().isa<MemRefType>());		assert(sourceMemref.getType().isa<MemRefType>());

MemRefType subviewMemRefType =		MemRefType subviewMemRefType =
getTypeConverter()->convertType(op.getType()).cast<MemRefType>();		getTypeConverter()->convertType(op.getType()).cast<MemRefType>();
// op.sizes() capture exactly the dynamic alloc operands matching the		// op.sizes() capture exactly the dynamic alloc operands matching the
// subviewMemRefType thanks to subview/subtensor canonicalization and		// subviewMemRefType thanks to subview/subtensor canonicalization and
// verification.		// verification.
Value alloc =		Value alloc = options.allocationFn(rewriter, op.getLoc(), subviewMemRefType,
rewriter.create<AllocOp>(op.getLoc(), subviewMemRefType, op.sizes());		op.sizes());
Value subView = rewriter.create<SubViewOp>(		Value subView = rewriter.create<SubViewOp>(
op.getLoc(), sourceMemref, extractFromI64ArrayAttr(op.static_offsets()),		op.getLoc(), sourceMemref, extractFromI64ArrayAttr(op.static_offsets()),
extractFromI64ArrayAttr(op.static_sizes()),		extractFromI64ArrayAttr(op.static_sizes()),
extractFromI64ArrayAttr(op.static_strides()), op.offsets(), op.sizes(),		extractFromI64ArrayAttr(op.static_strides()), op.offsets(), op.sizes(),
op.strides());		op.strides());
rewriter.create<linalg::CopyOp>(op.getLoc(), subView, alloc);		rewriter.create<linalg::CopyOp>(op.getLoc(), subView, alloc);
rewriter.replaceOp(op, alloc);		rewriter.replaceOp(op, alloc);
return success();		return success();
}		}

		LinalgBufferizeOptions options;
};		};

/// Convert `subtensor_insert %source into %dest [offsets][sizes][strides] ->		/// Convert `subtensor_insert %source into %dest [offsets][sizes][strides] ->
/// %t` to an tensor_to_memref + subview + copy + tensor_load pattern.		/// %t` to an tensor_to_memref + subview + copy + tensor_load pattern.
/// tensor_to_memref and tensor_load are inserted automatically by the		/// tensor_to_memref and tensor_load are inserted automatically by the
/// conversion infra:		/// conversion infra:
/// ```		/// ```
/// %sv = subview %dest [offsets][sizes][strides]		/// %sv = subview %dest [offsets][sizes][strides]
/// linalg_copy(%source, %sv)		/// linalg_copy(%source, %sv)
/// // replace with %dest		/// // replace with %dest
/// ```		/// ```
///		///
/// This pattern is arguable a std pattern once linalg::CopyOp becomes		/// This pattern is arguable a std pattern once linalg::CopyOp becomes
/// std::CopyOp.		/// std::CopyOp.
class SubTensorInsertOpConverter		class SubTensorInsertOpConverter
: public OpConversionPattern<SubTensorInsertOp> {		: public OpConversionPattern<SubTensorInsertOp> {
public:		public:
using OpConversionPattern<SubTensorInsertOp>::OpConversionPattern;		using OpConversionPattern<SubTensorInsertOp>::OpConversionPattern;
		SubTensorInsertOpConverter(TypeConverter &typeConverter,
		LinalgBufferizeOptions options,
		MLIRContext *context, PatternBenefit benefit = 1)
		: OpConversionPattern(typeConverter, context, benefit), options(options) {
		}

LogicalResult		LogicalResult
matchAndRewrite(SubTensorInsertOp op, ArrayRef<Value> operands,		matchAndRewrite(SubTensorInsertOp op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const final {		ConversionPatternRewriter &rewriter) const final {
SubTensorInsertOpAdaptor adaptor(operands,		SubTensorInsertOpAdaptor adaptor(operands,
op.getOperation()->getAttrDictionary());		op.getOperation()->getAttrDictionary());
Value sourceMemRef = adaptor.source();		Value sourceMemRef = adaptor.source();
assert(sourceMemRef.getType().isa<MemRefType>());		assert(sourceMemRef.getType().isa<MemRefType>());

// For now, be conservative and copy the converted input memref.		// For now, be conservative and copy the converted input memref.
// In general, the converted input memref here could be aliased or could		// In general, the converted input memref here could be aliased or could
// point into constant memory, so mutating it would lead to miscompilations.		// point into constant memory, so mutating it would lead to miscompilations.
Value destMemRef = cloneMemref(op.getLoc(), adaptor.dest(), rewriter);		Value destMemRef =
		cloneMemref(rewriter, op.getLoc(), adaptor.dest(), options);
assert(destMemRef.getType().isa<MemRefType>());		assert(destMemRef.getType().isa<MemRefType>());

// Take a subview to copy the small memref.		// Take a subview to copy the small memref.
Value subview = rewriter.create<SubViewOp>(		Value subview = rewriter.create<SubViewOp>(
op.getLoc(), destMemRef, extractFromI64ArrayAttr(op.static_offsets()),		op.getLoc(), destMemRef, extractFromI64ArrayAttr(op.static_offsets()),
extractFromI64ArrayAttr(op.static_sizes()),		extractFromI64ArrayAttr(op.static_sizes()),
extractFromI64ArrayAttr(op.static_strides()), adaptor.offsets(),		extractFromI64ArrayAttr(op.static_strides()), adaptor.offsets(),
adaptor.sizes(), adaptor.strides());		adaptor.sizes(), adaptor.strides());
// Copy the small memref.		// Copy the small memref.
rewriter.create<linalg::CopyOp>(op.getLoc(), sourceMemRef, subview);		rewriter.create<linalg::CopyOp>(op.getLoc(), sourceMemRef, subview);
rewriter.replaceOp(op, destMemRef);		rewriter.replaceOp(op, destMemRef);
return success();		return success();
}		}

		LinalgBufferizeOptions options;
};		};
} // namespace

namespace {
/// Converts Linalg operations that work on tensor-type operands or results to		/// Converts Linalg operations that work on tensor-type operands or results to
/// work on buffers.		/// work on buffers.
struct LinalgBufferizePass : public LinalgBufferizeBase<LinalgBufferizePass> {		struct LinalgBufferizePass : public LinalgBufferizeBase<LinalgBufferizePass> {
void runOnOperation() override {		void runOnOperation() override {
MLIRContext &context = getContext();		MLIRContext &context = getContext();
ConversionTarget target(context);		ConversionTarget target(context);
BufferizeTypeConverter typeConverter;		BufferizeTypeConverter typeConverter;

Show All 12 Lines	void runOnOperation() override {
populateLinalgBufferizePatterns(&context, typeConverter, patterns);		populateLinalgBufferizePatterns(&context, typeConverter, patterns);
if (failed(applyPartialConversion(getOperation(), target,		if (failed(applyPartialConversion(getOperation(), target,
std::move(patterns))))		std::move(patterns))))
signalPassFailure();		signalPassFailure();
}		}
};		};
} // end anonymous namespace		} // end anonymous namespace

		Value mlir::linalg::defaultBufferizationAllocFn(OpBuilder &b, Location loc,
		MemRefType type,
		ValueRange allocOperands) {
		return b.create<AllocOp>(loc, type, allocOperands);
		}

std::unique_ptr<OperationPass<ModuleOp>> mlir::createLinalgBufferizePass() {		std::unique_ptr<OperationPass<ModuleOp>> mlir::createLinalgBufferizePass() {
return std::make_unique<LinalgBufferizePass>();		return std::make_unique<LinalgBufferizePass>();
}		}

void mlir::linalg::populateLinalgBufferizePatterns(		void mlir::linalg::populateLinalgBufferizePatterns(
MLIRContext *context, BufferizeTypeConverter &typeConverter,		MLIRContext *context, TypeConverter &typeConverter,
OwningRewritePatternList &patterns) {		OwningRewritePatternList &patterns, LinalgBufferizeOptions options) {
patterns.insert<BufferizeAnyLinalgOp>(typeConverter);		patterns.insert<BufferizeAnyLinalgOp>(typeConverter, options);
// TODO: Drop this once tensor constants work in standard.		// TODO: Drop this once tensor constants work in standard.
patterns.insert<		patterns.insert<
// clang-format off		// clang-format off
SubTensorOpConverter,		SubTensorOpConverter,
SubTensorInsertOpConverter		SubTensorInsertOpConverter
// clang-format on		// clang-format on
>(typeConverter, context);		>(typeConverter, options, context);
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] NFC - Add bufferization optionsAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 305140

mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp

[mlir][Linalg] NFC - Add bufferization options
AbandonedPublic