This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Conversion/MemRefToLLVM/
-
Conversion/
-
MemRefToLLVM/
-
MemRefToLLVM.cpp
-
test/Conversion/MemRefToLLVM/
-
Conversion/
-
MemRefToLLVM/
-
memref-to-llvm.mlir

Differential D135756

[mlir] MemRefToLLVM: Save / restore stack when lowering memref.copy
ClosedPublic

Authored by andidr on Oct 12 2022, 2:13 AM.

Download Raw Diff

Details

Reviewers

ftynse
dcaballe

Commits

rGf76e40d1a4d7: [mlir] MemRefToLLVM: Save / restore stack when lowering memref.copy

Summary

The MemRef to LLVM conversion pass emits llvm.alloca operations to promote MemRef descriptors to the stack when lowering memref.copy operations for operands which do not have a contiguous layout in memory. The original stack position is never restored after the allocations, which creates an issue when the copy operation is embedded into a loop with a high trip count, ultimately resulting in a segmentation fault due to the stack growing too large.

Below is as a minimal example illustrating the issue:

module {
  func.func @main() {
    %arg0 = memref.alloc() : memref<32x64xi64>
    %arg1 = memref.alloc() : memref<16x32xi64>
    %lb = arith.constant 0 : index
    %ub = arith.constant 100000 : index
    %step = arith.constant 1 : index
    %slice = memref.subview %arg0[16,32][16,32][1,1] :
       memref<32x64xi64> to memref<16x32xi64, #map>

    scf.for %i = %lb to %ub step %step {
       memref.copy %slice, %arg1 :
         memref<16x32xi64, #map> to memref<16x32xi64>
    }

    return
  }
}

When running the code above, e.g., with mlir-cpu-runner, the execution crashes with a segmentation fault:

$ mlir-opt \
    --convert-scf-to-cf \
    --convert-memref-to-llvm \
    --convert-func-to-llvm
    --convert-cf-to-llvm \
    --reconcile-unrealized-casts <file> | \
  mlir-cpu-runner \
    -e main -entry-point-result=void \
    --shared-libs=$PWD/build/lib/libmlir_c_runner_utils.so
[...]
Segmentation fault

This patch causes the code lowering a memref.copy operation in the MemRefToLLVM pass to emit a pair of matching llvm.intr.stacksave and llvm.intr.stackrestore operations around the promotion of memory descriptors and the subsequent call to memrefCopy in order to restore the stack to its original position after the call.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

andidr created this revision.Oct 12 2022, 2:13 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 12 2022, 2:13 AM

Herald added subscribers: zero9178, bzcheeseman, awarzynski and 20 others. · View Herald Transcript

andidr requested review of this revision.Oct 12 2022, 2:13 AM

Herald added a reviewer: dcaballe. · View Herald TranscriptOct 12 2022, 2:13 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Harbormaster completed remote builds in B191685: Diff 467069.Oct 12 2022, 2:33 AM

Could you also update the corresponding test?

Uploaded a new version of the diff with the corresponding tests updated.

ftynse accepted this revision.Oct 12 2022, 7:44 AM

This revision is now accepted and ready to land.Oct 12 2022, 7:44 AM

Harbormaster completed remote builds in B191731: Diff 467135.Oct 12 2022, 7:50 AM

Thanks for reviewing. Could someone with commit rights commit this to the monorepo, please?

Closed by commit rGf76e40d1a4d7: [mlir] MemRefToLLVM: Save / restore stack when lowering memref.copy (authored by andidr, committed by ftynse). · Explain WhyOct 13 2022, 1:13 AM

This revision was automatically updated to reflect the committed changes.

ftynse added a commit: rGf76e40d1a4d7: [mlir] MemRefToLLVM: Save / restore stack when lowering memref.copy.

Revision Contents

Path

Size

mlir/

lib/

Conversion/

MemRefToLLVM/

MemRefToLLVM.cpp

8 lines

test/

Conversion/

MemRefToLLVM/

memref-to-llvm.mlir

2 lines

Diff 467388

mlir/lib/Conversion/MemRefToLLVM/MemRefToLLVM.cpp

Show First 20 Lines • Show All 955 Lines • ▼ Show 20 Lines	auto makeUnranked = [&, this](Value ranked, BaseMemRefType type) {
.getResult();		.getResult();
auto unrankedType =		auto unrankedType =
UnrankedMemRefType::get(type.getElementType(), type.getMemorySpace());		UnrankedMemRefType::get(type.getElementType(), type.getMemorySpace());
return UnrankedMemRefDescriptor::pack(rewriter, loc, *typeConverter,		return UnrankedMemRefDescriptor::pack(rewriter, loc, *typeConverter,
unrankedType,		unrankedType,
ValueRange{rank, voidPtr});		ValueRange{rank, voidPtr});
};		};

		// Save stack position before promoting descriptors
		auto stackSaveOp =
		rewriter.create<LLVM::StackSaveOp>(loc, getVoidPtrType());

Value unrankedSource = srcType.hasRank()		Value unrankedSource = srcType.hasRank()
? makeUnranked(adaptor.getSource(), srcType)		? makeUnranked(adaptor.getSource(), srcType)
: adaptor.getSource();		: adaptor.getSource();
Value unrankedTarget = targetType.hasRank()		Value unrankedTarget = targetType.hasRank()
? makeUnranked(adaptor.getTarget(), targetType)		? makeUnranked(adaptor.getTarget(), targetType)
: adaptor.getTarget();		: adaptor.getTarget();

// Now promote the unranked descriptors to the stack.		// Now promote the unranked descriptors to the stack.
Show All 13 Lines	lowerToMemCopyFunctionCall(memref::CopyOp op, OpAdaptor adaptor,
unsigned typeSize =		unsigned typeSize =
mlir::DataLayout::closest(op).getTypeSize(srcType.getElementType());		mlir::DataLayout::closest(op).getTypeSize(srcType.getElementType());
auto elemSize = rewriter.create<LLVM::ConstantOp>(		auto elemSize = rewriter.create<LLVM::ConstantOp>(
loc, getIndexType(), rewriter.getIndexAttr(typeSize));		loc, getIndexType(), rewriter.getIndexAttr(typeSize));
auto copyFn = LLVM::lookupOrCreateMemRefCopyFn(		auto copyFn = LLVM::lookupOrCreateMemRefCopyFn(
op->getParentOfType<ModuleOp>(), getIndexType(), sourcePtr.getType());		op->getParentOfType<ModuleOp>(), getIndexType(), sourcePtr.getType());
rewriter.create<LLVM::CallOp>(loc, copyFn,		rewriter.create<LLVM::CallOp>(loc, copyFn,
ValueRange{elemSize, sourcePtr, targetPtr});		ValueRange{elemSize, sourcePtr, targetPtr});

		// Restore stack used for descriptors
		rewriter.create<LLVM::StackRestoreOp>(loc, stackSaveOp);

rewriter.eraseOp(op);		rewriter.eraseOp(op);

return success();		return success();
}		}

LogicalResult		LogicalResult
matchAndRewrite(memref::CopyOp op, OpAdaptor adaptor,		matchAndRewrite(memref::CopyOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
▲ Show 20 Lines • Show All 1,160 Lines • Show Last 20 Lines

mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir

Show First 20 Lines • Show All 1,132 Lines • ▼ Show 20 Lines	func.func @memref_copy_unranked() {
// CHECK: [[ONE:%.*]] = llvm.mlir.constant(1 : index) : i64		// CHECK: [[ONE:%.*]] = llvm.mlir.constant(1 : index) : i64
// CHECK: [[ALLOCA:%.*]] = llvm.alloca %35 x !llvm.struct<(ptr<i1>, ptr<i1>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i1>, ptr<i1>, i64, array<1 x i64>, array<1 x i64>)>>		// CHECK: [[ALLOCA:%.*]] = llvm.alloca %35 x !llvm.struct<(ptr<i1>, ptr<i1>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<i1>, ptr<i1>, i64, array<1 x i64>, array<1 x i64>)>>
// CHECK: llvm.store {{%.*}}, [[ALLOCA]] : !llvm.ptr<struct<(ptr<i1>, ptr<i1>, i64, array<1 x i64>, array<1 x i64>)>>		// CHECK: llvm.store {{%.*}}, [[ALLOCA]] : !llvm.ptr<struct<(ptr<i1>, ptr<i1>, i64, array<1 x i64>, array<1 x i64>)>>
// CHECK: [[BITCAST:%.*]] = llvm.bitcast [[ALLOCA]] : !llvm.ptr<struct<(ptr<i1>, ptr<i1>, i64, array<1 x i64>, array<1 x i64>)>> to !llvm.ptr<i8>		// CHECK: [[BITCAST:%.*]] = llvm.bitcast [[ALLOCA]] : !llvm.ptr<struct<(ptr<i1>, ptr<i1>, i64, array<1 x i64>, array<1 x i64>)>> to !llvm.ptr<i8>
// CHECK: [[RANK:%.*]] = llvm.mlir.constant(1 : index) : i64		// CHECK: [[RANK:%.*]] = llvm.mlir.constant(1 : index) : i64
// CHECK: [[UNDEF:%.*]] = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>		// CHECK: [[UNDEF:%.*]] = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
// CHECK: [[INSERT:%.*]] = llvm.insertvalue [[RANK]], [[UNDEF]][0] : !llvm.struct<(i64, ptr<i8>)>		// CHECK: [[INSERT:%.*]] = llvm.insertvalue [[RANK]], [[UNDEF]][0] : !llvm.struct<(i64, ptr<i8>)>
// CHECK: [[INSERT2:%.*]] = llvm.insertvalue [[BITCAST]], [[INSERT]][1] : !llvm.struct<(i64, ptr<i8>)>		// CHECK: [[INSERT2:%.*]] = llvm.insertvalue [[BITCAST]], [[INSERT]][1] : !llvm.struct<(i64, ptr<i8>)>
		// CHECK: [[STACKSAVE:%.*]] = llvm.intr.stacksave : !llvm.ptr<i8>
// CHECK: [[RANK2:%.*]] = llvm.mlir.constant(1 : index) : i64		// CHECK: [[RANK2:%.*]] = llvm.mlir.constant(1 : index) : i64
// CHECK: [[ALLOCA2:%.*]] = llvm.alloca [[RANK2]] x !llvm.struct<(i64, ptr<i8>)> : (i64) -> !llvm.ptr<struct<(i64, ptr<i8>)>>		// CHECK: [[ALLOCA2:%.*]] = llvm.alloca [[RANK2]] x !llvm.struct<(i64, ptr<i8>)> : (i64) -> !llvm.ptr<struct<(i64, ptr<i8>)>>
// CHECK: llvm.store {{%.*}}, [[ALLOCA2]] : !llvm.ptr<struct<(i64, ptr<i8>)>>		// CHECK: llvm.store {{%.*}}, [[ALLOCA2]] : !llvm.ptr<struct<(i64, ptr<i8>)>>
// CHECK: [[ALLOCA3:%.*]] = llvm.alloca [[RANK2]] x !llvm.struct<(i64, ptr<i8>)> : (i64) -> !llvm.ptr<struct<(i64, ptr<i8>)>>		// CHECK: [[ALLOCA3:%.*]] = llvm.alloca [[RANK2]] x !llvm.struct<(i64, ptr<i8>)> : (i64) -> !llvm.ptr<struct<(i64, ptr<i8>)>>
// CHECK: llvm.store [[INSERT2]], [[ALLOCA3]] : !llvm.ptr<struct<(i64, ptr<i8>)>>		// CHECK: llvm.store [[INSERT2]], [[ALLOCA3]] : !llvm.ptr<struct<(i64, ptr<i8>)>>
// CHECK: [[SIZE:%.*]] = llvm.mlir.constant(1 : index) : i64		// CHECK: [[SIZE:%.*]] = llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.call @memrefCopy([[SIZE]], [[ALLOCA2]], [[ALLOCA3]]) : (i64, !llvm.ptr<struct<(i64, ptr<i8>)>>, !llvm.ptr<struct<(i64, ptr<i8>)>>) -> ()		// CHECK: llvm.call @memrefCopy([[SIZE]], [[ALLOCA2]], [[ALLOCA3]]) : (i64, !llvm.ptr<struct<(i64, ptr<i8>)>>, !llvm.ptr<struct<(i64, ptr<i8>)>>) -> ()
		// CHECK: llvm.intr.stackrestore [[STACKSAVE]]
return		return
}		}

// -----		// -----

// CHECK-LABEL: func @extract_aligned_pointer_as_index		// CHECK-LABEL: func @extract_aligned_pointer_as_index
func.func @extract_aligned_pointer_as_index(%m: memref<?xf32>) -> index {		func.func @extract_aligned_pointer_as_index(%m: memref<?xf32>) -> index {
%0 = memref.extract_aligned_pointer_as_index %m: memref<?xf32> -> index		%0 = memref.extract_aligned_pointer_as_index %m: memref<?xf32> -> index
// CHECK: %[[E:.]] = llvm.extractvalue %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>		// CHECK: %[[E:.]] = llvm.extractvalue %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
// CHECK: %[[I64:.*]] = llvm.ptrtoint %[[E]] : !llvm.ptr<f32> to i64		// CHECK: %[[I64:.*]] = llvm.ptrtoint %[[E]] : !llvm.ptr<f32> to i64
// CHECK: %[[R:.*]] = builtin.unrealized_conversion_cast %[[I64]] : i64 to index		// CHECK: %[[R:.*]] = builtin.unrealized_conversion_cast %[[I64]] : i64 to index

// CHECK: return %[[R:.*]] : index		// CHECK: return %[[R:.*]] : index
return %0: index		return %0: index
}		}