This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
8/8
ConvertVectorToLLVM.cpp
-
test/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
5
vector-to-llvm.mlir

Differential D100444

[MLIR] Update Vector To LLVM conversion to be aware of assume_alignment
ClosedPublic

Authored by stephenneuendorffer on Apr 13 2021, 9:57 PM.

Download Raw Diff

Details

Reviewers

aartbik
ftynse
nicolasvasilache

Commits

rG29a50c5864dd: [MLIR] Update Vector To LLVM conversion to be aware of assume_alignment

Summary

vector.transfer_read and vector.transfer_write operations are converted
to llvm intrinsics with specific alignment information, however there
doesn't seem to be a way in llvm to take information from llvm.assume
intrinsics and change this alignment information. In any
event, due the to the structure of the llvm.assume instrinsic, applying
this information at the llvm level is more cumbersome. Instead, let's
generate the masked vector load and store instrinsic with the right
alignment information from MLIR in the first place. Since
we're bothering to do this, lets just emit the proper alignment for
loads, stores, scatter, and gather ops too.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

stephenneuendorffer created this revision.Apr 13 2021, 9:57 PM

Herald added a reviewer: aartbik. · View Herald TranscriptApr 13 2021, 9:57 PM

Herald added a reviewer: ftynse. · View Herald Transcript

Herald added subscribers: dcaballe, cota, teijeong and 16 others. · View Herald Transcript

stephenneuendorffer requested review of this revision.Apr 13 2021, 9:57 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2021, 9:57 PM

Herald added a subscriber: nicolasvasilache. · View Herald Transcript

stephenneuendorffer added a reviewer: nicolasvasilache.Apr 13 2021, 9:59 PM

Harbormaster completed remote builds in B98612: Diff 337332.Apr 14 2021, 12:14 AM

nicolasvasilache added inline comments.Apr 14 2021, 1:26 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
123	Can we fold this into a: LogicalResult getMemRefAlignment(LLVMTypeConverter &typeConverter, Value memRef, unsigned &align) { MemRefType memrefType = memRef.getType().cast<MemRefType>(); // old code // your code } ?
130	`align = std::max(align, op.alignment())` (with a cast if needed) ?

aartbik added inline comments.Apr 14 2021, 11:53 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
123	+1 with probably a name to reflect that getMemRefAlignmentAndUpdate() or something like that

stephenneuendorffer updated this revision to Diff 346352.May 19 2021, 12:18 AM

stephenneuendorffer edited the summary of this revision. (Show Details)

In incorporating your feedback, I decided to emit the proper alignment other loads and stores too. Thoughts?

Harbormaster completed remote builds in B105158: Diff 346352.May 19 2021, 12:47 AM

In D100444#2767792, @stephenneuendorffer wrote:

In incorporating your feedback, I decided to emit the proper alignment other loads and stores too. Thoughts?

Looks good!

Now I'm wondering whether we should go for the LICM instead?

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
123	If no assumed alignment -> Return the minimal alignment value that satisfies all the AssumeAlignment uses of `value`. If no such uses exist, return 0.
130	Thanks for re-spelling. Now that I see it in this form, should this be `licm` instead of `max` ?

And by licm I mean lcm :p ..

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
130	And by `licm` I mean `lcm` :p ..

stephenneuendorffer marked 3 inline comments as done.May 19 2021, 10:36 AM

stephenneuendorffer added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
130	Probably.. I guess I was implicitly assuming everything was a power of 2.

stephenneuendorffer updated this revision to Diff 346504.May 19 2021, 10:43 AM

stephenneuendorffer marked an inline comment as done.

stephenneuendorffer added inline comments.May 19 2021, 10:44 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
130	Sure enough : 'error: 'memref.assume_alignment' op alignment must be power of 2'

This revision was not accepted when it landed; it landed in state Needs Review.May 19 2021, 10:51 AM

This revision was landed with ongoing or failed builds.

Closed by commit rG29a50c5864dd: [MLIR] Update Vector To LLVM conversion to be aware of assume_alignment (authored by stephenneuendorffer). · Explain Why

This revision was automatically updated to reflect the committed changes.

stephenneuendorffer added a commit: rG29a50c5864dd: [MLIR] Update Vector To LLVM conversion to be aware of assume_alignment.

thanks!

Harbormaster completed remote builds in B105266: Diff 346504.May 19 2021, 11:31 AM

ThomasRaoux added a subscriber: ThomasRaoux.Nov 30 2021, 1:29 PM

ThomasRaoux added inline comments.

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1555	How can we know that this store is aligned on 32B without knowing anything about %i and %j. I'm running into a bug because the memref alignment is propagated to load/store associated without considering the indices used for indexing. I don't believe we can infer any kind of alignment without having alignment information about the indices. @stephenneuendorffer, could you clarify how you expect this to work?

Herald added subscribers: sdasgup3, wenzhicui, wrengr, Chia-hungDuan. · View Herald TranscriptNov 30 2021, 1:29 PM

stephenneuendorffer added a reverting change: rG73863648892e: Revert "[MLIR] Update Vector To LLVM conversion to be aware of assume_alignment".Nov 30 2021, 3:18 PM

nicolasvasilache added inline comments.Dec 1 2021, 12:00 AM

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1555	This one is clearly not aligned on 32B but it should be aligned on: lcm(annotation + offset_alignment, most_minor_dim_alignment) == lcm(32 + 0 * 4, 100 * 4) == 16B There seems indeed to be a small bug somewhere but I would prefer we fix rather than flatly revert.

mehdi_amini added inline comments.Dec 1 2021, 12:09 AM

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1555	Why? Revert is conservative and cheap: it is trivial to reapply the patch with the fix without pressure.

Fair enough, my bias was against losing the contribution; didn't think about it your way, thanks!

ThomasRaoux added inline comments.Dec 1 2021, 6:53 AM

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1555	Yes Stephen reverted it as a quick solution as this was causing miscompile on IREE side. @nicolasvasilache, I don't understand how we can prove that this is aligned on 16B. If %j = 1 and %i = 0 for instance we would be accessing 4B from the base and since the base is 32B aligned we would only be able to get 4B alignment. What am I missing?

nicolasvasilache added inline comments.Dec 1 2021, 7:33 AM

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1555	You're right, I was very focused on the memref type itself and did not look deeper at the indexing.. What I meant to say is that the alignment for each start of every row of memref is already degraded wrt the base pointer. More precisely: @%memref[0, 0] is aligned at 32B @%memref[1, 0] is aligned at 16B but not 32B @%memref[2, 0] is aligned at 32B ... the indexing within the row can indeed be any 4B aligned. But even worse, even `@%memref[1, 32]` is aligned at 16B and not 32B. This is one of the important things to grasp when talking about memref and alignment: a memref is not a pointer. An n-D memref is closer in principle to a: (n-1)-D array of pointers to 1-D row (even if they are not materialized) Please ignore my earlier comment on the revert and sorry about the noise 😛

Revision Contents

Path

Size

mlir/

lib/

Conversion/

VectorToLLVM/

ConvertVectorToLLVM.cpp

57 lines

test/

Conversion/

VectorToLLVM/

vector-to-llvm.mlir

80 lines

Diff 346352

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	LogicalResult getMemRefAlignment(LLVMTypeConverter &typeConverter,
// TODO: this should use the MLIR data layout when it becomes available and		// TODO: this should use the MLIR data layout when it becomes available and
// stop depending on translation.		// stop depending on translation.
llvm::LLVMContext llvmContext;		llvm::LLVMContext llvmContext;
align = LLVM::TypeToLLVMIRTranslator(llvmContext)		align = LLVM::TypeToLLVMIRTranslator(llvmContext)
.getPreferredAlignment(elementTy, typeConverter.getDataLayout());		.getPreferredAlignment(elementTy, typeConverter.getDataLayout());
return success();		return success();
}		}

		// Helper that returns assumed data layout alignment of a value with memref
		// type using information from assume_alignment calls. If no assumed alignment
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can we fold this into a: LogicalResult getMemRefAlignment(LLVMTypeConverter &typeConverter, Value memRef, unsigned &align) { MemRefType memrefType = memRef.getType().cast<MemRefType>(); // old code // your code } ? nicolasvasilache: Can we fold this into a: ``` LogicalResult getMemRefAlignment(LLVMTypeConverter &typeConverter…
		aartbikUnsubmitted Done Reply Inline Actions +1 with probably a name to reflect that getMemRefAlignmentAndUpdate() or something like that aartbik: +1 with probably a name to reflect that getMemRefAlignmentAndUpdate() or something like that
		nicolasvasilacheUnsubmitted Done Reply Inline Actions If no assumed alignment -> Return the minimal alignment value that satisfies all the AssumeAlignment uses of `value`. If no such uses exist, return 0. nicolasvasilache: ```If no assumed alignment``` -> ``` Return the minimal alignment value that satisfies all…
		// then return 0;
		static unsigned getAssumedAlignment(Value value) {
		unsigned align = 0;
		for (auto &u : value.getUses()) {
		Operation *owner = u.getOwner();
		if (auto op = dyn_cast<memref::AssumeAlignmentOp>(owner))
		align = std::max(align, op.alignment());
		nicolasvasilacheUnsubmitted Done Reply Inline Actions `align = std::max(align, op.alignment())` (with a cast if needed) ? nicolasvasilache: `align = std::max(align, op.alignment())` (with a cast if needed) ?
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Thanks for re-spelling. Now that I see it in this form, should this be `licm` instead of `max` ? nicolasvasilache: Thanks for re-spelling. Now that I see it in this form, should this be `licm` instead of `max` ?
		nicolasvasilacheUnsubmitted Done Reply Inline Actions And by `licm` I mean `lcm` :p .. nicolasvasilache: And by `licm` I mean `lcm` :p ..
		stephenneuendorfferAuthorUnsubmitted Done Reply Inline Actions Probably.. I guess I was implicitly assuming everything was a power of 2. stephenneuendorffer: Probably.. I guess I was implicitly assuming everything was a power of 2.
		stephenneuendorfferAuthorUnsubmitted Done Reply Inline Actions Sure enough : 'error: 'memref.assume_alignment' op alignment must be power of 2' stephenneuendorffer: Sure enough : 'error: 'memref.assume_alignment' op alignment must be power of 2'
		}
		return align;
		}
		// Helper that returns data layout alignment of a memref associated with a
		// transfer op, including additional information from assume_alignment calls
		// on the source of the transfer
		LogicalResult getTransferOpAlignment(LLVMTypeConverter &typeConverter,
		VectorTransferOpInterface xfer,
		unsigned &align) {
		if (failed(getMemRefAlignment(
		typeConverter, xfer.getShapedType().cast<MemRefType>(), align)))
		return failure();
		align = std::max(align, getAssumedAlignment(xfer.source()));
		return success();
		}

		// Helper that returns data layout alignment of a memref associated with a
		// load, store, scatter, or gather op, including additional information from
		// assume_alignment calls on the source of the transfer
		template <class OpAdaptor>
		LogicalResult getMemRefOpAlignment(LLVMTypeConverter &typeConverter,
		OpAdaptor op, unsigned &align) {
		if (failed(getMemRefAlignment(typeConverter, op.getMemRefType(), align)))
		return failure();
		align = std::max(align, getAssumedAlignment(op.base()));
		return success();
		}

// Add an index vector component to a base pointer. This almost always succeeds		// Add an index vector component to a base pointer. This almost always succeeds
// unless the last stride is non-unit or the memory space is not zero.		// unless the last stride is non-unit or the memory space is not zero.
static LogicalResult getIndexedPtrs(ConversionPatternRewriter &rewriter,		static LogicalResult getIndexedPtrs(ConversionPatternRewriter &rewriter,
Location loc, Value memref, Value base,		Location loc, Value memref, Value base,
Value index, MemRefType memRefType,		Value index, MemRefType memRefType,
VectorType vType, Value &ptrs) {		VectorType vType, Value &ptrs) {
int64_t offset;		int64_t offset;
SmallVector<int64_t, 4> strides;		SmallVector<int64_t, 4> strides;
Show All 16 Lines
}		}

static LogicalResult		static LogicalResult
replaceTransferOpWithLoadOrStore(ConversionPatternRewriter &rewriter,		replaceTransferOpWithLoadOrStore(ConversionPatternRewriter &rewriter,
LLVMTypeConverter &typeConverter, Location loc,		LLVMTypeConverter &typeConverter, Location loc,
TransferReadOp xferOp,		TransferReadOp xferOp,
ArrayRef<Value> operands, Value dataPtr) {		ArrayRef<Value> operands, Value dataPtr) {
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(		if (failed(getTransferOpAlignment(typeConverter, xferOp, align)))
typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))
return failure();		return failure();
rewriter.replaceOpWithNewOp<LLVM::LoadOp>(xferOp, dataPtr, align);		rewriter.replaceOpWithNewOp<LLVM::LoadOp>(xferOp, dataPtr, align);
return success();		return success();
}		}

static LogicalResult		static LogicalResult
replaceTransferOpWithMasked(ConversionPatternRewriter &rewriter,		replaceTransferOpWithMasked(ConversionPatternRewriter &rewriter,
LLVMTypeConverter &typeConverter, Location loc,		LLVMTypeConverter &typeConverter, Location loc,
TransferReadOp xferOp, ArrayRef<Value> operands,		TransferReadOp xferOp, ArrayRef<Value> operands,
Value dataPtr, Value mask) {		Value dataPtr, Value mask) {
Type vecTy = typeConverter.convertType(xferOp.getVectorType());		Type vecTy = typeConverter.convertType(xferOp.getVectorType());
if (!vecTy)		if (!vecTy)
return failure();		return failure();

auto adaptor = TransferReadOpAdaptor(operands, xferOp->getAttrDictionary());		auto adaptor = TransferReadOpAdaptor(operands, xferOp->getAttrDictionary());
Value fill = rewriter.create<SplatOp>(loc, vecTy, adaptor.padding());		Value fill = rewriter.create<SplatOp>(loc, vecTy, adaptor.padding());

unsigned align;		unsigned align;
if (failed(getMemRefAlignment(		if (failed(getTransferOpAlignment(typeConverter, xferOp, align)))
typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))
return failure();		return failure();

rewriter.replaceOpWithNewOp<LLVM::MaskedLoadOp>(		rewriter.replaceOpWithNewOp<LLVM::MaskedLoadOp>(
xferOp, vecTy, dataPtr, mask, ValueRange{fill},		xferOp, vecTy, dataPtr, mask, ValueRange{fill},
rewriter.getI32IntegerAttr(align));		rewriter.getI32IntegerAttr(align));
return success();		return success();
}		}

static LogicalResult		static LogicalResult
replaceTransferOpWithLoadOrStore(ConversionPatternRewriter &rewriter,		replaceTransferOpWithLoadOrStore(ConversionPatternRewriter &rewriter,
LLVMTypeConverter &typeConverter, Location loc,		LLVMTypeConverter &typeConverter, Location loc,
TransferWriteOp xferOp,		TransferWriteOp xferOp,
ArrayRef<Value> operands, Value dataPtr) {		ArrayRef<Value> operands, Value dataPtr) {
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(		if (failed(getTransferOpAlignment(typeConverter, xferOp, align)))
typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))
return failure();		return failure();
auto adaptor = TransferWriteOpAdaptor(operands, xferOp->getAttrDictionary());		auto adaptor = TransferWriteOpAdaptor(operands, xferOp->getAttrDictionary());
rewriter.replaceOpWithNewOp<LLVM::StoreOp>(xferOp, adaptor.vector(), dataPtr,		rewriter.replaceOpWithNewOp<LLVM::StoreOp>(xferOp, adaptor.vector(), dataPtr,
align);		align);
return success();		return success();
}		}

static LogicalResult		static LogicalResult
replaceTransferOpWithMasked(ConversionPatternRewriter &rewriter,		replaceTransferOpWithMasked(ConversionPatternRewriter &rewriter,
LLVMTypeConverter &typeConverter, Location loc,		LLVMTypeConverter &typeConverter, Location loc,
TransferWriteOp xferOp, ArrayRef<Value> operands,		TransferWriteOp xferOp, ArrayRef<Value> operands,
Value dataPtr, Value mask) {		Value dataPtr, Value mask) {
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(		if (failed(getTransferOpAlignment(typeConverter, xferOp, align)))
typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))
return failure();		return failure();

auto adaptor = TransferWriteOpAdaptor(operands, xferOp->getAttrDictionary());		auto adaptor = TransferWriteOpAdaptor(operands, xferOp->getAttrDictionary());
rewriter.replaceOpWithNewOp<LLVM::MaskedStoreOp>(		rewriter.replaceOpWithNewOp<LLVM::MaskedStoreOp>(
xferOp, adaptor.vector(), dataPtr, mask,		xferOp, adaptor.vector(), dataPtr, mask,
rewriter.getI32IntegerAttr(align));		rewriter.getI32IntegerAttr(align));
return success();		return success();
}		}
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	if (vectorTy.getRank() > 1)
return failure();		return failure();

auto loc = loadOrStoreOp->getLoc();		auto loc = loadOrStoreOp->getLoc();
auto adaptor = LoadOrStoreOpAdaptor(operands);		auto adaptor = LoadOrStoreOpAdaptor(operands);
MemRefType memRefTy = loadOrStoreOp.getMemRefType();		MemRefType memRefTy = loadOrStoreOp.getMemRefType();

// Resolve alignment.		// Resolve alignment.
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(*this->getTypeConverter(), memRefTy, align)))		if (failed(getMemRefOpAlignment(*this->getTypeConverter(), loadOrStoreOp,
		align)))
return failure();		return failure();

// Resolve address.		// Resolve address.
auto vtype = this->typeConverter->convertType(loadOrStoreOp.getVectorType())		auto vtype = this->typeConverter->convertType(loadOrStoreOp.getVectorType())
.template cast<VectorType>();		.template cast<VectorType>();
Value dataPtr = this->getStridedElementPtr(loc, memRefTy, adaptor.base(),		Value dataPtr = this->getStridedElementPtr(loc, memRefTy, adaptor.base(),
adaptor.indices(), rewriter);		adaptor.indices(), rewriter);
Value ptr = castDataPtr(rewriter, loc, dataPtr, memRefTy, vtype);		Value ptr = castDataPtr(rewriter, loc, dataPtr, memRefTy, vtype);
Show All 13 Lines	public:
matchAndRewrite(vector::GatherOp gather, ArrayRef<Value> operands,		matchAndRewrite(vector::GatherOp gather, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
auto loc = gather->getLoc();		auto loc = gather->getLoc();
auto adaptor = vector::GatherOpAdaptor(operands);		auto adaptor = vector::GatherOpAdaptor(operands);
MemRefType memRefType = gather.getMemRefType();		MemRefType memRefType = gather.getMemRefType();

// Resolve alignment.		// Resolve alignment.
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(*getTypeConverter(), memRefType, align)))		if (failed(getMemRefOpAlignment(*getTypeConverter(), gather, align)))
return failure();		return failure();

// Resolve address.		// Resolve address.
Value ptrs;		Value ptrs;
VectorType vType = gather.getVectorType();		VectorType vType = gather.getVectorType();
Value ptr = getStridedElementPtr(loc, memRefType, adaptor.base(),		Value ptr = getStridedElementPtr(loc, memRefType, adaptor.base(),
adaptor.indices(), rewriter);		adaptor.indices(), rewriter);
if (failed(getIndexedPtrs(rewriter, loc, adaptor.base(), ptr,		if (failed(getIndexedPtrs(rewriter, loc, adaptor.base(), ptr,
Show All 18 Lines	public:
matchAndRewrite(vector::ScatterOp scatter, ArrayRef<Value> operands,		matchAndRewrite(vector::ScatterOp scatter, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
auto loc = scatter->getLoc();		auto loc = scatter->getLoc();
auto adaptor = vector::ScatterOpAdaptor(operands);		auto adaptor = vector::ScatterOpAdaptor(operands);
MemRefType memRefType = scatter.getMemRefType();		MemRefType memRefType = scatter.getMemRefType();

// Resolve alignment.		// Resolve alignment.
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(*getTypeConverter(), memRefType, align)))		if (failed(getMemRefOpAlignment(*getTypeConverter(), scatter, align)))
return failure();		return failure();

// Resolve address.		// Resolve address.
Value ptrs;		Value ptrs;
VectorType vType = scatter.getVectorType();		VectorType vType = scatter.getVectorType();
Value ptr = getStridedElementPtr(loc, memRefType, adaptor.base(),		Value ptr = getStridedElementPtr(loc, memRefType, adaptor.base(),
adaptor.indices(), rewriter);		adaptor.indices(), rewriter);
if (failed(getIndexedPtrs(rewriter, loc, adaptor.base(), ptr,		if (failed(getIndexedPtrs(rewriter, loc, adaptor.base(), ptr,
▲ Show 20 Lines • Show All 1,010 Lines • Show Last 20 Lines

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

Show First 20 Lines • Show All 1,289 Lines • ▼ Show 20 Lines
// CHECK: %[[loaded:.]] = llvm.intr.masked.load %{{.}}, %{{.}}, %{{.}} {alignment = 8 : i32} :		// CHECK: %[[loaded:.]] = llvm.intr.masked.load %{{.}}, %{{.}}, %{{.}} {alignment = 8 : i32} :
// CHECK-SAME: (!llvm.ptr<vector<17xi64>>, vector<17xi1>, vector<17xi64>) -> vector<17xi64>		// CHECK-SAME: (!llvm.ptr<vector<17xi64>>, vector<17xi1>, vector<17xi64>) -> vector<17xi64>

// CHECK: llvm.intr.masked.store %[[loaded]], %{{.}}, %{{.}} {alignment = 8 : i32} :		// CHECK: llvm.intr.masked.store %[[loaded]], %{{.}}, %{{.}} {alignment = 8 : i32} :
// CHECK-SAME: vector<17xi64>, vector<17xi1> into !llvm.ptr<vector<17xi64>>		// CHECK-SAME: vector<17xi64>, vector<17xi1> into !llvm.ptr<vector<17xi64>>

// -----		// -----

		func @transfer_read_1d_aligned(%A : memref<?xf32>, %base: index) -> vector<17xf32> {
		memref.assume_alignment %A, 32 : memref<?xf32>
		%f7 = constant 7.0: f32
		%f = vector.transfer_read %A[%base], %f7
		{permutation_map = affine_map<(d0) -> (d0)>} :
		memref<?xf32>, vector<17xf32>
		vector.transfer_write %f, %A[%base]
		{permutation_map = affine_map<(d0) -> (d0)>} :
		vector<17xf32>, memref<?xf32>
		return %f: vector<17xf32>
		}
		// CHECK: llvm.intr.masked.load
		// CHECK-SAME: {alignment = 32 : i32}
		// CHECK-SAME: (!llvm.ptr<vector<17xf32>>, vector<17xi1>, vector<17xf32>) -> vector<17xf32>
		// CHECK: llvm.intr.masked.store
		// CHECK-SAME: {alignment = 32 : i32}
		// CHECK-SAME: vector<17xf32>, vector<17xi1> into !llvm.ptr<vector<17xf32>>

		// -----

func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: index) -> vector<17xf32> {		func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: index) -> vector<17xf32> {
%f7 = constant 7.0: f32		%f7 = constant 7.0: f32
%f = vector.transfer_read %A[%base0, %base1], %f7		%f = vector.transfer_read %A[%base0, %base1], %f7
{permutation_map = affine_map<(d0, d1) -> (d1)>} :		{permutation_map = affine_map<(d0, d1) -> (d1)>} :
memref<?x?xf32>, vector<17xf32>		memref<?x?xf32>, vector<17xf32>
return %f: vector<17xf32>		return %f: vector<17xf32>
}		}
// CHECK-LABEL: func @transfer_read_2d_to_1d		// CHECK-LABEL: func @transfer_read_2d_to_1d
▲ Show 20 Lines • Show All 176 Lines • ▼ Show 20 Lines
}		}
// CHECK-LABEL: func @vector_load_op_index		// CHECK-LABEL: func @vector_load_op_index
// CHECK: %[[T0:.]] = llvm.load %{{.}} {alignment = 8 : i64} : !llvm.ptr<vector<8xi64>>		// CHECK: %[[T0:.]] = llvm.load %{{.}} {alignment = 8 : i64} : !llvm.ptr<vector<8xi64>>
// CHECK: %[[T1:.*]] = llvm.mlir.cast %[[T0]] : vector<8xi64> to vector<8xindex>		// CHECK: %[[T1:.*]] = llvm.mlir.cast %[[T0]] : vector<8xi64> to vector<8xindex>
// CHECK: return %[[T1]] : vector<8xindex>		// CHECK: return %[[T1]] : vector<8xindex>

// -----		// -----

		func @vector_load_op_aligned(%memref : memref<200x100xf32>, %i : index, %j : index) -> vector<8xf32> {
		memref.assume_alignment %memref, 32 : memref<200x100xf32>
		%0 = vector.load %memref[%i, %j] : memref<200x100xf32>, vector<8xf32>
		return %0 : vector<8xf32>
		}

		// CHECK-LABEL: func @vector_load_op_aligned
		// CHECK: %[[c100:.*]] = llvm.mlir.constant(100 : index) : i64
		// CHECK: %[[mul:.]] = llvm.mul %{{.}}, %[[c100]] : i64
		// CHECK: %[[add:.]] = llvm.add %[[mul]], %{{.}} : i64
		// CHECK: %[[gep:.]] = llvm.getelementptr %{{.}}[%[[add]]] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
		// CHECK: %[[bcast:.*]] = llvm.bitcast %[[gep]] : !llvm.ptr<f32> to !llvm.ptr<vector<8xf32>>
		// CHECK: llvm.load %[[bcast]] {alignment = 32 : i64} : !llvm.ptr<vector<8xf32>>

		// -----

func @vector_store_op(%memref : memref<200x100xf32>, %i : index, %j : index) {		func @vector_store_op(%memref : memref<200x100xf32>, %i : index, %j : index) {
%val = constant dense<11.0> : vector<4xf32>		%val = constant dense<11.0> : vector<4xf32>
vector.store %val, %memref[%i, %j] : memref<200x100xf32>, vector<4xf32>		vector.store %val, %memref[%i, %j] : memref<200x100xf32>, vector<4xf32>
return		return
}		}

// CHECK-LABEL: func @vector_store_op		// CHECK-LABEL: func @vector_store_op
// CHECK: %[[c100:.*]] = llvm.mlir.constant(100 : index) : i64		// CHECK: %[[c100:.*]] = llvm.mlir.constant(100 : index) : i64
Show All 10 Lines	func @vector_store_op_index(%memref : memref<200x100xindex>, %i : index, %j : index) {
vector.store %val, %memref[%i, %j] : memref<200x100xindex>, vector<4xindex>		vector.store %val, %memref[%i, %j] : memref<200x100xindex>, vector<4xindex>
return		return
}		}
// CHECK-LABEL: func @vector_store_op_index		// CHECK-LABEL: func @vector_store_op_index
// CHECK: llvm.store %{{.}}, %{{.}} {alignment = 8 : i64} : !llvm.ptr<vector<4xi64>>		// CHECK: llvm.store %{{.}}, %{{.}} {alignment = 8 : i64} : !llvm.ptr<vector<4xi64>>

// -----		// -----

		func @vector_store_op_aligned(%memref : memref<200x100xf32>, %i : index, %j : index) {
		memref.assume_alignment %memref, 32 : memref<200x100xf32>
		%val = constant dense<11.0> : vector<4xf32>
		vector.store %val, %memref[%i, %j] : memref<200x100xf32>, vector<4xf32>
		ThomasRaouxUnsubmitted Not Done Reply Inline Actions How can we know that this store is aligned on 32B without knowing anything about %i and %j. I'm running into a bug because the memref alignment is propagated to load/store associated without considering the indices used for indexing. I don't believe we can infer any kind of alignment without having alignment information about the indices. @stephenneuendorffer, could you clarify how you expect this to work? ThomasRaoux: How can we know that this store is aligned on 32B without knowing anything about %i and %j. I'm…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions This one is clearly not aligned on 32B but it should be aligned on: lcm(annotation + offset_alignment, most_minor_dim_alignment) == lcm(32 + 0 * 4, 100 * 4) == 16B There seems indeed to be a small bug somewhere but I would prefer we fix rather than flatly revert. nicolasvasilache: This one is clearly not aligned on 32B but it should be aligned on: ``` lcm(annotation +…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Why? Revert is conservative and cheap: it is trivial to reapply the patch with the fix without pressure. mehdi_amini: Why? Revert is conservative and cheap: it is trivial to reapply the patch with the fix without…
		ThomasRaouxUnsubmitted Not Done Reply Inline Actions Yes Stephen reverted it as a quick solution as this was causing miscompile on IREE side. @nicolasvasilache, I don't understand how we can prove that this is aligned on 16B. If %j = 1 and %i = 0 for instance we would be accessing 4B from the base and since the base is 32B aligned we would only be able to get 4B alignment. What am I missing? ThomasRaoux: Yes Stephen reverted it as a quick solution as this was causing miscompile on IREE side.
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions You're right, I was very focused on the memref type itself and did not look deeper at the indexing.. What I meant to say is that the alignment for each start of every row of memref is already degraded wrt the base pointer. More precisely: @%memref[0, 0] is aligned at 32B @%memref[1, 0] is aligned at 16B but not 32B @%memref[2, 0] is aligned at 32B ... the indexing within the row can indeed be any 4B aligned. But even worse, even `@%memref[1, 32]` is aligned at 16B and not 32B. This is one of the important things to grasp when talking about memref and alignment: a memref is not a pointer. An n-D memref is closer in principle to a: (n-1)-D array of pointers to 1-D row (even if they are not materialized) Please ignore my earlier comment on the revert and sorry about the noise 😛 nicolasvasilache: You're right, I was very focused on the memref type itself and did not look deeper at the…
		return
		}

		// CHECK-LABEL: func @vector_store_op_aligned
		// CHECK: %[[c100:.*]] = llvm.mlir.constant(100 : index) : i64
		// CHECK: %[[mul:.]] = llvm.mul %{{.}}, %[[c100]] : i64
		// CHECK: %[[add:.]] = llvm.add %[[mul]], %{{.}} : i64
		// CHECK: %[[gep:.]] = llvm.getelementptr %{{.}}[%[[add]]] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
		// CHECK: %[[bcast:.*]] = llvm.bitcast %[[gep]] : !llvm.ptr<f32> to !llvm.ptr<vector<4xf32>>
		// CHECK: llvm.store %{{.*}}, %[[bcast]] {alignment = 32 : i64} : !llvm.ptr<vector<4xf32>>

		// -----

func @masked_load_op(%arg0: memref<?xf32>, %arg1: vector<16xi1>, %arg2: vector<16xf32>) -> vector<16xf32> {		func @masked_load_op(%arg0: memref<?xf32>, %arg1: vector<16xi1>, %arg2: vector<16xf32>) -> vector<16xf32> {
%c0 = constant 0: index		%c0 = constant 0: index
%0 = vector.maskedload %arg0[%c0], %arg1, %arg2 : memref<?xf32>, vector<16xi1>, vector<16xf32> into vector<16xf32>		%0 = vector.maskedload %arg0[%c0], %arg1, %arg2 : memref<?xf32>, vector<16xi1>, vector<16xf32> into vector<16xf32>
return %0 : vector<16xf32>		return %0 : vector<16xf32>
}		}

// CHECK-LABEL: func @masked_load_op		// CHECK-LABEL: func @masked_load_op
// CHECK: %[[CO:.*]] = constant 0 : index		// CHECK: %[[CO:.*]] = constant 0 : index
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines

// CHECK-LABEL: func @gather_op_index		// CHECK-LABEL: func @gather_op_index
// CHECK: %[[P:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<i64>, vector<3xi64>) -> !llvm.vec<3 x ptr<i64>>		// CHECK: %[[P:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<i64>, vector<3xi64>) -> !llvm.vec<3 x ptr<i64>>
// CHECK: %[[G:.]] = llvm.intr.masked.gather %{{.}}, %{{.}}, %{{.}} {alignment = 8 : i32} : (!llvm.vec<3 x ptr<i64>>, vector<3xi1>, vector<3xi64>) -> vector<3xi64>		// CHECK: %[[G:.]] = llvm.intr.masked.gather %{{.}}, %{{.}}, %{{.}} {alignment = 8 : i32} : (!llvm.vec<3 x ptr<i64>>, vector<3xi1>, vector<3xi64>) -> vector<3xi64>
// CHECK: %{{.*}} = llvm.mlir.cast %[[G]] : vector<3xi64> to vector<3xindex>		// CHECK: %{{.*}} = llvm.mlir.cast %[[G]] : vector<3xi64> to vector<3xindex>

// -----		// -----

		func @gather_op_aligned(%arg0: memref<?xf32>, %arg1: vector<3xi32>, %arg2: vector<3xi1>, %arg3: vector<3xf32>) -> vector<3xf32> {
		memref.assume_alignment %arg0, 32 : memref<?xf32>
		%0 = constant 0: index
		%1 = vector.gather %arg0[%0][%arg1], %arg2, %arg3 : memref<?xf32>, vector<3xi32>, vector<3xi1>, vector<3xf32> into vector<3xf32>
		return %1 : vector<3xf32>
		}

		// CHECK-LABEL: func @gather_op_aligned
		// CHECK: %[[P:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<f32>, vector<3xi32>) -> !llvm.vec<3 x ptr<f32>>
		// CHECK: %[[G:.]] = llvm.intr.masked.gather %[[P]], %{{.}}, %{{.*}} {alignment = 32 : i32} : (!llvm.vec<3 x ptr<f32>>, vector<3xi1>, vector<3xf32>) -> vector<3xf32>
		// CHECK: return %[[G]] : vector<3xf32>

		// -----

func @gather_2d_op(%arg0: memref<4x4xf32>, %arg1: vector<4xi32>, %arg2: vector<4xi1>, %arg3: vector<4xf32>) -> vector<4xf32> {		func @gather_2d_op(%arg0: memref<4x4xf32>, %arg1: vector<4xi32>, %arg2: vector<4xi1>, %arg3: vector<4xf32>) -> vector<4xf32> {
%0 = constant 3 : index		%0 = constant 3 : index
%1 = vector.gather %arg0[%0, %0][%arg1], %arg2, %arg3 : memref<4x4xf32>, vector<4xi32>, vector<4xi1>, vector<4xf32> into vector<4xf32>		%1 = vector.gather %arg0[%0, %0][%arg1], %arg2, %arg3 : memref<4x4xf32>, vector<4xi32>, vector<4xi1>, vector<4xf32> into vector<4xf32>
return %1 : vector<4xf32>		return %1 : vector<4xf32>
}		}

// CHECK-LABEL: func @gather_2d_op		// CHECK-LABEL: func @gather_2d_op
// CHECK: %[[B:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>		// CHECK: %[[B:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
Show All 22 Lines
}		}

// CHECK-LABEL: func @scatter_op_index		// CHECK-LABEL: func @scatter_op_index
// CHECK: %[[P:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<i64>, vector<3xi64>) -> !llvm.vec<3 x ptr<i64>>		// CHECK: %[[P:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<i64>, vector<3xi64>) -> !llvm.vec<3 x ptr<i64>>
// CHECK: llvm.intr.masked.scatter %{{.}}, %[[P]], %{{.}} {alignment = 8 : i32} : vector<3xi64>, vector<3xi1> into !llvm.vec<3 x ptr<i64>>		// CHECK: llvm.intr.masked.scatter %{{.}}, %[[P]], %{{.}} {alignment = 8 : i32} : vector<3xi64>, vector<3xi1> into !llvm.vec<3 x ptr<i64>>

// -----		// -----

		func @scatter_op_aligned(%arg0: memref<?xf32>, %arg1: vector<3xi32>, %arg2: vector<3xi1>, %arg3: vector<3xf32>) {
		memref.assume_alignment %arg0, 32 : memref<?xf32>
		%0 = constant 0: index
		vector.scatter %arg0[%0][%arg1], %arg2, %arg3 : memref<?xf32>, vector<3xi32>, vector<3xi1>, vector<3xf32>
		return
		}

		// CHECK-LABEL: func @scatter_op_aligned
		// CHECK: %[[P:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<f32>, vector<3xi32>) -> !llvm.vec<3 x ptr<f32>>
		// CHECK: llvm.intr.masked.scatter %{{.}}, %[[P]], %{{.}} {alignment = 32 : i32} : vector<3xf32>, vector<3xi1> into !llvm.vec<3 x ptr<f32>>

		// -----

func @scatter_2d_op(%arg0: memref<4x4xf32>, %arg1: vector<4xi32>, %arg2: vector<4xi1>, %arg3: vector<4xf32>) {		func @scatter_2d_op(%arg0: memref<4x4xf32>, %arg1: vector<4xi32>, %arg2: vector<4xi1>, %arg3: vector<4xf32>) {
%0 = constant 3 : index		%0 = constant 3 : index
vector.scatter %arg0[%0, %0][%arg1], %arg2, %arg3 : memref<4x4xf32>, vector<4xi32>, vector<4xi1>, vector<4xf32>		vector.scatter %arg0[%0, %0][%arg1], %arg2, %arg3 : memref<4x4xf32>, vector<4xi32>, vector<4xi1>, vector<4xf32>
return		return
}		}

// CHECK-LABEL: func @scatter_2d_op		// CHECK-LABEL: func @scatter_2d_op
// CHECK: %[[B:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>		// CHECK: %[[B:.]] = llvm.getelementptr %{{.}}[%{{.*}}] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines