This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
8/8
ConvertVectorToLLVM.cpp
-
test/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
5
vector-to-llvm.mlir

Differential D100444

[MLIR] Update Vector To LLVM conversion to be aware of assume_alignment
ClosedPublic

Authored by stephenneuendorffer on Apr 13 2021, 9:57 PM.

Download Raw Diff

Details

Reviewers

aartbik
ftynse
nicolasvasilache

Commits

rG29a50c5864dd: [MLIR] Update Vector To LLVM conversion to be aware of assume_alignment

Summary

vector.transfer_read and vector.transfer_write operations are converted
to llvm intrinsics with specific alignment information, however there
doesn't seem to be a way in llvm to take information from llvm.assume
intrinsics and change this alignment information. In any
event, due the to the structure of the llvm.assume instrinsic, applying
this information at the llvm level is more cumbersome. Instead, let's
generate the masked vector load and store instrinsic with the right
alignment information from MLIR in the first place. Since
we're bothering to do this, lets just emit the proper alignment for
loads, stores, scatter, and gather ops too.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

stephenneuendorffer created this revision.Apr 13 2021, 9:57 PM

Herald added a reviewer: aartbik. · View Herald TranscriptApr 13 2021, 9:57 PM

Herald added a reviewer: ftynse. · View Herald Transcript

Herald added subscribers: dcaballe, cota, teijeong and 16 others. · View Herald Transcript

stephenneuendorffer requested review of this revision.Apr 13 2021, 9:57 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2021, 9:57 PM

Herald added a subscriber: nicolasvasilache. · View Herald Transcript

stephenneuendorffer added a reviewer: nicolasvasilache.Apr 13 2021, 9:59 PM

Harbormaster completed remote builds in B98612: Diff 337332.Apr 14 2021, 12:14 AM

nicolasvasilache added inline comments.Apr 14 2021, 1:26 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
123	Can we fold this into a: LogicalResult getMemRefAlignment(LLVMTypeConverter &typeConverter, Value memRef, unsigned &align) { MemRefType memrefType = memRef.getType().cast<MemRefType>(); // old code // your code } ?
130	`align = std::max(align, op.alignment())` (with a cast if needed) ?

aartbik added inline comments.Apr 14 2021, 11:53 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
123	+1 with probably a name to reflect that getMemRefAlignmentAndUpdate() or something like that

stephenneuendorffer updated this revision to Diff 346352.May 19 2021, 12:18 AM

stephenneuendorffer edited the summary of this revision. (Show Details)

In incorporating your feedback, I decided to emit the proper alignment other loads and stores too. Thoughts?

Harbormaster completed remote builds in B105158: Diff 346352.May 19 2021, 12:47 AM

In D100444#2767792, @stephenneuendorffer wrote:

In incorporating your feedback, I decided to emit the proper alignment other loads and stores too. Thoughts?

Looks good!

Now I'm wondering whether we should go for the LICM instead?

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
123	If no assumed alignment -> Return the minimal alignment value that satisfies all the AssumeAlignment uses of `value`. If no such uses exist, return 0.
130	Thanks for re-spelling. Now that I see it in this form, should this be `licm` instead of `max` ?

And by licm I mean lcm :p ..

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
130	And by `licm` I mean `lcm` :p ..

stephenneuendorffer marked 3 inline comments as done.May 19 2021, 10:36 AM

stephenneuendorffer added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
130	Probably.. I guess I was implicitly assuming everything was a power of 2.

stephenneuendorffer updated this revision to Diff 346504.May 19 2021, 10:43 AM

stephenneuendorffer marked an inline comment as done.

stephenneuendorffer added inline comments.May 19 2021, 10:44 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
130	Sure enough : 'error: 'memref.assume_alignment' op alignment must be power of 2'

This revision was not accepted when it landed; it landed in state Needs Review.May 19 2021, 10:51 AM

This revision was landed with ongoing or failed builds.

Closed by commit rG29a50c5864dd: [MLIR] Update Vector To LLVM conversion to be aware of assume_alignment (authored by stephenneuendorffer). · Explain Why

This revision was automatically updated to reflect the committed changes.

stephenneuendorffer added a commit: rG29a50c5864dd: [MLIR] Update Vector To LLVM conversion to be aware of assume_alignment.

thanks!

Harbormaster completed remote builds in B105266: Diff 346504.May 19 2021, 11:31 AM

ThomasRaoux added a subscriber: ThomasRaoux.Nov 30 2021, 1:29 PM

ThomasRaoux added inline comments.

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1539	How can we know that this store is aligned on 32B without knowing anything about %i and %j. I'm running into a bug because the memref alignment is propagated to load/store associated without considering the indices used for indexing. I don't believe we can infer any kind of alignment without having alignment information about the indices. @stephenneuendorffer, could you clarify how you expect this to work?

Herald added subscribers: sdasgup3, wenzhicui, wrengr, Chia-hungDuan. · View Herald TranscriptNov 30 2021, 1:29 PM

stephenneuendorffer added a reverting change: rG73863648892e: Revert "[MLIR] Update Vector To LLVM conversion to be aware of assume_alignment".Nov 30 2021, 3:18 PM

nicolasvasilache added inline comments.Dec 1 2021, 12:00 AM

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1539	This one is clearly not aligned on 32B but it should be aligned on: lcm(annotation + offset_alignment, most_minor_dim_alignment) == lcm(32 + 0 * 4, 100 * 4) == 16B There seems indeed to be a small bug somewhere but I would prefer we fix rather than flatly revert.

mehdi_amini added inline comments.Dec 1 2021, 12:09 AM

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1539	Why? Revert is conservative and cheap: it is trivial to reapply the patch with the fix without pressure.

Fair enough, my bias was against losing the contribution; didn't think about it your way, thanks!

ThomasRaoux added inline comments.Dec 1 2021, 6:53 AM

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1539	Yes Stephen reverted it as a quick solution as this was causing miscompile on IREE side. @nicolasvasilache, I don't understand how we can prove that this is aligned on 16B. If %j = 1 and %i = 0 for instance we would be accessing 4B from the base and since the base is 32B aligned we would only be able to get 4B alignment. What am I missing?

nicolasvasilache added inline comments.Dec 1 2021, 7:33 AM

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1539	You're right, I was very focused on the memref type itself and did not look deeper at the indexing.. What I meant to say is that the alignment for each start of every row of memref is already degraded wrt the base pointer. More precisely: @%memref[0, 0] is aligned at 32B @%memref[1, 0] is aligned at 16B but not 32B @%memref[2, 0] is aligned at 32B ... the indexing within the row can indeed be any 4B aligned. But even worse, even `@%memref[1, 32]` is aligned at 16B and not 32B. This is one of the important things to grasp when talking about memref and alignment: a memref is not a pointer. An n-D memref is closer in principle to a: (n-1)-D array of pointers to 1-D row (even if they are not materialized) Please ignore my earlier comment on the revert and sorry about the noise 😛

Revision Contents

Path

Size

mlir/

lib/

Conversion/

VectorToLLVM/

ConvertVectorToLLVM.cpp

17 lines

test/

Conversion/

VectorToLLVM/

vector-to-llvm.mlir

20 lines

Diff 337332

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	LogicalResult getMemRefAlignment(LLVMTypeConverter &typeConverter,
// TODO: this should use the MLIR data layout when it becomes available and		// TODO: this should use the MLIR data layout when it becomes available and
// stop depending on translation.		// stop depending on translation.
llvm::LLVMContext llvmContext;		llvm::LLVMContext llvmContext;
align = LLVM::TypeToLLVMIRTranslator(llvmContext)		align = LLVM::TypeToLLVMIRTranslator(llvmContext)
.getPreferredAlignment(elementTy, typeConverter.getDataLayout());		.getPreferredAlignment(elementTy, typeConverter.getDataLayout());
return success();		return success();
}		}

		// Helper to find assume_alignment information.
		void updateAlignment(Value value, unsigned &align) {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can we fold this into a: LogicalResult getMemRefAlignment(LLVMTypeConverter &typeConverter, Value memRef, unsigned &align) { MemRefType memrefType = memRef.getType().cast<MemRefType>(); // old code // your code } ? nicolasvasilache: Can we fold this into a: ``` LogicalResult getMemRefAlignment(LLVMTypeConverter &typeConverter…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions If no assumed alignment -> Return the minimal alignment value that satisfies all the AssumeAlignment uses of `value`. If no such uses exist, return 0. nicolasvasilache: ```If no assumed alignment``` -> ``` Return the minimal alignment value that satisfies all…
		aartbikUnsubmitted Done Reply Inline Actions +1 with probably a name to reflect that getMemRefAlignmentAndUpdate() or something like that aartbik: +1 with probably a name to reflect that getMemRefAlignmentAndUpdate() or something like that
		for (auto &u : value.getUses()) {
		Operation *owner = u.getOwner();
		if(auto op = dyn_cast<memref::AssumeAlignmentOp>(owner)) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if(auto op = dyn_cast<memref::AssumeAlignmentOp>(owner)) { + if (auto op = dyn_cast<memref::AssumeAlignmentOp>(owner)) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - if(auto op = dyn_cast<memref…
		unsigned newAlignment = op.alignment();
		if(newAlignment > align)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if(newAlignment > align) + if (newAlignment > align) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if(newAlignment > align) + if…
		align = newAlignment;
		}
		nicolasvasilacheUnsubmitted Done Reply Inline Actions `align = std::max(align, op.alignment())` (with a cast if needed) ? nicolasvasilache: `align = std::max(align, op.alignment())` (with a cast if needed) ?
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Thanks for re-spelling. Now that I see it in this form, should this be `licm` instead of `max` ? nicolasvasilache: Thanks for re-spelling. Now that I see it in this form, should this be `licm` instead of `max` ?
		nicolasvasilacheUnsubmitted Done Reply Inline Actions And by `licm` I mean `lcm` :p .. nicolasvasilache: And by `licm` I mean `lcm` :p ..
		stephenneuendorfferAuthorUnsubmitted Done Reply Inline Actions Probably.. I guess I was implicitly assuming everything was a power of 2. stephenneuendorffer: Probably.. I guess I was implicitly assuming everything was a power of 2.
		stephenneuendorfferAuthorUnsubmitted Done Reply Inline Actions Sure enough : 'error: 'memref.assume_alignment' op alignment must be power of 2' stephenneuendorffer: Sure enough : 'error: 'memref.assume_alignment' op alignment must be power of 2'
		}
		}

// Add an index vector component to a base pointer. This almost always succeeds		// Add an index vector component to a base pointer. This almost always succeeds
// unless the last stride is non-unit or the memory space is not zero.		// unless the last stride is non-unit or the memory space is not zero.
static LogicalResult getIndexedPtrs(ConversionPatternRewriter &rewriter,		static LogicalResult getIndexedPtrs(ConversionPatternRewriter &rewriter,
Location loc, Value memref, Value base,		Location loc, Value memref, Value base,
Value index, MemRefType memRefType,		Value index, MemRefType memRefType,
VectorType vType, Value &ptrs) {		VectorType vType, Value &ptrs) {
int64_t offset;		int64_t offset;
SmallVector<int64_t, 4> strides;		SmallVector<int64_t, 4> strides;
Show All 19 Lines
replaceTransferOpWithLoadOrStore(ConversionPatternRewriter &rewriter,		replaceTransferOpWithLoadOrStore(ConversionPatternRewriter &rewriter,
LLVMTypeConverter &typeConverter, Location loc,		LLVMTypeConverter &typeConverter, Location loc,
TransferReadOp xferOp,		TransferReadOp xferOp,
ArrayRef<Value> operands, Value dataPtr) {		ArrayRef<Value> operands, Value dataPtr) {
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(		if (failed(getMemRefAlignment(
typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))		typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))
return failure();		return failure();
		updateAlignment(xferOp.source(), align);
rewriter.replaceOpWithNewOp<LLVM::LoadOp>(xferOp, dataPtr, align);		rewriter.replaceOpWithNewOp<LLVM::LoadOp>(xferOp, dataPtr, align);
return success();		return success();
}		}

static LogicalResult		static LogicalResult
replaceTransferOpWithMasked(ConversionPatternRewriter &rewriter,		replaceTransferOpWithMasked(ConversionPatternRewriter &rewriter,
LLVMTypeConverter &typeConverter, Location loc,		LLVMTypeConverter &typeConverter, Location loc,
TransferReadOp xferOp, ArrayRef<Value> operands,		TransferReadOp xferOp, ArrayRef<Value> operands,
Value dataPtr, Value mask) {		Value dataPtr, Value mask) {
Type vecTy = typeConverter.convertType(xferOp.getVectorType());		Type vecTy = typeConverter.convertType(xferOp.getVectorType());
if (!vecTy)		if (!vecTy)
return failure();		return failure();

auto adaptor = TransferReadOpAdaptor(operands, xferOp->getAttrDictionary());		auto adaptor = TransferReadOpAdaptor(operands, xferOp->getAttrDictionary());
Value fill = rewriter.create<SplatOp>(loc, vecTy, adaptor.padding());		Value fill = rewriter.create<SplatOp>(loc, vecTy, adaptor.padding());

unsigned align;		unsigned align;
if (failed(getMemRefAlignment(		if (failed(getMemRefAlignment(
typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))		typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))
return failure();		return failure();
		updateAlignment(xferOp.source(), align);
rewriter.replaceOpWithNewOp<LLVM::MaskedLoadOp>(		rewriter.replaceOpWithNewOp<LLVM::MaskedLoadOp>(
xferOp, vecTy, dataPtr, mask, ValueRange{fill},		xferOp, vecTy, dataPtr, mask, ValueRange{fill},
rewriter.getI32IntegerAttr(align));		rewriter.getI32IntegerAttr(align));
return success();		return success();
}		}

static LogicalResult		static LogicalResult
replaceTransferOpWithLoadOrStore(ConversionPatternRewriter &rewriter,		replaceTransferOpWithLoadOrStore(ConversionPatternRewriter &rewriter,
LLVMTypeConverter &typeConverter, Location loc,		LLVMTypeConverter &typeConverter, Location loc,
TransferWriteOp xferOp,		TransferWriteOp xferOp,
ArrayRef<Value> operands, Value dataPtr) {		ArrayRef<Value> operands, Value dataPtr) {
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(		if (failed(getMemRefAlignment(
typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))		typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))
return failure();		return failure();
		updateAlignment(xferOp.source(), align);
auto adaptor = TransferWriteOpAdaptor(operands, xferOp->getAttrDictionary());		auto adaptor = TransferWriteOpAdaptor(operands, xferOp->getAttrDictionary());
rewriter.replaceOpWithNewOp<LLVM::StoreOp>(xferOp, adaptor.vector(), dataPtr,		rewriter.replaceOpWithNewOp<LLVM::StoreOp>(xferOp, adaptor.vector(), dataPtr,
align);		align);
return success();		return success();
}		}

static LogicalResult		static LogicalResult
replaceTransferOpWithMasked(ConversionPatternRewriter &rewriter,		replaceTransferOpWithMasked(ConversionPatternRewriter &rewriter,
LLVMTypeConverter &typeConverter, Location loc,		LLVMTypeConverter &typeConverter, Location loc,
TransferWriteOp xferOp, ArrayRef<Value> operands,		TransferWriteOp xferOp, ArrayRef<Value> operands,
Value dataPtr, Value mask) {		Value dataPtr, Value mask) {
unsigned align;		unsigned align;
if (failed(getMemRefAlignment(		if (failed(getMemRefAlignment(
typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))		typeConverter, xferOp.getShapedType().cast<MemRefType>(), align)))
return failure();		return failure();

auto adaptor = TransferWriteOpAdaptor(operands, xferOp->getAttrDictionary());		auto adaptor = TransferWriteOpAdaptor(operands, xferOp->getAttrDictionary());
		updateAlignment(xferOp.source(), align);
rewriter.replaceOpWithNewOp<LLVM::MaskedStoreOp>(		rewriter.replaceOpWithNewOp<LLVM::MaskedStoreOp>(
xferOp, adaptor.vector(), dataPtr, mask,		xferOp, adaptor.vector(), dataPtr, mask,
rewriter.getI32IntegerAttr(align));		rewriter.getI32IntegerAttr(align));
return success();		return success();
}		}

static TransferReadOpAdaptor getTransferOpAdapter(TransferReadOp xferOp,		static TransferReadOpAdaptor getTransferOpAdapter(TransferReadOp xferOp,
ArrayRef<Value> operands) {		ArrayRef<Value> operands) {
▲ Show 20 Lines • Show All 1,191 Lines • Show Last 20 Lines

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

	Show First 20 Lines • Show All 1,289 Lines • ▼ Show 20 Lines
	// CHECK: %[[loaded:.]] = llvm.intr.masked.load %{{.}}, %{{.}}, %{{.}} {alignment = 8 : i32} :			// CHECK: %[[loaded:.]] = llvm.intr.masked.load %{{.}}, %{{.}}, %{{.}} {alignment = 8 : i32} :
	// CHECK-SAME: (!llvm.ptr<vector<17xi64>>, vector<17xi1>, vector<17xi64>) -> vector<17xi64>			// CHECK-SAME: (!llvm.ptr<vector<17xi64>>, vector<17xi1>, vector<17xi64>) -> vector<17xi64>

	// CHECK: llvm.intr.masked.store %[[loaded]], %{{.}}, %{{.}} {alignment = 8 : i32} :			// CHECK: llvm.intr.masked.store %[[loaded]], %{{.}}, %{{.}} {alignment = 8 : i32} :
	// CHECK-SAME: vector<17xi64>, vector<17xi1> into !llvm.ptr<vector<17xi64>>			// CHECK-SAME: vector<17xi64>, vector<17xi1> into !llvm.ptr<vector<17xi64>>

	// -----			// -----

				func @transfer_read_1d_aligned(%A : memref<?xf32>, %base: index) -> vector<17xf32> {
				memref.assume_alignment %A, 32 : memref<?xf32>
				%f7 = constant 7.0: f32
				%f = vector.transfer_read %A[%base], %f7
				{permutation_map = affine_map<(d0) -> (d0)>} :
				memref<?xf32>, vector<17xf32>
				vector.transfer_write %f, %A[%base]
				{permutation_map = affine_map<(d0) -> (d0)>} :
				vector<17xf32>, memref<?xf32>
				return %f: vector<17xf32>
				}
				// CHECK: llvm.intr.masked.load
				// CHECK-SAME: {alignment = 32 : i32}
				// CHECK-SAME: (!llvm.ptr<vector<17xf32>>, vector<17xi1>, vector<17xf32>) -> vector<17xf32>
				// CHECK: llvm.intr.masked.store
				// CHECK-SAME: {alignment = 32 : i32}
				// CHECK-SAME: vector<17xf32>, vector<17xi1> into !llvm.ptr<vector<17xf32>>

				// -----

	func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: index) -> vector<17xf32> {			func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: index) -> vector<17xf32> {
	%f7 = constant 7.0: f32			%f7 = constant 7.0: f32
	%f = vector.transfer_read %A[%base0, %base1], %f7			%f = vector.transfer_read %A[%base0, %base1], %f7
	{permutation_map = affine_map<(d0, d1) -> (d1)>} :			{permutation_map = affine_map<(d0, d1) -> (d1)>} :
	memref<?x?xf32>, vector<17xf32>			memref<?x?xf32>, vector<17xf32>
	return %f: vector<17xf32>			return %f: vector<17xf32>
	}			}
	// CHECK-LABEL: func @transfer_read_2d_to_1d			// CHECK-LABEL: func @transfer_read_2d_to_1d
	▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: func @vector_store_op_index			// CHECK-LABEL: func @vector_store_op_index
	// CHECK: llvm.store %{{.}}, %{{.}} {alignment = 8 : i64} : !llvm.ptr<vector<4xi64>>			// CHECK: llvm.store %{{.}}, %{{.}} {alignment = 8 : i64} : !llvm.ptr<vector<4xi64>>

	// -----			// -----

	func @masked_load_op(%arg0: memref<?xf32>, %arg1: vector<16xi1>, %arg2: vector<16xf32>) -> vector<16xf32> {			func @masked_load_op(%arg0: memref<?xf32>, %arg1: vector<16xi1>, %arg2: vector<16xf32>) -> vector<16xf32> {
	%c0 = constant 0: index			%c0 = constant 0: index
	%0 = vector.maskedload %arg0[%c0], %arg1, %arg2 : memref<?xf32>, vector<16xi1>, vector<16xf32> into vector<16xf32>			%0 = vector.maskedload %arg0[%c0], %arg1, %arg2 : memref<?xf32>, vector<16xi1>, vector<16xf32> into vector<16xf32>
	return %0 : vector<16xf32>			return %0 : vector<16xf32>
				ThomasRaouxUnsubmitted Not Done Reply Inline Actions How can we know that this store is aligned on 32B without knowing anything about %i and %j. I'm running into a bug because the memref alignment is propagated to load/store associated without considering the indices used for indexing. I don't believe we can infer any kind of alignment without having alignment information about the indices. @stephenneuendorffer, could you clarify how you expect this to work? ThomasRaoux: How can we know that this store is aligned on 32B without knowing anything about %i and %j. I'm…
				nicolasvasilacheUnsubmitted Not Done Reply Inline Actions This one is clearly not aligned on 32B but it should be aligned on: lcm(annotation + offset_alignment, most_minor_dim_alignment) == lcm(32 + 0 * 4, 100 * 4) == 16B There seems indeed to be a small bug somewhere but I would prefer we fix rather than flatly revert. nicolasvasilache: This one is clearly not aligned on 32B but it should be aligned on: ``` lcm(annotation +…
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Why? Revert is conservative and cheap: it is trivial to reapply the patch with the fix without pressure. mehdi_amini: Why? Revert is conservative and cheap: it is trivial to reapply the patch with the fix without…
				ThomasRaouxUnsubmitted Not Done Reply Inline Actions Yes Stephen reverted it as a quick solution as this was causing miscompile on IREE side. @nicolasvasilache, I don't understand how we can prove that this is aligned on 16B. If %j = 1 and %i = 0 for instance we would be accessing 4B from the base and since the base is 32B aligned we would only be able to get 4B alignment. What am I missing? ThomasRaoux: Yes Stephen reverted it as a quick solution as this was causing miscompile on IREE side.
				nicolasvasilacheUnsubmitted Not Done Reply Inline Actions You're right, I was very focused on the memref type itself and did not look deeper at the indexing.. What I meant to say is that the alignment for each start of every row of memref is already degraded wrt the base pointer. More precisely: @%memref[0, 0] is aligned at 32B @%memref[1, 0] is aligned at 16B but not 32B @%memref[2, 0] is aligned at 32B ... the indexing within the row can indeed be any 4B aligned. But even worse, even `@%memref[1, 32]` is aligned at 16B and not 32B. This is one of the important things to grasp when talking about memref and alignment: a memref is not a pointer. An n-D memref is closer in principle to a: (n-1)-D array of pointers to 1-D row (even if they are not materialized) Please ignore my earlier comment on the revert and sorry about the noise 😛 nicolasvasilache: You're right, I was very focused on the memref type itself and did not look deeper at the…
	}			}

	// CHECK-LABEL: func @masked_load_op			// CHECK-LABEL: func @masked_load_op
	// CHECK: %[[CO:.*]] = constant 0 : index			// CHECK: %[[CO:.*]] = constant 0 : index
	// CHECK: %[[C:.*]] = llvm.mlir.cast %[[CO]] : index to i64			// CHECK: %[[C:.*]] = llvm.mlir.cast %[[CO]] : index to i64
	// CHECK: %[[P:.]] = llvm.getelementptr %{{.}}[%[[C]]] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>			// CHECK: %[[P:.]] = llvm.getelementptr %{{.}}[%[[C]]] : (!llvm.ptr<f32>, i64) -> !llvm.ptr<f32>
	// CHECK: %[[B:.*]] = llvm.bitcast %[[P]] : !llvm.ptr<f32> to !llvm.ptr<vector<16xf32>>			// CHECK: %[[B:.*]] = llvm.bitcast %[[P]] : !llvm.ptr<f32> to !llvm.ptr<vector<16xf32>>
	// CHECK: %[[L:.]] = llvm.intr.masked.load %[[B]], %{{.}}, %{{.*}} {alignment = 4 : i32} : (!llvm.ptr<vector<16xf32>>, vector<16xi1>, vector<16xf32>) -> vector<16xf32>			// CHECK: %[[L:.]] = llvm.intr.masked.load %[[B]], %{{.}}, %{{.*}} {alignment = 4 : i32} : (!llvm.ptr<vector<16xf32>>, vector<16xi1>, vector<16xf32>) -> vector<16xf32>
	▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines