This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/StandardOps/IR/
-
mlir/
-
Dialect/
-
StandardOps/
-
IR/
1
Ops.td
-
lib/
-
Conversion/StandardToLLVM/
-
StandardToLLVM/
13/25
ConvertStandardToLLVM.cpp
-
Dialect/StandardOps/IR/
-
StandardOps/
-
IR/
-
Ops.cpp
-
test/
-
Conversion/StandardToLLVM/
-
StandardToLLVM/
1/2
convert-to-llvmir.mlir
-
IR/
-
core-ops.mlir
-
invalid-ops.mlir

Differential D74401

[MLIR] Add std.atomic_rmw op
ClosedPublic

Authored by flaub on Feb 11 2020, 6:05 AM.

Download Raw Diff

Details

Reviewers

ftynse
mehdi_amini
rriddle
jbruestle
earhart

Commits

rGfe210a1ff2e9: [MLIR] Add std.atomic_rmw op

Summary

The RFC for this op is here: https://llvm.discourse.group/t/rfc-add-std-atomic-rmw-op/489

The std.atmomic_rmw op provides a way to support read-modify-write
sequences with data race freedom. It is intended to be used in the lowering
of an upcoming affine.atomic_rmw op which can be used for reductions.

A lowering to LLVM is provided with 2 paths:

Simple patterns: llvm.atomicrmw
Everything else: llvm.cmpxchg

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

flaub created this revision.Feb 11 2020, 6:05 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2020, 6:05 AM

Herald added subscribers: llvm-commits, Joonsoo, liufengdb and 9 others. · View Herald Transcript

Remove stdx.

Harbormaster failed remote builds in B46219: Diff 243841!Feb 11 2020, 6:28 AM

Harbormaster failed remote builds in B46218: Diff 243840!

jbruestle requested changes to this revision.Feb 11 2020, 10:48 AM

jbruestle added inline comments.

mlir/include/mlir/Dialect/StandardOps/Ops.td
230 ↗	(On Diff #243841)	I think you just mean 'block argument' not induction variable here (and below in a few places), since it's not iterating over anything.
252 ↗	(On Diff #243841)	Maybe rename getInitialValue(), or getLoadedValue() or something similar?

This revision now requires changes to proceed.Feb 11 2020, 10:48 AM

Review updates

flaub marked 2 inline comments as done.Feb 11 2020, 12:43 PM

Remove iv

flaub updated this revision to Diff 243963.Feb 11 2020, 12:48 PM

Remove iv

Review addressed in latest push.

jbruestle accepted this revision.Feb 11 2020, 1:16 PM

This revision is now accepted and ready to land.Feb 11 2020, 1:16 PM

Harbormaster failed remote builds in B46262: Diff 243963!Feb 11 2020, 1:18 PM

Harbormaster failed remote builds in B46259: Diff 243960!

rriddle requested changes to this revision.Feb 11 2020, 1:25 PM

rriddle added inline comments.

mlir/include/mlir/Dialect/StandardOps/Ops.td
227 ↗	(On Diff #243963)	typo: indicies -> indices
239 ↗	(On Diff #243963)	Wrap these in a mlir code block
mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2507	static functions should be in the global namespace and marked as `static`. Only classes should be placed within anonymous namespaces.
2540	Drop trivial braces.
2563	Same here.
2619	Top-level comments should be ///
2677	Don't create SmallVectors for things like this. Use ArrayRef or arrays.
2705	Same here.
mlir/lib/Dialect/StandardOps/Ops.cpp
3019 ↗	(On Diff #243963)	Remove trivial braces. I would have expected that this could be covered by ODS constraints.

This revision now requires changes to proceed.Feb 11 2020, 1:25 PM

Harbormaster failed remote builds in B46261: Diff 243962!Feb 11 2020, 1:36 PM

Review updates

Herald added a subscriber: aartbik. · View Herald TranscriptFeb 11 2020, 3:40 PM

flaub marked 3 inline comments as not done.Feb 11 2020, 3:40 PM

flaub added inline comments.

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2507	I'm OK with this, but I'm unfamiliar with this convention. What's the purpose behind having functions being `static` instead of being in the anonymous namespace? I was always under the impression that the two were functionally equivalent and that the more 'C++' way was to use anonymous namespaces. Also, should I close out the namespace here and then add the static functions and then re-open the anonymous namespace? Or would it make sense to move these up above?
mlir/lib/Dialect/StandardOps/Ops.cpp
3019 ↗	(On Diff #243963)	Did you have a specific trait in mind? I'm comparing the parent's element type to the op's operand type. I didn't see anything in `OpBase.td` at first glance.

rriddle added inline comments.Feb 11 2020, 3:51 PM

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2507	It's about readability and consistency. https://llvm.org/docs/CodingStandards.html#anonymous-namespaces
mlir/lib/Dialect/StandardOps/Ops.cpp
3019 ↗	(On Diff #243963)	Hmmm, I thought there was one for this already. I think a lot of the current usages are abusing `mlir::getElementTypeOrSelf`.

rriddle added inline comments.Feb 11 2020, 3:52 PM

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2507	Also, should I close out the namespace here and then add the static functions and then re-open the anonymous namespace? Or would it make sense to move these up above? Whichever one makes sense. You can close the namespace or hoist the functions.

Drop trivial braces
Trivial braces

Harbormaster failed remote builds in B46283: Diff 244020!Feb 11 2020, 4:02 PM

flaub marked an inline comment as done.Feb 11 2020, 4:06 PM

flaub added inline comments.

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2507	Thanks for the link!

Harbormaster failed remote builds in B46285: Diff 244026!Feb 11 2020, 4:20 PM

nicolasvasilache added inline comments.Feb 11 2020, 5:01 PM

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2645	This seems like it'd be better a `loop.while` + progressive lowering?
mlir/test/Conversion/StandardToLLVM/convert-to-llvmir.mlir
869	So this is interesting to me. Weren't you and/or @jbruestle advocating that we should have the reduction op encoded as an attribute in the case or affine.parallel_for with reduction semantics? It seems a very similar scenario to me, I'd be interested of where you draw the distinction between "encoded as an attribute" and just use a region?

flaub marked 2 inline comments as done.Feb 11 2020, 5:12 PM

flaub added inline comments.

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2645	I agree that it'd be a lot nicer to write the loop at a higher level, but it wasn't clear to me where this would go. Also, the determination of whether to use a cmpxchg or atomic_rmw is really specific to LLVM lowering. So we wouldn't want to use a loop in the case of simple bodies that can map to a single intrinsic/op at the lower level. I'm also thinking about how we will want to have a lowering from std to SPIR-V or OpenMP, in which case a loop may or may not make sense for those lowerings.
mlir/test/Conversion/StandardToLLVM/convert-to-llvmir.mlir
869	I think we initially thought having a closed attribute would be good, but it seemed that providing the ability to lower arbitrary reductions into cmpxchg wasn't too hard to do and it was easy enough to identify these simple cases that do lower to a single intrinsic. Our current plan is to still use an enum at the top level (the tile dialect), but then use a region for affine and below. The upcoming `affine.atomic_rmw` should basically mirror the standard one in regards to region vs attribute.

nicolasvasilache added inline comments.Feb 11 2020, 5:36 PM

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2645	Not for this revision but in the future there are some nice tricks we could play here. Your Conversion could very well decide to introduce a higher level loop construct, rewrite the region into that and let the rewrite infrastructure stitch the pieces together. In other words, I wasn't advocating that the atomic behavior should leak into the other targets, but that you could decide during lowering to introduce a higher level construct that is implemented and tested independently.

Great work! I only have some nits.

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2510	I'd appreciate some documentation on this function and below.
2607	Type conversion may fail and return null, please check the result.
2610	Nit: `rewriter.replaceOpWithNewOp<LLVM::AtomicRMWOp>(op, resultType, ...);`
2645	This is similar to the discussion @nicolasvasilache and I had about memory copies, and is worse discussing in general. One practical thing I'd like to point out: we need to make sure we don't introduce cyclic library dependencies, and it may be tricky.
2702	Nit: normally, single-result ops should be convertible to `Value` so you shouldn't be needing the `.res()` part

Remove stdx.
Review updates
Remove iv
Review updates
Drop trivial braces
Trivial braces
Address feedback

@rriddle Are there anymore blockers that need to be addressed?

Harbormaster failed remote builds in B46373: Diff 244317!Feb 12 2020, 6:55 PM

LGTM when extra documentation is added as discussed on Discourse.

mlir/include/mlir/Dialect/StandardOps/Ops.td
235 ↗	(On Diff #244317)	Could you please describe the restrictions on the body contents of the atomic region as discussed in the RFC?

LGTM after comments are resolved.

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2501	Can you document this struct please.
2508	Use /// for all top-level comments. Here and below.
2727	For single element things you should be able to pass the values directly. Is there a problem you are running into doing that?

This revision is now accepted and ready to land.Feb 14 2020, 9:28 PM

Simplified design based on RFC feedback

Comments

Fix example

Harbormaster failed remote builds in B46889: Diff 245584!Feb 19 2020, 11:28 PM

Harbormaster failed remote builds in B46891: Diff 245586!Feb 19 2020, 11:36 PM

Harbormaster failed remote builds in B46890: Diff 245585!

Thanks for the update Frank! Added a few comments.

mlir/include/mlir/Dialect/StandardOps/Ops.td
262 ↗	(On Diff #245586)	Should this be a more constrained type, like 'IntegerOrFloatLike'?
mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2698	This is invalid, all IR mutations need to go through the rewriter. Some pattern drivers, like DialectConversion which is being used here, will undo transformations if something goes wrong. If something is done outside of the rewriter, this leads to invalid code/crashes. Seems like you want to do `rewriter.replaceOp` here.
2764	Can we keep this ordered?
mlir/lib/Dialect/StandardOps/Ops.cpp
2919 ↗	(On Diff #245586)	Could you switch to the declarative format? It will format the enum as a string instead of a keyword for now, but that is worth remove all of this parsing code.

This revision now requires changes to proceed.Feb 19 2020, 11:51 PM

flaub marked 4 inline comments as done.Feb 20 2020, 12:13 AM

flaub added inline comments.

mlir/include/mlir/Dialect/StandardOps/Ops.td
262 ↗	(On Diff #245586)	OK, I suppose that will work here for now. I could see this being expanded later if the lowering supported more types (which in theory it could by lowering to `cmpxchg` or others).
mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
2698	OK, thanks for clarifying. I'll try to use `rewriter.replaceOp` instead.
2764	Will do.
mlir/lib/Dialect/StandardOps/Ops.cpp
2919 ↗	(On Diff #245586)	Cool, I will try that, thanks (sounds like a worthwhile tradeoff).

Review feedback

Fix example

Harbormaster failed remote builds in B46900: Diff 245598!Feb 20 2020, 1:15 AM

Harbormaster failed remote builds in B46901: Diff 245599!Feb 20 2020, 1:23 AM

Merge branch 'master' into arcpatch-D74401
Merge branch 'master' into arcpatch-D74401
Update with master

@rriddle Thanks for the review, does this look better now?

Thanks!

mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
273	nit: Can you keep these two lines aligned?

This revision is now accepted and ready to land.Feb 24 2020, 4:39 PM

Closed by commit rGfe210a1ff2e9: [MLIR] Add std.atomic_rmw op (authored by flaub). · Explain WhyFeb 24 2020, 4:54 PM

This revision was automatically updated to reflect the committed changes.

Harbormaster completed remote builds in B47171: Diff 246342.Feb 24 2020, 5:13 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

StandardOps/

IR/

Ops.td

63 lines

lib/

Conversion/

StandardToLLVM/

ConvertStandardToLLVM.cpp

184 lines

Dialect/

StandardOps/

IR/

Ops.cpp

38 lines

test/

Conversion/

StandardToLLVM/

convert-to-llvmir.mlir

40 lines

IR/

core-ops.mlir

7 lines

invalid-ops.mlir

24 lines

Diff 246345

mlir/include/mlir/Dialect/StandardOps/IR/Ops.td

Show First 20 Lines • Show All 212 Lines • ▼ Show 20 Lines	def AllocOp : Std_Op<"alloc"> {
let hasCanonicalizer = 1;		let hasCanonicalizer = 1;
}		}

def AndOp : IntArithmeticOp<"and", [Commutative]> {		def AndOp : IntArithmeticOp<"and", [Commutative]> {
let summary = "integer binary and";		let summary = "integer binary and";
let hasFolder = 1;		let hasFolder = 1;
}		}

		def ATOMIC_RMW_KIND_ADDF : I64EnumAttrCase<"addf", 0>;
		def ATOMIC_RMW_KIND_ADDI : I64EnumAttrCase<"addi", 1>;
		def ATOMIC_RMW_KIND_ASSIGN : I64EnumAttrCase<"assign", 2>;
		def ATOMIC_RMW_KIND_MAXF : I64EnumAttrCase<"maxf", 3>;
		def ATOMIC_RMW_KIND_MAXS : I64EnumAttrCase<"maxs", 4>;
		def ATOMIC_RMW_KIND_MAXU : I64EnumAttrCase<"maxu", 5>;
		def ATOMIC_RMW_KIND_MINF : I64EnumAttrCase<"minf", 6>;
		def ATOMIC_RMW_KIND_MINS : I64EnumAttrCase<"mins", 7>;
		def ATOMIC_RMW_KIND_MINU : I64EnumAttrCase<"minu", 8>;
		def ATOMIC_RMW_KIND_MULF : I64EnumAttrCase<"mulf", 9>;
		def ATOMIC_RMW_KIND_MULI : I64EnumAttrCase<"muli", 10>;

		def AtomicRMWKindAttr : I64EnumAttr<
		"AtomicRMWKind", "",
		[ATOMIC_RMW_KIND_ADDF, ATOMIC_RMW_KIND_ADDI, ATOMIC_RMW_KIND_ASSIGN,
		ATOMIC_RMW_KIND_MAXF, ATOMIC_RMW_KIND_MAXS, ATOMIC_RMW_KIND_MAXU,
		ATOMIC_RMW_KIND_MINF, ATOMIC_RMW_KIND_MINS, ATOMIC_RMW_KIND_MINU,
		ATOMIC_RMW_KIND_MULF, ATOMIC_RMW_KIND_MULI]> {
		let cppNamespace = "::mlir";
		}

		def AtomicRMWOp : Std_Op<"atomic_rmw", [
		AllTypesMatch<["value", "result"]>,
		TypesMatchWith<"value type matches element type of memref",
		"memref", "value",
		"$_self.cast<MemRefType>().getElementType()">
		]> {
		let summary = "atomic read-modify-write operation";
		let description = [{
		The "atomic_rmw" operation provides a way to perform a read-modify-write
		sequence that is free from data races. The kind enumeration specifies the
		modification to perform. The value operand represents the new value to be
		applied during the modification. The memref operand represents the buffer
		that the read and write will be performed against, as accessed by the
		specified indices. The arity of the indices is the rank of the memref. The
		result represents the latest value that was stored.

		Example:

		```mlir
		%x = atomic_rmw "addf" %value, %I[%i] : (f32, memref<10xf32>) -> f32
		```
		}];

		let arguments = (ins
		AtomicRMWKindAttr:$kind,
		AnyTypeOf<[AnySignlessInteger, AnyFloat]>:$value,
		MemRefOf<[AnySignlessInteger, AnyFloat]>:$memref,
		Variadic<Index>:$indices);
		let results = (outs AnyTypeOf<[AnySignlessInteger, AnyFloat]>:$result);

		let assemblyFormat = [{
		$kind $value `,` $memref `[` $indices `]` attr-dict `:` `(` type($value) `,`
		rriddleUnsubmitted Not Done Reply Inline Actions nit: Can you keep these two lines aligned? rriddle: nit: Can you keep these two lines aligned?
		type($memref) `)` `->` type($result)
		}];

		let extraClassDeclaration = [{
		MemRefType getMemRefType() {
		return memref().getType().cast<MemRefType>();
		}
		}];
		}

def BranchOp : Std_Op<"br", [Terminator]> {		def BranchOp : Std_Op<"br", [Terminator]> {
let summary = "branch operation";		let summary = "branch operation";
let description = [{		let description = [{
The "br" operation represents a branch operation in a function.		The "br" operation represents a branch operation in a function.
The operation takes variable number of operands and produces no results.		The operation takes variable number of operands and produces no results.
The operand number and types for each successor must match the arguments of		The operand number and types for each successor must match the arguments of
the block successor. For example:		the block successor. For example:

▲ Show 20 Lines • Show All 1,488 Lines • Show Last 20 Lines

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp

Show First 20 Lines • Show All 1,137 Lines • ▼ Show 20 Lines	for (unsigned i = 0; i < numResults; ++i) {
op->getLoc(), type, newOp.getOperation()->getResult(0),		op->getLoc(), type, newOp.getOperation()->getResult(0),
rewriter.getI64ArrayAttr(i)));		rewriter.getI64ArrayAttr(i)));
}		}
rewriter.replaceOp(op, results);		rewriter.replaceOp(op, results);
return this->matchSuccess();		return this->matchSuccess();
}		}
};		};

template <typename SourceOp, unsigned OpCount> struct OpCountValidator {		template <typename SourceOp, unsigned OpCount>
		struct OpCountValidator {
static_assert(		static_assert(
std::is_base_of<		std::is_base_of<
typename OpTrait::NOperands<OpCount>::template Impl<SourceOp>,		typename OpTrait::NOperands<OpCount>::template Impl<SourceOp>,
SourceOp>::value,		SourceOp>::value,
"wrong operand count");		"wrong operand count");
};		};

template <typename SourceOp> struct OpCountValidator<SourceOp, 1> {		template <typename SourceOp>
		struct OpCountValidator<SourceOp, 1> {
static_assert(std::is_base_of<OpTrait::OneOperand<SourceOp>, SourceOp>::value,		static_assert(std::is_base_of<OpTrait::OneOperand<SourceOp>, SourceOp>::value,
"expected a single operand");		"expected a single operand");
};		};

template <typename SourceOp, unsigned OpCount> void ValidateOpCount() {		template <typename SourceOp, unsigned OpCount>
		void ValidateOpCount() {
OpCountValidator<SourceOp, OpCount>();		OpCountValidator<SourceOp, OpCount>();
}		}

// Basic lowering implementation for rewriting from Standard Ops to LLVM Dialect		// Basic lowering implementation for rewriting from Standard Ops to LLVM Dialect
// Ops for N-ary ops with one result. This supports higher-dimensional vector		// Ops for N-ary ops with one result. This supports higher-dimensional vector
// types.		// types.
template <typename SourceOp, typename TargetOp, unsigned OpCount>		template <typename SourceOp, typename TargetOp, unsigned OpCount>
struct NaryOpLLVMOpLowering : public LLVMLegalizationPattern<SourceOp> {		struct NaryOpLLVMOpLowering : public LLVMLegalizationPattern<SourceOp> {
▲ Show 20 Lines • Show All 351 Lines • ▼ Show 20 Lines	void rewrite(Operation *op, ArrayRef<Value> operands,
// Iterate strides in reverse order, compute runningStride and strideValues.		// Iterate strides in reverse order, compute runningStride and strideValues.
auto nStrides = strides.size();		auto nStrides = strides.size();
SmallVector<Value, 4> strideValues(nStrides, nullptr);		SmallVector<Value, 4> strideValues(nStrides, nullptr);
for (unsigned i = 0; i < nStrides; ++i) {		for (unsigned i = 0; i < nStrides; ++i) {
int64_t index = nStrides - 1 - i;		int64_t index = nStrides - 1 - i;
if (strides[index] == MemRefType::getDynamicStrideOrOffset())		if (strides[index] == MemRefType::getDynamicStrideOrOffset())
// Identity layout map is enforced in the match function, so we compute:		// Identity layout map is enforced in the match function, so we compute:
// `runningStride *= sizes[index + 1]`		// `runningStride *= sizes[index + 1]`
runningStride =		runningStride = runningStride
runningStride
? rewriter.create<LLVM::MulOp>(loc, runningStride,		? rewriter.create<LLVM::MulOp>(loc, runningStride,
sizes[index + 1])		sizes[index + 1])
: createIndexConstant(rewriter, loc, 1);		: createIndexConstant(rewriter, loc, 1);
else		else
runningStride = createIndexConstant(rewriter, loc, strides[index]);		runningStride = createIndexConstant(rewriter, loc, strides[index]);
strideValues[index] = runningStride;		strideValues[index] = runningStride;
}		}
// Fill size and stride descriptors in memref.		// Fill size and stride descriptors in memref.
for (auto indexedSize : llvm::enumerate(sizes)) {		for (auto indexedSize : llvm::enumerate(sizes)) {
int64_t index = indexedSize.index();		int64_t index = indexedSize.index();
memRefDescriptor.setSize(rewriter, loc, index, indexedSize.value());		memRefDescriptor.setSize(rewriter, loc, index, indexedSize.value());
▲ Show 20 Lines • Show All 951 Lines • ▼ Show 20 Lines	for (int i = viewMemRefType.getRank() - 1; i >= 0; --i) {
nextSize = size;		nextSize = size;
}		}

rewriter.replaceOp(op, {targetMemRef});		rewriter.replaceOp(op, {targetMemRef});
return matchSuccess();		return matchSuccess();
}		}
};		};

struct AssumeAlignmentOpLowering		struct AssumeAlignmentOpLowering
		rriddleUnsubmitted Not Done Reply Inline Actions Can you document this struct please. rriddle: Can you document this struct please.
: public LLVMLegalizationPattern<AssumeAlignmentOp> {		: public LLVMLegalizationPattern<AssumeAlignmentOp> {
using LLVMLegalizationPattern<AssumeAlignmentOp>::LLVMLegalizationPattern;		using LLVMLegalizationPattern<AssumeAlignmentOp>::LLVMLegalizationPattern;

PatternMatchResult		PatternMatchResult
matchAndRewrite(Operation *op, ArrayRef<Value> operands,		matchAndRewrite(Operation *op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
		rriddleUnsubmitted Not Done Reply Inline Actions static functions should be in the global namespace and marked as `static`. Only classes should be placed within anonymous namespaces. rriddle: static functions should be in the global namespace and marked as `static`. Only classes should…
		flaubAuthorUnsubmitted Not Done Reply Inline Actions I'm OK with this, but I'm unfamiliar with this convention. What's the purpose behind having functions being `static` instead of being in the anonymous namespace? I was always under the impression that the two were functionally equivalent and that the more 'C++' way was to use anonymous namespaces. Also, should I close out the namespace here and then add the static functions and then re-open the anonymous namespace? Or would it make sense to move these up above? flaub: I'm OK with this, but I'm unfamiliar with this convention. What's the purpose behind having…
		rriddleUnsubmitted Not Done Reply Inline Actions It's about readability and consistency. https://llvm.org/docs/CodingStandards.html#anonymous-namespaces rriddle: It's about readability and consistency. https://llvm.org/docs/CodingStandards.html#anonymous…
		rriddleUnsubmitted Not Done Reply Inline Actions Also, should I close out the namespace here and then add the static functions and then re-open the anonymous namespace? Or would it make sense to move these up above? Whichever one makes sense. You can close the namespace or hoist the functions. rriddle: > Also, should I close out the namespace here and then add the static functions and then re…
		flaubAuthorUnsubmitted Done Reply Inline Actions Thanks for the link! flaub: Thanks for the link!
OperandAdaptor<AssumeAlignmentOp> transformed(operands);		OperandAdaptor<AssumeAlignmentOp> transformed(operands);
		rriddleUnsubmitted Not Done Reply Inline Actions Use /// for all top-level comments. Here and below. rriddle: Use /// for all top-level comments. Here and below.
Value memref = transformed.memref();		Value memref = transformed.memref();
unsigned alignment = cast<AssumeAlignmentOp>(op).alignment().getZExtValue();		unsigned alignment = cast<AssumeAlignmentOp>(op).alignment().getZExtValue();
		ftynseUnsubmitted Done Reply Inline Actions I'd appreciate some documentation on this function and below. ftynse: I'd appreciate some documentation on this function and below.

MemRefDescriptor memRefDescriptor(memref);		MemRefDescriptor memRefDescriptor(memref);
Value ptr = memRefDescriptor.alignedPtr(rewriter, memref.getLoc());		Value ptr = memRefDescriptor.alignedPtr(rewriter, memref.getLoc());

// Emit llvm.assume(memref.alignedPtr & (alignment - 1) == 0). Notice that		// Emit llvm.assume(memref.alignedPtr & (alignment - 1) == 0). Notice that
// the asserted memref.alignedPtr isn't used anywhere else, as the real		// the asserted memref.alignedPtr isn't used anywhere else, as the real
// users like load/store/views always re-extract memref.alignedPtr as they		// users like load/store/views always re-extract memref.alignedPtr as they
// get lowered.		// get lowered.
Show All 13 Lines	rewriter.create<LLVM::AssumeOp>(
op->getLoc(), LLVM::ICmpPredicate::eq,		op->getLoc(), LLVM::ICmpPredicate::eq,
rewriter.create<LLVM::AndOp>(op->getLoc(), ptrValue, mask), zero));		rewriter.create<LLVM::AndOp>(op->getLoc(), ptrValue, mask), zero));

rewriter.eraseOp(op);		rewriter.eraseOp(op);
return matchSuccess();		return matchSuccess();
}		}
};		};

} // namespace		} // namespace
		rriddleUnsubmitted Done Reply Inline Actions Drop trivial braces. rriddle: Drop trivial braces.

		/// Try to match the kind of a std.atomic_rmw to determine whether to use a
		/// lowering to llvm.atomicrmw or fallback to llvm.cmpxchg.
		static Optional<LLVM::AtomicBinOp> matchSimpleAtomicOp(AtomicRMWOp atomicOp) {
		switch (atomicOp.kind()) {
		case AtomicRMWKind::addf:
		return LLVM::AtomicBinOp::fadd;
		case AtomicRMWKind::addi:
		return LLVM::AtomicBinOp::add;
		case AtomicRMWKind::assign:
		return LLVM::AtomicBinOp::xchg;
		case AtomicRMWKind::maxs:
		return LLVM::AtomicBinOp::max;
		case AtomicRMWKind::maxu:
		return LLVM::AtomicBinOp::umax;
		case AtomicRMWKind::mins:
		return LLVM::AtomicBinOp::min;
		case AtomicRMWKind::minu:
		return LLVM::AtomicBinOp::umin;
		default:
		return llvm::None;
		}
		llvm_unreachable("Invalid AtomicRMWKind");
		rriddleUnsubmitted Done Reply Inline Actions Same here. rriddle: Same here.
		}

		namespace {

		struct AtomicRMWOpLowering : public LoadStoreOpLowering<AtomicRMWOp> {
		using Base::Base;

		PatternMatchResult
		matchAndRewrite(Operation *op, ArrayRef<Value> operands,
		ConversionPatternRewriter &rewriter) const override {
		auto atomicOp = cast<AtomicRMWOp>(op);
		auto maybeKind = matchSimpleAtomicOp(atomicOp);
		if (!maybeKind)
		return matchFailure();
		OperandAdaptor<AtomicRMWOp> adaptor(operands);
		auto resultType = adaptor.value().getType();
		auto memRefType = atomicOp.getMemRefType();
		auto dataPtr = getDataPtr(op->getLoc(), memRefType, adaptor.memref(),
		adaptor.indices(), rewriter, getModule());
		rewriter.replaceOpWithNewOp<LLVM::AtomicRMWOp>(
		op, resultType, *maybeKind, dataPtr, adaptor.value(),
		LLVM::AtomicOrdering::acq_rel);
		return matchSuccess();
		}
		};

		/// Wrap a llvm.cmpxchg operation in a while loop so that the operation can be
		/// retried until it succeeds in atomically storing a new value into memory.
		///
		/// +---------------------------------+
		/// \| <code before the AtomicRMWOp> \|
		/// \| <compute initial %loaded> \|
		/// \| br loop(%loaded) \|
		/// +---------------------------------+
		/// \|
		/// -------\| \|
		/// \| v v
		/// \| +--------------------------------+
		/// \| \| loop(%loaded): \|
		/// \| \| <body contents> \|
		/// \| \| %pair = cmpxchg \|
		/// \| \| %ok = %pair[0] \|
		/// \| \| %new = %pair[1] \|
		/// \| \| cond_br %ok, end, loop(%new) \|
		ftynseUnsubmitted Done Reply Inline Actions Type conversion may fail and return null, please check the result. ftynse: Type conversion may fail and return null, please check the result.
		/// \| +--------------------------------+
		/// \| \| \|
		/// \|----------- \|
		ftynseUnsubmitted Done Reply Inline Actions Nit: `rewriter.replaceOpWithNewOp<LLVM::AtomicRMWOp>(op, resultType, ...);` ftynse: Nit:` rewriter.replaceOpWithNewOp<LLVM::AtomicRMWOp>(op, resultType, ...);`
		/// v
		/// +--------------------------------+
		/// \| end: \|
		/// \| <code after the AtomicRMWOp> \|
		/// +--------------------------------+
		///
		struct AtomicCmpXchgOpLowering : public LoadStoreOpLowering<AtomicRMWOp> {
		using Base::Base;

		rriddleUnsubmitted Done Reply Inline Actions Top-level comments should be /// rriddle: Top-level comments should be ///
		PatternMatchResult
		matchAndRewrite(Operation *op, ArrayRef<Value> operands,
		ConversionPatternRewriter &rewriter) const override {
		auto atomicOp = cast<AtomicRMWOp>(op);
		auto maybeKind = matchSimpleAtomicOp(atomicOp);
		if (maybeKind)
		return matchFailure();

		LLVM::FCmpPredicate predicate;
		switch (atomicOp.kind()) {
		case AtomicRMWKind::maxf:
		predicate = LLVM::FCmpPredicate::ogt;
		break;
		case AtomicRMWKind::minf:
		predicate = LLVM::FCmpPredicate::olt;
		break;
		default:
		return matchFailure();
		}

		OperandAdaptor<AtomicRMWOp> adaptor(operands);
		auto loc = op->getLoc();
		auto valueType = adaptor.value().getType().cast<LLVM::LLVMType>();

		// Split the block into initial, loop, and ending parts.
		auto *initBlock = rewriter.getInsertionBlock();
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions This seems like it'd be better a `loop.while` + progressive lowering? nicolasvasilache: This seems like it'd be better a `loop.while` + progressive lowering?
		flaubAuthorUnsubmitted Done Reply Inline Actions I agree that it'd be a lot nicer to write the loop at a higher level, but it wasn't clear to me where this would go. Also, the determination of whether to use a cmpxchg or atomic_rmw is really specific to LLVM lowering. So we wouldn't want to use a loop in the case of simple bodies that can map to a single intrinsic/op at the lower level. I'm also thinking about how we will want to have a lowering from std to SPIR-V or OpenMP, in which case a loop may or may not make sense for those lowerings. flaub: I agree that it'd be a lot nicer to write the loop at a higher level, but it wasn't clear to me…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions Not for this revision but in the future there are some nice tricks we could play here. Your Conversion could very well decide to introduce a higher level loop construct, rewrite the region into that and let the rewrite infrastructure stitch the pieces together. In other words, I wasn't advocating that the atomic behavior should leak into the other targets, but that you could decide during lowering to introduce a higher level construct that is implemented and tested independently. nicolasvasilache: Not for this revision but in the future there are some nice tricks we could play here. Your…
		ftynseUnsubmitted Not Done Reply Inline Actions This is similar to the discussion @nicolasvasilache and I had about memory copies, and is worse discussing in general. One practical thing I'd like to point out: we need to make sure we don't introduce cyclic library dependencies, and it may be tricky. ftynse: This is similar to the discussion @nicolasvasilache and I had about memory copies, and is worse…
		auto initPosition = rewriter.getInsertionPoint();
		auto *loopBlock = rewriter.splitBlock(initBlock, initPosition);
		auto loopArgument = loopBlock->addArgument(valueType);
		auto loopPosition = rewriter.getInsertionPoint();
		auto *endBlock = rewriter.splitBlock(loopBlock, loopPosition);

		// Compute the loaded value and branch to the loop block.
		rewriter.setInsertionPointToEnd(initBlock);
		auto memRefType = atomicOp.getMemRefType();
		auto dataPtr = getDataPtr(loc, memRefType, adaptor.memref(),
		adaptor.indices(), rewriter, getModule());
		auto init = rewriter.create<LLVM::LoadOp>(loc, dataPtr);
		std::array<Value, 1> brRegionOperands{init};
		std::array<ValueRange, 1> brOperands{brRegionOperands};
		rewriter.create<LLVM::BrOp>(loc, ArrayRef<Value>{}, loopBlock, brOperands);

		// Prepare the body of the loop block.
		rewriter.setInsertionPointToStart(loopBlock);
		auto predicateI64 =
		rewriter.getI64IntegerAttr(static_cast<int64_t>(predicate));
		auto boolType = LLVM::LLVMType::getInt1Ty(&getDialect());
		auto lhs = loopArgument;
		auto rhs = adaptor.value();
		auto cmp =
		rewriter.create<LLVM::FCmpOp>(loc, boolType, predicateI64, lhs, rhs);
		auto select = rewriter.create<LLVM::SelectOp>(loc, cmp, lhs, rhs);

		// Prepare the epilog of the loop block.
		rewriter.setInsertionPointToEnd(loopBlock);
		// Append the cmpxchg op to the end of the loop block.
		auto successOrdering = LLVM::AtomicOrdering::acq_rel;
		auto failureOrdering = LLVM::AtomicOrdering::monotonic;
		rriddleUnsubmitted Done Reply Inline Actions Don't create SmallVectors for things like this. Use ArrayRef or arrays. rriddle: Don't create SmallVectors for things like this. Use ArrayRef or arrays.
		auto pairType = LLVM::LLVMType::getStructTy(valueType, boolType);
		auto cmpxchg = rewriter.create<LLVM::AtomicCmpXchgOp>(
		loc, pairType, dataPtr, loopArgument, select, successOrdering,
		failureOrdering);
		// Extract the %new_loaded and %ok values from the pair.
		auto newLoaded = rewriter.create<LLVM::ExtractValueOp>(
		loc, valueType, cmpxchg, rewriter.getI64ArrayAttr({0}));
		auto ok = rewriter.create<LLVM::ExtractValueOp>(
		loc, boolType, cmpxchg, rewriter.getI64ArrayAttr({1}));

		// Conditionally branch to the end or back to the loop depending on %ok.
		std::array<Value, 1> condBrProperOperands{ok};
		std::array<Block *, 2> condBrDestinations{endBlock, loopBlock};
		std::array<Value, 1> condBrRegionOperands{newLoaded};
		std::array<ValueRange, 2> condBrOperands{ArrayRef<Value>{},
		condBrRegionOperands};
		rewriter.create<LLVM::CondBrOp>(loc, condBrProperOperands,
		condBrDestinations, condBrOperands);

		// The 'result' of the atomic_rmw op is the newly loaded value.
		rewriter.replaceOp(op, {newLoaded});
		rriddleUnsubmitted Not Done Reply Inline Actions This is invalid, all IR mutations need to go through the rewriter. Some pattern drivers, like DialectConversion which is being used here, will undo transformations if something goes wrong. If something is done outside of the rewriter, this leads to invalid code/crashes. Seems like you want to do `rewriter.replaceOp` here. rriddle: This is invalid, all IR mutations need to go through the rewriter. Some pattern drivers, like…
		flaubAuthorUnsubmitted Done Reply Inline Actions OK, thanks for clarifying. I'll try to use `rewriter.replaceOp` instead. flaub: OK, thanks for clarifying. I'll try to use `rewriter.replaceOp` instead.

		return matchSuccess();
		}
		};
		ftynseUnsubmitted Done Reply Inline Actions Nit: normally, single-result ops should be convertible to `Value` so you shouldn't be needing the `.res()` part ftynse: Nit: normally, single-result ops should be convertible to `Value` so you shouldn't be needing…

		} // namespace

		rriddleUnsubmitted Done Reply Inline Actions Same here. rriddle: Same here.
static void ensureDistinctSuccessors(Block &bb) {		static void ensureDistinctSuccessors(Block &bb) {
auto *terminator = bb.getTerminator();		auto *terminator = bb.getTerminator();

// Find repeated successors with arguments.		// Find repeated successors with arguments.
llvm::SmallDenseMap<Block *, SmallVector<int, 4>> successorPositions;		llvm::SmallDenseMap<Block *, SmallVector<int, 4>> successorPositions;
for (int i = 0, e = terminator->getNumSuccessors(); i < e; ++i) {		for (int i = 0, e = terminator->getNumSuccessors(); i < e; ++i) {
Block *successor = terminator->getSuccessor(i);		Block *successor = terminator->getSuccessor(i);
// Blocks with no arguments are safe even if they appear multiple times		// Blocks with no arguments are safe even if they appear multiple times
// because they don't need PHI nodes.		// because they don't need PHI nodes.
if (successor->getNumArguments() == 0)		if (successor->getNumArguments() == 0)
continue;		continue;
successorPositions[successor].push_back(i);		successorPositions[successor].push_back(i);
}		}

// If a successor appears for the second or more time in the terminator,		// If a successor appears for the second or more time in the terminator,
// create a new dummy block that unconditionally branches to the original		// create a new dummy block that unconditionally branches to the original
// destination, and retarget the terminator to branch to this new block.		// destination, and retarget the terminator to branch to this new block.
// There is no need to pass arguments to the dummy block because it will be		// There is no need to pass arguments to the dummy block because it will be
// dominated by the original block and can therefore use any values defined in		// dominated by the original block and can therefore use any values defined in
// the original block.		// the original block.
for (const auto &successor : successorPositions) {		for (const auto &successor : successorPositions) {
const auto &positions = successor.second;		const auto &positions = successor.second;
		rriddleUnsubmitted Not Done Reply Inline Actions For single element things you should be able to pass the values directly. Is there a problem you are running into doing that? rriddle: For single element things you should be able to pass the values directly. Is there a problem…
// Start from the second occurrence of a block in the successor list.		// Start from the second occurrence of a block in the successor list.
for (auto position = std::next(positions.begin()), end = positions.end();		for (auto position = std::next(positions.begin()), end = positions.end();
position != end; ++position) {		position != end; ++position) {
auto *dummyBlock = new Block();		auto *dummyBlock = new Block();
bb.getParent()->push_back(dummyBlock);		bb.getParent()->push_back(dummyBlock);
auto builder = OpBuilder(dummyBlock);		auto builder = OpBuilder(dummyBlock);
SmallVector<Value, 8> operands(		SmallVector<Value, 8> operands(
terminator->getSuccessorOperands(*position));		terminator->getSuccessorOperands(*position));
Show All 19 Lines	void mlir::populateStdToLLVMNonMemoryConversionPatterns(
LLVMTypeConverter &converter, OwningRewritePatternList &patterns) {		LLVMTypeConverter &converter, OwningRewritePatternList &patterns) {
// FIXME: this should be tablegen'ed		// FIXME: this should be tablegen'ed
// clang-format off		// clang-format off
patterns.insert<		patterns.insert<
AbsFOpLowering,		AbsFOpLowering,
AddFOpLowering,		AddFOpLowering,
AddIOpLowering,		AddIOpLowering,
AndOpLowering,		AndOpLowering,
		AtomicCmpXchgOpLowering,
		AtomicRMWOpLowering,
		rriddleUnsubmitted Not Done Reply Inline Actions Can we keep this ordered? rriddle: Can we keep this ordered?
		flaubAuthorUnsubmitted Done Reply Inline Actions Will do. flaub: Will do.
BranchOpLowering,		BranchOpLowering,
CallIndirectOpLowering,		CallIndirectOpLowering,
CallOpLowering,		CallOpLowering,
CeilFOpLowering,		CeilFOpLowering,
CmpFOpLowering,		CmpFOpLowering,
CmpIOpLowering,		CmpIOpLowering,
CondBranchOpLowering,		CondBranchOpLowering,
CopySignOpLowering,		CopySignOpLowering,
▲ Show 20 Lines • Show All 226 Lines • Show Last 20 Lines

mlir/lib/Dialect/StandardOps/IR/Ops.cpp

	Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines
	static void printStandardCastOp(Operation *op, OpAsmPrinter &p) {			static void printStandardCastOp(Operation *op, OpAsmPrinter &p) {
	int stdDotLen = StandardOpsDialect::getDialectNamespace().size() + 1;			int stdDotLen = StandardOpsDialect::getDialectNamespace().size() + 1;
	p << op->getName().getStringRef().drop_front(stdDotLen) << ' '			p << op->getName().getStringRef().drop_front(stdDotLen) << ' '
	<< op->getOperand(0) << " : " << op->getOperand(0).getType() << " to "			<< op->getOperand(0) << " : " << op->getOperand(0).getType() << " to "
	<< op->getResult(0).getType();			<< op->getResult(0).getType();
	}			}

	/// A custom cast operation verifier.			/// A custom cast operation verifier.
	template <typename T> static LogicalResult verifyCastOp(T op) {			template <typename T>
				static LogicalResult verifyCastOp(T op) {
	auto opType = op.getOperand().getType();			auto opType = op.getOperand().getType();
	auto resType = op.getType();			auto resType = op.getType();
	if (!T::areCastCompatible(opType, resType))			if (!T::areCastCompatible(opType, resType))
	return op.emitError("operand type ") << opType << " and result type "			return op.emitError("operand type ") << opType << " and result type "
	<< resType << " are cast incompatible";			<< resType << " are cast incompatible";

	return success();			return success();
	}			}
	▲ Show 20 Lines • Show All 2,463 Lines • ▼ Show 20 Lines
	bool FPTruncOp::areCastCompatible(Type a, Type b) {			bool FPTruncOp::areCastCompatible(Type a, Type b) {
	if (auto fa = a.dyn_cast<FloatType>())			if (auto fa = a.dyn_cast<FloatType>())
	if (auto fb = b.dyn_cast<FloatType>())			if (auto fb = b.dyn_cast<FloatType>())
	return fa.getWidth() > fb.getWidth();			return fa.getWidth() > fb.getWidth();
	return false;			return false;
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// AtomicRMWOp
				//===----------------------------------------------------------------------===//

				static LogicalResult verify(AtomicRMWOp op) {
				if (op.getMemRefType().getRank() != op.getNumOperands() - 2)
				return op.emitOpError(
				"expects the number of subscripts to be equal to memref rank");
				switch (op.kind()) {
				case AtomicRMWKind::addf:
				case AtomicRMWKind::maxf:
				case AtomicRMWKind::minf:
				case AtomicRMWKind::mulf:
				if (!op.value().getType().isa<FloatType>())
				return op.emitOpError()
				<< "with kind '" << stringifyAtomicRMWKind(op.kind())
				<< "' expects a floating-point type";
				break;
				case AtomicRMWKind::addi:
				case AtomicRMWKind::maxs:
				case AtomicRMWKind::maxu:
				case AtomicRMWKind::mins:
				case AtomicRMWKind::minu:
				case AtomicRMWKind::muli:
				if (!op.value().getType().isa<IntegerType>())
				return op.emitOpError()
				<< "with kind '" << stringifyAtomicRMWKind(op.kind())
				<< "' expects an integer type";
				break;
				default:
				break;
				}
				return success();
				}

				//===----------------------------------------------------------------------===//
	// TableGen'd op method definitions			// TableGen'd op method definitions
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#define GET_OP_CLASSES			#define GET_OP_CLASSES
	#include "mlir/Dialect/StandardOps/IR/Ops.cpp.inc"			#include "mlir/Dialect/StandardOps/IR/Ops.cpp.inc"

mlir/test/Conversion/StandardToLLVM/convert-to-llvmir.mlir

	Show First 20 Lines • Show All 852 Lines • ▼ Show 20 Lines
	// CHECK: module {			// CHECK: module {
	// CHECK: llvm.func @tanh(!llvm.double) -> !llvm.double			// CHECK: llvm.func @tanh(!llvm.double) -> !llvm.double
	// CHECK: llvm.func @tanhf(!llvm.float) -> !llvm.float			// CHECK: llvm.func @tanhf(!llvm.float) -> !llvm.float
	// CHECK-LABEL: func @check_tanh_func_added_only_once_to_symbol_table			// CHECK-LABEL: func @check_tanh_func_added_only_once_to_symbol_table
	}			}

	// -----			// -----

				// CHECK-LABEL: func @atomic_rmw
				func @atomic_rmw(%I : memref<10xi32>, %ival : i32, %F : memref<10xf32>, %fval : f32, %i : index) {
				atomic_rmw "assign" %fval, %F[%i] : (f32, memref<10xf32>) -> f32
				// CHECK: llvm.atomicrmw xchg %{{.}}, %{{.}} acq_rel
				atomic_rmw "addi" %ival, %I[%i] : (i32, memref<10xi32>) -> i32
				// CHECK: llvm.atomicrmw add %{{.}}, %{{.}} acq_rel
				atomic_rmw "maxs" %ival, %I[%i] : (i32, memref<10xi32>) -> i32
				// CHECK: llvm.atomicrmw max %{{.}}, %{{.}} acq_rel
				atomic_rmw "mins" %ival, %I[%i] : (i32, memref<10xi32>) -> i32
				nicolasvasilacheUnsubmitted Not Done Reply Inline Actions So this is interesting to me. Weren't you and/or @jbruestle advocating that we should have the reduction op encoded as an attribute in the case or affine.parallel_for with reduction semantics? It seems a very similar scenario to me, I'd be interested of where you draw the distinction between "encoded as an attribute" and just use a region? nicolasvasilache: So this is interesting to me. Weren't you and/or @jbruestle advocating that we should have the…
				flaubAuthorUnsubmitted Done Reply Inline Actions I think we initially thought having a closed attribute would be good, but it seemed that providing the ability to lower arbitrary reductions into cmpxchg wasn't too hard to do and it was easy enough to identify these simple cases that do lower to a single intrinsic. Our current plan is to still use an enum at the top level (the tile dialect), but then use a region for affine and below. The upcoming `affine.atomic_rmw` should basically mirror the standard one in regards to region vs attribute. flaub: I think we initially thought having a closed attribute would be good, but it seemed that…
				// CHECK: llvm.atomicrmw min %{{.}}, %{{.}} acq_rel
				atomic_rmw "maxu" %ival, %I[%i] : (i32, memref<10xi32>) -> i32
				// CHECK: llvm.atomicrmw umax %{{.}}, %{{.}} acq_rel
				atomic_rmw "minu" %ival, %I[%i] : (i32, memref<10xi32>) -> i32
				// CHECK: llvm.atomicrmw umin %{{.}}, %{{.}} acq_rel
				atomic_rmw "addf" %fval, %F[%i] : (f32, memref<10xf32>) -> f32
				// CHECK: llvm.atomicrmw fadd %{{.}}, %{{.}} acq_rel
				return
				}

				// -----

				// CHECK-LABEL: func @cmpxchg
				func @cmpxchg(%F : memref<10xf32>, %fval : f32, %i : index) -> f32 {
				%x = atomic_rmw "maxf" %fval, %F[%i] : (f32, memref<10xf32>) -> f32
				// CHECK: %[[init:.]] = llvm.load %{{.}} : !llvm<"float*">
				// CHECK-NEXT: llvm.br ^bb1(%[[init]] : !llvm.float)
				// CHECK-NEXT: ^bb1(%[[loaded:.*]]: !llvm.float):
				// CHECK-NEXT: %[[cmp:.]] = llvm.fcmp "ogt" %[[loaded]], %{{.}} : !llvm.float
				// CHECK-NEXT: %[[max:.]] = llvm.select %[[cmp]], %[[loaded]], %{{.}} : !llvm.i1, !llvm.float
				// CHECK-NEXT: %[[pair:.]] = llvm.cmpxchg %{{.}}, %[[loaded]], %[[max]] acq_rel monotonic : !llvm.float
				// CHECK-NEXT: %[[new:.*]] = llvm.extractvalue %[[pair]][0] : !llvm<"{ float, i1 }">
				// CHECK-NEXT: %[[ok:.*]] = llvm.extractvalue %[[pair]][1] : !llvm<"{ float, i1 }">
				// CHECK-NEXT: llvm.cond_br %[[ok]], ^bb2, ^bb1(%[[new]] : !llvm.float)
				// CHECK-NEXT: ^bb2:
				return %x : f32
				// CHECK-NEXT: llvm.return %[[new]]
				}

				// -----

	// CHECK-LABEL: func @assume_alignment			// CHECK-LABEL: func @assume_alignment
	func @assume_alignment(%0 : memref<4x4xf16>) {			func @assume_alignment(%0 : memref<4x4xf16>) {
	// CHECK: %[[PTR:.]] = llvm.extractvalue %[[MEMREF:.]][1] : !llvm<"{ half, half, i64, [2 x i64], [2 x i64] }">			// CHECK: %[[PTR:.]] = llvm.extractvalue %[[MEMREF:.]][1] : !llvm<"{ half, half, i64, [2 x i64], [2 x i64] }">
	// CHECK-NEXT: %[[ZERO:.*]] = llvm.mlir.constant(0 : index) : !llvm.i64			// CHECK-NEXT: %[[ZERO:.*]] = llvm.mlir.constant(0 : index) : !llvm.i64
	// CHECK-NEXT: %[[MASK:.*]] = llvm.mlir.constant(15 : index) : !llvm.i64			// CHECK-NEXT: %[[MASK:.*]] = llvm.mlir.constant(15 : index) : !llvm.i64
	// CHECK-NEXT: %[[INT:.]] = llvm.ptrtoint %[[PTR]] : !llvm<"half"> to !llvm.i64			// CHECK-NEXT: %[[INT:.]] = llvm.ptrtoint %[[PTR]] : !llvm<"half"> to !llvm.i64
	// CHECK-NEXT: %[[MASKED_PTR:.]] = llvm.and %[[INT]], %[[MASK:.]] : !llvm.i64			// CHECK-NEXT: %[[MASKED_PTR:.]] = llvm.and %[[INT]], %[[MASK:.]] : !llvm.i64
	// CHECK-NEXT: %[[CONDITION:.*]] = llvm.icmp "eq" %[[MASKED_PTR]], %[[ZERO]] : !llvm.i64			// CHECK-NEXT: %[[CONDITION:.*]] = llvm.icmp "eq" %[[MASKED_PTR]], %[[ZERO]] : !llvm.i64
	// CHECK-NEXT: "llvm.intr.assume"(%[[CONDITION]]) : (!llvm.i1) -> ()			// CHECK-NEXT: "llvm.intr.assume"(%[[CONDITION]]) : (!llvm.i1) -> ()
	assume_alignment %0, 16 : memref<4x4xf16>			assume_alignment %0, 16 : memref<4x4xf16>
	return			return
	}			}

mlir/test/IR/core-ops.mlir

	Show First 20 Lines • Show All 735 Lines • ▼ Show 20 Lines
	func @tensor_load_store(%0 : memref<4x4xi32>) {			func @tensor_load_store(%0 : memref<4x4xi32>) {
	// CHECK: %[[TENSOR:.]] = tensor_load %[[MEMREF:.]] : memref<4x4xi32>			// CHECK: %[[TENSOR:.]] = tensor_load %[[MEMREF:.]] : memref<4x4xi32>
	%1 = tensor_load %0 : memref<4x4xi32>			%1 = tensor_load %0 : memref<4x4xi32>
	// CHECK: tensor_store %[[TENSOR]], %[[MEMREF]] : memref<4x4xi32>			// CHECK: tensor_store %[[TENSOR]], %[[MEMREF]] : memref<4x4xi32>
	tensor_store %1, %0 : memref<4x4xi32>			tensor_store %1, %0 : memref<4x4xi32>
	return			return
	}			}

				// CHECK-LABEL: func @atomic_rmw
				func @atomic_rmw(%I: memref<10xf32>, %val: f32, %i : index) {
				// CHECK: %{{.}} = atomic_rmw "addf" %{{.}}, %{{.}}[%{{.}}]
				%x = atomic_rmw "addf" %val, %I[%i] : (f32, memref<10xf32>) -> f32
				return
				}

	// CHECK-LABEL: func @assume_alignment			// CHECK-LABEL: func @assume_alignment
	// CHECK-SAME: %[[MEMREF:.*]]: memref<4x4xf16>			// CHECK-SAME: %[[MEMREF:.*]]: memref<4x4xf16>
	func @assume_alignment(%0: memref<4x4xf16>) {			func @assume_alignment(%0: memref<4x4xf16>) {
	// CHECK: assume_alignment %[[MEMREF]], 16 : memref<4x4xf16>			// CHECK: assume_alignment %[[MEMREF]], 16 : memref<4x4xf16>
	assume_alignment %0, 16 : memref<4x4xf16>			assume_alignment %0, 16 : memref<4x4xf16>
	return			return
	}			}

mlir/test/IR/invalid-ops.mlir

Show First 20 Lines • Show All 1,033 Lines • ▼ Show 20 Lines	func @invalid_memref_cast() {
%1 = memref_cast %0 : memref<2x5xf32, 0> to memref<*xf32, 0>		%1 = memref_cast %0 : memref<2x5xf32, 0> to memref<*xf32, 0>
// expected-error@+1 {{operand type 'memref<xf32>' and result type 'memref<xf32>' are cast incompatible}}		// expected-error@+1 {{operand type 'memref<xf32>' and result type 'memref<xf32>' are cast incompatible}}
%2 = memref_cast %1 : memref<xf32, 0> to memref<xf32, 0>		%2 = memref_cast %1 : memref<xf32, 0> to memref<xf32, 0>
return		return
}		}

// -----		// -----

		func @atomic_rmw_idxs_rank_mismatch(%I: memref<16x10xf32>, %i : index, %val : f32) {
		// expected-error@+1 {{expects the number of subscripts to be equal to memref rank}}
		%x = atomic_rmw "addf" %val, %I[%i] : (f32, memref<16x10xf32>) -> f32
		return
		}

		// -----

		func @atomic_rmw_expects_float(%I: memref<16x10xi32>, %i : index, %val : i32) {
		// expected-error@+1 {{expects a floating-point type}}
		%x = atomic_rmw "addf" %val, %I[%i, %i] : (i32, memref<16x10xi32>) -> i32
		return
		}

		// -----

		func @atomic_rmw_expects_int(%I: memref<16x10xf32>, %i : index, %val : f32) {
		// expected-error@+1 {{expects an integer type}}
		%x = atomic_rmw "addi" %val, %I[%i, %i] : (f32, memref<16x10xf32>) -> f32
		return
		}

		// -----

// alignment is not power of 2.		// alignment is not power of 2.
func @assume_alignment(%0: memref<4x4xf16>) {		func @assume_alignment(%0: memref<4x4xf16>) {
// expected-error@+1 {{alignment must be power of 2}}		// expected-error@+1 {{alignment must be power of 2}}
std.assume_alignment %0, 12 : memref<4x4xf16>		std.assume_alignment %0, 12 : memref<4x4xf16>
return		return
}		}

// -----		// -----

// 0 alignment value.		// 0 alignment value.
func @assume_alignment(%0: memref<4x4xf16>) {		func @assume_alignment(%0: memref<4x4xf16>) {
// expected-error@+1 {{'std.assume_alignment' op attribute 'alignment' failed to satisfy constraint: positive 32-bit integer attribute}}		// expected-error@+1 {{'std.assume_alignment' op attribute 'alignment' failed to satisfy constraint: positive 32-bit integer attribute}}
std.assume_alignment %0, 0 : memref<4x4xf16>		std.assume_alignment %0, 0 : memref<4x4xf16>
return		return
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Add std.atomic_rmw opClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 246345

mlir/include/mlir/Dialect/StandardOps/IR/Ops.td

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp

mlir/lib/Dialect/StandardOps/IR/Ops.cpp

mlir/test/Conversion/StandardToLLVM/convert-to-llvmir.mlir

mlir/test/IR/core-ops.mlir

mlir/test/IR/invalid-ops.mlir

[MLIR] Add std.atomic_rmw op
ClosedPublic