This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
docs/
7/17
ConversionToLLVMDialect.md
-
include/mlir/
-
mlir/
-
Conversion/StandardToLLVM/
-
StandardToLLVM/
-
ConvertStandardToLLVM.h
-
Dialect/LLVMIR/
-
LLVMIR/
-
LLVMOps.td
-
lib/Conversion/StandardToLLVM/
-
Conversion/
-
StandardToLLVM/
3/5
StandardToLLVM.cpp
-
test/
-
Conversion/StandardToLLVM/
-
StandardToLLVM/
-
calling-convention.mlir
-
Dialect/LLVMIR/
-
LLVMIR/
-
roundtrip.mlir
-
Target/
-
llvmir-intrinsics.mlir
-
mlir-cpu-runner/
-
unranked_memref.mlir

Differential D82647

[mlir] support returning unranked memrefs
ClosedPublic

Authored by ftynse on Jun 26 2020, 5:35 AM.

Download Raw Diff

Details

Reviewers

herhut
pifon2a

Commits

rG6323065fd602: [mlir] support returning unranked memrefs

Summary

Initially, unranked memref descriptors in the LLVM dialect were designed only
to be passed into functions. An assertion was guarding against returning
unranked memrefs from functions in the standard-to-LLVM conversion. This is
insufficient for functions that wish to return an unranked memref such that the
caller does not know the rank in advance, and hence cannot allocate the
descriptor and pass it in as an argument.

Introduce a calling convention for returning unranked memref descriptors as
follows. An unranked memref descriptor always points to a ranked memref
descriptor stored on stack of the current function. When an unranked memref
descriptor is returned from a function, the ranked memref descriptor it points
to is copied to dynamically allocated memory, the ownership of which is
transferred to the caller. The caller is responsible for deallocating the
dynamically allocated memory and for copying the pointed-to ranked memref
descriptor onto its stack.

Provide default lowerings for std.return, std.call and std.indirect_call that
maintain the conversion defined above.

This convention is additionally exercised by a runtime test to guard against
memory errors.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ftynse created this revision.Jun 26 2020, 5:35 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2020, 5:35 AM

Herald added subscribers: msifontes, jurahul, Kayjukh and 14 others. · View Herald Transcript

Thanks for adding this and unblocking the use of unranked results!

This can be optimized in many ways to reduce the number of allocations that is required. Some ideas we discussed offline (just to document them):

only allocate once for all returned descriptors
the caller allocates some scratchpad memory on stack and passes it in as arg 0 if the called function returns unranked memrefs. The size could be so that it can hold e.g. rank-5 descriptors for all results. That memory is then used for return values and only if it is not large enough, we allocate.

However, that is future work. Getting it supported and correct is more important.

mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp
1918	nit: remove on descriptors.
1927	Nit: operands.
1966	Move out of loop?
2050	Not doing an early return would avoid duplication here.

This revision is now accepted and ready to land.Jun 26 2020, 6:25 AM

ftynse added a child revision: D82656: [mlir] Modernize LLVM dialect rountrip test.Jun 26 2020, 6:28 AM

Closed by commit rG6323065fd602: [mlir] support returning unranked memrefs (authored by ftynse). · Explain WhyJun 26 2020, 7:03 AM

This revision was automatically updated to reflect the committed changes.

ftynse marked 2 inline comments as done.

Harbormaster completed remote builds in B61917: Diff 273692.Jun 26 2020, 7:37 AM

rriddle added inline comments.Jun 26 2020, 12:01 PM

mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp
1930	nit: Flip the condition and drop the braces?

mehdi_amini added inline comments.Jun 29 2020, 8:34 PM

mlir/docs/ConversionToLLVMDialect.md
323	Is this correct? `For unranked memrefs [...] same as the unranked memref descriptor`
325	`stack allocation` threw me off at first because I was wondering what prevents a heap alloc.
328	This last sentence is really not clear to me. What's specific about thread safety here? What's about removing stack allocations? What should a caller do here? It this about optimization?
376	I don't understand how the fact that it is allocated on the stack or not is part of the convention? The lifetime aspect is also not clear to me: is the motivation to be able to reuse this stack memory in the function itself when it is done with the memref?
378	The part about stack overflows seems to indicate that the lifetime is about the caller and not the callee. This seems all implementation details of the lowering and not part of the calling convention.
381	Typo: `accordingly`

Fixed some in 4ab43980450baf3c49bebbc526c6c96c3ed9f06e

mlir/docs/ConversionToLLVMDialect.md
323	Yes, it's calling convention about about unpacking the `{ i64, i8* }` struct into two separate arguments
325	If we allocate on heap when lowering the cast, we need some lifetime analysis to understand where to insert the corresponding deallocation. Allocation on stack does not have this problem.
328	This was here way before the current patch. Thread safety was a concern when we were passing a pointer to the descriptor that could be mutated, but I don't think it's a problem in the current model. This should be rephrased as something like "the caller is in charge of managing the allocation".
376	The motivation is that the caller guarantees to the callee that the pointer lives longer than the callee.
378	I can factor this out into some "unranked memref lifetime management" section, but it will only scatter the information that is relevant here.

mehdi_amini added inline comments.Jun 30 2020, 1:58 PM

mlir/docs/ConversionToLLVMDialect.md
325	I guess you're trying to document some specific lowering while I'm reading "calling convention independently" of any lowering.
328	Makes sense.
376	OK but the "stack" aspect does not seem to belong still, and the sentence is overly restrictive: as written it says that the lifetime ends with the function instead of outlive the function. This is a big difference because inside the function it impacts whether I can clobber the memory or not.
378	The issue to me (across all these comments above) is really separating the specification of the "calling convention" from a particular lowering strategy.

ftynse marked an inline comment as done.Jul 1 2020, 1:01 AM

ftynse added inline comments.

mlir/docs/ConversionToLLVMDialect.md
325	Well, this is a side effect of MLIR: a calling convention is nothing else but the set of compatible patterns lowering std.func, std.call, std.indirect_call and std.return. I don't think it's really separable, and there are users that use different patterns and get a different convention (eg "bare pointer")

mehdi_amini added inline comments.Jul 4 2020, 2:46 PM

mlir/docs/ConversionToLLVMDialect.md
325	I am not sure I follow: the fact that the caller is turned into a stack allocation seems like a pure implementation details of a lowering pattern. The signature of the function (the actual calling convention) isn't: it describes how anyone could implement their lowering to call such a function, or to lower their own function definition so that they can be called by the standard call. You seem to conflate both aspects together here, while they are fundamentally different to me.

Revision Contents

Path

Size

mlir/

docs/

ConversionToLLVMDialect.md

20 lines

include/

mlir/

Conversion/

StandardToLLVM/

ConvertStandardToLLVM.h

10 lines

Dialect/

LLVMIR/

LLVMOps.td

7 lines

lib/

Conversion/

StandardToLLVM/

StandardToLLVM.cpp

182 lines

test/

Conversion/

StandardToLLVM/

calling-convention.mlir

131 lines

Dialect/

LLVMIR/

roundtrip.mlir

19 lines

Target/

llvmir-intrinsics.mlir

13 lines

mlir-cpu-runner/

unranked_memref.mlir

47 lines

Diff 273692

mlir/docs/ConversionToLLVMDialect.md

Show First 20 Lines • Show All 240 Lines • ▼ Show 20 Lines	func @bar() {
%4 = llvm.extractvalue %2[1] : !llvm<"{i32, i64}">		%4 = llvm.extractvalue %2[1] : !llvm<"{i32, i64}">

// use as before		// use as before
"use_i32"(%3) : (!llvm.i32) -> ()		"use_i32"(%3) : (!llvm.i32) -> ()
"use_i64"(%4) : (!llvm.i64) -> ()		"use_i64"(%4) : (!llvm.i64) -> ()
}		}
```		```

### Calling Convention for `memref`		### Calling Convention for Ranked `memref`

Function _arguments_ of `memref` type, ranked or unranked, are _expanded_ into a		Function _arguments_ of `memref` type, ranked or unranked, are _expanded_ into a
list of arguments of non-aggregate types that the memref descriptor defined		list of arguments of non-aggregate types that the memref descriptor defined
above comprises. That is, the outer struct type and the inner array types are		above comprises. That is, the outer struct type and the inner array types are
replaced with individual arguments.		replaced with individual arguments.

This convention is implemented in the conversion of `std.func` and `std.call` to		This convention is implemented in the conversion of `std.func` and `std.call` to
the LLVM dialect, with the former unpacking the descriptor into a set of		the LLVM dialect, with the former unpacking the descriptor into a set of
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	llvm.func @bar() {

// Pass individual values to the callee.		// Pass individual values to the callee.
llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm<"float">, !llvm<"float">, !llvm.i64, !llvm.i64, !llvm.i64) -> ()		llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm<"float">, !llvm<"float">, !llvm.i64, !llvm.i64, !llvm.i64) -> ()
llvm.return		llvm.return
}		}

```		```

For unranked memrefs, the list of function arguments always contains two		### Calling Convention for Unranked `memref`

		For unranked memrefs, the list of function arguments always contains two
elements, same as the unranked memref descriptor: an integer rank, and a		elements, same as the unranked memref descriptor: an integer rank, and a
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Is this correct? `For unranked memrefs [...] same as the unranked memref descriptor` mehdi_amini: Is this correct? `For unranked memrefs [...] same as the unranked memref descriptor`
		ftynseAuthorUnsubmitted Done Reply Inline Actions Yes, it's calling convention about about unpacking the `{ i64, i8* }` struct into two separate arguments ftynse: Yes, it's calling convention about about unpacking the `{ i64, i8* }` struct into two separate…
type-erased (`!llvm<"i8*">`) pointer to the ranked memref descriptor. Note that		type-erased (`!llvm<"i8*">`) pointer to the ranked memref descriptor. Note that
while the _calling convention_ does not require stack allocation, _casting_ to		while the _calling convention_ does not require stack allocation, _casting_ to
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions `stack allocation` threw me off at first because I was wondering what prevents a heap alloc. mehdi_amini: `stack allocation` threw me off at first because I was wondering what prevents a heap alloc.
		ftynseAuthorUnsubmitted Done Reply Inline Actions If we allocate on heap when lowering the cast, we need some lifetime analysis to understand where to insert the corresponding deallocation. Allocation on stack does not have this problem. ftynse: If we allocate on heap when lowering the cast, we need some lifetime analysis to understand…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I guess you're trying to document some specific lowering while I'm reading "calling convention independently" of any lowering. mehdi_amini: I guess you're trying to document some specific lowering while I'm reading "calling convention…
		ftynseAuthorUnsubmitted Done Reply Inline Actions Well, this is a side effect of MLIR: a calling convention is nothing else but the set of compatible patterns lowering std.func, std.call, std.indirect_call and std.return. I don't think it's really separable, and there are users that use different patterns and get a different convention (eg "bare pointer") ftynse: Well, this is a side effect of MLIR: a calling convention is nothing else but the set of…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I am not sure I follow: the fact that the caller is turned into a stack allocation seems like a pure implementation details of a lowering pattern. The signature of the function (the actual calling convention) isn't: it describes how anyone could implement their lowering to call such a function, or to lower their own function definition so that they can be called by the standard call. You seem to conflate both aspects together here, while they are fundamentally different to me. mehdi_amini: I am not sure I follow: the fact that the caller is turned into a stack allocation seems like a…
unranked memref does since one cannot take an address of an SSA value containing		unranked memref does since one cannot take an address of an SSA value containing
the ranked memref. The caller is in charge of ensuring the thread safety and		the ranked memref. The caller is in charge of ensuring the thread safety and
eventually removing unnecessary stack allocations in cast operations.		eventually removing unnecessary stack allocations in cast operations.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions This last sentence is really not clear to me. What's specific about thread safety here? What's about removing stack allocations? What should a caller do here? It this about optimization? mehdi_amini: This last sentence is really not clear to me. What's specific about thread safety here? What's…
		ftynseAuthorUnsubmitted Done Reply Inline Actions This was here way before the current patch. Thread safety was a concern when we were passing a pointer to the descriptor that could be mutated, but I don't think it's a problem in the current model. This should be rephrased as something like "the caller is in charge of managing the allocation". ftynse: This was here way before the current patch. Thread safety was a concern when we were passing a…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Makes sense. mehdi_amini: Makes sense.

Example		Example

```mlir		```mlir
llvm.func @foo(%arg0: memref<*xf32>) -> () {		llvm.func @foo(%arg0: memref<*xf32>) -> () {
"use"(%arg0) : (memref<*xf32>) -> ()		"use"(%arg0) : (memref<*xf32>) -> ()
return		return
}		}
Show All 29 Lines	llvm.func @bar() {
%2 = llvm.extractvalue %0[1] : !llvm<"{ i64, i8* }">		%2 = llvm.extractvalue %0[1] : !llvm<"{ i64, i8* }">

// Pass individual values to the callee.		// Pass individual values to the callee.
llvm.call @foo(%1, %2) : (!llvm.i64, !llvm<"i8*">)		llvm.call @foo(%1, %2) : (!llvm.i64, !llvm<"i8*">)
llvm.return		llvm.return
}		}
```		```

		Lifetime. The second element of the unranked memref descriptor points to
		some memory in which the ranked memref descriptor is stored. By convention, this
		memory is allocated on stack and has the lifetime of the function. (Note: due
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I don't understand how the fact that it is allocated on the stack or not is part of the convention? The lifetime aspect is also not clear to me: is the motivation to be able to reuse this stack memory in the function itself when it is done with the memref? mehdi_amini: I don't understand how the fact that it is allocated on the stack or not is part of the…
		ftynseAuthorUnsubmitted Done Reply Inline Actions The motivation is that the caller guarantees to the callee that the pointer lives longer than the callee. ftynse: The motivation is that the caller guarantees to the callee that the pointer lives longer than…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions OK but the "stack" aspect does not seem to belong still, and the sentence is overly restrictive: as written it says that the lifetime ends with the function instead of outlive the function. This is a big difference because inside the function it impacts whether I can clobber the memory or not. mehdi_amini: OK but the "stack" aspect does not seem to belong still, and the sentence is overly restrictive…
		to function-length lifetime, creation of multiple unranked memref descriptors,
		e.g., in a loop, may lead to stack overflows.) If an unranked descriptor has to
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions The part about stack overflows seems to indicate that the lifetime is about the caller and not the callee. This seems all implementation details of the lowering and not part of the calling convention. mehdi_amini: The part about stack overflows seems to indicate that the lifetime is about the caller and not…
		ftynseAuthorUnsubmitted Done Reply Inline Actions I can factor this out into some "unranked memref lifetime management" section, but it will only scatter the information that is relevant here. ftynse: I can factor this out into some "unranked memref lifetime management" section, but it will only…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions The issue to me (across all these comments above) is really separating the specification of the "calling convention" from a particular lowering strategy. mehdi_amini: The issue to me (across all these comments above) is really separating the specification of the…
		be returned from a function, the ranked descriptor it points to is copied into
		dynamically allocated memory, and the pointer in the unranked descriptor is
		updated accodingly. The allocation happens immediately before returning. It is
		mehdi_aminiUnsubmitted Done Reply Inline Actions Typo: `accordingly` mehdi_amini: Typo: `accordingly`
		the responsibility of the caller to free the dynamically allocated memory. The
		default conversion of `std.call` and `std.call_indirect` copies the ranked
		descriptor to newly allocated memory on the caller's stack. Thus, the convention
		of the ranked memref descriptor pointed to by an unranked memref descriptor
		being stored on stack is respected.

*This convention may or may not apply if the conversion of MemRef types is		*This convention may or may not apply if the conversion of MemRef types is
overridden by the user.*		overridden by the user.*

### C-compatible wrapper emission		### C-compatible wrapper emission

In practical cases, it may be desirable to have externally-facing functions with		In practical cases, it may be desirable to have externally-facing functions with
a single attribute corresponding to a MemRef argument. When interfacing with		a single attribute corresponding to a MemRef argument. When interfacing with
LLVM IR produced from C, the code needs to respect the corresponding calling		LLVM IR produced from C, the code needs to respect the corresponding calling
▲ Show 20 Lines • Show All 266 Lines • Show Last 20 Lines

mlir/include/mlir/Conversion/StandardToLLVM/ConvertStandardToLLVM.h

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	public:

/// Gets the LLVM representation of the index type. The returned type is an		/// Gets the LLVM representation of the index type. The returned type is an
/// integer type with the size configured for this type converter.		/// integer type with the size configured for this type converter.
LLVM::LLVMType getIndexType();		LLVM::LLVMType getIndexType();

/// Gets the bitwidth of the index type when converted to LLVM.		/// Gets the bitwidth of the index type when converted to LLVM.
unsigned getIndexTypeBitwidth() { return customizations.indexBitwidth; }		unsigned getIndexTypeBitwidth() { return customizations.indexBitwidth; }

		/// Gets the pointer bitwidth.
		unsigned getPointerBitwidth(unsigned addressSpace = 0);

protected:		protected:
/// LLVM IR module used to parse/create types.		/// LLVM IR module used to parse/create types.
llvm::Module *module;		llvm::Module *module;
LLVM::LLVMDialect *llvmDialect;		LLVM::LLVMDialect *llvmDialect;

private:		private:
/// Convert a function type. The arguments and results are converted one by		/// Convert a function type. The arguments and results are converted one by
/// one. Additionally, if the function returns more than one value, pack the		/// one. Additionally, if the function returns more than one value, pack the
▲ Show 20 Lines • Show All 241 Lines • ▼ Show 20 Lines	public:
/// Builds IR extracting individual elements that compose an unranked memref		/// Builds IR extracting individual elements that compose an unranked memref
/// descriptor and returns them as `results` list.		/// descriptor and returns them as `results` list.
static void unpack(OpBuilder &builder, Location loc, Value packed,		static void unpack(OpBuilder &builder, Location loc, Value packed,
SmallVectorImpl<Value> &results);		SmallVectorImpl<Value> &results);

/// Returns the number of non-aggregate values that would be produced by		/// Returns the number of non-aggregate values that would be produced by
/// `unpack`.		/// `unpack`.
static unsigned getNumUnpackedValues() { return 2; }		static unsigned getNumUnpackedValues() { return 2; }

		/// Builds IR computing the sizes in bytes (suitable for opaque allocation)
		/// and appends the corresponding values into `sizes`.
		static void computeSizes(OpBuilder &builder, Location loc,
		LLVMTypeConverter &typeConverter,
		ArrayRef<UnrankedMemRefDescriptor> values,
		SmallVectorImpl<Value> &sizes);
};		};

/// Base class for operation conversions targeting the LLVM IR dialect. Provides		/// Base class for operation conversions targeting the LLVM IR dialect. Provides
/// conversion patterns with access to an LLVMTypeConverter.		/// conversion patterns with access to an LLVMTypeConverter.
class ConvertToLLVMPattern : public ConversionPattern {		class ConvertToLLVMPattern : public ConversionPattern {
public:		public:
ConvertToLLVMPattern(StringRef rootOpName, MLIRContext *context,		ConvertToLLVMPattern(StringRef rootOpName, MLIRContext *context,
LLVMTypeConverter &typeConverter,		LLVMTypeConverter &typeConverter,
▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td

Show First 20 Lines • Show All 788 Lines • ▼ Show 20 Lines	def LLVM_Prefetch : LLVM_ZeroResultIntrOp<"prefetch", [0]>,
Arguments<(ins LLVM_Type:$addr, LLVM_Type:$rw,		Arguments<(ins LLVM_Type:$addr, LLVM_Type:$rw,
LLVM_Type:$hint, LLVM_Type:$cache)>;		LLVM_Type:$hint, LLVM_Type:$cache)>;
def LLVM_SinOp : LLVM_UnaryIntrinsicOp<"sin">;		def LLVM_SinOp : LLVM_UnaryIntrinsicOp<"sin">;
def LLVM_SqrtOp : LLVM_UnaryIntrinsicOp<"sqrt">;		def LLVM_SqrtOp : LLVM_UnaryIntrinsicOp<"sqrt">;
def LLVM_PowOp : LLVM_BinarySameArgsIntrinsicOp<"pow">;		def LLVM_PowOp : LLVM_BinarySameArgsIntrinsicOp<"pow">;
def LLVM_BitReverseOp : LLVM_UnaryIntrinsicOp<"bitreverse">;		def LLVM_BitReverseOp : LLVM_UnaryIntrinsicOp<"bitreverse">;
def LLVM_CtPopOp : LLVM_UnaryIntrinsicOp<"ctpop">;		def LLVM_CtPopOp : LLVM_UnaryIntrinsicOp<"ctpop">;

		def LLVM_MemcpyOp : LLVM_ZeroResultIntrOp<"memcpy", [0, 1, 2]>,
		Arguments<(ins LLVM_Type:$dst, LLVM_Type:$src,
		LLVM_Type:$len, LLVM_Type:$isVolatile)>;
		def LLVM_MemcpyInlineOp : LLVM_ZeroResultIntrOp<"memcpy.inline", [0, 1, 2]>,
		Arguments<(ins LLVM_Type:$dst, LLVM_Type:$src,
		LLVM_Type:$len, LLVM_Type:$isVolatile)>;

//		//
// Vector Reductions.		// Vector Reductions.
//		//

def LLVM_experimental_vector_reduce_add : LLVM_VectorReduction<"add">;		def LLVM_experimental_vector_reduce_add : LLVM_VectorReduction<"add">;
def LLVM_experimental_vector_reduce_and : LLVM_VectorReduction<"and">;		def LLVM_experimental_vector_reduce_and : LLVM_VectorReduction<"and">;
def LLVM_experimental_vector_reduce_mul : LLVM_VectorReduction<"mul">;		def LLVM_experimental_vector_reduce_mul : LLVM_VectorReduction<"mul">;
def LLVM_experimental_vector_reduce_fmax : LLVM_VectorReduction<"fmax">;		def LLVM_experimental_vector_reduce_fmax : LLVM_VectorReduction<"fmax">;
▲ Show 20 Lines • Show All 221 Lines • Show Last 20 Lines

mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp

Show All 18 Lines
#include "mlir/IR/Attributes.h"		#include "mlir/IR/Attributes.h"
#include "mlir/IR/BlockAndValueMapping.h"		#include "mlir/IR/BlockAndValueMapping.h"
#include "mlir/IR/Builders.h"		#include "mlir/IR/Builders.h"
#include "mlir/IR/MLIRContext.h"		#include "mlir/IR/MLIRContext.h"
#include "mlir/IR/Module.h"		#include "mlir/IR/Module.h"
#include "mlir/IR/PatternMatch.h"		#include "mlir/IR/PatternMatch.h"
#include "mlir/IR/TypeUtilities.h"		#include "mlir/IR/TypeUtilities.h"
#include "mlir/Support/LogicalResult.h"		#include "mlir/Support/LogicalResult.h"
		#include "mlir/Support/MathExtras.h"
#include "mlir/Transforms/DialectConversion.h"		#include "mlir/Transforms/DialectConversion.h"
#include "mlir/Transforms/Passes.h"		#include "mlir/Transforms/Passes.h"
#include "mlir/Transforms/Utils.h"		#include "mlir/Transforms/Utils.h"
#include "llvm/ADT/TypeSwitch.h"		#include "llvm/ADT/TypeSwitch.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines
llvm::LLVMContext &LLVMTypeConverter::getLLVMContext() {		llvm::LLVMContext &LLVMTypeConverter::getLLVMContext() {
return module->getContext();		return module->getContext();
}		}

LLVM::LLVMType LLVMTypeConverter::getIndexType() {		LLVM::LLVMType LLVMTypeConverter::getIndexType() {
return LLVM::LLVMType::getIntNTy(llvmDialect, getIndexTypeBitwidth());		return LLVM::LLVMType::getIntNTy(llvmDialect, getIndexTypeBitwidth());
}		}

		unsigned LLVMTypeConverter::getPointerBitwidth(unsigned addressSpace) {
		return module->getDataLayout().getPointerSizeInBits(addressSpace);
		}

Type LLVMTypeConverter::convertIndexType(IndexType type) {		Type LLVMTypeConverter::convertIndexType(IndexType type) {
return getIndexType();		return getIndexType();
}		}

Type LLVMTypeConverter::convertIntegerType(IntegerType type) {		Type LLVMTypeConverter::convertIntegerType(IntegerType type) {
return LLVM::LLVMType::getIntNTy(llvmDialect, type.getWidth());		return LLVM::LLVMType::getIntNTy(llvmDialect, type.getWidth());
}		}

▲ Show 20 Lines • Show All 569 Lines • ▼ Show 20 Lines	void UnrankedMemRefDescriptor::unpack(OpBuilder &builder, Location loc,
Value packed,		Value packed,
SmallVectorImpl<Value> &results) {		SmallVectorImpl<Value> &results) {
UnrankedMemRefDescriptor d(packed);		UnrankedMemRefDescriptor d(packed);
results.reserve(results.size() + 2);		results.reserve(results.size() + 2);
results.push_back(d.rank(builder, loc));		results.push_back(d.rank(builder, loc));
results.push_back(d.memRefDescPtr(builder, loc));		results.push_back(d.memRefDescPtr(builder, loc));
}		}

		void UnrankedMemRefDescriptor::computeSizes(
		OpBuilder &builder, Location loc, LLVMTypeConverter &typeConverter,
		ArrayRef<UnrankedMemRefDescriptor> values, SmallVectorImpl<Value> &sizes) {
		if (values.empty())
		return;

		// Cache the index type.
		LLVM::LLVMType indexType = typeConverter.getIndexType();

		// Initialize shared constants.
		Value one = createIndexAttrConstant(builder, loc, indexType, 1);
		Value two = createIndexAttrConstant(builder, loc, indexType, 2);
		Value pointerSize = createIndexAttrConstant(
		builder, loc, indexType, ceilDiv(typeConverter.getPointerBitwidth(), 8));
		Value indexSize =
		createIndexAttrConstant(builder, loc, indexType,
		ceilDiv(typeConverter.getIndexTypeBitwidth(), 8));

		sizes.reserve(sizes.size() + values.size());
		for (UnrankedMemRefDescriptor desc : values) {
		// Emit IR computing the memory necessary to store the descriptor. This
		// assumes the descriptor to be
		// { type, type, index, index[rank], index[rank] }
		// and densely packed, so the total size is
		// 2 * sizeof(pointer) + (1 + 2 * rank) * sizeof(index).
		// TODO: consider including the actual size (including eventual padding due
		// to data layout) into the unranked descriptor.
		Value doublePointerSize =
		builder.create<LLVM::MulOp>(loc, indexType, two, pointerSize);

		// (1 + 2 * rank) * sizeof(index)
		Value rank = desc.rank(builder, loc);
		Value doubleRank = builder.create<LLVM::MulOp>(loc, indexType, two, rank);
		Value doubleRankIncremented =
		builder.create<LLVM::AddOp>(loc, indexType, doubleRank, one);
		Value rankIndexSize = builder.create<LLVM::MulOp>(
		loc, indexType, doubleRankIncremented, indexSize);

		// Total allocation size.
		Value allocationSize = builder.create<LLVM::AddOp>(
		loc, indexType, doublePointerSize, rankIndexSize);
		sizes.push_back(allocationSize);
		}
		}

LLVM::LLVMDialect &ConvertToLLVMPattern::getDialect() const {		LLVM::LLVMDialect &ConvertToLLVMPattern::getDialect() const {
return *typeConverter.getDialect();		return *typeConverter.getDialect();
}		}

llvm::LLVMContext &ConvertToLLVMPattern::getContext() const {		llvm::LLVMContext &ConvertToLLVMPattern::getContext() const {
return typeConverter.getLLVMContext();		return typeConverter.getLLVMContext();
}		}

▲ Show 20 Lines • Show All 1,078 Lines • ▼ Show 20 Lines
struct AllocOpLowering : public AllocLikeOpLowering<AllocOp> {		struct AllocOpLowering : public AllocLikeOpLowering<AllocOp> {
explicit AllocOpLowering(LLVMTypeConverter &converter,		explicit AllocOpLowering(LLVMTypeConverter &converter,
bool useAlignedAlloc = false)		bool useAlignedAlloc = false)
: AllocLikeOpLowering<AllocOp>(converter, useAlignedAlloc) {}		: AllocLikeOpLowering<AllocOp>(converter, useAlignedAlloc) {}
};		};

using AllocaOpLowering = AllocLikeOpLowering<AllocaOp>;		using AllocaOpLowering = AllocLikeOpLowering<AllocaOp>;

		/// Copies the shaped descriptor part to (if `toDynamic` is set) or from
		/// (otherwise) the dynamically allocated memory for any operands that were
		/// unranked descriptors descriptors originally.
		herhutUnsubmitted Done Reply Inline Actions nit: remove on descriptors. herhut: nit: remove on descriptors.
		static LogicalResult copyUnrankedDescriptors(OpBuilder &builder, Location loc,
		LLVMTypeConverter &typeConverter,
		TypeRange origTypes,
		SmallVectorImpl<Value> &operands,
		bool toDynamic) {
		assert(origTypes.size() == operands.size() &&
		"expected as may original types as operands");

		// Find opreands of unranked memref type and store them.
		herhutUnsubmitted Done Reply Inline Actions Nit: operands. herhut: Nit: operands.
		SmallVector<UnrankedMemRefDescriptor, 4> unrankedMemrefs;
		for (unsigned i = 0, e = operands.size(); i < e; ++i) {
		if (!origTypes[i].isa<UnrankedMemRefType>())
		rriddleUnsubmitted Done Reply Inline Actions nit: Flip the condition and drop the braces? rriddle: nit: Flip the condition and drop the braces?
		continue;
		unrankedMemrefs.emplace_back(operands[i]);
		}

		if (unrankedMemrefs.empty())
		return success();

		// Compute allocation sizes.
		SmallVector<Value, 4> sizes;
		UnrankedMemRefDescriptor::computeSizes(builder, loc, typeConverter,
		unrankedMemrefs, sizes);

		// Find the malloc and free.
		auto module = builder.getInsertionPoint()->getParentOfType<ModuleOp>();
		auto mallocFunc = module.lookupSymbol<LLVM::LLVMFuncOp>("malloc");
		auto freeFunc = module.lookupSymbol<LLVM::LLVMFuncOp>("free");

		// Get frequently used types.
		auto voidType = LLVM::LLVMType::getVoidTy(typeConverter.getDialect());
		auto voidPtrType = LLVM::LLVMType::getInt8PtrTy(typeConverter.getDialect());
		auto i1Type = LLVM::LLVMType::getInt1Ty(typeConverter.getDialect());
		LLVM::LLVMType indexType = typeConverter.getIndexType();

		// Initialize shared constants.
		Value zero =
		builder.create<LLVM::ConstantOp>(loc, i1Type, builder.getBoolAttr(false));

		unsigned unrankedMemrefPos = 0;
		for (unsigned i = 0, e = operands.size(); i < e; ++i) {
		Type type = origTypes[i];
		if (!type.isa<UnrankedMemRefType>())
		continue;
		Value allocationSize = sizes[unrankedMemrefPos++];
		UnrankedMemRefDescriptor desc(operands[i]);

		// If nobody declared "malloc" or "free" yet, do so.
		herhutUnsubmitted Not Done Reply Inline Actions Move out of loop? herhut: Move out of loop?
		if (!mallocFunc && toDynamic) {
		OpBuilder::InsertionGuard guard(builder);
		builder.setInsertionPointToStart(module.getBody());
		mallocFunc = builder.create<LLVM::LLVMFuncOp>(
		builder.getUnknownLoc(), "malloc",
		LLVM::LLVMType::getFunctionTy(
		voidPtrType, llvm::makeArrayRef(indexType), /isVarArg=/false));
		}
		if (!freeFunc && !toDynamic) {
		OpBuilder::InsertionGuard guard(builder);
		builder.setInsertionPointToStart(module.getBody());
		freeFunc = builder.create<LLVM::LLVMFuncOp>(
		builder.getUnknownLoc(), "free",
		LLVM::LLVMType::getFunctionTy(
		voidType, llvm::makeArrayRef(voidPtrType), /isVarArg=/false));
		}

		// Allocate memory, copy, and free the source if necessary.
		Value memory =
		toDynamic
		? builder.create<LLVM::CallOp>(loc, mallocFunc, allocationSize)
		.getResult(0)
		: builder.create<LLVM::AllocaOp>(loc, voidPtrType, allocationSize,
		/alignment=/0);

		Value source = desc.memRefDescPtr(builder, loc);
		builder.create<LLVM::MemcpyOp>(loc, memory, source, allocationSize, zero);
		if (!toDynamic)
		builder.create<LLVM::CallOp>(loc, freeFunc, source);

		// Create a new descriptor. The same descriptor can be returned multiple
		// times, attempting to modify its pointer can lead to memory leaks
		// (allocated twice and overwritten) or double frees (the caller does not
		// know if the descriptor points to the same memory).
		Type descriptorType = typeConverter.convertType(type);
		if (!descriptorType)
		return failure();
		auto updatedDesc =
		UnrankedMemRefDescriptor::undef(builder, loc, descriptorType);
		Value rank = desc.rank(builder, loc);
		updatedDesc.setRank(builder, loc, rank);
		updatedDesc.setMemRefDescPtr(builder, loc, memory);

		operands[i] = updatedDesc;
		}

		return success();
		}

// A CallOp automatically promotes MemRefType to a sequence of alloca/store and		// A CallOp automatically promotes MemRefType to a sequence of alloca/store and
// passes the pointer to the MemRef across function boundaries.		// passes the pointer to the MemRef across function boundaries.
template <typename CallOpType>		template <typename CallOpType>
struct CallOpInterfaceLowering : public ConvertOpToLLVMPattern<CallOpType> {		struct CallOpInterfaceLowering : public ConvertOpToLLVMPattern<CallOpType> {
using ConvertOpToLLVMPattern<CallOpType>::ConvertOpToLLVMPattern;		using ConvertOpToLLVMPattern<CallOpType>::ConvertOpToLLVMPattern;
using Super = CallOpInterfaceLowering<CallOpType>;		using Super = CallOpInterfaceLowering<CallOpType>;
using Base = ConvertOpToLLVMPattern<CallOpType>;		using Base = ConvertOpToLLVMPattern<CallOpType>;

LogicalResult		LogicalResult
matchAndRewrite(Operation *op, ArrayRef<Value> operands,		matchAndRewrite(Operation *op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
typename CallOpType::Adaptor transformed(operands);		typename CallOpType::Adaptor transformed(operands);
auto callOp = cast<CallOpType>(op);		auto callOp = cast<CallOpType>(op);

// Pack the result types into a struct.		// Pack the result types into a struct.
Type packedResult;		Type packedResult;
unsigned numResults = callOp.getNumResults();		unsigned numResults = callOp.getNumResults();
auto resultTypes = llvm::to_vector<4>(callOp.getResultTypes());		auto resultTypes = llvm::to_vector<4>(callOp.getResultTypes());

for (Type resType : resultTypes) {
assert(!resType.isa<UnrankedMemRefType>() &&
"Returning unranked memref is not supported. Pass result as an"
"argument instead.");
(void)resType;
}

if (numResults != 0) {		if (numResults != 0) {
if (!(packedResult =		if (!(packedResult =
this->typeConverter.packFunctionResults(resultTypes)))		this->typeConverter.packFunctionResults(resultTypes)))
return failure();		return failure();
}		}

auto promoted = this->typeConverter.promoteMemRefDescriptors(		auto promoted = this->typeConverter.promoteMemRefDescriptors(
op->getLoc(), /opOperands=/op->getOperands(), operands, rewriter);		op->getLoc(), /opOperands=/op->getOperands(), operands, rewriter);
auto newOp = rewriter.create<LLVM::CallOp>(op->getLoc(), packedResult,		auto newOp = rewriter.create<LLVM::CallOp>(op->getLoc(), packedResult,
promoted, op->getAttrs());		promoted, op->getAttrs());

// If < 2 results, packing did not do anything and we can just return.		// If < 2 results, packing did not do anything and we can just return.
		SmallVector<Value, 4> results;
if (numResults < 2) {		if (numResults < 2) {
rewriter.replaceOp(op, newOp.getResults());		results.append(newOp.result_begin(), newOp.result_end());
		if (failed(copyUnrankedDescriptors(rewriter, op->getLoc(),
		herhutUnsubmitted Not Done Reply Inline Actions Not doing an early return would avoid duplication here. herhut: Not doing an early return would avoid duplication here.
		this->typeConverter,
		op->getResultTypes(), results,
		/toDynamic=/false)))
		return failure();
		rewriter.replaceOp(op, results);
return success();		return success();
}		}

// Otherwise, it had been converted to an operation producing a structure.		// Otherwise, it had been converted to an operation producing a structure.
// Extract individual results from the structure and return them as list.		// Extract individual results from the structure and return them as list.
// TODO(aminim, ntv, riverriddle, zinenko): this seems like patching around
// a particular interaction between MemRefType and CallOp lowering. Find a
// way to avoid special casing.
SmallVector<Value, 4> results;
results.reserve(numResults);		results.reserve(numResults);
for (unsigned i = 0; i < numResults; ++i) {		for (unsigned i = 0; i < numResults; ++i) {
auto type = this->typeConverter.convertType(op->getResult(i).getType());		auto type = this->typeConverter.convertType(op->getResult(i).getType());
results.push_back(rewriter.create<LLVM::ExtractValueOp>(		results.push_back(rewriter.create<LLVM::ExtractValueOp>(
op->getLoc(), type, newOp.getOperation()->getResult(0),		op->getLoc(), type, newOp.getOperation()->getResult(0),
rewriter.getI64ArrayAttr(i)));		rewriter.getI64ArrayAttr(i)));
}		}
		if (failed(copyUnrankedDescriptors(
		rewriter, op->getLoc(), this->typeConverter, op->getResultTypes(),
		results, /toDynamic=/false)))
		return failure();
rewriter.replaceOp(op, results);		rewriter.replaceOp(op, results);

return success();		return success();
}		}
};		};

struct CallOpLowering : public CallOpInterfaceLowering<CallOp> {		struct CallOpLowering : public CallOpInterfaceLowering<CallOp> {
using Super::Super;		using Super::Super;
▲ Show 20 Lines • Show All 462 Lines • ▼ Show 20 Lines
// necessary before returning it		// necessary before returning it
struct ReturnOpLowering : public ConvertOpToLLVMPattern<ReturnOp> {		struct ReturnOpLowering : public ConvertOpToLLVMPattern<ReturnOp> {
using ConvertOpToLLVMPattern<ReturnOp>::ConvertOpToLLVMPattern;		using ConvertOpToLLVMPattern<ReturnOp>::ConvertOpToLLVMPattern;

LogicalResult		LogicalResult
matchAndRewrite(Operation *op, ArrayRef<Value> operands,		matchAndRewrite(Operation *op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
unsigned numArguments = op->getNumOperands();		unsigned numArguments = op->getNumOperands();
		auto updatedOperands = llvm::to_vector<4>(operands);
		copyUnrankedDescriptors(rewriter, op->getLoc(), typeConverter,
		op->getOperands().getTypes(), updatedOperands,
		/toDynamic=/true);

// If ReturnOp has 0 or 1 operand, create it and return immediately.		// If ReturnOp has 0 or 1 operand, create it and return immediately.
if (numArguments == 0) {		if (numArguments == 0) {
rewriter.replaceOpWithNewOp<LLVM::ReturnOp>(		rewriter.replaceOpWithNewOp<LLVM::ReturnOp>(
op, ArrayRef<Type>(), ArrayRef<Value>(), op->getAttrs());		op, ArrayRef<Type>(), ArrayRef<Value>(), op->getAttrs());
return success();		return success();
}		}
if (numArguments == 1) {		if (numArguments == 1) {
rewriter.replaceOpWithNewOp<LLVM::ReturnOp>(		rewriter.replaceOpWithNewOp<LLVM::ReturnOp>(
op, ArrayRef<Type>(), operands.front(), op->getAttrs());		op, ArrayRef<Type>(), updatedOperands, op->getAttrs());
return success();		return success();
}		}

// Otherwise, we need to pack the arguments into an LLVM struct type before		// Otherwise, we need to pack the arguments into an LLVM struct type before
// returning.		// returning.
auto packedType = typeConverter.packFunctionResults(		auto packedType = typeConverter.packFunctionResults(
llvm::to_vector<4>(op->getOperandTypes()));		llvm::to_vector<4>(op->getOperandTypes()));

Value packed = rewriter.create<LLVM::UndefOp>(op->getLoc(), packedType);		Value packed = rewriter.create<LLVM::UndefOp>(op->getLoc(), packedType);
for (unsigned i = 0; i < numArguments; ++i) {		for (unsigned i = 0; i < numArguments; ++i) {
packed = rewriter.create<LLVM::InsertValueOp>(		packed = rewriter.create<LLVM::InsertValueOp>(
op->getLoc(), packedType, packed, operands[i],		op->getLoc(), packedType, packed, updatedOperands[i],
rewriter.getI64ArrayAttr(i));		rewriter.getI64ArrayAttr(i));
}		}
rewriter.replaceOpWithNewOp<LLVM::ReturnOp>(op, ArrayRef<Type>(), packed,		rewriter.replaceOpWithNewOp<LLVM::ReturnOp>(op, ArrayRef<Type>(), packed,
op->getAttrs());		op->getAttrs());
return success();		return success();
}		}
};		};

▲ Show 20 Lines • Show All 770 Lines • Show Last 20 Lines

mlir/test/Conversion/StandardToLLVM/calling-convention.mlir

Show First 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	func @other_callee(%arg0: memref<?xf32>, %arg1: index) attributes { llvm.emit_c_interface } {
return		return
}		}

// CHECK: @_mlir_ciface_other_callee		// CHECK: @_mlir_ciface_other_callee
// CHECK: llvm.call @other_callee		// CHECK: llvm.call @other_callee

// EMIT_C_ATTRIBUTE: @_mlir_ciface_other_callee		// EMIT_C_ATTRIBUTE: @_mlir_ciface_other_callee
// EMIT_C_ATTRIBUTE: llvm.call @other_callee		// EMIT_C_ATTRIBUTE: llvm.call @other_callee

		//===========================================================================//
		// Calling convention on returning unranked memrefs.
		//===========================================================================//

		// CHECK-LABEL: llvm.func @return_var_memref_caller
		func @return_var_memref_caller(%arg0: memref<4x3xf32>) {
		// CHECK: %[[CALL_RES:.*]] = llvm.call @return_var_memref
		%0 = call @return_var_memref(%arg0) : (memref<4x3xf32>) -> memref<*xf32>

		// CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : index)
		// CHECK: %[[TWO:.*]] = llvm.mlir.constant(2 : index)
		// These sizes may depend on the data layout, not matching specific values.
		// CHECK: %[[PTR_SIZE:.*]] = llvm.mlir.constant
		// CHECK: %[[IDX_SIZE:.*]] = llvm.mlir.constant

		// CHECK: %[[DOUBLE_PTR_SIZE:.*]] = llvm.mul %[[TWO]], %[[PTR_SIZE]]
		// CHECK: %[[RANK:.]] = llvm.extractvalue %[[CALL_RES]][0] : !llvm<"{ i64, i8 }">
		// CHECK: %[[DOUBLE_RANK:.*]] = llvm.mul %[[TWO]], %[[RANK]]
		// CHECK: %[[DOUBLE_RANK_INC:.*]] = llvm.add %[[DOUBLE_RANK]], %[[ONE]]
		// CHECK: %[[TABLES_SIZE:.*]] = llvm.mul %[[DOUBLE_RANK_INC]], %[[IDX_SIZE]]
		// CHECK: %[[ALLOC_SIZE:.*]] = llvm.add %[[DOUBLE_PTR_SIZE]], %[[TABLES_SIZE]]
		// CHECK: %[[FALSE:.*]] = llvm.mlir.constant(false)
		// CHECK: %[[ALLOCA:.*]] = llvm.alloca %[[ALLOC_SIZE]] x !llvm.i8
		// CHECK: %[[SOURCE:.*]] = llvm.extractvalue %[[CALL_RES]][1]
		// CHECK: "llvm.intr.memcpy"(%[[ALLOCA]], %[[SOURCE]], %[[ALLOC_SIZE]], %[[FALSE]])
		// CHECK: llvm.call @free(%[[SOURCE]])
		// CHECK: %[[DESC:.]] = llvm.mlir.undef : !llvm<"{ i64, i8 }">
		// CHECK: %[[RANK:.]] = llvm.extractvalue %[[CALL_RES]][0] : !llvm<"{ i64, i8 }">
		// CHECK: %[[DESC_1:.*]] = llvm.insertvalue %[[RANK]], %[[DESC]][0]
		// CHECK: llvm.insertvalue %[[ALLOCA]], %[[DESC_1]][1]
		return
		}

		// CHECK-LABEL: llvm.func @return_var_memref
		func @return_var_memref(%arg0: memref<4x3xf32>) -> memref<*xf32> {
		// Match the construction of the unranked descriptor.
		// CHECK: %[[ALLOCA:.*]] = llvm.alloca
		// CHECK: %[[MEMORY:.*]] = llvm.bitcast %[[ALLOCA]]
		// CHECK: %[[DESC_0:.]] = llvm.mlir.undef : !llvm<"{ i64, i8 }">
		// CHECK: %[[DESC_1:.]] = llvm.insertvalue %{{.}}, %[[DESC_0]][0]
		// CHECK: %[[DESC_2:.*]] = llvm.insertvalue %[[MEMORY]], %[[DESC_1]][1]
		%0 = memref_cast %arg0: memref<4x3xf32> to memref<*xf32>

		// CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : index)
		// CHECK: %[[TWO:.*]] = llvm.mlir.constant(2 : index)
		// These sizes may depend on the data layout, not matching specific values.
		// CHECK: %[[PTR_SIZE:.*]] = llvm.mlir.constant
		// CHECK: %[[IDX_SIZE:.*]] = llvm.mlir.constant

		// CHECK: %[[DOUBLE_PTR_SIZE:.*]] = llvm.mul %[[TWO]], %[[PTR_SIZE]]
		// CHECK: %[[RANK:.]] = llvm.extractvalue %[[DESC_2]][0] : !llvm<"{ i64, i8 }">
		// CHECK: %[[DOUBLE_RANK:.*]] = llvm.mul %[[TWO]], %[[RANK]]
		// CHECK: %[[DOUBLE_RANK_INC:.*]] = llvm.add %[[DOUBLE_RANK]], %[[ONE]]
		// CHECK: %[[TABLES_SIZE:.*]] = llvm.mul %[[DOUBLE_RANK_INC]], %[[IDX_SIZE]]
		// CHECK: %[[ALLOC_SIZE:.*]] = llvm.add %[[DOUBLE_PTR_SIZE]], %[[TABLES_SIZE]]
		// CHECK: %[[FALSE:.*]] = llvm.mlir.constant(false)
		// CHECK: %[[ALLOCATED:.*]] = llvm.call @malloc(%[[ALLOC_SIZE]])
		// CHECK: %[[SOURCE:.*]] = llvm.extractvalue %[[DESC_2]][1]
		// CHECK: "llvm.intr.memcpy"(%[[ALLOCATED]], %[[SOURCE]], %[[ALLOC_SIZE]], %[[FALSE]])
		// CHECK: %[[NEW_DESC:.]] = llvm.mlir.undef : !llvm<"{ i64, i8 }">
		// CHECK: %[[RANK:.]] = llvm.extractvalue %[[DESC_2]][0] : !llvm<"{ i64, i8 }">
		// CHECK: %[[NEW_DESC_1:.*]] = llvm.insertvalue %[[RANK]], %[[NEW_DESC]][0]
		// CHECK: %[[NEW_DESC_2:.*]] = llvm.insertvalue %[[ALLOCATED]], %[[NEW_DESC_1]][1]
		// CHECL: llvm.return %[[NEW_DESC_2]]
		return %0 : memref<*xf32>
		}

		// CHECK-LABEL: llvm.func @return_two_var_memref_caller
		func @return_two_var_memref_caller(%arg0: memref<4x3xf32>) {
		// Only check that we create two different descriptors using different
		// memory, and deallocate both sources. The size computation is same as for
		// the single result.
		// CHECK: %[[CALL_RES:.*]] = llvm.call @return_two_var_memref
		// CHECK: %[[RES_1:.*]] = llvm.extractvalue %[[CALL_RES]][0]
		// CHECK: %[[RES_2:.*]] = llvm.extractvalue %[[CALL_RES]][1]
		%0:2 = call @return_two_var_memref(%arg0) : (memref<4x3xf32>) -> (memref<xf32>, memref<xf32>)

		// CHECK: %[[ALLOCA_1:.]] = llvm.alloca %{{.}} x !llvm.i8
		// CHECK: %[[SOURCE_1:.]] = llvm.extractvalue %[[RES_1:.]][1] : ![[DESC_TYPE:.*]]
		// CHECK: "llvm.intr.memcpy"(%[[ALLOCA_1]], %[[SOURCE_1]], %{{.}}, %[[FALSE:.]])
		// CHECK: llvm.call @free(%[[SOURCE_1]])
		// CHECK: %[[DESC_1:.*]] = llvm.mlir.undef : ![[DESC_TYPE]]
		// CHECK: %[[DESC_11:.]] = llvm.insertvalue %{{.}}, %[[DESC_1]][0]
		// CHECK: llvm.insertvalue %[[ALLOCA_1]], %[[DESC_11]][1]

		// CHECK: %[[ALLOCA_2:.]] = llvm.alloca %{{.}} x !llvm.i8
		// CHECK: %[[SOURCE_2:.]] = llvm.extractvalue %[[RES_2:.]][1]
		// CHECK: "llvm.intr.memcpy"(%[[ALLOCA_2]], %[[SOURCE_2]], %{{.*}}, %[[FALSE]])
		// CHECK: llvm.call @free(%[[SOURCE_2]])
		// CHECK: %[[DESC_2:.*]] = llvm.mlir.undef : ![[DESC_TYPE]]
		// CHECK: %[[DESC_21:.]] = llvm.insertvalue %{{.}}, %[[DESC_2]][0]
		// CHECK: llvm.insertvalue %[[ALLOCA_2]], %[[DESC_21]][1]
		return
		}

		// CHECK-LABEL: llvm.func @return_two_var_memref
		func @return_two_var_memref(%arg0: memref<4x3xf32>) -> (memref<xf32>, memref<xf32>) {
		// Match the construction of the unranked descriptor.
		// CHECK: %[[ALLOCA:.*]] = llvm.alloca
		// CHECK: %[[MEMORY:.*]] = llvm.bitcast %[[ALLOCA]]
		// CHECK: %[[DESC_0:.]] = llvm.mlir.undef : !llvm<"{ i64, i8 }">
		// CHECK: %[[DESC_1:.]] = llvm.insertvalue %{{.}}, %[[DESC_0]][0]
		// CHECK: %[[DESC_2:.*]] = llvm.insertvalue %[[MEMORY]], %[[DESC_1]][1]
		%0 = memref_cast %arg0 : memref<4x3xf32> to memref<*xf32>

		// Only check that we allocate the memory for each operand of the "return"
		// separately, even if both operands are the same value. The calling
		// convention requires the caller to free them and the caller cannot know
		// whether they are the same value or not.
		// CHECK: %[[ALLOCATED_1:.]] = llvm.call @malloc(%{{.}})
		// CHECK: %[[SOURCE_1:.*]] = llvm.extractvalue %[[DESC_2]][1]
		// CHECK: "llvm.intr.memcpy"(%[[ALLOCATED_1]], %[[SOURCE_1]], %{{.}}, %[[FALSE:.]])
		// CHECK: %[[RES_1:.*]] = llvm.mlir.undef
		// CHECK: %[[RES_11:.]] = llvm.insertvalue %{{.}}, %[[RES_1]][0]
		// CHECK: %[[RES_12:.*]] = llvm.insertvalue %[[ALLOCATED_1]], %[[RES_11]][1]

		// CHECK: %[[ALLOCATED_2:.]] = llvm.call @malloc(%{{.}})
		// CHECK: %[[SOURCE_2:.*]] = llvm.extractvalue %[[DESC_2]][1]
		// CHECK: "llvm.intr.memcpy"(%[[ALLOCATED_2]], %[[SOURCE_2]], %{{.*}}, %[[FALSE]])
		// CHECK: %[[RES_2:.*]] = llvm.mlir.undef
		// CHECK: %[[RES_21:.]] = llvm.insertvalue %{{.}}, %[[RES_2]][0]
		// CHECK: %[[RES_22:.*]] = llvm.insertvalue %[[ALLOCATED_2]], %[[RES_21]][1]

		// CHECK: %[[RESULTS:.]] = llvm.mlir.undef : !llvm<"{ { i64, i8 }, { i64, i8* } }">
		// CHECK: %[[RESULTS_1:.*]] = llvm.insertvalue %[[RES_12]], %[[RESULTS]]
		// CHECK: %[[RESULTS_2:.*]] = llvm.insertvalue %[[RES_22]], %[[RESULTS_1]]
		// CHECK: llvm.return %[[RESULTS_2]]
		return %0, %0 : memref<xf32>, memref<xf32>
		}

mlir/test/Dialect/LLVMIR/roundtrip.mlir

// RUN: mlir-opt %s \| mlir-opt \| FileCheck %s		// RUN: mlir-opt %s \| mlir-opt \| FileCheck %s

// CHECK-LABEL: func @ops(%arg0: !llvm.i32, %arg1: !llvm.float)		// CHECK-LABEL: func @ops
func @ops(%arg0 : !llvm.i32, %arg1 : !llvm.float) {		func @ops(%arg0: !llvm.i32, %arg1: !llvm.float,
		%arg2: !llvm<"i8">, %arg3: !llvm<"i8">,
		%arg4: !llvm.i1) {
// Integer arithmetic binary operations.		// Integer arithmetic binary operations.
//		//
// CHECK-NEXT: %0 = llvm.add %arg0, %arg0 : !llvm.i32		// CHECK-NEXT: %0 = llvm.add %arg0, %arg0 : !llvm.i32
// CHECK-NEXT: %1 = llvm.sub %arg0, %arg0 : !llvm.i32		// CHECK-NEXT: %1 = llvm.sub %arg0, %arg0 : !llvm.i32
// CHECK-NEXT: %2 = llvm.mul %arg0, %arg0 : !llvm.i32		// CHECK-NEXT: %2 = llvm.mul %arg0, %arg0 : !llvm.i32
// CHECK-NEXT: %3 = llvm.udiv %arg0, %arg0 : !llvm.i32		// CHECK-NEXT: %3 = llvm.udiv %arg0, %arg0 : !llvm.i32
// CHECK-NEXT: %4 = llvm.sdiv %arg0, %arg0 : !llvm.i32		// CHECK-NEXT: %4 = llvm.sdiv %arg0, %arg0 : !llvm.i32
// CHECK-NEXT: %5 = llvm.urem %arg0, %arg0 : !llvm.i32		// CHECK-NEXT: %5 = llvm.urem %arg0, %arg0 : !llvm.i32
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	// CHECK: "llvm.intr.pow"(%arg1, %arg1) : (!llvm.float, !llvm.float) -> !llvm.float
%31 = "llvm.intr.pow"(%arg1, %arg1) : (!llvm.float, !llvm.float) -> !llvm.float		%31 = "llvm.intr.pow"(%arg1, %arg1) : (!llvm.float, !llvm.float) -> !llvm.float

// CHECK: "llvm.intr.bitreverse"(%{{.*}}) : (!llvm.i32) -> !llvm.i32		// CHECK: "llvm.intr.bitreverse"(%{{.*}}) : (!llvm.i32) -> !llvm.i32
%32 = "llvm.intr.bitreverse"(%arg0) : (!llvm.i32) -> !llvm.i32		%32 = "llvm.intr.bitreverse"(%arg0) : (!llvm.i32) -> !llvm.i32

// CHECK: "llvm.intr.ctpop"(%{{.*}}) : (!llvm.i32) -> !llvm.i32		// CHECK: "llvm.intr.ctpop"(%{{.*}}) : (!llvm.i32) -> !llvm.i32
%33 = "llvm.intr.ctpop"(%arg0) : (!llvm.i32) -> !llvm.i32		%33 = "llvm.intr.ctpop"(%arg0) : (!llvm.i32) -> !llvm.i32

		// CHECK: "llvm.intr.memcpy"(%{{.}}, %{{.}}, %{{.}}, %{{.}}) : (!llvm<"i8">, !llvm<"i8">, !llvm.i32, !llvm.i1) -> ()
		"llvm.intr.memcpy"(%arg2, %arg3, %arg0, %arg4) : (!llvm<"i8">, !llvm<"i8">, !llvm.i32, !llvm.i1) -> ()

		// CHECK: "llvm.intr.memcpy"(%{{.}}, %{{.}}, %{{.}}, %{{.}}) : (!llvm<"i8">, !llvm<"i8">, !llvm.i32, !llvm.i1) -> ()
		"llvm.intr.memcpy"(%arg2, %arg3, %arg0, %arg4) : (!llvm<"i8">, !llvm<"i8">, !llvm.i32, !llvm.i1) -> ()

		// CHECK: %[[SZ:.*]] = llvm.mlir.constant
		%sz = llvm.mlir.constant(10: i64) : !llvm.i64
		// CHECK: "llvm.intr.memcpy.inline"(%{{.}}, %{{.}}, %{{.}}, %{{.}}) : (!llvm<"i8">, !llvm<"i8">, !llvm.i64, !llvm.i1) -> ()
		"llvm.intr.memcpy.inline"(%arg2, %arg3, %sz, %arg4) : (!llvm<"i8">, !llvm<"i8">, !llvm.i64, !llvm.i1) -> ()

// CHECK: llvm.return		// CHECK: llvm.return
llvm.return		llvm.return
}		}

// An larger self-contained function.		// An larger self-contained function.
// CHECK-LABEL:func @foo(%arg0: !llvm.i32) -> !llvm<"{ i32, double, i32 }"> {		// CHECK-LABEL:func @foo(%arg0: !llvm.i32) -> !llvm<"{ i32, double, i32 }"> {
func @foo(%arg0: !llvm.i32) -> !llvm<"{ i32, double, i32 }"> {		func @foo(%arg0: !llvm.i32) -> !llvm<"{ i32, double, i32 }"> {
// CHECK-NEXT: %0 = llvm.mlir.constant(3 : i64) : !llvm.i32		// CHECK-NEXT: %0 = llvm.mlir.constant(3 : i64) : !llvm.i32
▲ Show 20 Lines • Show All 190 Lines • ▼ Show 20 Lines
func @useFenceInst() {		func @useFenceInst() {
// CHECK: syncscope("agent") seq_cst		// CHECK: syncscope("agent") seq_cst
llvm.fence syncscope("agent") seq_cst		llvm.fence syncscope("agent") seq_cst
// CHECK: seq_cst		// CHECK: seq_cst
llvm.fence syncscope("") seq_cst		llvm.fence syncscope("") seq_cst
// CHECK: release		// CHECK: release
llvm.fence release		llvm.fence release
return		return
}		}
No newline at end of file

mlir/test/Target/llvmir-intrinsics.mlir

Show First 20 Lines • Show All 196 Lines • ▼ Show 20 Lines	llvm.func @masked_intrinsics(%A: !llvm<"<7 x float>*">, %mask: !llvm<"<7 x i1>">) {
%b = llvm.intr.masked.load %A, %mask, %a { alignment = 1: i32} :		%b = llvm.intr.masked.load %A, %mask, %a { alignment = 1: i32} :
(!llvm<"<7 x float>*">, !llvm<"<7 x i1>">, !llvm<"<7 x float>">) -> !llvm<"<7 x float>">		(!llvm<"<7 x float>*">, !llvm<"<7 x i1>">, !llvm<"<7 x float>">) -> !llvm<"<7 x float>">
// CHECK: call void @llvm.masked.store.v7f32.p0v7f32(<7 x float> %{{.}}, <7 x float> %0, i32 {{.}}, <7 x i1> %{{.}})		// CHECK: call void @llvm.masked.store.v7f32.p0v7f32(<7 x float> %{{.}}, <7 x float> %0, i32 {{.}}, <7 x i1> %{{.}})
llvm.intr.masked.store %b, %A, %mask { alignment = 1: i32} :		llvm.intr.masked.store %b, %A, %mask { alignment = 1: i32} :
!llvm<"<7 x float>">, !llvm<"<7 x i1>"> into !llvm<"<7 x float>*">		!llvm<"<7 x float>">, !llvm<"<7 x i1>"> into !llvm<"<7 x float>*">
llvm.return		llvm.return
}		}

		// CHECK-LABEL: @memcpy_test
		llvm.func @memcpy_test(%arg0: !llvm.i32, %arg1: !llvm.i1, %arg2: !llvm<"i8">, %arg3: !llvm<"i8">) {
		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %{{.}}, i8 %{{.}}, i32 %{{.}}, i1 %{{.*}})
		"llvm.intr.memcpy"(%arg2, %arg3, %arg0, %arg1) : (!llvm<"i8">, !llvm<"i8">, !llvm.i32, !llvm.i1) -> ()
		%sz = llvm.mlir.constant(10: i64) : !llvm.i64
		// CHECK: call void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* %{{.}}, i8 %{{.}}, i64 10, i1 %{{.}})
		"llvm.intr.memcpy.inline"(%arg2, %arg3, %sz, %arg1) : (!llvm<"i8">, !llvm<"i8">, !llvm.i64, !llvm.i1) -> ()
		llvm.return
		}


// Check that intrinsics are declared with appropriate types.		// Check that intrinsics are declared with appropriate types.
// CHECK-DAG: declare float @llvm.fma.f32(float, float, float)		// CHECK-DAG: declare float @llvm.fma.f32(float, float, float)
// CHECK-DAG: declare <8 x float> @llvm.fma.v8f32(<8 x float>, <8 x float>, <8 x float>) #0		// CHECK-DAG: declare <8 x float> @llvm.fma.v8f32(<8 x float>, <8 x float>, <8 x float>) #0
// CHECK-DAG: declare float @llvm.fmuladd.f32(float, float, float)		// CHECK-DAG: declare float @llvm.fmuladd.f32(float, float, float)
// CHECK-DAG: declare <8 x float> @llvm.fmuladd.v8f32(<8 x float>, <8 x float>, <8 x float>) #0		// CHECK-DAG: declare <8 x float> @llvm.fmuladd.v8f32(<8 x float>, <8 x float>, <8 x float>) #0
// CHECK-DAG: declare void @llvm.prefetch.p0i8(i8* nocapture readonly, i32 immarg, i32 immarg, i32)		// CHECK-DAG: declare void @llvm.prefetch.p0i8(i8* nocapture readonly, i32 immarg, i32 immarg, i32)
// CHECK-DAG: declare float @llvm.exp.f32(float)		// CHECK-DAG: declare float @llvm.exp.f32(float)
// CHECK-DAG: declare <8 x float> @llvm.exp.v8f32(<8 x float>) #0		// CHECK-DAG: declare <8 x float> @llvm.exp.v8f32(<8 x float>) #0
Show All 13 Lines
// CHECK-DAG: declare <8 x float> @llvm.cos.v8f32(<8 x float>) #0		// CHECK-DAG: declare <8 x float> @llvm.cos.v8f32(<8 x float>) #0
// CHECK-DAG: declare float @llvm.copysign.f32(float, float)		// CHECK-DAG: declare float @llvm.copysign.f32(float, float)
// CHECK-DAG: declare <12 x float> @llvm.matrix.multiply.v12f32.v64f32.v48f32(<64 x float>, <48 x float>, i32 immarg, i32 immarg, i32 immarg)		// CHECK-DAG: declare <12 x float> @llvm.matrix.multiply.v12f32.v64f32.v48f32(<64 x float>, <48 x float>, i32 immarg, i32 immarg, i32 immarg)
// CHECK-DAG: declare <48 x float> @llvm.matrix.transpose.v48f32(<48 x float>, i32 immarg, i32 immarg)		// CHECK-DAG: declare <48 x float> @llvm.matrix.transpose.v48f32(<48 x float>, i32 immarg, i32 immarg)
// CHECK-DAG: declare <48 x float> @llvm.matrix.column.major.load.v48f32.p0f32(float* nocapture, i64, i1 immarg, i32 immarg, i32 immarg)		// CHECK-DAG: declare <48 x float> @llvm.matrix.column.major.load.v48f32.p0f32(float* nocapture, i64, i1 immarg, i32 immarg, i32 immarg)
// CHECK-DAG: declare void @llvm.matrix.column.major.store.v48f32.p0f32(<48 x float>, float* nocapture writeonly, i64, i1 immarg, i32 immarg, i32 immarg)		// CHECK-DAG: declare void @llvm.matrix.column.major.store.v48f32.p0f32(<48 x float>, float* nocapture writeonly, i64, i1 immarg, i32 immarg, i32 immarg)
// CHECK-DAG: declare <7 x float> @llvm.masked.load.v7f32.p0v7f32(<7 x float>*, i32 immarg, <7 x i1>, <7 x float>)		// CHECK-DAG: declare <7 x float> @llvm.masked.load.v7f32.p0v7f32(<7 x float>*, i32 immarg, <7 x i1>, <7 x float>)
// CHECK-DAG: declare void @llvm.masked.store.v7f32.p0v7f32(<7 x float>, <7 x float>*, i32 immarg, <7 x i1>)		// CHECK-DAG: declare void @llvm.masked.store.v7f32.p0v7f32(<7 x float>, <7 x float>*, i32 immarg, <7 x i1>)
		// CHECK-DAG: declare void @llvm.memcpy.p0i8.p0i8.i32(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i32, i1 immarg)
		// CHECK-DAG: declare void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64 immarg, i1 immarg)

mlir/test/mlir-cpu-runner/unranked_memref.mlir

Show All 12 Lines
// CHECK: rank = 2		// CHECK: rank = 2
// CHECK-SAME: sizes = [10, 3]		// CHECK-SAME: sizes = [10, 3]
// CHECK-SAME: strides = [3, 1]		// CHECK-SAME: strides = [3, 1]
// CHECK-COUNT-10: [2, 2, 2]		// CHECK-COUNT-10: [2, 2, 2]
//		//
// CHECK: rank = 0		// CHECK: rank = 0
// 122 is ASCII for 'z'.		// 122 is ASCII for 'z'.
// CHECK: [z]		// CHECK: [z]
		//
		// CHECK: rank = 2
		// CHECK-SAME: sizes = [4, 3]
		// CHECK-SAME: strides = [3, 1]
		// CHECK-COUNT-4: [1, 1, 1]
		//
		// CHECK: rank = 2
		// CHECK-SAME: sizes = [4, 3]
		// CHECK-SAME: strides = [3, 1]
		// CHECK-COUNT-4: [1, 1, 1]
		//
		// CHECK: rank = 2
		// CHECK-SAME: sizes = [4, 3]
		// CHECK-SAME: strides = [3, 1]
		// CHECK-COUNT-4: [1, 1, 1]
func @main() -> () {		func @main() -> () {
%A = alloc() : memref<10x3xf32, 0>		%A = alloc() : memref<10x3xf32, 0>
%f2 = constant 2.00000e+00 : f32		%f2 = constant 2.00000e+00 : f32
%f5 = constant 5.00000e+00 : f32		%f5 = constant 5.00000e+00 : f32
%f10 = constant 10.00000e+00 : f32		%f10 = constant 10.00000e+00 : f32

%V = memref_cast %A : memref<10x3xf32, 0> to memref<?x?xf32>		%V = memref_cast %A : memref<10x3xf32, 0> to memref<?x?xf32>
linalg.fill(%V, %f10) : memref<?x?xf32, 0>, f32		linalg.fill(%V, %f10) : memref<?x?xf32, 0>, f32
Show All 14 Lines	func @main() -> () {
// 122 is ASCII for 'z'.		// 122 is ASCII for 'z'.
%i8_z = constant 122 : i8		%i8_z = constant 122 : i8
%I8 = alloc() : memref<i8>		%I8 = alloc() : memref<i8>
store %i8_z, %I8[]: memref<i8>		store %i8_z, %I8[]: memref<i8>
%U4 = memref_cast %I8 : memref<i8> to memref<*xi8>		%U4 = memref_cast %I8 : memref<i8> to memref<*xi8>
call @print_memref_i8(%U4) : (memref<*xi8>) -> ()		call @print_memref_i8(%U4) : (memref<*xi8>) -> ()

dealloc %A : memref<10x3xf32, 0>		dealloc %A : memref<10x3xf32, 0>

		call @return_var_memref_caller() : () -> ()
		call @return_two_var_memref_caller() : () -> ()
return		return
}		}

func @print_memref_i8(memref<*xi8>) attributes { llvm.emit_c_interface }		func @print_memref_i8(memref<*xi8>) attributes { llvm.emit_c_interface }
func @print_memref_f32(memref<*xf32>) attributes { llvm.emit_c_interface }		func @print_memref_f32(memref<*xf32>) attributes { llvm.emit_c_interface }

		func @return_two_var_memref_caller() {
		%0 = alloca() : memref<4x3xf32>
		%c0f32 = constant 1.0 : f32
		linalg.fill(%0, %c0f32) : memref<4x3xf32>, f32
		%1:2 = call @return_two_var_memref(%0) : (memref<4x3xf32>) -> (memref<xf32>, memref<xf32>)
		call @print_memref_f32(%1#0) : (memref<*xf32>) -> ()
		call @print_memref_f32(%1#1) : (memref<*xf32>) -> ()
		return
		}

		func @return_two_var_memref(%arg0: memref<4x3xf32>) -> (memref<xf32>, memref<xf32>) {
		%0 = memref_cast %arg0 : memref<4x3xf32> to memref<*xf32>
		return %0, %0 : memref<xf32>, memref<xf32>
		}

		func @return_var_memref_caller() {
		%0 = alloca() : memref<4x3xf32>
		%c0f32 = constant 1.0 : f32
		linalg.fill(%0, %c0f32) : memref<4x3xf32>, f32
		%1 = call @return_var_memref(%0) : (memref<4x3xf32>) -> memref<*xf32>
		call @print_memref_f32(%1) : (memref<*xf32>) -> ()
		return
		}

		func @return_var_memref(%arg0: memref<4x3xf32>) -> memref<*xf32> {
		%0 = memref_cast %arg0: memref<4x3xf32> to memref<*xf32>
		return %0 : memref<*xf32>
		}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] support returning unranked memrefsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 273692

mlir/docs/ConversionToLLVMDialect.md

mlir/include/mlir/Conversion/StandardToLLVM/ConvertStandardToLLVM.h

mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td

mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp

mlir/test/Conversion/StandardToLLVM/calling-convention.mlir

mlir/test/Dialect/LLVMIR/roundtrip.mlir

mlir/test/Target/llvmir-intrinsics.mlir

mlir/test/mlir-cpu-runner/unranked_memref.mlir

[mlir] support returning unranked memrefs
ClosedPublic