This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/ExecutionEngine/
-
mlir/
-
ExecutionEngine/
9/24
CRunnerUtils.h
2
MemRefUtils.h
-
unittests/ExecutionEngine/
-
ExecutionEngine/
1/2
Invoke.cpp

Differential D96397

Add C++ helpers to manage Unranked Memref in native code
Needs RevisionPublic

Authored by mehdi_amini on Feb 9 2021, 10:18 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
ftynse
bondhugula
aartbik

Summary

This added an OwningUnrankedMemRef which wraps a UnrankedMemRefType
descriptor. New helpers method to initialize and index the descriptor, as
well as an iterator (UnrankedMemrefIterator) are also provided.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

mehdi_amini created this revision.Feb 9 2021, 10:18 PM

Herald added subscribers: teijeong, rdzhabarov, tatianashp and 12 others. · View Herald TranscriptFeb 9 2021, 10:18 PM

mehdi_amini requested review of this revision.Feb 9 2021, 10:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2021, 10:18 PM

Herald added a subscriber: stephenneuendorffer. · View Herald Transcript

Harbormaster completed remote builds in B88581: Diff 322595.Feb 9 2021, 10:19 PM

mehdi_amini added inline comments.Feb 9 2021, 10:19 PM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
400	Note to self: refactor this to share the implementation with the ranked iterator.

nicolasvasilache added inline comments.Feb 9 2021, 11:39 PM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	CRunner utils is meant to run on HW that does not have C++ runtime library. In particular, uses of vector are prohibited and array is used instead. Does this file fit the bill?
386	An implementation that is easier to follow and reuses common utils is to just keep the offset, add 1 to it and delinearize the indices in the stride basis.
421	That's not ok for the CRunnerUtils. Maybe put those in RunnerUtils?

mehdi_amini added inline comments.Feb 10 2021, 9:59 AM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	We should probably refactor all this, it seems all quite not well organized to me right now, including mixing the definition of the descriptors with random set of utilities that look quite "ad-hoc" (`printComma()`, `getTensorFilename()`, ...).
386	I didn't quite follow what you describe here? Why adding 1 to the offset? The offset is physical in the buffer right now, are you thinking about making it a logical offset? When would the logical -> physical take place?
421	I can use a SmallVector :)

Use LLVM SmallVector instead of std::vector and refactor iterator indexing to share the code between Ranked/Unranked iterators

nicolasvasilache added inline comments.Feb 10 2021, 11:59 AM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	I'm very happy if you have the cycles to devote to this and tidy things up. Basically the original version of CRunnerUtils was running on ARM micro but it was a half manual process with kludges to connect to a Makefile based system that I didn't have access to (i.e. shipping pre-cross-compiled .o across the wall). As time passed and this was only used as the runtime library support to write some basic integration tests, other stuff got added. It would be great if there was a simple way to test the works without a C++ runtime aspect, as this was the original
386	I meant you could make available and reuse these functions from mlir/include/mlir/Dialect/Vector/VectorUtils.h /// Computes and returns the linearized index of 'offsets' w.r.t. 'basis'. int64_t linearize(ArrayRef<int64_t> offsets, ArrayRef<int64_t> basis); /// Given the strides together with a linear index in the dimension /// space, returns the vector-space offsets in each dimension for a /// de-linearized index. SmallVector<int64_t, 4> delinearize(ArrayRef<int64_t> strides, int64_t linearIndex); Then your function would just `delinearize(strides, linearize(offset, strides) + 1);`. If you were interested in performance (which I would claim you shouldn't be here, or you would just use some prebaked Torch-like library), you could additionally store the linearOffset in to avoid 1/2 the compute.

Use a templated Range argument for operator []

mehdi_amini added inline comments.Feb 10 2021, 12:27 PM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	To be clear, by "C++ runtime" do you mean the non-header part of the STL? Otherwise, without RTTI enabled there shouldn't be anything left I think
386	Ah I see what you mean now, thanks. It isn't clear to me that the `+ 1` is enough here if the outermost stride isn't 1? I'm not sure if the behavior of `delinearize` should even be specific for an offset that is impossible to get from `linearize`... But I guess I could replace `+1` with `+ strides[rank-1]`?

mehdi_amini retitled this revision from Add C++ helpers to manager Unranked Memeref in native code to Add C++ helpers to manage Unranked Memref in native code.Feb 10 2021, 12:32 PM

Harbormaster completed remote builds in B88686: Diff 322770.Feb 10 2021, 1:08 PM

Harbormaster completed remote builds in B88695: Diff 322786.Feb 10 2021, 1:59 PM

nicolasvasilache added inline comments.Feb 11 2021, 8:45 AM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
371	This could be: `linearize(indices, strides)` ?
386	You're right sorry, I took the wrong basis. Here is the thinking, dropping the word "offset" as it is overloaded by now. Indexing in the basis of memref "sizes" gives you the virtual index, you can go from linear to multi-D as: `delinearize(sizes, linearize(indices, sizes) + 1)`. Indexing in the basis of memref "strides" gives you the physical displacement which skips holes, you can go from linear to multi-D as: `linearize(delinearize(sizes, linearize(indices, sizes) + 1), strides)`. So assuming you keep a list of `indices`, to get the physical displacement of your next element compared to the base, you can: `linearize(delinearize(sizes, linearize(indices, sizes) + 1), strides)` If you prefer instead to keep the 1-D physical displacement, you can get the displacement of the next entry by: `linearize(delinearize(sizes, linearize(delinearize(displacement, strides), sizes) + 1), strides)` Does this make sense ?

bondhugula added inline comments.Feb 11 2021, 6:14 PM

mlir/unittests/ExecutionEngine/Invoke.cpp
408	Is this a typo here?

mehdi_amini added inline comments.Feb 11 2021, 6:31 PM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	There is another wrinkle here: this file is built in C++11 mode? Why? It is a bit confusing right now to have this piece of the project compiled differently and in a way that it can't interact with any other data-structure from ADT/Support, what is the purpose here? If there is a specific needs, I'd like to see isolate in its own component (separate directory, etc.) rather than mixed with the ExecutionEngine folder which is part of the regular C++ infra.

bondhugula added inline comments.Feb 13 2021, 3:44 AM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	I think this file is built the same way as the rest of MLIR. The comment there perhaps exists in the interest of avoiding link time issues with `libmlir_runner_utils.so` - for eg. to link objects compiled from MLIR outside of the mlir-cpu-runner JIT?

mehdi_amini added inline comments.Feb 13 2021, 9:16 AM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	We have this: `mlir/lib/ExecutionEngine/CMakeLists.txt:set_property(TARGET mlir_c_runner_utils PROPERTY CXX_STANDARD 11)`

nicolasvasilache requested changes to this revision.Feb 14 2021, 5:50 AM

nicolasvasilache added inline comments.

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	There is another wrinkle here: this file is built in C++11 mode? Why? It is a bit confusing right now to have this piece of the project compiled differently and in a way that it can't interact with any other data-structure from ADT/Support, what is the purpose here? If there is a specific needs, I'd like to see isolate in its own component (separate directory, etc.) rather than mixed with the ExecutionEngine folder which is part of the regular C++ infra. Yes there are specific needs for this file: when it was created it was used to run ModelBuilder with some ARM micro (both arm32 and arm64 I believe) HW. I don't know offhand the exact processor but we can find out if relevant. Anyway, what's relevant is that the SDK, compiler etc for those required what is documented at the top of the file: use C++11 don't require linking in libstdc++. If you want to iterate on a better set of requirements and think there is a better place/way to do this, by all means let's start the discussion. Maybe we should start an `integration_test/xxx` directory for some e2e test on a restricted embedded HW. In the meantime, let's not break it.
35	I don't see value in introducing a dependency to LLVM at runtime here. Things were absolutely fine with std::array.

This revision now requires changes to proceed.Feb 14 2021, 5:50 AM

mehdi_amini added inline comments.Feb 15 2021, 11:51 AM

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	In the meantime, let's not break it. I don't think this is reasonable to have this kind of requirement here right now. Wanting to support an embedded environment which does not support the LLVM C++ standard is interesting, but I don't think we have an agreed goal to have this in-tree at the moment. I'm more inclined to align everything right now for what the project supports, rather that having a "project within the project" for out-of-tree reasons.
35	std::array is fine if you have a fixed rank at compile time, it does not scale with unranked...

Rebase and fix API changes + typo found by Uday

Herald added a subscriber: mgorny. · View Herald TranscriptFeb 15 2021, 1:01 PM

mehdi_amini marked an inline comment as done.Feb 15 2021, 1:02 PM

Harbormaster completed remote builds in B89276: Diff 323825.Feb 15 2021, 1:59 PM

Looks good. Could you add an additional line to the commit summary for better context? The commit title is a bit cryptic and the connection to runtime utils and runners is missing.

mlir/include/mlir/ExecutionEngine/MemRefUtils.h
144	Nit: * -> .
322–323	Doc comments here please.
mlir/unittests/ExecutionEngine/Invoke.cpp
303–307	Can these five lines be put in a helper `registerDialectsAndParseSourceString`? You have four repetitions.

Herald added subscribers: dcaballe, cota. · View Herald TranscriptApr 8 2021, 9:27 PM

I haven't worked around the specific requirement in the CRunnerUtils file and I suspect it requires some refactoring. But I don't know also the exact use-case and how to test it in order to be able to propose a path forward here, @nicolasvasilache can you help figure out who's using this and how?

I haven't worked around the specific requirement in the CRunnerUtils file and I suspect it requires some refactoring. But I don't know also the exact use-case and how to test it in order to be able to propose a path forward here, @nicolasvasilache can you help figure out who's using this and how?

I don't either which is why the best proxy I found is mlir/lib/ExecutionEngine/CMakeLists.txt:set_property(TARGET mlir_c_runner_utils PROPERTY CXX_STANDARD 11).
I don't think there are active users though so this could be your window of opportunity to move everything into RunnerUtils as I suggested above.

I still question the decision to not use the lowest common denominator when the logic you need is barely more than ptr + offset.
But I also see more value in progress and iteration than YAGNI discussions: it is perfectly valid to say that interested parties should grow/bring their own tensor library and this is merely for providing the batteries than MLIR is lacking.
If you want to grow this into something more than simple 2xAAA, an RFC would be useful.

Please address the comments on the indexing logic.

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h
34	As usual, with MLIR everything is on a per-need basis, the need for setting up such goals has not come up before. This probably deserves an RFC. In the meantime, to unblock progress, I recommend moving all the code from this file into RunnerUtils.cpp and revisit later. CRunnerUtils.* should not stay around if you're breaking the requirements.
35	pointer + offset ?
371	I don't believe this has been addressed?
386	I don't believe this has been addressed?

This revision now requires changes to proceed.Apr 8 2021, 11:36 PM

In D96397#2678641, @nicolasvasilache wrote:

I haven't worked around the specific requirement in the CRunnerUtils file and I suspect it requires some refactoring. But I don't know also the exact use-case and how to test it in order to be able to propose a path forward here, @nicolasvasilache can you help figure out who's using this and how?

I don't either which is why the best proxy I found is mlir/lib/ExecutionEngine/CMakeLists.txt:set_property(TARGET mlir_c_runner_utils PROPERTY CXX_STANDARD 11).
I don't think there are active users though so this could be your window of opportunity to move everything into RunnerUtils as I suggested above.

I'm not sure how I could "move" things without knowing who depends on the things I'd move :)

That said my best shot at a plan for a future layering would be:

A low-level layer to manipulate memref descriptor written purely in C. That would make it a C runtime very suitable for embedding on micro environment as well I think.
A higher-level layer written in modern C++ with all the possible niceties one want. This layer could target the low-level C API to manipulate the descriptors, that said it is possible that template+inline would give better perf for C++ users.

WDYT?
@bondhugula as well?

That said my best shot at a plan for a future layering would be:

This sounds great to me!

In D96397#2680121, @mehdi_amini wrote:

In D96397#2678641, @nicolasvasilache wrote:

I haven't worked around the specific requirement in the CRunnerUtils file and I suspect it requires some refactoring. But I don't know also the exact use-case and how to test it in order to be able to propose a path forward here, @nicolasvasilache can you help figure out who's using this and how?

I don't either which is why the best proxy I found is mlir/lib/ExecutionEngine/CMakeLists.txt:set_property(TARGET mlir_c_runner_utils PROPERTY CXX_STANDARD 11).
I don't think there are active users though so this could be your window of opportunity to move everything into RunnerUtils as I suggested above.

I'm not sure how I could "move" things without knowing who depends on the things I'd move :)

That said my best shot at a plan for a future layering would be:

A low-level layer to manipulate memref descriptor written purely in C. That would make it a C runtime very suitable for embedding on micro environment as well I think.

A higher-level layer written in modern C++ with all the possible niceties one want. This layer could target the low-level C API to manipulate the descriptors, that said it is possible that template+inline would give better perf for C++ users.

WDYT?
@bondhugula as well?

This sounds great to me too.

Revision Contents

Path

Size

mlir/

include/

mlir/

ExecutionEngine/

CRunnerUtils.h

140 lines

MemRefUtils.h

130 lines

unittests/

ExecutionEngine/

Invoke.cpp

163 lines

Diff 322786

mlir/include/mlir/ExecutionEngine/CRunnerUtils.h

Show All 25 Lines
#define MLIR_CRUNNERUTILS_EXPORT __declspec(dllimport)		#define MLIR_CRUNNERUTILS_EXPORT __declspec(dllimport)
#endif // mlir_c_runner_utils_EXPORTS		#endif // mlir_c_runner_utils_EXPORTS
#endif // MLIR_CRUNNERUTILS_EXPORT		#endif // MLIR_CRUNNERUTILS_EXPORT
#else // _WIN32		#else // _WIN32
#define MLIR_CRUNNERUTILS_EXPORT		#define MLIR_CRUNNERUTILS_EXPORT
#define MLIR_CRUNNERUTILS_DEFINE_FUNCTIONS		#define MLIR_CRUNNERUTILS_DEFINE_FUNCTIONS
#endif // _WIN32		#endif // _WIN32

		#include "llvm/ADT/ArrayRef.h"
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions CRunner utils is meant to run on HW that does not have C++ runtime library. In particular, uses of vector are prohibited and array is used instead. Does this file fit the bill? nicolasvasilache: CRunner utils is meant to run on HW that does not have C++ runtime library. In particular, uses…
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions We should probably refactor all this, it seems all quite not well organized to me right now, including mixing the definition of the descriptors with random set of utilities that look quite "ad-hoc" (`printComma()`, `getTensorFilename()`, ...). mehdi_amini: We should probably refactor all this, it seems all quite not well organized to me right now…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I'm very happy if you have the cycles to devote to this and tidy things up. Basically the original version of CRunnerUtils was running on ARM micro but it was a half manual process with kludges to connect to a Makefile based system that I didn't have access to (i.e. shipping pre-cross-compiled .o across the wall). As time passed and this was only used as the runtime library support to write some basic integration tests, other stuff got added. It would be great if there was a simple way to test the works without a C++ runtime aspect, as this was the original nicolasvasilache: I'm very happy if you have the cycles to devote to this and tidy things up. Basically the…
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions To be clear, by "C++ runtime" do you mean the non-header part of the STL? Otherwise, without RTTI enabled there shouldn't be anything left I think mehdi_amini: To be clear, by "C++ runtime" do you mean the non-header part of the STL? Otherwise, without…
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions There is another wrinkle here: this file is built in C++11 mode? Why? It is a bit confusing right now to have this piece of the project compiled differently and in a way that it can't interact with any other data-structure from ADT/Support, what is the purpose here? If there is a specific needs, I'd like to see isolate in its own component (separate directory, etc.) rather than mixed with the ExecutionEngine folder which is part of the regular C++ infra. mehdi_amini: There is another wrinkle here: this file is built in C++11 mode? Why? It is a bit confusing…
		bondhugulaUnsubmitted Not Done Reply Inline Actions I think this file is built the same way as the rest of MLIR. The comment there perhaps exists in the interest of avoiding link time issues with `libmlir_runner_utils.so` - for eg. to link objects compiled from MLIR outside of the mlir-cpu-runner JIT? bondhugula: I think this file is built the same way as the rest of MLIR. The comment there perhaps exists…
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions We have this: `mlir/lib/ExecutionEngine/CMakeLists.txt:set_property(TARGET mlir_c_runner_utils PROPERTY CXX_STANDARD 11)` mehdi_amini: We have this: `mlir/lib/ExecutionEngine/CMakeLists.txt:set_property(TARGET mlir_c_runner_utils…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions There is another wrinkle here: this file is built in C++11 mode? Why? It is a bit confusing right now to have this piece of the project compiled differently and in a way that it can't interact with any other data-structure from ADT/Support, what is the purpose here? If there is a specific needs, I'd like to see isolate in its own component (separate directory, etc.) rather than mixed with the ExecutionEngine folder which is part of the regular C++ infra. Yes there are specific needs for this file: when it was created it was used to run ModelBuilder with some ARM micro (both arm32 and arm64 I believe) HW. I don't know offhand the exact processor but we can find out if relevant. Anyway, what's relevant is that the SDK, compiler etc for those required what is documented at the top of the file: use C++11 don't require linking in libstdc++. If you want to iterate on a better set of requirements and think there is a better place/way to do this, by all means let's start the discussion. Maybe we should start an `integration_test/xxx` directory for some e2e test on a restricted embedded HW. In the meantime, let's not break it. nicolasvasilache: ``` There is another wrinkle here: this file is built in C++11 mode? Why? It is a bit confusing…
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions In the meantime, let's not break it. I don't think this is reasonable to have this kind of requirement here right now. Wanting to support an embedded environment which does not support the LLVM C++ standard is interesting, but I don't think we have an agreed goal to have this in-tree at the moment. I'm more inclined to align everything right now for what the project supports, rather that having a "project within the project" for out-of-tree reasons. mehdi_amini: > In the meantime, let's not break it. I don't think this is reasonable to have this kind of…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions As usual, with MLIR everything is on a per-need basis, the need for setting up such goals has not come up before. This probably deserves an RFC. In the meantime, to unblock progress, I recommend moving all the code from this file into RunnerUtils.cpp and revisit later. CRunnerUtils.* should not stay around if you're breaking the requirements. nicolasvasilache: As usual, with MLIR everything is on a per-need basis, the need for setting up such goals has…
		#include "llvm/ADT/SmallVector.h"
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I don't see value in introducing a dependency to LLVM at runtime here. Things were absolutely fine with std::array. nicolasvasilache: I don't see value in introducing a dependency to LLVM at runtime here. Things were absolutely…
		mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions std::array is fine if you have a fixed rank at compile time, it does not scale with unranked... mehdi_amini: std::array is fine if you have a fixed rank at compile time, it does not scale with unranked...
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions pointer + offset ? nicolasvasilache: pointer + offset ?

#include <array>		#include <array>
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <initializer_list>		#include <initializer_list>

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Codegen-compatible structures for Vector type.		// Codegen-compatible structures for Vector type.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
Show All 33 Lines	struct Vector1D<T, Dim, /IsPowerOf2=/false> {
}		}
inline T &operator[](unsigned i) { return vector[i]; }		inline T &operator[](unsigned i) { return vector[i]; }
inline const T &operator[](unsigned i) const { return vector[i]; }		inline const T &operator[](unsigned i) const { return vector[i]; }

private:		private:
T vector[Dim];		T vector[Dim];
char padding[nextPowerOf2(sizeof(T[Dim])) - sizeof(T[Dim])];		char padding[nextPowerOf2(sizeof(T[Dim])) - sizeof(T[Dim])];
};		};

		/// Update the `indices` array representing the subscript for a memref to point
		/// to the next element. The `sizes` is the current shape of the memref, and
		/// `strides` the physical size for each dimension. `offset` is the physical
		/// offset in the buffer for the current element. Returns the physical offset in
		/// the buffer for the next element, or -1 if the end of the memref has been
		/// reached.
		int64_t offsetForNextElement(llvm::MutableArrayRef<int64_t> indices,
		llvm::ArrayRef<int64_t> sizes,
		llvm::ArrayRef<int64_t> strides, int64_t offset) {
		int dim = indices.size() - 1;
		// Start from the end of the indices and try to find the outer most dimension
		// where the index can be incremented. For each dimension where the index is
		// at the end, roll back the offset to the beginning for this dimension (the
		// next outer dimension will increment by the stride size).
		while (dim >= 0 && indices[dim] == (sizes[dim] - 1)) {
		offset -= indices[dim] * strides[dim];
		indices[dim] = 0;
		--dim;
		}
		if (dim < 0) {
		// All indices are at the end for every dimensions: end of the memref.
		offset = -1;
		} else {
		// Increment the index for the current dimension, and move the offset
		// forward by the current stride.
		++indices[dim];
		offset += strides[dim];
		}
		return offset;
		}

} // end namespace detail		} // end namespace detail
} // end namespace mlir		} // end namespace mlir

// N-D vectors recurse down to 1-D.		// N-D vectors recurse down to 1-D.
template <typename T, int Dim, int... Dims>		template <typename T, int Dim, int... Dims>
struct Vector {		struct Vector {
inline Vector<T, Dims...> &operator[](unsigned i) { return vector[i]; }		inline Vector<T, Dims...> &operator[](unsigned i) { return vector[i]; }
inline const Vector<T, Dims...> &operator[](unsigned i) const {		inline const Vector<T, Dims...> &operator[](unsigned i) const {
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
/// Iterate over all elements in a strided memref.		/// Iterate over all elements in a strided memref.
template <typename T, int Rank>		template <typename T, int Rank>
class StridedMemrefIterator {		class StridedMemrefIterator {
public:		public:
StridedMemrefIterator(StridedMemRefType<T, Rank> &descriptor,		StridedMemrefIterator(StridedMemRefType<T, Rank> &descriptor,
int64_t offset = 0)		int64_t offset = 0)
: offset(offset), descriptor(descriptor) {}		: offset(offset), descriptor(descriptor) {}
StridedMemrefIterator<T, Rank> &operator++() {		StridedMemrefIterator<T, Rank> &operator++() {
int dim = Rank - 1;		offset = mlir::detail::offsetForNextElement(indices, descriptor.sizes,
while (dim >= 0 && indices[dim] == (descriptor.sizes[dim] - 1)) {		descriptor.strides, offset);
offset -= indices[dim] * descriptor.strides[dim];
indices[dim] = 0;
--dim;
}
if (dim < 0) {
offset = -1;
return *this;
}
++indices[dim];
offset += descriptor.strides[dim];
return *this;		return *this;
}		}

T &operator*() { return descriptor.data[offset]; }		T &operator*() { return descriptor.data[offset]; }
T *operator->() { return &descriptor.data[offset]; }		T *operator->() { return &descriptor.data[offset]; }

const std::array<int64_t, Rank> &getIndices() { return indices; }		const std::array<int64_t, Rank> &getIndices() { return indices; }

▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
private:		private:
/// Pointer to the single element in the zero-ranked memref.		/// Pointer to the single element in the zero-ranked memref.
T *elt;		T *elt;
};		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Codegen-compatible structure for UnrankedMemRef type.		// Codegen-compatible structure for UnrankedMemRef type.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		template <typename T>
		class UnrankedMemrefIterator;

// Unranked MemRef		// Unranked MemRef
template <typename T>		template <typename T>
struct UnrankedMemRefType {		struct UnrankedMemRefType {
int64_t rank;		int64_t rank;
		/// This is a pointer to an instantiation of StridedMemref<T, rank>.
void *descriptor;		void *descriptor;

		// Return a pointer to the `basePtr` member of the Strided descriptor.
		T basePtr() { return reinterpret_cast<T **>(descriptor); }

		// Return a pointer to the `data` member of the Strided descriptor.
		T *data() {
		return reinterpret_cast<T >(reinterpret_cast<int8_t >(descriptor) +
		sizeof(T *));
		}

		// Return the `offset` member of the Strided descriptor.
		int64_t offset() {
		return reinterpret_cast<int64_t >(reinterpret_cast<int8_t *>(descriptor) +
		sizeof(T ) + sizeof(T ));
		}

		// Return the `sizes` member of the Strided descriptor.
		llvm::ArrayRef<int64_t> sizes() {
		return llvm::makeArrayRef(reinterpret_cast<int64_t *>(
		reinterpret_cast<int8_t *>(descriptor) +
		sizeof(T ) + sizeof(T ) + sizeof(int64_t)),
		rank);
		}

		// Return the `strides` member of the Strided descriptor.
		llvm::ArrayRef<int64_t> strides() {
		return llvm::makeArrayRef(
		reinterpret_cast<int64_t >(reinterpret_cast<int8_t >(descriptor) +
		sizeof(T ) + sizeof(T ) +
		sizeof(int64_t) + rank * sizeof(int64_t)),
		rank);
		}

		template <typename Range,
		typename sfinae = decltype(std::declval<Range>().begin())>
		T &operator[](Range indices) {
		assert(static_cast<int64_t>(indices.size()) == rank);
		int64_t offset = 0;
		// This is a very ugly hard-coded way to access the data and strides
		// inside a StridedMemrefType<T, rank>!!
		llvm::ArrayRef<int64_t> strides = this->strides();

		for (int dim = rank - 1; dim >= 0; --dim)
		offset += (indices.begin() + dim) strides[dim];
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions This could be: `linearize(indices, strides)` ? nicolasvasilache: This could be: `linearize(indices, strides)` ?
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I don't believe this has been addressed? nicolasvasilache: I don't believe this has been addressed?
		return data()[offset];
		}

		UnrankedMemrefIterator<T> begin() { return {*this}; }
		UnrankedMemrefIterator<T> end() { return {*this, -1}; }
		};

		/// Iterate over all elements in an unranked strided memref.
		template <typename T>
		class UnrankedMemrefIterator {
		public:
		UnrankedMemrefIterator(UnrankedMemRefType<T> &descriptor, int64_t offset = 0)
		: offset(offset), descriptor(descriptor) {
		indices.resize(descriptor.rank);
		}
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions An implementation that is easier to follow and reuses common utils is to just keep the offset, add 1 to it and delinearize the indices in the stride basis. nicolasvasilache: An implementation that is easier to follow and reuses common utils is to just keep the offset…
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I didn't quite follow what you describe here? Why adding 1 to the offset? The offset is physical in the buffer right now, are you thinking about making it a logical offset? When would the logical -> physical take place? mehdi_amini: I didn't quite follow what you describe here? Why adding 1 to the offset? The offset is…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I meant you could make available and reuse these functions from mlir/include/mlir/Dialect/Vector/VectorUtils.h /// Computes and returns the linearized index of 'offsets' w.r.t. 'basis'. int64_t linearize(ArrayRef<int64_t> offsets, ArrayRef<int64_t> basis); /// Given the strides together with a linear index in the dimension /// space, returns the vector-space offsets in each dimension for a /// de-linearized index. SmallVector<int64_t, 4> delinearize(ArrayRef<int64_t> strides, int64_t linearIndex); Then your function would just `delinearize(strides, linearize(offset, strides) + 1);`. If you were interested in performance (which I would claim you shouldn't be here, or you would just use some prebaked Torch-like library), you could additionally store the linearOffset in to avoid 1/2 the compute. nicolasvasilache: I meant you could make available and reuse these functions from…
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Ah I see what you mean now, thanks. It isn't clear to me that the `+ 1` is enough here if the outermost stride isn't 1? I'm not sure if the behavior of `delinearize` should even be specific for an offset that is impossible to get from `linearize`... But I guess I could replace `+1` with `+ strides[rank-1]`? mehdi_amini: Ah I see what you mean now, thanks. It isn't clear to me that the `+ 1` is enough here if the…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions You're right sorry, I took the wrong basis. Here is the thinking, dropping the word "offset" as it is overloaded by now. Indexing in the basis of memref "sizes" gives you the virtual index, you can go from linear to multi-D as: `delinearize(sizes, linearize(indices, sizes) + 1)`. Indexing in the basis of memref "strides" gives you the physical displacement which skips holes, you can go from linear to multi-D as: `linearize(delinearize(sizes, linearize(indices, sizes) + 1), strides)`. So assuming you keep a list of `indices`, to get the physical displacement of your next element compared to the base, you can: `linearize(delinearize(sizes, linearize(indices, sizes) + 1), strides)` If you prefer instead to keep the 1-D physical displacement, you can get the displacement of the next entry by: `linearize(delinearize(sizes, linearize(delinearize(displacement, strides), sizes) + 1), strides)` Does this make sense ? nicolasvasilache: You're right sorry, I took the wrong basis. Here is the thinking, dropping the word "offset"…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I don't believe this has been addressed? nicolasvasilache: I don't believe this has been addressed?
		UnrankedMemrefIterator &operator++() {
		offset = mlir::detail::offsetForNextElement(indices, descriptor.sizes(),
		descriptor.strides(), offset);
		return *this;
		}

		T &operator*() { return descriptor.data()[offset]; }
		T *operator->() { return descriptor.data()[offset]; }

		bool operator==(const UnrankedMemrefIterator &other) const {
		return &other.descriptor == &descriptor && other.offset == offset;
		}

		bool operator!=(const UnrankedMemrefIterator &other) const {
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Note to self: refactor this to share the implementation with the ranked iterator. mehdi_amini: Note to self: refactor this to share the implementation with the ranked iterator.
		return !(*this == other);
		}

		llvm::ArrayRef<int64_t> getIndices() { return indices; }

		private:
		/// Offset in the buffer. This can be derived from the indices and the
		/// descriptor.
		int64_t offset = 0;
		/// Array of indices in the multi-dimensional memref.
		llvm::SmallVector<int64_t, 4> indices = {};
		/// Descriptor for the unranked memref.
		UnrankedMemRefType<T> &descriptor;
};		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// DynamicMemRefType type.		// DynamicMemRefType type.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// A reference to one of the StridedMemRef types.		// A reference to one of the StridedMemRef types.
template <typename T>		template <typename T>
class DynamicMemRefType {		class DynamicMemRefType {
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions That's not ok for the CRunnerUtils. Maybe put those in RunnerUtils? nicolasvasilache: That's not ok for the CRunnerUtils. Maybe put those in RunnerUtils?
		mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I can use a SmallVector :) mehdi_amini: I can use a SmallVector :)
public:		public:
explicit DynamicMemRefType(const StridedMemRefType<T, 0> &mem_ref)		explicit DynamicMemRefType(const StridedMemRefType<T, 0> &mem_ref)
: rank(0), basePtr(mem_ref.basePtr), data(mem_ref.data),		: rank(0), basePtr(mem_ref.basePtr), data(mem_ref.data),
offset(mem_ref.offset), sizes(nullptr), strides(nullptr) {}		offset(mem_ref.offset), sizes(nullptr), strides(nullptr) {}
template <int N>		template <int N>
explicit DynamicMemRefType(const StridedMemRefType<T, N> &mem_ref)		explicit DynamicMemRefType(const StridedMemRefType<T, N> &mem_ref)
: rank(N), basePtr(mem_ref.basePtr), data(mem_ref.data),		: rank(N), basePtr(mem_ref.basePtr), data(mem_ref.data),
offset(mem_ref.offset), sizes(mem_ref.sizes), strides(mem_ref.strides) {		offset(mem_ref.offset), sizes(mem_ref.sizes), strides(mem_ref.strides) {
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

mlir/include/mlir/ExecutionEngine/MemRefUtils.h

Show All 31 Lines
#ifndef MLIR_EXECUTIONENGINE_MEMREFUTILS_H_		#ifndef MLIR_EXECUTIONENGINE_MEMREFUTILS_H_
#define MLIR_EXECUTIONENGINE_MEMREFUTILS_H_		#define MLIR_EXECUTIONENGINE_MEMREFUTILS_H_

namespace mlir {		namespace mlir {
using AllocFunType = llvm::function_ref<void *(size_t)>;		using AllocFunType = llvm::function_ref<void *(size_t)>;

namespace detail {		namespace detail {

		/// Given a shape with sizes greater than 0 along all dimensions, populate the
		/// provided `strides` array with the distance, in number of elements, between a
		/// slice in a dimension and the next slice in the same dimension.
		/// e.g. shape[3, 4, 5] -> strides[20, 5, 1]
		inline void makeStrides(ArrayRef<int64_t> shape,
		MutableArrayRef<int64_t> strides) {
		assert(shape.size() == strides.size() &&
		"expect shapes and strides size to match");
		int64_t running = 1;
		for (int64_t idx = shape.size() - 1; idx >= 0; --idx) {
		assert(shape[idx] && "size must be non-negative for all shape dimensions");
		strides[idx] = running;
		running *= shape[idx];
		}
		}

/// Given a shape with sizes greater than 0 along all dimensions, returns the		/// Given a shape with sizes greater than 0 along all dimensions, returns the
/// distance, in number of elements, between a slice in a dimension and the next		/// distance, in number of elements, between a slice in a dimension and the next
/// slice in the same dimension.		/// slice in the same dimension.
/// e.g. shape[3, 4, 5] -> strides[20, 5, 1]		/// e.g. shape[3, 4, 5] -> strides[20, 5, 1]
template <size_t N>		template <size_t N>
inline std::array<int64_t, N> makeStrides(ArrayRef<int64_t> shape) {		inline std::array<int64_t, N> makeStrides(ArrayRef<int64_t> shape) {
assert(shape.size() == N && "expect shape specification to match rank");		assert(shape.size() == N && "expect shape specification to match rank");
std::array<int64_t, N> res;		std::array<int64_t, N> strides;
int64_t running = 1;		makeStrides(shape, strides);
for (int64_t idx = N - 1; idx >= 0; --idx) {		return strides;
assert(shape[idx] && "size must be non-negative for all shape dimensions");
res[idx] = running;
running *= shape[idx];
}
return res;
}		}

/// Build a `StridedMemRefDescriptor<T, N>` that matches the MLIR ABI.		/// Build a `StridedMemRefDescriptor<T, N>` that matches the MLIR ABI.
/// This is an implementation detail that is kept in sync with MLIR codegen		/// This is an implementation detail that is kept in sync with MLIR codegen
/// conventions. Additionally takes a `shapeAlloc` array which		/// conventions. Additionally takes a `shapeAlloc` array which
/// is used instead of `shape` to allocate "more aligned" data and compute the		/// is used instead of `shape` to allocate "more aligned" data and compute the
/// corresponding strides.		/// corresponding strides.
template <int N, typename T>		template <int N, typename T>
Show All 25 Lines	makeStridedMemRefDescriptor(T ptr, T alignedPtr, ArrayRef<int64_t> shape = {},
assert(shapeAlloc.size() == N);		assert(shapeAlloc.size() == N);
StridedMemRefType<T, 0> descriptor;		StridedMemRefType<T, 0> descriptor;
descriptor.basePtr = static_cast<T *>(ptr);		descriptor.basePtr = static_cast<T *>(ptr);
descriptor.data = static_cast<T *>(alignedPtr);		descriptor.data = static_cast<T *>(alignedPtr);
descriptor.offset = 0;		descriptor.offset = 0;
return descriptor;		return descriptor;
}		}

		// Mallocs an UnrankedMemRefType<T>* that contains a ranked
		// StridedMemRefDescriptor<T, Rank>* and matches the MLIR ABI. This is an
		// implementation detail that is kept in sync with MLIR codegen conventions.
		template <typename T>
		::UnrankedMemRefType<T>
		makeUnrankedDescriptor(T data, T alignedData, ArrayRef<int64_t> shape,
		AllocFunType allocFun = &::malloc) {
		::UnrankedMemRefType<T> res{};
		res.rank = shape.size();
		if (res.rank == 0) {
		res.descriptor = allocFun(sizeof(StridedMemRefType<T, 0>));
		static_cast<StridedMemRefType<T, 0> >(res.descriptor) =
		makeStridedMemRefDescriptor<0>(data, alignedData, shape, shape);
		} else {
		// Allocate and build a StridedMemRefType descriptor for Rank >= 1.
		// The allocated size is the size of the descriptor for rank 1 plus two
		// int64_t for each additional rank for one shape and one stride. The
		// reinterpret_cast computations are computing the offset to the individual
		// fields based on the rank.
		res.descriptor = allocFun(sizeof(StridedMemRefType<T, 1>) +
		(res.rank - 1) * 2 * sizeof(int64_t));
		reinterpret_cast<T *>(res.descriptor) = data;
		reinterpret_cast<T >(reinterpret_cast<int8_t >(res.descriptor) +
		sizeof(T *)) = alignedData;
		reinterpret_cast<int64_t >(reinterpret_cast<int8_t >(res.descriptor) +
		2 * sizeof(T *)) = 0;
		std::copy(shape.begin(), shape.end(),
		const_cast<int64_t *>(res.sizes().data()));
		llvm::SmallVector<int64_t> strides;
		strides.resize(res.rank);
		makeStrides(shape, strides);
		std::copy(strides.begin(), strides.end(),
		const_cast<int64_t *>(res.strides().data()));
		}
		return res;
		}

		// Frees an UnrankedMemRefType<T>*
		bondhugulaUnsubmitted Not Done Reply Inline Actions Nit: * -> . bondhugula: Nit: * -> .
		template <typename T>
		void freeUnrankedDescriptor(::UnrankedMemRefType<T> *desc) {
		free(desc->descriptor);
		free(desc);
		}

/// Align `nElements` of type T with an optional `alignment`.		/// Align `nElements` of type T with an optional `alignment`.
/// This replaces a portable `posix_memalign`.		/// This replaces a portable `posix_memalign`.
/// `alignment` must be a power of 2 and greater than the size of T. By default		/// `alignment` must be a power of 2 and greater than the size of T. By default
/// the alignment is sizeof(T).		/// the alignment is sizeof(T).
template <typename T>		template <typename T>
std::pair<T , T >		std::pair<T , T >
allocAligned(size_t nElements, AllocFunType allocFun = &::malloc,		allocAligned(size_t nElements, AllocFunType allocFun = &::malloc,
llvm::Optional<uint64_t> alignment = llvm::Optional<uint64_t>()) {		llvm::Optional<uint64_t> alignment = llvm::Optional<uint64_t>()) {
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines
private:		private:
/// Custom deleter used to release the data buffer manager with the descriptor		/// Custom deleter used to release the data buffer manager with the descriptor
/// below.		/// below.
FreeFunType freeFunc;		FreeFunType freeFunc;
/// The descriptor is an instance of StridedMemRefType<T, rank>.		/// The descriptor is an instance of StridedMemRefType<T, rank>.
DescriptorType descriptor;		DescriptorType descriptor;
};		};

		/// Owning Unranked MemRef type that abstracts over the runtime type for memref.
		template <typename T>
		class OwningUnrankedMemRef {
		public:
		using DescriptorType = ::UnrankedMemRefType<T>;
		using FreeFunType = std::function<void(DescriptorType)>;

		// Allocate a new UnrankedMemref with a given `shape` and initializer
		// of type ElementWiseVisitor. Can optionally take specific `alloc` and `free`
		// functions.
		OwningUnrankedMemRef(
		ArrayRef<int64_t> shape, ElementWiseVisitor<T> init = {},
		llvm::Optional<uint64_t> alignment = llvm::Optional<uint64_t>(),
		AllocFunType alloc = &::malloc,
		FreeFunType freeFun =
		[](DescriptorType descriptor) {
		auto *strided_descriptor =
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'strided_descriptor' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'strided_descriptor' [readability…
		reinterpret_cast<StridedMemRefType<T, 0> *>(
		descriptor.descriptor);
		::free(strided_descriptor->data);
		})
		: freeFunc(freeFun) {
		int64_t nElements = 1;
		for (int64_t s : shape)
		nElements *= s;
		T data, alignedData;
		std::tie(data, alignedData) =
		detail::allocAligned<T>(nElements, alloc, alignment);
		descriptor =
		detail::makeUnrankedDescriptor(data, alignedData, shape, alloc);
		if (init) {
		for (UnrankedMemrefIterator<T> it = descriptor.begin(),
		end = descriptor.end();
		it != end; ++it)
		init(*it, it.getIndices());
		} else {
		memset(descriptor.data(), 0,
		nElements * sizeof(T) +
		alignment.getValueOr(detail::nextPowerOf2(sizeof(T))));
		}
		}

		/// Take ownership of an existing UnrankMemRef descriptor with a custom
		/// deleter;
		OwningUnrankedMemRef(FreeFunType freeFunc, DescriptorType descriptor)
		: freeFunc(freeFunc), descriptor(descriptor) {}
		~OwningUnrankedMemRef() { freeFunc(descriptor); }
		T &operator[](std::initializer_list<int64_t> indices) {
		return descriptor[std::move(indices)];
		}

		DescriptorType &operator*() { return descriptor; }
		DescriptorType *operator->() { return &descriptor; }

		private:
		FreeFunType freeFunc;
		DescriptorType descriptor;
		bondhugulaUnsubmitted Not Done Reply Inline Actions Doc comments here please. bondhugula: Doc comments here please.
		};

} // namespace mlir		} // namespace mlir

#endif // MLIR_EXECUTIONENGINE_MEMREFUTILS_H_		#endif // MLIR_EXECUTIONENGINE_MEMREFUTILS_H_

mlir/unittests/ExecutionEngine/Invoke.cpp

Show First 20 Lines • Show All 245 Lines • ▼ Show 20 Lines	TEST(NativeMemRefJit, JITCallback) {
int32_t coefficient = 3.;		int32_t coefficient = 3.;
llvm::Error error = jit->invoke("caller_for_callback", &*A, coefficient);		llvm::Error error = jit->invoke("caller_for_callback", &*A, coefficient);
ASSERT_TRUE(!error);		ASSERT_TRUE(!error);
count = 1;		count = 1;
for (float elt : *A)		for (float elt : *A)
ASSERT_EQ(elt, coefficient * count++);		ASSERT_EQ(elt, coefficient * count++);
}		}

		TEST(NativeMemRefJit, UnrankedMemref_Rank0) {
		OwningUnrankedMemRef<float> A({});
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'A' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'A' [readability-identifier-naming]…
		A[{}] = 42.;
		ASSERT_EQ(*A->data(), 42);
		A[{}] = 0;
		std::string moduleStr = R"mlir(
		func @unranked_zero(%arg0 : memref<*xf32>) attributes { llvm.emit_c_interface } {
		%cst42 = constant 42.0 : f32
		%casted = memref_cast %arg0 : memref<*xf32> to memref<f32>
		store %cst42, %casted[] : memref<f32>
		return
		}
		)mlir";
		MLIRContext context;
		registerAllDialects(context);
		auto module = parseSourceString(moduleStr, &context);
		ASSERT_TRUE(!!module);
		ASSERT_TRUE(succeeded(lowerToLLVMDialect(*module)));
		auto jitOrError = ExecutionEngine::create(*module);
		ASSERT_TRUE(!!jitOrError);
		auto jit = std::move(jitOrError.get());

		llvm::Error error = jit->invoke("unranked_zero", &*A);
		ASSERT_TRUE(!error);
		EXPECT_EQ((A[{}]), 42.);
		for (float &elt : *A)
		EXPECT_EQ(&elt, &(A[{}]));
		}

		TEST(NativeMemRefJit, UnrankedMemref_Rank1) {
		int64_t shape[] = {9};
		OwningUnrankedMemRef<float> A(shape);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'A' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'A' [readability-identifier-naming]…
		int count = 1;
		for (float &elt : *A) {
		EXPECT_EQ(&elt, &(A[{count - 1}]));
		elt = count++;
		}

		std::string moduleStr = R"mlir(
		func @unranked_one(%arg0 : memref<*xf32>) attributes { llvm.emit_c_interface } {
		%cst42 = constant 42.0 : f32
		%cst5 = constant 5 : index
		%casted = memref_cast %arg0 : memref<*xf32> to memref<?xf32>
		store %cst42, %casted[%cst5] : memref<?xf32>
		return
		}
		)mlir";
		MLIRContext context;
		registerAllDialects(context);
		auto module = parseSourceString(moduleStr, &context);
		ASSERT_TRUE(!!module);
		ASSERT_TRUE(succeeded(lowerToLLVMDialect(*module)));
		auto jitOrError = ExecutionEngine::create(*module);
		ASSERT_TRUE(!!jitOrError);
		bondhugulaUnsubmitted Not Done Reply Inline Actions Can these five lines be put in a helper `registerDialectsAndParseSourceString`? You have four repetitions. bondhugula: Can these five lines be put in a helper `registerDialectsAndParseSourceString`? You have four…
		auto jit = std::move(jitOrError.get());

		llvm::Error error = jit->invoke("unranked_one", &*A);
		ASSERT_TRUE(!error);
		count = 1;
		for (float elt : *A) {
		if (count == 6)
		EXPECT_EQ(elt, 42.);
		else
		EXPECT_EQ(elt, count);
		count++;
		}
		}

		TEST(NativeMemRefJit, UnrankedMemref_Rank2) {
		constexpr int K = 3;
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'K' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'K' [readability-identifier-naming]…
		constexpr int M = 7;
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'M' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'M' [readability-identifier-naming]…
		// Prepare arguments beforehand.
		auto init = [=](float &elt, ArrayRef<int64_t> indices) {
		assert(indices.size() == 2);
		elt = M * indices[0] + indices[1];
		};
		int64_t shape[] = {K, M};
		OwningUnrankedMemRef<float> A(shape, init);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'A' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'A' [readability-identifier-naming]…

		for (int i = 0; i < K; ++i)
		for (int j = 0; j < M; ++j)
		EXPECT_EQ((A[{i, j}]), i * M + j);

		std::string moduleStr = R"mlir(
		func @unranked_memref(%arg0 : memref<xf32>, %arg1 : memref<xf32>) attributes { llvm.emit_c_interface } {
		%x = constant 2 : index
		%y = constant 1 : index
		%cst42 = constant 42.0 : f32
		%arg0_cast = memref_cast %arg0 : memref<*xf32> to memref<?x?xf32>
		%arg1_cast = memref_cast %arg1 : memref<*xf32> to memref<?x?xf32>
		store %cst42, %arg0_cast[%y, %x] : memref<?x?xf32>
		store %cst42, %arg1_cast[%x, %y] : memref<?x?xf32>
		return
		}
		)mlir";
		MLIRContext context;
		registerAllDialects(context);
		auto module = parseSourceString(moduleStr, &context);
		ASSERT_TRUE(!!module);
		ASSERT_TRUE(succeeded(lowerToLLVMDialect(*module)));
		auto jitOrError = ExecutionEngine::create(*module);
		ASSERT_TRUE(!!jitOrError);
		auto jit = std::move(jitOrError.get());
		llvm::Error error = jit->invoke("unranked_memref", &A, &A);
		ASSERT_TRUE(!error);
		EXPECT_EQ((A[{1, 2}]), 42.);
		EXPECT_EQ((A[{2, 1}]), 42.);
		}

		// A helper function operating on unranked memref that will be called from the
		// JIT.
		static void unranked_memref_multiply(::UnrankedMemRefType<float> *memref,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'unranked_memref_multiply' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'unranked_memref_multiply' [readability…
		int32_t coefficient) {
		for (float &elt : *memref)
		elt *= coefficient;
		}

		TEST(NativeMemRefJit, UnrankedMemrefCallback) {
		constexpr int K = 3;
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'K' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'K' [readability-identifier-naming]…
		constexpr int M = 7;
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'M' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'M' [readability-identifier-naming]…
		// Prepare arguments beforehand.
		int64_t shape[] = {K, M};
		OwningUnrankedMemRef<float> A(shape);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'A' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'A' [readability-identifier-naming]…
		int count = 0;
		for (float &elt : *A)
		elt = count++;

		for (int i = 0; i < K; ++i)
		for (int j = 0; j < M; ++j)
		EXPECT_EQ((A[{i, j}]), i * M + j);

		std::string moduleStr = R"mlir(
		func private @callback(%arg0: memref<*xf32>, %coefficient: i32) attributes { llvm.emit_c_interface }
		func @unrankedcaller_for_callback(%arg0: memref<*xf32>, %coefficient: i32) attributes { llvm.emit_c_interface } {
		call @callback(%arg0, %coefficient) : (memref<*xf32>, i32) -> ()
		return
		}
		)mlir";
		MLIRContext context;
		registerAllDialects(context);
		auto module = parseSourceString(moduleStr, &context);
		ASSERT_TRUE(!!module);
		ASSERT_TRUE(succeeded(lowerToLLVMDialect(*module)));
		auto jitOrError = ExecutionEngine::create(*module);
		ASSERT_TRUE(!!jitOrError);
		auto jit = std::move(jitOrError.get());
		// Define any extra symbols so they're available at runtime.
		jit->registerSymbols([&](llvm::orc::MangleAndInterner interner) {
		llvm::orc::SymbolMap symbolMap;
		symbolMap[interner("_mlir_ciface_callback")] =
		llvm::JITEvaluatedSymbol::fromPointer(unranked_memref_multiply);
		return symbolMap;
		});

		int32_t coefficient = 7.;
		bondhugulaUnsubmitted Done Reply Inline Actions Is this a typo here? bondhugula: Is this a typo here?
		llvm::Error error =
		jit->invoke("unrankedcaller_for_callback", &*A, coefficient);
		ASSERT_TRUE(!error);
		count = 0;
		for (float elt : *A)
		ASSERT_EQ(elt, coefficient * count++);
		}

#endif // _WIN32		#endif // _WIN32