This offers the ability to pass numpy arrays to the corresponding
|27 ↗||(On Diff #335948)|
This will not work with a memref that has strides I believe. You'll need to use something like np.lib.stride_tricks.as_strided.
(If you can add a test with an array and a view with strides that could show this)
Debug left-over? (same below)
I'd really like to see this support! Thanks for implementing this.
|4 ↗||(On Diff #335952)|
Please add a line on what this file is about.
|41 ↗||(On Diff #335952)|
Terminate all comments with a full stop.
Is there a way to restrict this patch to identity layout map? This would enable and unlock *many* things even if it just handles the default/row major identity layout. Can we check and bail out / assert on non-default strides?
Unrelated to this PR: If the wrong number of arguments are provided, will ExecutionEngine catch it? I think from a developer standpoint, this is quite useful since folks might sometimes lower the MLIR module to a form where there is a mismatch in the number of function arguments.
The stride is really easy to add though, maybe easier than bailing out (and adding the tests for all the bail out).
But if you prefer to add these tests, I can implement the strides in a follow up.
This is a layer of consistency that is important, but we'll have to build above this, these are all quite low-level APIs: the execution engine does not have any knowledge of the APIs present in the Module.
The only comment I have (not covered elsewhere) is a weak preference to unnest by one directory level (i.e. remove the memref directory and put this in the parent). All of the exported APIs already have memref in the name and fewer imports/less repeating for these kind of things is better, imo. Also fine to see how things evolve and rework the namespace once/if there is more there.
|68 ↗||(On Diff #336556)|
Note that numpy uses byte quantities to express strides.
MLIR OTOH uses the torch abstraction which specifies strides in terms of elements; the ConversionToLLVM takes care of generating the right addresses (which will also requirer furrther hooking the data layout better).
Bottom line, I experimented with memref<...xf32> by hacking the following on top of this:
+ # x.strides = nparray.ctypes.strides + strides_ctype_t = ctypes.c_longlong * nparray.ndim + x.strides = strides_ctype_t(*[t // 4 for t in nparray.strides])
It seemed to help a bit but I still saw issues that I have not yet debugged.
Also note the discrepancy between np.strides and memref / torch strides I signal above.
This is not the element size but the displacement from the "base pointer address" at which the relevant data lives.
I verified that setting this to 0 in my local experiment (here and below), worked in simple cases.
In the general case you need to translate between MLIR and NP offsets similarly as you do for strides.
I tried setting this to multiple different values and it worked. If offset is represented as a number of elements in MLIR, then the offset should be 1. Similarly, it should be 1 * nparray.itemsize while converting back to NumPy from memrefs world.
I am unclear what "it worked" means in the absence of more context on what you tried?
Please note that lowering will do the right thing in the static case here: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp#L3399.
For instance, this commit will fail with an offset different than 0 (at runtime, in this case I hardcoded 0 for expediency while awaiting for this PR to land) : https://github.com/google/iree-llvm-sandbox/commit/87015b29c5f7cbc445bd85e1ce4a5d7597e80361.
The underlying reason is that after tiling, the offset becomes dependence on the loop IV (i.e. the memref has a static ? in its type) and then the dynamic branch of the StandardToLLVM.cpp will kick in.
@mehdi_amini not yet, the author should at the very least set the 2 offsets to 0 to as described in the commit message of https://github.com/google/iree-llvm-sandbox/commit/87015b29c5f7cbc445bd85e1ce4a5d7597e80361, so that it passes without crashing.
I'm fine with iterating in-tree once that is fixed.
I don't quite indeed this x.offset = ctypes.c_longlong(4)? Shouldn't it always be zero? I suspect Numpy is baking the offset into the pointer we get with nparray.ctypes.data_as.
@mehdi_amini indeed, in https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html we can see:
>>> np.ndarray((2,), buffer=np.array([1,2,3]), ... offset=np.int_().itemsize, ... dtype=int) # offset = 1*itemsize, i.e. skip first element array([2, 3])
as well as the default = 0 for offset:
class numpy.ndarray(shape, dtype=float, buffer=None, offset=0, strides=None, order=None)[source]¶
0 is the right value to use for now.
When we do morer fancy numpy / linalg level subviews etc it will be a different story; but for now anything else than 0 will crash non-trivial cases.
Right, but the ctypes interface does not expose the offset, so I wonder if in such cases the offset that you provide on construction will not just be computed in the "data" field internally (I could check the source code for ndarray...)
It supports a subset of cases (e.g. anything that takes subviews and crosses the NP -> MLIR or MLIR -> NP boundary should be considered broken atm).
+1 on adding a simple mixed static / dynamic 2-D add and compare against NP's.