mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1075	Make it a little more clear that these are the output parameters (so we expect pointers that will be populated by this method)
1088	also add TODO on generalizing type, just like L1051
1096	does allocation leak? why not just use a std::vector<uint64_t> for this std::vector<uint64_t> perm(rank); std::iota(perm.begin(), perm.end(), 0);
mlir/test/Integration/Dialect/SparseTensor/python/test_elementwise_add_sparse_output.py
23	Make this a TODO since we will have to work on allowing this
117	actually, in Python land we don't need to CHECK output, we can actually just verify that the values are as expected, see the other py test in this dir
mlir/test/Integration/Dialect/SparseTensor/python/tools/np_to_sparse_tensor.py
29	is there a way to do this only once?

aartbik added inline comments.Dec 10 2021, 4:13 PM

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1113	how about using a "base" for this to avoid the * as was done above (not trusting compiler optimization ;-)

Harbormaster completed remote builds in B138755: Diff 393619.Dec 10 2021, 4:21 PM

Address review comment.

bixia added inline comments.Dec 10 2021, 4:29 PM

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1096	Good catch, add delete. The function toCOO only accept c style array.

Fixed test.

Harbormaster completed remote builds in B138766: Diff 393635.Dec 10 2021, 5:06 PM

aartbik added inline comments.Dec 10 2021, 5:57 PM

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1096	Yes, but you can pass perm.data() for that. I have a slight preference for that since that is the idiom already used at line 1057 and further. This way we have a single way of doing the permutation setup.
1113	did you see this one?

Replace the use of an allocated array with std::vector.

bixia added inline comments.Dec 11 2021, 7:29 AM

mlir/test/Integration/Dialect/SparseTensor/python/tools/np_to_sparse_tensor.py
29	We probably need to make c_lib a global, and then user set it to None to trigger this initialization. Shall we do this?

Harbormaster completed remote builds in B138800: Diff 393676.Dec 11 2021, 7:42 AM

Put the code to look up the c shared library in a function and decorate the function with functools.lru_cache.

Harbormaster completed remote builds in B138813: Diff 393690.Dec 11 2021, 1:11 PM

bixia added inline comments.Dec 11 2021, 1:21 PM

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1113	Use "base".
mlir/test/Integration/Dialect/SparseTensor/python/tools/np_to_sparse_tensor.py
29	Style guide recommend to avoid global variable. I outlined the look up of c_lib to a routine and added decorator lru_cache.

Fixed python function docstrings.

Fixed docstrings.

Harbormaster completed remote builds in B138821: Diff 393699.Dec 11 2021, 2:47 PM

Fixed a typo.

Harbormaster completed remote builds in B138822: Diff 393701.Dec 11 2021, 3:14 PM

bixia marked an inline comment as done.Dec 13 2021, 10:04 AM

aartbik added inline comments.Dec 13 2021, 11:24 AM

mlir/test/Integration/Dialect/SparseTensor/python/test_elementwise_add_sparse_output.py
117	We will have to check the return value, i.e. something like if np.allclose(....): pass else: quit(f'FAILURE')

Use the result of np.allclose.

bixia marked an inline comment as done.Dec 13 2021, 11:45 AM

One last suggestion on testing the other output variable as well. but other than that, solid work. Ship it!

mlir/test/Integration/Dialect/SparseTensor/python/test_elementwise_add_sparse_output.py
117	perhaps also test the other output parameters, either with a similar test or by printing and CHECKing the value. E.g. the rank is of course also reflected in the indices, but still nice to add an explicit check in this early phase of developing the tests

This revision is now accepted and ready to land.Dec 13 2021, 12:00 PM

Harbormaster completed remote builds in B139031: Diff 393980.Dec 13 2021, 12:02 PM

Closed by commit rG2f49e6b0dbf7: Support sparse tensor output. (authored by bixia). · Explain WhyDec 13 2021, 12:06 PM

This revision was automatically updated to reflect the committed changes.

bixia added a commit: rG2f49e6b0dbf7: Support sparse tensor output..

Diff 393997

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp

Show First 20 Lines • Show All 1,065 Lines • ▼ Show 20 Lines	for (uint64_t i = 0, base = 0; i < nse; i++) {
tensor->add(idx, values[i]);		tensor->add(idx, values[i]);
base += rank;		base += rank;
}		}
// Return sparse tensor storage format as opaque pointer.		// Return sparse tensor storage format as opaque pointer.
return SparseTensorStorage<uint64_t, uint64_t, double>::newSparseTensor(		return SparseTensorStorage<uint64_t, uint64_t, double>::newSparseTensor(
rank, shape, perm.data(), sparse.data(), tensor);		rank, shape, perm.data(), sparse.data(), tensor);
}		}

		/// Converts a sparse tensor to COO-flavored format expressed using C-style
		/// data structures. The expected output parameters are pointers for these
		aartbikUnsubmitted Done Reply Inline Actions Make it a little more clear that these are the output parameters (so we expect pointers that will be populated by this method) aartbik: Make it a little more clear that these are the output parameters (so we expect pointers that…
		/// values:
		///
		/// rank: rank of tensor
		/// nse: number of specified elements (usually the nonzeros)
		/// shape: array with dimension size for each rank
		/// values: a "nse" array with values for all specified elements
		/// indices: a flat "nse x rank" array with indices for all specified elements
		///
		/// The input is a pointer to SparseTensorStorage<P, I, V>, typically returned
		/// from convertToMLIRSparseTensor.
		///
		// TODO: Currently, values are copied from SparseTensorStorage to
		// SparseTensorCOO, then to the output. We may want to reduce the number of
		aartbikUnsubmitted Done Reply Inline Actions also add TODO on generalizing type, just like L1051 aartbik: also add TODO on generalizing type, just like L1051
		// copies.
		//
		// TODO: for now f64 tensors only, no dim ordering, all dimensions compressed
		//
		void convertFromMLIRSparseTensor(void tensor, uint64_t p_rank,
		uint64_t p_nse, uint64_t *p_shape,
		double p_values, uint64_t p_indices) {
		SparseTensorStorage<uint64_t, uint64_t, double> *sparse_tensor =
		aartbikUnsubmitted Done Reply Inline Actions does allocation leak? why not just use a std::vector<uint64_t> for this std::vector<uint64_t> perm(rank); std::iota(perm.begin(), perm.end(), 0); aartbik: does allocation leak? why not just use a std::vector<uint64_t> for this std…
		bixiaAuthorUnsubmitted Done Reply Inline Actions Good catch, add delete. The function toCOO only accept c style array. bixia: Good catch, add delete. The function toCOO only accept c style array.
		aartbikUnsubmitted Done Reply Inline Actions Yes, but you can pass perm.data() for that. I have a slight preference for that since that is the idiom already used at line 1057 and further. This way we have a single way of doing the permutation setup. aartbik: Yes, but you can pass perm.data() for that. I have a slight preference for that since that is…
		static_cast<SparseTensorStorage<uint64_t, uint64_t, double> *>(tensor);
		uint64_t rank = sparse_tensor->getRank();
		std::vector<uint64_t> perm(rank);
		std::iota(perm.begin(), perm.end(), 0);
		SparseTensorCOO<double> *coo = sparse_tensor->toCOO(perm.data());

		const std::vector<Element<double>> &elements = coo->getElements();
		uint64_t nse = elements.size();

		uint64_t *shape = new uint64_t[rank];
		for (uint64_t i = 0; i < rank; i++)
		shape[i] = coo->getSizes()[i];

		double *values = new double[nse];
		uint64_t indices = new uint64_t[rank nse];

		for (uint64_t i = 0, base = 0; i < nse; i++) {
		aartbikUnsubmitted Done Reply Inline Actions how about using a "base" for this to avoid the * as was done above (not trusting compiler optimization ;-) aartbik: how about using a "base" for this to avoid the * as was done above (not trusting compiler…
		aartbikUnsubmitted Done Reply Inline Actions did you see this one? aartbik: did you see this one?
		bixiaAuthorUnsubmitted Done Reply Inline Actions Use "base". bixia: Use "base".
		values[i] = elements[i].value;
		for (uint64_t j = 0; j < rank; j++)
		indices[base + j] = elements[i].indices[j];
		base += rank;
		}

		delete coo;
		*p_rank = rank;
		*p_nse = nse;
		*p_shape = shape;
		*p_values = values;
		*p_indices = indices;
		}
} // extern "C"		} // extern "C"

#endif // MLIR_CRUNNERUTILS_DEFINE_FUNCTIONS		#endif // MLIR_CRUNNERUTILS_DEFINE_FUNCTIONS

mlir/test/Integration/Dialect/SparseTensor/python/test_elementwise_add_sparse_output.py

This file was added.

				# RUN: SUPPORT_LIB=%mlir_runner_utils_dir/libmlir_c_runner_utils%shlibext %PYTHON %s \| FileCheck %s

				import ctypes
				import numpy as np
				import os
				import sys

				import mlir.all_passes_registration

				from mlir import ir
				from mlir import runtime as rt
				from mlir import execution_engine
				from mlir import passmanager
				from mlir.dialects import sparse_tensor as st
				from mlir.dialects import builtin
				from mlir.dialects.linalg.opdsl import lang as dsl

				_SCRIPT_PATH = os.path.dirname(os.path.abspath(__file__))
				sys.path.append(_SCRIPT_PATH)
				from tools import np_to_sparse_tensor as test_tools

				# TODO: Use linalg_structured_op to generate the kernel after making it to
				# handle sparse tensor outputs.
				aartbikUnsubmitted Done Reply Inline Actions Make this a TODO since we will have to work on allowing this aartbik: Make this a TODO since we will have to work on allowing this
				_KERNEL_STR = """
				#DCSR = #sparse_tensor.encoding<{
				dimLevelType = [ "compressed", "compressed" ]
				}>

				#trait_add_elt = {
				indexing_maps = [
				affine_map<(i,j) -> (i,j)>, // A
				affine_map<(i,j) -> (i,j)>, // B
				affine_map<(i,j) -> (i,j)> // X (out)
				],
				iterator_types = ["parallel", "parallel"],
				doc = "X(i,j) = A(i,j) + B(i,j)"
				}

				func @sparse_add_elt(
				%arga: tensor<3x4xf64, #DCSR>, %argb: tensor<3x4xf64, #DCSR>) -> tensor<3x4xf64, #DCSR> {
				%c3 = arith.constant 3 : index
				%c4 = arith.constant 4 : index
				%argx = sparse_tensor.init [%c3, %c4] : tensor<3x4xf64, #DCSR>
				%0 = linalg.generic #trait_add_elt
				ins(%arga, %argb: tensor<3x4xf64, #DCSR>, tensor<3x4xf64, #DCSR>)
				outs(%argx: tensor<3x4xf64, #DCSR>) {
				^bb(%a: f64, %b: f64, %x: f64):
				%1 = arith.addf %a, %b : f64
				linalg.yield %1 : f64
				} -> tensor<3x4xf64, #DCSR>
				return %0 : tensor<3x4xf64, #DCSR>
				}

				func @main(%ad: tensor<3x4xf64>, %bd: tensor<3x4xf64>) -> tensor<3x4xf64, #DCSR>
				attributes { llvm.emit_c_interface } {
				%a = sparse_tensor.convert %ad : tensor<3x4xf64> to tensor<3x4xf64, #DCSR>
				%b = sparse_tensor.convert %bd : tensor<3x4xf64> to tensor<3x4xf64, #DCSR>
				%0 = call @sparse_add_elt(%a, %b) : (tensor<3x4xf64, #DCSR>, tensor<3x4xf64, #DCSR>) -> tensor<3x4xf64, #DCSR>
				return %0 : tensor<3x4xf64, #DCSR>
				}
				"""


				class _SparseCompiler:
				"""Sparse compiler passes."""

				def __init__(self):
				self.pipeline = (
				f'sparsification,'
				f'sparse-tensor-conversion,'
				f'builtin.func(linalg-bufferize,convert-linalg-to-loops,convert-vector-to-scf),'
				f'convert-scf-to-std,'
				f'func-bufferize,'
				f'tensor-constant-bufferize,'
				f'builtin.func(tensor-bufferize,std-bufferize,finalizing-bufferize),'
				f'convert-vector-to-llvm{{reassociate-fp-reductions=1 enable-index-optimizations=1}},'
				f'lower-affine,'
				f'convert-memref-to-llvm,'
				f'convert-std-to-llvm,'
				f'reconcile-unrealized-casts')

				def __call__(self, module: ir.Module):
				passmanager.PassManager.parse(self.pipeline).run(module)


				def _run_test(support_lib, kernel):
				"""Compiles, runs and checks results."""
				module = ir.Module.parse(kernel)
				_SparseCompiler()(module)
				engine = execution_engine.ExecutionEngine(
				module, opt_level=0, shared_libs=[support_lib])

				# Set up numpy inputs and buffer for output.
				a = np.array(
				[[1.1, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 6.6, 0.0]],
				np.float64)
				b = np.array(
				[[1.1, 0.0, 0.0, 2.8], [0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0]],
				np.float64)

				mem_a = ctypes.pointer(ctypes.pointer(rt.get_ranked_memref_descriptor(a)))
				mem_b = ctypes.pointer(ctypes.pointer(rt.get_ranked_memref_descriptor(b)))

				# The sparse tensor output is a pointer to pointer of char.
				out = ctypes.c_char(0)
				mem_out = ctypes.pointer(ctypes.pointer(out))

				# Invoke the kernel.
				engine.invoke('main', mem_a, mem_b, mem_out)

				# Retrieve and check the result.
				rank, nse, shape, values, indices = test_tools.sparse_tensor_to_coo_tensor(
				support_lib, mem_out[0], np.float64)

				# CHECK: PASSED
				if np.allclose(values, [2.2, 2.8, 6.6]) and np.allclose(
				indices, [[0, 0], [0, 3], [2, 2]]):
				aartbikUnsubmitted Done Reply Inline Actions actually, in Python land we don't need to CHECK output, we can actually just verify that the values are as expected, see the other py test in this dir aartbik: actually, in Python land we don't need to CHECK output, we can actually just verify that the…
				aartbikUnsubmitted Done Reply Inline Actions We will have to check the return value, i.e. something like if np.allclose(....): pass else: quit(f'FAILURE') aartbik: We will have to check the return value, i.e. something like if np.allclose(....): pass…
				aartbikUnsubmitted Not Done Reply Inline Actions perhaps also test the other output parameters, either with a similar test or by printing and CHECKing the value. E.g. the rank is of course also reflected in the indices, but still nice to add an explicit check in this early phase of developing the tests aartbik: perhaps also test the other output parameters, either with a similar test or by printing and…
				print('PASSED')
				else:
				quit('FAILURE')


				def test_elementwise_add():
				# Obtain path to runtime support library.
				support_lib = os.getenv('SUPPORT_LIB')
				assert support_lib is not None, 'SUPPORT_LIB is undefined'
				assert os.path.exists(support_lib), f'{support_lib} does not exist'
				with ir.Context() as ctx, ir.Location.unknown():
				_run_test(support_lib, _KERNEL_STR)


				test_elementwise_add()

mlir/test/Integration/Dialect/SparseTensor/python/tools/lit.local.cfg

This file was added.

				# Files in this directory are tools, not tests.
				config.unsupported = True

mlir/test/Integration/Dialect/SparseTensor/python/tools/np_to_sparse_tensor.py

This file was added.

				# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				# See https://llvm.org/LICENSE.txt for license information.
				# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

				# This file contains functions to process sparse tensor outputs.

				import ctypes
				import functools
				import numpy as np


				@functools.lru_cache()
				def _get_c_shared_lib(lib_name: str):
				"""Loads and returns the requested C shared library.

				Args:
				lib_name: A string representing the C shared library.

				Returns:
				The C shared library.

				Raises:
				OSError: If there is any problem in loading the shared library.
				ValueError: If the shared library doesn't contain the needed routine.
				"""
				# This raises OSError exception if there is any problem in loading the shared
				# library.
				c_lib = ctypes.CDLL(lib_name)

				aartbikUnsubmitted Done Reply Inline Actions is there a way to do this only once? aartbik: is there a way to do this only once?
				bixiaAuthorUnsubmitted Done Reply Inline Actions We probably need to make c_lib a global, and then user set it to None to trigger this initialization. Shall we do this? bixia: We probably need to make c_lib a global, and then user set it to None to trigger this…
				bixiaAuthorUnsubmitted Done Reply Inline Actions Style guide recommend to avoid global variable. I outlined the look up of c_lib to a routine and added decorator lru_cache. bixia: Style guide recommend to avoid global variable. I outlined the look up of c_lib to a routine…
				try:
				c_lib.convertFromMLIRSparseTensor.restype = ctypes.c_void_p
				except Exception as e:
				raise ValueError('Missing function convertFromMLIRSparseTensor from '
				f'the C shared library: {e} ') from e

				return c_lib


				def sparse_tensor_to_coo_tensor(support_lib, sparse, dtype):
				"""Converts a sparse tensor to COO-flavored format.

				Args:
				support_lib: A string for the supporting C shared library.
				sparse: A ctypes.pointer to the sparse tensor descriptor.
				dtype: The numpy data type for the tensor elements.

				Returns:
				A tuple that contains the following values:
				rank: An integer for the rank of the tensor.
				nse: An interger for the number of non-zero values in the tensor.
				shape: A 1D numpy array of integers, for the shape of the tensor.
				values: A 1D numpy array, for the non-zero values in the tensor.
				indices: A 2D numpy array of integers, representing the indices for the
				non-zero values in the tensor.

				Raises:
				OSError: If there is any problem in loading the shared library.
				ValueError: If the shared library doesn't contain the needed routine.
				"""
				c_lib = _get_c_shared_lib(support_lib)

				rank = ctypes.c_ulonglong(0)
				nse = ctypes.c_ulonglong(0)
				shape = ctypes.POINTER(ctypes.c_ulonglong)()
				values = ctypes.POINTER(np.ctypeslib.as_ctypes_type(dtype))()
				indices = ctypes.POINTER(ctypes.c_ulonglong)()
				c_lib.convertFromMLIRSparseTensor(sparse, ctypes.byref(rank),
				ctypes.byref(nse), ctypes.byref(shape),
				ctypes.byref(values), ctypes.byref(indices))
				# Convert the returned values to the corresponding numpy types.
				shape = np.ctypeslib.as_array(shape, shape=[rank.value])
				values = np.ctypeslib.as_array(values, shape=[nse.value])
				indices = np.ctypeslib.as_array(indices, shape=[nse.value, rank.value])
				return rank, nse, shape, values, indices

This is an archive of the discontinued LLVM Phabricator instance.

Support sparse tensor output.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 393997

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp

mlir/test/Integration/Dialect/SparseTensor/python/test_elementwise_add_sparse_output.py

mlir/test/Integration/Dialect/SparseTensor/python/tools/lit.local.cfg

mlir/test/Integration/Dialect/SparseTensor/python/tools/np_to_sparse_tensor.py

This is an archive of the discontinued LLVM Phabricator instance.

Support sparse tensor output.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 393997

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp

mlir/test/Integration/Dialect/SparseTensor/python/test_elementwise_add_sparse_output.py

mlir/test/Integration/Dialect/SparseTensor/python/tools/lit.local.cfg

mlir/test/Integration/Dialect/SparseTensor/python/tools/np_to_sparse_tensor.py

Support sparse tensor output.
ClosedPublic