This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
3
CMakeLists.txt
-
benchmark/python/
-
python/
-
__init__.py
4
benchmark_sparse.py
2/6
common.py
-
utils/mbr/
-
mbr/
-
CMakeLists.txt
-
README.md
-
mbr/
-
__init__.py
1
config.ini
1
discovery.py
1
main.py
1
stats.py
-
mlir-mbr.in
-
requirements.txt
-
setup.py

Differential D115174

[mlir] Set up boilerplate build for MLIR benchmarks
ClosedPublic

Authored by SaurabhJha on Dec 6 2021, 11:45 AM.

Download Raw Diff

Details

Reviewers

mehdi_amini
aartbik
lebedev.ri

Commits

rGfa90c9d5e7a3: [mlir] Set up boilerplate build for MLIR benchmarks

Summary

This is the start of mlir benchmarks. It configures cmake files necessary to get started writing actual benchmarks.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

SaurabhJha created this revision.Dec 6 2021, 11:45 AM

Herald added subscribers: sdasgup3, wenzhicui, wrengr and 21 others. · View Herald TranscriptDec 6 2021, 11:45 AM

SaurabhJha requested review of this revision.Dec 6 2021, 11:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2021, 11:45 AM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

SaurabhJha added reviewers: mehdi_amini, aartbik.Dec 6 2021, 11:46 AM

Changing commit message by removing references to sparse kernel since we are introducing a general benchmark framework here

SaurabhJha retitled this revision from [mlir] Set up boilerplate build for MLIR sparse kernel benchmarks to [mlir] Set up boilerplate build for MLIR benchmarks.Dec 6 2021, 12:02 PM

SaurabhJha edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B137719: Diff 392147.Dec 6 2021, 12:23 PM

mehdi_amini added inline comments.Dec 6 2021, 1:38 PM

mlir/CMakeLists.txt
239	I think this should all be inside the benchmark folder.
244	This should be behind an option `-DMLIR_ENABLE_CXX_BENCHMARKS` or something like that. I wouldn't use an overly generic name like `MLIR_ENABLE_BENCHMARKS` because I can imagine other infrastructure (Python for example) that will also do benchmarking.

Address comments: move benchmarking configuration to inner directory and enable/disable benchmarking with a flag

Harbormaster completed remote builds in B137740: Diff 392175.Dec 6 2021, 2:19 PM

aartbik added inline comments.Dec 6 2021, 4:21 PM

mlir/CMakeLists.txt
218	looks like newline is missing
mlir/benchmarks/MLIRStarter.cpp
3–8 ↗	(On Diff #392175)	jumping ahead a little, you are setting up a framework that runs C++ code how are you going to make the link from C++ to MLIR IR (i.e. how do you envision running a sparse kernel benchmark this way)?

SaurabhJha added inline comments.Dec 7 2021, 11:24 AM

mlir/benchmarks/MLIRStarter.cpp
3–8 ↗	(On Diff #392175)	I don't know the answer to it yet but I am going through some more documentation about how MLIR IR, specifically around what's the equivalent of `IRBuilder` in MLIR world. What metrics of MLIR IR do we care about? For example, is code size important?

mehdi_amini added inline comments.Dec 7 2021, 11:32 AM

mlir/benchmarks/MLIRStarter.cpp
3–8 ↗	(On Diff #392175)	code size could be something important on Mobile indeed. It isn't the main focus in general though (compared to execution time). We can write C++ execution tests (example: https://github.com/llvm/llvm-project/blob/main/mlir/unittests/ExecutionEngine/Invoke.cpp ) but we'll need to significantly improve the runtime support to make this convenient I think.

SaurabhJha added inline comments.Dec 7 2021, 11:46 AM

mlir/benchmarks/MLIRStarter.cpp
3–8 ↗	(On Diff #392175)	Thanks for the pointers @mehdi_amini. That's what I was thinking, somehow generating IR inline and then timing it, but looks like it's not viable yet. I will think about it more and post my findings here.

SaurabhJha added inline comments.Dec 7 2021, 12:16 PM

mlir/benchmarks/MLIRStarter.cpp
3–8 ↗	(On Diff #392175)	We can have an external python program, similar to `llvm-lit`, that can do these transformations externally and run some benchmarks. What do you think about this high level idea? I can start on this direction if that sounds viable.

mehdi_amini added inline comments.Dec 7 2021, 9:06 PM

mlir/benchmarks/MLIRStarter.cpp
3–8 ↗	(On Diff #392175)	I'm not sure what you mean actually, can you elaborate what you have in mind?

aartbik added inline comments.Dec 7 2021, 10:06 PM

mlir/benchmarks/MLIRStarter.cpp
3–8 ↗	(On Diff #392175)	Note that we are actually much more advanced in building, compiling, and running MLIR IR from Python programs, take a look at https://github.com/llvm/llvm-project/blob/main/mlir/test/Integration/Dialect/SparseTensor/python/test_SpMM.py for example (which would give many opportunities to time part of the generated IR in a very fine-grained manner [after a warm JIT execution startup]). So in that sense, perhaps a Python based benchmarking framework for starters would make some sense too. But please explore a bit and report back to us.

SaurabhJha added inline comments.Dec 8 2021, 12:27 AM

mlir/benchmarks/MLIRStarter.cpp
3–8 ↗	(On Diff #392175)	Yeah, my idea is not very concrete yet but https://github.com/llvm/llvm-project/blob/main/mlir/test/Integration/Dialect/SparseTensor/python/test_SpMM.py can be a good starting point. I will flesh this out more and come back.

Introduce a python framework for benchmarking llvm programs

Herald added a reviewer: lebedev.ri. · View Herald TranscriptDec 12 2021, 7:52 AM

Herald added a subscriber: lebedev.ri. · View Herald Transcript

@mehdi_amini @aartbik I have introduced a python example benchmark. Let me know what you think about this new approach. It needs more fleshing out and a possible integration with llvm-lit. I have suggested a way we can use FileCheck for these benchmarks.

I am also posting a comment on discourse thread with more context so that more people can weigh in.

Harbormaster completed remote builds in B138865: Diff 393755.Dec 12 2021, 8:16 AM

aartbik added inline comments.Dec 13 2021, 11:07 AM

mlir/benchmark/Dialect/SparseTensor/python/test_SpMM.py
135 ↗	(On Diff #393755)	if you are just showing this as a first very rough example, that's fine, but we cannot check this in as is, since this is a very bad test for a benchmark (it stress tests all sparsity annotations over a very small test case) In the long run, we want a proper benchmark to (1) pick one sparsity annotation (or report numbers for each different annotation at least) (2) pick one test input, larger size (3) split the JIT part from the actual kernel execution part (4) maybe even report various metrics, such time to compile, time to execute etc.

SaurabhJha added inline comments.Dec 13 2021, 11:13 AM

mlir/benchmark/Dialect/SparseTensor/python/test_SpMM.py
135 ↗	(On Diff #393755)	Yes, I couldn't think of a good example so just copied the original sparse multiplication test just to demonstrate what it would look like. In fact, we should probably not even use `timeit` utility of python.

Set up a python script to run benchmarks. Remove google benchmark setup.

Even if the current benchmarking is the way to go, I couldn't find a way to consistently run them. In local, I have been running them from command line like this

bash
PYTHONPATH=build/tools/mlir/python_packages/mlir_core MLIR_C_RUNNER_UTILS=build/lib/libmlir_c_runner_utils.dylib MLIR_RUNNER_UTILS=build/lib/libmlir_runner_utils.dylib python mlir/benchmark/python/*.bench.py

Harbormaster completed remote builds in B140077: Diff 395427.Dec 20 2021, 6:51 AM

Added support of pushing benchmarks to an LNT server. Also added a README.

Harbormaster completed remote builds in B140374: Diff 395836.Dec 22 2021, 4:06 AM

mehdi_amini added inline comments.Dec 22 2021, 1:04 PM

mlir/benchmark/python/README.md
25 ↗	(On Diff #395836)	I'm not convinced by the parameters here: I would expect the "driver" / "framework" to be agnostic to what the benchmark function does mostly. Somehow like Python Unit-test maybe? I expect mostly from a framework like this to: discover benchmarks organize suites (so we need either some metadata, some naming convention, a manifest, or a directory convention eventually). Think about gtests for example. allow to run entire suites or just a subset with filtering. for each benchmark, runs multiple times until you get confidence in the result (google benchmarks does that I think?) instead of a fixed arbitrary number of times. collect and aggregate results in a report file.
mlir/benchmark/python/run.py
33 ↗	(On Diff #395836)	Can you add a mode where I get a local report file instead? (offline mode)

SaurabhJha added inline comments.Dec 22 2021, 3:18 PM

mlir/benchmark/python/README.md
25 ↗	(On Diff #395836)	Makes total sense. I have never implemented discovery or organising suites so give me some time to design and implement it. 🙂 I do hope that the approach is right and the final outcome would mostly look like a `pytest` invocation.

SaurabhJha added inline comments.Dec 23 2021, 7:00 AM

mlir/benchmark/python/README.md
25 ↗	(On Diff #395836)	Hey, I was implementing this and a question came to me. In the current implementation, we compile the module once and then run the compiled module n times. If we make the driver benchmark agnostic, we would be both compiling and running n number of times. Would that be acceptable? One way we could reproduce the current arrangement is by having a convention that the benchmarks return the compiled module and then we run the compiled module at the driver level. But because we are recording running times using `nano_time`, the driver would have to make assumptions about kernel's function signature to access the running times like we are doing now. I don't think we should be imposing this much from the driver. Do you have any thoughts on this?

SaurabhJha added inline comments.Dec 23 2021, 10:49 AM

mlir/benchmark/python/README.md
25 ↗	(On Diff #395836)	Thinking about this more, I think what I am looking for is "setup" for unit testing. This could be a good way to divide responsibilities between the driver and the sparse matrix specific setup. I also think that each benchmark can return a float for the time taken in a run which can then be collected by the driver.

SaurabhJha added inline comments.Dec 23 2021, 12:52 PM

mlir/benchmark/python/README.md
25 ↗	(On Diff #395836)	Sorry for too many messages here. I will spend some more time. It would be much easier if I have an implementation of these scattered ideas.

mehdi_amini added inline comments.Dec 23 2021, 4:31 PM

mlir/benchmark/python/README.md

25 ↗

(On Diff #395836)

There are many ways to handle this, for example a function decorated with @benchmark could actually return a tuple of functors to split various phases, instead of the function itself being invoked and timed.

```python
@benchmark(pipeline_string, ntimes)
def benchmark_something_module():
   def setup():
      module = ir.Module.create()
      # Define arguments for the kernel function
      with ir.InsertionPoint(module.body):
          @builtin.FuncOp.from_py_func(<arguments_defined_above>)
          def kernel_name(<parameters>):
              # Kernel implementation

      compiled_module = compile(module)
      execution_engine = mlir.jit.init()
      execution_engine.register(compiled_module)
      return compiled_module, execution_engine

  def run(args):
      compiled_module = args[0]
      execution_engine = args[1]
      execution_engine.run("main", ....)   

  return (setup, run)

The framework can then do:

run_args = setup()

time(run, run_args)

I just made that up right now, not saying it is necessarily the best option :)

Still a work in progress but I have addressed some comments.

I have abstracted everything into a library.
Implemented benchmark discovery similar to pytests.
Better separation of running passes, compiling, and running.

Some issues still need resolution:

I am still running benchmarks a fixed number of times instead of running till they are statistically significant.
The module discovery doesn't include filtering like pytests.

I am working on these two things right now.

Harbormaster completed remote builds in B140661: Diff 396226.Dec 26 2021, 10:50 AM

Glancing at it:

it's nice that you're splitting the general library in mlir/utils/
it seems that there is still a strong coupling between the benchmark framework and the "thing to benchmark". It isn't clear to me why the code mlir/utils/mbr needs to know anything about the IR or the execution engine. Ideally it should be able to benchmark any random python code. Think that we may use this to benchmark a Numpy implementation vs MLIR for example. This is why the example I showed you before was directly returning callbacks: they can run any arbitrary code and the framework does not know about it.

This diff has these changes:

Address the comment of returning a compile function and a run function from a benchmark which the framework can use.
Add filtering of benchmark paths.
Dynamically determine the number of runs required for a benchmark function. I have used a strategy similar to python's timeit (https://github.com/python/cpython/blob/main/Lib/timeit.py#L31-L33) and google benchmark (https://github.com/google/benchmark/blob/main/src/benchmark_runner.cc#L231-L253). Let me know if need something different here.

I haven't included a README since I am not yet sure if the framework interface is something we agree on. Once we have an agreement, I will add a user guide of this library in the README.

Harbormaster completed remote builds in B140740: Diff 396329.Dec 27 2021, 12:42 PM

The latest revision contains the following changes

Adding a README.
Having a configuration file for the library.
Having a numpy benchmark as an example where there is no compile function.
Improve benchmark filtering to filter by benchmark name.

Harbormaster completed remote builds in B141326: Diff 397059.Jan 3 2022, 7:22 AM

Hey @mehdi_amini @aartbik , I have addressed all outstanding issues. Can you please take a look and help me decide how to proceed further on this?

this is converging to something very nice, thanks for all your work and patience!

mlir/benchmark/python/benchmark_sparse.py
23	Please add top level comment """xxxx""" describing the benchmark, just to set the right example for future benchmarks also, nitpicky, do we want a "flat" python/bencmark_.py structure, or do we want to specialize like benchmark/sparse/.py benchmark/dense/*.py Just curious if we should set that example now already, or whether we refactor later if the number of benchmarks grows
24	this probably belongs in common now, ie. setting up various pipelines that will be shared by benchmarks
mlir/benchmark/python/common.py
10	top level doc on what is in this file """Common utilities .... """
mlir/utils/mbr/mbr/config.ini
10	what are these newline missing comments?
mlir/utils/mbr/mbr/discovery.py
10	please document every method with a python doc string
mlir/utils/mbr/mbr/main.py
33	try to keep the 80-col limit

Addressed latest round of comments

Added python docstrings for functions and modules.
Made all lines wrap around 80 character limit.
Added newlines.

Regarding flat python/bencmark_*.py structure, my vote would be for the current structure as its simpler and we could always move
things around later. I am open to other directory structures as well so let me know if you have any alternate thoughts :)

Harbormaster completed remote builds in B143258: Diff 399814.Jan 13 2022, 4:20 PM

SaurabhJha added inline comments.Jan 18 2022, 10:06 AM

mlir/benchmark/python/common.py
27	I wonder whether this `setup_pass` belongs to `common` as it seems specific to sparse tensors.

Add CMake targets for installation of the library

Add trailing line to mlir/.gitignore

Harbormaster completed remote builds in B144104: Diff 400972.Jan 18 2022, 2:38 PM

mehdi_amini added inline comments.Jan 18 2022, 11:55 PM

mlir/.gitignore
10 ↗	(On Diff #400972)	Why do all these folder show up here? In general the execution is in the build (which is at the top level and not under MLIR). We should leave our source directory "pristine".
mlir/benchmark/python/common.py
27	Yes it isn't "common" from this point of view, but this is also the case of a helper like `create_sparse_np_tensor` below.
91	This is not describing the loop that is emitted and the wrapping
mlir/utils/mbr/mbr/stats.py
15	These seems fairly arbitrary and overly large to me. Where is this coming from? Did you look into how other solution are using some statistical approach to evaluate confidence?

Address latest round of comments.

Why do all these folder show up here? In general the execution is in the build (which is at the top level and not under MLIR).

We should leave our source directory "pristine".
I have now made an llvm-lit kind of arrangement where we move the executable to the build directory and mlir-mbr is invoked from there. That way, we won't be bothered by "*.pyc" files. Unfortunately, the egg-info and build/ directories are created by pip install -e which we have to do to install mlir-mbr. I included manual rm -rs in CMakeLists.txt.

Yes it isn't "common" from this point of view, but this is also the case of a helper like create_sparse_np_tensor below.

Absolutely. I tried splitting common into benchmark wide common and sparse specific common by creating a directory called mlir/benchmark/python/sparse and moving sparse specific common and sparse benchmark there. Unfortunately, I couldn't work around parent package import problem where benchmark is trying to import from common in the parent directory. I kept it unchanged for now.

We need to solve this problem as I expect benchmarks would be divided into different directories.

This is not describing the loop that is emitted and the wrapping

Updated the comment.

These seems fairly arbitrary and overly large to me. Where is this coming from?

I took the general idea from google benchmark https://github.com/google/benchmark/blob/main/src/benchmark_runner.cc#L260-L265 and python timeit https://github.com/python/cpython/blob/main/Lib/timeit.py#L31-L33. The values probably need tweaking but my understanding is the general idea is to either run it till some max time has elapsed or we have a sufficient number of measurements.

Harbormaster completed remote builds in B144437: Diff 401434.Jan 19 2022, 5:02 PM

In D115174#3256557, @SaurabhJha wrote:

Address latest round of comments.

Why do all these folder show up here? In general the execution is in the build (which is at the top level and not under MLIR).

We should leave our source directory "pristine".
I have now made an llvm-lit kind of arrangement where we move the executable to the build directory and mlir-mbr is invoked from there. That way, we won't be bothered by "*.pyc" files. Unfortunately, the egg-info and build/ directories are created by pip install -e which we have to do to install mlir-mbr. I included manual rm -rs in CMakeLists.txt.

llvm-lit is also a python program, how is it setup?
The rm looks a bit hacky to me, I rather not touch the source directory at all.

These seems fairly arbitrary and overly large to me. Where is this coming from?

I took the general idea from google benchmark https://github.com/google/benchmark/blob/main/src/benchmark_runner.cc#L260-L265 and python timeit https://github.com/python/cpython/blob/main/Lib/timeit.py#L31-L33. The values probably need tweaking but my understanding is the general idea is to either run it till some max time has elapsed or we have a sufficient number of measurements.

Wow, that really low-tech...
Probably good enough to get started, but please make a default to a couple of seconds, not 100.

llvm-lit is also a python program, how is it setup?
The rm looks a bit hacky to me, I rather not touch the source directory at all.

Yep, got a solution for it! The problem was mbr seeked to be both a library and a CLI runner. As a library, it used to provide BenchmarkRunConfig so that benchmarks
could import them and return them like this.

python
from mbr import BenchmarkRunConfig


def some_benchmark():
    ...
    return BenchmarkRunConfig(compiler=compiler, runner=runner)

To make mbr available, we had to do pip install which created those build directories. Now I just return a two element tuple from each benchmark and check them in the runner. So we don't need a library and thus we can remove the pip install.

Probably good enough to get started, but please make a default to a couple of seconds, not 100.

Made it 1 second (1e9 ns).

I have also made it default to continue running benchmarks if any benchmark raises an exception. This is similar to how unittests handle testcases that raise an exception.

Remove references to BenchmarkRunConfig from README

lebedev.ri removed a reviewer: lebedev.ri.Jan 20 2022, 2:21 PM

Herald added a reviewer: lebedev.ri. · View Herald TranscriptJan 20 2022, 2:21 PM

Harbormaster completed remote builds in B144688: Diff 401781.Jan 20 2022, 4:20 PM

LGTM, if @aartbik is happy with this.

• aktech added a subscriber: • aktech.Jan 22 2022, 5:56 AM

In D115174#3260328, @mehdi_amini wrote:

LGTM, if @aartbik is happy with this.

@aartbik polite ping 🙂

Thanks a lot for all your efforts and patience!

LGTM as well, but with a few last nits on naming

please address those before submitting (but I am giving you a green light now already to avoid further delay once you are done with those ;-) ;-)

mlir/benchmark/python/benchmark_sparse.py
2	matrices -> tensors
31	throughout the file please replace sparse multiplication with sparse matrix multiplication (note that we use "sparse tensor" when we talk about the support in general, but for specific 2-d cases, using matirx is okay)
mlir/benchmark/python/common.py
27	Also note that thanks to Wren's work, we soon can simplify this a lot!

This revision is now accepted and ready to land.Jan 26 2022, 6:34 PM

Address final comments on naming

SaurabhJha added inline comments.Jan 27 2022, 10:17 AM

mlir/benchmark/python/common.py
27	Nice, I guess you are talking about this one. I am keeping track of that and will change this once that's in the main tree. I hope I am guessing it correctly that you are okay with this going in as of now. I have addressed the name changes that you raised in the last comment.

I'll wait for the build to pass before merging this in.

The build is taking quite a while. Is it okay to restart if it doesn't finish in about 9-10 hours?

Harbormaster completed remote builds in B146083: Diff 403717.Jan 27 2022, 1:32 PM

Alright, this is it! The build has passed, merging.

Closed by commit rGfa90c9d5e7a3: [mlir] Set up boilerplate build for MLIR benchmarks (authored by SaurabhJha). · Explain WhyJan 27 2022, 1:45 PM

This revision was automatically updated to reflect the committed changes.

SaurabhJha added a commit: rGfa90c9d5e7a3: [mlir] Set up boilerplate build for MLIR benchmarks.

Revision Contents

Path

Size

mlir/

CMakeLists.txt

4 lines

benchmark/

python/

__init__.py

benchmark_sparse.py

121 lines

common.py

124 lines

utils/

mbr/

CMakeLists.txt

1 line

README.md

86 lines

mbr/

13 lines

9 lines

75 lines

110 lines

39 lines

86 lines

14 lines

Diff 403790

mlir/CMakeLists.txt

Show First 20 Lines • Show All 205 Lines • ▼ Show 20 Lines	if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
if (NOT LLVM_ENABLE_IDE)		if (NOT LLVM_ENABLE_IDE)
add_llvm_install_targets(install-mlir-headers		add_llvm_install_targets(install-mlir-headers
DEPENDS mlir-headers		DEPENDS mlir-headers
COMPONENT mlir-headers)		COMPONENT mlir-headers)
endif()		endif()
endif()		endif()

add_subdirectory(cmake/modules)		add_subdirectory(cmake/modules)

		if (MLIR_ENABLE_PYTHON_BENCHMARKS)
		add_subdirectory(utils/mbr)
		endif()
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I think this should all be inside the benchmark folder. mehdi_amini: I think this should all be inside the benchmark folder.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions This should be behind an option `-DMLIR_ENABLE_CXX_BENCHMARKS` or something like that. I wouldn't use an overly generic name like `MLIR_ENABLE_BENCHMARKS` because I can imagine other infrastructure (Python for example) that will also do benchmarking. mehdi_amini: This should be behind an option `-DMLIR_ENABLE_CXX_BENCHMARKS` or something like that. I…
		aartbikUnsubmitted Not Done Reply Inline Actions looks like newline is missing aartbik: looks like newline is missing

mlir/benchmark/python/init.py

This file was added.

This is an empty file.

mlir/benchmark/python/benchmark_sparse.py

This file was added.

				"""This file contains benchmarks for sparse tensors. In particular, it
				contains benchmarks for both mlir sparse tensor dialect and numpy so that they
				aartbikUnsubmitted Not Done Reply Inline Actions matrices -> tensors aartbik: matrices -> tensors
				can be compared against each other.
				"""
				import ctypes
				import numpy as np
				import os
				import re
				import time

				from mlir import ir
				from mlir import runtime as rt
				from mlir.dialects import builtin
				from mlir.dialects.linalg.opdsl import lang as dsl
				from mlir.execution_engine import ExecutionEngine

				from common import create_sparse_np_tensor
				from common import emit_timer_func
				from common import emit_benchmark_wrapped_main_func
				from common import get_kernel_func_from_module
				from common import setup_passes


				aartbikUnsubmitted Not Done Reply Inline Actions Please add top level comment """xxxx""" describing the benchmark, just to set the right example for future benchmarks also, nitpicky, do we want a "flat" python/bencmark_.py structure, or do we want to specialize like benchmark/sparse/.py benchmark/dense/.py Just curious if we should set that example now already, or whether we refactor later if the number of benchmarks grows aartbik:* Please add top level comment """xxxx""" describing the benchmark, just to set the right…
				@dsl.linalg_structured_op
				aartbikUnsubmitted Not Done Reply Inline Actions this probably belongs in common now, ie. setting up various pipelines that will be shared by benchmarks aartbik: this probably belongs in common now, ie. setting up various pipelines that will be shared by…
				def matmul_dsl(
				A=dsl.TensorDef(dsl.T, dsl.S.M, dsl.S.K),
				B=dsl.TensorDef(dsl.T, dsl.S.K, dsl.S.N),
				C=dsl.TensorDef(dsl.T, dsl.S.M, dsl.S.N, output=True)
				):
				"""Helper function for mlir sparse matrix multiplication benchmark."""
				C[dsl.D.m, dsl.D.n] += A[dsl.D.m, dsl.D.k] * B[dsl.D.k, dsl.D.n]
				aartbikUnsubmitted Not Done Reply Inline Actions throughout the file please replace sparse multiplication with sparse matrix multiplication (note that we use "sparse tensor" when we talk about the support in general, but for specific 2-d cases, using matirx is okay) aartbik: throughout the file please replace sparse multiplication with sparse matrix multiplication…


				def benchmark_sparse_mlir_multiplication():
				"""Benchmark for mlir sparse matrix multiplication. Because its an
				MLIR benchmark we need to return both a `compiler` function and a `runner`
				function.
				"""
				with ir.Context(), ir.Location.unknown():
				module = ir.Module.create()
				f64 = ir.F64Type.get()
				param1_type = ir.RankedTensorType.get([1000, 1500], f64)
				param2_type = ir.RankedTensorType.get([1500, 2000], f64)
				result_type = ir.RankedTensorType.get([1000, 2000], f64)
				with ir.InsertionPoint(module.body):
				@builtin.FuncOp.from_py_func(param1_type, param2_type, result_type)
				def sparse_kernel(x, y, z):
				return matmul_dsl(x, y, outs=[z])

				def compiler():
				with ir.Context(), ir.Location.unknown():
				kernel_func = get_kernel_func_from_module(module)
				timer_func = emit_timer_func()
				wrapped_func = emit_benchmark_wrapped_main_func(
				kernel_func,
				timer_func
				)
				main_module_with_benchmark = ir.Module.parse(
				str(timer_func) + str(wrapped_func) + str(kernel_func)
				)
				setup_passes(main_module_with_benchmark)
				c_runner_utils = os.getenv("MLIR_C_RUNNER_UTILS", "")
				assert os.path.exists(c_runner_utils),\
				f"{c_runner_utils} does not exist." \
				f" Please pass a valid value for" \
				f" MLIR_C_RUNNER_UTILS environment variable."
				runner_utils = os.getenv("MLIR_RUNNER_UTILS", "")
				assert os.path.exists(runner_utils),\
				f"{runner_utils} does not exist." \
				f" Please pass a valid value for MLIR_RUNNER_UTILS" \
				f" environment variable."

				engine = ExecutionEngine(
				main_module_with_benchmark,
				3,
				shared_libs=[c_runner_utils, runner_utils]
				)
				return engine.invoke

				def runner(engine_invoke):
				compiled_program_args = []
				for argument_type in [
				result_type, param1_type, param2_type, result_type
				]:
				argument_type_str = str(argument_type)
				dimensions_str = re.sub("<\|>\|tensor", "", argument_type_str)
				dimensions = [int(dim) for dim in dimensions_str.split("x")[:-1]]
				if argument_type == result_type:
				argument = np.zeros(dimensions, np.float64)
				else:
				argument = create_sparse_np_tensor(dimensions, 1000)
				compiled_program_args.append(
				ctypes.pointer(
				ctypes.pointer(rt.get_ranked_memref_descriptor(argument))
				)
				)
				np_timers_ns = np.array([0], dtype=np.int64)
				compiled_program_args.append(
				ctypes.pointer(
				ctypes.pointer(rt.get_ranked_memref_descriptor(np_timers_ns))
				)
				)
				engine_invoke("main", *compiled_program_args)
				return int(np_timers_ns[0])

				return compiler, runner


				def benchmark_np_matrix_multiplication():
				"""Benchmark for numpy matrix multiplication. Because its a python
				benchmark, we don't have any `compiler` function returned. We just return
				the `runner` function.
				"""
				def runner():
				argument1 = np.random.uniform(low=0.0, high=100.0, size=(1000, 1500))
				argument2 = np.random.uniform(low=0.0, high=100.0, size=(1500, 2000))
				start_time = time.time_ns()
				np.matmul(argument1, argument2)
				return time.time_ns() - start_time

				return None, runner

mlir/benchmark/python/common.py

This file was added.

				"""Common utilities that are useful for all the benchmarks."""
				import numpy as np

				import mlir.all_passes_registration

				from mlir import ir
				from mlir.dialects import arith
				from mlir.dialects import builtin
				from mlir.dialects import memref
				from mlir.dialects import scf
				aartbikUnsubmitted Not Done Reply Inline Actions top level doc on what is in this file """Common utilities .... """ aartbik: top level doc on what is in this file """Common utilities .... """
				from mlir.dialects import std
				from mlir.passmanager import PassManager


				def setup_passes(mlir_module):
				"""Setup pass pipeline parameters for benchmark functions.
				"""
				opt = (
				"parallelization-strategy=0"
				" vectorization-strategy=0 vl=1 enable-simd-index32=False"
				)
				pipeline = (
				f"builtin.func"
				f"(linalg-generalize-named-ops,linalg-fuse-elementwise-ops),"
				f"sparsification{{{opt}}},"
				f"sparse-tensor-conversion,"
				f"builtin.func"
				SaurabhJhaAuthorUnsubmitted Done Reply Inline Actions I wonder whether this `setup_pass` belongs to `common` as it seems specific to sparse tensors. SaurabhJha: I wonder whether this `setup_pass` belongs to `common` as it seems specific to sparse tensors.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Yes it isn't "common" from this point of view, but this is also the case of a helper like `create_sparse_np_tensor` below. mehdi_amini: Yes it isn't "common" from this point of view, but this is also the case of a helper like…
				aartbikUnsubmitted Not Done Reply Inline Actions Also note that thanks to Wren's work, we soon can simplify this a lot! aartbik: Also note that thanks to Wren's work, we soon can simplify this a lot!
				SaurabhJhaAuthorUnsubmitted Done Reply Inline Actions Nice, I guess you are talking about this one. I am keeping track of that and will change this once that's in the main tree. I hope I am guessing it correctly that you are okay with this going in as of now. I have addressed the name changes that you raised in the last comment. SaurabhJha: Nice, I guess you are talking about [this](https://reviews.llvm.org/D117919) one. I am keeping…
				f"(linalg-bufferize,convert-linalg-to-loops,convert-vector-to-scf),"
				f"convert-scf-to-std,"
				f"func-bufferize,"
				f"tensor-constant-bufferize,"
				f"builtin.func(tensor-bufferize,std-bufferize,finalizing-bufferize),"
				f"convert-vector-to-llvm"
				f"{{reassociate-fp-reductions=1 enable-index-optimizations=1}},"
				f"lower-affine,"
				f"convert-memref-to-llvm,"
				f"convert-std-to-llvm,"
				f"reconcile-unrealized-casts"
				)
				PassManager.parse(pipeline).run(mlir_module)


				def create_sparse_np_tensor(dimensions, number_of_elements):
				"""Constructs a numpy tensor of dimensions `dimensions` that has only a
				specific number of nonzero elements, specified by the `number_of_elements`
				argument.
				"""
				tensor = np.zeros(dimensions, np.float64)
				tensor_indices_list = [
				[np.random.randint(0, dimension) for dimension in dimensions]
				for _ in range(number_of_elements)
				]
				for tensor_indices in tensor_indices_list:
				current_tensor = tensor
				for tensor_index in tensor_indices[:-1]:
				current_tensor = current_tensor[tensor_index]
				current_tensor[tensor_indices[-1]] = np.random.uniform(1, 100)
				return tensor


				def get_kernel_func_from_module(module: ir.Module) -> builtin.FuncOp:
				"""Takes an mlir module object and extracts the function object out of it.
				This function only works for a module with one region, one block, and one
				operation.
				"""
				assert len(module.operation.regions) == 1, \
				"Expected kernel module to have only one region"
				assert len(module.operation.regions[0].blocks) == 1, \
				"Expected kernel module to have only one block"
				assert len(module.operation.regions[0].blocks[0].operations) == 1, \
				"Expected kernel module to have only one operation"
				return module.operation.regions[0].blocks[0].operations[0]


				def emit_timer_func() -> builtin.FuncOp:
				"""Returns the declaration of nano_time function. If nano_time function is
				used, the `MLIR_RUNNER_UTILS` and `MLIR_C_RUNNER_UTILS` must be included.
				"""
				i64_type = ir.IntegerType.get_signless(64)
				nano_time = builtin.FuncOp(
				"nano_time", ([], [i64_type]), visibility="private")
				nano_time.attributes["llvm.emit_c_interface"] = ir.UnitAttr.get()
				return nano_time


				def emit_benchmark_wrapped_main_func(func, timer_func):
				"""Takes a function and a timer function, both represented as FuncOp
				objects, and returns a new function. This new function wraps the call to
				the original function between calls to the timer_func and this wrapping
				in turn is executed inside a loop. The loop is executed
				len(func.type.results) times. This function can be used to create a
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions This is not describing the loop that is emitted and the wrapping mehdi_amini: This is not describing the loop that is emitted and the wrapping
				"time measuring" variant of a function.
				"""
				i64_type = ir.IntegerType.get_signless(64)
				memref_of_i64_type = ir.MemRefType.get([-1], i64_type)
				wrapped_func = builtin.FuncOp(
				# Same signature and an extra buffer of indices to save timings.
				"main",
				(func.arguments.types + [memref_of_i64_type], func.type.results),
				visibility="public"
				)
				wrapped_func.attributes["llvm.emit_c_interface"] = ir.UnitAttr.get()

				num_results = len(func.type.results)
				with ir.InsertionPoint(wrapped_func.add_entry_block()):
				timer_buffer = wrapped_func.arguments[-1]
				zero = arith.ConstantOp.create_index(0)
				n_iterations = memref.DimOp(ir.IndexType.get(), timer_buffer, zero)
				one = arith.ConstantOp.create_index(1)
				iter_args = list(wrapped_func.arguments[-num_results - 1:-1])
				loop = scf.ForOp(zero, n_iterations, one, iter_args)
				with ir.InsertionPoint(loop.body):
				start = std.CallOp(timer_func, [])
				call = std.CallOp(
				func,
				wrapped_func.arguments[:-num_results - 1] + loop.inner_iter_args
				)
				end = std.CallOp(timer_func, [])
				time_taken = arith.SubIOp(end, start)
				memref.StoreOp(time_taken, timer_buffer, [loop.induction_variable])
				scf.YieldOp(list(call.results))
				std.ReturnOp(loop)

				return wrapped_func

mlir/utils/mbr/CMakeLists.txt

This file was added.

configure_file(mlir-mbr.in ${CMAKE_BINARY_DIR}/bin/mlir-mbr @ONLY)

mlir/utils/mbr/README.md

This file was added.

				# MBR - MLIR Benchmark Runner
				MBR is a tool to run benchmarks. It measures compilation and running times of
				benchmark programs. It uses MLIR's python bindings for MLIR benchmarks.

				## Installation
				To build and enable MLIR benchmarks, pass `-DMLIR_ENABLE_PYTHON_BENCHMARKS=ON`
				while building MLIR. If you make some changes to the `mbr` files itself, build
				again with `-DMLIR_ENABLE_PYTHON_BENCHMARKS=ON`.

				## Writing benchmarks
				As mentioned in the intro, this tool measures compilation and running times.
				An MBR benchmark is a python function that returns two callables, a compiler
				and a runner. Here's an outline of a benchmark; we explain its working after
				the example code.

				```python
				def benchmark_something():
				# Preliminary setup
				def compiler():
				# Compiles a program and creates an "executable object" that can be
				# called to invoke the compiled program.
				...

				def runner(executable_object):
				# Sets up arguments for executable_object and calls it. The
				# executable_object is returned by the compiler.
				# Returns an integer representing running time in nanoseconds.
				...

				return compiler, runner
				```

				The benchmark function's name must be prefixed by `"benchmark_"` and benchmarks
				must be in the python files prefixed by `"benchmark_` for them to be
				discoverable. The file and function prefixes are configurable using the
				configuration file `mbr/config.ini` relative to this README's directory.

				A benchmark returns two functions, a `compiler` and a `runner`. The `compiler`
				returns a callable which is accepted as an argument by the runner function.
				So the two functions work like this
				1. `compiler`: configures and returns a callable.
				2. `runner`: takes that callable in as input, sets up its arguments, and calls
				it. Returns an int representing running time in nanoseconds.

				The `compiler` callable is optional if there is no compilation step, for
				example, for benchmarks involving numpy. In that case, the benchmarks look
				like this.

				```python
				def benchmark_something():
				# Preliminary setup
				def runner():
				# Run the program and return the running time in nanoseconds.
				...

				return None, runner
				```
				In this case, the runner does not take any input as there is no compiled object
				to invoke.

				## Running benchmarks
				MLIR benchmarks can be run like this

				```bash
				PYTHONPATH=<path_to_python_mlir_core> <other_env_vars> python <llvm-build-path>/bin/mlir-mbr --machine <machine_identifier> --revision <revision_string> --result-stdout <path_to_start_search_for_benchmarks>
				```
				For a description of command line arguments, run

				```bash
				python mlir/utils/mbr/mbr/main.py -h
				```
				And to learn more about the other arguments, check out the LNT's
				documentation page [here](https://llvm.org/docs/lnt/concepts.html).

				If you want to run only specific benchmarks, you can use the positional argument
				`top_level_path` appropriately.

				1. If you want to run benchmarks in a specific directory or a file, set
				`top_level_path` to that.
				2. If you want to run a specific benchmark function, set the `top_level_path` to
				the file containing that benchmark function, followed by a `::`, and then the
				benchmark function name. For example, `mlir/benchmark/python/benchmark_sparse.py::benchmark_sparse_mlir_multiplication`.

				## Configuration
				Various aspects about the framework can be configured using the configuration
				file in the `mbr/config.ini` relative to the directory of this README.

mlir/utils/mbr/mbr/init.py

This file was added.

				"""The public API of this library is defined or imported here."""
				import dataclasses
				import typing


				@dataclasses.dataclass
				class BenchmarkRunConfig:
				"""Any benchmark runnable by this library must return an instance of this
				class. The `compiler` attribute is optional, for example for python
				benchmarks.
				"""
				runner: typing.Callable
				compiler: typing.Optional[typing.Callable] = None

mlir/utils/mbr/mbr/config.ini

This file was added.

				[discovery]
				function_prefix = benchmark_
				filename_prefix = benchmark_

				[stats]
				# 1 billion
				max_number_of_measurements = 1e9
				# 10 seconds
				max_time_for_a_benchmark_ns = 1e9
				aartbikUnsubmitted Not Done Reply Inline Actions what are these newline missing comments? aartbik: what are these newline missing comments?

mlir/utils/mbr/mbr/discovery.py

This file was added.

				"""This file contains functions for discovering benchmark functions. It works
				in a similar way to python's unittest library.
				"""
				import configparser
				import importlib
				import os
				import pathlib
				import re
				import sys
				import types
				aartbikUnsubmitted Not Done Reply Inline Actions please document every method with a python doc string aartbik: please document every method with a python doc string


				def discover_benchmark_modules(top_level_path):
				"""Starting from the `top_level_path`, discover python files which contains
				benchmark functions. It looks for files with a specific prefix, which
				defaults to "benchmark_"
				"""
				config = configparser.ConfigParser()
				config.read(
				os.path.join(os.path.dirname(os.path.realpath(__file__)), "config.ini")
				)
				if "discovery" in config.sections():
				filename_prefix = config["discovery"]["filename_prefix"]
				else:
				filename_prefix = "benchmark_"
				if re.search(fr"{filename_prefix}.*.py$", top_level_path):
				# A specific python file so just include that.
				benchmark_files = [top_level_path]
				else:
				# A directory so recursively search for all python files.
				benchmark_files = pathlib.Path(
				top_level_path
				).rglob(f"{filename_prefix}*.py")
				for benchmark_filename in benchmark_files:
				benchmark_abs_dir = os.path.abspath(os.path.dirname(benchmark_filename))
				sys.path.append(benchmark_abs_dir)
				module_file_name = os.path.basename(benchmark_filename)
				module_name = module_file_name.replace(".py", "")
				module = importlib.import_module(module_name)
				yield module
				sys.path.pop()


				def get_benchmark_functions(module, benchmark_function_name=None):
				"""Discover benchmark functions in python file. It looks for functions with
				a specific prefix, which defaults to "benchmark_".
				"""
				config = configparser.ConfigParser()
				config.read(
				os.path.join(os.path.dirname(os.path.realpath(__file__)), "config.ini")
				)
				if "discovery" in config.sections():
				function_prefix = config["discovery"].get("function_prefix")
				else:
				function_prefix = "benchmark_"

				module_functions = []
				for attribute_name in dir(module):
				attribute = getattr(module, attribute_name)
				if (
				isinstance(attribute, types.FunctionType)
				and attribute_name.startswith(function_prefix)
				):
				module_functions.append(attribute)

				if benchmark_function_name:
				# If benchmark_function_name is present, just yield the corresponding
				# function and nothing else.
				for function in module_functions:
				if function.__name__ == benchmark_function_name:
				yield function
				else:
				# If benchmark_function_name is not present, yield all functions.
				for function in module_functions:
				yield function

mlir/utils/mbr/mbr/main.py

This file was added.

				"""This file contains the main function that's called by the CLI of the library.
				"""

				import os
				import sys
				import time

				import numpy as np

				from discovery import discover_benchmark_modules, get_benchmark_functions
				from stats import has_enough_measurements


				def main(top_level_path, stop_on_error):
				"""Top level function called when the CLI is invoked.
				"""
				if "::" in top_level_path:
				if top_level_path.count("::") > 1:
				raise AssertionError(f"Invalid path {top_level_path}")
				top_level_path, benchmark_function_name = top_level_path.split("::")
				else:
				benchmark_function_name = None

				if not os.path.exists(top_level_path):
				raise AssertionError(
				f"The top-level path {top_level_path} doesn't exist"
				)

				modules = [module for module in discover_benchmark_modules(top_level_path)]
				benchmark_dicts = []
				for module in modules:
				benchmark_functions = [
				function for function in
				aartbikUnsubmitted Not Done Reply Inline Actions try to keep the 80-col limit aartbik: try to keep the 80-col limit
				get_benchmark_functions(module, benchmark_function_name)
				]
				for benchmark_function in benchmark_functions:
				try:
				compiler, runner = benchmark_function()
				except (TypeError, ValueError):
				error_message = (
				f"benchmark_function '{benchmark_function.__name__}'"
				f" must return a two tuple value (compiler, runner)."
				)
				if stop_on_error is False:
				print(error_message, file=sys.stderr)
				continue
				else:
				raise AssertionError(error_message)
				measurements_ns = np.array([])
				if compiler:
				start_compile_time_s = time.time()
				try:
				compiled_callable = compiler()
				except Exception as e:
				error_message = (
				f"Compilation of {benchmark_function.__name__} failed"
				f" because of {e}"
				)
				if stop_on_error is False:
				print(error_message, file=sys.stderr)
				continue
				else:
				raise AssertionError(error_message)
				total_compile_time_s = time.time() - start_compile_time_s
				runner_args = (compiled_callable,)
				else:
				total_compile_time_s = 0
				runner_args = ()
				while not has_enough_measurements(measurements_ns):
				try:
				measurement_ns = runner(*runner_args)
				except Exception as e:
				error_message = (
				f"Runner of {benchmark_function.__name__} failed"
				f" because of {e}"
				)
				if stop_on_error is False:
				print(error_message, file=sys.stderr)
				# Recover from runner error by breaking out of this loop
				# and continuing forward.
				break
				else:
				raise AssertionError(error_message)
				if not isinstance(measurement_ns, int):
				error_message = (
				f"Expected benchmark runner function"
				f" to return an int, got {measurement_ns}"
				)
				if stop_on_error is False:
				print(error_message, file=sys.stderr)
				continue
				else:
				raise AssertionError(error_message)
				measurements_ns = np.append(measurements_ns, measurement_ns)

				if len(measurements_ns) > 0:
				measurements_s = [t * 1e-9 for t in measurements_ns]
				benchmark_identifier = ":".join([
				module.__name__,
				benchmark_function.__name__
				])
				benchmark_dicts.append(
				{
				"name": benchmark_identifier,
				"compile_time": total_compile_time_s,
				"execution_time": list(measurements_s),
				}
				)

				return benchmark_dicts

mlir/utils/mbr/mbr/stats.py

This file was added.

				"""This file contains functions related to interpreting measurement results
				of benchmarks.
				"""
				import configparser
				import numpy as np
				import os


				def has_enough_measurements(measurements):
				"""Takes a list/numpy array of measurements and determines whether we have
				enough measurements to make a confident judgement of the performance. The
				criteria for determining whether we have enough measurements is as follows.
				1. Whether enough time, defaulting to 1 second, has passed.
				2. Whether we have a max number of measurements, defaulting to a billion.

				mehdi_aminiUnsubmitted Not Done Reply Inline Actions These seems fairly arbitrary and overly large to me. Where is this coming from? Did you look into how other solution are using some statistical approach to evaluate confidence? mehdi_amini: These seems fairly arbitrary and overly large to me. Where is this coming from? Did you look…
				If 1. is true, 2. doesn't need to be true.
				"""
				config = configparser.ConfigParser()
				config.read(
				os.path.join(os.path.dirname(os.path.realpath(__file__)), "config.cfg")
				)
				if "stats" in config:
				stats_dict = {
				"max_number_of_measurements": int(
				float(config["stats"]["max_number_of_measurements"])
				),
				"max_time_for_a_benchmark_ns": int(
				float(config["stats"]["max_time_for_a_benchmark_ns"])
				),
				}
				else:
				stats_dict = {
				"max_number_of_measurements": 1e9,
				"max_time_for_a_benchmark_ns": 1e9,
				}
				return (
				np.sum(measurements) >= stats_dict["max_time_for_a_benchmark_ns"] or
				np.size(measurements) >= stats_dict["max_number_of_measurements"]
				)

mlir/utils/mbr/mlir-mbr.in

This file was added.

				#!@Python3_EXECUTABLE@
				# -- coding: utf-8 --

				import argparse
				import datetime
				import json
				import os
				import sys

				from urllib import error as urlerror
				from urllib import parse as urlparse
				from urllib import request


				mlir_source_root = "@MLIR_SOURCE_DIR@"
				sys.path.insert(0, os.path.join(mlir_source_root, "utils", "mbr", "mbr"))

				from main import main


				if __name__ == "__main__":
				parser = argparse.ArgumentParser()
				parser.add_argument(
				"--machine",
				required=True,
				help="A platform identifier on which the "
				"benchmarks are run. For example"
				" <hardware>-<arch>-<optimization level>-<branch-name>"
				)
				parser.add_argument(
				"--revision",
				required=True,
				help="The key used to identify different runs. "
				"Could be anything as long as it"
				" can be sorted by python's sort function"
				)
				parser.add_argument(
				"--url",
				help="The lnt server url to send the results to",
				default="http://localhost:8000/db_default/v4/nts/submitRun"
				)
				parser.add_argument(
				"--result-stdout",
				help="Print benchmarking results to stdout instead"
				" of sending it to lnt",
				default=False,
				action=argparse.BooleanOptionalAction
				)
				parser.add_argument(
				"top_level_path",
				help="The top level path from which to search for benchmarks",
				default=os.getcwd(),
				)
				parser.add_argument(
				"--stop_on_error",
				help="Should we stop the benchmark run on errors? Defaults to false",
				default=False,
				)
				args = parser.parse_args()

				complete_benchmark_start_time = datetime.datetime.utcnow().isoformat()
				benchmark_function_dicts = main(args.top_level_path, args.stop_on_error)
				complete_benchmark_end_time = datetime.datetime.utcnow().isoformat()
				lnt_dict = {
				"format_version": "2",
				"machine": {"name": args.machine},
				"run": {
				"end_time": complete_benchmark_start_time,
				"start_time": complete_benchmark_end_time,
				"llvm_project_revision": args.revision
				},
				"tests": benchmark_function_dicts,
				"name": "MLIR benchmark suite"
				}
				lnt_json = json.dumps(lnt_dict, indent=4)
				if args.result_stdout is True:
				print(lnt_json)
				else:
				request_data = urlparse.urlencode(
				{"input_data": lnt_json}
				).encode("ascii")
				req = request.Request(args.url, request_data)
				try:
				resp = request.urlopen(req)
				except urlerror.HTTPError as e:
				print(e)

mlir/utils/mbr/requirements.txt

This file was added.