This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/test/
-
test/
1
CMakeLists.txt
-
Integration/Dialect/
-
Dialect/
-
SparseTensor/CPU/
-
CPU/
-
Inputs/
-
main_for_lli.ll
4/9
concatenate_dim_1.mlir
1/2
lit.local.cfg
-
sparse_unary.mlir
-
Vector/CPU/ArmSVE/
-
CPU/
-
ArmSVE/
-
lit.local.cfg
-
test-sve.mlir
-
lit.site.cfg.py.in

Differential D148005

[mlir] Update SVE integration tests to use mlir-cpu-runner
AbandonedPublic

Authored by awarzynski on Apr 11 2023, 3:22 AM.

Download Raw Diff

Details

Reviewers

aartbik
nicolasvasilache
dcaballe
c-rhodes
jsetoain

Summary

With the recent addition of "-mattr" and "-march" to the list of options
supported by mlir-cpu-runner, we can update the SVE integration tests
to better to use mlir-cpu-runner instead of lli. This way we are
making sure that these tests better align with the other integration
tests in MLIR.

This patch replaces all the logic related to lli from the test setup
for SVE and replaces that with mlir-cpu-runner (e.g. CMake and LIT
variables). It also reduces the duplication of RUN lines in
"mlir/test/Integration/Dialect/SparseTensor/CPU/". More specifically,
at the moment there are usually 2 RUN lines to test for vectorisation:

one for VLS vectorisation,
one for VLA vectorisation whenever that's available and which reduces to VLS vectorisation when VLA is not supported.

When VLA is not available, VLS vectorisation is verified twice. This
should be avoinded - integration test are relatively expansive to run.
Instead, this patch makes srue that there's only one RUN line for
vectorisation:

"use VLA when requested/supported, otherwise use VLS".

A dedicated input file that implements main for lli is no longer
needed and hence removed.

[1] https://reviews.llvm.org/D146917

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

awarzynski created this revision.Apr 11 2023, 3:22 AM

Herald added a reviewer: aartbik. · View Herald TranscriptApr 11 2023, 3:22 AM

Herald added a reviewer: aartbik. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: bviyer, hanchung, jsetoain and 28 others. · View Herald Transcript

awarzynski requested review of this revision.Apr 11 2023, 3:22 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptApr 11 2023, 3:23 AM

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

NOTE: I've only updated few tests so far - I want to make sure that the logic is OK before I proceed. Once that's approved, I'll refactor the remaining tests and update this patch. Thanks!

awarzynski added reviewers: c-rhodes, jsetoain.Apr 11 2023, 3:29 AM

Harbormaster completed remote builds in B224728: Diff 512386.Apr 11 2023, 3:34 AM

Matt added a subscriber: Matt.Apr 11 2023, 11:44 AM

c-rhodes added inline comments.Apr 12 2023, 4:08 AM

mlir/test/Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir
2	Unrelated changes such as this make reviewing more difficult, please could you commit (or post) them separately as NFC and base your actual changes on top.
19	I think `-vla` can be dropped to make this generic, and perhaps use underscores to be consistent with other binary substitutions? https://llvm.org/docs/TestingGuide.html#substitutions
mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
32	With the above suggestion this could be moved to a higher Lit config (`mlir/test/lit.cfg.py`?) to remove duplication.

awarzynski mentioned this in D148111: [mlir][ArmSME] Add tests for Streaming SVE.Apr 12 2023, 7:25 AM

Revert unrelated changes, replace %mlir-cpu-runner-vla with %mlir_cpu_runner_vla

Cheers for taking a look @c-rhodes! I was meant to submit my replies before updating the patch, apologies if this feels out-of-sync.

mlir/test/Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir
2	Sorry about that, I was meant to revert that.
19	Using underscores is a good suggestion, but I'm still strongly in favor of some suffix (could be `-vla`, doesn't have to be). Otherwise, we will end up with `%mlir-cpu-runner` and `%mlir_cpu_runner`. That would be too easy to confuse. Perhaps `%mlir_cpu_runner_vec`? Or `%mlir_cpu_runner_emu`?
mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
32	Good idea, though that will have impact on all testing in MLIR and I would feel more confident doing that in a separate patch :) (once reviewers approve the logic, I'll update the remaining tests in mlir/test/Integration/Dialect/SparseTensor/CPU, which will increase the size of this patch quite significantly). But yes, the overall goal is to reduce duplication and unify how things get configured.

Harbormaster completed remote builds in B225266: Diff 513088.Apr 13 2023, 12:58 AM

c-rhodes added inline comments.Apr 13 2023, 1:16 AM

mlir/test/Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir
19	Using underscores is a good suggestion, but I'm still strongly in favor of some suffix (could be `-vla`, doesn't have to be). Otherwise, we will end up with `%mlir-cpu-runner` and `%mlir_cpu_runner`. That would be too easy to confuse. Perhaps `%mlir_cpu_runner_vec`? Or `%mlir_cpu_runner_emu`? It's not clear to me why 2 substitutions (vla and non-vla?) for mlir-cpu-runner binary are required?

awarzynski added inline comments.Apr 13 2023, 1:39 AM

mlir/test/Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir
19	Let me illustrate with an example (_before_ and _after_): BEFORE // RUN: %{compile} \| %{run_config_1} // RUN: %{compile} \| %{run_config_2} // RUN: %{compile} \| %{run_config_3} // RUN: %{compile} \| %{run_config_3_with_or_without_emulation} The final `RUN` line depends on e.g. `ENABLE_VLA` and `ARM_EMULATOR_EXECUTABLE` AFTER // RUN: %{compile} \| %{run_config_1} // RUN: %{compile} \| %{run_config_2} // RUN: %{compile} \| %{run_config_3_with_or_without_emulation} AFTER v2 If there's only one `%mlir_cpu_runner` substitution, then we will have this: // RUN: %{compile} \| %{run_config_1_with_or_without_emulation} // RUN: %{compile} \| %{run_config_2_with_or_without_emulation} // RUN: %{compile} \| %{run_config_3_with_or_without_emulation} My goal in this patch is to preserve the original config as much as I can. I'm not against running everything in an emulator, but I'd rather do it in incremental steps (so, BEFORE --> AFTER --> AFTER v2 ).

c-rhodes added inline comments.Apr 13 2023, 2:24 AM

mlir/test/Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir
19	If there's only one `%mlir_cpu_runner` substitution, then we will have this: // RUN: %{compile} \| %{run_config_1_with_or_without_emulation} // RUN: %{compile} \| %{run_config_2_with_or_without_emulation} // RUN: %{compile} \| %{run_config_3_with_or_without_emulation} But `%mlir_cpu_runner` substitution is only used in the SVE run line? The other run lines use the binary name `mlir-cpu-runner`.

awarzynski added inline comments.Apr 14 2023, 1:00 AM

mlir/test/Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir
19	I feel that we might be referring to different things. But %mlir_cpu_runner substitution is only used in the SVE run line? The other run lines use the binary name mlir-cpu-runner "the binary name mlir-cpu-runner" is also effectively a substitution (created here). Because of these "default" substitutions, things like `%mlir-cpu-runner` are best avoided [1]. Now, I want to keep two substitutions to preserve the original logic as much as I can: `mlir-cpu-runner` for regular `RUN` lines (`config_1` and `config_2` above), `%mlir_cpu_runner_vec` (or something else) for SVE `RUN` lines (`config_3` above). This is not possible with one substitution. I'm against introducing `%mlir_cpu_runner` (or `%mlir-cpu-runner`), because it is too close to `mlir-cpu-runner` (and I want to make sure that the spelling makes it obvious that the two are completely different). That's why I am suggesting an alternative. As the 3rd configuration is meant to run vectorised code, which _may_ or _may not_ be run via an emulator, I see 3 potential suffixes: `_vec`\|`_emu`\| `_vla`. To me, `%mlir_cpu_runner_vec` seems like the closes match for `config_3` and that's what I suggest that we use here. WDYT? [1] It would get expanded too `%/<absolute-path>/mlir-cpu-runner`, because `mlir-cpu-runner` substitution would precede the one for `%mlir-cpu-runner`. Unless I misunderstood how LIT expansion works.

c-rhodes added inline comments.Apr 14 2023, 2:18 AM

mlir/test/Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir
19	I feel that we might be referring to different things. But %mlir_cpu_runner substitution is only used in the SVE run line? The other run lines use the binary name mlir-cpu-runner "the binary name mlir-cpu-runner" is also effectively a substitution (created here). Because of these "default" substitutions, things like `%mlir-cpu-runner` are best avoided [1]. Perhaps that's why underscores are used elsewhere? Now, I want to keep two substitutions to preserve the original logic as much as I can: `mlir-cpu-runner` for regular `RUN` lines (`config_1` and `config_2` above), `%mlir_cpu_runner_vec` (or something else) for SVE `RUN` lines (`config_3` above). This is not possible with one substitution. I'm against introducing `%mlir_cpu_runner` (or `%mlir-cpu-runner`), because it is too close to `mlir-cpu-runner` (and I want to make sure that the spelling makes it obvious that the two are completely different). That's why I am suggesting an alternative. As the 3rd configuration is meant to run vectorised code, which _may_ or _may not_ be run via an emulator, I see 3 potential suffixes: `_vec`\|`_emu`\| `_vla`. To me, `%mlir_cpu_runner_vec` seems like the closes match for `config_3` and that's what I suggest that we use here. WDYT? [1] It would get expanded too `%/<absolute-path>/mlir-cpu-runner`, because `mlir-cpu-runner` substitution would precede the one for `%mlir-cpu-runner`. Unless I misunderstood how LIT expansion works. I thought this substitution is to support cross-compiling so the path to the target `lli` (rather than host) binary can be specified and run on an emulator. I think this should be as simple as specifying `%mlir_cpu_runner` (much like `%clang`, `%clang_cc1`, etc) for RUN lines where this is the case and `mlir-cpu-runner` (defaulting to llvm-tools-dir) everywhere else?

awarzynski mentioned this in D148232: [mlir][aarch64] Enable MLIR integration tests for SVE/SME under emulation.May 2 2023, 2:18 PM

Other than the typo, all looks good to me.

One comment. In your patch you mention the fact that, when VLA is not enabled, we run VLS twice. Your solution is to run either/or, but that's also not a good testing approach. Ideally, it will run one or both. Given that these tests are, at the moment, really inexpensive in VLS, running 2xVLS or VLS+VLA looks to me like the least worst solution (under the principle that redundant testing is not as bad as incomplete testing).

Fwiw, when I first wrote these tests, I considered two other approaches. In one of them, the most straightforward, I simply duplicated the test under a different directory. The second, I (mostly) avoided code duplication by including the original tests from a subdirectory. I pushed that patch for review, but it was deemed too ugly, which is how we ended up with this approach. You are definitely more proficient with LIT than I am, tests look much better nowadays, so you may be interested in reconsidering that approach (check: Diff1@D12304).

mlir/test/CMakeLists.txt
23

This revision is now accepted and ready to land.Jun 11 2023, 3:46 AM

In D148005#4411951, @jsetoain wrote:

Other than the typo, all looks good to me.

Thanks for taking a look! I actually need to refactor this a bit to match the changes from https://reviews.llvm.org/D148929. I just haven't had the time recently, but will do this week.

One comment. In your patch you mention the fact that, when VLA is not enabled, we run VLS twice. Your solution is to run either/or, but that's also not a good testing approach. Ideally, it will run one or both.

I agree. The approach proposed here is a compromise to avoid running VLS twice. That's the only low hanging fruit that we have found so far to reduce the execution time of these tests.

Given that these tests are, at the moment, really inexpensive in VLS, running 2xVLS or VLS+VLA looks to me like the least worst solution

Based on this comment, I am guessing that you assume that VLS is cheap and VLA is expensive :) That would be the case if:

VLS was always run natively,
VLA was always run in an emulator.

That's not the case though. We always run VLA natively (emulator is an option for folks without access to hardware). And we also made sure that these tests are enabled in the public SVE buildbots (https://reviews.llvm.org/D136460). Also, I assume that most folks have VLA tests disabled anyway. So when people say that these integration tests are slow, they are most likely referring to the VLS code-path.

Now, I am making a lot of assumptions - please let me know if I am completely wrong or misinterpreted your comment :)

(under the principle that redundant testing is not as bad as incomplete testing).

Good point! That's why before proposing this change, I made sure that the SVE integration tests are enabled in upstream SVE bots: (https://reviews.llvm.org/D136460. Ultimately, we want to make sure that the VLA code-patch is well defended against any breakage and I am hoping that these bots are a strong enough defence lines :)

Again, this is a compromise. But this way we make sure that:

these tests are regularly run (by upstream buildbots),
folks that don't/can't run them, don't have to pay the extra price for an additional RUN line.

Fwiw, when I first wrote these tests, I considered two other approaches. In one of them, the most straightforward, I simply duplicated the test under a different directory. The second, I (mostly) avoided code duplication by including the original tests from a subdirectory. I pushed that patch for review, but it was deemed too ugly, which is how we ended up with this approach. You are definitely more proficient with LIT than I am, tests look much better nowadays, so you may be interested in reconsidering that approach (check: Diff1@D12304).

Thanks for the context, this is super helpful! In fact, we have been discussing how to reduce the number of these RUN lines and what you experimented there with is one approach that we have considered. But we would like do that globally, i.e.:

move the test input/expected output to dedicated files,
move RUN lines to separate files.

And then, use CMake flags to control which RUN lines to enable/disable. AFAIK, that's currently not supported by LIT. You achieved that by duplicating test files, but that would mean ... more files to maintain. Usually that's unpopular :(

I think that people do agree that these tests (and the LIT config) could benefit from some more refactor/re-design. That is bound to require some compromise, which, IMHO, is fine as long as we make sure that all tests are regularly run in buildbots.

-Andrzej

In D148005#4413110, @awarzynski wrote:

Given that these tests are, at the moment, really inexpensive in VLS, running 2xVLS or VLS+VLA looks to me like the least worst solution

Based on this comment, I am guessing that you assume that VLS is cheap and VLA is expensive :) That would be the case if:

VLS was always run natively,

VLA was always run in an emulator.

Actually, from what remember from both, neither of them took a particularly long time to finish. I might be wrong (it's been a while), but I half-remember native VLS/VLA being instantaneous, and emulated VLA almost instantaneous.

(under the principle that redundant testing is not as bad as incomplete testing).

Good point! That's why before proposing this change, I made sure that the SVE integration tests are enabled in upstream SVE bots: (https://reviews.llvm.org/D136460. Ultimately, we want to make sure that the VLA code-patch is well defended against any breakage and I am hoping that these bots are a strong enough defence lines :)

Just to be clear, my concern here is that, with this approach, if I modify something that breaks VLS but I'm compiling with VLA support, the integration tests won't detect the integration errors locally because the test won't run in VLS mode. That said, it's true that this is not a huge problem because those bugs will eventually be caught by bots testing VLS, and if test performance has become an issue, then I agree that VLA ^ VLS + build bot tests for each is a reasonable compromise.

Thanks for the context, this is super helpful! In fact, we have been discussing how to reduce the number of these RUN lines and what you experimented there with is one approach that we have considered. But we would like do that globally, i.e.:

move the test input/expected output to dedicated files,

move RUN lines to separate files.

And then, use CMake flags to control which RUN lines to enable/disable. AFAIK, that's currently not supported by LIT. You achieved that by duplicating test files, but that would mean ... more files to maintain. Usually that's unpopular :(

Not sure I understand this. I only duplicated the tests that could not run in VLA (because of unsupported reduction operations on Arm; at the time I asked to the internal compiler team, and those were supposed to be eventually fixed. In all the others, the only thing the tests do is to include the other source and run with VLA options. If you mean that by "duplicating test files", then yes, indeed :-)

It does feel like this is something lit should support one way or another, back then I spent quite a lot of time (way too much) trying to figure it out, and this is the best I could come up with. Do you know of anybody interested in adding a "RUN-IF" directive to lit? 😬

In D148005#4413349, @jsetoain wrote:

Actually, from what remember from both, neither of them took a particularly long time to finish. I might be wrong (it's been a while), but I half-remember native VLS/VLA being instantaneous, and emulated VLA almost instantaneous.

Thanks for clarifying. I will just add that quite a few tests have been added in the last 6-9 months, so it's possible that these would no longer be "instantaneous" for you :) (but I might be wrong)

Just to be clear, my concern here is that, with this approach, if I modify something that breaks VLS but I'm compiling with VLA support, the integration tests won't detect the integration errors locally because the test won't run in VLS mode. That said, it's true that this is not a huge problem because those bugs will eventually be caught by bots testing VLS, and if test performance has become an issue, then I agree that VLA ^ VLS + build bot tests for each is a reasonable compromise.

This is a valid concern. However:

I suspect that only folks with access to SVE hardware enable/test the VLA code-path anyway. So it won't be exercised as frequently as VLS anyway.
Any change in "core" MLIR could break something in Flang (which depends on MLIR)? Should MLIR devs run Flang tests on regular basis? Given Flang build times, that's an unrealistic expectation (caveat: I worked on Flang for quite a while). In practice, buildbots are used to capture such breakages (that was one of the reasons to set them up - there's ~8 that run on AArch64).

I'm definitely not against the flexibility that you are suggesting and agree with pretty much everything that you have said. I am just thinking that we should also be a bit more open to rely on our CI infrastructure for certain things. Especially, given these rather tricky limitations in LIT.

Thanks for the context, this is super helpful! In fact, we have been discussing how to reduce the number of these RUN lines and what you experimented there with is one approach that we have considered. But we would like do that globally, i.e.:

move the test input/expected output to dedicated files,

move RUN lines to separate files.

And then, use CMake flags to control which RUN lines to enable/disable. AFAIK, that's currently not supported by LIT. You achieved that by duplicating test files, but that would mean ... more files to maintain. Usually that's unpopular :(

Not sure I understand this. I only duplicated the tests that could not run in VLA (because of unsupported reduction operations on Arm; at the time I asked to the internal compiler team, and those were supposed to be eventually fixed. In all the others, the only thing the tests do is to include the other source and run with VLA options. If you mean that by "duplicating test files", then yes, indeed :-)

Yeah, I meant the latter. Sorry for the confusion. In general, we should work towards some basic model where: 1 test == 1 file (or, 1 file for input/output, 1 file with RUN lines).

Btw, I am aware of the limitations that you were hitting and my comment wasn't meant as criticism. Setting these tests up was a non-trivial task and we are super lucky that you paved the way for testing SVE in MLIR 🙏 😄 .

Do you know of anybody interested in adding a "RUN-IF" directive to lit? 😬

Haha, I was just about to say - this is clearly a huge a gap in LIT infrastructure and we just need some brave volunteer(s) to fill it with some magic 😂 .

Sorry about the delay!

In D148005#4413349, @jsetoain wrote:

It does feel like this is something lit should support one way or another, back then I spent quite a lot of time (way too much) trying to figure it out, and this is the best I could come up with. Do you know of anybody interested in adding a "RUN-IF" directive to lit? 😬

@jsetoain Looks like this is actually already available 😂 In https://reviews.llvm.org/D155403 I used LIT's "conditional substitution" to achieve that. I intend to rebase this change ("switch from lli to mlir-cpu-runner for SVE tests") on top of D155403 ("make sure that there are no duplicated RUN lines"). My ultimate goal is to update all integration tests to make sure that:

there's no duplication, i.e. that the VLS vectorisation is always run once (enabled in D155403),
all RUN lines use mlir-cpu-runner so that most of the logic is shared (enabled here).

WDYT?

Abandoning in favor of:

Revision Contents

Path

Size

mlir/

test/

CMakeLists.txt

4 lines

Integration/

Dialect/

SparseTensor/

CPU/

Inputs/

main_for_lli.ll

concatenate_dim_1.mlir

20 lines

lit.local.cfg

14 lines

sparse_unary.mlir

24 lines

Vector/

CPU/

ArmSVE/

lit.local.cfg

12 lines

test-sve.mlir

8 lines

lit.site.cfg.py.in

2 lines

Diff 513088

mlir/test/CMakeLists.txt

Show All 14 Lines

if (MLIR_INCLUDE_INTEGRATION_TESTS)

set(INTEL_SDE_EXECUTABLE "" CACHE STRING

"If set, arch-specific integration tests are run with Intel SDE.")

set(ARM_EMULATOR_EXECUTABLE "" CACHE STRING

"If set, arch-specific Arm integration tests are run with an emulator.")

set(ARM_EMULATOR_OPTIONS "" CACHE STRING

"If arch-specific Arm integration tests run emulated, pass these as parameters to the emulator.")

set(ARM_EMULATOR_LLI_EXECUTABLE "" CACHE STRING

set(ARM_EMULATOR_MLIR_CPU_RUNNER_EXECUTABLE_EXECUTABLE "" CACHE STRING

jsetoainUnsubmitted

Not Done

"If arch-specific Arm integration tests run emulated, pass these as parameters to the emulator.")

- set(ARM_EMULATOR_MLIR_CPU_RUNNER_EXECUTABLE_EXECUTABLE "" CACHE STRING

+ set(ARM_EMULATOR_MLIR_CPU_RUNNER_EXECUTABLE "" CACHE STRING

"If arch-specific Arm integration tests run emulated, use this Arm native mlir-cpu-runner.")

jsetoain:

"If arch-specific Arm integration tests run emulated, use this Arm native lli.")

"If arch-specific Arm integration tests run emulated, use this Arm native mlir-cpu-runner.")

set(ARM_EMULATOR_UTILS_LIB_DIR "" CACHE STRING

"If arch-specific Arm integration tests run emulated, find Arm native utility libraries in this directory.")

option(MLIR_RUN_AMX_TESTS "Run AMX tests.")

option(MLIR_RUN_X86VECTOR_TESTS "Run X86Vector tests.")

option(MLIR_RUN_CUDA_TENSOR_CORE_TESTS "Run CUDA Tensor core WMMA tests.")

option(MLIR_RUN_ARM_SVE_TESTS "Run Arm SVE tests.")

▲ Show 20 Lines • Show All 137 Lines • Show Last 20 Lines

mlir/test/Integration/Dialect/SparseTensor/CPU/Inputs/main_for_lli.ll

This file was deleted.

	; Dummy wrapper required by lli, which does not support void functions (i.e.
	; it fails if non-zero code is returned)
	define i32 @entry_lli() {
	call void @entry()
	ret i32 0
	}

	declare void @entry()

mlir/test/Integration/Dialect/SparseTensor/CPU/concatenate_dim_1.mlir

	// DEFINE: %{option} = enable-runtime-library=true			// DEFINE: %{compile_opts} = enable-runtime-library=true
	// DEFINE: %{run_option} =			// DEFINE: %{run_opts} =
				c-rhodesUnsubmitted Not Done Reply Inline Actions Unrelated changes such as this make reviewing more difficult, please could you commit (or post) them separately as NFC and base your actual changes on top. c-rhodes: Unrelated changes such as this make reviewing more difficult, please could you commit (or post)…
				awarzynskiAuthorUnsubmitted Done Reply Inline Actions Sorry about that, I was meant to revert that. awarzynski: Sorry about that, I was meant to revert that.
	// DEFINE: %{compile} = mlir-opt %s --sparse-compiler=%{option}			// DEFINE: %{compile} = mlir-opt %s --sparse-compiler=%{compile_opts}
	// DEFINE: %{run} = mlir-cpu-runner \			// DEFINE: %{cpu_runner} = mlir-cpu-runner
				// DEFINE: %{run} = %{cpu_runner} \
	// DEFINE: -e entry -entry-point-result=void \			// DEFINE: -e entry -entry-point-result=void \
	// DEFINE: -shared-libs=%mlir_lib_dir/libmlir_c_runner_utils%shlibext,%mlir_lib_dir/libmlir_runner_utils%shlibext %{run_option} \| \			// DEFINE: -shared-libs=%mlir_lib_dir/libmlir_c_runner_utils%shlibext,%mlir_lib_dir/libmlir_runner_utils%shlibext %{run_opts} \| \
	// DEFINE: FileCheck %s			// DEFINE: FileCheck %s
	//			//
	// RUN: %{compile} \| %{run}			// RUN: %{compile} \| %{run}
	//			//
	// Do the same run, but now with direct IR generation.			// Do the same run, but now with direct IR generation.
	// REDEFINE: %{option} = "enable-runtime-library=false enable-buffer-initialization=true"			// REDEFINE: %{compile_opts} = "enable-runtime-library=false enable-buffer-initialization=true"
	// RUN: %{compile} \| %{run}			// RUN: %{compile} \| %{run}
	//			//
	// Do the same run, but now with direct IR generation and vectorization. Enable			// Do the same run, but now with direct IR generation and vectorization. Enable
	// Arm SVE if supported.			// Arm SVE/VLA if supported.
	// REDEFINE: %{option} = "enable-runtime-library=false enable-buffer-initialization=true vl=4 enable-arm-sve=%ENABLE_VLA reassociate-fp-reductions=true enable-index-optimizations=true"			// REDEFINE: %{compile_opts} = "enable-runtime-library=false enable-buffer-initialization=true vl=4 enable-arm-sve=%ENABLE_VLA reassociate-fp-reductions=true enable-index-optimizations=true"
	// REDEFINE: %{run_option} = %VLA_ARCH_ATTR_OPTIONS			// REDEFINE: %{cpu_runner} = %mlir_cpu_runner_vla
				c-rhodesUnsubmitted Not Done Reply Inline Actions I think `-vla` can be dropped to make this generic, and perhaps use underscores to be consistent with other binary substitutions? https://llvm.org/docs/TestingGuide.html#substitutions c-rhodes: I think `-vla` can be dropped to make this generic, and perhaps use underscores to be…
				awarzynskiAuthorUnsubmitted Done Reply Inline Actions Using underscores is a good suggestion, but I'm still strongly in favor of some suffix (could be `-vla`, doesn't have to be). Otherwise, we will end up with `%mlir-cpu-runner` and `%mlir_cpu_runner`. That would be too easy to confuse. Perhaps `%mlir_cpu_runner_vec`? Or `%mlir_cpu_runner_emu`? awarzynski: Using underscores is a good suggestion, but I'm still strongly in favor of some suffix (could…
				c-rhodesUnsubmitted Not Done Reply Inline Actions Using underscores is a good suggestion, but I'm still strongly in favor of some suffix (could be `-vla`, doesn't have to be). Otherwise, we will end up with `%mlir-cpu-runner` and `%mlir_cpu_runner`. That would be too easy to confuse. Perhaps `%mlir_cpu_runner_vec`? Or `%mlir_cpu_runner_emu`? It's not clear to me why 2 substitutions (vla and non-vla?) for mlir-cpu-runner binary are required? c-rhodes: > Using underscores is a good suggestion, but I'm still strongly in favor of some suffix (could…
				awarzynskiAuthorUnsubmitted Done Reply Inline Actions Let me illustrate with an example (_before_ and _after_): BEFORE // RUN: %{compile} \| %{run_config_1} // RUN: %{compile} \| %{run_config_2} // RUN: %{compile} \| %{run_config_3} // RUN: %{compile} \| %{run_config_3_with_or_without_emulation} The final `RUN` line depends on e.g. `ENABLE_VLA` and `ARM_EMULATOR_EXECUTABLE` AFTER // RUN: %{compile} \| %{run_config_1} // RUN: %{compile} \| %{run_config_2} // RUN: %{compile} \| %{run_config_3_with_or_without_emulation} AFTER v2 If there's only one `%mlir_cpu_runner` substitution, then we will have this: // RUN: %{compile} \| %{run_config_1_with_or_without_emulation} // RUN: %{compile} \| %{run_config_2_with_or_without_emulation} // RUN: %{compile} \| %{run_config_3_with_or_without_emulation} My goal in this patch is to preserve the original config as much as I can. I'm not against running everything in an emulator, but I'd rather do it in incremental steps (so, BEFORE --> AFTER --> AFTER v2 ). awarzynski: Let me illustrate with an example (_before_ and _after_): BEFORE ``` // RUN: %{compile} \|…
				c-rhodesUnsubmitted Not Done Reply Inline Actions If there's only one `%mlir_cpu_runner` substitution, then we will have this: // RUN: %{compile} \| %{run_config_1_with_or_without_emulation} // RUN: %{compile} \| %{run_config_2_with_or_without_emulation} // RUN: %{compile} \| %{run_config_3_with_or_without_emulation} But `%mlir_cpu_runner` substitution is only used in the SVE run line? The other run lines use the binary name `mlir-cpu-runner`. c-rhodes: > If there's only one `%mlir_cpu_runner` substitution, then we will have this: > ``` > // RUN…
				awarzynskiAuthorUnsubmitted Done Reply Inline Actions I feel that we might be referring to different things. But %mlir_cpu_runner substitution is only used in the SVE run line? The other run lines use the binary name mlir-cpu-runner "the binary name mlir-cpu-runner" is also effectively a substitution (created here). Because of these "default" substitutions, things like `%mlir-cpu-runner` are best avoided [1]. Now, I want to keep two substitutions to preserve the original logic as much as I can: `mlir-cpu-runner` for regular `RUN` lines (`config_1` and `config_2` above), `%mlir_cpu_runner_vec` (or something else) for SVE `RUN` lines (`config_3` above). This is not possible with one substitution. I'm against introducing `%mlir_cpu_runner` (or `%mlir-cpu-runner`), because it is too close to `mlir-cpu-runner` (and I want to make sure that the spelling makes it obvious that the two are completely different). That's why I am suggesting an alternative. As the 3rd configuration is meant to run vectorised code, which _may_ or _may not_ be run via an emulator, I see 3 potential suffixes: `_vec`\|`_emu`\| `_vla`. To me, `%mlir_cpu_runner_vec` seems like the closes match for `config_3` and that's what I suggest that we use here. WDYT? [1] It would get expanded too `%/<absolute-path>/mlir-cpu-runner`, because `mlir-cpu-runner` substitution would precede the one for `%mlir-cpu-runner`. Unless I misunderstood how LIT expansion works. awarzynski: I feel that we might be referring to different things. > But %mlir_cpu_runner substitution is…
				c-rhodesUnsubmitted Not Done Reply Inline Actions I feel that we might be referring to different things. But %mlir_cpu_runner substitution is only used in the SVE run line? The other run lines use the binary name mlir-cpu-runner "the binary name mlir-cpu-runner" is also effectively a substitution (created here). Because of these "default" substitutions, things like `%mlir-cpu-runner` are best avoided [1]. Perhaps that's why underscores are used elsewhere? Now, I want to keep two substitutions to preserve the original logic as much as I can: `mlir-cpu-runner` for regular `RUN` lines (`config_1` and `config_2` above), `%mlir_cpu_runner_vec` (or something else) for SVE `RUN` lines (`config_3` above). This is not possible with one substitution. I'm against introducing `%mlir_cpu_runner` (or `%mlir-cpu-runner`), because it is too close to `mlir-cpu-runner` (and I want to make sure that the spelling makes it obvious that the two are completely different). That's why I am suggesting an alternative. As the 3rd configuration is meant to run vectorised code, which _may_ or _may not_ be run via an emulator, I see 3 potential suffixes: `_vec`\|`_emu`\| `_vla`. To me, `%mlir_cpu_runner_vec` seems like the closes match for `config_3` and that's what I suggest that we use here. WDYT? [1] It would get expanded too `%/<absolute-path>/mlir-cpu-runner`, because `mlir-cpu-runner` substitution would precede the one for `%mlir-cpu-runner`. Unless I misunderstood how LIT expansion works. I thought this substitution is to support cross-compiling so the path to the target `lli` (rather than host) binary can be specified and run on an emulator. I think this should be as simple as specifying `%mlir_cpu_runner` (much like `%clang`, `%clang_cc1`, etc) for RUN lines where this is the case and `mlir-cpu-runner` (defaulting to llvm-tools-dir) everywhere else? c-rhodes: > I feel that we might be referring to different things. > > > But %mlir_cpu_runner…
				// REDEFINE: %{run_opts} = %VLA_ARCH_ATTR_OPTIONS
	// RUN: %{compile} \| %{run}			// RUN: %{compile} \| %{run}

	#MAT_C_C = #sparse_tensor.encoding<{dimLevelType = ["compressed", "compressed"]}>			#MAT_C_C = #sparse_tensor.encoding<{dimLevelType = ["compressed", "compressed"]}>
	#MAT_D_C = #sparse_tensor.encoding<{dimLevelType = ["dense", "compressed"]}>			#MAT_D_C = #sparse_tensor.encoding<{dimLevelType = ["dense", "compressed"]}>
	#MAT_C_D = #sparse_tensor.encoding<{dimLevelType = ["compressed", "dense"]}>			#MAT_C_D = #sparse_tensor.encoding<{dimLevelType = ["compressed", "dense"]}>
	#MAT_D_D = #sparse_tensor.encoding<{			#MAT_D_D = #sparse_tensor.encoding<{
	dimLevelType = ["dense", "dense"],			dimLevelType = ["dense", "dense"],
	dimOrdering = affine_map<(i,j) -> (j,i)>			dimOrdering = affine_map<(i,j) -> (j,i)>
	▲ Show 20 Lines • Show All 144 Lines • Show Last 20 Lines

mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg

	import sys			import sys

	# FIXME: %mlir_native_utils_lib_dir is set incorrectly on Windows			# FIXME: %mlir_native_utils_lib_dir is set incorrectly on Windows
	if sys.platform == 'win32':			if sys.platform == 'win32':
	config.unsupported = True			config.unsupported = True

	# ArmSVE tests must be enabled via build flag.			# ArmSVE tests must be enabled via build flag.
	if config.mlir_run_arm_sve_tests:			if config.mlir_run_arm_sve_tests:
	config.substitutions.append(('%ENABLE_VLA', 'true'))			config.substitutions.append(('%ENABLE_VLA', 'true'))
	config.substitutions.append(('%VLA_ARCH_ATTR_OPTIONS', '--march=aarch64 --mattr="+sve"'))			config.substitutions.append(('%VLA_ARCH_ATTR_OPTIONS', '--march=aarch64 --mattr="+sve"'))
	lli_cmd = 'lli'			mlir_cpu_runner_cmd = 'mlir-cpu-runner'
	if config.arm_emulator_lli_executable:			if config.arm_emulator_mlir_cpu_runner_executable:
	lli_cmd = config.arm_emulator_lli_executable			mlir_cpu_runner_cmd = config.arm_emulator_mlir_cpu_runner_executable

	if config.arm_emulator_utils_lib_dir:			if config.arm_emulator_utils_lib_dir:
	config.substitutions.append(('%mlir_native_utils_lib_dir', config.arm_emulator_utils_lib_dir))			config.substitutions.append(('%mlir_native_utils_lib_dir', config.arm_emulator_utils_lib_dir))
	else:			else:
	config.substitutions.append(('%mlir_native_utils_lib_dir', config.mlir_lib_dir))			config.substitutions.append(('%mlir_native_utils_lib_dir', config.mlir_lib_dir))

	if config.arm_emulator_executable:			if config.arm_emulator_executable:
	# Run test in emulator (qemu or armie).			# Run test in emulator (qemu or armie).
	emulation_cmd = config.arm_emulator_executable			emulation_cmd = config.arm_emulator_executable
	if config.arm_emulator_options:			if config.arm_emulator_options:
	emulation_cmd = emulation_cmd + ' ' + config.arm_emulator_options			emulation_cmd = emulation_cmd + ' ' + config.arm_emulator_options
	emulation_cmd = emulation_cmd + ' ' + lli_cmd			emulation_cmd = emulation_cmd + ' ' + mlir_cpu_runner_cmd
	config.substitutions.append(('%lli', emulation_cmd))			config.substitutions.append(('%mlir_cpu_runner_vla', emulation_cmd))
	else:			else:
	config.substitutions.append(('%lli', lli_cmd))			config.substitutions.append(('%mlir_cpu_runner_vla', mlir_cpu_runner_cmd))
	else:			else:
	config.substitutions.append(('%ENABLE_VLA', 'false'))			config.substitutions.append(('%ENABLE_VLA', 'false'))
	config.substitutions.append(('%VLA_ARCH_ATTR_OPTIONS', ''))			config.substitutions.append(('%VLA_ARCH_ATTR_OPTIONS', ''))
	config.substitutions.append(('%lli', 'lli'))			config.substitutions.append(('%mlir_cpu_runner_vla', 'mlir-cpu-runner'))
				c-rhodesUnsubmitted Not Done Reply Inline Actions With the above suggestion this could be moved to a higher Lit config (`mlir/test/lit.cfg.py`?) to remove duplication. c-rhodes: With the above suggestion this could be moved to a higher Lit config (`mlir/test/lit.cfg.py`?)…
				awarzynskiAuthorUnsubmitted Done Reply Inline Actions Good idea, though that will have impact on all testing in MLIR and I would feel more confident doing that in a separate patch :) (once reviewers approve the logic, I'll update the remaining tests in mlir/test/Integration/Dialect/SparseTensor/CPU, which will increase the size of this patch quite significantly). But yes, the overall goal is to reduce duplication and unify how things get configured. awarzynski: Good idea, though that will have impact on all testing in MLIR and I would feel more confident…
	config.substitutions.append(('%mlir_native_utils_lib_dir', config.mlir_lib_dir))			config.substitutions.append(('%mlir_native_utils_lib_dir', config.mlir_lib_dir))

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir

	// DEFINE: %{option} = enable-runtime-library=true			// DEFINE: %{option} = enable-runtime-library=true
				// DEFINE: %{run_option} =
	// DEFINE: %{compile} = mlir-opt %s --sparse-compiler=%{option}			// DEFINE: %{compile} = mlir-opt %s --sparse-compiler=%{option}
	// DEFINE: %{run} = mlir-cpu-runner \			// DEFINE: %{cpu_runner} = mlir-cpu-runner
				// DEFINE: %{run} = %{cpu_runner} \
	// DEFINE: -e entry -entry-point-result=void \			// DEFINE: -e entry -entry-point-result=void \
	// DEFINE: -shared-libs=%mlir_c_runner_utils \| \			// DEFINE: -shared-libs=%mlir_c_runner_utils %{run_option} \| \
	// DEFINE: FileCheck %s			// DEFINE: FileCheck %s
	//			//
	// RUN: %{compile} \| %{run}			// RUN: %{compile} \| %{run}
	//			//
	// Do the same run, but now with direct IR generation.			// Do the same run, but now with direct IR generation.
	// REDEFINE: %{option} = "enable-runtime-library=false enable-buffer-initialization=true"			// REDEFINE: %{option} = "enable-runtime-library=false enable-buffer-initialization=true"
	// RUN: %{compile} \| %{run}			// RUN: %{compile} \| %{run}
	//			//
	// Do the same run, but now with direct IR generation and vectorization.			// Do the same run, but now with direct IR generation and vectorization. Enable
	// REDEFINE: %{option} = "enable-runtime-library=false enable-buffer-initialization=true vl=2 reassociate-fp-reductions=true enable-index-optimizations=true"			// Arm SVE/VLA if supported.
				// REDEFINE: %{option} = "enable-runtime-library=false enable-buffer-initialization=true vl=2 enable-arm-sve=%ENABLE_VLA reassociate-fp-reductions=true enable-index-optimizations=true"
				// REDEFINE: %{cpu_runner} = %mlir_cpu_runner_vla
				// REDEFINE: %{run_option} = %VLA_ARCH_ATTR_OPTIONS
	// RUN: %{compile} \| %{run}			// RUN: %{compile} \| %{run}

	// Do the same run, but now with direct IR generation and, if available, VLA
	// vectorization.
	// REDEFINE: %{option} = "enable-runtime-library=false vl=4 enable-buffer-initialization=true reassociate-fp-reductions=true enable-index-optimizations=true enable-arm-sve=%ENABLE_VLA"
	// REDEFINE: %{run} = %lli \
	// REDEFINE: --entry-function=entry_lli \
	// REDEFINE: --extra-module=%S/Inputs/main_for_lli.ll \
	// REDEFINE: %VLA_ARCH_ATTR_OPTIONS \
	// REDEFINE: --dlopen=%mlir_native_utils_lib_dir/libmlir_c_runner_utils%shlibext \| \
	// REDEFINE: FileCheck %s
	// RUN: %{compile} \| mlir-translate -mlir-to-llvmir \| %{run}

	#SparseVector = #sparse_tensor.encoding<{dimLevelType = ["compressed"]}>			#SparseVector = #sparse_tensor.encoding<{dimLevelType = ["compressed"]}>
	#DCSR = #sparse_tensor.encoding<{dimLevelType = ["compressed", "compressed"]}>			#DCSR = #sparse_tensor.encoding<{dimLevelType = ["compressed", "compressed"]}>

	//			//
	// Traits for tensor operations.			// Traits for tensor operations.
	//			//
	#trait_vec_scale = {			#trait_vec_scale = {
	indexing_maps = [			indexing_maps = [
	▲ Show 20 Lines • Show All 253 Lines • Show Last 20 Lines

mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/lit.local.cfg

	import sys			import sys

	# ArmSVE tests must be enabled via build flag.			# ArmSVE tests must be enabled via build flag.
	if not config.mlir_run_arm_sve_tests:			if not config.mlir_run_arm_sve_tests:
	config.unsupported = True			config.unsupported = True

	# No JIT on win32.			# No JIT on win32.
	if sys.platform == 'win32':			if sys.platform == 'win32':
	config.unsupported = True			config.unsupported = True

	lli_cmd = 'lli'			mlir_cpu_runner_cmd = 'mlir-cpu-runner'
	if config.arm_emulator_lli_executable:			if config.arm_emulator_mlir_cpu_runner_executable:
	lli_cmd = config.arm_emulator_lli_executable			mlir_cpu_runner_cmd = config.arm_emulator_mlir_cpu_runner_executable

	config.substitutions.append(('%mlir_native_utils_lib_dir',			config.substitutions.append(('%mlir_native_utils_lib_dir',
	config.arm_emulator_utils_lib_dir or config.mlir_lib_dir))			config.arm_emulator_utils_lib_dir or config.mlir_lib_dir))

	if config.arm_emulator_executable:			if config.arm_emulator_executable:
	# Run test in emulator (qemu or armie)			# Run test in emulator (qemu or armie)
	emulation_cmd = config.arm_emulator_executable			emulation_cmd = config.arm_emulator_executable
	if config.arm_emulator_options:			if config.arm_emulator_options:
	emulation_cmd = emulation_cmd + ' ' + config.arm_emulator_options			emulation_cmd = emulation_cmd + ' ' + config.arm_emulator_options
	emulation_cmd = emulation_cmd + ' ' + lli_cmd			emulation_cmd = emulation_cmd + ' ' + mlir_cpu_runner_cmd
	config.substitutions.append(('%lli', emulation_cmd))			config.substitutions.append(('%mlir-cpu-runner-vla', emulation_cmd))
	else:			else:
	config.substitutions.append(('%lli', lli_cmd))			config.substitutions.append(('%mlir-cpu-runner-vla', mlir_cpu_runner_cmd))

mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/test-sve.mlir

// RUN: mlir-opt %s -lower-affine -convert-scf-to-cf -convert-vector-to-llvm="enable-arm-sve" -finalize-memref-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -canonicalize \| \		// RUN: mlir-opt %s -lower-affine -convert-scf-to-cf -convert-vector-to-llvm="enable-arm-sve" -finalize-memref-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -canonicalize \| \
// RUN: mlir-translate -mlir-to-llvmir \| \		// RUN: %mlir-cpu-runner-vla -e=entry -entry-point-result=void --march=aarch64 --mattr="+sve" -shared-libs=%mlir_lib_dir/libmlir_c_runner_utils%shlibext \| \
// RUN: %lli --entry-function=entry --march=aarch64 --mattr="+sve" --dlopen=%mlir_native_utils_lib_dir/libmlir_c_runner_utils%shlibext \| \
// RUN: FileCheck %s		// RUN: FileCheck %s

// Note: To run this test, your CPU must support SVE		// Note: To run this test, your CPU must support SVE

// VLA memcopy		// VLA memcopy
func.func @kernel_copy(%src : memref<?xi64>, %dst : memref<?xi64>, %size : index) {		func.func @kernel_copy(%src : memref<?xi64>, %dst : memref<?xi64>, %size : index) {
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%c2 = arith.constant 2 : index		%c2 = arith.constant 2 : index
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	scf.for %i0 = %c0 to %N step %s {
%la = vector.maskedload %a[%i0], %mask, %v0f : memref<?xf32>, vector<[4]xi1>, vector<[4]xf32> into vector<[4]xf32>		%la = vector.maskedload %a[%i0], %mask, %v0f : memref<?xf32>, vector<[4]xi1>, vector<[4]xf32> into vector<[4]xf32>
%lb = vector.maskedload %b[%i0], %mask, %v0f : memref<?xf32>, vector<[4]xi1>, vector<[4]xf32> into vector<[4]xf32>		%lb = vector.maskedload %b[%i0], %mask, %v0f : memref<?xf32>, vector<[4]xi1>, vector<[4]xf32> into vector<[4]xf32>
%lc = arith.addf %la, %lb : vector<[4]xf32>		%lc = arith.addf %la, %lb : vector<[4]xf32>
vector.maskedstore %c[%i0], %mask, %lc : memref<?xf32>, vector<[4]xi1>, vector<[4]xf32>		vector.maskedstore %c[%i0], %mask, %lc : memref<?xf32>, vector<[4]xi1>, vector<[4]xf32>
}		}
return		return
}		}

func.func @entry() -> i32 {		func.func @entry() {
%i0 = arith.constant 0: i64		%i0 = arith.constant 0: i64
%i1 = arith.constant 1: i64		%i1 = arith.constant 1: i64
%r0 = arith.constant 0: i32
%f0 = arith.constant 0.0: f32		%f0 = arith.constant 0.0: f32
%c0 = arith.constant 0: index		%c0 = arith.constant 0: index
%c1 = arith.constant 1: index		%c1 = arith.constant 1: index
%c2 = arith.constant 2: index		%c2 = arith.constant 2: index
%c4 = arith.constant 4: index		%c4 = arith.constant 4: index
%c8 = arith.constant 8: index		%c8 = arith.constant 8: index
%c32 = arith.constant 32: index		%c32 = arith.constant 32: index
%c33 = arith.constant 33: index		%c33 = arith.constant 33: index
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	func.func @entry() {
memref.dealloc %a_copy : memref<32xi64>		memref.dealloc %a_copy : memref<32xi64>
memref.dealloc %b : memref<32xi64>		memref.dealloc %b : memref<32xi64>
memref.dealloc %c : memref<32xi64>		memref.dealloc %c : memref<32xi64>
memref.dealloc %d : memref<32xi64>		memref.dealloc %d : memref<32xi64>
memref.dealloc %e : memref<33xf32>		memref.dealloc %e : memref<33xf32>
memref.dealloc %f : memref<33xf32>		memref.dealloc %f : memref<33xf32>
memref.dealloc %g : memref<36xf32>		memref.dealloc %g : memref<36xf32>

return %r0 : i32		return
}		}

mlir/test/lit.site.cfg.py.in

	Show All 36 Lines
	config.mlir_run_amx_tests = @MLIR_RUN_AMX_TESTS@			config.mlir_run_amx_tests = @MLIR_RUN_AMX_TESTS@
	config.mlir_run_arm_sve_tests = @MLIR_RUN_ARM_SVE_TESTS@			config.mlir_run_arm_sve_tests = @MLIR_RUN_ARM_SVE_TESTS@
	config.mlir_run_x86vector_tests = @MLIR_RUN_X86VECTOR_TESTS@			config.mlir_run_x86vector_tests = @MLIR_RUN_X86VECTOR_TESTS@
	config.mlir_run_riscv_vector_tests = "@MLIR_RUN_RISCV_VECTOR_TESTS@"			config.mlir_run_riscv_vector_tests = "@MLIR_RUN_RISCV_VECTOR_TESTS@"
	config.mlir_run_cuda_tensor_core_tests = @MLIR_RUN_CUDA_TENSOR_CORE_TESTS@			config.mlir_run_cuda_tensor_core_tests = @MLIR_RUN_CUDA_TENSOR_CORE_TESTS@
	config.mlir_include_integration_tests = @MLIR_INCLUDE_INTEGRATION_TESTS@			config.mlir_include_integration_tests = @MLIR_INCLUDE_INTEGRATION_TESTS@
	config.arm_emulator_executable = "@ARM_EMULATOR_EXECUTABLE@"			config.arm_emulator_executable = "@ARM_EMULATOR_EXECUTABLE@"
	config.arm_emulator_options = "@ARM_EMULATOR_OPTIONS@"			config.arm_emulator_options = "@ARM_EMULATOR_OPTIONS@"
	config.arm_emulator_lli_executable = "@ARM_EMULATOR_LLI_EXECUTABLE@"			config.arm_emulator_mlir_cpu_runner_executable = "@ARM_EMULATOR_MLIR_CPU_RUNNER_EXECUTABLE@"
	config.arm_emulator_utils_lib_dir = "@ARM_EMULATOR_UTILS_LIB_DIR@"			config.arm_emulator_utils_lib_dir = "@ARM_EMULATOR_UTILS_LIB_DIR@"
	config.riscv_vector_emulator_executable = "@RISCV_VECTOR_EMULATOR_EXECUTABLE@"			config.riscv_vector_emulator_executable = "@RISCV_VECTOR_EMULATOR_EXECUTABLE@"
	config.riscv_vector_emulator_options = "@RISCV_VECTOR_EMULATOR_OPTIONS@"			config.riscv_vector_emulator_options = "@RISCV_VECTOR_EMULATOR_OPTIONS@"
	config.riscv_emulator_lli_executable = "@RISCV_EMULATOR_LLI_EXECUTABLE@"			config.riscv_emulator_lli_executable = "@RISCV_EMULATOR_LLI_EXECUTABLE@"
	config.riscv_emulator_utils_lib_dir = "@RISCV_EMULATOR_UTILS_LIB_DIR@"			config.riscv_emulator_utils_lib_dir = "@RISCV_EMULATOR_UTILS_LIB_DIR@"

	import lit.llvm			import lit.llvm
	lit.llvm.initialize(lit_config, config)			lit.llvm.initialize(lit_config, config)

	# Let the main config do the real work.			# Let the main config do the real work.
	lit_config.load_config(config, "@MLIR_SOURCE_DIR@/test/lit.cfg.py")			lit_config.load_config(config, "@MLIR_SOURCE_DIR@/test/lit.cfg.py")