This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
4/8
EmitCModelRegistry.h
-
InlineAdvisor.h
-
MLEmitCModelRunner.h
-
MLModelRunner.h
-
lib/
-
Analysis/
-
CMakeLists.txt
-
InlineAdvisor.cpp
-
MLInlineAdvisor.cpp
-
models/
-
CMakeLists.txt
-
emitc/
-
CMakeLists.txt
2/4
InlineOzTestModel.emitc.h
-
InlineOzTestModel.emitc.cpp
-
tflite_to_cpp.py
-
tflite_to_cpp_lib.py
-
Passes/
-
PassBuilderPipelines.cpp
-
test/Transforms/Inline/ML/
-
Transforms/
-
Inline/
-
ML/
-
ml-test-emitc-mode.ll

Differential D146483

[mlgo] Add infrastructure to use EmitC-generated models for inlining.
Needs ReviewPublic

Authored by jacobhegna on Mar 20 2023, 5:38 PM.

Download Raw Diff

Details

Reviewers

alexander-shaposhnikov
mtrofin

Summary

Current MLGO models can be served in three ways:

release mode, which uses a TF AOT compiler to compile models into binary blobs and a header file. This requires a TF dependency at (LLVM) build time.
development mode, which loads TFLite models using the TFLite runtime dynamically by providing LLVM with a path to the .tflite file. This requires a TFLite dependency.
interactive mode, which fetches actions via two pipes (one for features going out, and one for actions going in).

None of these are suitable for a general clang release, where we don't assume any TF dependencies and package a model in the LLVM source code.

The EmitC serving path compiles TFLite models to pure C++ code with a mild runtime dependency. The runtime is a small set of tensor kernels necessary to run a neural net. The mlcompileropt repository contains (or at least, will eventually contain) a script which automates the process of the compiling the .tflile file through the various MLIR stages down to C++, and also embeds the C++ runtime directly in the autogenerated .cpp file. The result is that there is a single (.cpp, .h) pair which can be contained in the LLVM repository and built in a normal CMake build process, with no additional dependencies.

This patch adds two things:

the infrastructure to load + run EmitC-generated models for the ML inlining advisor, and
a "test" policy which was a TF function that always returned 1 and was fed through EmitC, and is loaded in the above framework. This is used to test the code in (1).

You can use the EmitC module inliner in opt with the flags -enable-ml-inliner=emitc -inliner-emitc-model-name=NAME_OF_MODEL. Currently the only supported model name is InlineOzTestModel.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jacobhegna created this revision.Mar 20 2023, 5:38 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 5:38 PM

Herald added subscribers: mtrofin, hiraditya. · View Herald Transcript

jacobhegna requested review of this revision.Mar 20 2023, 5:38 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 5:38 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Fixing the error message when trying to use the emitc model when it was
not built with clang.

Harbormaster completed remote builds in B220598: Diff 506807.Mar 20 2023, 6:16 PM

phosek added a subscriber: phosek.Mar 23 2023, 11:53 AM

aidengrossman added a subscriber: aidengrossman.Apr 10 2023, 10:37 PM

marbre added a subscriber: marbre.Apr 18 2023, 2:37 AM

Slight update to the patch; add command line selection for registered emitc
models.

Herald added a reviewer: alexander-shaposhnikov. · View Herald TranscriptApr 21 2023, 4:21 PM

Note this still isn't ready for review, need to trim down this patch and split it up. just posting for visibility

Clean up the code & delete non-test models.

Herald added a subscriber: hoy. · View Herald TranscriptApr 21 2023, 5:15 PM

jacobhegna retitled this revision from Add initial EmitC inlining-for-size model. to Add infrastructure to use EmitC-generated models for inlining..Apr 21 2023, 5:32 PM

jacobhegna edited the summary of this revision. (Show Details)

Herald added subscribers: bzcheeseman, stephenneuendorffer, rriddle. · View Herald TranscriptApr 21 2023, 5:32 PM

Ok, this is ready for review. The code which compiles the .tflite models to C++ and embeds the runtime can be found in this external PR: https://github.com/google/ml-compiler-opt/pull/215

The phabricator interface/diff looked weird, updating diff.

Squashing commits to clean up diff.

Remove spurious changes.

jacobhegna edited the summary of this revision. (Show Details)Apr 21 2023, 8:14 PM

Add Apache 2.0 license to the autogenerated files + notice of original location.

Note to reviewers: I am not the most familiar with open source licensing and I
would like to get attribution + licensing reviewed carefully before this patch
lands.

Harbormaster completed remote builds in B227415: Diff 516020.Apr 21 2023, 11:20 PM

Thanks for doing this!

High level, the dependency from model to LLVM should go away. Also emitc stuff should be implementation detail (i.e. not in any headers)

The models can then have their CMakeLists.txt, which IIUC it's the expected way for subdirectories.

llvm/include/llvm/Analysis/EmitCModelRegistry.h
56	`StringMap`?
61	why do we need this, wouldn't the macro below be sufficient?
llvm/include/llvm/Analysis/EmitCTensor.h
1 ↗	(On Diff #516020)	won't this be part of the model package, i.e. something that's (maybe cloned) per-model?
llvm/include/llvm/Analysis/MLInlineEmitCModel.h
1 ↗	(On Diff #516020)	Nit: isn't it "Model for inlining Oz"?
35 ↗	(On Diff #516020)	can't these not be surfaced to begin with?
llvm/lib/Analysis/MLInlinerEmitCRunner.h
26 ↗	(On Diff #516020)	This should be in the emit-ed code, so `emitc` stuff shouldn't appear in any llvm-facing APIs.
63 ↗	(On Diff #516020)	can't we set the buffers in the ctor once? they don't get reallocated.
llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.h
21	Not sure what the inheritance relationship buys us. We already have a common mechanism for referring to features by their enum-named index. This also seems to add a tight coupling between codegen and llvm. If all we need is a way to identify groups of models in the registry, that could be done with the address of a static specific to each consumer, i.e. MLInlineAdvisor could have a static char ID = '\0' and the address of that ID is the key (like the legacy PM does things, too)

marbre added inline comments.Apr 26 2023, 9:15 AM

llvm/include/llvm/Analysis/EmitCTensor.h
1 ↗	(On Diff #516020)	I actually had the same question in mind and I am not sure if `llvm/include/llvm/Analysis` is a good location for this. However, we discussed some time ago if we should upstream the whole reference implementation, see https://discourse.llvm.org/t/tosa-reference-model-from-mlir-using-emitc/4799/11. Furthermore, I don't know if it is acceptable to upstream code not licensed under `Apache-2.0 WITH LLVM exceptions`. The good news is, we would be able to re-license (most of) the reference implementation.

Address comments from reviewers. The main changes are:

Include the python files to generate the EmitC models in this patch, instead

of in the mlcompileropt github repo.

Remove the global declaration of the emitc::Tensor type and make it an

implementation detail. This requires changing the signatures of the generated
models to speak in terms of T* instead of Tensor<T, ...>.

llvm/include/llvm/Analysis/EmitCModelRegistry.h
56	Fixed.
61	we need to call `EmitCModelRegistry<ModelT>::get().addModel(...)` before `main()` runs in the final linked binary. the easiest way to do that is to define a global object whose constructor runs that code - that's the point of the `EmitCModelRegistrationHandle`. We can't (as far as I know) just have the macro because then we wouldn't have a "context" to run the needed code (right now, that context is the constructor of `...Handle`. this is basically how `cl::opt` works, too. Note that this mechanism is about to change when I update the patch. Instead of registering instantiated `unique_ptr`s of each model, I register a factory method which is capable of creating a `MLModelRunner` pointer. The reason for the change is because now that the emitc models don't share a common base class, there isn't a good way to store them all in a common registry (but we can store the `MLModelRunner*` because they use a subclass templated on the emitc model.
llvm/include/llvm/Analysis/EmitCTensor.h
1 ↗	(On Diff #516020)	It is true that the EmitC runtime includes all of this code. it's not quite cloned per-model right now, because in the python emitc code I rename the `Tensor` type to something like `_UnusedTensorType`. The reason is that when EmitC generates models, the signature of the functions take `Tensor`s for input and output. If the definition of `Tensor` is in a local, anonymous namespace (aka the embedded runtime), then the code won't even compile because there won't be an accessible declaration for `Tensor` for the model header. There are really only two solutions to this: make `Tensor` a type that's declared in a common LLVM header that each model can `#include`, and so each model will use the same shared definition of `Tensor`. This is what I did in this commit. make the model signatures only use `void` or `float` and keep the definition of `Tensor` as an internal implementation detail. I talked with mircea offline, and (2) seems ok for us. it avoids exposing this code outside of generated models. It weakens the type in the exported function signatures, but that's not horrible because we already deal with void* internally (and we're the only user of this modified emitc backend)
1 ↗	(On Diff #516020)	(see my above comment to mircea as well). +1 to making the license change, that will be a blocker regardless of how we want to use the library. right now I don't use any of the Eigen code, so we can ignore that (if that was the main thing that can't be relicensed). So, originally we were planning on putting the entire reference implementation in LLVM and having models #include it. The reason we didn't do this is essentially because we wanted two conflicting things: models to maintain bit-identical stability over time - if you give it these bits as input today, it will give you the same answer today or in 6 months from today. to be able to upgrade the runtime for performance/other reasons. (2) can conflict with (1) because upgrades can create slight differences due to floating point numerical stability issues. So, instead we embed the runtime in each model individually so that new models get all the upgrades and old models get stability.
llvm/include/llvm/Analysis/MLInlineEmitCModel.h
1 ↗	(On Diff #516020)	file gone, not an issue anymore
35 ↗	(On Diff #516020)	the base model is going away, so not a problem anymore.
llvm/lib/Analysis/MLInlinerEmitCRunner.h
26 ↗	(On Diff #516020)	fixed. this function actually doesn't exist anymore.
63 ↗	(On Diff #516020)	fixed.
llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.h
21	fixed, but not with the static char trick. see the patch update for details

Minor fixes.

Harbormaster completed remote builds in B228611: Diff 517643.Apr 27 2023, 11:13 AM

mtrofin added inline comments.Apr 27 2023, 11:25 AM

llvm/include/llvm/Analysis/EmitCModelRegistry.h
61	ah, makes sense. thanks!
110	could this macro also #include the header - I think it'd be totally reasonable to follow a canonical naming convention where the header name is trivially derivable from `EmitCModelType`, for example.
llvm/include/llvm/Analysis/InlineModelFeatureMaps.h
25 ↗	(On Diff #517643)	pls add a small doc accordingly
84 ↗	(On Diff #517643)	can you peel this bit of the change separately and just submit it (so the part that adds the self-doc-ing and makes the 2 lists the same)
llvm/lib/Analysis/MLInlinerEmitCRunner.h
28 ↗	(On Diff #517643)	would it be possible for the emitC-ed code to just give us a `void* lookupTensorBuffer(const std::string &)` - it'd also keep the codegen-ed files smaller
llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.h
18	can we add a "this is generated, avoid modifying me by hand" type of thing loud somewhere in the comment? same for the .cpp

Address most of comments; will move the feature macro stuff to a new patch in
the next update.

llvm/include/llvm/Analysis/EmitCModelRegistry.h
110	mmm we can if you want, but the problem is that I think we will eventually store models for different passes in different places - for example, the regalloc models might end up somewhere in llvm/lib/CodeGen. we can put the include here if you want, but not sure it's the most useful optimization
llvm/lib/Analysis/MLInlinerEmitCRunner.h
28 ↗	(On Diff #517643)	fixed
llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.h
18	fixed

mtrofin added inline comments.Apr 27 2023, 2:28 PM

llvm/include/llvm/Analysis/EmitCModelRegistry.h
110	maybe add a parameter that is the header location, so this becomes 1 gesture rather than 2 (easier to discover what needs to be given, too)
llvm/lib/Analysis/MLInlinerEmitCRunner.h
28 ↗	(On Diff #517643)	ok but now you can avoid the macro-ing, right?

Address comments - mainly making the emitc runner generic beyond inlining.

Rebasing to HEAD to get inline macro changes.

Harbormaster completed remote builds in B228749: Diff 517813.Apr 28 2023, 1:19 AM

jacobhegna mentioned this in D147570: Add option to emit stateful functions to the emitc backend..May 5 2023, 11:16 AM

mtrofin added a parent revision: D147570: Add option to emit stateful functions to the emitc backend..May 5 2023, 12:20 PM

mtrofin retitled this revision from Add infrastructure to use EmitC-generated models for inlining. to [mlgo] Add infrastructure to use EmitC-generated models for inlining..

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

122 lines

6 lines

51 lines

9 lines

lib/

Analysis/

CMakeLists.txt

3 lines

InlineAdvisor.cpp

3 lines

MLInlineAdvisor.cpp

19 lines

models/

CMakeLists.txt

1 line

emitc/

CMakeLists.txt

3 lines

InlineOzTestModel.emitc.h

45 lines

InlineOzTestModel.emitc.cpp

1508 lines

tflite_to_cpp.py

117 lines

tflite_to_cpp_lib.py

369 lines

Passes/

PassBuilderPipelines.cpp

4 lines

test/

Transforms/

Inline/

ML/

ml-test-emitc-mode.ll

6 lines

Diff 517813

llvm/include/llvm/Analysis/EmitCModelRegistry.h

This file was added.

				//===- EmitCModelRegistry.h ---- Registry for EmitC models ------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements a registry for EmitC generated models. The idea is that
				// generated models register themselves here, and then optimization passes can
				// look up each model by the generated string. This separates concerns between
				// people who want to integrate new models for existing ML optimization passes
				// (ml inline -Oz, for example) and people who want to expose new passes to ML.
				//
				// The normal case should be that EmitC models should be selected via a command
				// line flag, whose string value is passed to the registry as a lookup.
				//
				// Registration should be performed by invoking the REGISTER_EMITC_MODEL macro
				// in a .cpp file.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_ANALYSIS_EMITCMODELREGISTRY_H
				#define LLVM_ANALYSIS_EMITCMODELREGISTRY_H

				#include <memory>
				#include <string_view>

				#include "llvm/ADT/StringMap.h"
				#include "llvm/Analysis/MLEmitCModelRunner.h"
				#include "llvm/Analysis/MLModelRunner.h"

				namespace llvm {

				/// Meyer singleton representing the registry. There will be one instance of the
				/// registry for each ModelT type, which represents the interface for a
				/// particular model (inlining, regalloc, etc).
				class EmitCModelRegistry {
				public:
				/// Function type which takes a LLVMContext, Input spec, and Output spec and
				/// returns a MLRunner
				using RunnerFactoryType = std::function<std::unique_ptr<MLModelRunner>(
				LLVMContext &, const std::vector<TensorSpec> &, const TensorSpec &)>;

				/// Get a reference to the singleton registry
				static EmitCModelRegistry &get() {
				static EmitCModelRegistry Registry;
				return Registry;
				}

				/// Register the given Factory under Name
				void registerModelFactory(std::string_view Name, RunnerFactoryType Factory) {
				auto itr = ModelFactories.find(Name);
				if (itr != std::end(ModelFactories)) {
				llvm::errs() << "Conflicting model factory registrations in "
				"EmitCModelFactory; conflicting at name ["
				mtrofinUnsubmitted Not Done Reply Inline Actions `StringMap`? mtrofin: `StringMap`?
				jacobhegnaAuthorUnsubmitted Done Reply Inline Actions Fixed. jacobhegna: Fixed.
				<< Name << "]\n";
				}
				ModelFactories[Name] = std::move(Factory);
				}

				mtrofinUnsubmitted Not Done Reply Inline Actions why do we need this, wouldn't the macro below be sufficient? mtrofin: why do we need this, wouldn't the macro below be sufficient?
				jacobhegnaAuthorUnsubmitted Done Reply Inline Actions we need to call `EmitCModelRegistry<ModelT>::get().addModel(...)` before `main()` runs in the final linked binary. the easiest way to do that is to define a global object whose constructor runs that code - that's the point of the `EmitCModelRegistrationHandle`. We can't (as far as I know) just have the macro because then we wouldn't have a "context" to run the needed code (right now, that context is the constructor of `...Handle`. this is basically how `cl::opt` works, too. Note that this mechanism is about to change when I update the patch. Instead of registering instantiated `unique_ptr`s of each model, I register a factory method which is capable of creating a `MLModelRunner` pointer. The reason for the change is because now that the emitc models don't share a common base class, there isn't a good way to store them all in a common registry (but we can store the `MLModelRunner` because they use a subclass templated on the emitc model. jacobhegna:* we need to call `EmitCModelRegistry<ModelT>::get().addModel(...)` before `main()` runs in the…
				mtrofinUnsubmitted Done Reply Inline Actions ah, makes sense. thanks! mtrofin: ah, makes sense. thanks!
				/// create a MLModelRunner from the factory registered at Name.
				std::unique_ptr<MLModelRunner>
				createModelRunner(std::string_view Name, LLVMContext &Ctx,
				const std::vector<TensorSpec> &Inputs,
				const TensorSpec &Advice) {
				auto itr = ModelFactories.find(Name);
				if (itr == std::end(ModelFactories)) {
				Ctx.emitError("[EmitCModelRegistry] Could not find model: " + Name +
				". The following models have been registered:\n" +
				getAllModelsStr());
				return nullptr;
				}
				return itr->second(Ctx, Inputs, Advice);
				}

				private:
				EmitCModelRegistry() {}

				/// Returns a string representing all registered model factories
				std::string getAllModelsStr() {
				std::string Res;
				for (const auto &[K, V] : ModelFactories) {
				Res += K;
				Res += "\n";
				}
				return Res;
				}

				llvm::StringMap<RunnerFactoryType> ModelFactories;
				};

				/// Helper class whose constructor performs a model registration. Constructing
				/// an object of this type is all you need to do to register the model runner.
				template <class RunnerT> class EmitCModelRegistrationHandle {
				public:
				/// Create a RegistrationHandle which automatically registers a factory method
				/// for the MLModelRunner of type RunnerT
				EmitCModelRegistrationHandle(std::string_view Name) {
				EmitCModelRegistry::get().registerModelFactory(
				Name, [](LLVMContext &Ctx, const std::vector<TensorSpec> &Inputs,
				const TensorSpec &Advice) {
				return std::make_unique<RunnerT>(Ctx, Inputs, Advice);
				});
				}
				};
				} // namespace llvm

				// Macro which simplifies registering models with the registry.
				#define REGISTER_EMITC_MODEL(EmitCModelType) \
				mtrofinUnsubmitted Not Done Reply Inline Actions could this macro also #include the header - I think it'd be totally reasonable to follow a canonical naming convention where the header name is trivially derivable from `EmitCModelType`, for example. mtrofin: could this macro also #include the header - I think it'd be totally reasonable to follow a…
				jacobhegnaAuthorUnsubmitted Done Reply Inline Actions mmm we can if you want, but the problem is that I think we will eventually store models for different passes in different places - for example, the regalloc models might end up somewhere in llvm/lib/CodeGen. we can put the include here if you want, but not sure it's the most useful optimization jacobhegna: mmm we can if you want, but the problem is that I think we will eventually store models for…
				mtrofinUnsubmitted Not Done Reply Inline Actions maybe add a parameter that is the header location, so this becomes 1 gesture rather than 2 (easier to discover what needs to be given, too) mtrofin: maybe add a parameter that is the header location, so this becomes 1 gesture rather than 2…
				namespace { \
				::llvm::EmitCModelRegistrationHandle< \
				::llvm::MLEmitCModelRunner<emitc_generated::EmitCModelType>> \
				_handle_##EmitCModelType(emitc_generated::EmitCModelType::name()); \
				} \
				static_assert(true, "")
				// The above (trivial) static assert generates no code but forces invocations of
				// this macro to end with a semicolon. This is mostly aesthetic, but it also
				// silences a compiler warning about ending the macro with a semicolon, which
				// many programmers would naturally do anyways.

				#endif

llvm/include/llvm/Analysis/InlineAdvisor.h

Show All 35 Lines
///		///
/// - Development mode, for training new models.		/// - Development mode, for training new models.
/// In this mode, we trade off runtime performance for flexibility. This mode		/// In this mode, we trade off runtime performance for flexibility. This mode
/// requires the full C Tensorflow API library, and evaluates models		/// requires the full C Tensorflow API library, and evaluates models
/// dynamically. This mode also permits generating training logs, for offline		/// dynamically. This mode also permits generating training logs, for offline
/// training.		/// training.
///		///
/// - Dynamically load an advisor via a plugin (PluginInlineAdvisorAnalysis)		/// - Dynamically load an advisor via a plugin (PluginInlineAdvisorAnalysis)
enum class InliningAdvisorMode : int { Default, Release, Development };		enum class InliningAdvisorMode : int { Default, Release, Development, EmitC };

// Each entry represents an inline driver.		// Each entry represents an inline driver.
enum class InlinePass : int {		enum class InlinePass : int {
AlwaysInliner,		AlwaysInliner,
CGSCCInliner,		CGSCCInliner,
EarlyInliner,		EarlyInliner,
ModuleInliner,		ModuleInliner,
MLInliner,		MLInliner,
▲ Show 20 Lines • Show All 299 Lines • ▼ Show 20 Lines	public:

PreservedAnalyses run(Module &M, ModuleAnalysisManager &MAM);		PreservedAnalyses run(Module &M, ModuleAnalysisManager &MAM);

PreservedAnalyses run(LazyCallGraph::SCC &InitialC, CGSCCAnalysisManager &AM,		PreservedAnalyses run(LazyCallGraph::SCC &InitialC, CGSCCAnalysisManager &AM,
LazyCallGraph &CG, CGSCCUpdateResult &UR);		LazyCallGraph &CG, CGSCCUpdateResult &UR);
};		};

std::unique_ptr<InlineAdvisor>		std::unique_ptr<InlineAdvisor>
		getEmitCModeAdvisor(Module &M, ModuleAnalysisManager &MAM,
		std::function<bool(CallBase &)> GetDefaultAdvice);

		std::unique_ptr<InlineAdvisor>
getReleaseModeAdvisor(Module &M, ModuleAnalysisManager &MAM,		getReleaseModeAdvisor(Module &M, ModuleAnalysisManager &MAM,
std::function<bool(CallBase &)> GetDefaultAdvice);		std::function<bool(CallBase &)> GetDefaultAdvice);

std::unique_ptr<InlineAdvisor>		std::unique_ptr<InlineAdvisor>
getDevelopmentModeAdvisor(Module &M, ModuleAnalysisManager &MAM,		getDevelopmentModeAdvisor(Module &M, ModuleAnalysisManager &MAM,
std::function<bool(CallBase &)> GetDefaultAdvice);		std::function<bool(CallBase &)> GetDefaultAdvice);

// Default (manual policy) decision making helper APIs. Shared with the legacy		// Default (manual policy) decision making helper APIs. Shared with the legacy
Show All 34 Lines

llvm/include/llvm/Analysis/MLEmitCModelRunner.h

This file was added.

				//===- MLEmitCModelRunner.h ---- EmitC ML model runner ---------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//

				#ifndef LLVM_ANALYSIS_MLEMITCMODELRUNNER_H
				#define LLVM_ANALYSIS_MLEMITCMODELRUNNER_H

				#include "llvm/Analysis/EmitCModelRegistry.h"
				#include "llvm/Analysis/MLModelRunner.h"
				#include "llvm/Analysis/TensorSpec.h"

				namespace llvm {

				template <class ModelT> class MLEmitCModelRunner : public MLModelRunner {
				public:
				MLEmitCModelRunner(LLVMContext &Ctx, const std::vector<TensorSpec> &Inputs,
				const TensorSpec &Advice)
				: MLModelRunner(Ctx, MLModelRunner::Kind::EmitC, Inputs.size()),
				InputSpecs(Inputs), OutputSpec(Advice) {
				for (uint64_t I = 0; I < InputSpecs.size(); ++I) {
				setUpBufferForTensor(I, InputSpecs[I],
				Model.get_input_buffer(InputSpecs[I].name()));
				}
				}

				virtual ~MLEmitCModelRunner() {}

				static bool classof(const MLModelRunner *R) {
				return R->getKind() == MLModelRunner::Kind::EmitC;
				}

				void *evaluateUntyped() override {
				auto *Result = Model.run();
				return static_cast<void *>(Result);
				}

				private:
				const std::vector<TensorSpec> InputSpecs;
				const TensorSpec OutputSpec;

				ModelT Model;
				};

				} // namespace llvm

				#endif

llvm/include/llvm/Analysis/MLModelRunner.h

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	return reinterpret_cast<const T *>(
getTensorUntyped(static_cast<size_t>(FeatureID)));		getTensorUntyped(static_cast<size_t>(FeatureID)));
}		}

void *getTensorUntyped(size_t Index) { return InputBuffers[Index]; }		void *getTensorUntyped(size_t Index) { return InputBuffers[Index]; }
const void *getTensorUntyped(size_t Index) const {		const void *getTensorUntyped(size_t Index) const {
return (const_cast<MLModelRunner *>(this))->getTensorUntyped(Index);		return (const_cast<MLModelRunner *>(this))->getTensorUntyped(Index);
}		}

enum class Kind : int { Unknown, Release, Development, NoOp, Interactive };		enum class Kind : int {
		Unknown,
		Release,
		Development,
		NoOp,
		Interactive,
		EmitC
		};
Kind getKind() const { return Type; }		Kind getKind() const { return Type; }
virtual void switchContext(StringRef Name) {}		virtual void switchContext(StringRef Name) {}

protected:		protected:
MLModelRunner(LLVMContext &Ctx, Kind Type, size_t NrInputs)		MLModelRunner(LLVMContext &Ctx, Kind Type, size_t NrInputs)
: Ctx(Ctx), Type(Type), InputBuffers(NrInputs) {		: Ctx(Ctx), Type(Type), InputBuffers(NrInputs) {
assert(Type != Kind::Unknown);		assert(Type != Kind::Unknown);
}		}
Show All 21 Lines

llvm/lib/Analysis/CMakeLists.txt

Show All 17 Lines	if (DEFINED LLVM_HAVE_TF_AOT OR LLVM_HAVE_TFLITE)
endif()		endif()

if (LLVM_HAVE_TFLITE)		if (LLVM_HAVE_TFLITE)
list(APPEND MLLinkDeps		list(APPEND MLLinkDeps
tensorflow-lite::tensorflow-lite)		tensorflow-lite::tensorflow-lite)
endif()		endif()
endif()		endif()

		add_subdirectory(models)

add_llvm_component_library(LLVMAnalysis		add_llvm_component_library(LLVMAnalysis
AliasAnalysis.cpp		AliasAnalysis.cpp
AliasAnalysisEvaluator.cpp		AliasAnalysisEvaluator.cpp
AliasAnalysisSummary.cpp		AliasAnalysisSummary.cpp
AliasSetTracker.cpp		AliasSetTracker.cpp
Analysis.cpp		Analysis.cpp
AssumeBundleQueries.cpp		AssumeBundleQueries.cpp
AssumptionCache.cpp		AssumptionCache.cpp
▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMAnalysis

LINK_COMPONENTS		LINK_COMPONENTS
BinaryFormat		BinaryFormat
Core		Core
Object		Object
ProfileData		ProfileData
Support		Support
TargetParser		TargetParser
		EmitCModels
)		)

llvm/lib/Analysis/InlineAdvisor.cpp

Show First 20 Lines • Show All 228 Lines • ▼ Show 20 Lines	#ifdef LLVM_HAVE_TFLITE
LLVM_DEBUG(dbgs() << "Using development-mode inliner policy.\n");		LLVM_DEBUG(dbgs() << "Using development-mode inliner policy.\n");
Advisor = llvm::getDevelopmentModeAdvisor(M, MAM, GetDefaultAdvice);		Advisor = llvm::getDevelopmentModeAdvisor(M, MAM, GetDefaultAdvice);
#endif		#endif
break;		break;
case InliningAdvisorMode::Release:		case InliningAdvisorMode::Release:
LLVM_DEBUG(dbgs() << "Using release-mode inliner policy.\n");		LLVM_DEBUG(dbgs() << "Using release-mode inliner policy.\n");
Advisor = llvm::getReleaseModeAdvisor(M, MAM, GetDefaultAdvice);		Advisor = llvm::getReleaseModeAdvisor(M, MAM, GetDefaultAdvice);
break;		break;
		case InliningAdvisorMode::EmitC:
		LLVM_DEBUG(dbgs() << "Using EmitC-compiled policy.\n");
		Advisor = llvm::getEmitCModeAdvisor(M, MAM, GetDefaultAdvice);
}		}

return !!Advisor;		return !!Advisor;
}		}

/// Return true if inlining of CB can block the caller from being		/// Return true if inlining of CB can block the caller from being
/// inlined which is proved to be more beneficial. \p IC is the		/// inlined which is proved to be more beneficial. \p IC is the
/// estimated inline cost associated with callsite \p CB.		/// estimated inline cost associated with callsite \p CB.
▲ Show 20 Lines • Show All 410 Lines • Show Last 20 Lines

llvm/lib/Analysis/MLInlineAdvisor.cpp

	Show All 9 Lines
	// It delegates model evaluation to either the AOT compiled model (the			// It delegates model evaluation to either the AOT compiled model (the
	// 'release' mode) or a runtime-loaded model (the 'development' case).			// 'release' mode) or a runtime-loaded model (the 'development' case).
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	#include "llvm/Analysis/MLInlineAdvisor.h"			#include "llvm/Analysis/MLInlineAdvisor.h"
	#include "llvm/ADT/SCCIterator.h"			#include "llvm/ADT/SCCIterator.h"
	#include "llvm/Analysis/AssumptionCache.h"			#include "llvm/Analysis/AssumptionCache.h"
	#include "llvm/Analysis/CallGraph.h"			#include "llvm/Analysis/CallGraph.h"
				#include "llvm/Analysis/EmitCModelRegistry.h"
	#include "llvm/Analysis/FunctionPropertiesAnalysis.h"			#include "llvm/Analysis/FunctionPropertiesAnalysis.h"
	#include "llvm/Analysis/InlineCost.h"			#include "llvm/Analysis/InlineCost.h"
	#include "llvm/Analysis/InlineModelFeatureMaps.h"			#include "llvm/Analysis/InlineModelFeatureMaps.h"
	#include "llvm/Analysis/InteractiveModelRunner.h"			#include "llvm/Analysis/InteractiveModelRunner.h"
	#include "llvm/Analysis/LazyCallGraph.h"			#include "llvm/Analysis/LazyCallGraph.h"
	#include "llvm/Analysis/LoopInfo.h"			#include "llvm/Analysis/LoopInfo.h"
	#include "llvm/Analysis/MLModelRunner.h"			#include "llvm/Analysis/MLModelRunner.h"
	#include "llvm/Analysis/OptimizationRemarkEmitter.h"			#include "llvm/Analysis/OptimizationRemarkEmitter.h"
	#include "llvm/Analysis/ReleaseModeModelRunner.h"			#include "llvm/Analysis/ReleaseModeModelRunner.h"
	#include "llvm/Analysis/TargetTransformInfo.h"			#include "llvm/Analysis/TargetTransformInfo.h"
	#include "llvm/IR/Dominators.h"			#include "llvm/IR/Dominators.h"
	#include "llvm/IR/InstIterator.h"			#include "llvm/IR/InstIterator.h"
	#include "llvm/IR/PassManager.h"			#include "llvm/IR/PassManager.h"
	#include "llvm/Support/CommandLine.h"			#include "llvm/Support/CommandLine.h"

				// Start EmitC model registration
				#include "models/emitc/InlineOzTestModel.emitc.h"
				REGISTER_EMITC_MODEL(InlineOzTestModel);
				// End EmitC model registration

	using namespace llvm;			using namespace llvm;

				static cl::opt<std::string> MLInlineEmitCModelName(
				"inliner-emitc-model-name", cl::Hidden,
				cl::desc("Name of the model to use for the ml inlining advisor."));

	static cl::opt<std::string> InteractiveChannelBaseName(			static cl::opt<std::string> InteractiveChannelBaseName(
	"inliner-interactive-channel-base", cl::Hidden,			"inliner-interactive-channel-base", cl::Hidden,
	cl::desc(			cl::desc(
	"Base file path for the interactive mode. The incoming filename should "			"Base file path for the interactive mode. The incoming filename should "
	"have the name <inliner-interactive-channel-base>.in, while the "			"have the name <inliner-interactive-channel-base>.in, while the "
	"outgoing name should be <inliner-interactive-channel-base>.out"));			"outgoing name should be <inliner-interactive-channel-base>.out"));
	static const std::string InclDefaultMsg =			static const std::string InclDefaultMsg =
	(Twine("In interactive mode, also send the default policy decision: ") +			(Twine("In interactive mode, also send the default policy decision: ") +
	DefaultDecisionName + ".")			DefaultDecisionName + ".")
	.str();			.str();
	static cl::opt<bool>			static cl::opt<bool>
	InteractiveIncludeDefault("inliner-interactive-include-default", cl::Hidden,			InteractiveIncludeDefault("inliner-interactive-include-default", cl::Hidden,
	cl::desc(InclDefaultMsg));			cl::desc(InclDefaultMsg));

	#if defined(LLVM_HAVE_TF_AOT_INLINERSIZEMODEL)			#if defined(LLVM_HAVE_TF_AOT_INLINERSIZEMODEL)
	// codegen-ed file			// codegen-ed file
	#include "InlinerSizeModel.h" // NOLINT			#include "InlinerSizeModel.h" // NOLINT
	using CompiledModelType = llvm::InlinerSizeModel;			using CompiledModelType = llvm::InlinerSizeModel;
	#else			#else
	using CompiledModelType = NoopSavedModelImpl;			using CompiledModelType = NoopSavedModelImpl;
	#endif			#endif

	std::unique_ptr<InlineAdvisor>			std::unique_ptr<InlineAdvisor>
				llvm::getEmitCModeAdvisor(Module &M, ModuleAnalysisManager &MAM,
				std::function<bool(CallBase &)> GetDefaultAdvice) {
				auto Runner = EmitCModelRegistry::get().createModelRunner(
				MLInlineEmitCModelName, M.getContext(), FeatureMap, InlineDecisionSpec);
				return std::make_unique<MLInlineAdvisor>(M, MAM, std::move(Runner),
				std::move(GetDefaultAdvice));
				}

				std::unique_ptr<InlineAdvisor>
	llvm::getReleaseModeAdvisor(Module &M, ModuleAnalysisManager &MAM,			llvm::getReleaseModeAdvisor(Module &M, ModuleAnalysisManager &MAM,
	std::function<bool(CallBase &)> GetDefaultAdvice) {			std::function<bool(CallBase &)> GetDefaultAdvice) {
	if (!llvm::isEmbeddedModelEvaluatorValid<CompiledModelType>() &&			if (!llvm::isEmbeddedModelEvaluatorValid<CompiledModelType>() &&
	InteractiveChannelBaseName.empty())			InteractiveChannelBaseName.empty())
	return nullptr;			return nullptr;
	std::unique_ptr<MLModelRunner> AOTRunner;			std::unique_ptr<MLModelRunner> AOTRunner;
	if (InteractiveChannelBaseName.empty())			if (InteractiveChannelBaseName.empty())
	AOTRunner = std::make_unique<ReleaseModeModelRunner<CompiledModelType>>(			AOTRunner = std::make_unique<ReleaseModeModelRunner<CompiledModelType>>(
	▲ Show 20 Lines • Show All 466 Lines • Show Last 20 Lines

llvm/lib/Analysis/models/CMakeLists.txt

This file was added.

add_subdirectory(emitc)

llvm/lib/Analysis/models/emitc/CMakeLists.txt

This file was added.

				add_llvm_component_library(LLVMEmitCModels
				InlineOzTestModel.emitc.cpp
				)

llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.h

This file was added.

				// Licensed under the Apache License, Version 2.0 (the "License");
				// you may not use this file except in compliance with the License.
				// You may obtain a copy of the License at
				//
				// https://www.apache.org/licenses/LICENSE-2.0
				//
				// Unless required by applicable law or agreed to in writing, software
				// distributed under the License is distributed on an "AS IS" BASIS,
				// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
				// See the License for the specific language governing permissions and
				// limitations under the License.
				//
				// SPDX-License-Identifier: Apache-2.0
				//
				// This code was originally sourced from github.com/iml130/mlir-emitc and has
				// been modified to fit the needs of generated C++ models in LLVM.
				//
				//------------------------------------------------------------------------------
				mtrofinUnsubmitted Not Done Reply Inline Actions can we add a "this is generated, avoid modifying me by hand" type of thing loud somewhere in the comment? same for the .cpp mtrofin: can we add a "this is generated, avoid modifying me by hand" type of thing loud somewhere in…
				jacobhegnaAuthorUnsubmitted Done Reply Inline Actions fixed jacobhegna: fixed
				//
				// THIS FILE WAS AUTOGENERATED BY A PYTHON SCRIPT.
				//
				mtrofinUnsubmitted Not Done Reply Inline Actions Not sure what the inheritance relationship buys us. We already have a common mechanism for referring to features by their enum-named index. This also seems to add a tight coupling between codegen and llvm. If all we need is a way to identify groups of models in the registry, that could be done with the address of a static specific to each consumer, i.e. MLInlineAdvisor could have a static char ID = '\0' and the address of that ID is the key (like the legacy PM does things, too) mtrofin: Not sure what the inheritance relationship buys us. We already have a common mechanism for…
				jacobhegnaAuthorUnsubmitted Done Reply Inline Actions fixed, but not with the static char trick. see the patch update for details jacobhegna: fixed, but not with the static char trick. see the patch update for details
				// DO NOT MODIFY THIS FILE DIRECTLY. INSTEAD, MODIFY THE SCRIPT WHICH GENERATED
				// IT AND REGENERATE THE FILE.
				//
				// THE SCRIPT IS LOCATED AT llvm/lib/Analysis/models/tflite_to_cpp.py
				//
				//------------------------------------------------------------------------------
				#include <memory>
				#include <string_view>

				namespace emitc_generated {
				class _InlineOzTestModelImpl;
				class InlineOzTestModel {
				private:
				std::unique_ptr<_InlineOzTestModelImpl> impl;

				public:
				InlineOzTestModel();
				~InlineOzTestModel();
				void *get_input_buffer(std::string_view name);
				static std::string_view name() { return "InlineOzTestModel"; }
				int64_t *run();
				};

				} // namespace emitc_generated

llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.cpp

This file was added.

				// Licensed under the Apache License, Version 2.0 (the "License");
				// you may not use this file except in compliance with the License.
				// You may obtain a copy of the License at
				//
				// https://www.apache.org/licenses/LICENSE-2.0
				//
				// Unless required by applicable law or agreed to in writing, software
				// distributed under the License is distributed on an "AS IS" BASIS,
				// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
				// See the License for the specific language governing permissions and
				// limitations under the License.
				//
				// SPDX-License-Identifier: Apache-2.0
				//
				// This code was originally sourced from github.com/iml130/mlir-emitc and has
				// been modified to fit the needs of generated C++ models in LLVM.
				//
				//------------------------------------------------------------------------------
				//
				// THIS FILE WAS AUTOGENERATED BY A PYTHON SCRIPT.
				//
				// DO NOT MODIFY THIS FILE DIRECTLY. INSTEAD, MODIFY THE SCRIPT WHICH GENERATED
				// IT AND REGENERATE THE FILE.
				//
				// THE SCRIPT IS LOCATED AT llvm/lib/Analysis/models/tflite_to_cpp.py
				//
				//------------------------------------------------------------------------------

				#include "InlineOzTestModel.emitc.h"
				#include <algorithm>
				#include <array>
				#include <cassert>
				#include <cmath>
				#include <complex>
				#include <cstddef>
				#include <cstdint>
				#include <cstring>
				#include <functional>
				#include <limits>
				#include <numeric>
				#include <type_traits>
				#include <vector>
				namespace {
				namespace emitc {
				namespace utility {
				template <size_t... Shape> static constexpr size_t size() {
				constexpr std::array<size_t, sizeof...(Shape)> s = {Shape...};
				size_t result = 1;
				for (size_t i = 0; i < sizeof...(Shape); ++i) {
				result *= s[i];
				}
				return result;
				}
				template <size_t... Shape>
				static constexpr std::array<size_t, sizeof...(Shape)> strides() {
				std::array<size_t, sizeof...(Shape)> result = {};
				constexpr std::array<size_t, sizeof...(Shape)> s = {Shape...};
				if (sizeof...(Shape) == 0) {
				return result;
				}
				result[sizeof...(Shape) - 1] = 1;
				for (size_t i = sizeof...(Shape) - 1; i > 0; i--) {
				result[i - 1] = result[i] * s[i];
				}
				return result;
				}
				template <size_t... Shape>
				constexpr size_t ravel_index(std::array<size_t, sizeof...(Shape)> indices) {
				std::array<size_t, sizeof...(Shape)> shape = {Shape...};
				for (size_t i = 0; i < sizeof...(Shape); ++i) {
				assert(indices[i] < shape[i]);
				}
				std::array<size_t, sizeof...(Shape)> s = strides<Shape...>();
				size_t result = 0;
				for (size_t i = 0; i < indices.size(); ++i) {
				result += indices[i] * s[i];
				}
				return result;
				}
				template <size_t... Shape, typename... Indices>
				constexpr size_t ravel_index(Indices... indices) {
				static_assert(sizeof...(Indices) == sizeof...(Shape),
				"Incorrect number of arguments");
				return ravel_index<Shape...>({static_cast<size_t>(indices)...});
				}
				template <size_t... Shape>
				constexpr std::array<size_t, sizeof...(Shape)> unravel_index(size_t index) {
				assert(index < size<Shape...>());
				std::array<size_t, sizeof...(Shape)> s = strides<Shape...>();
				std::array<size_t, sizeof...(Shape)> result = {};
				for (size_t i = 0; i < sizeof...(Shape); ++i) {
				result[i] = index / s[i];
				index = index % s[i];
				}
				return result;
				}
				} // namespace utility
				} // namespace emitc
				namespace detail {
				template <size_t N> constexpr size_t sum(const std::array<size_t, N> arr) {
				size_t result = 0;
				for (size_t i = 0; i < arr.size(); ++i) {
				result += arr[i];
				}
				return result;
				}
				template <size_t N> constexpr size_t first(const std::array<size_t, N> arr) {
				static_assert(N > 0, "Cannot get the first element of an empty array");
				return arr[0];
				}
				template <size_t N> constexpr bool all_same(const std::array<size_t, N> arr) {
				if (arr.size() == 0) {
				return true;
				}
				size_t first = arr[0];
				for (size_t i = 1; i < arr.size(); ++i) {
				if (arr[i] != first) {
				return false;
				}
				}
				return true;
				}
				template <class...> struct conjunction : std::true_type {};
				template <class B1> struct conjunction<B1> : B1 {};
				template <class B1, class... Bn>
				struct conjunction<B1, Bn...>
				: std::conditional_t<bool(B1::value), conjunction<Bn...>, B1> {};
				template <class... B> constexpr bool conjunction_v = conjunction<B...>::value;
				template <bool B, typename T> struct case_t {
				static constexpr bool value = B;
				using type = T;
				};
				template <typename First, typename... Rest>
				struct switch_t : std::conditional_t<First::value, First, switch_t<Rest...>> {};
				template <typename T> struct switch_t<T> {
				using type = T;
				};
				template <bool B, typename T> struct switch_t<case_t<B, T>> {
				static_assert(B, "None of the supplied conditions evaluate to true.");
				using type = T;
				};
				} // namespace detail
				template <typename T, size_t... Shape> class Tensor {
				public:
				using value_type = T;
				using reference = typename std::vector<T>::reference;
				using iterator = typename std::vector<T>::iterator;
				using const_iterator = typename std::vector<T>::const_iterator;
				Tensor() : data(size()) {}
				Tensor(std::initializer_list<T> data) : data(data) {
				assert(data.size() == size());
				}
				Tensor(std::vector<T> data) : data(std::move(data)) {}
				static constexpr size_t dim(size_t index) {
				assert(0 <= index && index < rank());
				constexpr std::array<size_t, rank()> s = {Shape...};
				return s[index];
				}
				static constexpr size_t rank() { return sizeof...(Shape); }
				static constexpr std::array<size_t, rank()> shape() { return {Shape...}; }
				static constexpr size_t size() { return emitc::utility::size<Shape...>(); }
				static constexpr std::array<size_t, rank()> strides() {
				return emitc::utility::strides<Shape...>();
				}
				T *get() { return data.data(); }
				std::vector<std::array<size_t, rank()>>
				window(std::array<size_t, rank()> index, std::array<size_t, rank()> sizes) {
				std::vector<std::vector<size_t>> iotas;
				for (auto &size : sizes) {
				std::vector<size_t> range(size);
				std::iota(range.begin(), range.end(), 0);
				iotas.push_back(range);
				}
				std::vector<std::array<size_t, rank()>> result;
				int resultSize =
				std::accumulate(sizes.begin(), sizes.end(), 1, std::multiplies<int>{});
				for (int n = 0; n < resultSize; ++n) {
				std::array<size_t, rank()> u = {};
				div_t q{n, 0};
				for (int i = iotas.size() - 1; 0 <= i; --i) {
				q = div(q.quot, iotas[i].size());
				u[i] = iotas[i][q.rem];
				}
				for (size_t i = 0; i < index.size(); ++i) {
				u[i] += index[i];
				}
				result.push_back(u);
				}
				return result;
				}
				iterator begin() { return data.begin(); }
				const_iterator begin() const { return data.begin(); }
				iterator end() { return data.end(); }
				const_iterator end() const { return data.end(); }
				reference operator[](size_t index) {
				assert(0 <= index && index < size());
				return data[index];
				}
				template <typename... Indices,
				typename = std::enable_if<
				detail::conjunction_v<std::is_same<size_t, Indices>...>>>
				reference operator()(Indices... indices) {
				static_assert(sizeof...(Indices) == rank(),
				"Incorrect number of arguments");
				size_t index = ravel_index({static_cast<size_t>(indices)...});
				assert(index < size());
				return data[index];
				}
				constexpr size_t ravel_index(std::array<size_t, rank()> indices) {
				return emitc::utility::ravel_index<Shape...>(indices);
				}
				constexpr std::array<size_t, rank()> unravel_index(size_t index) {
				return emitc::utility::unravel_index<Shape...>(index);
				}

				private:
				std::vector<T> data;
				};
				template <typename T> using Tensor0D = Tensor<T>;
				template <typename T, size_t Dim0> using Tensor1D = Tensor<T, Dim0>;
				template <typename T, size_t Dim0, size_t Dim1>
				using Tensor2D = Tensor<T, Dim0, Dim1>;
				template <typename T, size_t Dim0, size_t Dim1, size_t Dim2>
				using Tensor3D = Tensor<T, Dim0, Dim1, Dim2>;
				template <typename T, size_t Dim0, size_t Dim1, size_t Dim2, size_t Dim3>
				using Tensor4D = Tensor<T, Dim0, Dim1, Dim2, Dim3>;
				template <typename T> using is_scalar = std::is_arithmetic<T>;
				template <typename T, typename Unused = void>
				struct is_tensor : std::false_type {};
				template <typename T, size_t... Shape>
				struct is_tensor<Tensor<T, Shape...>> : std::true_type {};
				template <size_t Dim, typename T, typename Unused = void>
				struct is_tensor_of_dim : std::false_type {};
				template <size_t Dim, typename T, size_t... Shape>
				struct is_tensor_of_dim<Dim, Tensor<T, Shape...>> {
				static constexpr bool value = Tensor<T, Shape...>::rank() == Dim;
				};
				template <typename T>
				using IsScalar = typename std::enable_if_t<std::is_scalar<T>::value, bool>;
				template <typename T>
				using IsTensor = typename std::enable_if_t<is_tensor<T>::value, bool>;
				template <size_t Dim, typename T>
				using IsTensorOfDim =
				typename std::enable_if_t<is_tensor_of_dim<Dim, T>::value, bool>;
				template <typename T> struct get_element_type {
				using type = T;
				};
				template <typename T, size_t... Shape>
				struct get_element_type<Tensor<T, Shape...>> {
				using type = T;
				};
				template <typename T, typename ET>
				using IsTensorOfType = std::enable_if_t<
				std::is_same<typename get_element_type<T>::type, ET>::value, bool>;
				template <typename Dest, typename Src> struct replace_element_type {
				using type = Dest;
				};
				template <typename Dest, typename Src, size_t... Shape>
				struct replace_element_type<Dest, Tensor<Src, Shape...>> {
				using type = Tensor<Dest, Shape...>;
				};
				template <typename Dest, typename Src> using UnaryFuncType = Dest (*)(Src);
				template <typename Dest, typename SrcLeft, typename SrcRight>
				using BinaryFuncType = Dest (*)(SrcLeft, SrcRight);
				template <typename Dest, typename Src, typename UnaryOp, IsScalar<Src> = true>
				inline Dest unary(const Src &x, UnaryOp &&op) {
				return op(x);
				}
				template <typename Dest, typename Src, typename UnaryOp, IsTensor<Src> = true>
				inline Dest unary(const Src &x, UnaryOp &&op) {
				Dest z;
				std::transform(x.begin(), x.end(), z.begin(), op);
				return z;
				}
				template <typename Dest, typename SrcLeft, typename SrcRight, typename BinaryOp,
				IsScalar<SrcLeft> = true, IsScalar<SrcRight> = true>
				inline Dest binary(const SrcLeft &x, const SrcRight &y, BinaryOp &&op) {
				return op(x, y);
				}
				template <typename Dest, typename SrcLeft, typename SrcRight, typename BinaryOp,
				IsTensor<SrcLeft> = true, IsTensor<SrcRight> = true>
				inline Dest binary(const SrcLeft &x, const SrcRight &y, BinaryOp &&op) {
				Dest z;
				std::transform(x.begin(), x.end(), y.begin(), z.begin(), op);
				return z;
				}
				template <typename Dest, typename SrcA, typename SrcB, typename SrcC,
				typename TernaryOp, IsScalar<SrcA> = true, IsScalar<SrcB> = true,
				IsScalar<SrcC> = true>
				inline Dest ternary(const SrcA &a, const SrcB &b, const SrcB &c,
				TernaryOp &&op) {
				return op(a, b, c);
				}
				template <typename Dest, typename SrcA, typename SrcB, typename SrcC,
				typename TernaryOp, IsTensor<SrcA> = true, IsTensor<SrcB> = true,
				IsTensor<SrcC> = true>
				inline Dest ternary(const SrcA &a, const SrcB &b, const SrcB &c,
				TernaryOp &&op) {
				Dest d;
				auto first1 = a.begin(), last1 = a.end();
				auto first2 = b.begin(), first3 = c.begin();
				auto result = d.begin();
				while (first1 != last1) {
				result = op(first1, first2, first3);
				++result;
				++first1;
				++first2;
				++first3;
				}
				return d;
				}
				template <size_t Dim, typename T, typename... Ts> struct concat {};
				template <size_t Dim, typename T, size_t... Xs>
				struct concat<Dim, T, Tensor1D<T, Xs>...> {
				static_assert(0 <= Dim && Dim < 1, "Dimension index out of bounds");
				using type = Tensor1D<T, detail::sum<sizeof...(Xs)>({Xs...})>;
				};
				template <typename T, size_t Dim, size_t... Xs, size_t... Ys>
				struct concat<Dim, T, Tensor2D<T, Xs, Ys>...> {
				static_assert(0 <= Dim && Dim < 2, "Dimension index out of bounds");
				static_assert((Dim == 0 && detail::all_same<sizeof...(Ys)>({Ys...})) \|\|
				(Dim == 1 && detail::all_same<sizeof...(Xs)>({Xs...})),
				"All dimensions except for the dimension index must match");
				using type = typename std::conditional_t<
				Dim == 0,
				Tensor2D<T, detail::sum<sizeof...(Xs)>({Xs...}),
				detail::first<sizeof...(Ys)>({Ys...})>,
				Tensor2D<T, detail::first<sizeof...(Xs)>({Xs...}),
				detail::sum<sizeof...(Ys)>({Ys...})>>;
				};
				template <typename T, size_t Dim, size_t... Xs, size_t... Ys, size_t... Zs>
				struct concat<Dim, T, Tensor3D<T, Xs, Ys, Zs>...> {
				static_assert(0 <= Dim && Dim < 3, "Dimension index out of bounds");
				using type = typename detail::switch_t<
				detail::case_t<Dim == 0, Tensor3D<T, detail::sum<sizeof...(Xs)>({Xs...}),
				detail::first<sizeof...(Ys)>({Ys...}),
				detail::first<sizeof...(Zs)>({Zs...})>>,
				detail::case_t<Dim == 1,
				Tensor3D<T, detail::first<sizeof...(Xs)>({Xs...}),
				detail::sum<sizeof...(Ys)>({Ys...}),
				detail::first<sizeof...(Zs)>({Zs...})>>,
				detail::case_t<Dim == 2,
				Tensor3D<T, detail::first<sizeof...(Xs)>({Xs...}),
				detail::first<sizeof...(Ys)>({Ys...}),
				detail::sum<sizeof...(Zs)>({Zs...})>>>::type;
				};
				template <typename T, size_t Dim, size_t... D0, size_t... D1, size_t... D2,
				size_t... D3>
				struct concat<Dim, T, Tensor4D<T, D0, D1, D2, D3>...> {
				static_assert(0 <= Dim && Dim < 4, "Dimension index out of bounds");
				using type = typename detail::switch_t<
				detail::case_t<Dim == 0, Tensor4D<T, detail::sum<sizeof...(D0)>({D0...}),
				detail::first<sizeof...(D1)>({D1...}),
				detail::first<sizeof...(D2)>({D2...}),
				detail::first<sizeof...(D3)>({D3...})>>,
				detail::case_t<Dim == 1,
				Tensor4D<T, detail::first<sizeof...(D0)>({D0...}),
				detail::sum<sizeof...(D1)>({D1...}),
				detail::first<sizeof...(D2)>({D2...}),
				detail::first<sizeof...(D3)>({D3...})>>,
				detail::case_t<Dim == 2,
				Tensor4D<T, detail::first<sizeof...(D0)>({D0...}),
				detail::first<sizeof...(D1)>({D1...}),
				detail::sum<sizeof...(D2)>({D2...}),
				detail::first<sizeof...(D3)>({D3...})>>,
				detail::case_t<Dim == 3,
				Tensor4D<T, detail::first<sizeof...(D0)>({D0...}),
				detail::first<sizeof...(D1)>({D1...}),
				detail::first<sizeof...(D2)>({D2...}),
				detail::sum<sizeof...(D3)>({D3...})>>>::type;
				};
				namespace emitc {
				template <typename Src> inline Src abs(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = static_cast<ET_Src (*)(ET_Src)>(std::abs);
				return unary<Src>(x, f);
				}
				template <typename Src> inline Src ceil(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = static_cast<ET_Src (*)(ET_Src)>(std::ceil);
				return unary<Src>(x, f);
				}
				template <typename Dest, typename Src> inline Dest convert(Src x) {
				using ET_Dest = typename get_element_type<Dest>::type;
				using ET_Src = typename get_element_type<Src>::type;
				auto cast = [](ET_Src value) { return static_cast<ET_Dest>(value); };
				return unary<Dest, Src, UnaryFuncType<ET_Dest, ET_Src>>(x, cast);
				}
				template <typename Src> inline Src exp(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = static_cast<ET_Src (*)(ET_Src)>(std::exp);
				return unary<Src>(x, f);
				}
				template <typename Src> inline Src floor(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = static_cast<ET_Src (*)(ET_Src)>(std::floor);
				return unary<Src>(x, f);
				}
				template <typename Src> inline Src log(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = static_cast<ET_Src (*)(ET_Src)>(std::log);
				return unary<Src>(x, f);
				}
				template <typename Src> inline Src negate(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = std::negate<ET_Src>{};
				return unary<Src>(x, f);
				}
				template <typename Min, typename Src, typename Max>
				inline Src clamp(Min min, Src operand, Max max) {
				static_assert(
				std::is_same<Min, Src>::value \|\|
				(is_tensor_of_dim<0, Min>::value &&
				std::is_same<typename get_element_type<Src>::type,
				typename get_element_type<Min>::type>::value),
				"Expected the same type for min and operand or a 0-dim tensor of the "
				"same element type for min");
				static_assert(
				std::is_same<Max, Src>::value \|\|
				(is_tensor_of_dim<0, Max>::value &&
				std::is_same<typename get_element_type<Src>::type,
				typename get_element_type<Max>::type>::value),
				"Expected the same type for min and operand or a 0-dim tensor of the "
				"same element type for max");
				const bool broadcast_min = !std::is_same<Min, Src>::value;
				const bool broadcast_max = !std::is_same<Max, Src>::value;
				Src result;
				for (size_t index = 0; index < Src::size(); index++) {
				const auto value_min = broadcast_min ? min[0] : min[index];
				const auto value_max = broadcast_max ? max[0] : max[index];
				auto value = operand[index];
				value = value < value_min ? value_min : value;
				value = value > value_max ? value_max : value;
				result[index] = value;
				}
				return result;
				}
				template <typename Src> inline Src sqrt(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = static_cast<ET_Src (*)(ET_Src)>(std::sqrt);
				return unary<Src>(x, f);
				}
				template <typename Src> inline Src tanh(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = static_cast<ET_Src (*)(ET_Src)>(std::tanh);
				return unary<Src>(x, f);
				}
				template <typename Src> inline Src add(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = std::plus<ET_Src>{};
				return binary<Src>(x, y, f);
				}
				template <typename Src> inline Src max(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f =
				static_cast<const ET_Src &(*)(const ET_Src &, const ET_Src &)>(std::max);
				return binary<Src>(x, y, f);
				}
				template <typename Src> inline Src min(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f =
				static_cast<const ET_Src &(*)(const ET_Src &, const ET_Src &)>(std::min);
				return binary<Src>(x, y, f);
				}
				template <typename Src> inline Src mul(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = std::multiplies<ET_Src>{};
				return binary<Src>(x, y, f);
				}
				template <typename Src> inline Src pow(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = [](ET_Src a, ET_Src b) -> ET_Src {
				if (std::is_integral<ET_Src>::value) {
				const bool negative = b < 0;
				if (b < 0) {
				b = -b;
				}
				ET_Src result = 1;
				for (ET_Src i = 0; i < b; i++) {
				result *= a;
				}
				if (negative) {
				result = 1 / result;
				}
				return result;
				} else {
				return std::pow(a, b);
				}
				};
				return binary<Src>(x, y, f);
				}
				template <typename Src> inline Src sub(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = std::minus<ET_Src>{};
				return binary<Src>(x, y, f);
				}
				template <typename Dest, typename Src>
				inline Dest
				broadcast_in_dim(Src operand,
				Tensor<int64_t, Src::rank()> broadcast_dimensions) {
				static_assert(is_tensor<Src>::value, "Expected tensor argument");
				static_assert(is_tensor<Dest>::value, "Expected tensor result");
				std::vector<size_t> retainedDimensions(Dest::rank());
				std::iota(retainedDimensions.begin(), retainedDimensions.end(), 0);
				retainedDimensions.erase(
				std::remove_if(retainedDimensions.begin(), retainedDimensions.end(),
				[&broadcast_dimensions](size_t i) {
				return std::find(broadcast_dimensions.begin(),
				broadcast_dimensions.end(),
				i) == broadcast_dimensions.end();
				}),
				retainedDimensions.end());
				assert(retainedDimensions.size() == Src::rank());
				Dest result;
				for (size_t i = 0; i < result.size(); i++) {
				auto dest_index = result.unravel_index(i);
				std::array<size_t, Src::rank()> src_index;
				for (size_t j = 0; j < src_index.size(); j++) {
				src_index[j] = dest_index[broadcast_dimensions(j)];
				}
				for (size_t i = 0; i < src_index.size(); ++i) {
				if (Src::shape()[i] == 1) {
				src_index[i] = 0;
				}
				}
				result[i] = operand[operand.ravel_index(src_index)];
				}
				return result;
				}
				template <typename Dest, typename Lhs, typename Rhs>
				Dest dot(Lhs lhs, Rhs rhs) {
				static_assert(is_tensor_of_dim<2, Lhs>::value, "Expected 2 dimensional lhs");
				static_assert(is_tensor_of_dim<2, Rhs>::value, "Expected 2 dimensional rhs");
				static_assert(Lhs::dim(1) == Rhs::dim(0),
				"Expected contracting dimension to match");
				Dest output;
				for (size_t m = 0; m < lhs.dim(0); m++) {
				for (size_t n = 0; n < lhs.dim(1); n++) {
				for (size_t k = 0; k < rhs.dim(1); k++) {
				output(m, k) += lhs(m, n) * rhs(n, k);
				}
				}
				}
				return output;
				}
				template <typename Dest, typename Lhs, typename Rhs>
				Dest batch_matmul(Lhs lhs, Rhs rhs) {
				static_assert(is_tensor_of_dim<3, Lhs>::value, "Expected 3 dimensional lhs");
				static_assert(is_tensor_of_dim<3, Rhs>::value, "Expected 3 dimensional rhs");
				static_assert(Lhs::dim(0) == Rhs::dim(0) && Lhs::dim(0) == Dest::dim(0),
				"Expected batch dimension to match");
				static_assert(Lhs::dim(2) == Rhs::dim(1),
				"Expected contracting dimension to match");
				static_assert(Dest::dim(1) == Lhs::dim(1), "Expected row dimension to match");
				static_assert(Dest::dim(2) == Rhs::dim(2),
				"Expected column dimension to match");
				Dest output;
				for (size_t b = 0; b < lhs.dim(0); b++) {
				for (size_t m = 0; m < lhs.dim(1); m++) {
				for (size_t n = 0; n < lhs.dim(2); n++) {
				for (size_t k = 0; k < rhs.dim(2); k++) {
				output(b, m, k) += lhs(b, m, n) * rhs(b, n, k);
				}
				}
				}
				}
				return output;
				}
				template <int64_t Dimension, typename Dest, typename Src>
				inline Dest concatenate(Src input) {
				Dest z = input;
				return z;
				}
				template <int64_t Dimension, typename Dest, typename Src1, typename... Src>
				inline Dest concatenate(Src1 input1, Src... inputs) {
				static_assert(sizeof...(inputs) > 0, "Wrong template specialization chosen");
				using ET_Src = typename get_element_type<Src1>::type;
				using Rest = typename concat<Dimension, ET_Src, Src...>::type;
				Rest rest = concatenate<Dimension, Rest, Src...>(inputs...);
				Dest z;
				auto calculate_shift = [](const auto &shape) {
				size_t shift = 1;
				for (size_t i = Dimension; i < shape.size(); i++) {
				shift *= shape[i];
				}
				return shift;
				};
				auto a_shift = calculate_shift(Src1::shape());
				auto b_shift = calculate_shift(Rest::shape());
				for (auto a_ptr = input1.begin(), b_ptr = rest.begin(), c_ptr = z.begin();
				a_ptr != input1.end(); a_ptr += a_shift, b_ptr += b_shift) {
				std::copy(a_ptr, a_ptr + a_shift, c_ptr);
				c_ptr += a_shift;
				std::copy(b_ptr, b_ptr + b_shift, c_ptr);
				c_ptr += b_shift;
				}
				return z;
				}
				template <typename Dest, typename Src> inline Dest reshape(Src x) {
				static_assert(is_tensor<Src>::value, "Expected tensor argument");
				static_assert(is_tensor<Dest>::value, "Expected tensor result");
				using ET_Src = typename get_element_type<Src>::type;
				using ET_Dest = typename get_element_type<Dest>::type;
				static_assert(std::is_same<ET_Src, ET_Dest>::value, "Element type mismatch");
				static_assert(Src::size() == Dest::size(), "Tensor size mismatch");
				Dest z;
				std::copy(x.begin(), x.end(), z.begin());
				return z;
				}
				template <typename Dest, typename Src, IsTensorOfDim<1, Src> = true>
				Dest slice(Src x, Tensor<int64_t, 1> start_indices,
				Tensor<int64_t, 1> limit_indices, Tensor<int64_t, 1> strides) {
				Dest z;
				size_t index = 0;
				for (int64_t i = start_indices[0]; i < limit_indices[0]; i += strides[0]) {
				z[index++] = x(i);
				}
				return z;
				}
				template <typename Dest, typename Src, IsTensorOfDim<2, Src> = true>
				Dest slice(Src x, Tensor<int64_t, 2> start_indices,
				Tensor<int64_t, 2> limit_indices, Tensor<int64_t, 2> strides) {
				Dest z;
				size_t index = 0;
				for (int64_t i = start_indices[0]; i < limit_indices[0]; i += strides[0]) {
				for (int64_t j = start_indices[1]; j < limit_indices[1]; j += strides[1]) {
				z[index++] = x(i, j);
				}
				}
				return z;
				}
				template <typename Dest, typename Src, IsTensorOfDim<3, Src> = true>
				Dest slice(Src x, Tensor<int64_t, 3> start_indices,
				Tensor<int64_t, 3> limit_indices, Tensor<int64_t, 3> strides) {
				Dest z;
				size_t index = 0;
				for (int64_t i = start_indices[0]; i < limit_indices[0]; i += strides[0]) {
				for (int64_t j = start_indices[1]; j < limit_indices[1]; j += strides[1]) {
				for (int64_t k = start_indices[2]; k < limit_indices[2];
				k += strides[2]) {
				z[index++] = x(i, j, k);
				}
				}
				}
				return z;
				}
				template <typename Dest, typename Src, IsTensorOfDim<4, Src> = true>
				Dest slice(Src x, Tensor<int64_t, 4> start_indices,
				Tensor<int64_t, 4> limit_indices, Tensor<int64_t, 4> strides) {
				Dest z;
				size_t index = 0;
				for (int64_t i = start_indices[0]; i < limit_indices[0]; i += strides[0]) {
				for (int64_t j = start_indices[1]; j < limit_indices[1]; j += strides[1]) {
				for (int64_t k = start_indices[2]; k < limit_indices[2];
				k += strides[2]) {
				for (int64_t c = start_indices[3]; c < limit_indices[3];
				c += strides[3]) {
				z[index++] = x(i, j, k, c);
				}
				}
				}
				}
				return z;
				}
				template <typename Dest, typename Src>
				inline Dest pad(Src operand,
				Tensor<typename get_element_type<Src>::type> padding_value,
				Tensor<int64_t, Src::rank()> edge_padding_low,
				Tensor<int64_t, Src::rank()> edge_padding_high,
				Tensor<int64_t, Src::rank()> interior_padding) {
				assert(std::all_of(interior_padding.begin(), interior_padding.end(),
				[](int64_t i) { return i >= 0; }));
				assert(std::all_of(edge_padding_low.begin(), edge_padding_low.end(),
				[](int64_t i) { return i >= 0; }));
				assert(std::all_of(edge_padding_high.begin(), edge_padding_high.end(),
				[](int64_t i) { return i >= 0; }));
				Dest result;
				auto interior = [&interior_padding](std::array<size_t, Src::rank()> index) {
				for (size_t i = 0; i < index.size(); i++) {
				if (index[i] % (interior_padding[i] + 1) != 0) {
				return true;
				}
				}
				return false;
				};
				auto out_of_bounds = [](std::array<size_t, Src::rank()> index) {
				for (size_t i = 0; i < index.size(); i++) {
				if (index[i] < 0 \|\| index[i] >= Src::dim(i)) {
				return true;
				}
				}
				return false;
				};
				for (size_t i = 0; i < result.size(); i++) {
				auto index = result.unravel_index(i);
				for (size_t j = 0; j < index.size(); j++) {
				index[j] -= edge_padding_low[j];
				}
				if (interior(index)) {
				result[i] = padding_value();
				} else {
				for (size_t j = 0; j < index.size(); j++) {
				size_t pad = interior_padding[j];
				assert(index[j] % (pad + 1) == 0);
				index[j] /= (pad + 1);
				}
				if (out_of_bounds(index)) {
				result[i] = padding_value();
				} else {
				result[i] = operand[operand.ravel_index(index)];
				}
				}
				}
				return result;
				}
				} // namespace emitc
				namespace emitc {
				namespace tensor {
				template <typename T, size_t... Shape, typename... Indices>
				inline T extract(Tensor<T, Shape...> x, Indices... indices) {
				return x(indices...);
				}
				template <typename Dest, typename Src, IsScalar<Src> = true>
				inline Dest splat(Src x) {
				Dest z;
				std::fill(z.begin(), z.end(), x);
				return z;
				}
				} // namespace tensor
				} // namespace emitc
				namespace emitc {
				namespace tosa {
				template <typename Src> inline Src abs(Src x) { return emitc::abs<Src>(x); }
				template <typename Dest, typename Src> inline Dest cast(Src x) {
				return emitc::convert<Dest>(x);
				}
				template <typename Src> inline Src ceil(Src x) { return emitc::ceil<Src>(x); }
				template <typename Src>
				inline Src clamp(Src operand, typename Src::value_type min_value,
				typename Src::value_type max_value) {
				Tensor<typename Src::value_type> min{min_value};
				Tensor<typename Src::value_type> max{max_value};
				return emitc::clamp(min, operand, max);
				}
				template <typename Src> inline Src clz(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				static_assert(std::is_same<ET_Src, int32_t>::value,
				"Expected tensor of type int32_t");
				auto f = [](ET_Src element) {
				ET_Src count = 32;
				while (element != 0 && count > 0) {
				count--;
				element >>= 1;
				}
				return count;
				};
				return unary<Src>(x, f);
				}
				template <typename Src> inline Src exp(Src x) { return emitc::exp<Src>(x); }
				template <typename Src> inline Src floor(Src x) { return emitc::floor<Src>(x); }
				template <typename Src> inline Src log(Src x) { return emitc::log<Src>(x); }
				template <typename Src> inline Src negate(Src x) { return emitc::negate(x); }
				template <typename Src> inline Src reciprocal(Src x) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = [](ET_Src element) { return (static_cast<ET_Src>(1.0) / element); };
				return unary<Src>(x, f);
				}
				template <typename Dest, size_t Dim, typename Src>
				inline Dest rescale(Src x, typename get_element_type<Src>::type in_zp,
				typename get_element_type<Dest>::type out_zp,
				Tensor1D<int32_t, Dim> mult, Tensor1D<int32_t, Dim> shift,
				bool scale32, bool double_round, bool per_channel) {
				using ET_Dest = typename get_element_type<Dest>::type;
				using Dest_I32 = typename replace_element_type<int32_t, Dest>::type;
				assert(!(!scale32 && double_round) &&
				"Invalid combination of `scale32` and `double_round` arguments.");
				auto apply_scale = [=](int64_t element, int64_t mult, int64_t shift) {
				int64_t round = 1 << (shift - 1);
				if (double_round && shift > 31) {
				if (element >= 0)
				round += 1 << 30;
				else
				round -= 1 << 30;
				}
				int64_t result = (element * mult + round) >> shift;
				return static_cast<int32_t>(result);
				};
				Dest_I32 result;
				for (size_t i = 0; i < x.size(); ++i) {
				size_t index = per_channel ? x.unravel_index(i)[x.rank() - 1] : 0;
				int64_t element = x[i] - in_zp;
				int32_t scaled_element = apply_scale(element, mult[index], shift[index]);
				result[i] = scaled_element + out_zp;
				}
				Tensor0D<int32_t> min{
				static_cast<int32_t>(std::numeric_limits<ET_Dest>::min())};
				Tensor0D<int32_t> max{
				static_cast<int32_t>(std::numeric_limits<ET_Dest>::max())};
				return cast<Dest>(emitc::clamp(min, result, max));
				}
				template <typename Src> inline Src tanh(Src x) { return emitc::tanh<Src>(x); }
				template <typename Src> inline Src add(Src x, Src y) {
				return emitc::add<Src>(x, y);
				}
				template <typename Src>
				inline Src arithmetic_right_shift(Src x, Src y, bool round) {
				using ET_Src = typename get_element_type<Src>::type;
				std::function<ET_Src(ET_Src, ET_Src)> f;
				if (round) {
				f = [](ET_Src left, ET_Src right) {
				ET_Src result = left >> right;
				if (right > 0 && ((left >> (right - 1)) & 1) != 0) {
				result++;
				}
				return result;
				};
				} else {
				f = [](ET_Src left, ET_Src right) { return left >> right; };
				}
				return binary<Src>(x, y, f);
				}
				template <typename Dest, typename Src> inline Dest equal(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = [](ET_Src left, ET_Src right) { return left == right; };
				return binary<Dest, Src>(x, y, f);
				}
				template <typename Dest, typename Src> inline Dest greater_equal(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = [](ET_Src left, ET_Src right) { return left >= right; };
				return binary<Dest, Src>(x, y, f);
				}
				template <typename Src> inline Src logical_left_shift(Src x, Src y) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f = [](ET_Src left, ET_Src right) { return left << right; };
				return binary<Src>(x, y, f);
				}
				template <typename Src> inline Src mul(Src x, Src y) {
				return emitc::mul(x, y);
				}
				template <typename Src> inline Src maximum(Src x, Src y) {
				return emitc::max(x, y);
				}
				template <typename Src> inline Src minimum(Src x, Src y) {
				return emitc::min(x, y);
				}
				template <typename Src, IsTensorOfType<Src, int32_t> = true>
				inline Src mul(Src x, Src y, const int32_t shift) {
				if (shift > 0) {
				auto f = [&shift](int32_t x, int32_t y) -> int32_t {
				int64_t result;
				int64_t round = 1L << (shift - 1);
				result = x * y + round;
				result = result >> shift;
				return static_cast<int32_t>(result);
				};
				return binary<Src>(x, y, f);
				} else {
				return emitc::mul(x, y);
				}
				}
				template <typename Src> inline Src pow(Src x, Src y) {
				return emitc::pow(x, y);
				}
				template <typename Src> inline Src sub(Src x, Src y) {
				return emitc::sub<Src>(x, y);
				}
				template <size_t... Shape>
				inline Tensor<int8_t, Shape...> table(Tensor<int8_t, Shape...> x,
				Tensor1D<int8_t, 256> table) {
				auto f = [&table](int8_t element) {
				return table(static_cast<int16_t>(element) + 128);
				};
				return unary<Tensor<int8_t, Shape...>>(x, f);
				}
				template <size_t... Shape>
				inline Tensor<int32_t, Shape...> table(Tensor<int16_t, Shape...> x,
				Tensor1D<int16_t, 513> table) {
				auto f = [&table](int16_t element) {
				int32_t integer = (element >> 7) + 0x100;
				int32_t fractional = element & 0x7F;
				int32_t result_integer = table(integer);
				int32_t result_fractional =
				(table(integer + 1) - table(integer)) * fractional;
				return (result_integer << 7) + result_fractional;
				};
				return unary<Tensor<int32_t, Shape...>>(x, f);
				}
				template <typename Dest, typename SrcPred, typename SrcOperand>
				inline Dest select(SrcPred a, SrcOperand b, SrcOperand c) {
				using ET_Src_Pred = typename get_element_type<SrcPred>::type;
				static_assert(std::is_same<ET_Src_Pred, bool>::value,
				"Pred tensor type must be bool");
				using ET_Src_Operand = typename get_element_type<SrcOperand>::type;
				auto f = [](ET_Src_Pred pred, ET_Src_Operand on_true,
				ET_Src_Operand on_false) { return pred ? on_true : on_false; };
				return ternary<Dest, SrcPred, SrcOperand, SrcOperand>(a, b, c, f);
				}
				template <int32_t Dimension, typename Dest, typename... Src>
				inline Dest concat(Src... inputs) {
				return emitc::concatenate<Dimension, Dest, Src...>(inputs...);
				}
				template <typename Dest, typename Src, typename Weights>
				Dest conv2d(Src input, Weights weights, Tensor1D<int64_t, 4> padding,
				Tensor1D<int64_t, 2> stride, Tensor1D<int64_t, 2> dilation) {
				static_assert(is_tensor_of_dim<4, Src>::value,
				"Expected 4 dimensional input");
				static_assert(is_tensor_of_dim<4, Dest>::value,
				"Expected 4 dimensional output");
				static_assert(is_tensor_of_dim<4, Weights>::value,
				"Expected 4 dimensional weights");
				assert(stride[0] > 0);
				assert(stride[1] > 0);
				assert(dilation[0] == 1);
				assert(dilation[1] == 1);
				const int N = input.dim(0);
				const int H_IN = input.dim(1);
				const int W_IN = input.dim(2);
				const int C_IN = input.dim(3);
				Dest output;
				const int C_OUT = output.dim(3);
				const int K_H = weights.dim(1);
				const int K_W = weights.dim(2);
				const int S_H = stride[0];
				const int S_W = stride[1];
				const int pt = padding[0];
				const int pb = padding[1];
				const int pl = padding[2];
				const int pr = padding[3];
				const int H_PAD = pt + H_IN + pb;
				const int W_PAD = pl + W_IN + pr;
				for (int n = 0; n < N; n++) {
				for (int h_pad = 0; h_pad < H_PAD - K_H + 1; h_pad += S_H) {
				for (int w_pad = 0; w_pad < W_PAD - K_W + 1; w_pad += S_W) {
				for (int kh = 0; kh < K_H; kh++) {
				for (int kw = 0; kw < K_W; kw++) {
				for (int c_in = 0; c_in < C_IN; c_in++) {
				for (int c_out = 0; c_out < C_OUT; c_out++) {
				const int h_out = h_pad / S_H;
				const int w_out = w_pad / S_W;
				const int h_in = h_pad - pt + kh;
				const int w_in = w_pad - pl + kw;
				if (h_in < 0 \|\| h_in >= H_IN \|\| w_in < 0 \|\| w_in >= W_IN)
				continue;
				output(n, h_out, w_out, c_out) +=
				input(n, h_in, w_in, c_in) * weights(c_out, kh, kw, c_in);
				}
				}
				}
				}
				}
				}
				}
				return output;
				}
				template <typename Dest, typename Src, typename Weights>
				Dest depthwise_conv2d(Src input, Weights weights, Tensor1D<int64_t, 4> padding,
				Tensor1D<int64_t, 2> stride,
				Tensor1D<int64_t, 2> dilation) {
				static_assert(is_tensor_of_dim<4, Src>::value,
				"Expected 4 dimensional input");
				static_assert(is_tensor_of_dim<4, Dest>::value,
				"Expected 4 dimensional output");
				static_assert(is_tensor_of_dim<4, Weights>::value,
				"Expected 4 dimensional weights");
				static_assert(Src::dim(3) == Weights::dim(2),
				"Input channels must equal weights channels");
				static_assert(Src::dim(0) == Dest::dim(0), "Batch sizes must be equal");
				static_assert(Dest::dim(3) % Src::dim(3) == 0,
				"Output channels need to be a multiple of input channels");
				static_assert(
				Dest::dim(3) == Src::dim(3) * Weights::dim(3),
				"Output channels size must be input channels times channel multiplier");
				assert(stride[0] > 0);
				assert(stride[1] > 0);
				assert(dilation[0] == 1);
				assert(dilation[1] == 1);
				const int N = input.dim(0);
				const int H_IN = input.dim(1);
				const int W_IN = input.dim(2);
				const int C_IN = input.dim(3);
				Dest output;
				const int K_H = weights.dim(0);
				const int K_W = weights.dim(1);
				const int M = weights.dim(3);
				const int S_H = stride[0];
				const int S_W = stride[1];
				const int pt = padding[0];
				const int pb = padding[1];
				const int pl = padding[2];
				const int pr = padding[3];
				const int H_PAD = pt + H_IN + pb;
				const int W_PAD = pl + W_IN + pr;
				for (int n = 0; n < N; ++n) {
				for (int h_pad = 0; h_pad < H_PAD - K_H + 1; h_pad += S_H) {
				for (int w_pad = 0; w_pad < W_PAD - K_W + 1; w_pad += S_W) {
				for (int kh = 0; kh < K_H; ++kh) {
				for (int kw = 0; kw < K_W; ++kw) {
				for (int c_in = 0; c_in < C_IN; ++c_in) {
				for (int m = 0; m < M; ++m) {
				const int h_out = h_pad / S_H;
				const int w_out = w_pad / S_W;
				const int c_out = c_in * M + m;
				const int h_in = h_pad - pt + kh;
				const int w_in = w_pad - pl + kw;
				if (h_in < 0 \|\| h_in >= H_IN \|\| w_in < 0 \|\| w_in >= W_IN)
				continue;
				const size_t weights_index = emitc::utility::ravel_index<
				Weights::dim(0), Weights::dim(1), 1,
				Weights::dim(2) * Weights::dim(3)>(kh, kw, 0, c_out);
				output(n, h_out, w_out, c_out) +=
				input(n, h_in, w_in, c_in) * weights[weights_index];
				}
				}
				}
				}
				}
				}
				}
				return output;
				}
				template <typename Dest, typename Src, typename Weights, typename Bias>
				Dest fully_connected(Src input, Weights weights, Bias bias) {
				static_assert(is_tensor_of_dim<2, Src>::value,
				"Expected 2 dimensional input");
				static_assert(is_tensor_of_dim<2, Dest>::value,
				"Expected 2 dimensional output");
				static_assert(is_tensor_of_dim<2, Weights>::value,
				"Expected 2 dimensional weights");
				static_assert(is_tensor_of_dim<1, Bias>::value,
				"Expected 1 dimensional bias");
				Dest output;
				static_assert(input.dim(0) == output.dim(0),
				"Output and input batch dimension do not match.");
				static_assert(input.dim(1) == weights.dim(1),
				"Input and weights dimensions do not match.");
				static_assert(output.dim(1) == weights.dim(0),
				"Output and weights dimensions do not match.");
				static_assert(weights.dim(0) == bias.dim(0),
				"Bias and weights dimensions do not match.");
				const size_t N = input.dim(0);
				const size_t C_IN = input.dim(1);
				const size_t C_OUT = weights.dim(0);
				for (size_t n = 0; n < N; ++n) {
				for (size_t c_out = 0; c_out < C_OUT; ++c_out) {
				for (size_t c_in = 0; c_in < C_IN; ++c_in) {
				auto in = input(n, c_in);
				auto weight = weights(c_out, c_in);
				output(n, c_out) += in * weight;
				}
				output(n, c_out) += bias(c_out);
				}
				}
				return output;
				}
				template <typename Dest, typename Src, typename Idx,
				IsTensorOfDim<3, Dest> = true, IsTensorOfDim<3, Src> = true,
				IsTensorOfDim<2, Idx> = true, IsTensorOfType<Idx, int32_t> = true>
				Dest gather(Src input, Idx indices) {
				Dest result;
				static_assert(input.dim(0) == result.dim(0),
				"Input and output batch dimension do not match.");
				static_assert(input.dim(0) == indices.dim(0),
				"Input and weight batch dimension do not match.");
				static_assert(input.dim(2) == result.dim(2),
				"Input and output channel dimension do not match.");
				static_assert(indices.dim(1) == result.dim(1),
				"Weight and output index dimension do not match.");
				auto it = result.begin();
				size_t d0offset = Src::dim(1) * Src::dim(2);
				for (size_t i = 0, idx = Idx::size(); i < idx; i++) {
				auto d0 = d0offset * (i / Idx::dim(1));
				auto d1 = Src::dim(2) * indices[i];
				auto start = input.begin() + d0 + d1;
				auto end = start + Src::dim(2);
				it = std::copy(start, end, it);
				}
				return result;
				}
				template <typename T, size_t B, size_t M, size_t K, size_t N>
				Tensor3D<T, B, M, N> matmul(Tensor3D<T, B, M, K> a, Tensor3D<T, B, K, N> b) {
				return emitc::batch_matmul<Tensor3D<T, B, M, N>>(a, b);
				}
				namespace {
				template <typename Dest, typename Src, typename Computation>
				inline Dest reduce(Src operand, typename get_element_type<Src>::type initValue,
				int64_t dimension, Computation computation) {
				static_assert(is_tensor<Src>::value, "Expected tensor argument");
				static_assert(is_tensor<Dest>::value, "Expected tensor result");
				using ET_Src = typename get_element_type<Src>::type;
				using ET_Dest = typename get_element_type<Dest>::type;
				static_assert(std::is_same<ET_Src, ET_Dest>::value, "Element type mismatch");
				static_assert(Src::rank() == Dest::rank() + 1,
				"source rank must equal dest rank + 1");
				std::vector<size_t> retainedDimensions(Src::rank());
				std::iota(retainedDimensions.begin(), retainedDimensions.end(), 0);
				retainedDimensions.erase(retainedDimensions.begin() + dimension);
				assert(retainedDimensions.size() == Dest::rank());
				Dest result;
				std::fill(result.begin(), result.end(), initValue);
				for (size_t i = 0; i < operand.size(); ++i) {
				auto value = operand[i];
				auto index = operand.unravel_index(i);
				std::array<size_t, Dest::rank()> reducedIndex;
				size_t j = 0;
				for (size_t dim : retainedDimensions) {
				reducedIndex[j++] = index[dim];
				}
				auto reductionValue = result[result.ravel_index(reducedIndex)];
				result[result.ravel_index(reducedIndex)] =
				computation(reductionValue, value);
				}
				return result;
				}
				} // namespace
				template <typename Dest, typename Src>
				inline Dest argmax(Src operand, int64_t dimension) {
				static_assert(is_tensor<Src>::value, "Expected tensor argument");
				static_assert(is_tensor<Dest>::value, "Expected tensor result");
				using ET_Src = typename get_element_type<Src>::type;
				static_assert(Src::rank() == Dest::rank() + 1,
				"source rank must equal dest rank + 1");
				std::vector<size_t> retainedDimensions(Src::rank());
				std::iota(retainedDimensions.begin(), retainedDimensions.end(), 0);
				retainedDimensions.erase(retainedDimensions.begin() + dimension);
				assert(retainedDimensions.size() == Dest::rank());
				Dest result;
				typename replace_element_type<ET_Src, Dest>::type maxValues;
				std::fill(maxValues.begin(), maxValues.end(),
				std::numeric_limits<ET_Src>::min());
				for (size_t i = 0; i < operand.size(); ++i) {
				auto value = operand[i];
				auto index = operand.unravel_index(i);
				std::array<size_t, Dest::rank()> reducedIndex;
				size_t j = 0;
				for (size_t dim : retainedDimensions) {
				reducedIndex[j++] = index[dim];
				}
				auto destIndex = result.ravel_index(reducedIndex);
				if (value > maxValues[destIndex]) {
				maxValues[destIndex] = value;
				result[destIndex] = index[dimension];
				}
				}
				return result;
				}
				template <typename Dest, typename Src>
				inline Dest reduce_all(Src input, int64_t dimension) {
				using ET_Src = typename get_element_type<Src>::type;
				using ET_Dest = typename get_element_type<Dest>::type;
				static_assert(std::is_same<ET_Src, bool>::value,
				"Src tensor type must be bool");
				static_assert(std::is_same<ET_Dest, bool>::value,
				"Dest tensor type must be bool");
				auto and_ = [](ET_Src a, ET_Src b) { return (a && b); };
				return tosa::reduce<Dest, Src>(input, true, dimension, and_);
				}
				template <typename Dest, typename Src>
				inline Dest reduce_any(Src input, int64_t dimension) {
				using ET_Src = typename get_element_type<Src>::type;
				using ET_Dest = typename get_element_type<Dest>::type;
				static_assert(std::is_same<ET_Src, bool>::value,
				"Src tensor type must be bool");
				static_assert(std::is_same<ET_Dest, bool>::value,
				"Dest tensor type must be bool");
				auto or_ = [](ET_Src a, ET_Src b) { return a \|\| b; };
				return tosa::reduce<Dest, Src>(input, false, dimension, or_);
				}
				template <typename Dest, typename Src>
				inline Dest reduce_max(Src input, int64_t dimension) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f =
				static_cast<const ET_Src &(*)(const ET_Src &, const ET_Src &)>(std::max);
				return tosa::reduce<Dest, Src>(input, std::numeric_limits<ET_Src>::min(),
				dimension, f);
				}
				template <typename Dest, typename Src>
				inline Dest reduce_min(Src input, int64_t dimension) {
				using ET_Src = typename get_element_type<Src>::type;
				auto f =
				static_cast<const ET_Src &(*)(const ET_Src &, const ET_Src &)>(std::min);
				return tosa::reduce<Dest, Src>(input, std::numeric_limits<ET_Src>::max(),
				dimension, f);
				}
				template <typename Dest, typename Src>
				inline Dest reduce_prod(Src input, int64_t dimension) {
				using ET_Src = typename get_element_type<Src>::type;
				return tosa::reduce<Dest, Src>(input, 1, dimension,
				std::multiplies<ET_Src>{});
				}
				template <typename Dest, typename Src>
				inline Dest reduce_sum(Src input, int64_t dimension) {
				using ET_Src = typename get_element_type<Src>::type;
				return tosa::reduce<Dest, Src>(input, 0, dimension, std::plus<ET_Src>{});
				}
				template <typename Dest, typename Src> inline Dest reshape(Src x) {
				return emitc::reshape<Dest>(x);
				}
				template <typename Dest, typename Src>
				Dest slice(Src x, Tensor<int64_t, Src::rank()> start_indices,
				Tensor<int64_t, Src::rank()> slice_sizes) {
				Tensor<int64_t, Src::rank()> limit_indices =
				emitc::add(start_indices, slice_sizes);
				Tensor<int64_t, Src::rank()> strides =
				emitc::tensor::splat<Tensor<int64_t, Src::rank()>>(1);
				return emitc::slice<Dest, Src>(x, start_indices, limit_indices, strides);
				}
				template <typename Dest, typename Src, typename Padding>
				inline Dest pad(Src operand, Padding padding,
				Tensor0D<typename get_element_type<Src>::type> pad_const =
				Tensor0D<typename get_element_type<Src>::type>{0}) {
				using ET_Padding = typename get_element_type<Padding>::type;
				static_assert(is_tensor<Dest>::value, "Expected tensor result");
				static_assert(is_tensor<Src>::value, "Expected tensor argument");
				static_assert(is_tensor<Padding>::value, "Expected tensor argument");
				static_assert(Padding::rank() == 2, "Padding must have rank 2");
				static_assert(Padding::dim(0) == Src::rank(),
				"Dimension 1 of padding must equal source rank");
				static_assert(Padding::dim(1) == 2, "Dimension 2 of padding is must be 2");
				static_assert(std::is_same<ET_Padding, int32_t>::value \|\|
				std::is_same<ET_Padding, int64_t>::value,
				"Padding element type must be i32 or i64");
				Tensor<int64_t, Src::rank()> edge_padding_low;
				Tensor<int64_t, Src::rank()> edge_padding_high;
				for (unsigned int i = 0; i < padding.dim(0); ++i) {
				edge_padding_low(i) = padding(i, 0);
				edge_padding_high(i) = padding(i, 1);
				}
				Tensor<int64_t, Src::rank()> interior_padding;
				std::fill(interior_padding.begin(), interior_padding.end(), 0);
				return emitc::pad<Dest>(operand, pad_const, edge_padding_low,
				edge_padding_high, interior_padding);
				}
				template <typename Dest, typename Src, IsTensorOfDim<1, Dest> = true>
				Dest tile(Src input, Tensor1D<int64_t, 1> multiples) {
				Dest result;
				auto it = result.begin();
				for (int32_t i = 0, M0 = multiples[0]; i < M0; i++) {
				it = std::copy(input.begin(), input.end(), it);
				}
				return result;
				}
				template <typename Dest, typename Src, IsTensorOfDim<2, Src> = true>
				Dest tile(Src input, Tensor1D<int64_t, 2> multiples) {
				Dest result;
				auto it = result.begin();
				for (int32_t i = 0, M0 = multiples[0]; i < M0; i++) {
				for (int32_t j = 0, D0 = Src::dim(0); j < D0; j++) {
				for (int32_t k = 0, M1 = multiples[1]; k < M1; k++) {
				auto start = input.begin() + j * Src::dim(1);
				auto end = start + Src::dim(1);
				it = std::copy(start, end, it);
				}
				}
				}
				return result;
				}
				template <typename Dest, typename Src, IsTensorOfDim<3, Src> = true>
				Dest tile(Src input, Tensor1D<int64_t, 3> multiples) {
				Dest result;
				auto it = result.begin();
				for (int32_t m0 = 0, M0 = multiples[0]; m0 < M0; m0++) {
				for (int32_t d0 = 0, D0 = Src::dim(0); d0 < D0; d0++) {
				for (int32_t m1 = 0, M1 = multiples[1]; m1 < M1; m1++) {
				for (int32_t d1 = 0, D1 = Src::dim(1); d1 < D1; d1++) {
				for (int32_t m2 = 0, M2 = multiples[2]; m2 < M2; m2++) {
				auto start = input.begin() + (d0 * Src::dim(1) + d1) * Src::dim(2);
				auto end = start + Src::dim(2);
				it = std::copy(start, end, it);
				}
				}
				}
				}
				}
				return result;
				}
				template <typename Dest, typename Src, IsTensorOfDim<4, Src> = true>
				Dest tile(Src input, Tensor1D<int64_t, 4> multiples) {
				Dest result;
				auto it = result.begin();
				for (int32_t m0 = 0, M0 = multiples[0]; m0 < M0; m0++) {
				for (int32_t d0 = 0, D0 = Src::dim(0); d0 < D0; d0++) {
				for (int32_t m1 = 0, M1 = multiples[1]; m1 < M1; m1++) {
				for (int32_t d1 = 0, D1 = Src::dim(1); d1 < D1; d1++) {
				for (int32_t m2 = 0, M2 = multiples[2]; m2 < M2; m2++) {
				for (int32_t d2 = 0, D2 = Src::dim(2); d2 < D2; d2++) {
				for (int32_t m3 = 0, M3 = multiples[3]; m3 < M3; m3++) {
				auto start =
				input.begin() +
				((d0 * Src::dim(1) + d1) * Src::dim(2) + d2) * Src::dim(3);
				auto end = start + Src::dim(3);
				it = std::copy(start, end, it);
				}
				}
				}
				}
				}
				}
				}
				return result;
				}
				template <typename Dest, typename Src>
				inline Dest transpose(Src operand, Tensor1D<int64_t, Src::rank()> perms) {
				static_assert(is_tensor<Src>::value, "Expected tensor argument");
				static_assert(is_tensor<Dest>::value, "Expected tensor result");
				Tensor1D<int64_t, Src::rank()> broadcast_dimensions;
				for (size_t i = 0; i < perms.size(); ++i) {
				auto pos = std::find(perms.begin(), perms.end(), i);
				assert(pos != std::end(perms));
				int64_t index = std::distance(perms.begin(), pos);
				broadcast_dimensions[i] = index;
				}
				return emitc::broadcast_in_dim<Dest>(operand, broadcast_dimensions);
				}
				template <typename Dest, typename Src>
				inline Dest transpose(Src input, Tensor1D<int32_t, Src::rank()> perms) {
				Tensor1D<int64_t, Src::rank()> permsInt64;
				for (size_t i = 0; i < perms.size(); ++i) {
				permsInt64[i] = static_cast<int64_t>(perms[i]);
				}
				return tosa::transpose<Dest>(input, permsInt64);
				}
				} // namespace tosa
				} // namespace emitc
				} // namespace
				namespace emitc_generated {
				class _InlineOzTestModelImpl {
				private:
				Tensor<int64_t> result;
				Tensor<int64_t> v1;
				Tensor<int64_t> v2;
				Tensor<int64_t> v3;
				Tensor<int64_t> v4;
				Tensor<int64_t> v5;
				Tensor<int64_t> v6;
				Tensor<int64_t> v7;
				Tensor<int64_t> v8;
				Tensor<int64_t> v9;
				Tensor<int64_t> v10;
				Tensor<int64_t> v11;
				Tensor<int64_t> v12;
				Tensor<int64_t> v13;
				Tensor<int64_t> v14;
				Tensor<int64_t> v15;
				Tensor<int64_t> v16;
				Tensor<int64_t> v17;
				Tensor<int64_t> v18;
				Tensor<int64_t> v19;
				Tensor<int64_t> v20;
				Tensor<int64_t> v21;
				Tensor<int64_t> v22;
				Tensor<int64_t> v23;
				Tensor<int64_t> v24;
				Tensor<int64_t> v25;
				Tensor<int64_t> v26;
				Tensor<int64_t> v27;
				Tensor<int32_t> v28;
				Tensor<int64_t> v29;
				Tensor<int64_t> v30;
				Tensor<int64_t> v31;
				Tensor<int64_t> v32;
				Tensor<int64_t> v33;
				Tensor<int64_t> v34;
				Tensor<int64_t> v35;
				Tensor<int64_t> v36;
				Tensor<float> v37;
				Tensor<int64_t> v38;
				Tensor<float> v39;

				public:
				void *get_input_buffer(std::string_view name) {
				if (name == "callsite_cost") {
				return static_cast<void *>(v1.get());
				}
				if (name == "is_multiple_blocks") {
				return static_cast<void *>(v2.get());
				}
				if (name == "caller_conditionally_executed_blocks") {
				return static_cast<void *>(v3.get());
				}
				if (name == "inlining_default") {
				return static_cast<void *>(v4.get());
				}
				if (name == "cold_cc_penalty") {
				return static_cast<void *>(v5.get());
				}
				if (name == "callee_conditionally_executed_blocks") {
				return static_cast<void *>(v6.get());
				}
				if (name == "callee_users") {
				return static_cast<void *>(v7.get());
				}
				if (name == "callee_basic_block_count") {
				return static_cast<void *>(v8.get());
				}
				if (name == "nr_ctant_params") {
				return static_cast<void *>(v9.get());
				}
				if (name == "load_relative_intrinsic") {
				return static_cast<void *>(v10.get());
				}
				if (name == "jump_table_penalty") {
				return static_cast<void *>(v11.get());
				}
				if (name == "unsimplified_common_instructions") {
				return static_cast<void *>(v12.get());
				}
				if (name == "indirect_call_penalty") {
				return static_cast<void *>(v13.get());
				}
				if (name == "load_elimination") {
				return static_cast<void *>(v14.get());
				}
				if (name == "call_penalty") {
				return static_cast<void *>(v15.get());
				}
				if (name == "cost_estimate") {
				return static_cast<void *>(v16.get());
				}
				if (name == "case_cluster_penalty") {
				return static_cast<void *>(v17.get());
				}
				if (name == "node_count") {
				return static_cast<void *>(v18.get());
				}
				if (name == "call_argument_setup") {
				return static_cast<void *>(v19.get());
				}
				if (name == "sroa_savings") {
				return static_cast<void *>(v20.get());
				}
				if (name == "lowered_call_arg_setup") {
				return static_cast<void *>(v21.get());
				}
				if (name == "threshold") {
				return static_cast<void *>(v22.get());
				}
				if (name == "dead_blocks") {
				return static_cast<void *>(v23.get());
				}
				if (name == "constant_args") {
				return static_cast<void *>(v24.get());
				}
				if (name == "sroa_losses") {
				return static_cast<void *>(v25.get());
				}
				if (name == "simplified_instructions") {
				return static_cast<void *>(v26.get());
				}
				if (name == "num_loops") {
				return static_cast<void *>(v27.get());
				}
				if (name == "step_type") {
				return static_cast<void *>(v28.get());
				}
				if (name == "edge_count") {
				return static_cast<void *>(v29.get());
				}
				if (name == "nested_inlines") {
				return static_cast<void *>(v30.get());
				}
				if (name == "caller_basic_block_count") {
				return static_cast<void *>(v31.get());
				}
				if (name == "last_call_to_static_bonus") {
				return static_cast<void *>(v32.get());
				}
				if (name == "nested_inline_cost_estimate") {
				return static_cast<void *>(v33.get());
				}
				if (name == "callsite_height") {
				return static_cast<void *>(v34.get());
				}
				if (name == "constant_offset_ptr_args") {
				return static_cast<void *>(v35.get());
				}
				if (name == "switch_penalty") {
				return static_cast<void *>(v36.get());
				}
				if (name == "discount") {
				return static_cast<void *>(v37.get());
				}
				if (name == "caller_users") {
				return static_cast<void *>(v38.get());
				}
				if (name == "reward") {
				return static_cast<void *>(v39.get());
				}
				assert(false && "Unknown input name!");
				return nullptr;
				}
				int64_t *run() {
				result = runImpl();
				return result.get();
				}
				Tensor<int64_t> runImpl() {
				Tensor<int64_t> v40 = {1};
				return v40;
				}
				};
				InlineOzTestModel::InlineOzTestModel()
				: impl{std::make_unique<_InlineOzTestModelImpl>()} {}
				InlineOzTestModel::~InlineOzTestModel() {}
				void *InlineOzTestModel::get_input_buffer(std::string_view name) {
				return impl->get_input_buffer(name);
				}
				int64_t *InlineOzTestModel::run() { return impl->run(); }

				} // namespace emitc_generated

llvm/lib/Analysis/models/tflite_to_cpp.py

This file was added.

				"""Script for converting between TFLite and C++ using EmitC."""
				from absl import app
				from absl import flags
				from absl import logging

				import tflite_to_cpp_lib

				flags.DEFINE_string(
				'input', None, 'Input, which should be a path to a tflite model'
				)
				flags.mark_flag_as_required('input')

				flags.DEFINE_string(
				'output_dir', None, 'Output directory for the generated files'
				)
				flags.mark_flag_as_required('output_dir')

				flags.DEFINE_string(
				'name',
				None,
				(
				'Name to use for the model. This will be in the filenames and also will'
				' be used to identify the model within LLVM. This should be unique'
				' between models'
				),
				)
				flags.mark_flag_as_required('name')

				flags.DEFINE_string(
				'iree_import_tflite_path',
				None,
				'Path to the iree-import-tflite binary from iree repository',
				)
				flags.mark_flag_as_required('iree_import_tflite_path')

				flags.DEFINE_string(
				'emitc_opt_path',
				None,
				'Path to the emitc-opt binary from the emitc repository',
				)
				flags.mark_flag_as_required('emitc_opt_path')

				flags.DEFINE_string(
				'mlir_translate_path',
				None,
				'Path to the mlir-translate binary from the llvm repository',
				)
				flags.mark_flag_as_required('mlir_translate_path')

				flags.DEFINE_string(
				'emitc_runtime_path',
				None,
				'Path to the emitc runtime to embed in the generated c++ model',
				)
				flags.mark_flag_as_required('emitc_runtime_path')

				flags.DEFINE_string(
				'clang_format_path',
				None,
				(
				'(Optional) path to clang-format binary to use to format the resulting'
				' files'
				),
				)
				flags.DEFINE_string(
				'clang_format_style',
				'llvm',
				'Style argument to use for clang format',
				)

				FLAGS = flags.FLAGS


				def main(argv):
				del argv
				logging.info('Beginning conversion pipeline.')
				tosa = tflite_to_cpp_lib.tflite_to_tosa(
				tflite_path=FLAGS.input,
				iree_import_tflite_path=FLAGS.iree_import_tflite_path,
				)
				emitc_mlir = tflite_to_cpp_lib.tosa_to_emitc_mlir(
				tosa=tosa, emitc_opt_path=FLAGS.emitc_opt_path
				)
				model = tflite_to_cpp_lib.emitc_mlir_to_cpp(
				emitc_mlir=emitc_mlir,
				mlir_translate_path=FLAGS.mlir_translate_path,
				name=FLAGS.name,
				)
				model = tflite_to_cpp_lib.embed_runtime(
				model=model,
				runtime_path=FLAGS.emitc_runtime_path,
				)

				tflite_to_cpp_lib.print_llvm_registration_handle(model=model)

				model = tflite_to_cpp_lib.add_license_and_notice(model=model)

				if FLAGS.clang_format_path:
				model = tflite_to_cpp_lib.format_model(
				model=model,
				clang_format_path=FLAGS.clang_format_path,
				clang_format_style=FLAGS.clang_format_style,
				)

				cpp_path = tflite_to_cpp_lib.get_model_cpp_path(model, FLAGS.output_dir)
				hdr_path = tflite_to_cpp_lib.get_model_hdr_path(model, FLAGS.output_dir)

				logging.info('Writing generated files to [%s] and [%s].', cpp_path, hdr_path)
				with open(cpp_path, 'wt', encoding='utf-8') as f:
				f.write(model.cpp)
				with open(hdr_path, 'wt', encoding='utf-8') as f:
				f.write(model.hdr)
				logging.info('Done.')


				if __name__ == '__main__':
				app.run(main)

llvm/lib/Analysis/models/tflite_to_cpp_lib.py

This file was added.

				"""Library for converting between TFLite and C++ using EmitC."""
				from __future__ import annotations

				import os
				import dataclasses
				import subprocess
				import pathlib
				import re

				from absl import flags
				from absl import logging

				flags.DEFINE_bool(
				'rename_main_to_action',
				True,
				(
				'Whether to remain the @main method to @action, if it exists. This'
				' option exists because the MLGO-generated policies use the name'
				' @action, but the LLVM test policies that are generated by scripts in'
				' lib/Analysis/models use the name @main.'
				),
				)

				FLAGS = flags.FLAGS

				_TFAGENTS_POLICY_NAME = 'action'
				_MODEL_NAMESPACE = 'emitc_generated'

				# pylint: disable=line-too-long
				_LICENSE_AND_NOTICE = """// Licensed under the Apache License, Version 2.0 (the "License");
				// you may not use this file except in compliance with the License.
				// You may obtain a copy of the License at
				//
				// https://www.apache.org/licenses/LICENSE-2.0
				//
				// Unless required by applicable law or agreed to in writing, software
				// distributed under the License is distributed on an "AS IS" BASIS,
				// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
				// See the License for the specific language governing permissions and
				// limitations under the License.
				//
				// SPDX-License-Identifier: Apache-2.0
				//
				// This code was originally sourced from github.com/iml130/mlir-emitc and has
				// been modified to fit the needs of generated C++ models in LLVM.
				//
				//------------------------------------------------------------------------------
				//
				// THIS FILE WAS AUTOGENERATED BY A PYTHON SCRIPT.
				//
				// DO NOT MODIFY THIS FILE DIRECTLY. INSTEAD, MODIFY THE SCRIPT WHICH GENERATED
				// IT AND REGENERATE THE FILE.
				//
				// THE SCRIPT IS LOCATED AT llvm/lib/Analysis/models/tflite_to_cpp.py
				//
				//------------------------------------------------------------------------------
				"""


				def _fmt_includes(includes, angles=False):
				lhs = '<' if angles else '"'
				rhs = '>' if angles else '"'
				return '\n'.join([f'#include {lhs}{hdr}{rhs}' for hdr in includes]) + '\n'


				@dataclasses.dataclass
				class EmitCRuntime:
				"""Holds the runtime buffers in memory."""

				# Maps [header name] -> [header contents]
				headers: dict[str, str]

				# Which is the primary head for the runtime? e.g., 'tosa.h'
				primary: str


				def _load_emitc_runtime(path: str) -> EmitCRuntime:
				"""Load the EmitC runtime from a given path."""
				headers = {}
				pathlist = pathlib.Path(path).glob('*.h')
				for p in pathlist:
				with open(p, 'rt', encoding='utf-8') as f:
				headers[p.name] = f.read()
				return EmitCRuntime(headers=headers, primary='tosa.h')


				def _create_local_emitc_runtime(runtime: EmitCRuntime) -> str:
				"""Create a "local" version of the EmitC runtime.

				The "local" version is analogous to a single-header version of the runtime,
				but intended to be put in a .cpp file. All relevant code is wrapped in an
				anonymous namespace in the .cpp file, so each model will have its own copy of
				the runtime.

				This function modifies the runtime in the following way:
				1) removes all macros aside from includes
				2) removes all comments/whitespace

				This function depends on a particular implementation of the runtime which is
				prefered by mlcompileropt. To generalize this code, the function should
				topologically sort each header in the runtime by the inclusion ordering

				Args:
				runtime: the runtime to create a local version of.

				Returns:
				the contents of the local runtime as a string.
				"""
				topsort_on_includes = [
				'utility.h',
				'types.h',
				'core_ops.h',
				'tensor.h',
				'tosa.h',
				]
				assert set(topsort_on_includes).issubset(set(runtime.headers.keys()))
				# we don't currently support the eigen runtime, so set the file to zero
				runtime.headers['tosa_eigen.h'] = ''
				has_been_included = {key: False for key in topsort_on_includes}
				for key in topsort_on_includes:

				def on_match(m):
				group = m.group(1)
				if group not in topsort_on_includes or has_been_included[group]:
				return ''
				has_been_included[group] = True
				return runtime.headers[group]

				runtime.headers[key] = re.sub(
				r'#include "emitc/(\w+\.h)"',
				on_match,
				runtime.headers[key],
				)
				local_runtime = runtime.headers[runtime.primary]
				# Remove all comments, they just take up space
				local_runtime = re.sub(r'//.*', '', local_runtime)

				# Find any stdlib includes and store them
				stdlib_includes = re.findall(r'#include <(\w+)>', local_runtime)

				# Remove all the remaining macros
				local_runtime = re.sub(r'#.*', '', local_runtime)

				# Wrap the runtime in a local namespace to prevent ODR problems
				local_runtime = 'namespace {\n' + local_runtime + '\n}'

				# Reinsert the stdlib includes
				include_str = (
				'\n'.join([f'#include <{hdr}>' for hdr in stdlib_includes]) + '\n'
				)

				local_runtime = include_str + local_runtime

				# Remove all empty newlines and return
				return '\n'.join(
				[l for l in local_runtime.splitlines() if (l and not l.isspace())]
				)


				@dataclasses.dataclass
				class EmitCModel:
				"""Dataclass which represents an EmitC model."""

				# Name of the model.
				name: str

				# Contents of the generated .cpp file
				cpp: str

				# Contents of the generated .h file
				hdr: str


				def _run_clang_format(
				buffer: str, clang_format_path: str, clang_format_style: str
				) -> str:
				"""Formats the given buffer and returns the result"""
				cmdline = [clang_format_path, f'--style={clang_format_style}']
				result = subprocess.run(
				cmdline, stdout=subprocess.PIPE, text=True, input=buffer, check=True
				)
				return result.stdout


				def format_model(
				model: EmitCModel, clang_format_path: str, clang_format_style: str
				) -> EmitCModel:
				"""Formats the given model and returns the result"""
				logging.info(
				'Formatting the resulting model with style [%s].', clang_format_style
				)
				return dataclasses.replace(
				model,
				cpp=_run_clang_format(
				model.cpp,
				clang_format_path=clang_format_path,
				clang_format_style=clang_format_style,
				),
				hdr=_run_clang_format(
				model.hdr,
				clang_format_path=clang_format_path,
				clang_format_style=clang_format_style,
				),
				)


				def get_model_cpp_path(model: EmitCModel, root: str) -> str:
				return os.path.join(root, model.name + '.emitc.cpp')


				def get_model_hdr_path(model: EmitCModel, root: str) -> str:
				return os.path.join(root, model.name + '.emitc.h')


				def tflite_to_tosa(
				tflite_path: str, iree_import_tflite_path: str, *, convert_i48=True
				) -> str:
				"""Converts TFLite to TOSA MLIR."""
				logging.info('Converting the TFLite model to TOSA MLIR.')
				cmdline = [
				iree_import_tflite_path,
				'-o',
				'-',
				tflite_path,
				'--output-format=mlir-ir',
				]
				result = subprocess.run(
				cmdline, stdout=subprocess.PIPE, text=True, check=True
				)
				if convert_i48:
				return re.sub(r'i48', 'i64', result.stdout)
				return result.stdout


				def tosa_to_emitc_mlir(tosa: str, emitc_opt_path: str) -> str:
				"""Converts TOSA MLIR to EmitC MLIR using emitc-opt."""
				if FLAGS.rename_main_to_action:
				tosa = re.sub('@main', '@action', tosa)
				logging.info('Converting the TOSA MLIR to EmitC MLIR.')
				cmdline = [emitc_opt_path, '--convert-tosa-to-emitc', '-o', '-', '-']
				result = subprocess.run(
				cmdline, stdout=subprocess.PIPE, text=True, input=tosa, check=True
				)
				return result.stdout


				def emitc_mlir_to_cpp(
				emitc_mlir: str,
				mlir_translate_path: str,
				name: str,
				) -> EmitCModel:
				"""Converts EmitC MLIR to C++ files using mlir-translate."""
				logging.info('Converting the EmitC MLIR to C++.')

				def _get_cmdline(kind: str):
				return [
				mlir_translate_path,
				'-mlir-to-cpp',
				'--emit-cpp-kind=stateful',
				'--emit-cpp-arg-name-attr=tf_saved_model.index_path',
				f'--emit-cpp-model-name={name}',
				f'--emit-cpp-file-kind={kind}',
				f'--emit-cpp-only-one-fn={_TFAGENTS_POLICY_NAME}',
				'-o',
				'-',
				'-',
				]

				result_cpp = subprocess.run(
				_get_cmdline('cpp'),
				stdout=subprocess.PIPE,
				text=True,
				input=emitc_mlir,
				check=True,
				).stdout
				result_hdr = subprocess.run(
				_get_cmdline('header'),
				stdout=subprocess.PIPE,
				text=True,
				input=emitc_mlir,
				check=True,
				).stdout

				# Wrap results in namespaces
				result_cpp = f'namespace {_MODEL_NAMESPACE} {{' + '\n' + result_cpp + '}\n'
				result_hdr = f'namespace {_MODEL_NAMESPACE} {{' + '\n' + result_hdr + '}\n'

				return EmitCModel(cpp=result_cpp, hdr=result_hdr, name=name)


				def embed_runtime(
				model: EmitCModel,
				runtime_path: str,
				) -> EmitCModel:
				"""Embed the emitc runtime in the model.cpp file.

				This also:
				1) renames any types that are coming from LLVM instead of the embedded
				runtime, and
				2) includes all required headers

				Args:
				model: the model which we are embedding the runtime into.
				runtime_path: path to the emitc runtime to embed.

				Returns:
				the new model
				"""
				logging.info('Embedding the EmitC runtime in the generated model.')

				runtime = _load_emitc_runtime(runtime_path)
				local_runtime = _create_local_emitc_runtime(runtime)

				new_cpp = local_runtime + model.cpp

				# Add necessary includes to both files
				cpp_includes = [f'{model.name}.emitc.h']
				hdr_includes = []
				cpp_system_includes = []
				hdr_system_includes = ['memory', 'string_view']

				new_hdr = model.hdr

				new_cpp = _fmt_includes(cpp_includes) + new_cpp
				new_hdr = _fmt_includes(hdr_includes) + new_hdr
				new_cpp = _fmt_includes(cpp_system_includes, angles=True) + new_cpp
				new_hdr = _fmt_includes(hdr_system_includes, angles=True) + new_hdr

				return dataclasses.replace(model, cpp=new_cpp, hdr=new_hdr)


				def add_license_and_notice(model: EmitCModel) -> EmitCModel:
				new_cpp = _LICENSE_AND_NOTICE + model.cpp
				new_hdr = _LICENSE_AND_NOTICE + model.hdr
				return dataclasses.replace(model, cpp=new_cpp, hdr=new_hdr)


				def print_llvm_registration_handle(model: EmitCModel):
				"""Prints LLVM model registration code.

				This handle automatically adds the model to a global registry of models that
				are available in LLVM, so all that needs to be done to integrate the model in
				LLVM is link the .cpp with the required binary.
				"""
				registration_msg = f"""
				{''60}
				To register the generated model in LLVM, please:
				1) copy the generated .cpp/.h to llvm/lib/Analysis/models/emitc,
				2) add the .cpp to llvm/lib/Analysis/models/emitc/CMakeLists.txt, and
				3) include the following code somewhere else in a LLVM .cpp file:

				#include "models/emitc/{model.name}.emitc.h"
				REGISTER_EMITC_MODEL(FULLY_QUALIFIED_NAME_OF_EMITC_MODEL_RUNNER, {model.name});

				Note the .cpp file that you put the above code in must include the line at
				least once:

				#include "llvm/Analysis/EmitCModelRegistry.h".

				The token FULLY_QUALIFIED_NAME_OF_EMITC_MODEL_RUNNER needs to be replaced with
				the (template) class which implements the EmitC MLModelRunner for your specific
				problem. For example, if your model was named MyModel and it was for inlining,
				the macro would look like:

				REGISTER_EMITC_MODEL(::llvm::MLInlinerEmitCRunner, MyModel);

				{''60}
				"""
				logging.info(registration_msg)

llvm/lib/Passes/PassBuilderPipelines.cpp

	Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines
	static cl::opt<InliningAdvisorMode> UseInlineAdvisor(			static cl::opt<InliningAdvisorMode> UseInlineAdvisor(
	"enable-ml-inliner", cl::init(InliningAdvisorMode::Default), cl::Hidden,			"enable-ml-inliner", cl::init(InliningAdvisorMode::Default), cl::Hidden,
	cl::desc("Enable ML policy for inliner. Currently trained for -Oz only"),			cl::desc("Enable ML policy for inliner. Currently trained for -Oz only"),
	cl::values(clEnumValN(InliningAdvisorMode::Default, "default",			cl::values(clEnumValN(InliningAdvisorMode::Default, "default",
	"Heuristics-based inliner version"),			"Heuristics-based inliner version"),
	clEnumValN(InliningAdvisorMode::Development, "development",			clEnumValN(InliningAdvisorMode::Development, "development",
	"Use development mode (runtime-loadable model)"),			"Use development mode (runtime-loadable model)"),
	clEnumValN(InliningAdvisorMode::Release, "release",			clEnumValN(InliningAdvisorMode::Release, "release",
	"Use release mode (AOT-compiled model)")));			"Use release mode (AOT-compiled model)"),
				clEnumValN(InliningAdvisorMode::EmitC, "emitc",
				"Use EmitC-compiled model")));

	static cl::opt<bool> EnableSyntheticCounts(			static cl::opt<bool> EnableSyntheticCounts(
	"enable-npm-synthetic-counts", cl::Hidden,			"enable-npm-synthetic-counts", cl::Hidden,
	cl::desc("Run synthetic function entry count generation "			cl::desc("Run synthetic function entry count generation "
	"pass"));			"pass"));

	/// Flag to enable inline deferral during PGO.			/// Flag to enable inline deferral during PGO.
	static cl::opt<bool>			static cl::opt<bool>
	▲ Show 20 Lines • Show All 1,875 Lines • Show Last 20 Lines

llvm/test/Transforms/Inline/ML/ml-test-emitc-mode.ll

This file was added.

				; This test uses Inputs/test-module.ll, as it shares it with a similar test
				; for the 'development' and 'release' mode. The InlineOzTestModel inlines
				; everything.
				;
				; RUN: opt -passes=scc-oz-module-inliner -enable-ml-inliner=emitc -inliner-emitc-model-name=InlineOzTestModel -S < %S/Inputs/test-module.ll 2>&1 \| FileCheck %S/Inputs/test-module.ll --check-prefix=CHECK
				; RUN: opt -passes=scc-oz-module-inliner -enable-ml-inliner=default -S < %S/Inputs/test-module.ll 2>&1 \| FileCheck %S/Inputs/test-module.ll --check-prefix=DEFAULT

This is an archive of the discontinued LLVM Phabricator instance.

[mlgo] Add infrastructure to use EmitC-generated models for inlining.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 517813

llvm/include/llvm/Analysis/EmitCModelRegistry.h

llvm/include/llvm/Analysis/InlineAdvisor.h

llvm/include/llvm/Analysis/MLEmitCModelRunner.h

llvm/include/llvm/Analysis/MLModelRunner.h

llvm/lib/Analysis/CMakeLists.txt

llvm/lib/Analysis/InlineAdvisor.cpp

llvm/lib/Analysis/MLInlineAdvisor.cpp

llvm/lib/Analysis/models/CMakeLists.txt

llvm/lib/Analysis/models/emitc/CMakeLists.txt

llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.h

llvm/lib/Analysis/models/emitc/InlineOzTestModel.emitc.cpp

llvm/lib/Analysis/models/tflite_to_cpp.py

llvm/lib/Analysis/models/tflite_to_cpp_lib.py

llvm/lib/Passes/PassBuilderPipelines.cpp

llvm/test/Transforms/Inline/ML/ml-test-emitc-mode.ll

[mlgo] Add infrastructure to use EmitC-generated models for inlining.
Needs ReviewPublic