Current MLGO models can be served in three ways:
- release mode, which uses a TF AOT compiler to compile models into binary blobs and a header file. This requires a TF dependency at (LLVM) build time.
- development mode, which loads TFLite models using the TFLite runtime dynamically by providing LLVM with a path to the .tflite file. This requires a TFLite dependency.
- interactive mode, which fetches actions via two pipes (one for features going out, and one for actions going in).
None of these are suitable for a general clang release, where we don't assume any TF dependencies and package a model in the LLVM source code.
The EmitC serving path compiles TFLite models to pure C++ code with a mild runtime dependency. The runtime is a small set of tensor kernels necessary to run a neural net. The mlcompileropt repository contains (or at least, will eventually contain) a script which automates the process of the compiling the .tflile file through the various MLIR stages down to C++, and also embeds the C++ runtime directly in the autogenerated .cpp file. The result is that there is a single (.cpp, .h) pair which can be contained in the LLVM repository and built in a normal CMake build process, with no additional dependencies.
This patch adds two things:
- the infrastructure to load + run EmitC-generated models for the ML inlining advisor, and
- a "test" policy which was a TF function that always returned 1 and was fed through EmitC, and is loaded in the above framework. This is used to test the code in (1).
You can use the EmitC module inliner in opt with the flags -enable-ml-inliner=emitc -inliner-emitc-model-name=NAME_OF_MODEL. Currently the only supported model name is InlineOzTestModel.
StringMap?