This is an archive of the discontinued LLVM Phabricator instance.

[llvm] Release-mode ML InlineAdvisor
ClosedPublic

Authored by mtrofin on Jun 9 2020, 4:09 PM.

Details

Summary

This implementation uses a pre-trained model which is statically
compiled into a native function.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Diff Detail

Event Timeline

mtrofin created this revision.Jun 9 2020, 4:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2020, 4:09 PM
mtrofin updated this revision to Diff 269718.Jun 9 2020, 7:03 PM

added some comments

Including the models in the LLVM tree is problematic.

I'm not sure there's a formal policy on this, but generally part of being an open-source project is that the source is available in human-readable format. With the exception of a few regression tests for binary parsers, the entire LLVM tree is human-readable. A model clearly doesn't count as human-readable.

If it isn't practical to train the model as part of the LLVM build (because it would take too long), it might make sense to commit binary files. There's some precedent for this in-tree: lowering for shuffles on some targets is based on a precomputed table, built using a utility that isn't run as part of the normal build process. But I would expect reproducible instructions for how to generate the files.

Including the models in the LLVM tree is problematic.

I'm not sure there's a formal policy on this, but generally part of being an open-source project is that the source is available in human-readable format. With the exception of a few regression tests for binary parsers, the entire LLVM tree is human-readable. A model clearly doesn't count as human-readable.

If it isn't practical to train the model as part of the LLVM build (because it would take too long), it might make sense to commit binary files. There's some precedent for this in-tree: lowering for shuffles on some targets is based on a precomputed table, built using a utility that isn't run as part of the normal build process. But I would expect reproducible instructions for how to generate the files.

Indeed, training part of the build would be impractical. But that still doesn't mean we need binary files.

I believe there are 2 concerns:

  • binary files: I agree with the sentiment about binaries. We want to explore a way to offer the model in a text format. That would require changes to the AOT compiler. We decided to start with what we had, believing that, given this part of the project is a built-time opt-in, it shouldn't cause much hindrance for the interim until we develop a text format.
  • how to train a model. There is a high level description of the means we used to train a model in the RFC, and, as outlined there, we intend to open source a reference training tool. Our plan is to do that in the next step.

Including the models in the LLVM tree is problematic.

I'm not sure there's a formal policy on this, but generally part of being an open-source project is that the source is available in human-readable format. With the exception of a few regression tests for binary parsers, the entire LLVM tree is human-readable. A model clearly doesn't count as human-readable.

If it isn't practical to train the model as part of the LLVM build (because it would take too long), it might make sense to commit binary files. There's some precedent for this in-tree: lowering for shuffles on some targets is based on a precomputed table, built using a utility that isn't run as part of the normal build process. But I would expect reproducible instructions for how to generate the files.

If there's some standardized binary format for models, that's might be okay? By analogy, there are some PNG files in the documentation; we don't insist people use XPM or something like that. There are some technical reasons to prefer text, though: it would allow someone to identify or diff the contents of the files without specialized tools.

I'm more concerned about adding an opaque matrix of coefficients nobody can reproduce into the codebase. I think before we commit a generated model, the training tool needs to be committed, and someone needs to verify they can independently reproduce the generated model using that tool. I think it's important we set the right precedent here.

If there's some standardized binary format for models, that's might be okay? By analogy, there are some PNG files in the documentation; we don't insist people use XPM or something like that. There are some technical reasons to prefer text, though: it would allow someone to identify or diff the contents of the files without specialized tools.

It's the tensorflow format for models - https://www.tensorflow.org/guide/saved_model

I'm more concerned about adding an opaque matrix of coefficients nobody can reproduce into the codebase. I think before we commit a generated model, the training tool needs to be committed, and someone needs to verify they can independently reproduce the generated model using that tool. I think it's important we set the right precedent here.

We are on the same page - we do plan to release the training tools for developers wishing to produce their own models. It may be natural to do that first step first, but in this case, we believe the staging described in the RFC may have some merit (we should have described our motivation in the RFC, come to think of it). The main motivation for starting with the LLVM components (both ‘release mode’ and ‘development mode’, which I plan to submit next), and then making the training tools available (in a separate repository), is that having the LLVM components available allows for quicker experimentation by our partner teams, thus allowing us to parallelize work on upstreaming the training components with more ML exploration with those teams.

IIUC, being an experimental feature that is conditionally-compiled in LLVM, this staging wouldn't have any material downside to anyone, while helping us maintain velocity. Importantly, because this is an optionally-built component, there should be no impact on "business as usual" LLVM developers, and, in particular, the build bots testing this feature are pointing to the silent master.

Can you also rebase the patch?

llvm/CMakeLists.txt
965

Add a comment here describing briefly how to download TF packages and set AOT_PATH?

llvm/cmake/modules/TensorFlowCompile.cmake
2

document the function.

20

groupped -- grouped.

llvm/lib/Analysis/InlineAdvisor.cpp
160

add an assert in the #else branch?

llvm/lib/Analysis/ML/Common/MLInlineAdvisor.cpp
62 ↗(On Diff #269718)

Add comments here what it does -- e.g, feature extraction etc.

105 ↗(On Diff #269718)

add top level comments documenting what it does.

206 ↗(On Diff #269718)

extract the feature extraction code into a small helper?

llvm/lib/Analysis/ML/InlineModelFeatureMaps.h
18 ↗(On Diff #269718)

add comment on each feature

38 ↗(On Diff #269718)

It is hard to keep the index in sync with names. how about something with a def table:

// ml_features.def

DEFINE_FEATURE(CalleeBasicCount, "callee bb count")
DEFINE_FEATURE(CallSiteHeight, "callsite_height)
.....

For enum define
#define DEFINE_FEATURE(en, name) en,
#include "ml_features.def"
#undef DEFINE_FEATURE

mtrofin updated this revision to Diff 270428.Jun 12 2020, 9:34 AM
mtrofin marked 11 inline comments as done.

rebased + feedback

mtrofin updated this revision to Diff 270928.Jun 15 2020, 5:38 PM

Moved everything to Analysis

mtrofin added inline comments.Jun 15 2020, 5:40 PM
llvm/lib/Analysis/InlineAdvisor.cpp
160

Actually, we don't assert, rather the tryCreate caller checks the return of this function and emits an error if it didn't get an Advisor - this is the current behavior.

llvm/lib/Analysis/ML/Common/MLInlineAdvisor.cpp
206 ↗(On Diff #269718)

There's a lot of little parameters to pass in that case. It'll probably be more natural when the features become more involved (multi-dimensional) to have groups of helpers like that.

mtrofin updated this revision to Diff 270932.Jun 15 2020, 5:56 PM

Default TENSORFLOW_AOT_PATH should be "", not its description.

davidxl added inline comments.Jun 17 2020, 12:57 PM
llvm/cmake/modules/TensorFlowCompile.cmake
27

there are lots of references to ${CMAKE_CURRENT_BINARY_DIR}/${fname} here, perhaps use a common variable for it?

llvm/include/llvm/Analysis/InlineModelFeatureMaps.h
58

put these static variables inside the .cpp file to avoid multiple copies if the header is included by different sources.

llvm/include/llvm/Analysis/InlineModelRunner.h
21 ↗(On Diff #270932)

Is this interface just for inliner or more general. Perhaps just name it MLModelRunner?

llvm/lib/Analysis/MLInlineAdvisor.cpp
38

Is this for controlling training time overhead?

73

reverse the condition with continue to reduce nesting level

78

inlinable callee

llvm/lib/Analysis/ReleaseModeModelRunner.cpp
32

MLInferenceRunner?

mtrofin updated this revision to Diff 271474.Jun 17 2020, 1:51 PM
mtrofin marked 9 inline comments as done.

Feedback

llvm/include/llvm/Analysis/InlineModelRunner.h
21 ↗(On Diff #270932)

Renamed

llvm/lib/Analysis/MLInlineAdvisor.cpp
38

Not only, it's also for controlling against misbehaving policies. Description was wrong, it's not native size increase, it's IR size.

llvm/lib/Analysis/ReleaseModeModelRunner.cpp
32

What would we call the Development mode one then?

asl added a subscriber: asl.Jun 17 2020, 2:21 PM
asl added inline comments.
llvm/cmake/modules/TensorFlowCompile.cmake
25

This misses the target triple. Otherwise even on MacOS it will generate linux object file.

mtrofin updated this revision to Diff 271530.Jun 17 2020, 4:58 PM

target triple

mtrofin marked 2 inline comments as done.Jun 17 2020, 4:58 PM
mtrofin added inline comments.
llvm/cmake/modules/TensorFlowCompile.cmake
25

Thanks! Fixed.

asl added inline comments.Jun 18 2020, 4:34 AM
llvm/cmake/modules/TensorFlowCompile.cmake
25

Host triple should be used here, no?

mtrofin updated this revision to Diff 271703.Jun 18 2020, 6:49 AM
mtrofin marked an inline comment as done.

correct triple

mtrofin marked 2 inline comments as done.Jun 18 2020, 6:51 AM
mtrofin added inline comments.
llvm/cmake/modules/TensorFlowCompile.cmake
25

Done - thanks!

mtrofin updated this revision to Diff 271788.Jun 18 2020, 11:18 AM
mtrofin marked an inline comment as done.

fix some formatting

davidxl added inline comments.Jun 19 2020, 10:33 AM
llvm/test/Transforms/Inline/ML/bounds-checks.ll
38

can you explain about the expected output?

llvm/test/Transforms/Inline/ML/ml-test-release-mode.ll
2

Why can't default inliner handle this case (adder call can be folded).

mtrofin marked 4 inline comments as done.Jun 19 2020, 12:06 PM
mtrofin added inline comments.
llvm/test/Transforms/Inline/ML/bounds-checks.ll
38

Added more detail

llvm/test/Transforms/Inline/ML/ml-test-release-mode.ll
2

Cost evaluation - added explanation.

mtrofin updated this revision to Diff 272156.Jun 19 2020, 12:06 PM
mtrofin marked 2 inline comments as done.

more details in test

mtrofin updated this revision to Diff 272493.Jun 22 2020, 10:40 AM

clang-tidy

davidxl added inline comments.Jun 22 2020, 11:01 AM
llvm/test/Transforms/Inline/ML/bounds-checks.ll
38

ok. I do wish the increase threshold to be learned as well in the future and make this option unnecessary.

davidxl accepted this revision.Jun 22 2020, 11:01 AM

lgtm. Wait a few days to see if other reviewers have more feedbacks.

This revision is now accepted and ready to land.Jun 22 2020, 11:01 AM

Hi, your git commit contains extra Phabricator tags. You can drop Reviewers: Subscribers: Tags: and the text Summary: from the git commit with the following script:

arcfilter () {
        arc amend
        git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F -
}

Reviewed By: is considered important by some people. Please keep the tag. (I have updated my script to use --date=now (setting author date to committer date))

https://reviews.llvm.org/D80978 contains a git pre-push hook to automate this.

This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Jun 24 2020, 9:58 AM
thakis added inline comments.
llvm/test/lit.site.cfg.py.in
51

Please use llvm_canonicalize_cmake_booleans for this.

mtrofin marked an inline comment as done.Jun 24 2020, 11:18 AM
mtrofin added inline comments.
llvm/test/lit.site.cfg.py.in
51

could you elaborate how? I see it used in CMakeLists files - not super sure how I'd use it here.

Thanks!

mtrofin marked 2 inline comments as done.Jun 29 2020, 8:46 AM
mtrofin added inline comments.
llvm/test/lit.site.cfg.py.in
51

Being addressed in D82776.

Hi Mircea, Could you also provide the information on what specific tf-nightly, protobuf version did you guys use to save the two frozen models? Unfortunately, I don't seem to load the models using a number of tf-nighly versions and am receiving

google.protobuf.message.DecodeError: Error parsing message

After further investigations, I noticed this has been done using the new TF's SavedModel method and Keras : https://tensorflow.google.cn/tutorials/keras/save_and_load?hl=en#save_checkpoints_during_training

Would you provide scripts to load the model and see the layers?

Thanks,

  • Amir

Hi Mircea, Could you also provide the information on what specific tf-nightly, protobuf version did you guys use to save the two frozen models? Unfortunately, I don't seem to load the models using a number of tf-nighly versions and am receiving

google.protobuf.message.DecodeError: Error parsing message

After further investigations, I noticed this has been done using the new TF's SavedModel method and Keras : https://tensorflow.google.cn/tutorials/keras/save_and_load?hl=en#save_checkpoints_during_training

Would you provide scripts to load the model and see the layers?

Thanks,

  • Amir

Hello Amir,

to answer the first question (but I think you figured that already), the authoritative versions are captured in the bot script, available at https://github.com/google/ml-compiler-opt/blob/master/buildbot/buildbot_init.sh

Re. second question, visualization - this is a question for Yundi, Gaurav, or Eugene (they are the ML experts). I'll venture "tensorboard" as an answer, but I'll make sure they give the authoritative one in a moment.

gjain added a subscriber: gjain.Oct 21 2020, 3:22 PM
gjain added a comment.Oct 21 2020, 4:04 PM

Would you provide scripts to load the model and see the layers?

Re. second question, visualization - this is a question for Yundi, Gaurav, or Eugene (they are the ML experts). I'll venture "tensorboard" as an answer, but I'll make sure they give the authoritative one in a moment.

You should be able to use tensorboard but you need to first import the model into tensorboard with https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/import_pb_to_tensorboard.py. Something like python import_pb_to_tensorboard.py --model_dir=llvm/lib/Analysis/models/inliner/ --log_dir=/tmp/inliner should work. Then you'll be able to run tensorboard on the log_dir.

Here's a hosted visualization from tensorboard for your convenience: https://tensorboard.dev/experiment/C45o0HjZTPGRSqpOrdkbeg/#graphs

AmirJamez added a comment.EditedOct 22 2020, 10:15 PM

Would you provide scripts to load the model and see the layers?

Re. second question, visualization - this is a question for Yundi, Gaurav, or Eugene (they are the ML experts). I'll venture "tensorboard" as an answer, but I'll make sure they give the authoritative one in a moment.

You should be able to use tensorboard but you need to first import the model into tensorboard with https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/import_pb_to_tensorboard.py. Something like python import_pb_to_tensorboard.py --model_dir=llvm/lib/Analysis/models/inliner/ --log_dir=/tmp/inliner should work. Then you'll be able to run tensorboard on the log_dir.

Here's a hosted visualization from tensorboard for your convenience: https://tensorboard.dev/experiment/C45o0HjZTPGRSqpOrdkbeg/#graphs

Thanks.

(1) May I ask what was the reason behind using a tf-nighlty rather than a tensoflow release?
(2) tf.nighlty mentioned in https://github.com/google/ml-compiler-opt/blob/master/buildbot/buildbot_init.sh#L119 is no longer available in https://pypi.org/project/tf-nightly/#history :)
(3) I can confirm that I was able to generate logs and subsequently visualize the model with tensorboard 2.3.0 and tensorflow release 2.2.0 instead. Also, in pursuit of installing packages, I ran into:

tensorboard duplicate plugins for name projector

which it turned out to be a common issue for tensorboard when there are multiple packages installed, as a result of trying tf.nightly with release. Removing duplicate tensorboard fixed the issue.

(4) Will you also release training scripts for brewing ir2native model as well here: https://github.com/google/ml-compiler-opt

Thanks,

  • Amir

Would you provide scripts to load the model and see the layers?

Re. second question, visualization - this is a question for Yundi, Gaurav, or Eugene (they are the ML experts). I'll venture "tensorboard" as an answer, but I'll make sure they give the authoritative one in a moment.

You should be able to use tensorboard but you need to first import the model into tensorboard with https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/import_pb_to_tensorboard.py. Something like python import_pb_to_tensorboard.py --model_dir=llvm/lib/Analysis/models/inliner/ --log_dir=/tmp/inliner should work. Then you'll be able to run tensorboard on the log_dir.

Here's a hosted visualization from tensorboard for your convenience: https://tensorboard.dev/experiment/C45o0HjZTPGRSqpOrdkbeg/#graphs

Thanks.

(1) May I ask what was the reason behind using a tf-nighlty rather than a tensoflow release?

Historic reason - at the time we started upstreaming the work, the necessary changes to the pip package were not in the release package yet.

(2) tf.nighlty mentioned in https://github.com/google/ml-compiler-opt/blob/master/buildbot/buildbot_init.sh#L119 is no longer available in https://pypi.org/project/tf-nightly/#history :)

Thanks for pointing it out - updated the script; one of the build bots was also having issues for this reason, must have been a recent change (or the bots weren't rebooted in a while)

(3) I can confirm that I was able to generate logs and subsequently visualize the model with tensorboard 2.3.0 and tensorflow release 2.2.0 instead. Also, in pursuit of installing packages, I ran into:

tensorboard duplicate plugins for name projector

which it turned out to be a common issue for tensorboard when there are multiple packages installed, as a result of trying tf.nightly with release. Removing duplicate tensorboard fixed the issue.

To confirm, now that we're using the release 2.3.0 tensorflow pip package, this shouldn't be an issue anymore, correct?

(4) Will you also release training scripts for brewing ir2native model as well here: https://github.com/google/ml-compiler-opt

IR2Native is used for RL training algorithms where we want partial rewards. That's what we initially did, but then we got better characteristics with training algorithms using just final reward (==the .text size in the native object). We abandoned for the short term the partial rewards training. We suspect it will start making sense again when we incorporate more global context than we currently do (currently, the global context is really thin - node/edge counts, uses, and a measure of the initial DAG position). So this is a long way of saying: we should probably yank out IR2Native right now, for code simplicity, but didn't get around to doing it.

Thanks,

  • Amir

Would you provide scripts to load the model and see the layers?

Re. second question, visualization - this is a question for Yundi, Gaurav, or Eugene (they are the ML experts). I'll venture "tensorboard" as an answer, but I'll make sure they give the authoritative one in a moment.

You should be able to use tensorboard but you need to first import the model into tensorboard with https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/import_pb_to_tensorboard.py. Something like python import_pb_to_tensorboard.py --model_dir=llvm/lib/Analysis/models/inliner/ --log_dir=/tmp/inliner should work. Then you'll be able to run tensorboard on the log_dir.

Here's a hosted visualization from tensorboard for your convenience: https://tensorboard.dev/experiment/C45o0HjZTPGRSqpOrdkbeg/#graphs

Thanks.

(1) May I ask what was the reason behind using a tf-nighlty rather than a tensoflow release?

Historic reason - at the time we started upstreaming the work, the necessary changes to the pip package were not in the release package yet.

(2) tf.nighlty mentioned in https://github.com/google/ml-compiler-opt/blob/master/buildbot/buildbot_init.sh#L119 is no longer available in https://pypi.org/project/tf-nightly/#history :)

Thanks for pointing it out - updated the script; one of the build bots was also having issues for this reason, must have been a recent change (or the bots weren't rebooted in a while)

(3) I can confirm that I was able to generate logs and subsequently visualize the model with tensorboard 2.3.0 and tensorflow release 2.2.0 instead. Also, in pursuit of installing packages, I ran into:

tensorboard duplicate plugins for name projector

which it turned out to be a common issue for tensorboard when there are multiple packages installed, as a result of trying tf.nightly with release. Removing duplicate tensorboard fixed the issue.

To confirm, now that we're using the release 2.3.0 tensorflow pip package, this shouldn't be an issue anymore, correct?

Yes. I confirm using TF.2.3.0 and Tensorboard 2.3.0; pip3 install tensorflow==2.3 --user did the job.

(4) Will you also release training scripts for brewing ir2native model as well here: https://github.com/google/ml-compiler-opt

IR2Native is used for RL training algorithms where we want partial rewards. That's what we initially did, but then we got better characteristics with training algorithms using just final reward (==the .text size in the native object). We abandoned for the short term the partial rewards training. We suspect it will start making sense again when we incorporate more global context than we currently do (currently, the global context is really thin - node/edge counts, uses, and a measure of the initial DAG position). So this is a long way of saying: we should probably yank out IR2Native right now, for code simplicity, but didn't get around to doing it.

I see. So there are two questions:

(Q1) Could you provide a definition for an IR2native final/optimal partial rewards ? I'd assume it was the final iteration of model weights when the training was stopped, however, what was the stop condition here?

(Q2) To make sense of it, let consider:
(2-1) Training Phase:

  • If models are trained together in the same pipeline: So that means you trained these two (IR2Native and RL) together in the same pipeline, meaning that when you feed IR2Native the training data, the partial rewards are fed into the RL model. If that's the case, it would be tricky as the partial rewards changes each iteration and depending on the input data and gradually converge to a more accurate values (lower loss function) and meanwhile you kept feeding these, inaccurate values, to the RL model to get trained. I guess as long as you had a unified strategy to deal with the loss functions, this method should be tricky.
  • If IR2Native was trained first: Based on your reply and that you mentioned you fixed the buckets with their final partial rewards, I assume this was the method you used, meaning that you trained IR2Native and stopped the training at a certain iteration perhaps with a low loss function value? or other criteria. At this point, you use the final buckets of IR2Native to train RL. So in a way IR2Native's inference is used to train RL. Is that a correct assumption?

(2-2) Inference Phase:
So at deployment and when an LLVM user passes opt -passes=scc-oz-module-inliner -enable-ml-inliner=release -S, callers()'s IR2Native features are collected and one bucket is chosen as partial reward which is then fed into RL to decide whether or not to inline a callee() ?

Thanks,

  • Amir

Would you provide scripts to load the model and see the layers?

Re. second question, visualization - this is a question for Yundi, Gaurav, or Eugene (they are the ML experts). I'll venture "tensorboard" as an answer, but I'll make sure they give the authoritative one in a moment.

You should be able to use tensorboard but you need to first import the model into tensorboard with https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/import_pb_to_tensorboard.py. Something like python import_pb_to_tensorboard.py --model_dir=llvm/lib/Analysis/models/inliner/ --log_dir=/tmp/inliner should work. Then you'll be able to run tensorboard on the log_dir.

Here's a hosted visualization from tensorboard for your convenience: https://tensorboard.dev/experiment/C45o0HjZTPGRSqpOrdkbeg/#graphs

Thanks.

(1) May I ask what was the reason behind using a tf-nighlty rather than a tensoflow release?

Historic reason - at the time we started upstreaming the work, the necessary changes to the pip package were not in the release package yet.

(2) tf.nighlty mentioned in https://github.com/google/ml-compiler-opt/blob/master/buildbot/buildbot_init.sh#L119 is no longer available in https://pypi.org/project/tf-nightly/#history :)

Thanks for pointing it out - updated the script; one of the build bots was also having issues for this reason, must have been a recent change (or the bots weren't rebooted in a while)

(3) I can confirm that I was able to generate logs and subsequently visualize the model with tensorboard 2.3.0 and tensorflow release 2.2.0 instead. Also, in pursuit of installing packages, I ran into:

tensorboard duplicate plugins for name projector

which it turned out to be a common issue for tensorboard when there are multiple packages installed, as a result of trying tf.nightly with release. Removing duplicate tensorboard fixed the issue.

To confirm, now that we're using the release 2.3.0 tensorflow pip package, this shouldn't be an issue anymore, correct?

Yes. I confirm using TF.2.3.0 and Tensorboard 2.3.0; pip3 install tensorflow==2.3 --user did the job.

(4) Will you also release training scripts for brewing ir2native model as well here: https://github.com/google/ml-compiler-opt

IR2Native is used for RL training algorithms where we want partial rewards. That's what we initially did, but then we got better characteristics with training algorithms using just final reward (==the .text size in the native object). We abandoned for the short term the partial rewards training. We suspect it will start making sense again when we incorporate more global context than we currently do (currently, the global context is really thin - node/edge counts, uses, and a measure of the initial DAG position). So this is a long way of saying: we should probably yank out IR2Native right now, for code simplicity, but didn't get around to doing it.

I see. So there are two questions:

(Q1) Could you provide a definition for an IR2native final/optimal partial rewards ? I'd assume it was the final iteration of model weights when the training was stopped, however, what was the stop condition here?

IR2Native was trained through supervised learning: we captured features after last inlining, then also captured final native size of that function (when asm printing), as label.

(Q2) To make sense of it, let consider:
(2-1) Training Phase:

  • If models are trained together in the same pipeline: So that means you trained these two (IR2Native and RL) together in the same pipeline, meaning that when you feed IR2Native the training data, the partial rewards are fed into the RL model. If that's the case, it would be tricky as the partial rewards changes each iteration and depending on the input data and gradually converge to a more accurate values (lower loss function) and meanwhile you kept feeding these, inaccurate values, to the RL model to get trained. I guess as long as you had a unified strategy to deal with the loss functions, this method should be tricky.
  • If IR2Native was trained first: Based on your reply and that you mentioned you fixed the buckets with their final partial rewards, I assume this was the method you used, meaning that you trained IR2Native and stopped the training at a certain iteration perhaps with a low loss function value? or other criteria. At this point, you use the final buckets of IR2Native to train RL. So in a way IR2Native's inference is used to train RL. Is that a correct assumption?

(2-2) Inference Phase:
So at deployment and when an LLVM user passes opt -passes=scc-oz-module-inliner -enable-ml-inliner=release -S, callers()'s IR2Native features are collected and one bucket is chosen as partial reward which is then fed into RL to decide whether or not to inline a callee() ?

IR2Native was trained completely separately: at a point, we captured the feature|label tuples from a corpus. Then we did supervised learning on that dataset, and obtained the IR2Native model.

After that, we only used the IR2Native model in inference mode any time we wanted to train the the inliner model. The IR used for the training sessions was different (same overall codebase, but unrelated points in time). We didn't retrain IR2Native before training the inliner either.

Thanks,

  • Amir