This is an archive of the discontinued LLVM Phabricator instance.

[mlgo] Introduce an "InteractiveModelRunner"
ClosedPublic

Authored by mtrofin on Jan 26 2023, 8:42 AM.

Details

Summary

This is a model runner for ML researchers using environments like
CompilerGym. In such environments, researchers host the compiler and
want to be able to observe the problem space (features) at each decision
step of some optimization pass, at which point the compiler is stopped,
waiting for the host makes a decision and provide an advice back to
the compiler, which then continues its normal operation, and so on.

The InteractiveModelRunner supports this scenario for the feature set
exposed by the compiler at a given time. It uses 2 files - ideally FIFO
pipes - one to pass data to the host, the other to get advices back from
the host. This means this scenario is supported with no special
dependencies. The file creation and deletion is the responsibility of
the host. Hooking up this model evaluator to a MLGO-ed pass is the
responsibilty of the pass author, and subsequent patches will do so for
the current set of mlgo passes, and offer an API to easily "just opt in"
by default when mlgo-ing a new pass.

The data protocol is that of the training logger: the host sees a training
log doled out observation by observation by reading from one of the
files, and passes back its advice as a serialized tensor (i.e. tensor value
memory dump) via the other file.

There are some differences wrt the log seen during training: the
interactive model doesn't currently include the outcome (because it should be
identical to the decision, and it's also not present in the "release"
mode); and partial rewards aren't currently communicated back.

The assumption - just like with the training logger - is that the host
is co-located, thus avoiding any endianness concerns. In a distributed
environment, it is up to the hosting infrastructure to intermediate
that.

Diff Detail

Event Timeline

mtrofin created this revision.Jan 26 2023, 8:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2023, 8:42 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
mtrofin requested review of this revision.Jan 26 2023, 8:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2023, 8:42 AM
mtrofin added a subscriber: ChrisCummins.
ChrisCummins added inline comments.Jan 26 2023, 5:17 PM
llvm/unittests/Analysis/MLModelRunnerTest.cpp
166

What does switchContext() do?

mtrofin added inline comments.Jan 26 2023, 7:51 PM
llvm/unittests/Analysis/MLModelRunnerTest.cpp
166

suppose this was regalloc. It writes a one-line json saying that all following observations are for the function specified in the parameter of switchContext

For inlining, the context is the hole compilation unit, so switchContext can have any parameter (i.e. it can be ignored on the reader side)

https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Analysis/Utils/TrainingLogger.h#L39

mtrofin removed a subscriber: ChrisCummins.
ChrisCummins accepted this revision.Jan 27 2023, 11:41 AM

Thanks for the info @mtrofin. Code LGTM, looking forward to playing with this when it's hooked into the MLGO-d passes :)

Cheers,
Chris

This revision is now accepted and ready to land.Jan 27 2023, 11:41 AM
jacobhegna added inline comments.Jan 27 2023, 12:51 PM
llvm/lib/Analysis/InteractiveModelRunner.cpp
51

When waiting for advice, the compiler blocks until it reads the number of bytes specified by the output tensor spec size. Do we want to make any assurances of what will happen if the host sends bad data back over the wire, writes too much data, etc? Or we just say "that's the host's problem." If so, I think we should document that in the description of the class.

jacobhegna accepted this revision.Jan 27 2023, 1:11 PM

LGTM! Btw, this is an interesting way that we could initially investigate having big models that can't run on CPUs for MLGO passes, where the host is communicating with a GPU server or something. Obviously that doesn't work well interacting with blaze, but interesting nonetheless.

mtrofin marked an inline comment as done.Jan 27 2023, 1:11 PM
mtrofin added inline comments.
llvm/lib/Analysis/InteractiveModelRunner.cpp
51

Good point, I added a doc header to the new class. To your q, we don't want to make any assurances. Bugs would be quite quickly detectable; and assurances would be kind of difficult to offer - in the fifo case, it's possible that it might appear insufficient data were sent, for example.

mtrofin updated this revision to Diff 492907.Jan 27 2023, 2:48 PM
mtrofin marked an inline comment as done.

added debugging

mtrofin updated this revision to Diff 492916.Jan 27 2023, 3:08 PM

debug mechanism

llvm/lib/Analysis/InteractiveModelRunner.cpp
51

Thought about it more and added a way for the user to get a dump of the values the compiler thinks it gets from the host, because otherwise debugging that side of things would have been harder - especially since one of the values of this feature is that ML researchers could just use a released clang "out of the box"

@ChrisCummins this means one can pass to clang -mllvm -interactive-model-runner-echo-type=<value> where the value is the scalar value of the advice tensor, and clang will output to stderr the vector of values it received. So then the researcher can debug that side of things without needing to debug clang.

This is how the help of the option looks like:

--interactive-model-runner-echo-type=<value>                      - The InteractiveModelRunner will echo back to stderr the data received from the host as the specified type (for debugging purposes).
    =float                                                          -   float
    =double                                                         -   double
    =int8_t                                                         -   int8_t
    =uint8_t                                                        -   uint8_t
    =int16_t                                                        -   int16_t
    =uint16_t                                                       -   uint16_t
    =int32_t                                                        -   int32_t
    =uint32_t                                                       -   uint32_t
    =int64_t                                                        -   int64_t
    =uint64_t                                                       -   uint64_t 
    =disable                                                        -   Don't echo
This revision was automatically updated to reflect the committed changes.