This is an archive of the discontinued LLVM Phabricator instance.

[mlgo] Dependency-free training mode logger
ClosedPublic

Authored by mtrofin on Dec 5 2022, 2:18 PM.

Details

Summary

This is the next step in dropping the dependency on protobuf.

The simple logger produces an output consisting of lines of json
strings. Tensor values - which should constitute the bulk of the data -
are serialized as raw byte buffers. This allows for light-weight reading
of the values.

The next step is to switch the training logic to the new logging format,
following which the protobuf-based logger will be dropped, together with
the training dependency on protobuf.

Subsequent changes will also stop buffering and stream, instead - the
buffering model is just as a convenient point-in-time.

Diff Detail

Event Timeline

mtrofin created this revision.Dec 5 2022, 2:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 5 2022, 2:18 PM
mtrofin requested review of this revision.Dec 5 2022, 2:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 5 2022, 2:18 PM
jacobhegna added inline comments.Dec 5 2022, 2:48 PM
llvm/lib/Analysis/TrainingLogger.cpp
112

where in the header is the length of the tensor buffer stored? (to deal with newline bytes in the tensor buffer), in the value of the features key?

mtrofin marked an inline comment as done.Dec 5 2022, 2:51 PM
mtrofin added inline comments.
llvm/lib/Analysis/TrainingLogger.cpp
112

we have the shape and the type of the scalar value in the buffer, so we get the size from there. See for example how we write (L 168 this file) and then how we read (line 63 in log_reader.py)

jacobhegna accepted this revision.Dec 5 2022, 2:55 PM

just some small python style nits

llvm/lib/Analysis/models/log_reader.py
2

nit: docstring for file

30

nit: could make this a @classmethod

51

nit: return type annotations? same for next two methods

63

nit: annotate fs?

This revision is now accepted and ready to land.Dec 5 2022, 2:55 PM
mtrofin updated this revision to Diff 480256.Dec 5 2022, 3:05 PM
mtrofin marked 5 inline comments as done.

python nits

jacobhegna added inline comments.Dec 5 2022, 3:08 PM
llvm/lib/Analysis/models/log_reader.py
35

this won't work, no? if it's a classmethod the first argument could be cls, and the point is you could return cls(name=...) instead of TensorSpec(name=...), it's essentially the same but follows the convention I see people using for factory methods in python

mtrofin added inline comments.Dec 5 2022, 3:11 PM
llvm/lib/Analysis/models/log_reader.py
35

(sorry, went to fast over it)

what does it give us, though, compared to the @staticmethod way?

51

getitem is a bit trickier I think because it's the value of the underlying type. doesn't seem pytype lets us specify self._spec.type

jacobhegna added inline comments.Dec 5 2022, 3:20 PM
llvm/lib/Analysis/models/log_reader.py
35

in this case, basically nothing 😊 (that's why it was a nit)

in this case, it's just a stylistic choice: I tend to see people use @classmethod for factory-style methods, and @staticmethod for methods that don't return an instance of the type. it's not that important, though

mtrofin updated this revision to Diff 480261.Dec 5 2022, 3:22 PM

back to @staticmethod

This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Dec 6 2022, 9:21 AM

This doesn't build with some compilers: http://45.33.8.238/macm1/50259/step_4.txt

Please take a look and revert for now if it takes a while to fix.

fhahn added a subscriber: fhahn.Dec 6 2022, 9:26 AM

I went ahead and reverted the patch as it breaks the build on macOS to unblock bots. More details about the failure in the commit message of the revert.