This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Add an option for emitting LLDB logs as JSON
AcceptedPublic

Authored by teemperor on Jul 29 2021, 7:51 AM.

Details

Reviewers
aprantl
Group Reviewers
Restricted Project
Summary

LLDB has a quite extensive logging framework that is mostly used to help with bug reporting where users
are usually requested to attach the generated logs after reproducing a bug locally.

Right now all logging is done in plain text where usually one line in the text file equals one log message (but
that is more of a guideline, there are multi-line log messages too).

While the plain text logs are easy to read without any tools, they are often also really limiting the usefulness
of LLDB's logging framework:

  • The plain text logs become hard to read when they are filled out with too much information (e.g., the automatic

bactrace option was enabled). This usually means we have to pick between a readable log or one that contains
more information.

  • We can't easily write tools that make viewing logs easier. Writing a parser that extracts the different parts of a

log message from the plain text format is rather difficult.

This patch adds the log channel option to write the log message in JSON format. Every log message is now a
serialized JSON object that is just appended to the log file. For convenience I also added a trailing comma
behind every JSON object so that a log file can always be turned into a single top-level JSON object by just
surrounding it with [] and removing the last comma (for strict parsers that care about this).

The current plain text format is still available and the default. log enable -O json ... can be used to change
the channel format to JSON.

Diff Detail

Event Timeline

teemperor created this revision.Jul 29 2021, 7:51 AM
teemperor requested review of this revision.Jul 29 2021, 7:51 AM

I hacked together a quick example log viewer in HTML here: https://teemperor.de/pub/logviewer.html (I already loaded in a log from my machine, so you can just hit "parse log"). Note that you can search the log messages and you can click each one of them to reveal their associated stack trace. You can also export the log in a different format (Excel, CSV, etc.).

Some general notes:

  • Why JSON? In LLVM we have pretty much just JSON and YAML as formats available (well, and the bitcode format I guess). YAML is a much more complicated format and the main idea here is to let people write tools, so JSON seemed like an obvious pick.
  • Will there be an upstream log viewer? I don't know, but we could easily put a simple one on the website that just formats a log as a simple HTML table. That is easily done in vanilla JavaScript. But that should be a follow up patch.
  • Do we want to keep the old plain text format too? To be honest, I don't really see much
  • What about logs that are printed to stdin and we want to parse? I think you can easily make a tool that filters out the non-JSON output. Logs are guaranteed to have every object on its own line, so you can just go line-by-line and filter them based on that.

Some planned follow ups:

  • Enable more logging meta-information by default. Things such as pid/tid seems like obvious cheap choices.
  • For people that really like the old format, it should be a simple patch to add a converter that just takes a JSON file and outputs the old plaintext.
  • I want to remove the ability to turn off thread-safe logging. It seems like it could potentially crash LLDB but it would certainly break the JSON parser. It's usually (and luckily) enabled by default anyway.
  • Some more meta information would be actually helpful sometimes and there isn't any reason to not add them. E.g., classifying messages as error/warning/info.

Actually forgot this one: Having the ability to log the current channel in each message is also planned. This originally came up when I kept telling people to do 'log enable lldb all' and people were tired of filtering through the very verbose logs (and Jim suggested that something more structured than plaintext would be nice to have). With the added channels in each message we could just do the filtering on our side and tell people to just enable every channel in every category.

With the added channels in each message we could just do the filtering on our side and tell people to just enable every channel in every category.

+1 to this, was the first thing I tried to do on the viewer you linked.

Will there be an upstream log viewer? I don't know, but we could easily put a simple one on the website that just formats a log as a simple HTML table. That is easily done in vanilla JavaScript. But that should be a follow up patch.

As in on llvm.org? That would be neat.

Python does great with JSON so I'm sure it would be easy to hack on more specific scripts later too.

Will you be adding documentation about what the format is? (said web page would be a good place for it actually)

I know it won't be that complicated but it's always nice to see the total set of fields you might expect.

lldb/packages/Python/lldbsuite/test/lldbutil.py
115

What is the situation where you *would* have [, ] already? Will this be parsing raw text as well as output that is already JSON.

DavidSpickett added inline comments.Jul 29 2021, 8:22 AM
lldb/packages/Python/lldbsuite/test/lldbutil.py
1579

Could you show what the suffix is here?

# Remove the suffix ("ABC") that isn't part of the JSON payload
mib added a subscriber: mib.Jul 29 2021, 8:46 AM

This looks like a great effort to improve logging and debugging! Have you considered also adding logging levels (info, debug, warning, error ...) ? I think the CommandReturnObject and Status classes already makes the distinction between warnings and errors messages. We could enrich logging further more by having logging levels to filter / color-code the messages.

lldb/include/lldb/Utility/Log.h
188–189

+1 to this! We should standardise the log message format to always have the source info, as a follow-up (May be clang-tidy check ?).

mib added a comment.Jul 29 2021, 8:52 AM

Quick side-note: Status only has an "Error String". I guess this is for historical reason but I think we could change it (on another patch) to convey more information inside the debugger.

I'm supportive of this effort, but if we're removing the plaintext format, I think it's worth sending this out as an RFC.

Some general notes:

  • Why JSON? In LLVM we have pretty much just JSON and YAML as formats available (well, and the bitcode format I guess). YAML is a much more complicated format and the main idea here is to let people write tools, so JSON seemed like an obvious pick.

The YAML I/O library is also relatively slow, which would hurt the overhead of enabling logging (even more).

  • Will there be an upstream log viewer? I don't know, but we could easily put a simple one on the website that just formats a log as a simple HTML table. That is easily done in vanilla JavaScript. But that should be a follow up patch.

Personally I'd also want a script to view the logs in the terminal. If we write it in Python we can ship it with lldb.

  • Do we want to keep the old plain text format too? To be honest, I don't really see much

Me neither on the condition that users can convert the logs to some kind of plain text format (part of the motivation for my previous comment.)

  • What about logs that are printed to stdin and we want to parse? I think you can easily make a tool that filters out the non-JSON output. Logs are guaranteed to have every object on its own line, so you can just go line-by-line and filter them based on that.

Some planned follow ups:

  • Enable more logging meta-information by default. Things such as pid/tid seems like obvious cheap choices.

+1, if the data is structured we should always emit it (unless there's some large associated cost)

  • For people that really like the old format, it should be a simple patch to add a converter that just takes a JSON file and outputs the old plaintext.
  • I want to remove the ability to turn off thread-safe logging. It seems like it could potentially crash LLDB but it would certainly break the JSON parser. It's usually (and luckily) enabled by default anyway.
  • Some more meta information would be actually helpful sometimes and there isn't any reason to not add them. E.g., classifying messages as error/warning/info.

+1 on everything here.

More a comment than anything else: One thing I always wanted to explore was to implement LLDB's logging on Darwin on top of os_log (https://developer.apple.com/documentation/os/logging) which is also structured and also faster since it moves the the formatting out of process. This patch is great because it also works for non-Darwin platforms, and I guess there isn't anything preventing us from moving to os_log even if we take patch.

When you are using the logging for a reliably reproducible problem (some parsing issue, for instance) then there's no harm to the general "turn on all logging and we'll sort it out on our end". But for any issues that are timing sensitive the problem is that too much logging changes the timing and often causes bugs to disappear. This patch will add volume to log emission and slow its emission down, so it goes a bit in the wrong direction. I don't think that's a reason NOT to do this, hopefully the overhead is small. But it means we can't assume "turning on all logs" is a reasonable default behavior for logging. If anything, we need to go in the other direction and add a more fine-grained verbosity so we can ask for "a bit of log channel A and a bit of B and more of C" or whatever.

I'm not entirely sure what you intend when you say:

Every log message is now a serialized JSON object (that is now also guaranteed to occupy exactly one line) that is just appended to the log file.

Log messages are going to have new-lines in them because it's quite convenient to use the various object Description or Dump methods to log objects and those are generally laid out to be human readable (and are used for that purpose in other contexts.) By "single line" it seems you mean "all log messages with new lines will have the newline converted to "\n" as a character string before being inserted into the JSON. This seems more an implementation detail, I'm not sure why you call it out. In any case, preserving the multi-line structure of log content is important for eventually being able to read them, so as long as the encoders & printers do the right thing, it doesn't really matter how they are encoded.

Conversely, however, it is NOT the case that each LLDB_LOGF call is a single Log message. For instance, at the beginning and end of each round of "ShouldStop" negotiation, I run through the ThreadPlanStack of each thread that had a non-trivial stop reason and print each thread plan's description in order. I LOG a header for the output, then (fortunately) I have a routine that dumps all the info to a stream, so I can log that part in one go. Then I print an end message with another LLDB_LOGF. Putting these into separate JSON entries makes keeping them together harder - thus making automated tools harder to write - harder to scan with anything but a plain text dumper.

We could require that all unitary log messages build up the whole message off-line into a stream, then log in one go. But it might be more convenient to have a "start message/end message" pair, so you can call LLDB_LOGF in a loop printing individual objects, then close off the message. Some kind of RAII deal would be convenient for this.

It would be even more convenient if this allows nesting. Then you can do "logging the shared library state change - start the list of dylibs - start the log of dylib A, end dylib A, etc..." That way the individual entity dumps would be structured in the log message that dumped them, and you could progressively reveal them in a neat way.

We definitely need a plain text dumper that we provide. A very common task in looking through logs is to find an address of some entity in one log message, and then search through the other logs looking at each one to see if it is interesting. And very often to see if it is embedded in another interesting log transaction, so the context is important. That's best done with a plain text dump of the log.

lldb/source/Utility/Log.cpp
362

Don't use single capitol letter variable names. They are sort of okay in tiny functions, but tiny functions tend not to stay tiny and then you're trying to find a single letter amongst all the other letters when searching for definitions and uses.

Sorry for not commenting on this on earlier, I wanted to think about it a bit.

I think the usefulness of this can depend on the type of logging. l often turn on gdb-remote packet logging, or thread step logging, in an interactive debug session and watch the log messages streaming as I do lldb commands. I think interposing a JSON formatter in this is less straightforward - not that reading JSON is that difficult, but it's just adding layering between me and the log messages. WIth this patch the logging will be like {msg="..."} and fine, but we'll add more fields to this over time and it will become less easy to read.

Maybe this can be an *option* to logging like verbose logging is. I don't have any problem with the default mode being JSON and people who want rawness can add an option to log enable to not do that.

I can't get too high and mighty with my philosophy of disintermediation -- gdb-remote packets are often actually using binary data, and even compressed, and lldb reformats all of that before logging so my terminal doesn't get borked by escape/high-bit characters being dumped on to it. But I read the gdb-remote packet logs every day and wrapping it in JSON doesn't make anything better for this use case, I don't think. This may be more a Jason opinion than a gdb-remote packet reader opinion!

teemperor updated this revision to Diff 365737.Aug 11 2021, 7:04 AM
teemperor retitled this revision from [lldb] Change the log format to JSON instead of plain text to [lldb] Add an option for emitting LLDB logs as JSON.
teemperor edited the summary of this revision. (Show Details)
  • Made the logging format optional so both JSON and plain text can be used (with plain text the default). It's not a channel option like everything else.
  • Added the log format as an option to log enable (o, f and F are taken, so I just picked O as the short option).
  • Added documentation for logging and the log format to the website.
  • Added some tests for the possibility of having mixed plain and JSON logging in one file.
  • Removed the API test changes that tried to do JSON parsing instead of parsing the plain text (this might still be a good idea in the future in a separate patch).

Sorry for not commenting on this on earlier, I wanted to think about it a bit.

I think the usefulness of this can depend on the type of logging. l often turn on gdb-remote packet logging, or thread step logging, in an interactive debug session and watch the log messages streaming as I do lldb commands. I think interposing a JSON formatter in this is less straightforward - not that reading JSON is that difficult, but it's just adding layering between me and the log messages. WIth this patch the logging will be like {msg="..."} and fine, but we'll add more fields to this over time and it will become less easy to read.

Maybe this can be an *option* to logging like verbose logging is. I don't have any problem with the default mode being JSON and people who want rawness can add an option to log enable to not do that.

I can't get too high and mighty with my philosophy of disintermediation -- gdb-remote packets are often actually using binary data, and even compressed, and lldb reformats all of that before logging so my terminal doesn't get borked by escape/high-bit characters being dumped on to it. But I read the gdb-remote packet logs every day and wrapping it in JSON doesn't make anything better for this use case, I don't think. This may be more a Jason opinion than a gdb-remote packet reader opinion!

Yeah, I actually wanted to write finish my comment above that started as "Do we want to keep the old plain text format too? To be honest, I don't really see much" with "point in that but I don't mind keeping the old format around if someone thinks its useful." Somehow that part of my sentence got deleted. Anyway, the plain text format is still around and still the default so I think this concern is addressed.

When you are using the logging for a reliably reproducible problem (some parsing issue, for instance) then there's no harm to the general "turn on all logging and we'll sort it out on our end". But for any issues that are timing sensitive the problem is that too much logging changes the timing and often causes bugs to disappear. This patch will add volume to log emission and slow its emission down, so it goes a bit in the wrong direction. I don't think that's a reason NOT to do this, hopefully the overhead is small. But it means we can't assume "turning on all logs" is a reasonable default behavior for logging. If anything, we need to go in the other direction and add a more fine-grained verbosity so we can ask for "a bit of log channel A and a bit of B and more of C" or whatever.

I think this is also addressed by having the plain text format around in its current form.

I'm not entirely sure what you intend when you say:

Every log message is now a serialized JSON object (that is now also guaranteed to occupy exactly one line) that is just appended to the log file.

Log messages are going to have new-lines in them because it's quite convenient to use the various object Description or Dump methods to log objects and those are generally laid out to be human readable (and are used for that purpose in other contexts.) By "single line" it seems you mean "all log messages with new lines will have the newline converted to "\n" as a character string before being inserted into the JSON. This seems more an implementation detail, I'm not sure why you call it out. In any case, preserving the multi-line structure of log content is important for eventually being able to read them, so as long as the encoders & printers do the right thing, it doesn't really matter how they are encoded.

This is just about the fact that every JSON log message is guaranteed to be one line (even if the message contains newlines which will just be escaped). You can still log multiline messages as before The 1 object = 1 line property is just relevant for my comment below: What about logs that are printed to stdin and we want to parse? I think you can easily make a tool that filters out the non-JSON output. Logs are guaranteed to have every object on its own line, so you can just go line-by-line and filter them based on that.. It's technically also possible now with both formats around to have a plain text channel and a JSON channel print to the same file, so because of this 1 object = 1 line property that is something that log parser could handle if we wanted to. Not that we should ever actually tell someone to log both formats into one file, but at least the final output would be salvageable this way.

Conversely, however, it is NOT the case that each LLDB_LOGF call is a single Log message. For instance, at the beginning and end of each round of "ShouldStop" negotiation, I run through the ThreadPlanStack of each thread that had a non-trivial stop reason and print each thread plan's description in order. I LOG a header for the output, then (fortunately) I have a routine that dumps all the info to a stream, so I can log that part in one go. Then I print an end message with another LLDB_LOGF. Putting these into separate JSON entries makes keeping them together harder - thus making automated tools harder to write - harder to scan with anything but a plain text dumper.

We could require that all unitary log messages build up the whole message off-line into a stream, then log in one go. But it might be more convenient to have a "start message/end message" pair, so you can call LLDB_LOGF in a loop printing individual objects, then close off the message. Some kind of RAII deal would be convenient for this.

It would be even more convenient if this allows nesting. Then you can do "logging the shared library state change - start the list of dylibs - start the log of dylib A, end dylib A, etc..." That way the individual entity dumps would be structured in the log message that dumped them, and you could progressively reveal them in a neat way.

We definitely need a plain text dumper that we provide. A very common task in looking through logs is to find an address of some entity in one log message, and then search through the other logs looking at each one to see if it is interesting. And very often to see if it is embedded in another interesting log transaction, so the context is important. That's best done with a plain text dump of the log.

I agree with the point of making it easier to make one big structures log message. I don't see how the JSON format really changes the current situation that much. Right now if you do multiple LLDB_LOG calls there is no guarantee that the final log contains them next to each other and each LLDB_LOG call gets its own metainformation dumped next to it. The JSON format does exactly the same and with a plain text dumper the output is 100% identical. I think the RAII idea that just collects a log and then sends it all as one (potentially somehow structured) message seems like a good improvement, but it seems more a like a future work item (especially now that the plain text format is still a thing).

With the added channels in each message we could just do the filtering on our side and tell people to just enable every channel in every category.

+1 to this, was the first thing I tried to do on the viewer you linked.

Will there be an upstream log viewer? I don't know, but we could easily put a simple one on the website that just formats a log as a simple HTML table. That is easily done in vanilla JavaScript. But that should be a follow up patch.

As in on llvm.org? That would be neat.

Yeah, I think a simple JSON-to-HTML-table converter would fit on the official website. Not sure if we want to go full JS framework and start including external resources and so on. But writing a parser for the format is a matter of seconds, so I expect other people to write more complicated viewers.

Python does great with JSON so I'm sure it would be easy to hack on more specific scripts later too.

Will you be adding documentation about what the format is? (said web page would be a good place for it actually)

Done, I just put it on the website where the other internal design stuff is documented.

I know it won't be that complicated but it's always nice to see the total set of fields you might expect.

Agreed, I added a table with the fields + type + explanation on the website.

This looks like a great effort to improve logging and debugging! Have you considered also adding logging levels (info, debug, warning, error ...) ? I think the CommandReturnObject and Status classes already makes the distinction between warnings and errors messages. We could enrich logging further more by having logging levels to filter / color-code the messages.

Jepp, see my 'possible follow ups" comment above :). Having a log highlighting/filter for all the cases where we error but just keep chugging along would be really nice.

More a comment than anything else: One thing I always wanted to explore was to implement LLDB's logging on Darwin on top of os_log (https://developer.apple.com/documentation/os/logging) which is also structured and also faster since it moves the the formatting out of process. This patch is great because it also works for non-Darwin platforms, and I guess there isn't anything preventing us from moving to os_log even if we take patch.

Just logging messages to os_log shouldn't be a problem and I can do that as a follow-up. I don't think we can really use the optimization part where it stores the format string + args as that would require that everything is always using the old printf-style logging (which LLDB_LOG and LLDB_LOG_ERROR replace with a type safe API that uses different format strings).

teemperor updated this revision to Diff 365944.Aug 12 2021, 2:32 AM
  • Fix double printing and formatting of the old plain text format.

Friendly ping!

aprantl accepted this revision.Aug 27 2021, 12:49 PM

I think this is a great overall addition, might be good to have at least one other reviewer sign off though. (Summoning
@jingham!)
Should we add a python script to pretty-print a JSON log file?

This revision is now accepted and ready to land.Aug 27 2021, 12:49 PM

I have a few little grammatical nits.

The suggestion to use EnumArgs rather than a fresh argument type you can do or not as you see fit.

It really seems to me that SBDebugger.EnableLog should provide a way to set the OutputFormat. For humans, the plain text is probably preferable, but for programs that run lldb and want to control logging output, JSON would seem more natural. And programs that run lldb are more likely to want to use EnableLog than HandleCommand, since we've provided it...

But if you do that the OutputFormat should be a public enum, it would be dopey to have an API that takes a char * of "plain" or "json".

There should be a setting for the preferred OutputFormat.

lldb/docs/design/logging.rst
23 ↗(On Diff #365944)

You at least need "the format", but maybe

configuring the format with which log messages are emitted and the metainformation which should...

is a little easier to parse.

25 ↗(On Diff #365944)

These options are applied to log channels, and apply to all enabled categories in that channel.

47 ↗(On Diff #365944)

This makes it not proper JSON by itself, right? I'm guessing you do it this way so you don't have to track and last log emission? I can see that that would be more trouble than it's worth, so I'm not objecting. It is a little weird to call this JSON format when it is just JSON-adjacent, but that's probably too picky...

65 ↗(On Diff #365944)

We always know which channel/category pair produces the message. I think it would be handy to also always include the channel/category pairs. That seems like it would be useful information.

76 ↗(On Diff #365944)

The sequence ID is useful when the output is unstructured, but do we really need it in the structured form? This is just the index of the message in the JSON vector.

82 ↗(On Diff #365944)

Seems like the pid & tid are always ganged together. Maybe indicate that here?

But it does seem a shame that to get the thread info, I have to pay for "pid = NNNN" in every output message.

Seems like the pid should go in some sort of header, it won't change. You didn't add this behavior, but putting it in the JSON makes the pid emission cost about twice what it does in text form.

lldb/include/lldb/Utility/Log.h
60

I was a little surprised to see this enum in Log.h and not in the lldb public enums. Don't you need that so you can make an SB API to set the log output format? And don't you need that API?

lldb/source/Commands/Options.td
446 ↗(On Diff #365944)

You could also do this with the EnumArg descriptor. That way the help will show the options. I'm pretty sure that also auto-gen's the completion for the option as well.

lldb/unittests/Utility/LogTest.cpp
270

Since you allow the Output format to be changed while the channel stays enabled, you should add a test for that. You code should handle that gracefully, from what I can see. But still...

labath added a subscriber: labath.Sep 8 2021, 5:43 AM

I think that a structured logging format could be very useful. I am not sure what to make of the trailing "," in the json format though. It seems its usefulness is fairly limited, given that additional post processing is still needed to turn it into valid json. One could envision printing the surrounding [/] when logging is enabled/disabled, but I'm not sure that's such a good idea since one often inspects logs from crashing lldb instances. I have a feeling it would be cleanest to just drop the comma entirely. That way at least each individual log message would be valid json (so one could feed it to some log analyzer/storage backend in real time), and adding the trailing comma would would still be fairly trivial in post processing.

I too think that we should keep the textual log format, because it's much easier to read when printing to the terminal interactively. One thing to note there is that this patch removes the column alignment code from the textual output. With the advent of the structured format, this may not be so useful, as uninteresting fields can be removed differently (right now, I use vi block editing commands for that), but some alignment might still be useful for the ease of reading.

  • Added the log format as an option to log enable (o, f and F are taken, so I just picked O as the short option).

This seems like a good time to remind everyone that not every option needs to have a short form. I think that skipping one is a better idea than picking a random letter that noone is going to remember.

This looks like a great effort to improve logging and debugging! Have you considered also adding logging levels (info, debug, warning, error ...) ? I think the CommandReturnObject and Status classes already makes the distinction between warnings and errors messages. We could enrich logging further more by having logging levels to filter / color-code the messages.

We had a concept of different log levels at one point, but it wasn't really used, so it was removed during some refactoring. Though, with a structured log format, I can see how it could be useful to reintroduce that.

More a comment than anything else: One thing I always wanted to explore was to implement LLDB's logging on Darwin on top of os_log (https://developer.apple.com/documentation/os/logging) which is also structured and also faster since it moves the the formatting out of process. This patch is great because it also works for non-Darwin platforms, and I guess there isn't anything preventing us from moving to os_log even if we take patch.

BTW, there's a function called Host::SystemLog, which (I think) logs to some darwin-specific facility on darwin hosts. Personally, I think that having two logging mechanisms is just confusing, but I think I would be interesting if you could redirect the "normal" lldb logs to some os-specific target (like you can do with a file).

lldb/docs/design/logging.rst
65 ↗(On Diff #365944)

The way logging works right now, we only know which channel the message came from. Category is lost, and sometimes it's never really clear because of stuff like GetLogIfAny/AllCategoriesSet(CAT1 | CAT2).

To be able to do this (which I think is a great idea), we'd need to slightly refactor the logging code, so thateach message can be uniquely associated with a specific category.

lldb/source/Utility/Log.cpp
305–311

I guess you should use llvm::ListSeparator for this.

310–312

This is now dead code.