This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Use ObjectFileJSON to create modules for interactive crashlogs
ClosedPublic

Authored by JDevlieghere on Apr 12 2023, 2:01 PM.

Details

Summary

Create an artificial module using a JSON object file when we can't locate the module and dSYM through dsymForUUID (or however locate_module_and_debug_symbols is implemented). By parsing the symbols from the crashlog and making them part of the JSON object file, LLDB can symbolicate frames it otherwise wouldn't be able to, as there is no module for it. For non-interactive crashlogs, that never was a problem because we could simply show the "pre-symbolicated" frame from the input. For interactive crashlogs, we need a way to pass the symbol information to LLDB so that it can symbolicate the frames, which is what motivated the ObjectFileJSON format.

Diff Detail

Event Timeline

JDevlieghere created this revision.Apr 12 2023, 2:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 2:01 PM
JDevlieghere requested review of this revision.Apr 12 2023, 2:01 PM
JDevlieghere updated this revision to Diff 513027.
JDevlieghere retitled this revision from [lldb] Use ObjectFileJSON to create modules for interactive crashlogs (WIP) to [lldb] Use ObjectFileJSON to create modules for interactive crashlogs.
JDevlieghere edited the summary of this revision. (Show Details)

Add missing CHECK line in skipped_status_interactive_crashlog.test.

mib added inline comments.Apr 12 2023, 6:13 PM
lldb/examples/python/crashlog.py
460–464

Why do we need this ?

515

What do we use idx for ?

526–527

Seems like we're doing the same things for both self.images and self.crashlog.images ... In that case, is self.images really necessary ?

lldb/examples/python/scripted_process/crashlog_scripted_process.py
25–45

Nice!

lldb/examples/python/symbolication.py
413

Fancy! Can't we just do this and remove the extra import ?

428–429

This should work, right ?

mib added inline comments.Apr 12 2023, 6:21 PM
lldb/examples/python/symbolication.py
425

Since this field is optional in the ObjectFileJSON, I think it should be included here. We should just initialize it to 0 in lldb if the user didn't provide it.

JDevlieghere marked 7 inline comments as done.Apr 12 2023, 7:56 PM
JDevlieghere added inline comments.
lldb/examples/python/crashlog.py
460–464

The parser now stores a list of images sorted by their index. We can't use the list of images in the crashlog (self.crashlog.images) because they have the main module at index 0.

515

You're right, this isn't necessary anymore.

526–527

Yes, see my previous comment.

lldb/examples/python/symbolication.py
428–429

Apparently not: values() returns a view object rather than a list: https://docs.python.org/3/library/stdtypes.html#dict-views

JDevlieghere marked 4 inline comments as done.

Address @mib's code review feedback

mib added inline comments.Apr 13 2023, 10:44 AM
lldb/examples/python/crashlog.py
515

I'm really not a big fan of having very similar image lists ... may be we could use the from the crashlog object and skip the first entry (since we know it's the main executable).
What do you think ?

555

Does this needs to be initialized ?

lldb/examples/python/symbolication.py
427–428

What about :

mib added inline comments.Apr 13 2023, 10:46 AM
lldb/examples/python/crashlog.py
515

Otherwise, we could hoist the main executable image from the image list and handle it separately.

JDevlieghere marked 4 inline comments as done.
JDevlieghere added inline comments.
lldb/examples/python/crashlog.py
515

I understand the concern. To be fair, I didn't check whether the main executable coming first is something we rely on, but I'm pretty sure we are: we'll need it to create the target. I didn't want to mess with that and risk introducing a bug that way. It took me quite some time to figure out this was an issue when parsing the symbol data. If we don't want to break that assumption, there's nothing more efficient than keeping a second list of references. I also think it makes sense to keep that in the JSON parser, because the index of (parsed) image is only something that makes sense for that format because it cross references images based on their index. That's not the case in the textual or parser crashlogs.

FWIW this is the code that moves the main image to the top, invalidating the image indexes of every image before it:

def set_main_image(self, identifier):
    for i, image in enumerate(self.images):
        if image.identifier == identifier:
            self.images.insert(0, self.images.pop(i))
            break
mib accepted this revision.Apr 13 2023, 1:19 PM

LGTM!

lldb/examples/python/crashlog.py
515

Fair

This revision is now accepted and ready to land.Apr 13 2023, 1:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2023, 4:56 PM