This is an archive of the discontinued LLVM Phabricator instance.

[lldb/Plugins] Add ability to fetch crash information on crashed processes
ClosedPublic

Authored by mib on Feb 14 2020, 4:47 PM.

Details

Summary

Currently, in macOS, when a process crashes, lldb halts inside the
implementation disassembly without yielding any useful information.
The only way to get more information is to detach from the process, then wait
for ReportCrash to generate a report, find the report, then see what error
message was included in it. Instead of waiting for this to happen, lldb could
locate the error_string and make it available to the user.

This patch addresses this issue by enabling the user to fetch extended
crash information for crashed processes using process status --verbose.

Depending on the platform, this will try to gather different crash information
into an structured data dictionnary. This dictionnary is generic and extensible,
as it contains an array for each different type of crash information.

On Darwin Platforms, lldb will iterate over each of the target's images,
extract their __crash_info section and generated a StructuredData::Array
containing, in each entry, the module spec, its UUID, the crash messages
and the abort cause. The array will be inserted into the platform's
m_extended_crash_info dictionnary and FetchExtendedCrashInformation will
return its JSON representation like this:

{
  "crash-info annotations": [
    {
      "abort-cause": 0,
      "image": "/usr/lib/system/libsystem_malloc.dylib",
      "message": "main(76483,0x1000cedc0) malloc: *** error for object 0x1003040a0: pointer being freed was not allocated",
      "message2": "",
      "uuid": "5747D0C9-900D-3306-8D70-1E2EA4B7E821"
    },
    ...
  ],
  ...
}

This crash information can also be fetched using the SB API or lldb-rpc protocol
using SBTarget::GetExtendedCrashInformation().

rdar://37736535

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

Diff Detail

Event Timeline

mib created this revision.Feb 14 2020, 4:47 PM
mib updated this revision to Diff 244793.Feb 14 2020, 5:02 PM

Run clang-format

I have some comments (beside my usual documentation spam). But otherwise looks good to me.

lldb/bindings/interface/SBProcess.i
196 ↗(On Diff #244793)

documentation.

lldb/include/lldb/Target/Process.h
1333

Documentation pls

2635

Why make this a typedef'd anonymous struct?

Also maybe link to what struct this is mapping to (I assume this layout is somewhere on disk). That would also explain the mysterious type comments behind.

2646

documentation :p

2655

If this would return StructuredData then the structs above could be hidden in the implementation. Just a suggestion.

(Also documentation)

lldb/packages/Python/lldbsuite/test/commands/process/crash-info/TestProcessCrashInfo.py
5 ↗(On Diff #244793)

You don't need that if you don't use print in your test.

19 ↗(On Diff #244793)

I think I removed that comment from most tests, so you can also remove it here (it really doesn't tell me anything that isn't obvious).

64 ↗(On Diff #244793)

self.assertIn and you'll get a nice error message when this fails.

lldb/source/Commands/CommandObjectProcess.cpp
1557

I think there should be a the before the process'

1565

If you don't accept arguments you should check that there are no arguments or otherwise throw an error. Currently process crash-info banana doesn't give an error.

1581

Shouldn't that have a return (otherwise this dereferences the pointer below).

lldb/source/Target/Process.cpp
1133 ↗(On Diff #244793)

You can omit .AsCString(). when doing LLDB_LOG

1184 ↗(On Diff #244793)

.back() ?

1190 ↗(On Diff #244793)

early exit/

1197 ↗(On Diff #244793)

.back()

labath requested changes to this revision.Feb 17 2020, 12:53 AM
labath added a subscriber: labath.

All of this code you're adding to Process.cpp is extremely specific to one platform's notion of "crash info". This should be factored out so that macos-specific details live inside some macos class. (In the current setup, it's probably easiest to put it in PlatformDarwin, or some subclass thereof).

This revision now requires changes to proceed.Feb 17 2020, 12:53 AM
mib marked 15 inline comments as done.Feb 19 2020, 8:19 AM
mib updated this revision to Diff 245424.Feb 19 2020, 8:45 AM
mib retitled this revision from [lldb/Target] Add process crash-info command to [lldb/Plugins] Add `platform process crash-info` command.
mib edited the summary of this revision. (Show Details)

Moved implementation from Process to Platform / PlatformDarwin and made it more generic and extensive so other platforms could also use it.
Added documentation and addressed comments.

JDevlieghere added inline comments.Feb 19 2020, 9:46 AM
lldb/include/lldb/Target/Platform.h
838

I had a hard time understanding this sentence. Is the crash information type the key of the entry or the values in the array?

882

I find it suspicious that this member is stored here but not used. Does it need to be stored in the first place? Given the FetchExtendedCrashInformation I'd expect these dictionaries to be different per target?

lldb/source/Commands/CommandObjectPlatform.cpp
1520 ↗(On Diff #245424)

This comment is useless.

1521 ↗(On Diff #245424)

As I've said in other reviews I really dislike the #pragma mark. It's fine if the file already uses them but I really don't want to encourage new ones being added.

1541 ↗(On Diff #245424)

Shouldn't you update the return status? Same below.

lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.cpp
1514

The braces are inconsistent with the single-line ifs below. Why not wrap this log statement in the if check below?

if (!annotations)
    LLDB_LOG(log, "Couldn't extract crash information annotations")
else
    m_extended_crash_info->AddItem("crash-info annotations", annotations);

If you want the log statement to stay close to the ExtractCrashInfoAnnotations call (which I think you should), you could move the !m_extended_crash_info check up.

1596

braces

lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.h
93

Assuming the header defines this struct, why not do

#ifdef HAVE_CRASHREPORTERCLIENT_H 
#include <CrashReporterClient.h> 
#else 
struct CrashInfoAnnotations {
  ...
}
#endif
124

The log is a singleton, why pass it around?

This looks like a great feature, I see myself using this a lot in the future. Thanks for working on this!

mib marked 10 inline comments as done.Feb 19 2020, 10:41 AM
mib added inline comments.
lldb/include/lldb/Target/Platform.h
838

As I was trying to explain it on the diff summary, the "crash information type" (i.e. `crash-info annotations) is the key of the dictionary entry and the value is a dictionary array that contains all the entries for that specific crash information type

882

I did that so other platforms just need to implement FetchExtendedCrashInformation and also to make sure the SBAPI method keeps the same behaviour regardless of the platform.

lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.h
93

The struct have a different name in the header (which is not a big deal), but I don't see the point of including the header if I give the structure definition below it.

I put the header comment following @teemperor's feedback (https://reviews.llvm.org/D74657#inline-679127) but maybe I could get rid of it ?

mib marked an inline comment as done.Feb 19 2020, 10:44 AM

This looks like a great feature, I see myself using this a lot in the future. Thanks for working on this!

Appreciated it!

JDevlieghere added inline comments.Feb 19 2020, 10:59 AM
lldb/include/lldb/Target/Platform.h
838

Would you mind updating the comment to explain it that way?

882

I'm not sure I understand how that answers my questions. The SB API calls FetchExtendedCrashInformation, what does it care about the member? Why not return a new DictionarySP every time?

lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.h
93

I'll leave it up to your discretion, but I'd either specify the struct myself and say it's defined in CrashReporterClient.h, or alternatively put the ifdefs in the code and include the header and typedef it if the header is available, and use the custom struct otherwise, which means we still have the layout and type info Raphael asked about.

mib updated this revision to Diff 245528.Feb 19 2020, 2:45 PM
mib marked 5 inline comments as done.

Update documentation.
Address Jonas' comments.

This looks a bit better, but my the purpose of my previous command was not to move the location of the cli command itself -- I think it was ok just where it was. The fact that the part of the functionality is implemented in the Platform class does not mean the cli command has to be there.

lldb/source/Commands/CommandObjectPlatform.cpp
1574–1577 ↗(On Diff #245528)

I am not sure if this command really belongs here. If definitely doesn't fit the description "Commands to query, launch and attach to processes on the current platform.", and if you look at the subcommands, you see that these are all things that you do *before* you get an actual running process. "crash-info" is the exact opposite -- it only makes sense *after* you have a process.

The "process" (where you had it originally) command tree contains commands for interaction with a running process (although it also overlaps these commands somewhat :/). This command is pretty similar to "process status" so I think it makes sense for it to be next to it (in fact, it is so similar, that I'd consider putting this under some "verbose" switch of process status).

lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.h
93

I don't think it makes sense to include the system definition if you're going to provide the alternative -- either you support both cross and native scenarios (in which case you can use the hand-rolled version everywhere), or you just support the native one (and you just use the system version). That said, if the struct contains pointer members, I am not sure if you can actually use the native struct like that. What if the process you're debugging is 32-bit ?

And whatever you do, please don't include a system header in the middle of a class.

mib marked 2 inline comments as done.Feb 20 2020, 12:20 PM
mib added inline comments.
lldb/source/Commands/CommandObjectPlatform.cpp
1574–1577 ↗(On Diff #245528)

I misunderstood your request. I'm moving it under "process status --verbose".

mib marked 2 inline comments as done.Feb 20 2020, 12:25 PM
mib updated this revision to Diff 245721.Feb 20 2020, 12:34 PM
mib retitled this revision from [lldb/Plugins] Add `platform process crash-info` command to [lldb/Plugins] Add ability to fetch crash information on crashed processes.
mib edited the summary of this revision. (Show Details)

Moved invocation to process status --verbose.
Removed macOS' CrashReporterClient header.

JDevlieghere accepted this revision.Feb 20 2020, 1:55 PM

A few small nits inline, but this LGTM if Pavel is on board.

lldb/source/Commands/CommandObjectProcess.cpp
1260

Please use single quotes like we do for other command objects.

lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.cpp
1541

You can move this out of the loop

A few small nits inline, but this LGTM if Pavel is on board.

We're getting close, but I still see some room for improvement. :)

lldb/bindings/interface/SBPlatform.i
195–198 ↗(On Diff #245721)

If we keep the this way, then the user will have to remember to get the correct platform in order to ask it for the crash info. That's not the end of the world, but it might be better if this was a method on the SBTarget class, which would automatically consult the target's current platform.

lldb/source/Commands/CommandObjectProcess.cpp
1288–1291

So, this will print an error if we don't get a crash information for _any_ reason (process did not crash, platform does not support crash info, ...), is that right?

That seems overly aggressive. I think this should be more nuanced. Maybe if FetchExtendedCrashInformation returned Expected<StructuredDataSP>, then you could do something like

  • if the result is an error => something really went wrong => print the error message
  • nullptr => not an error, but we also didn't get anything => don't print anything
  • an actual dictionary => hurray => print the crash info

Then the default implementation can return nullptr, and in PlatformDarwin you can precisely decide what treatment does a certain situation deserve (e.g. whether not being able to find the __crash_info section is an error or just "nothing").

1294

Should you maybe print some kind of a header to make it clear that what follows is the crash information?

lldb/test/API/functionalities/process_crash_info/TestProcessCrashInfo.py
38

Besides the "positive" test, I think it would be also good to have some tests for the "negative" cases. E.g. a case where the process did not crash, or when we have not even launched it yet.

mib updated this revision to Diff 245939.Feb 21 2020, 11:48 AM
mib marked 6 inline comments as done.
mib edited the summary of this revision. (Show Details)

Addressed Pavel's comments.
Add "negative" tests.

mib updated this revision to Diff 245952.Feb 21 2020, 12:04 PM

Removed old revision artifacts
Renamed test_on_launch to test_before_launch

JDevlieghere accepted this revision.Feb 21 2020, 1:40 PM

LGTM'ing for Pavel by proxy as discussed on IRC.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 21 2020, 1:53 PM
This revision was automatically updated to reflect the committed changes.

A couple of more comments from me. The two on the test are minor, but I think it would be good to resolve the one about the SB method placement soon, before this thing starts having users.

lldb/bindings/interface/SBTarget.i
952–957

I know it was my idea to have this on the SBTarget class, but I think we should move this once more :/. I mean, this is definitely better than the SBPlatform class (which is what I was really trying to say before), but I think SBProcess would be even better here. The SBTarget methods are things that (more or less) make sense even when you don't have an actual process around. However, "crash info" (I think) only makes sense once you have an actual process, and so it would make sense to have it somewhere near SBProcess::GetState/GetDescription. It would also be more symmetric with the CLI interface for this, which lives under the "process" command tree.

The internal interfaces might also be better off passing a Process object (instead of Target) around, but that is not as important..

Sorry. :(

lldb/test/API/functionalities/process_crash_info/TestProcessCrashInfo.py
20

We have a line_number utility function to avoid hardcoding line numbers like this.

65–82

I'm hoping that these two tests can actually run (and pass) on non-darwin systems too...