This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Allow loading of minidumps with no process id
ClosedPublic

Authored by labath on Nov 14 2019, 5:43 AM.

Details

Summary

Normally, on linux we retrieve the process ID from the LinuxProcStatus
stream (which is just the contents of /proc/%d/status pseudo-file).

However, this stream is not strictly required (it's a breakpad
extension), and we are encountering a fair amount of minidumps which do
not have it present. It's not clear whether this is the case with all
these minidumps, but the two known situations where this stream can be
missing are:

  • /proc filesystem not mounted (or something to that effect)
  • process crashing after exhausting (almost) all file descriptors (so the minidump writer may not be able to open the /proc file)

At first I wanted to do something similar to what the gdb-remote plugin
does when talking to bare metal gdb stubs and like, and "invent" a
process ID in this case. However, then I noticed that most of the things
"just work" even if I leave the proces ID as "invalid".

Since it seems we have multiple use cases for a "process" without a
"process id", what I do in this patch instead is "downgrade" the missing
pid case to a warning. I also changes the "process status" output to
better handle the case of a missing/unknown pid.

The test case in this patch verifies that basic functionality still
works even with a pid-less minidump.

Event Timeline

labath created this revision.Nov 14 2019, 5:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 14 2019, 5:43 AM

Note: I am certain that there will be things which will fail for these kinds of "processes" -- there's plenty of != LLDB_INVALID_PID checks, but most of them seem to be on the launch/attach code paths which are not exercised here. I am fine with fixing these as they are discovered -- the bigger question here is whether this is a reasonable direction to take lldb in. I think it is, because most of the != LLDB_INVALID_PID checks look like they could be replaced by a check of the process state.

labath added a comment.Jan 7 2020, 9:06 AM

Greg, Jim, any thoughts on pidless processes?

Greg, Jim, any thoughts on pidless processes?

First off, it would be useful to know that the PID for a process couldn't be recovered. For instance when inspecting a dump file, you might want to correlate the process in the dump with PID's recorded in some system log file. It would be confusing to go to "target list" and just have the Process: 1234 record that's usually there be just absent. So we need to represent that somehow. This could either be by cooking the PID, or as Pavel suggested by checking the process status when the PID is invalid and printing "unknown pid" if the PID is invalid but the process status is eStateStopped.

It seems like it would be cleaner to have GetID() express the difference between "process hasn't gotten to the point where I know it's pid yet", and "failure to recover the pid". But I don't think it's a good idea to randomly pick a PID - again if you are correlating the dump with other records that could send you off on a wild goose chase. We could make Process::GetID() return an optional, but that seems annoying and would be hard to express across the SB API boundary. We could either steal another PID for LLDB_UNKNOWN_PROCESS_ID (0xffffffffffffffff?).

I would prefer that we just pick a PID of 1 for minidumps if they are missing and we have minidump files that don't have PIDs. Then no other logic needs to change? Have you seen real minidumps that have memory and threads and no process ID?

The pid usually comes from the MinidumpMiscInfoFlags::ProcessID right? This is a minidump supported directory entry. We don't actually rely on /proc/%d/status right?

The pid usually comes from the MinidumpMiscInfoFlags::ProcessID right? This is a minidump supported directory entry. We don't actually rely on /proc/%d/status right?

Actually, we are. :/ Breakpad does not emit the Misc stream on linux. It always tries to emit the "status" stream, but it seems to sometimes fail (we are getting a fair amount of minidumps like this). It's not fully clear why this happens, but our current hypothesis is file descriptor exhaustion (EMFILE).

Crashpad is always out of process (so it should not suffer from EMFILE), and it also always emits the Misc stream. Since crashpad is the future, and it should always be able to save the PID, this problem will hopefully go away in the future, so I agree it does not make sense to go out of our way (and introducing LLDB_UNKNOWN_PROCESS_ID may be just that) to support this use case. However, I still wanted to see how far I can take this, since the gdb-remote code has a similar problem (though talking to bare metal stubs is also somewhat "niche"). And I do agree with Jim that it we avoid lying to the user.

I think I'll just set the pid to 1 instead. I'll moderate the lying aspect by mentioning this explicitly in the error/warning message.

labath updated this revision to Diff 237274.Jan 10 2020, 2:52 AM

PID := 1

This revision was not accepted when it landed; it landed in state Needs Review.Jan 20 2020, 4:17 AM
This revision was automatically updated to reflect the committed changes.