This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Support fetching symbols with dsymForUUID in the background
ClosedPublic

Authored by JDevlieghere on Aug 6 2022, 9:38 AM.

Details

Summary

On macOS, LLDB uses the DebugSymbols.framework to locate symbol rich dSYM bundles. [1] The framework uses a variety of methods, one of them calling into a binary or shell script to locate (and download) dSYMs. Internally at Apple, that tool is called dsymForUUID and for simplicity I'm just going to refer to it that way here too, even though it can be be an arbitrary executable.

The most common use case for dsymForUUID is to fetch symbols from the network. This can take a long time, and because the calls to the DebugSymbols.framework are blocking, lldb does not launch the process immediately. This is the expected behavior and many people therefore often don't use this functionality, but instead use add-dsym when they want symbols for a given frame, backtrace or module. This is a little faster because you're only fetching symbols for the module you care about, but it's still a slow, blocking operation.

This patch introduces a hybrid approach between the two. When symbols.enable-background-lookup is enabled, lldb will do the equivalent of add-dsym for every module it doesn't have symbols for in the background. From the user's perspective there is no slowdown, because the process launches immediately, with whatever symbols are available and more symbol information is added over time as the background fetching completes.

[1] https://lldb.llvm.org/use/symbols.html

Diff Detail

Event Timeline

JDevlieghere created this revision.Aug 6 2022, 9:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 6 2022, 9:39 AM
JDevlieghere requested review of this revision.Aug 6 2022, 9:39 AM

Interesting idea, it seems worth trying out and seeing how it works in practice.

My read of the llvm::ThreadPool is that this won't create an unbounded number of threads doing background fetches, right. Some of the big UI apps in macOS these days can have a thousand dylib/frameworks loaded in them.

I wonder if it might be interesting to target this a bit - only start background downloading the dSYM for a Module once it has appeared in a backtrace, for instance. Any UI application on macOS etc has around 500 libraries loaded, but I believe only a few dozen of those libraries will actually appear in backtraces as they're debugging. We could delay it until the UnwindTable ctor is called for instance -- that's the first thing that will happen when a Module is in a backtrace. Just thinking aloud, not arguing for it especially.

Interesting idea, it seems worth trying out and seeing how it works in practice.

My read of the llvm::ThreadPool is that this won't create an unbounded number of threads doing background fetches, right. Some of the big UI apps in macOS these days can have a thousand dylib/frameworks loaded in them.

Correct, the thread pool is limited by the number of cores/physical CPUs (depending on the ThreadPoolStrategy). I contemplated using a dedicated thread pool for this, but didn't like the idea of having to manage its lifetime and pick and arbitrary fixed number. The only risk here is that we starve other work scheduled on this ThreadPool (currently that's building the manual DWARF index, so not really a concern on our platform).

I wonder if it might be interesting to target this a bit - only start background downloading the dSYM for a Module once it has appeared in a backtrace, for instance. Any UI application on macOS etc has around 500 libraries loaded, but I believe only a few dozen of those libraries will actually appear in backtraces as they're debugging. We could delay it until the UnwindTable ctor is called for instance -- that's the first thing that will happen when a Module is in a backtrace. Just thinking aloud, not arguing for it especially.

I really like this suggestion. It did bother me how naive/wasteful the current approach is. The downside of doing this more lazily is that you potentially had enough time to fetch everything by the time you did the first backtrace. But it's equally likely that you're waiting for those one or two modules you did care about (and are part of the backtrace) while 200 others you don't care about are being pulled in. I think your suggestion hits the sweet spot in terms of latency and doing only meaningful work.

Implement Jason's suggestion of delaying this until we need the unwind table for a module.

labath added a subscriber: labath.Aug 8 2022, 4:52 AM

I see two problems with this patch.

  • it makes shutting down lldb safely impossible (because you never know if there are any unfinished tasks running). We already have problems with shutting down (grep for "intentionally leaked"), but this is guaranteed to make the problem worse. Using the global thread pool would make things better, as then one could flush the pool in SBDebugger::Terminate, or something similar
  • there seems to be a race between new debuggers/targets being added, and their existing instances being notified. I'm not 100% sure what will happen in this case, but it seems very easy to miss some debuggers (or targets) that were added after we took the "snapshot" of the current state

(There's also the (obvious) problem of any breakpoints that are set on code within those modules becoming effective at an unpredictable future data, but I am assuming you are aware of that.)

I see two problems with this patch.

  • it makes shutting down lldb safely impossible (because you never know if there are any unfinished tasks running). We already have problems with shutting down (grep for "intentionally leaked"), but this is guaranteed to make the problem worse. Using the global thread pool would make things better, as then one could flush the pool in SBDebugger::Terminate, or something similar

It *is* using the global thread pool, but it doesn't look like that's being flushed. I can rework that in a separate patch to make it fit the Initialize/Terminate pattern.

  • there seems to be a race between new debuggers/targets being added, and their existing instances being notified. I'm not 100% sure what will happen in this case, but it seems very easy to miss some debuggers (or targets) that were added after we took the "snapshot" of the current state

As I (tried to) explain in the comment, since we set the symbol file in the module before taking the "snapshot", I don't think this is a problem. If a target was created after, it should already know about the symbol file. Did I miss something with that approach?

(There's also the (obvious) problem of any breakpoints that are set on code within those modules becoming effective at an unpredictable future data, but I am assuming you are aware of that.)

Yeah, but I don't see any way around that. The unpredictability is unfortunate, but from UX perspective it's not much worse than modules getting loaded at runtime (although hopefully that's more deterministic).

labath added a comment.Aug 8 2022, 8:31 AM

I see two problems with this patch.

  • it makes shutting down lldb safely impossible (because you never know if there are any unfinished tasks running). We already have problems with shutting down (grep for "intentionally leaked"), but this is guaranteed to make the problem worse. Using the global thread pool would make things better, as then one could flush the pool in SBDebugger::Terminate, or something similar

It *is* using the global thread pool, but it doesn't look like that's being flushed. I can rework that in a separate patch to make it fit the Initialize/Terminate pattern.

Ok, I'm sorry. I got confused by the static ThreadPoolTaskGroup thingy. In either case, I think we should flush that pool during termination. The reason we didn't need that before is because this is the first time that we have these sorts of fully async tasks (the other task pool tasks are "flushing" themselves).

  • there seems to be a race between new debuggers/targets being added, and their existing instances being notified. I'm not 100% sure what will happen in this case, but it seems very easy to miss some debuggers (or targets) that were added after we took the "snapshot" of the current state

As I (tried to) explain in the comment, since we set the symbol file in the module before taking the "snapshot", I don't think this is a problem. If a target was created after, it should already know about the symbol file. Did I miss something with that approach?

Right. I missed that part. I think this should be mostly fine. I mean, I suppose this can still race in the other "direction", where a target picks up the new symbol file, but we still end up calling SymbolsDidLoad manually. But hopefully that is ok (?)

(There's also the (obvious) problem of any breakpoints that are set on code within those modules becoming effective at an unpredictable future data, but I am assuming you are aware of that.)

Yeah, but I don't see any way around that. The unpredictability is unfortunate, but from UX perspective it's not much worse than modules getting loaded at runtime (although hopefully that's more deterministic).

Yeah, if that's the thing you want, then that's the thing you want. One could play some UX games, and add a mechanism to show the downloads that are in progress and/or force the completion of loading of some symbols of whatever, but there's no way to get rid of that completely.

There was another approach I considered initially which might alleviate some of the issues Pavel raised. Instead of using the global thread pool, that plan consisted of a dedicated thread to fetch symbols in the background. That would allow us to add the symbol info synchronously. For example on every stop we could ask if there are any new symbol files to add instead of trying to add the symbol info as soon as it becomes available.

labath added a comment.Aug 8 2022, 8:44 AM

There was another approach I considered initially which might alleviate some of the issues Pavel raised. Instead of using the global thread pool, that plan consisted of a dedicated thread to fetch symbols in the background. That would allow us to add the symbol info synchronously. For example on every stop we could ask if there are any new symbol files to add instead of trying to add the symbol info as soon as it becomes available.

I kinda like that. Bonus points if you make that single thread download multiple files at the same time (using threads to paralelize downloading is not the best approach, since the task is not cpu-bound). :)

  • Use a debugger event to execute the notification of the targets on the event handler thread

There was another approach I considered initially which might alleviate some of the issues Pavel raised. Instead of using the global thread pool, that plan consisted of a dedicated thread to fetch symbols in the background. That would allow us to add the symbol info synchronously. For example on every stop we could ask if there are any new symbol files to add instead of trying to add the symbol info as soon as it becomes available.

I kinda like that.

I tried prototyping this. I wasn't super happy with the results:

  • This one's minor, but compared to the current patch, there's a bunch of synchronization boilerplate: input/output mutex, condition variable, etc
  • Who should own this class? Inherently it's tied to the shared module list, but that never gets destroyed, so that has the flushing issue. The only viable alternative is to have an instance controlled by Debugger::{Initialize/Terminate}, which is exactly what we do for the global thread pool in D131407.
  • There's really no good place to poll for this in the target. You'd need to do it from the process.
  • Once you poll for new symbol files, you can't flush the newly added modules from the download (unless you store them in the target) because other targets will need to be notified later.

All these things can be overcome, but that was the point where I decided that maybe it wasn't worth it.

Instead I focussed on avoiding doing all this work from an arbitrary thread and using a broadcast event to do it from the event handler thread.

Bonus points if you make that single thread download multiple files at the same time (using threads to paralelize downloading is not the best approach, since the task is not cpu-bound). :)

If LLDB was doing the downloading itself we might be able to do better, but by shelling out to dsymForUUID all we see in an opaque blocking call.

This is an awesome enhancement for environments where the dSYM can be located programmatically --- downloading only the dSYMs that are actually being used in a debug session makes it low overhead versus grabbing everything, and it's great to have them loaded for you when you actually want to look at them. I think the LocateSymbolFileMacOSX.cpp part of the patch needs to be rebased now that https://reviews.llvm.org/D131303 landed. I hope we can get this integrated, I can't wait to live on it.

Rebase on top of D131303

This revision is now accepted and ready to land.Aug 15 2022, 5:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 15 2022, 5:57 PM