# Return "thread-pcs" in jstopinfo on Linux/Android.AbandonedPublicActions

Authored by labath on Nov 30 2016, 5:24 PM.

# Details

Reviewers
 jmajors
Summary

To prevent costly calls to the server to get the PC for every thread,
add all the thread's PCs to the jstopinfo message. This also makes the
Linux/Android server behave like the macOS server.

# Diff Detail

Build Status
 Buildable 1744 Build 1744: arc lint + arc unit

### Event Timeline

jmajors updated this revision to Diff 79846.Nov 30 2016, 5:24 PM
jmajors retitled this revision from to Return "thread-pcs" in jstopinfo on Linux/Android..
jmajors updated this object.

For what it's worth, this change was part of the changes we made to reduce packet traffic for "private stops". When a high-level "step" or "next" command is done, we instruction step (or fast-step with Jim's fast-step code when it sees a sequence of instructions that cannot branch - we set a breakpoint and continue instead of instruction stepping) until we reach the end of the stepping range and then we do a "public stop". Given that the private stops are more numerous, we spent a bunch of time looking at everything we could eliminate for those private stops. Getting all the pc values into the stop packet (aka T or ? packet) was a big one. For multi-threaded programs we also added a "jstopinfo" which gives the stop reasons for all threads that had a stop reason, e.g.

(I elided the expedited registers)

The jstopinfo above is ascii-hex encoding of the JSON,

[{"tid":591924,"metype":6,"medata":[2,0],"reason":"exception"}]

else lldb would iterate over all the threads to see if any of them stopped for some reason (e.g. hitting a breakpoint) while we were instruction stepping. If you don't see lldb doing that thread-stop-reason checking for your platform, it won't be needed.

(and looking at this, it's giving the stop reasoon for the thread that hit a breakpoint -- which we already get from the T packet. Hmmm, I see another tiny perf win in the near future! ;)

We're also sending up 4 words of stack memory which might be the start of this function's stack frame and it's caller's stack frame. For a simple "find the pc of the caller stack frame" in lldb, this can remove a couple of memory reads as the thread stepping logic needs to know who the caller stack frame is.

We also did some work on the public stop side, adding a jThreadsInfo packet which gives us all the information about all of the threads. We get all the expedited registers, the thread name, the stop reasons, information about the darwin libdispatch queues, and expedited stack memory to help speed up a stack walk (I don't remember offhand if debugserver tries to walk the entire stack using the simple follow-the-frame-chain-in-stack-memory technique or if it caps the stack walk. but our main IDE, Xcode, needs the full stack for its UI so we wanted to fetch that in one big packet so we can give the pc and function names without any additional packet traffic.)

labath edited edge metadata.Dec 1 2016, 2:12 AM

Looks like a good starting point. Apart from some style nits, I'd like us to remove sending of the registers field in the jstopinfo, as it is not necessary anymore. Also, we should add an lldb-server style test for this.

@jasonmolenda, we already have jThreadsInfo support in lldb-server, albeit without the memory snippet parts. A lot less functions use frame pointers on linux than on osx, so I was not sure whether implementing the simple FP-following will be useful. However, that is the next thing on my mind in terms of optimizations here.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
441

This function is called in only two sites. My preference would be to update them to use the new signature instead of adding another overload.

579

We can now change this code to:

if (!abridged) {
...
}

This was my attempt to send the PCs in the jstopinfo field, which used to work at some point, but now lldb client does not seem to recognize it. When we start sending the PC in the thread-pcs field, the whole registers field will become completely redundant, and we should remove it.

This will not affect the jThreadsInfo packet, as there we pass abridged = false.

(Obviously we should also clean up the GetRegistersAsJSON function to reflect the fact it does not need to support PC-only register sending.)

582

@labath ah I see I hadn't looked at the lldb-server packets so I didn't know you had jThreadsInfo, good to hear. Yes, if your target is built mostly -fomit-frame-pointer, lldb-server won't be able to do a stack walk without reading eh_frame or the arm unwind info that Tamas added a year or so ago. We've always avoided having lldb-server/debugserver know anything about files and symbols else the memory footprint will grow and we need these stubs to run in low-memory environments; I'm not sure how it could do a stalk walk. It's a pretty big perf hit for lldb to walk to stacks of modern apps that have tens of threads on every public stop. :(

Agree with not sending the registers in the jstopinfo kv pair in the T/? packet - \it's such a space-inefficient encoding compared to the usual kv pairs of info in the T packet, as weirdly as they're formatted. JSON requires we use base 10 for numbers, and then ascii-hex encoding it doubles it (I played with the idea of using base64 for jstopinfo originally, but instead worked on including the smallest amount of info we needed).

If we were going to be serious about the T packet format, I think we'd add a new JT packet (or whatever) which is all the information in straight JSON, and some request from lldb to enable JT packets instead of T packets. But it doesn't seem like a pressing concern, and the more we deviate from standard gdb-remote protocol (even if it's optional deviation with fallback to standard packets), I think it makes lldb more fragile for interop.

labath added a comment.Dec 2 2016, 2:47 AM

@labath ah I see I hadn't looked at the lldb-server packets so I didn't know you had jThreadsInfo, good to hear. Yes, if your target is built mostly -fomit-frame-pointer, lldb-server won't be able to do a stack walk without reading eh_frame or the arm unwind info that Tamas added a year or so ago. We've always avoided having lldb-server/debugserver know anything about files and symbols else the memory footprint will grow and we need these stubs to run in low-memory environments; I'm not sure how it could do a stalk walk. It's a pretty big perf hit for lldb to walk to stacks of modern apps that have tens of threads on every public stop. :(

I definitely don't want to add dwarf parsing code to lldb-server. We care a lot about the size of the binary as well.

Android studio does not crawl the stack of every thread after a stop -- it only displays the current one. What I was considering sending just K bytes of stack (for some reasonable value of K) around the SP of every thread, so that lldb does not need to do a memory read when computing the first frame of every thread (which I think we even if noone has asked that). But I'm talking in pretty vague terms now, as I haven't looked at this seriously yet. Last time I did a protocol profile, memory reads did not come up as a big problem, so I have not investigated this in details.

Agree with not sending the registers in the jstopinfo kv pair in the T/? packet - \it's such a space-inefficient encoding compared to the usual kv pairs of info in the T packet, as weirdly as they're formatted. JSON requires we use base 10 for numbers, and then ascii-hex encoding it doubles it (I played with the idea of using base64 for jstopinfo originally, but instead worked on including the smallest amount of info we needed).

If we were going to be serious about the T packet format, I think we'd add a new JT packet (or whatever) which is all the information in straight JSON, and some request from lldb to enable JT packets instead of T packets. But it doesn't seem like a pressing concern, and the more we deviate from standard gdb-remote protocol (even if it's optional deviation with fallback to standard packets), I think it makes lldb more fragile for interop.

I'd support switching to a new stop-reply format, the current one has a lot of ad-hoc hacks glued onto it.

@labath intersting idea, sending a blob of stack memory above the current stack pointer reg value to accelerate an unwind. If you have a lot of small stack frames, it could be a performance benefit. Common/simple methods do their work in 32-64 bytes of stack space (IMO) so we could look at the perf implications of sending up the 64 bytes of stack above \$sp in the T packet. We're locked into ascii for the T packet so we're talking about 128 characters plus 16 for the address. It would only be needed on platforms where a frame pointer register is not used by default. If you look at your packet logs, I bet lldb is doing a memory read on private stops as it tries to figure out what the caller function is - you could eliminate that packet with this, assuming smallish stack frames.

As mentioned, for wireless gdb-remote protocol (bluetooth, wifi) the performance characteristics are very different than for wired connections, but if we sent up that blob of stack data only on platforms without a fp reg used by default, it wouldn't impact the apple environments.

Agreed, I think we're all waiting for the addition of one more thing to the T packet to finally push us off of it, and to make up a new stop packet. Greg's addition of jstopinfo almost did it for me personally. :) The only thing I don't like about data sent in JSON is the base 10 requirement for numbers. If we want to send up 64 bytes of stack memory, should we send that as an array of bytes? Or an array of 64-bit ints that have been byteswapped by debugserver to big-endian format? Right now debugserver sends up two words of memory at the frame pointer value - so 16 bytes on a 64-bit process - in native endian order, like