This is an archive of the discontinued LLVM Phabricator instance.

Rewrite gdb-remote's SendContinuePacketAndWaitForResponse
ClosedPublic

Authored by labath on Jul 21 2016, 7:35 AM.

Details

Summary
SendContinuePacketAndWaitForResponse was huge function with very complex interactions with
several other functions (SendAsyncSignal, SendInterrupt, SendPacket). This meant that making any
changes to how packet sending functions and threads interact was very difficult and error-prone.

This change does not add any functionality yet, it merely paves the way for future changes. In a
follow-up, I plan to add the ability to have multiple query packets in flight (i.e.,
request,request,response,response instead of the usual request,response sequences) and use that
to speed up qModuleInfo packet processing.

Here, I introduce two special kinds of locks: ContinueLock, which is used by the continue thread,
and Lock, which is used by everyone else. ContinueLock (atomically) sends a continue packet, and
blocks any other async threads from accessing the connection. Other threads create an instance of
the Lock object when they want to access the connection. This object, while in scope prevents the
continue from being send. Optionally, it can also interrupt the process to gain access to the
connection for async processing.

Most of the syncrhonization logic is encapsulated within these two classes. Some of it still
had to bleed over into the SendContinuePacketAndWaitForResponse, but the function is still much
more manageable than before -- partly because of most of the work is done in the ContinueLock
class, and partly because I have factored out a lot of the packet processing code separate
functions (this also makes the functionality more easily testable). Most importantly, there is
none of syncrhonization code in the async thread users -- as far as they are concerned, they just
need to declare a Lock object, and they are good to go (SendPacketAndWaitForResponse is now a
very thin wrapper around the NoLock version of the function, whereas previously it had over 100
lines of synchronization code).  This will make my follow up changes there easy.

I have written a number of unit tests for the new code and I have ran the test suite on linux and
osx with no regressions.

Diff Detail

Repository
rL LLVM

Event Timeline

labath updated this revision to Diff 64888.Jul 21 2016, 7:35 AM
labath retitled this revision from to [WIP] Rewrite gdb-remote's SendContinuePacketAndWaitForResponse.
labath updated this object.
labath added a subscriber: tberghammer.
labath updated this revision to Diff 65377.Jul 25 2016, 10:44 AM

clean up a bit

labath updated this revision to Diff 65483.Jul 26 2016, 3:06 AM

Move things into place

labath updated this revision to Diff 65487.Jul 26 2016, 3:31 AM

Use StringRef in the interfaces, where possible. Unify packet/payload naming in interfaces.

labath updated this revision to Diff 65523.Jul 26 2016, 8:10 AM

Fix a race in m_should_stop handlind + drive-by refactor

labath updated this revision to Diff 65534.Jul 26 2016, 8:56 AM

Previous change fixed the race, but made it uncondtitionally broken. Fix that.

tberghammer added inline comments.Jul 27 2016, 3:52 AM
source/Plugins/Process/gdb-remote/GDBRemoteClientBase.cpp
56 ↗(On Diff #65534)

(nit): I would make it an assert as it is clearly an LLDB bug if we hit this code path

81 ↗(On Diff #65534)

(nit): Not needed

291 ↗(On Diff #65534)

Should we do some sort of cleanup here (e.g. m_async_count--)?

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.h
23 ↗(On Diff #65534)

(nit): I think you don't need this

unittests/Process/gdb-remote/GDBRemoteClientBaseTest.cpp
33 ↗(On Diff #65534)

(nit): Please put these into an anonymous namepsace

labath updated this revision to Diff 65713.Jul 27 2016, 5:31 AM

Address review comments. Thanks.

labath updated this revision to Diff 65718.Jul 27 2016, 6:14 AM

final tweaks before sending out for review

labath retitled this revision from [WIP] Rewrite gdb-remote's SendContinuePacketAndWaitForResponse to Rewrite gdb-remote's SendContinuePacketAndWaitForResponse.Jul 27 2016, 6:15 AM
labath updated this object.
labath marked 4 inline comments as done.Jul 27 2016, 6:21 AM
labath added a subscriber: lldb-commits.
clayborg requested changes to this revision.Jul 27 2016, 1:01 PM
clayborg edited edge metadata.

This provides a clearer implementation of our async code that grew over the years and was bolted on, thanks for digging into this.

Looks good a few changes, see inlined comments.

I have some reservations about your future changes to add concurrent packets. We should discuss this at length before you try to make any changes on the mailing list.

source/Plugins/Process/gdb-remote/GDBRemoteClientBase.cpp
57 ↗(On Diff #65718)

if you add an assert, please use lldbassert

148–150 ↗(On Diff #65718)

Do we need to call lock.DidInterrupt() twice?

261–268 ↗(On Diff #65718)

We can probably switch over using a new function that was added after this code was created:

size_t
StringExtractor::GetHexByteString (std::string &str);

So this could be:

inferior_stdout.reserve(packet.GetBytesLeft() / 2);
packet. GetHexByteString(inferior_stdout.reserve);
source/Plugins/Process/gdb-remote/GDBRemoteClientBase.h
28–29 ↗(On Diff #65718)

Why not just use "const std::string &" here instead of llvm::StringRef? It is OK to use std::string in internal APIs and llvm::StringRef buys us nothing.

31 ↗(On Diff #65718)

Why not just use "const std::string &" here instead of llvm::StringRef? It is OK to use std::string in internal APIs and llvm::StringRef buys us nothing.

This revision now requires changes to proceed.Jul 27 2016, 1:01 PM
labath updated this revision to Diff 65895.Jul 28 2016, 3:17 AM
labath edited edge metadata.
labath marked 2 inline comments as done.
labath updated this object.

Address review comments.

I have some reservations about your future changes to add concurrent packets. We should discuss this at length before you try to make any changes on the mailing list.

Agreed. I have a follow-up almost ready. I think I'll be able to upload it as a WIP later in the day. I think it will be easier to explain the idea with a concrete example.

source/Plugins/Process/gdb-remote/GDBRemoteClientBase.cpp
57 ↗(On Diff #65718)

This comment shows up in the wrong spot, I am moving it to the place where the assert() is

148–150 ↗(On Diff #65718)

The call is cheap so I don't think it matters, but I've reshuffled it a bit and now it's one.

297 ↗(On Diff #65718)

The assert() (and two more in the lock() function). I am going to change it back to lldbassert, but this checks a very critical part of the internal locking logic, and I would very much doubt our ability recover if these conditions are not met.

source/Plugins/Process/gdb-remote/GDBRemoteClientBase.h
31 ↗(On Diff #65718)

In the previous case it does not matter (right now (*)), but it this case it actually saves one string copy: The caller does a substr(1) on the received packet (to remove the leading 'A'), which is a cheap operation on a StringRef, but not on a string.

(*) It does however, future-proof the api. If the caller or the callee end up using something other than a std::string in the future, the API will remain the same, and cheap, as probably StringRef can be used to convert to/from that.

In general, I would reverse the logic here: there is nothing that can be done with a string that you cannot do with a StringRef(**), so why use the former? It's also consistent with the llvm guidelines.

(**) The only exception here is the c_str() function, but if you use StringRefs everywhere then you don't need to call it. Also, mitigating circumstances are: - a lot of our apis already use const char*+length style, which is just a hand-rolled StringRef and is cheap to convert back and forth; by the time you end up calling c_str(), you have probably have already saved a couple of copies, so doing one copy there is not that bad.

clayborg accepted this revision.Jul 28 2016, 10:11 AM
clayborg edited edge metadata.

I am fine with the llvm::StringRef until it causes a crash. So really look at all the ways you construct a llvm::StringRef and make sure it won't assert ever if you leave it as a llvm::StringRef.

source/Plugins/Process/gdb-remote/GDBRemoteClientBase.cpp
268 ↗(On Diff #65718)

Actually, feel free to switch over to using GetHexByteString, but maybe just put the reserve part into that function's implementation? Then this code would be just:

packet. GetHexByteString(inferior_stdout);
source/Plugins/Process/gdb-remote/GDBRemoteClientBase.h
31 ↗(On Diff #65895)

I just really don't like that a llvm::StringRef constructed with a null "const char *" will crash LLDB due to an assertion. It has cause crashes in the past when you have the result of a function being fed into a llvm::StringRef. I really want to make a lldb::StringRef that subclasses llvm::StringRef and fixes this as I really don't think it is useful. Why can't an llvm::StringRef be constructed with NULL????

This revision is now accepted and ready to land.Jul 28 2016, 10:11 AM
source/Plugins/Process/gdb-remote/GDBRemoteClientBase.cpp
269 ↗(On Diff #65895)

Will do. At that point this whole function becomes pretty irrelevant and i will inline it back into the caller.

source/Plugins/Process/gdb-remote/GDBRemoteClientBase.h
31 ↗(On Diff #65895)

I understand your concerns, but i also see the other side's point, who consider a null pointer an invalid input and the responsibility of the user to deal with that.

That said, we as a user have a lot of places which treat null as an empty string, so we need to do something about that. Instead of a new (sub-)class, I'd go for making a factory function instead (lldb::makeStringRef()?) and say we use that whereever we are not sure about the nullness of the pointer. The advantage of that being you don't have to worry about type conversions when dealing with different StringRefs (and once our StringRef is constructed, it works just like the vanilla object). In time, the need for that function should even diminish, as once we are already using a StringRef, we don't have to worry about the whole null pointer business.

BTW, constructing a std::string from a null pointer is also undefined behaviour, and implementations I've seen either assert or crash when trying do to a strlen().

This revision was automatically updated to reflect the committed changes.