Page MenuHomePhabricator

[LLDB] In ProcessGDBRemote::UpdateThreadIDList(), the thread PCs should not be cleared after they are updated from the stop reply packet
ClosedPublic

Authored by ramana-nvr on Jul 3 2018, 2:07 AM.

Details

Summary

The function ProcessGDBRemote::UpdateThreadIDsFromStopReplyThreadsValue(), which is being called after the thread PCs are updated, is clearing the thread PC list and that is wrong.

Diff Detail

Repository
rL LLVM

Event Timeline

ramana-nvr created this revision.Jul 3 2018, 2:07 AM
labath added a comment.Jul 9 2018, 6:20 AM

Could you elaborate on how/why is this wrong? E.g. in which situations does it matter whether we clear the PC list?

I could imagine there being a bug in this ProcessGDBRemote::UpdateThreadIDList -> ProcessGDBRemote::UpdateThreadIDsFromStopReplyThreadsValue sequence when talking to a gdb-remote stub that does not implement the lldb-enhancement jThreadsInfo. If jThreadsInfo isn't implemented, we fall back to looking for the thread-pcs: and threads: key-value pairs in the stop packet that we received. We clear m_thread_pcs, then look if thread-pcs: is present (that's one bug) to add all the thread pc values. After that, we look for the threads: list and if it is present, we call UpdateThreadIDsFromStopReplyThreadsValue which clears m_thread_pcs again (bug #2) and then adds any threads listed to the m_thread_ids.

I could believe there are problems here if we're talking to a gdb-remote that doesn't support jThreadsInfo but does provide threads: or thread-pcs: maybe? It'd be interesting to see what the packet log for a stop packet is showing, and what exactly the gdb-remote stub is.

(just to be clear, the m_thread_pcs.clear() in ProcessGDBRemote::UpdateThreadIDList that I called a bug -- today we only have two ways of populating that, jThreadsInfo or the thread-pcs: value in the stop packet. So clearing it unconditionally here, and then populating it from thread_pcs: seems reasonable.)

If jThreadsInfo isn't implemented, we fall back to looking for the thread-pcs: and threads: key-value pairs in the stop packet that we received.
We look for the threads: list and if it is present, we call UpdateThreadIDsFromStopReplyThreadsValue which clears m_thread_pcs again (bug #2) and then adds any threads listed to the m_thread_ids.

Yes, we shouldn't be clearing the m_thread_pcs in UpdateThreadIDsFromStopReplyThreadsValue() otherwise we would loose the thread PCs populated from stop reply thread-pcs: list just before the UpdateThreadIDsFromStopReplyThreadsValue() call.

The other m_thread_pcs.clear() in UpdateThreadIDList() is redundant as we are anyway clearing m_thread_pcs in UpdateThreadPCsFromStopReplyThreadsValue().

I think the m_thread_pcs.clear() in UpdateThreadIDList() is to clear out any old thread pc values that we might have populated in the past. The only way I can see this happening is if the remote stub SOMETIMES returned the thread-pcs: list in the stop packet. UpdateThreadPCsFromStopReplyThreadsValue() which we call if we did get a thread-pcs: in this stop reply packet, clears m_thread_pcs so the only reason to have this also in UpdateThreadIDList() is if we had a stub that was inconsistent.

I agree that the m_thread_pcs.clear() in UpdateThreadIDsFromStopReplyThreadsValue() is wrong, and your patch here is good.

I don't have strong feelings about removing the m_thread_pcs() in UpdateThreadIDList(), but I think I would prefer leaving it in to changing it unless we know it's causing problems. Does this cause problems in your environment?

Sorry, I wasn't clear in my previous comment.

I don't have strong feelings about removing the m_thread_pcs() in UpdateThreadIDList(), but I think I would prefer leaving it in to changing it unless we know it's causing problems. Does this cause problems in your environment?

No, it did not cause any problems in our environment.

But it is redundant as UpdateThreadPCsFromStopReplyThreadsValue(), which gets called immediately after clearing m_thread_pcs in UpdateThreadIDList(), will anyway clear the old m_thread_pcs (line 1545 without my changes) before populating them from the thread-pcs: received in the stop reply packet.

For now, leaving the m_thread_pcs.clear() in UpdateThreadIDList() unchanged as it is not causing any problems.

jasonmolenda accepted this revision.Jul 11 2018, 2:02 PM

looks good.

This revision is now accepted and ready to land.Jul 11 2018, 2:02 PM
This revision was automatically updated to reflect the committed changes.

Could you also add a test case for this?
I think it should be possible to test this via the gdb-client (test/testcases/functionalities/gdb_remote_client/) test suite. If I understood the previous comments correctly, you'll need to mock a server that sends a thread-pcs field, but does not implement a jThreadsInfo packet.

That's a good point Pavel. I tried to write one (below) but I never saw what the original failure mode was. Venkata, can you help to make a test case that fails before the patch and works after? Or explain what bug was being fixed exactly? I could see that the code was wrong from reading it, but I never understood how you got to this.

Index: packages/Python/lldbsuite/test/functionalities/gdb_remote_client/TestStopPCs.py

  • packages/Python/lldbsuite/test/functionalities/gdb_remote_client/TestStopPCs.py (nonexistent)

+++ packages/Python/lldbsuite/test/functionalities/gdb_remote_client/TestStopPCs.py (working copy)
@@ -0,0 +1,45 @@
+from future import print_function
+import lldb
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test.decorators import *
+from gdbclientutils import *
+
+
+class TestThreadSelectionBug(GDBRemoteTestBase):
+ def test(self):
+ class MyResponder(MockGDBServerResponder):
+ def haltReason(self):
+ return "T02thread:1ff0d;threads:1ff0d,2ff0d;thread-pcs:10001bc00,10002bc00;"
+
+ def threadStopInfo(self, threadnum):
+ if threadnum == 0x1ff0d:
+ return "T02thread:1ff0d;threads:1ff0d,2ff0d;thread-pcs:10001bc00,10002bc00;0:0,1:00bc010001;"
+ if threadnum == 0x2ff0d:
+ return "T00thread:2ff0d;threads:1ff0d,2ff0d;thread-pcs:10001bc00,10002bc00;0:0,1:00bc020001;"
+
+ def qXferRead(self, obj, annex, offset, length):
+ if annex == "target.xml":
+ return """<?xml version="1.0"?>
+ <target version="1.0">
+ <architecture>i386:x86-64</architecture>
+ <feature name="org.gnu.gdb.i386.core">
+ <reg name="rax" bitsize="64" regnum="0" type="int" group="general"/>
+ <reg name="rip" bitsize="64" regnum="1" type="code_ptr" group="general"/>
+ </feature>
+ </target>""", False
+ else:
+ return None, False
+
+ self.server.responder = MyResponder()
+ target = self.dbg.CreateTarget('')
+ if self.TraceOn():
+ self.runCmd("log enable gdb-remote packets")
+ process = self.connect(target)
+
+ self.assertEqual(process.GetNumThreads(), 2)
+ th0 = process.GetThreadAtIndex(0)
+ th1 = process.GetThreadAtIndex(1)
+ self.assertEqual(th0.GetThreadID(), 0x1ff0d)
+ self.assertEqual(th1.GetThreadID(), 0x2ff0d)
+ self.assertEqual(th0.GetFrameAtIndex(0).GetPC(), 0x10001bc00)
+ self.assertEqual(th1.GetFrameAtIndex(0).GetPC(), 0x10002bc00)

Index: packages/Python/lldbsuite/test/functionalities/gdb_remote_client/gdbclientutils.py

  • packages/Python/lldbsuite/test/functionalities/gdb_remote_client/gdbclientutils.py (revision 337215)

+++ packages/Python/lldbsuite/test/functionalities/gdb_remote_client/gdbclientutils.py (working copy)
@@ -130,6 +130,8 @@

    return self.QEnableErrorStrings()
if packet == "?":
    return self.haltReason()

+ if packet == "s":
+ return self.haltReason()

if packet[0] == "H":
    return self.selectThread(packet[1], int(packet[2:], 16))
if packet[0:6] == "qXfer:":

@@ -144,6 +146,9 @@

    return self.vAttach(int(pid, 16))
if packet[0] == "Z":
    return self.setBreakpoint(packet)

+ if packet.startswith("qThreadStopInfo"):
+ threadnum = int (packet[15:], 16)
+ return self.threadStopInfo(threadnum)

    return self.other(packet)
 
def interrupt(self):

@@ -204,6 +209,9 @@

def setBreakpoint(self, packet):
    raise self.UnexpectedPacketException()

+ def threadStopInfo(self, threadnum):
+ return ""
+

def other(self, packet):
    # empty string means unsupported
    return ""

Sorry, I am not helpful to you in providing a unit test case for this patch. I am still learning about test suite framework and/or trying to get the lldb test suite up and running.

Regarding how I got to this, it is not that I have run into some issue due to the original code. For my private port, I had to create a thread ID list similar to m_thread_ids and during code browsing for that I happened to see the code in concern and observed that clearing m_thread_pcs is wrong there.

After the process gets stopped, while trying to set the thread stop info, we also update the thread list (ProcessGDBRemote::SetThreadStopInfo() ---> Process::UpdateThreadListIfNeeded() ---> ProcessGDBRemote::UpdateThreadList()) and here we can reach UpdateThreadIDList() if m_thread_ids.empty() is true i.e. if we somehow failed to populate m_thread_ids while parsing the stop info packet and need to populate the m_thread_ids now.