Page MenuHomePhabricator

Local Linux debugging via LLGS passing all Linux local debugging tests.
ClosedPublic

Authored by chaoren on Jan 28 2015, 4:19 PM.

Details

Summary

Added remote-build.py and wired it into Xcode for Linux and MacOSX+Linux builds.

Get initial thread state coordinator integration working.

  • Fixed bug in run loop where run loop return enum was being treated erroneously like an int, causing the TSC event loop to terminate prematurely.
  • Added an explicit scope in NativeProcessLinux::Resume() for the threads lock lifetime. (This was likely unnecessary but is more explicit.)
  • Fixed a bug in ThreadStateCoordinator where resume execution was not updating the internal state about the thread assumed to be running now. I'll add a test and upstream this in a moment.
  • Added a verbose logging mechanism to event processing within ThreadStateCoordinator. It is currently enabled when the 'log enable lldb thread' is true upon inferior launch/attach.

llgs: fixed a bug in deferred signal thread id; added coordinator enqueue logging.

The deferred signal thread id was being set with the process id
unintentionally in NativeProcessLinux::CallAfterRunningThreadsStop().

llgs: fix up some handling of stepping.

Tracked down while working on https://github.com/tfiala/lldb/issues/75.
This is not a complete fix for that issue, but moves us farther along.

Fixes:

  • When a thread step is requested via vCont:{s,S}, Resume() now marks the stepping thread as (1) currently stepping and (2) does trigger the deferred signal for the stepped thread. This fixes a bug where we were actually triggering a deferred stop cycle here for the non-stepping thread since the single step thread was not part of the Resume() deferred signal mechanism. The stepping thread is also marked in the thread state coordinator as running (via a resume callback).
  • When we get the SIGTRAP signal for the step completion, we don't do a deferred signal call - that happened during the vCont:{s,S} processing in Resume() already. Now we just need to mark that the stepping thread is now stopped. If this is the last thread in the set that needs to stop, it will trigger the process/delegate stop call that will notify lldb. Otherwise, that'll happen when the final thead we're waiting for stops.

Misc:

  • Fixed up thread stop logging to use a leading 0 (0x%PRIx32) so we don't get log lines like 0x5 for 0x05 SIGTRAP.

llgs: more work on thread stepping.

See https://github.com/tfiala/lldb/issues/75. Not fixed yet but
continuing to push this further.

Fixes:

  • Resume() now skips doing deferred notifications if we're doing a vCont;{c,C}. In this case, we're trying to start something up, not defer a stop notification. The default thread action stop mode pickup was triggering a stop because it had at least one stop, which was wrong in the case of a continue. (Bug introduced by previous change.)
  • Added a variant to ThreadStateCoordinator to specify a set of thread ids to be skipped when triggering stop notifications to non-stopped threads on a deferred signal call. For the case of a stepping thread, it is actually told to step (and is running) for a brief moment, but the thread state coordinator would think it needed to send the stepping thread a stop, which id doesn't need to do. This facility allows me to get around that cleanly.

With this change, behavior is now reduced to something I think is
essentially a different bug:

  • Doing a step into libc code from my code crashes llgs.
  • Doing a next out of a function in my own code crashes llgs.

Fix some bugs in llgs thread state handling.

  • When the thread state coordinator is told to skip sending a stop request for a running thread that is ignored (e.g. the thread that steps in a step operation is technically running and should not have a stop sent to it, since it will stop of its own accord per the kernel step operation), ensure the deferred signal notification logic still waits for the skipped thread. (i.e. we want to defer the notification until the stepping thread is indeed stopped, we just don't want to send it a tgkill).
  • Add ThreadStateCoordinator::RequestResumeAsNeeded(). This variant of the RequestResume() method does not call the error function when the thread is already running. Instead, it just logs that the thread is already running and skips the resume operation. This is useful for the case of vCont;c handling, where we tell all threads that they should be running. At the place we're calling, all we know is "we want this thread running if it isn't already," and that's exactly what this command does.
  • Formatting change (minor) in NativeThreadLinux logging.

Disabled local-llgs hijacking of state on startup; passed along signo to killed process.

It looks like Shawn's fix addresses what the initial hijacking was trying
to accomplish per conversations with Greg and Jim. The hijacking was
causing several tests to hang (#61, #62, #63, #64, #67, possibly more).
These tests now just fail rather than hang with this modification.

Fix step commands that mix running threads and stepping threads.

This fixes https://github.com/tfiala/lldb/issues/62.

Added code to prevent "administrative stop" from overwriting a real stop reason.

Note this code path should not happen - it implies a bug in another part of
the code. For the thread to receive the stop signal as it is handled, the
and for it to already have a stop reason, it implies the kernel was able to
tell the thread that it stopped while it was stopped. More likely this
seems to indicate a bug where an actual thread start was not getting correctly
logged. If it does get hit, we'll want to understand the sequence to figure
out if it is truly legitimate or if it implies another bug.

Fix up NativeProcessLinux::Interrupt() to use thread state coordinator mechanism.

LLGS local - signal stops inferior in debugger

NativeProcessLinux::MonitorSignal was automatically resuming threads
that stopped due to a signal.  This is inconsistent with the
behavior of lldb and gdb.  This change removes the automatic resume.

Fixes
TestSendSignal.py
TestSignalsAPI.py
if PLATFORM_LINUX_FORCE_LLGS_LOCAL is in the environment vars.

Added support for writing registers larger than 64 bits

Fixed TestInferiorCrashing failures

LLGS debugging is outputting different thread stop reasons than
local linux debugging. The stop reasons are reasonable so I've left
left them alone. Might update them to match darwin in the future.

Fixed TestInferiorChanged

replaced expected stop reason 'address invalid' with 'signal SIGSEGV'

added some missing ABIs

Refactor ptrace commands in NativeProcessLinux to use Error as result return type.

Modify ThreadStateCoodrinator in order to resume threads if stop wasn't requested.

Mark several tests as XFAIL with new expectedFailureLLGS decorator since they are failing in Darwin for the same reason.

Share crash information between LLGS and local POSIX debugging with CrashReason class. Deliver crash information from LLGS to lldb via description field of thread stop packet.

Mark TestProcessLaunch.test_set_working_dir_with_dwarf as expected to fail in LLGS mode due llvm.org/pr20265

Make ThreadStateCoordinator to handle properly failed stop/resume operations.

Moving header files from source/Host/common to proper location.

Implement setting and clearing watchpoints.

Add missing switch cases to silence warnings.

Fix compilation error and cleanup in ThreadStateCoordinatorTest

Fix TestThreadStepOut on Linux with LLGS

Remove implicit stop action on $vCont package for threads where no
explicit action or default action specified based on the specification
(they have to stay in there original state).

Diff Detail

Event Timeline

chaoren retitled this revision from to Added remote-build.py and wired it into Xcode for Linux and MacOSX+Linux builds..
chaoren updated this object.
chaoren edited the test plan for this revision. (Show Details)
chaoren added a subscriber: Unknown Object (MLST).
vharron retitled this revision from Added remote-build.py and wired it into Xcode for Linux and MacOSX+Linux builds. to Multiple bug fixes for LLGS.Jan 28 2015, 4:24 PM
vharron updated this object.
vharron retitled this revision from Multiple bug fixes for LLGS to Local Linux debugging via LLGS passing all Linux local debugging tests..Jan 28 2015, 4:27 PM
vharron added reviewers: tberghammer, flackr.

Preliminary tests don't show any regressions on OS X or Linux. If no one has any problems, I'll commit all of this in a day or two.

clayborg accepted this revision.Feb 2 2015, 12:59 PM
clayborg edited edge metadata.

This is the big llgs patch we have been eagerly waiting to see. If the MacOSX test suite is good, then I am ok with this.

This revision is now accepted and ready to land.Feb 2 2015, 12:59 PM
zturner added a subscriber: zturner.Feb 2 2015, 1:45 PM

Give me a chance to compile this on Windows before comitting, please.

zturner requested changes to this revision.Feb 2 2015, 1:51 PM
zturner added a reviewer: zturner.

I'm unable to apply this patch:

d:\src\llvm\tools\lldb>git apply -p 0 D7238.diff.txt
fatal: new file include/lldb/Target/NativeRegisterContextRegisterInfo.h depends on old contents

Can you rebase against r227863 and upload a new patch? Either here or to my direct email is fine.

This revision now requires changes to proceed.Feb 2 2015, 1:51 PM

Here you go. FWIW I also compiled this on Windows and FreeBSD, but with
hundreds of test failures (about the same as top of tree).

emaste added a comment.Feb 2 2015, 2:21 PM

Here you go. FWIW I also compiled this on Windows and FreeBSD, but with
hundreds of test failures (about the same as top of tree).

I see no non-XFAIL failures on FreeBSD other than 5 lldb-mi tests (for which I have added a decorator locally):

OK
Ran 346 tests.

What does your failure look like, and which FreeBSD version are you using?

Don't see any additional test failures or compilation failures on Windows.
So LGTM also.

ki.stfu added a subscriber: ki.stfu.Feb 2 2015, 3:06 PM

Ed,

Execuse me, I'm working on lldb-mi. Can you provide information about
broken lldm-mi tests?

Thanks,
Ilia

tfiala edited edge metadata.Feb 2 2015, 3:34 PM

(Ok - flipped to my google account)

Yay! Glad to see this approaching submission :-)

Hi Ed,

FreeBSD 10.1-RELEASE #0 r274401
Python 2.7.9

Ran 346 tests.
Failing Tests (346)

All of them for the same reason:

Traceback (most recent call last):

File "/usr/home/chaorenl/llvm/lldb/test/dotest.py", line 1307, in <module>
  os.path.walk(testdir, visit, 'Test')
File "/usr/local/lib/python2.7/posixpath.py", line 238, in walk
  func(arg, top, names)
File "/usr/home/chaorenl/llvm/lldb/test/dotest.py", line 1204, in visit
  suite.addTests(unittest2.defaultTestLoader.loadTestsFromName(base))
File "/usr/home/chaorenl/llvm/lldb/test/unittest2/loader.py", line 111,

in loa

  module = __import__('.'.join(parts_copy))
File "/usr/home/chaorenl/llvm/lldb/test/types/TestIntegerTypesExpr.py",

line 5

  import AbstractBase
File "/usr/home/chaorenl/llvm/lldb/test/types/AbstractBase.py", line 7,

in <mo

  import lldb
File

"/home/chaorenl/llvm/build/lib/python2.7/site-packages/lldb/__init__.py",

  _lldb = swig_import_helper()
File

"/home/chaorenl/llvm/build/lib/python2.7/site-packages/lldb/__init__.py",

import _lldb

ImportError: No module named _lldb

For some reason lib/python2.7/site-packages/lldb/_lldb.so links to liblldb
instead of liblldb.so on FreeBSD build

_lldb.so -> ../../../liblldb

Since this doesn't seem related to the CL, I'll go ahead and commit all the
changes.

emaste added a comment.Feb 2 2015, 5:56 PM

Hi Ed,

FreeBSD 10.1-RELEASE #0 r274401
Python 2.7.9

Ok, I suspect this is an unfortunate bug in libedit which I fixed in FreeBSD stable/10 here:
https://svnweb.freebsd.org/base?view=revision&revision=277931

It started showing up after a python update when this change made it in:
https://hg.python.org/cpython/rev/6303266beb80
which caused OS X crashes, fixed in:
https://hg.python.org/cpython/rev/3f08c1156050
but not fixed for FreeBSD.

The most straightforward way to fix it is to checkout the FreeBSD source tree, then run make in src/lib/libedit followed by sudo make install.

emaste added a comment.Feb 2 2015, 6:00 PM

For some reason lib/python2.7/site-packages/lldb/_lldb.so links to liblldb
instead of liblldb.so on FreeBSD build

_lldb.so -> ../../../liblldb

Since this doesn't seem related to the CL, I'll go ahead and commit all the
changes.

Strange - this doesn't happen for me. I use cmake+ninja for builds. Was yours the same, or configure+make?

joule% ls -l lib/python2.7/site-packages/lldb/_lldb.so
lrwxr-xr-x 1 emaste emaste 19 Aug 28 2013 lib/python2.7/site-packages/lldb/_lldb.so -> ../../../liblldb.so

cmake + ninja

Ugh, sorry guys. I didn't know about the --use-log-author flag for git-svn,
so all the commits ended up with my name on them.

tberghammer requested changes to this revision.Feb 3 2015, 3:46 AM
tberghammer edited edge metadata.

Most of my comments can be addressed later (mostly clarifications) if we want this patch upstream as soon as possible but fixing the two type (chcar -> char) and the two wrong return value (return pointer to local object) should be addressed now as they can cause some serious issues.

scripts/Python/remote-build.py
28–29

Can we replace it with some more generic defaults or make them mandatory?

source/Host/common/NativeWatchpointList.cpp
20

What is the expected behavior if there is already a watchpoint on the given address? Is it good to simply overwrite it?

source/Plugins/Platform/Linux/PlatformLinux.cpp
764–774

Can we remove it completely or add a comment why is it skipped for now?

source/Plugins/Process/Linux/NativeProcessLinux.cpp
2429–2433

Can we factor out this lambda as it is used several times? (Also do it for the other lambdas in this file)

2577

Please remove as not used anymore

2739–2741

Can we add an assert instead of a fall through for nullptr-s?

3488

Please include the error message in the log if possible (instead of just true/false)

3600–3623

What is the return value of this function? Can we remove it as it is always nullptr and can't be overridden as it is a static method?

source/Plugins/Process/POSIX/CrashReason.cpp
211

Typo: chcar -> char

211

Returning a pointer to a local, stack based object!

source/Plugins/Process/POSIX/ProcessMessage.cpp
25–26

Typo: chcar -> char
Returning pointer to stack based object!

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServer.cpp
3770

Please add handle for default case

labath added a subscriber: labath.Feb 3 2015, 5:08 AM
labath added inline comments.
source/Plugins/Process/Linux/NativeProcessLinux.cpp
3600–3623

I think this function signature is required by ThreadLauncher::LaunchThread().

vharron edited edge metadata.Feb 3 2015, 7:24 AM

Should we revert this with one mega patch and reapply with correct blame
info?

chaoren accepted this revision.Feb 9 2015, 2:03 PM
chaoren added a reviewer: chaoren.
chaoren removed reviewers: chaoren, tberghammer.
chaoren removed a reviewer: zturner.
This revision is now accepted and ready to land.Feb 9 2015, 2:05 PM
chaoren closed this revision.Feb 9 2015, 2:05 PM