This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lldb/
-
packages/Python/lldbsuite/test/
-
Python/
-
lldbsuite/
-
test/
-
decorators.py
-
source/Plugins/DynamicLoader/POSIX-DYLD/
-
Plugins/
-
DynamicLoader/
-
POSIX-DYLD/
-
DynamicLoaderPOSIXDYLD.cpp
-
test/API/functionalities/module_load_attach/
-
API/
-
functionalities/
-
module_load_attach/
2
Makefile
3/8
TestModuleLoadAttach.py
-
feature.c
1
main.c

Differential D96637

Make sure the interpreter module was loaded before making checks against it
ClosedPublic

Authored by aadsm on Feb 12 2021, 2:37 PM.

Download Raw Diff

Details

Reviewers

mgorny
labath
emaste
clayborg

Commits

rGa83a825e9902: Make sure the interpreter module was loaded before making checks against it

Summary

This issue was introduced in https://reviews.llvm.org/D92187.
The guard I'm changing were is supposed to act when linux is loading the linker for the second time (due to differences in paths like symlinks).
This is done by checking module_sp != m_interpreter_module.lock() however this will be true when m_interpreter_module wasn't initialized, making linux unload the linker module (the most visible result here is that lldb will stop getting notified about new modules loaded by the process, because it can't set the rendezvous breakpoint again after the stepping over it once).
The m_interpreter_module is not getting initialize when it goes through this path: https://github.com/llvm/llvm-project/blob/dbfdb139f75470a9abc78e7c9faf743fdd963c2d/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp#L332, which happens when lldb was able to read the address from the dynamic section of the executable.

What I'm not sure about though, is if when we go through this path if we still load the linker twice on linux. If that's the case then it means we need to somehow set the m_interpreter_module instead of the fix I provide here. I've only tested this on Android.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aadsm requested review of this revision.Feb 12 2021, 2:37 PM

aadsm created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 12 2021, 2:37 PM

Herald added a subscriber: lldb-commits. · View Herald Transcript

aadsm added a reviewer: clayborg.Feb 12 2021, 3:30 PM

Herald added a subscriber: JDevlieghere. · View Herald TranscriptFeb 12 2021, 3:30 PM

Harbormaster completed remote builds in B89075: Diff 323485.Feb 12 2021, 6:26 PM

I'd like @mgorny to confirm this, but I think this should be fine -- the m_rendezvous.IsValid() branch does not do any funky module loading, so funky unloading should also be unnecessary.

From the description is seems like trigerring this bug should be relatively simple. All it takes is to attach to a process (after it has finished setting up the dynamic section). And indeed, from my simple experiments, it seems that we are indeed failing to load modules in that scenarios.

=> We should have a test for this.

And thanks for catching this. :)

We should have a test for this.

how do you recommend doing this? I spent a couple of hours on this but got no where. From what I understood we should prefer lit tests, so I was thinking of creating a binary that dlopens a module. However, I wasn't able to create a binary that I can start and capture its pid address so that I can attach to. Here's what I've tried so far:

// RUN: cp %s %s.cpp
// RUN: %clang -g -O0 --target=x86_64-linux-gnu %s.cpp -o %s.out
// RUN: PID=$(%s.out)
// RUN: %lldb -p $PID -b -o 'target list' | FileCheck %s
// RUN: kill -9 $PID
// CHECK: foo

#include <stdio.h>
#include <unistd.h>

int main() {
    pid_t pid = fork();
    if (pid > 0) {
        // parent process, print child pid
        printf("%d", pid);
        return 0;
    } else if (pid < 0) {
        printf("Unable to fork\n");
        return -1;
    }
    // child process
    pause();
}

The lit test get stuck on // RUN: PID=$(%s.out). Not sure why, the parent process shouldn't wait on its children..

In D96637#2567269, @aadsm wrote:
We should have a test for this.

how do you recommend doing this? I spent a couple of hours on this but got no where. From what I understood we should prefer lit tests, so I was thinking of creating a binary that dlopens a module. However, I wasn't able to create a binary that I can start and capture its pid address so that I can attach to. Here's what I've tried so far:
// RUN: cp %s %s.cpp
// RUN: %clang -g -O0 --target=x86_64-linux-gnu %s.cpp -o %s.out
// RUN: PID=$(%s.out)
// RUN: %lldb -p $PID -b -o 'target list' | FileCheck %s
// RUN: kill -9 $PID
// CHECK: foo

#include <stdio.h>
#include <unistd.h>

int main() {
    pid_t pid = fork();
    if (pid > 0) {
        // parent process, print child pid
        printf("%d", pid);
        return 0;
    } else if (pid < 0) {
        printf("Unable to fork\n");
        return -1;
    }
    // child process
    pause();
}
The lit test get stuck on // RUN: PID=$(%s.out). Not sure why, the parent process shouldn't wait on its children..

I would do an end to end test for this. We have many attach tests that should be easy to modify and pause() and then try to load a local dylib that is dlopen'ed. Unless Pavel has a differing opinion?

Yeah, I'm with Greg. Although I would recommend using lit tests in general, I don't think they're a good fit for anything that involves attaching, or other kinds of inter-process synchronization. Once you start dealing with subprocesses you're entering very messy (and unportable) waters. Just make this a dotest test. You can base this off of one of the existing attach tests there...

The lit test get stuck on // RUN: PID=$(%s.out). Not sure why, the parent process shouldn't wait on its children..

I don't think this does what you think it does. The $() doesn't give you the process id of anything -- it substitutes a string by the result of running that string as a shell command. So, the PID variable would get the (entire) stdout of %s.out. Obviously, the command has to terminate in order for it to be able to compute that...

Add api test

I don't think this does what you think it does. The $() doesn't give you the process id of anything -- it substitutes a string by the result of running that string as a shell command. So, the PID variable would get the (entire) stdout of %s.out

I'm confused here, "the PID variable would get the (entire) stdout of %s.out" is exactly what I'm expecting to happen, the stdout of the program is its pid.

I was finally able to figure out what the issue was. I thought pause() would continue once the debugger attached because it sends a signal, but that doesn't seem to be the case?

Harbormaster completed remote builds in B89856: Diff 324866.Feb 18 2021, 10:36 PM

In D96637#2573758, @aadsm wrote:

I don't think this does what you think it does. The $() doesn't give you the process id of anything -- it substitutes a string by the result of running that string as a shell command. So, the PID variable would get the (entire) stdout of %s.out

I'm confused here, "the PID variable would get the (entire) stdout of %s.out" is exactly what I'm expecting to happen, the stdout of the program is its pid.

I was finally able to figure out what the issue was. I thought pause() would continue once the debugger attached because it sends a signal, but that doesn't seem to be the case?

It does on mac , and I don't think it does on linux.

clayborg added inline comments.Feb 19 2021, 5:07 PM

lldb/test/API/functionalities/module_load_attach/TestModuleLoadAttach.py
29	Is this racy? What happens on a really slow system? Can we fail to attach? If we do attach, are we guaranteed to be at a place where we can set "flip_to_1_to_continue = 1"? The nice thing is it is a global variable that we should be able to set no matter where we stop.
33–34	Don't we need to break before the dlopen and make sure we don't have a libfeature.so in our module list, then run over the dlopen and verify we do see it afterwards? Wasn't this bug that we will see shared libraries correctly one time when we attach, but just not get any updates after this??

aadsm added inline comments.Feb 20 2021, 10:36 AM

lldb/test/API/functionalities/module_load_attach/TestModuleLoadAttach.py
29	Is this racy? I don't think so because we already have a pid at that point in time, so we should always be able to attach. If we do attach, are we guaranteed to be at a place where we can set "flip_to_1_to_continue = 1"? yeah, that's exactly why I made it global. I could also wait until there's a `flip_to_1_to_continue` in the scope if you think it's worthwhile.
33–34	that was a completely different bug and I have a different test for that situation as well. Something that I could test though, is that before we got an update for an unresolved breakpoint to make sure we did indeed transitioned from unresolved -> resolved. I'll add that.

aadsm added inline comments.Feb 20 2021, 10:44 AM

lldb/test/API/functionalities/module_load_attach/TestModuleLoadAttach.py
33–34	Forget this, I shouldn't be answering comments in the morning. that was a completely different bug and I have a different test for that situation as well. Something that I could test though, is that before we got an update for an unresolved breakpoint to make sure we did indeed transitioned from unresolved -> resolved. I'll add that.

Checks the module is not loaded right after we attach

Harbormaster completed remote builds in B90065: Diff 325221.Feb 20 2021, 11:40 AM

This revision was not accepted when it landed; it landed in state Needs Review.Feb 21 2021, 9:28 AM

This revision was landed with ongoing or failed builds.

Closed by commit rGa83a825e9902: Make sure the interpreter module was loaded before making checks against it (authored by aadsm). · Explain Why

This revision was automatically updated to reflect the committed changes.

aadsm added a commit: rGa83a825e9902: Make sure the interpreter module was loaded before making checks against it.

oh no, I picked the wrong commit to land :(. I think this is fine because I already addressed the comments, but if there's still something I should work on here, I'll put another diff up.

aadsm added a reverting change: rGb19d3b092d4e: Revert "Make sure the interpreter module was loaded before making checks….Feb 21 2021, 10:39 AM

I have some improvements to the test suite -- it would be great if you could incorporate them into the next version of the patch.

BTW, it would be nice if the revert commit message included a (brief) explanation of why the patch is being reverted.

lldb/test/API/functionalities/module_load_attach/Makefile
2	Delete, and use self.registerSharedLibrariesWithTarget in python code
5	`feature` sounds very specific and unusual. I guess that is inspired by whatever was the original use case that caused you to find this bug, but maybe you could pick a more generic name here: `liba`, or `libload_after_attach`, ...
lldb/test/API/functionalities/module_load_attach/TestModuleLoadAttach.py
29	I don't think so because we already have a pid at that point in time, so we should always be able to attach. Attaching -- yes, but I think that if we attach _really_ early we may not be able to flip the variable, as the loader has not yet finished setting up the main module. It will also make the test nondeterministic, as (depending on how early we attach) we may or may not be getting notifications about the loading of dependent libraries (libc and stuff). Other attach tests use synchronization by having the inferior create a file when it's ready to be attached, and the test waits for this via `lldbutil.wait_for_file_on_target`. It would be good to use that here too..
32–33	With the other modifications, you should be able to drop these. The way this test is phrased, it should run everywhere, so it'd be a pity to not make use of that.
42	Use `self.platformContext.shlib_prefix` and `.shlib_extension` instead of `"lib"` and `".so"`.
lldb/test/API/functionalities/module_load_attach/main.c
1	replace with `"dylib.h"` and use `dylib_open` instead of `dlopen`

jankratochvil mentioned this in D96680: [lldb-vscode] Emit the breakpoint changed event on location resolved.Feb 21 2021, 12:54 PM

Revision Contents

Path

Size

lldb/

packages/

Python/

lldbsuite/

test/

decorators.py

4 lines

source/

Plugins/

DynamicLoader/

POSIX-DYLD/

DynamicLoaderPOSIXDYLD.cpp

1 line

test/

API/

functionalities/

module_load_attach/

Makefile

10 lines

TestModuleLoadAttach.py

49 lines

feature.c

1 line

main.c

15 lines

Diff 325313

lldb/packages/Python/lldbsuite/test/decorators.py

Show First 20 Lines • Show All 616 Lines • ▼ Show 20 Lines	def skipUnlessWindows(func):
"""Decorate the item to skip tests that should be skipped on any non-Windows platform."""		"""Decorate the item to skip tests that should be skipped on any non-Windows platform."""
return skipUnlessPlatform(["windows"])(func)		return skipUnlessPlatform(["windows"])(func)


def skipUnlessDarwin(func):		def skipUnlessDarwin(func):
"""Decorate the item to skip tests that should be skipped on any non Darwin platform."""		"""Decorate the item to skip tests that should be skipped on any non Darwin platform."""
return skipUnlessPlatform(lldbplatformutil.getDarwinOSTriples())(func)		return skipUnlessPlatform(lldbplatformutil.getDarwinOSTriples())(func)

		def skipUnlessLinux(func):
		"""Decorate the item to skip tests that should be skipped on any non-Linux platform."""
		return skipUnlessPlatform(["linux"])(func)

def skipUnlessTargetAndroid(func):		def skipUnlessTargetAndroid(func):
return unittest2.skipUnless(lldbplatformutil.target_is_android(),		return unittest2.skipUnless(lldbplatformutil.target_is_android(),
"requires target to be Android")(func)		"requires target to be Android")(func)


def skipIfHostIncompatibleWithRemote(func):		def skipIfHostIncompatibleWithRemote(func):
"""Decorate the item to skip tests if binaries built on this host are incompatible."""		"""Decorate the item to skip tests if binaries built on this host are incompatible."""

▲ Show 20 Lines • Show All 251 Lines • Show Last 20 Lines

lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp

Show First 20 Lines • Show All 435 Lines • ▼ Show 20 Lines	if (m_initial_modules_added) {
m_initial_modules_added = true;		m_initial_modules_added = true;
}		}
for (; I != E; ++I) {		for (; I != E; ++I) {
ModuleSP module_sp =		ModuleSP module_sp =
LoadModuleAtAddress(I->file_spec, I->link_addr, I->base_addr, true);		LoadModuleAtAddress(I->file_spec, I->link_addr, I->base_addr, true);
if (module_sp.get()) {		if (module_sp.get()) {
if (module_sp->GetObjectFile()->GetBaseAddress().GetLoadAddress(		if (module_sp->GetObjectFile()->GetBaseAddress().GetLoadAddress(
&m_process->GetTarget()) == m_interpreter_base &&		&m_process->GetTarget()) == m_interpreter_base &&
		m_interpreter_module.lock() &&
module_sp != m_interpreter_module.lock()) {		module_sp != m_interpreter_module.lock()) {
// If this is a duplicate instance of ld.so, unload it. We may end up		// If this is a duplicate instance of ld.so, unload it. We may end up
// with it if we load it via a different path than before (symlink		// with it if we load it via a different path than before (symlink
// vs real path).		// vs real path).
// TODO: remove this once we either fix library matching or avoid		// TODO: remove this once we either fix library matching or avoid
// loading the interpreter when setting the rendezvous breakpoint.		// loading the interpreter when setting the rendezvous breakpoint.
UnloadSections(module_sp);		UnloadSections(module_sp);
loaded_modules.Remove(module_sp);		loaded_modules.Remove(module_sp);
▲ Show 20 Lines • Show All 344 Lines • Show Last 20 Lines

lldb/test/API/functionalities/module_load_attach/Makefile

This file was added.

				C_SOURCES := main.c
				LD_EXTRAS := -Wl,-rpath "-Wl,$(shell pwd)"
				labathUnsubmitted Not Done Reply Inline Actions Delete, and use self.registerSharedLibrariesWithTarget in python code labath: Delete, and use self.registerSharedLibrariesWithTarget in python code
				USE_LIBDL := 1

				feature:
				labathUnsubmitted Not Done Reply Inline Actions `feature` sounds very specific and unusual. I guess that is inspired by whatever was the original use case that caused you to find this bug, but maybe you could pick a more generic name here: `liba`, or `libload_after_attach`, ... labath: `feature` sounds very specific and unusual. I guess that is inspired by whatever was the…
				$(MAKE) -f $(MAKEFILE_RULES) \
				DYLIB_ONLY=YES DYLIB_NAME=feature DYLIB_C_SOURCES=feature.c
				all: feature

				include Makefile.rules

lldb/test/API/functionalities/module_load_attach/TestModuleLoadAttach.py

This file was added.

				import lldb
				from lldbsuite.test.decorators import *
				from lldbsuite.test.lldbtest import *
				from lldbsuite.test import lldbutil

				class TestCase(TestBase):

				mydir = TestBase.compute_mydir(__file__)

				def build_launch_and_attach(self):
				self.build()
				# launch
				exe = self.getBuildArtifact("a.out")
				popen = self.spawnSubprocess(exe)
				# attach
				target = self.dbg.CreateTarget(exe)
				self.assertTrue(target, VALID_TARGET)
				listener = lldb.SBListener("my.attach.listener")
				error = lldb.SBError()
				process = target.AttachToProcessWithID(listener, popen.pid, error)
				self.assertTrue(error.Success() and process, PROCESS_IS_VALID)
				return process

				def assertModuleIsLoaded(self, module_name):
				feature_module = self.dbg.GetSelectedTarget().FindModule(lldb.SBFileSpec(module_name))
				self.assertTrue(feature_module.IsValid(), f"Module {module_name} should be loaded")

				def assertModuleIsNotLoaded(self, module_name):
				feature_module = self.dbg.GetSelectedTarget().FindModule(lldb.SBFileSpec(module_name))
				clayborgUnsubmitted Not Done Reply Inline Actions Is this racy? What happens on a really slow system? Can we fail to attach? If we do attach, are we guaranteed to be at a place where we can set "flip_to_1_to_continue = 1"? The nice thing is it is a global variable that we should be able to set no matter where we stop. clayborg: Is this racy? What happens on a really slow system? Can we fail to attach? If we do attach, are…
				aadsmAuthorUnsubmitted Done Reply Inline Actions Is this racy? I don't think so because we already have a pid at that point in time, so we should always be able to attach. If we do attach, are we guaranteed to be at a place where we can set "flip_to_1_to_continue = 1"? yeah, that's exactly why I made it global. I could also wait until there's a `flip_to_1_to_continue` in the scope if you think it's worthwhile. aadsm: > Is this racy? I don't think so because we already have a pid at that point in time, so we…
				labathUnsubmitted Not Done Reply Inline Actions I don't think so because we already have a pid at that point in time, so we should always be able to attach. Attaching -- yes, but I think that if we attach _really_ early we may not be able to flip the variable, as the loader has not yet finished setting up the main module. It will also make the test nondeterministic, as (depending on how early we attach) we may or may not be getting notifications about the loading of dependent libraries (libc and stuff). Other attach tests use synchronization by having the inferior create a file when it's ready to be attached, and the test waits for this via `lldbutil.wait_for_file_on_target`. It would be good to use that here too.. labath: > I don't think so because we already have a pid at that point in time, so we should always be…
				self.assertFalse(feature_module.IsValid(), f"Module {module_name} should not be loaded")

				@skipIfRemote
				@skipUnlessLinux
				labathUnsubmitted Not Done Reply Inline Actions With the other modifications, you should be able to drop these. The way this test is phrased, it should run everywhere, so it'd be a pity to not make use of that. labath: With the other modifications, you should be able to drop these. The way this test is phrased…
				@no_debug_info_test
				clayborgUnsubmitted Not Done Reply Inline Actions Don't we need to break before the dlopen and make sure we don't have a libfeature.so in our module list, then run over the dlopen and verify we do see it afterwards? Wasn't this bug that we will see shared libraries correctly one time when we attach, but just not get any updates after this?? clayborg: Don't we need to break before the dlopen and make sure we don't have a libfeature.so in our…
				aadsmAuthorUnsubmitted Done Reply Inline Actions that was a completely different bug and I have a different test for that situation as well. Something that I could test though, is that before we got an update for an unresolved breakpoint to make sure we did indeed transitioned from unresolved -> resolved. I'll add that. aadsm: that was a completely different bug and I have a different test for that situation as well.
				aadsmAuthorUnsubmitted Done Reply Inline Actions Forget this, I shouldn't be answering comments in the morning. that was a completely different bug and I have a different test for that situation as well. Something that I could test though, is that before we got an update for an unresolved breakpoint to make sure we did indeed transitioned from unresolved -> resolved. I'll add that. aadsm: Forget this, I shouldn't be answering comments in the morning. > that was a completely…
				def test(self):
				'''
				This test makes sure that after attach lldb still gets notifications
				about new modules being loaded by the process
				'''
				process = self.build_launch_and_attach()
				thread = process.GetSelectedThread()
				self.assertModuleIsNotLoaded("libfeature.so")
				labathUnsubmitted Not Done Reply Inline Actions Use `self.platformContext.shlib_prefix` and `.shlib_extension` instead of `"lib"` and `".so"`. labath: Use `self.platformContext.shlib_prefix` and `.shlib_extension` instead of `"lib"` and `".so"`.
				thread.GetSelectedFrame().EvaluateExpression("flip_to_1_to_continue = 1")
				# Continue so that dlopen is called.
				breakpoint = self.target().BreakpointCreateBySourceRegex(
				"// break after dlopen", lldb.SBFileSpec("main.c"))
				self.assertNotEqual(breakpoint.GetNumResolvedLocations(), 0)
				stopped_threads = lldbutil.continue_to_breakpoint(self.process(), breakpoint)
				self.assertModuleIsLoaded("libfeature.so")

lldb/test/API/functionalities/module_load_attach/feature.c

This file was added.

extern void feature() {}

lldb/test/API/functionalities/module_load_attach/main.c

This file was added.

				#include <dlfcn.h>
				labathUnsubmitted Not Done Reply Inline Actions replace with `"dylib.h"` and use `dylib_open` instead of `dlopen` labath: replace with `"dylib.h"` and use `dylib_open` instead of `dlopen`
				#include <assert.h>
				#include <unistd.h>

				volatile int flip_to_1_to_continue = 0;

				int main() {
				lldb_enable_attach();
				while (! flip_to_1_to_continue) // Wait for debugger to attach
				sleep(1);
				// dlopen the feature
				void *feature = dlopen("libfeature.so", RTLD_NOW);
				assert(feature && "dlopen failed?");
				return 0; // break after dlopen
				}