This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lldb/
-
source/Plugins/DynamicLoader/POSIX-DYLD/
-
Plugins/
-
DynamicLoader/
-
POSIX-DYLD/
-
DYLDRendezvous.h
1/3
DYLDRendezvous.cpp
-
test/API/functionalities/dyld-multiple-rdebug/
-
API/
-
functionalities/
-
dyld-multiple-rdebug/
-
Makefile
3/6
TestDyldWithMultupleRDebug.py
-
library_file.h
-
library_file.cpp
-
main.cpp

Differential D158583

Fix shared library loading when users define duplicate _r_debug structure.
ClosedPublic

Authored by clayborg on Aug 22 2023, 11:43 PM.

Download Raw Diff

Details

Reviewers

labath
JDevlieghere
GeorgeHuyubo
yinghuitan
kusmour
rdhindsa
aprantl
DavidSpickett

Commits

rG07c215e8a8af: Fix shared library loading when users define duplicate _r_debug structure.

Summary

We ran into a case where shared libraries would fail to load in some processes on linux. The issue turned out to be if the main executable or a shared library defined a symbol named "_r_debug", then it would cause problems once the executable that contained it was loaded into the process. The "_r_debug" structure is currently found by looking through the .dynamic section in the main executable and finding the DT_DEBUG entry which points to this structure. The dynamic loader will update this structure as shared libraries are loaded and LLDB watches the contents of this structure as the dyld breakpoint is hit. Currently we expect the "state" in this structure to change as things happen. An issue comes up if someone defines another "_r_debug" struct in their program:

r_debug _r_debug;

If this code is included, a new "_r_debug" structure is created and it causes problems once the executable is loaded. This is because of the way symbol lookups happen in linux: they use the shared library list in the order it created and the dynamic loader is always last. So at some point the dynamic loader will start updating this other copy of "_r_debug", yet LLDB is only watching the copy inside of the dynamic loader.

Steps that show the problem are:

lldb finds the "_r_debug" structure via the DT_DEBUG entry in the .dynamic section and this points to the "_r_debug" in ld.so
ld.so modifies its copy of "_r_debug" with "state = eAdd" before it loads the shared libraries and calls the dyld function that LLDB has set a breakpoint on and we find this state and do nothing (we are waiting for a state of eConsistent to tell us the shared libraries have been fully loaded)
ld.so loads the main executable and any dependent shared libraries and wants to update the "_r_debug" structure, but it now finds "_r_debug" in the a.out program and updates the state in this other copy
lldb hits the notification breakpoint and checks the ld.so copy of "_r_debug" which still has a state of "eAdd". LLDB wants the new "eConsistent" state which will trigger the shared libraries to load, but it gets stale data and doesn't do anyhing and library load is missed. The "_r_debug" in a.out has the state set correctly, but we don't know which "_r_debug" is the right one.

The new fix detects the two "eAdd" states and loads shared libraries and will emit a log message in the "log enable lldb dyld" log channel which states there might be multiple "_r_debug" structs.

The correct solution is that no one should be adding a duplicate "_r_debug" symbol to their binaries, but we have programs that are doing this already and since it can be done, we should be able to work with this and keep debug sessions working as expected. If a user #includes the <link.h> file, they can just use the existing "_r_debug" structure as it is defined in this header file as "extern struct r_debug _r_debug;" and no local copies need to be made.

If your ld.so has debug info, you can easily see the duplicate "_r_debug" structs by doing:

(lldb) target variable _r_debug --raw
(r_debug) _r_debug = {
  r_version = 1
  r_map = 0x00007ffff7e30210
  r_brk = 140737349972416
  r_state = RT_CONSISTENT
  r_ldbase = 0
}
(r_debug) _r_debug = {
  r_version = 1
  r_map = 0x00007ffff7e30210
  r_brk = 140737349972416
  r_state = RT_ADD
  r_ldbase = 140737349943296
}
(lldb) target variable &_r_debug
(r_debug *) &_r_debug = 0x0000555555601040
(r_debug *) &_r_debug = 0x00007ffff7e301e0

And if you do a "image lookup --address <addr>" in the addresses, you can see one is in the a.out and one in the ld.so.

Adding more logging to print out the m_previous and m_current Rendezvous structures to make things more clear. Also added a log when we detect multiple eAdd states in a row to detect this problem in logs.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

clayborg created this revision.Aug 22 2023, 11:43 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2023, 11:43 PM

clayborg requested review of this revision.Aug 22 2023, 11:43 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2023, 11:43 PM

Herald added a subscriber: lldb-commits. · View Herald Transcript

clayborg added reviewers: rdhindsa, aprantl.Aug 22 2023, 11:46 PM

Harbormaster completed remote builds in B254262: Diff 552606.Aug 22 2023, 11:47 PM

I'm not familiar with this mechanism but just out of curiosity: ld.so loads the main executable and any dependent shared libraries and wants to update the "_r_debug" structure, but it now finds "_r_debug" in the a.out program and updates the state in this other copy

Is this some undefined behaviour or is there a good use case for this? From the program's point of view.

lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DYLDRendezvous.cpp
260–266	Can we split this sentence just for readability? Bullet points of each step might be clearer.
260–266	What you have in the commit message basically.
lldb/test/API/functionalities/dyld-multiple-rdebug/TestDyldWithMultupleRDebug.py
23	Any specific reason for this? Not that it really matters, it'll get plenty of testing elsewhere.
29	Leftover debug print.

aprantl added inline comments.Aug 23 2023, 9:06 AM

lldb/test/API/functionalities/dyld-multiple-rdebug/TestDyldWithMultupleRDebug.py
55	You should be able to shorten the test setup considerably by using lldbutil.run_to_name_breakpoint() and only manually setting the second breakpoint afterwards.

In D158583#4609320, @DavidSpickett wrote:

I'm not familiar with this mechanism but just out of curiosity: ld.so loads the main executable and any dependent shared libraries and wants to update the "_r_debug" structure, but it now finds "_r_debug" in the a.out program and updates the state in this other copy

Is this some undefined behaviour or is there a good use case for this? From the program's point of view.

This is just how name lookups happen across shared library boundaries as far as I know. External symbols get resolved by searching the shared libraries including the main executable.

I have verified this is indeed what is happening by debugging the ld.so binary and stepping through it at this code in rtld.c:

/* Notify the debugger all new objects are now ready to go.  We must re-get
   the address since by now the variable might be in another object.  */
r = _dl_debug_initialize (0, LM_ID_BASE);
r->r_state = RT_CONSISTENT;
_dl_debug_state ();

The code for _dl_debug_initialize() looks like:

struct r_debug *
_dl_debug_initialize (ElfW(Addr) ldbase, Lmid_t ns)
{
  struct r_debug *r;

  if (ns == LM_ID_BASE)
    r = &_r_debug; /// <- this will switch to 
  else
    r = &GL(dl_ns)[ns]._ns_debug;

  if (r->r_map == NULL || ldbase != 0)
    {
      /* Tell the debugger where to find the map of loaded objects.  */
      r->r_version = 1	/* R_DEBUG_VERSION XXX */;
      r->r_ldbase = ldbase ?: _r_debug.r_ldbase;
      r->r_map = (void *) GL(dl_ns)[ns]._ns_loaded;
      r->r_brk = (ElfW(Addr)) &_dl_debug_state;
    }

  return r;
}

And the line "r = &_r_debug;" picks up the first item in the library search paths for "_r_debug" which gets the one from the a.out program...

lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DYLDRendezvous.cpp
260–266	yeah, I was tired last night when finishing this up, I will grab the stuff from the commit message
lldb/test/API/functionalities/dyld-multiple-rdebug/TestDyldWithMultupleRDebug.py
23	Yeah, copy paste issue. I would be able to remove this.
29	good catch, yes!
55	will do

Address review comments:

Added documentation in DYLDRendezvous.h to explain how things are supposed to work
Updated the documentation in DYLDRendezvous::GetAction() to explain why we need to work around seeing two consecutive eAdd states
Greatly simplify the test that was added per suggestions

Harbormaster completed remote builds in B254521: Diff 552966.Aug 23 2023, 8:02 PM

Test LGTM, I'll defer to others for the dynamic loader plugin.

Fixed more header documentation and cleaned up the test even more.

Harbormaster completed remote builds in B254723: Diff 553259.Aug 24 2023, 2:15 PM

Anyone have any objections? Super easy to repro this bug with the test program.

The added context helps document what was already there so that's a nice improvement.

Something I'm not clear on mechanically. The original r_debug has eAdd set. Then the second r_debug is loaded, which also has eAdd set. What is the state of r_map at that point? I wonder if it could be somehow different between the 2 copies, with each having a subset of the list.

Also, why do we not have to wait for eConsistent on the second r_debug? Is it that we do in fact wait for that, but we load the library list from the first r_debug, then when the second one gets eConsistent, we load the rest from that.

In D158583#4627644, @DavidSpickett wrote:

The added context helps document what was already there so that's a nice improvement.

Something I'm not clear on mechanically. The original r_debug has eAdd set. Then the second r_debug is loaded, which also has eAdd set. What is the state of r_map at that point? I wonder if it could be somehow different between the 2 copies, with each having a subset of the list.

Once we stop at the breakpoint the second time, second r_debug has eConsistent set when the original has eAdd, but we only look at the original r_debug in the ld.so binary.

Actually the _only_ thing the "_r_debug" in the main executable has set is the "r_state" when we stop at the second breakpoint, the r_map is set correctly in the ld.so version, but not in the a.out version. If you continue to run, the a.out version does eventually get updated. It just depends when the ld.so writes new things into the new r_debug struct.

I actually ran a test where I created a global variable with the same name but different type:

char _r_debug[40] = "012345678901234567890123456789012345678"; // Same size as "r_debug" for safety

And then ran the program to the entry point and the "r_state" bytes in the above character array had been written to zero, but nothing else was touched. So I know this definitely isn't meant to happen.

The ld.so's "_r_debug.r_map" is correct and has a linked list of all the current libraries. But it really depends on when the _r_debug struct is written to again by ld.so.

Also, why do we not have to wait for eConsistent on the second r_debug?

Because the first eAdd state indicates the ld.so is _about_ to load the shared libraries. So debuggers can't actually set breakpoints in them until all of the shared libraries are actually loaded.

Is it that we do in fact wait for that, but we load the library list from the first r_debug, then when the second one gets eConsistent, we load the rest from that.

It is because when we receive the eConsistent the libraries are actually now loaded into memory and then we can load the libraries in LLDB and can actually resolve any breakpoints since the .text sections are now mapped and we can write breakpoints into their .text sections.

I understand the mechanism now, so for lack of any other strong opinions, LGTM.

This revision is now accepted and ready to land.Aug 31 2023, 1:36 AM

Closed by commit rG07c215e8a8af: Fix shared library loading when users define duplicate _r_debug structure. (authored by clayborg). · Explain WhyAug 31 2023, 10:37 AM

This revision was automatically updated to reflect the committed changes.

clayborg added a commit: rG07c215e8a8af: Fix shared library loading when users define duplicate _r_debug structure..

Revision Contents

Path

Size

lldb/

source/

Plugins/

DynamicLoader/

POSIX-DYLD/

DYLDRendezvous.h

89 lines

DYLDRendezvous.cpp

114 lines

test/

API/

functionalities/

dyld-multiple-rdebug/

Makefile

4 lines

TestDyldWithMultupleRDebug.py

39 lines

library_file.h

1 line

library_file.cpp

7 lines

main.cpp

32 lines

Diff 555100

lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DYLDRendezvous.h

Show All 15 Lines
#include "lldb/lldb-defines.h"		#include "lldb/lldb-defines.h"
#include "lldb/lldb-types.h"		#include "lldb/lldb-types.h"

#include "lldb/Core/LoadedModuleInfoList.h"		#include "lldb/Core/LoadedModuleInfoList.h"

using lldb_private::LoadedModuleInfoList;		using lldb_private::LoadedModuleInfoList;

namespace lldb_private {		namespace lldb_private {
		class Log;
class Process;		class Process;
}		}

/// \class DYLDRendezvous		/// \class DYLDRendezvous
/// Interface to the runtime linker.		/// Interface to the runtime linker.
///		///
/// A structure is present in a processes memory space which is updated by the		/// A structure is present in a processes memory space which is updated by the
/// runtime liker each time a module is loaded or unloaded. This class		/// dynamic linker each time a module is loaded or unloaded. This class
/// provides an interface to this structure and maintains a consistent		/// provides an interface to this structure and maintains a consistent
/// snapshot of the currently loaded modules.		/// snapshot of the currently loaded modules.
		///
		/// In the dynamic loader sources, this structure has a type of "r_debug" and
		/// the name of the structure us "_r_debug". The structure looks like:
		///
		/// struct r_debug {
		/// // Version number for this protocol.
		/// int r_version;
		/// // Head of the chain of loaded objects.
		/// struct link_map *r_map;
		/// // The address the debugger should set a breakpoint at in order to get
		/// // notified when shared libraries are added or removed
		/// uintptr_t r_brk;
		/// // This state value describes the mapping change taking place when the
		/// // 'r_brk' address is called.
		/// enum {
		/// RT_CONSISTENT, // Mapping change is complete.
		/// RT_ADD, // Beginning to add a new object.
		/// RT_DELETE, // Beginning to remove an object mapping.
		/// } r_state;
		/// // Base address the linker is loaded at.
		/// uintptr_t r_ldbase;
		/// };
		///
		/// The dynamic linker then defines a global variable using this type named
		/// "_r_debug":
		///
		/// r_debug _r_debug;
		///
		/// The DYLDRendezvous class defines a local version of this structure named
		/// DYLDRendezvous::Rendezvous. See the definition inside the class definition
		/// for DYLDRendezvous.
		///
		/// This structure can be located by looking through the .dynamic section in
		/// the main executable and finding the DT_DEBUG tag entry. This value starts
		/// out with a value of zero when the program first is initially loaded, but
		/// the address of the "_r_debug" structure from ld.so is filled in by the
		/// dynamic loader during program initialization code in ld.so prior to loading
		/// or unloading and shared libraries.
		///
		/// The dynamic loader will update this structure as shared libraries are
		/// loaded and will call a specific function that LLDB knows to set a
		/// breakpoint on (from _r_debug.r_brk) so LLDB will find out when shared
		/// libraries are loaded or unloaded. Each time this breakpoint is hit, LLDB
		/// looks at the contents of this structure and the contents tell LLDB what
		/// needs to be done.
		///
		/// Currently we expect the "state" in this structure to change as things
		/// happen.
		///
		/// When any shared libraries are loaded the following happens:
		/// - _r_debug.r_map is updated with the new shared libraries. This is a
		/// doubly linked list of "link_map *" entries.
		/// - _r_debug.r_state is set to RT_ADD and the debugger notification
		/// function is called notifying the debugger that shared libraries are
		/// about to be added, but are not yet ready for use.
		/// - Once the the shared libraries are fully loaded, _r_debug.r_state is set
		/// to RT_CONSISTENT and the debugger notification function is called again
		/// notifying the debugger that shared libraries are ready for use.
		/// DYLDRendezvous must remember that the previous state was RT_ADD when it
		/// receives a RT_CONSISTENT in order to know to add libraries
		///
		/// When any shared libraries are unloaded the following happens:
		/// - _r_debug.r_map is updated and the unloaded libraries are removed.
		/// - _r_debug.r_state is set to RT_DELETE and the debugger notification
		/// function is called notifying the debugger that shared libraries are
		/// about to be removed.
		/// - Once the the shared libraries are removed _r_debug.r_state is set to
		/// RT_CONSISTENT and the debugger notification function is called again
		/// notifying the debugger that shared libraries have been removed.
		/// DYLDRendezvous must remember that the previous state was RT_DELETE when
		/// it receives a RT_CONSISTENT in order to know to remove libraries
		///
class DYLDRendezvous {		class DYLDRendezvous {

// This structure is used to hold the contents of the debug rendezvous		// This structure is used to hold the contents of the debug rendezvous
// information (struct r_debug) as found in the inferiors memory. Note that		// information (struct r_debug) as found in the inferiors memory. Note that
// the layout of this struct is not binary compatible, it is simply large		// the layout of this struct is not binary compatible, it is simply large
// enough to hold the information on both 32 and 64 bit platforms.		// enough to hold the information on both 32 and 64 bit platforms.
struct Rendezvous {		struct Rendezvous {
uint64_t version = 0;		uint64_t version = 0;
lldb::addr_t map_addr = 0;		lldb::addr_t map_addr = 0;
lldb::addr_t brk = 0;		lldb::addr_t brk = 0;
uint64_t state = 0;		uint64_t state = 0;
lldb::addr_t ldbase = 0;		lldb::addr_t ldbase = 0;

Rendezvous() = default;		Rendezvous() = default;

		void DumpToLog(lldb_private::Log log, const char label);
};		};

/// Locates the address of the rendezvous structure. It updates		/// Locates the address of the rendezvous structure. It updates
/// m_executable_interpreter if address is extracted from _r_debug.		/// m_executable_interpreter if address is extracted from _r_debug.
///		///
/// \returns address on success and LLDB_INVALID_ADDRESS on failure.		/// \returns address on success and LLDB_INVALID_ADDRESS on failure.
lldb::addr_t ResolveRendezvousAddress();		lldb::addr_t ResolveRendezvousAddress();

▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	public:
/// \returns true if modules have been unloaded from the inferior since the		/// \returns true if modules have been unloaded from the inferior since the
/// last call to Resolve().		/// last call to Resolve().
bool ModulesDidUnload() const { return !m_removed_soentries.empty(); }		bool ModulesDidUnload() const { return !m_removed_soentries.empty(); }

void DumpToLog(lldb_private::Log *log) const;		void DumpToLog(lldb_private::Log *log) const;

/// Constants describing the state of the rendezvous.		/// Constants describing the state of the rendezvous.
///		///
		/// These values are defined to match the r_debug.r_state enum from the
		/// actual dynamic loader sources.
		///
/// \see GetState().		/// \see GetState().
enum RendezvousState { eConsistent, eAdd, eDelete };		enum RendezvousState {
		eConsistent, // RT_CONSISTENT
		eAdd, // RT_ADD
		eDelete // RT_DELETE
		};

/// Structure representing the shared objects currently loaded into the		/// Structure representing the shared objects currently loaded into the
/// inferior process.		/// inferior process.
///		///
/// This object is a rough analogue to the struct link_map object which		/// This object is a rough analogue to the struct link_map object which
/// actually lives in the inferiors memory.		/// actually lives in the inferiors memory.
struct SOEntry {		struct SOEntry {
lldb::addr_t link_addr; ///< Address of this link_map.		lldb::addr_t link_addr; ///< Address of this link_map.
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	protected:

enum RendezvousAction {		enum RendezvousAction {
eNoAction,		eNoAction,
eTakeSnapshot,		eTakeSnapshot,
eAddModules,		eAddModules,
eRemoveModules		eRemoveModules
};		};

		static const char *StateToCStr(RendezvousState state);
		static const char *ActionToCStr(RendezvousAction action);

/// Returns the current action to be taken given the current and previous		/// Returns the current action to be taken given the current and previous
/// state		/// state
RendezvousAction GetAction() const;		RendezvousAction GetAction() const;
};		};

#endif		#endif

lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DYLDRendezvous.cpp

Show All 19 Lines

#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"

#include "DYLDRendezvous.h"		#include "DYLDRendezvous.h"

using namespace lldb;		using namespace lldb;
using namespace lldb_private;		using namespace lldb_private;

		const char *DYLDRendezvous::StateToCStr(RendezvousState state) {
		switch (state) {
		case DYLDRendezvous::eConsistent:
		return "eConsistent";
		case DYLDRendezvous::eAdd:
		return "eAdd";
		case DYLDRendezvous::eDelete:
		return "eDelete";
		}
		return "<invalid RendezvousState>";
		}

		const char *DYLDRendezvous::ActionToCStr(RendezvousAction action) {
		switch (action) {
		case DYLDRendezvous::RendezvousAction::eTakeSnapshot:
		return "eTakeSnapshot";
		case DYLDRendezvous::RendezvousAction::eAddModules:
		return "eAddModules";
		case DYLDRendezvous::RendezvousAction::eRemoveModules:
		return "eRemoveModules";
		case DYLDRendezvous::RendezvousAction::eNoAction:
		return "eNoAction";
		}
		return "<invalid RendezvousAction>";
		}

DYLDRendezvous::DYLDRendezvous(Process *process)		DYLDRendezvous::DYLDRendezvous(Process *process)
: m_process(process), m_rendezvous_addr(LLDB_INVALID_ADDRESS),		: m_process(process), m_rendezvous_addr(LLDB_INVALID_ADDRESS),
m_executable_interpreter(false), m_current(), m_previous(),		m_executable_interpreter(false), m_current(), m_previous(),
m_loaded_modules(), m_soentries(), m_added_soentries(),		m_loaded_modules(), m_soentries(), m_added_soentries(),
m_removed_soentries() {		m_removed_soentries() {
m_thread_info.valid = false;		m_thread_info.valid = false;
UpdateExecutablePath();		UpdateExecutablePath();
}		}
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	if (exe_mod) {
LLDB_LOGF(log,		LLDB_LOGF(log,
"DYLDRendezvous::%s cannot cache exe module path: null "		"DYLDRendezvous::%s cannot cache exe module path: null "
"executable module pointer",		"executable module pointer",
__FUNCTION__);		__FUNCTION__);
}		}
}		}
}		}

		void DYLDRendezvous::Rendezvous::DumpToLog(Log log, const char label) {
		LLDB_LOGF(log, "%s Rendezvous: version = %" PRIu64 ", map_addr = 0x%16.16"
		PRIx64 ", brk = 0x%16.16" PRIx64 ", state = %" PRIu64
		" (%s), ldbase = 0x%16.16" PRIx64, label ? label : "", version,
		map_addr, brk, state, StateToCStr((RendezvousState)state), ldbase);
		}

bool DYLDRendezvous::Resolve() {		bool DYLDRendezvous::Resolve() {
Log *log = GetLog(LLDBLog::DynamicLoader);		Log *log = GetLog(LLDBLog::DynamicLoader);

const size_t word_size = 4;		const size_t word_size = 4;
Rendezvous info;		Rendezvous info;
size_t address_size;		size_t address_size;
size_t padding;		size_t padding;
addr_t info_addr;		addr_t info_addr;
Show All 31 Lines	bool DYLDRendezvous::Resolve() {
if (!(cursor = ReadPointer(cursor + padding, &info.ldbase)))		if (!(cursor = ReadPointer(cursor + padding, &info.ldbase)))
return false;		return false;

// The rendezvous was successfully read. Update our internal state.		// The rendezvous was successfully read. Update our internal state.
m_rendezvous_addr = info_addr;		m_rendezvous_addr = info_addr;
m_previous = m_current;		m_previous = m_current;
m_current = info;		m_current = info;

		m_previous.DumpToLog(log, "m_previous");
		m_current.DumpToLog(log, "m_current ");

if (m_current.map_addr == 0)		if (m_current.map_addr == 0)
return false;		return false;

if (UpdateSOEntriesFromRemote())		if (UpdateSOEntriesFromRemote())
return true;		return true;

return UpdateSOEntries();		return UpdateSOEntries();
}		}
Show All 25 Lines	case eConsistent:
case eAdd:		case eAdd:
return eAddModules;		return eAddModules;
case eDelete:		case eDelete:
return eRemoveModules;		return eRemoveModules;
}		}
break;		break;

case eAdd:		case eAdd:
		// If the main executable or a shared library defines a publicly visible
		// symbol named "_r_debug", then it will cause problems once the executable
		// that contains the symbol is loaded into the process. The correct
		// "_r_debug" structure is currently found by LLDB by looking through
		// the .dynamic section in the main executable and finding the DT_DEBUG tag
		// entry.
		//
		// An issue comes up if someone defines another publicly visible "_r_debug"
		// struct in their program. Sample code looks like:
		//
		// #include <link.h>
		DavidSpickettUnsubmitted Not Done Reply Inline Actions Can we split this sentence just for readability? Bullet points of each step might be clearer. DavidSpickett: Can we split this sentence just for readability? Bullet points of each step might be clearer.
		DavidSpickettUnsubmitted Not Done Reply Inline Actions What you have in the commit message basically. DavidSpickett: What you have in the commit message basically.
		clayborgAuthorUnsubmitted Done Reply Inline Actions yeah, I was tired last night when finishing this up, I will grab the stuff from the commit message clayborg: yeah, I was tired last night when finishing this up, I will grab the stuff from the commit…
		// r_debug _r_debug;
		//
		// If code like this is in an executable or shared library, this creates a
		// new "_r_debug" structure and it causes problems once the executable is
		// loaded due to the way symbol lookups happen in linux: the shared library
		// list from _r_debug.r_map will be searched for a symbol named "_r_debug"
		// and the first match will be the new version that is used. The dynamic
		// loader is always last in this list. So at some point the dynamic loader
		// will start updating the copy of "_r_debug" that gets found first. The
		// issue is that LLDB will only look at the copy that is pointed to by the
		// DT_DEBUG entry, or the initial version from the ld.so binary.
		//
		// Steps that show the problem are:
		//
		// - LLDB finds the "_r_debug" structure via the DT_DEBUG entry in the
		// .dynamic section and this points to the "_r_debug" in ld.so
		// - ld.so uodates its copy of "_r_debug" with "state = eAdd" before it
		// loads the dependent shared libraries for the main executable and
		// any dependencies of all shared libraries from the executable's list
		// and ld.so code calls the debugger notification function
		// that LLDB has set a breakpoint on.
		// - LLDB hits the breakpoint and the breakpoint has a callback function
		// where we read the _r_debug.state (eAdd) state and we do nothing as the
		// "eAdd" state indicates that the shared libraries are about to be added.
		// - ld.so finishes loading the main executable and any dependent shared
		// libraries and it will update the "_r_debug.state" member with a
		// "eConsistent", but it now updates the "_r_debug" in the a.out program
		// and it calls the debugger notification function.
		// - lldb hits the notification breakpoint and checks the ld.so copy of
		// "_r_debug.state" which still has a state of "eAdd", but LLDB needs to see a
		// "eConsistent" state to trigger the shared libraries to get loaded into
		// the debug session, but LLDB the ld.so _r_debug.state which still
		// contains "eAdd" and doesn't do anyhing and library load is missed.
		// The "_r_debug" in a.out has the state set correctly to "eConsistent"
		// but LLDB is still looking at the "_r_debug" from ld.so.
		//
		// So if we detect two "eAdd" states in a row, we assume this is the issue
		// and we now load shared libraries correctly and will emit a log message
		// in the "log enable lldb dyld" log channel which states there might be
		// multiple "_r_debug" structs causing problems.
		//
		// The correct solution is that no one should be adding a duplicate
		// publicly visible "_r_debug" symbols to their binaries, but we have
		// programs that are doing this already and since it can be done, we should
		// be able to work with this and keep debug sessions working as expected.
		//
		// If a user includes the <link.h> file, they can just use the existing
		// "_r_debug" structure as it is defined in this header file as "extern
		// struct r_debug _r_debug;" and no local copies need to be made.
		if (m_previous.state == eAdd) {
		Log *log = GetLog(LLDBLog::DynamicLoader);
		LLDB_LOG(log, "DYLDRendezvous::GetAction() found two eAdd states in a "
		"row, check process for multiple \"_r_debug\" symbols. "
		"Returning eAddModules to ensure shared libraries get loaded "
		"correctly");
		return eAddModules;
		}
		return eNoAction;
case eDelete:		case eDelete:
return eNoAction;		return eNoAction;
}		}

return eNoAction;		return eNoAction;
}		}

bool DYLDRendezvous::UpdateSOEntriesFromRemote() {		bool DYLDRendezvous::UpdateSOEntriesFromRemote() {
auto action = GetAction();		const auto action = GetAction();
		Log *log = GetLog(LLDBLog::DynamicLoader);
		LLDB_LOG(log, "{0} action = {1}", __PRETTY_FUNCTION__, ActionToCStr(action));

if (action == eNoAction)		if (action == eNoAction)
return false;		return false;

m_added_soentries.clear();		m_added_soentries.clear();
m_removed_soentries.clear();		m_removed_soentries.clear();
if (action == eTakeSnapshot) {		if (action == eTakeSnapshot) {
// We already have the loaded list from the previous update so no need to		// We already have the loaded list from the previous update so no need to
Show All 21 Lines	case eNoAction:
return false;		return false;
}		}
llvm_unreachable("Fully covered switch above!");		llvm_unreachable("Fully covered switch above!");
}		}

bool DYLDRendezvous::UpdateSOEntries() {		bool DYLDRendezvous::UpdateSOEntries() {
m_added_soentries.clear();		m_added_soentries.clear();
m_removed_soentries.clear();		m_removed_soentries.clear();
switch (GetAction()) {		const auto action = GetAction();
		Log *log = GetLog(LLDBLog::DynamicLoader);
		LLDB_LOG(log, "{0} action = {1}", __PRETTY_FUNCTION__, ActionToCStr(action));
		switch (action) {
case eTakeSnapshot:		case eTakeSnapshot:
m_soentries.clear();		m_soentries.clear();
return TakeSnapshot(m_soentries);		return TakeSnapshot(m_soentries);
case eAddModules:		case eAddModules:
return AddSOEntries();		return AddSOEntries();
case eRemoveModules:		case eRemoveModules:
return RemoveSOEntries();		return RemoveSOEntries();
case eNoAction:		case eNoAction:
▲ Show 20 Lines • Show All 404 Lines • Show Last 20 Lines

lldb/test/API/functionalities/dyld-multiple-rdebug/Makefile

This file was added.

				CXX_SOURCES := main.cpp
				DYLIB_NAME := testlib
				DYLIB_CXX_SOURCES := library_file.cpp
				include Makefile.rules

lldb/test/API/functionalities/dyld-multiple-rdebug/TestDyldWithMultupleRDebug.py

This file was added.

				"""
				Test that LLDB can launch a linux executable through the dynamic loader where
				the main executable has an extra exported "_r_debug" symbol that used to mess
				up shared library loading with DYLDRendezvous and the POSIX dynamic loader
				plug-in. What used to happen is that any shared libraries other than the main
				executable and the dynamic loader and VSDO would not get loaded. This test
				checks to make sure that we still load libraries correctly when we have
				multiple "_r_debug" symbols. See comments in the main.cpp source file for full
				details on what the problem is.
				"""

				import lldb
				import os

				from lldbsuite.test import lldbutil
				from lldbsuite.test.decorators import *
				from lldbsuite.test.lldbtest import *


				class TestDyldWithMultipleRDebug(TestBase):
				@skipIf(oslist=no_match(["linux"]))
				@no_debug_info_test
				def test(self):
				DavidSpickettUnsubmitted Not Done Reply Inline Actions Any specific reason for this? Not that it really matters, it'll get plenty of testing elsewhere. DavidSpickett: Any specific reason for this? Not that it really matters, it'll get plenty of testing elsewhere.
				clayborgAuthorUnsubmitted Done Reply Inline Actions Yeah, copy paste issue. I would be able to remove this. clayborg: Yeah, copy paste issue. I would be able to remove this.
				self.build()
				# Run to a breakpoint in main.cpp to ensure we can hit breakpoints
				# in the main executable. Setting breakpoints by file and line ensures
				# that the main executable was loaded correctly by the dynamic loader
				(target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
				self, "// Break here", lldb.SBFileSpec("main.cpp"),
				DavidSpickettUnsubmitted Not Done Reply Inline Actions Leftover debug print. DavidSpickett: Leftover debug print.
				clayborgAuthorUnsubmitted Done Reply Inline Actions good catch, yes! clayborg: good catch, yes!
				extra_images=["testlib"]
				)
				# Set breakpoints both on shared library function to ensure that
				# we hit a source breakpoint in the shared library which only will
				# happen if we load the shared library correctly in the dynamic
				# loader.
				lldbutil.continue_to_source_breakpoint(
				self, process, "// Library break here",
				lldb.SBFileSpec("library_file.cpp", False)
				)
				aprantlUnsubmitted Not Done Reply Inline Actions You should be able to shorten the test setup considerably by using lldbutil.run_to_name_breakpoint() and only manually setting the second breakpoint afterwards. aprantl: You should be able to shorten the test setup considerably by using lldbutil.
				clayborgAuthorUnsubmitted Done Reply Inline Actions will do clayborg: will do

lldb/test/API/functionalities/dyld-multiple-rdebug/library_file.h

This file was added.

int library_function();

lldb/test/API/functionalities/dyld-multiple-rdebug/library_file.cpp

This file was added.

				#include "library_file.h"
				#include <stdio.h>

				int library_function(void) {
				puts(__FUNCTION__); // Library break here
				return 0;
				}

lldb/test/API/functionalities/dyld-multiple-rdebug/main.cpp

This file was added.

				#include "library_file.h"
				#include <link.h>
				#include <stdio.h>
				// Make a duplicate "_r_debug" symbol that is visible. This is the global
				// variable name that the dynamic loader uses to communicate changes in shared
				// libraries that get loaded and unloaded. LLDB finds the address of this
				// variable by reading the DT_DEBUG entry from the .dynamic section of the main
				// executable.
				// What will happen is the dynamic loader will use the "_r_debug" symbol from
				// itself until the a.out executable gets loaded. At this point the new
				// "_r_debug" symbol will take precedence over the orignal "_r_debug" symbol
				// from the dynamic loader and the copy below will get updated with shared
				// library state changes while the version that LLDB checks in the dynamic
				// loader stays the same for ever after this.
				//
				// When our DYLDRendezvous.cpp tries to check the state in the _r_debug
				// structure, it will continue to get the last eAdd as the state before the
				// switch in symbol resolution.
				//
				// Before a fix in LLDB, this would mean that we wouldn't ever load any shared
				// libraries since DYLDRendezvous was waiting to see a eAdd state followed by a
				// eConsistent state which would trigger the adding of shared libraries, but we
				// would never see this change because the local copy below is actually what
				// would get updated. Now if DYLDRendezvous detects two eAdd states in a row,
				// it will load the shared libraries instead of doing nothing and a log message
				// will be printed out if "log enable lldb dyld" is active.
				r_debug _r_debug;

				int main() {
				library_function(); // Break here
				return 0;
				}