This is an archive of the discontinued LLVM Phabricator instance.

Parallelize demangling
Needs ReviewPublic

Authored by scott.smith on May 3 2017, 11:54 AM.

Download Raw Diff

Details

Reviewers

clayborg
jingham

Summary

Use TaskMapOverInt to demangle the symbols in parallel. Defer categorization of C++ symbols until later, when it can be determined what contexts are definitely classes, and what might not be.

Diff Detail

Repository: rL LLVM

Event Timeline

This commit uses TaskMapOverInt from change D32757, which hasn't landed yet.

git clang-format
new TaskMapOverInt api (begin, end, fn instead of end, batch, fn)

labath added a subscriber: labath.May 5 2017, 4:08 AM

labath added inline comments.

source/Symbol/Symtab.cpp
257	The function looks big enough to deserve a name. (== Please move the lambda out of line)

scott.smith added inline comments.May 5 2017, 7:20 AM

source/Symbol/Symtab.cpp
257	ok but now's your chance to review it as a diff rather than a sea of red and green....

sea of red and green
Be explicit about what the lambda has access to (and make the inputs const) to prevent concurrency bugs.

What are the measured improvements here? We can't slow things down on any platforms. I know MacOS didn't respond well to making demangling run in parallel. I want to see some numbers here. And please don't quote numbers with tcmalloc or any special allocator unless you have a patch in LLDB already to make this a build option.

source/Symbol/Symtab.cpp
233–239	Not being able to search for symbols by name when they are trampolines? If you lookup symbols by name I would expect things not to fail and I would expect that I would get all the answers, not just ones that are omitted for performance reasons. I would also not expect to have to specify extra flags. Please remove

In D32820#747309, @clayborg wrote:

What are the measured improvements here? We can't slow things down on any platforms. I know MacOS didn't respond well to making demangling run in parallel. I want to see some numbers here. And please don't quote numbers with tcmalloc or any special allocator unless you have a patch in LLDB already to make this a build option.

Without tcmalloc, on Ubuntu 14.04, 40 core VM: 13%
With tcmalloc, on Ubuntu 14.04, 40 core VM: 24% (built using cmake ... -DCMAKE_EXE_LINKER_FLAGS=-ltcmalloc_minimal, which amazingly only works when building with clang, not gcc...)

I don't have access to a Mac, and of course YMMV depending on the application.

so yeah, it's a bigger improvement with tcmalloc. Interestingly, I tried adding back the demangler improvements I had queued up (which reduced memory allocations), and they didn't matter much, which makes me think this is contention allocating const strings. I could be wrong though.

By far the biggest performance improvement I have queued up is loading the shared libraries in parallel (D32597), but that's waiting on pulling parallel_for_each from LLD into LLVM (I think).

If you're really leery of this change, I could just make the structural changes to allow parallelization, and then keep a small patch internally to enable it. Or enabling it could be platform dependent. Or ... ?

source/Symbol/Symtab.cpp
233–239	This is just moved code, not new code. You can use the phabricator history tool above to diff baseline against #97908, and see that the only change I made was continue->return, due to changing it from a loop to a lambda (now a separate function). This is why I pubished D32708 separately - I wanted to separate the functional change from the structural change.

Without tcmalloc, on Ubuntu 14.04, 40 core VM: 13%
With tcmalloc, on Ubuntu 14.04, 40 core VM: 24% (built using cmake ... -DCMAKE_EXE_LINKER_FLAGS=-ltcmalloc_minimal, which amazingly only works when building with clang, not gcc...)

Do you have a brief set of steps you use for benchmarking? I'd like to compare on FreeBSD using a similar test.

In D32820#759141, @emaste wrote:

Without tcmalloc, on Ubuntu 14.04, 40 core VM: 13%
With tcmalloc, on Ubuntu 14.04, 40 core VM: 24% (built using cmake ... -DCMAKE_EXE_LINKER_FLAGS=-ltcmalloc_minimal, which amazingly only works when building with clang, not gcc...)

Do you have a brief set of steps you use for benchmarking? I'd like to compare on FreeBSD using a similar test.

time lldb -b -o 'b main' -o 'run' /my/program
(sometimes I use 'perf stat' instead of time; I don't know if FreeBSD has something similar to Linux's perf - basically instruction count, cycle count, branch counts and mispredict rate, etc.)

my program happens to have a lot of symbols and a lot of libraries. I tried this benchmark with lldb itself, and all but one of my changes have no effect because lldb only links in one large library, so YMMV depending on the application.

Revision Contents

Path

Size

include/

lldb/

Utility/

ConstString.h

11 lines

source/

Symbol/

Symtab.cpp

270 lines

Diff 97970

include/lldb/Utility/ConstString.h

Context not available.

	} // namespace lldb_private	} // namespace lldb_private

		namespace std {
		template <> struct hash<lldb_private::ConstString> {
		size_t operator()(const lldb_private::ConstString &str) const {
		return m_substr_hash(str.GetCString());
		}

		private:
		std::hash<const void *> m_substr_hash;
		};
		} // namespace std

	namespace llvm {	namespace llvm {
	template <> struct format_provider<lldb_private::ConstString> {	template <> struct format_provider<lldb_private::ConstString> {
	static void format(const lldb_private::ConstString &CS, llvm::raw_ostream &OS,	static void format(const lldb_private::ConstString &CS, llvm::raw_ostream &OS,
Context not available.

source/Symbol/Symtab.cpp

Context not available.
	#include "Plugins/Language/CPlusPlus/CPlusPlusLanguage.h"	#include "Plugins/Language/CPlusPlus/CPlusPlusLanguage.h"
	#include "Plugins/Language/ObjC/ObjCLanguage.h"	#include "Plugins/Language/ObjC/ObjCLanguage.h"
	#include "lldb/Core/Module.h"	#include "lldb/Core/Module.h"
	#include "lldb/Core/Section.h"
	#include "lldb/Core/STLUtils.h"	#include "lldb/Core/STLUtils.h"
		#include "lldb/Core/Section.h"
	#include "lldb/Core/Timer.h"	#include "lldb/Core/Timer.h"
	#include "lldb/Symbol/ObjectFile.h"	#include "lldb/Symbol/ObjectFile.h"
	#include "lldb/Symbol/Symbol.h"	#include "lldb/Symbol/Symbol.h"
Context not available.
	#include "lldb/Symbol/Symtab.h"	#include "lldb/Symbol/Symtab.h"
	#include "lldb/Utility/RegularExpression.h"	#include "lldb/Utility/RegularExpression.h"
	#include "lldb/Utility/Stream.h"	#include "lldb/Utility/Stream.h"
		#include "lldb/Utility/TaskPool.h"

	using namespace lldb;	using namespace lldb;
	using namespace lldb_private;	using namespace lldb_private;
Context not available.
	return nullptr;	return nullptr;
	}	}

		namespace {

		struct demangle_state {
		ConstString name_to_index[5];
		ConstString selector_to_index[1];
		ConstString const_context;
		ConstString cxx_basename;
		bool is_definitely_class_context = false;
		};

		} // namespace

		// This is not a method of Symtab so it is easier to understand it's inputs.
		// This function is run in parallel so we need to be careful about side effects.
		static void IndexOneName(demangle_state &state, const Symbol &symbol,
		const ObjectFile &objfile) {
		// Don't let trampolines get into the lookup by name map
		// If we ever need the trampoline symbols to be searchable by name
		// we can remove this and then possibly add a new bool to any of the
		// Symtab functions that lookup symbols by name to indicate if they
		// want trampolines.
		if (symbol.IsTrampoline())
		return;
		clayborgUnsubmitted Done Reply Inline Actions Not being able to search for symbols by name when they are trampolines? If you lookup symbols by name I would expect things not to fail and I would expect that I would get all the answers, not just ones that are omitted for performance reasons. I would also not expect to have to specify extra flags. Please remove clayborg: Not being able to search for symbols by name when they are trampolines? If you lookup symbols…
		scott.smithAuthorUnsubmitted Not Done Reply Inline Actions This is just moved code, not new code. You can use the phabricator history tool above to diff baseline against #97908, and see that the only change I made was continue->return, due to changing it from a loop to a lambda (now a separate function). This is why I pubished D32708 separately - I wanted to separate the functional change from the structural change. scott.smith: This is just moved code, not new code. You can use the phabricator history tool above to diff…

		const Mangled &mangled = symbol.GetMangled();
		ConstString cstring = mangled.GetMangledName();
		if (cstring) {
		state.name_to_index[0] = cstring;

		if (symbol.ContainsLinkerAnnotations()) {
		// If the symbol has linker annotations, also add the version without
		// the annotations.
		cstring = ConstString(
		objfile.StripLinkerSymbolAnnotations(cstring.GetStringRef()));
		state.name_to_index[1] = cstring;
		}

		const SymbolType symbol_type = symbol.GetType();
		if (symbol_type == eSymbolTypeCode \|\| symbol_type == eSymbolTypeResolver) {
		llvm::StringRef entry_ref(cstring.GetStringRef());
		if (entry_ref[0] == '_' && entry_ref[1] == 'Z' &&
		labathUnsubmitted Done Reply Inline Actions The function looks big enough to deserve a name. (== Please move the lambda out of line) labath: The function looks big enough to deserve a name. (== Please move the lambda out of line)
		scott.smithAuthorUnsubmitted Done Reply Inline Actions ok but now's your chance to review it as a diff rather than a sea of red and green.... scott.smith: ok but now's your chance to review it as a diff rather than a sea of red and green....
		(entry_ref[2] != 'T' && // avoid virtual table, VTT structure,
		// typeinfo structure, and typeinfo
		// name
		entry_ref[2] != 'G' && // avoid guard variables
		entry_ref[2] != 'Z')) // named local entities (if we
		// eventually handle eSymbolTypeData,
		// we will want this back)
		{
		CPlusPlusLanguage::MethodName cxx_method(
		mangled.GetDemangledName(lldb::eLanguageTypeC_plus_plus));
		cstring = ConstString(cxx_method.GetBasename());
		if (cstring) {
		// ConstString objects permanently store the string in the pool so
		// calling
		// GetCString() on the value gets us a const char * that will
		// never go away
		entry_ref = cstring.GetStringRef();
		ConstString const_context = ConstString(cxx_method.GetContext());

		state.const_context = const_context;
		state.cxx_basename = cstring;
		state.is_definitely_class_context =
		entry_ref[0] == '~' \|\| !cxx_method.GetQualifiers().empty();
		}
		}
		}
		}

		cstring = mangled.GetDemangledName(symbol.GetLanguage());
		if (cstring) {
		state.name_to_index[2] = cstring;

		if (symbol.ContainsLinkerAnnotations()) {
		// If the symbol has linker annotations, also add the version without
		// the annotations.
		cstring = ConstString(
		objfile.StripLinkerSymbolAnnotations(cstring.GetStringRef()));
		state.name_to_index[3] = cstring;
		}
		}

		// If the demangled name turns out to be an ObjC name, and
		// is a category name, add the version without categories to the index
		// too.
		ObjCLanguage::MethodName objc_method(cstring.GetStringRef(), true);
		if (objc_method.IsValid(true)) {
		cstring = objc_method.GetSelector();
		state.selector_to_index[0] = cstring;

		ConstString objc_method_no_category(
		objc_method.GetFullNameWithoutCategory(true));
		if (objc_method_no_category) {
		cstring = objc_method_no_category;
		state.name_to_index[4] = cstring;
		}
		}
		}

	//----------------------------------------------------------------------	//----------------------------------------------------------------------
	// InitNameIndexes	// InitNameIndexes
	//----------------------------------------------------------------------	//----------------------------------------------------------------------
Context not available.
	m_name_to_index.Reserve(actual_count);	m_name_to_index.Reserve(actual_count);
	#endif	#endif

	NameToIndexMap::Entry entry;	std::vector<demangle_state> states(num_symbols);

	// The "const char *" in "class_contexts" must come from a
	// ConstString::GetCString()
	std::set<const char *> class_contexts;
	UniqueCStringMap<uint32_t> mangled_name_to_index;
	std::vector<const char *> symbol_contexts(num_symbols, nullptr);

	for (entry.value = 0; entry.value < num_symbols; ++entry.value) {
	const Symbol *symbol = &m_symbols[entry.value];

	// Don't let trampolines get into the lookup by name map
	// If we ever need the trampoline symbols to be searchable by name
	// we can remove this and then possibly add a new bool to any of the
	// Symtab functions that lookup symbols by name to indicate if they
	// want trampolines.
	if (symbol->IsTrampoline())
	continue;

	const Mangled &mangled = symbol->GetMangled();	TaskMapOverInt(0, num_symbols, [&states, this](size_t value) {
	entry.cstring = mangled.GetMangledName();	IndexOneName(states[value], m_symbols[value], *m_objfile);
	if (entry.cstring) {	});
	m_name_to_index.Append(entry);

	if (symbol->ContainsLinkerAnnotations()) {
	// If the symbol has linker annotations, also add the version without
	// the annotations.
	entry.cstring = ConstString(m_objfile->StripLinkerSymbolAnnotations(
	entry.cstring.GetStringRef()));
	m_name_to_index.Append(entry);
	}

	const SymbolType symbol_type = symbol->GetType();	std::unordered_set<ConstString> class_contexts;
	if (symbol_type == eSymbolTypeCode \|\|	for (size_t i = 0; i < num_symbols; i++) {
	symbol_type == eSymbolTypeResolver) {	demangle_state const &state = states[i];
	llvm::StringRef entry_ref(entry.cstring.GetStringRef());	for (auto name : state.name_to_index) {
	if (entry_ref[0] == '_' && entry_ref[1] == 'Z' &&	if (name) {
	(entry_ref[2] != 'T' && // avoid virtual table, VTT structure,	m_name_to_index.Append(name, i);
	// typeinfo structure, and typeinfo
	// name
	entry_ref[2] != 'G' && // avoid guard variables
	entry_ref[2] != 'Z')) // named local entities (if we
	// eventually handle eSymbolTypeData,
	// we will want this back)
	{
	CPlusPlusLanguage::MethodName cxx_method(
	mangled.GetDemangledName(lldb::eLanguageTypeC_plus_plus));
	entry.cstring = ConstString(cxx_method.GetBasename());
	if (entry.cstring) {
	// ConstString objects permanently store the string in the pool so
	// calling
	// GetCString() on the value gets us a const char * that will
	// never go away
	const char *const_context =
	ConstString(cxx_method.GetContext()).GetCString();

	if (!const_context \|\| const_context[0] == 0) {
	// No context for this function so this has to be a basename
	m_basename_to_index.Append(entry);
	// If there is no context (no namespaces or class scopes that
	// come before the function name) then this also could be a
	// fullname.
	m_name_to_index.Append(entry);
	} else {
	entry_ref = entry.cstring.GetStringRef();
	if (entry_ref[0] == '~' \|\|
	!cxx_method.GetQualifiers().empty()) {
	// The first character of the demangled basename is '~' which
	// means we have a class destructor. We can use this information
	// to help us know what is a class and what isn't.
	if (class_contexts.find(const_context) == class_contexts.end())
	class_contexts.insert(const_context);
	m_method_to_index.Append(entry);
	} else {
	if (class_contexts.find(const_context) !=
	class_contexts.end()) {
	// The current decl context is in our "class_contexts" which
	// means
	// this is a method on a class
	m_method_to_index.Append(entry);
	} else {
	// We don't know if this is a function basename or a method,
	// so put it into a temporary collection so once we are done
	// we can look in class_contexts to see if each entry is a
	// class
	// or just a function and will put any remaining items into
	// m_method_to_index or m_basename_to_index as needed
	mangled_name_to_index.Append(entry);
	symbol_contexts[entry.value] = const_context;
	}
	}
	}
	}
	}
	}	}
	}	}
		for (auto name : state.selector_to_index) {
	entry.cstring = mangled.GetDemangledName(symbol->GetLanguage());	if (name) {
	if (entry.cstring) {	m_selector_to_index.Append(name, i);
	m_name_to_index.Append(entry);

	if (symbol->ContainsLinkerAnnotations()) {
	// If the symbol has linker annotations, also add the version without
	// the annotations.
	entry.cstring = ConstString(m_objfile->StripLinkerSymbolAnnotations(
	entry.cstring.GetStringRef()));
	m_name_to_index.Append(entry);
	}	}
	}	}
		if (state.is_definitely_class_context) {
	// If the demangled name turns out to be an ObjC name, and	class_contexts.insert(state.const_context);
	// is a category name, add the version without categories to the index
	// too.
	ObjCLanguage::MethodName objc_method(entry.cstring.GetStringRef(), true);
	if (objc_method.IsValid(true)) {
	entry.cstring = objc_method.GetSelector();
	m_selector_to_index.Append(entry);

	ConstString objc_method_no_category(
	objc_method.GetFullNameWithoutCategory(true));
	if (objc_method_no_category) {
	entry.cstring = objc_method_no_category;
	m_name_to_index.Append(entry);
	}
	}	}
	}	}

	size_t count;	for (size_t i = 0; i < num_symbols; i++) {
	if (!mangled_name_to_index.IsEmpty()) {	demangle_state const &state = states[i];
	count = mangled_name_to_index.GetSize();	if (!state.cxx_basename)
	for (size_t i = 0; i < count; ++i) {	continue;
	if (mangled_name_to_index.GetValueAtIndex(i, entry.value)) {	if (!state.const_context) {
	entry.cstring = mangled_name_to_index.GetCStringAtIndex(i);	// No context for this function so this has to be a basename
	if (symbol_contexts[entry.value] &&	m_basename_to_index.Append(state.cxx_basename, i);
	class_contexts.find(symbol_contexts[entry.value]) !=	// If there is no context (no namespaces or class scopes that
	class_contexts.end()) {	// come before the function name) then this also could be a
	m_method_to_index.Append(entry);	// fullname.
	} else {	m_name_to_index.Append(state.cxx_basename, i);
	// If we got here, we have something that had a context (was inside	} else {
	// a namespace or class)	m_method_to_index.Append(state.cxx_basename, i);
	// yet we don't know if the entry	if (!state.is_definitely_class_context &&
	m_method_to_index.Append(entry);	class_contexts.find(state.const_context) == class_contexts.end()) {
	m_basename_to_index.Append(entry);	// If we got here, we have something that had a context (was inside
	}	// a namespace or class) yet we don't know if the entry
		m_basename_to_index.Append(state.cxx_basename, i);
	}	}
	}	}
	}	}
Context not available.