This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/sanitizer_common/
-
sanitizer_common/
1
CMakeLists.txt
-
sanitizer_symbolizer_internal.h
-
sanitizer_symbolizer_libcdep.cc
7
sanitizer_symbolizer_mac.h
20
sanitizer_symbolizer_mac.cc
-
sanitizer_symbolizer_process_libcdep.cc
-
tests/
-
sanitizer_symbolizer_test.cc

Differential D6588

[compiler-rt] atos and dladdr symbolizers for OS X
ClosedPublic

Authored by kubamracek on Dec 9 2014, 7:07 PM.

Download Raw Diff

Details

Reviewers

glider
samsonov

Summary

Hi everyone,

based on the discussion at https://groups.google.com/d/topic/address-sanitizer/2UaT7rvhvJ4/discussion about different symbolizers, especially on OS X, this patch implements a "fallback" symbolizer for OS X that uses the atos command line tool, which already ships with OS X. The main reason for that is to have easier deployment to machines that don't have llvm-symbolizer. Even though it was pointed out that atos may not be as accurate, having a backup symbolizer is still useful for issue suppressions, where one of the suppression types rely on having a working symbolizer.

As also discussed, the "real" long-term solution is to have internal_symbolizer (llvm-symbolizer built as a standalone static library) being built as part of the LLVM build, but that requires a significant amount of work. This atos patch is mostly meant to have something that works in the meantime.

I tried to keep the changes to existing code in sanitizer_symbolizer_posix_libcdep.cc minimal, so I reused the POSIXSymbolizer and added the AtosSymbolizer as another variant of an external symbolizer (a subclass of ExternalSymbolizerInterface). It would maybe make more sense to subclass SymbolizerProcess but that would require more refactoring. The POSIXSymbolizer class is also already very tied to how llvm-symbolizer works, and probably should also be refactored to allow using another tool/format.

That being said, this patch may be sub-optimal from the refactoring point of view, and I'll be glad to redo it properly, but first I'd like to ask if I'm on the right track or if this should be implemented in a completely different way.

Thanks,
Kuba

Diff Detail

Event Timeline

kubamracek updated this revision to Diff 17110.Dec 9 2014, 7:07 PM

kubamracek retitled this revision from to [compiler-rt] atos symbolizer for OS X.

kubamracek updated this object.

kubamracek edited the test plan for this revision. (Show Details)

kubamracek added a reviewer: glider.

kubamracek added a subscriber: Unknown Object (MLST).

kubamracek added subscribers: samsonov, kcc, glider.

samsonov added a reviewer: samsonov.Dec 9 2014, 8:29 PM

Once again, sorry for delay. Will take a look at it this week.

glider added inline comments.Dec 17 2014, 1:14 AM

lib/sanitizer_common/sanitizer_symbolizer_mac.cc
25	Please consider using fd_t for \|fd_to_child_\| and kInvalidFd for invalid fd.
32	This blew my mind out ;) I'm curious whether using the same fd to write commands to the symbolizer and read the output actually works. Is it in any sense better than having ye olde pipes or a socketpair like in StartSymbolizerSubprocess() in sanitizer_symbolizer_posix_libcdep.cc? Anyway, I think we need to use the same code to spawn the subprocesses for both the addr2line and atos symbolizers. Not sure which version is better.
34	s/Fork/fork
35	I think you need to ensure fd isn't spoiled by the failed fork().
57	How about moving ExtractToken to a common file? There's a similar function in sanitizer_symbolizer_posix_libcdep.cc
104	I've mixed feelings about internal_read() returning a uptr, but don't think we need to clean it up in this CL.
lib/sanitizer_common/sanitizer_symbolizer_mac.h
11	This is not accurate anymore, because we've more than two tools (although only ASan supports OSX so far) How about "This file is shared between sanitizer tools run-time libraries"?
28	Please fix the include order (standard headers should be sorted alphabetically)

kubamracek added inline comments.Dec 17 2014, 12:55 PM

lib/sanitizer_common/sanitizer_symbolizer_mac.cc
32	The reason to use forkpty is because `atos` doesn't flush its output fd after it gives a response. So using regular pipes doesn't work here, because the response gets buffered within the C library. To disable this, we make a new pseudo-terminal which makes all output from atos unbuffered. This is not a problem for llvm-symbolizer, because it does a outs().flush() after each response.

samsonov added inline comments.Dec 17 2014, 6:54 PM

lib/sanitizer_common/sanitizer_symbolizer.h
144 ↗	(On Diff #17110)	Looks like some overrides need one half of the arguments (module_name/module_offset), and some need another (addr). This kind of interface is not really convenient.
149 ↗	(On Diff #17110)	This function has nothing to do with `ExternalSymbolizerInterface`
lib/sanitizer_common/sanitizer_symbolizer_mac.cc
95	Note: you need to use internal_iserror() for write/read syscalls. If we don't have them for existing code - this is most likely a bug.
lib/sanitizer_common/sanitizer_symbolizer_mac.h
22	Why do you need it?
34	don't you need to initialize fd_to_child_ here?
lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
692 ↗	(On Diff #17110)	I don't like the way the code is structured: first you call `SendCommand()`, that looks at which symbolizers are available, and dispatches `SendCommand()` to one of them, and then you call `ParseSymbolizedStackFrame()` that does the similar dispatch. I don't believe this is correct. On top of that, there is libbacktrace_symbolizer_, which works in yet different way.
792 ↗	(On Diff #17110)	I'd prefer having this under if (SANITIZER_MAC) , and all the AtosSymbolizer code not #ifdef'ed out on Linux - this way it would be easier to check that we don't break one platform while modifying the code for another.

Updating the patch to address review comments and refactor the symbolizers interface a little bit more:

SymbolizerInterface becomes a generic interface that all the symbolizers implement, and has these methods:

bool SymbolizePC(uptr addr, SymbolizedStack *stack);
bool SymbolizeData(uptr addr, DataInfo *info)

SymbolizerProcess on the other hand is now a completely separate class that handles the communication with the external process. It no longer deals with modules names and offsets, it just receives a "command" and provides a "response". The new LLVMSymbolizer class is responsible for constructing the string command for LLVMSymbolizerProcess and parsing the response back into SymbolizedStack or DataInfo. This way, I can implement AtosSymbolizerProcess as a very simple subclass of SymbolizerProcess.

I changed the StartSymbolizerSubprocess() method to use the forkpty call (instead of fork, sock_pair and pipe) for all subprocesses. It's needed for atos, in order to disable buffering in the new terminal (otherwise the response gets buffered inside libc and never returned until the input stream is closed). This also simplifies this method a lot, and we don't need to handle the case when stdin/stdout/stderr are closed.

The patch also adds one more symbolizer, DlAddrSymbolizer, which is extremely simple, and just calls dladdr() to retrieve a symbol name, and doesn't provide any file names or line numbers. It's used as a fallback when spawning an external symbolizer fails (e.g. because we're in a no-fork-allowed sandbox).

I added some new test cases that show that we can still provide symbol names in a no-fork sandbox, and that suppressions specified by a symbol name also work.

glider added inline comments.Dec 30 2014, 7:04 AM

lib/sanitizer_common/CMakeLists.txt
32	Please swap this line with the previous one.
lib/sanitizer_common/sanitizer_common.cc
293 ↗	(On Diff #17698)	As far as I can tell, both ExtractInt and ExtractUptr are used to extract unsigned integers. Why use both (and deal with the potential overflow situations)?
315 ↗	(On Diff #17698)	How 'bout "ExtractTokenUpToDelimiter" since this function is extracting tokens, not ints or uptrs?
325 ↗	(On Diff #17698)	Isn't that just prefix_len?
lib/sanitizer_common/sanitizer_common.h
71 ↗	(On Diff #17698)	This group of helpers needs at least a brief comment.
lib/sanitizer_common/sanitizer_symbolizer_mac.cc
30	In this case you're assigning "0xdeadbeef" to res->info.function, is that intentional?
lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
170 ↗	(On Diff #17698)	A comment may give better understanding of what you're parsing here.
205 ↗	(On Diff #17698)	Is the column part always present? I think for addr2line it is not. In that case, will info->column be initialized properly?
221 ↗	(On Diff #17698)	I think this function (and the other one) has little to do with POSIX, despite it's in a _posix_.cc file. Better remove "POSIX" from the name.
226 ↗	(On Diff #17698)	Two spaces before the comment, please (here and in the tests below)
test/asan/TestCases/closed-fds.cc
2 ↗	(On Diff #17698)	Wonder if we want/should check that exactly the required symbolizer is being used (e.g. print its name under verbosity=2)

Addressing review comments.

As far as I can tell, both ExtractInt and ExtractUptr are used to extract unsigned integers. Why use both (and deal with the potential overflow situations)?

Removed ExtractInt and kept only ExtractUptr.

prefix_end += internal_strlen(delimiter);

Isn't that just prefix_len?

No, this line moves the pointer *after* the delimiter (if one was found), I think we really need strlen(delimiter) here.

Wonder if we want/should check that exactly the required symbolizer is being used (e.g. print its name under verbosity=2)

Good idea. Added.

If you need to refactor some interfaces or move code around in order to implement Mac-specific symbolizers, let's start with small patches doing that, which introduce no behavior changes.

lib/sanitizer_common/sanitizer_common.h
75 ↗	(On Diff #17728)	I'd prefer to have these function in sanitizer_symbolizer.(h\|cc), as for now they are not used anywhere else.
lib/sanitizer_common/sanitizer_symbolizer.h
145 ↗	(On Diff #17728)	I don't like that in some concrete classes (e.g. LibbacktraceSymbolizer) require that certain fields of "stack" structure are filled by the caller.
154 ↗	(On Diff #17728)	This buffer has nothing to do with the interface, it is an implementation detail of specific concrete classes.
168 ↗	(On Diff #17728)	Please move implementation to .cc files.
lib/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc
159 ↗	(On Diff #17728)	This function is completely wrong. First of all, it should return bool now, not SymbolizedStack. Then, "SymbolizedStack" argument you pass to this function is leaked, so is its module_name. It seems that the "stack" argument is only required to pass in module_name/module_offset, and serves no actual purpose by itself. It means that the interface is broken.
lib/sanitizer_common/sanitizer_symbolizer_libbacktrace.h
35 ↗	(On Diff #17728)	add override keywords for method overloads please.
lib/sanitizer_common/sanitizer_symbolizer_mac.h
47	"new"?! No, please don't call system allocator in the symbolizer code.
lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
28 ↗	(On Diff #17728)	Does this header exist on all POSIX systems?
33 ↗	(On Diff #17728)	Does this header exist on all POSIX systems?
509 ↗	(On Diff #17728)	dladdr may not be available on some platforms. I'd suggest to use it on Mac only for now.
620 ↗	(On Diff #17728)	Wait, LibbacktraceSymbolizer and DladdrSymbolizer also implement SymbolizerInterface, why don't we return them here?
699 ↗	(On Diff #17728)	We have VReport for these purposes.

Thanks for the review, I'll extract a NFC refactoring patch, but first I'd like to ask you to clarify what you suggest in the following comments:

I don't like that in some concrete classes (e.g. LibbacktraceSymbolizer) require that certain fields of "stack" structure are filled by the caller.

I see. It's meant as both an input and output, and since it's created at a single place (in POSIXSymbolizer::SymbolizePC), it's always pre-filled with the address, module name and module offset. I did that to avoid passing these 3 values to all the function calls. Any better suggestions what the interface should look like?

Wait, LibbacktraceSymbolizer and DladdrSymbolizer also implement SymbolizerInterface, why don't we return them here?

I wanted to keep the current behavior, which is a weird chain: On every symbolication request, LibbacktraceSymbolizer is called first, and if it fails, we try the "regular" symbolizer, which is either internal_symbolizer or external_symbolizer. I also wanted to add the DladdrSymbolizer into that chain, so it's only used when a "regular" symbolizer failed. The reason is that we only realize that we cannot spawn an external symbolizer after we try to at least once.

You're right that the change to LibbacktraceSymbolizer was completely broken, sorry was that.

A refactoring patch (http://reviews.llvm.org/D7827) is now extracted. Updating this patch to be an implementation of AtosSymbolizer and DlAddrSymbolizer on top of the refactorings.

Looks like the symbolizer refactoring is making some progress, so I'm updating this patch (that adds AtosSymbolizer and DlAddrSymbolizer) so it can be applied cleanly. I'm not adding these new symbolizer tools into the chain yet, but having them committed would be convenient for me. Alexey, what do you think? Or should I first finish all the refactoring?

ping

samsonov added inline comments.Mar 10 2015, 12:28 PM

lib/sanitizer_common/sanitizer_symbolizer_mac.cc
58	I don't like that this function essentially copies so much of the base class code. Do you have a good understanding of why we need to use one code to run llvm-symbolizer on Mac and different code (forkpty etc.) to run atos on Mac? Can we instead make certain bits of StartSymbolizerProcess() platform-specific?
67	Why do you need this?
116	no need to fill with zeroes, snprintf should do the right thing.
126	static?
138	factor out these constants and check in a helper function.
170	Wait, shouldn't AtosSymbolizerProcess be created via InternalAllocator?
178	Why do you need this here? If you want to avoid printing multiple error messages, you can just drop the reference to process_
188	See above - I don't think we need to add any implementation for non-Mac platforms.
lib/sanitizer_common/sanitizer_symbolizer_mac.h
2	Add this file to CMakeLists.txt
19	Do you really need the declaration of DlAddrSymbolizer and AtosSymbolizer on non-Mac? Maybe, we could instead protect the whole .h and .cc file with #if SANITIZER_MAC directive? E.g. you only provide AtosSymbolizerProcess under SANITIZER_MAC anyway.

Do you have a good understanding of why we need to use one code to run llvm-symbolizer on Mac and different code (forkpty etc.) to run atos on Mac? Can we instead make certain bits of StartSymbolizerProcess() platform-specific?

The reason to use forkpty is because atos doesn't flush its output fd after it gives a response. So using regular pipes doesn't work here, because the response gets buffered within the C library. To disable this, we make a new pseudo-terminal which makes all output from atos unbuffered. This is not a problem for llvm-symbolizer, because it does a outs().flush() after each response.

Addressing review comments. Moved the forkpty part into SymbolizerProcess under a use_forkpty option. Protected the .h and .cc files with #if SANITIZER_MAC.

LGTM after addressing comments below. Thanks!

lib/sanitizer_common/sanitizer_symbolizer_mac.cc
74	I believe we have ARRAY_SIZE macro for that.
97	InternalFree(trim);
131	You can remove extra temp variable: if (!ParseCommandOutput(buf, stack)) { process_ = nullptr; return false; } return true; Add a comment describing why you discard the process in this case.

This revision is now accepted and ready to land.Mar 11 2015, 11:54 AM

Landed in r232026.

Revision Contents

Path

Size

lib/

sanitizer_common/

CMakeLists.txt

2 lines

sanitizer_symbolizer_internal.h

5 lines

sanitizer_symbolizer_libcdep.cc

13 lines

sanitizer_symbolizer_mac.h

48 lines

sanitizer_symbolizer_mac.cc

143 lines

sanitizer_symbolizer_process_libcdep.cc

152 lines

tests/

sanitizer_symbolizer_test.cc

9 lines

Diff 21725

lib/sanitizer_common/CMakeLists.txt

Show All 21 Lines	set(SANITIZER_SOURCES
sanitizer_procmaps_linux.cc		sanitizer_procmaps_linux.cc
sanitizer_procmaps_mac.cc		sanitizer_procmaps_mac.cc
sanitizer_stackdepot.cc		sanitizer_stackdepot.cc
sanitizer_stacktrace.cc		sanitizer_stacktrace.cc
sanitizer_stacktrace_printer.cc		sanitizer_stacktrace_printer.cc
sanitizer_suppressions.cc		sanitizer_suppressions.cc
sanitizer_symbolizer.cc		sanitizer_symbolizer.cc
sanitizer_symbolizer_libbacktrace.cc		sanitizer_symbolizer_libbacktrace.cc
		sanitizer_symbolizer_mac.cc
sanitizer_symbolizer_win.cc		sanitizer_symbolizer_win.cc
sanitizer_tls_get_addr.cc		sanitizer_tls_get_addr.cc
		gliderUnsubmitted Not Done Reply Inline Actions Please swap this line with the previous one. glider: Please swap this line with the previous one.
sanitizer_thread_registry.cc		sanitizer_thread_registry.cc
sanitizer_win.cc)		sanitizer_win.cc)

set(SANITIZER_LIBCDEP_SOURCES		set(SANITIZER_LIBCDEP_SOURCES
sanitizer_common_libcdep.cc		sanitizer_common_libcdep.cc
sanitizer_coverage_libcdep.cc		sanitizer_coverage_libcdep.cc
sanitizer_coverage_mapping_libcdep.cc		sanitizer_coverage_mapping_libcdep.cc
sanitizer_linux_libcdep.cc		sanitizer_linux_libcdep.cc
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	set(SANITIZER_HEADERS
sanitizer_stackdepotbase.h		sanitizer_stackdepotbase.h
sanitizer_stacktrace.h		sanitizer_stacktrace.h
sanitizer_stacktrace_printer.h		sanitizer_stacktrace_printer.h
sanitizer_stoptheworld.h		sanitizer_stoptheworld.h
sanitizer_suppressions.h		sanitizer_suppressions.h
sanitizer_symbolizer.h		sanitizer_symbolizer.h
sanitizer_symbolizer_internal.h		sanitizer_symbolizer_internal.h
sanitizer_symbolizer_libbacktrace.h		sanitizer_symbolizer_libbacktrace.h
		sanitizer_symbolizer_mac.h
sanitizer_symbolizer_win.h		sanitizer_symbolizer_win.h
sanitizer_syscall_generic.inc		sanitizer_syscall_generic.inc
sanitizer_syscall_linux_x86_64.inc		sanitizer_syscall_linux_x86_64.inc
sanitizer_thread_registry.h)		sanitizer_thread_registry.h)

set(SANITIZER_COMMON_DEFINITIONS)		set(SANITIZER_COMMON_DEFINITIONS)

if(MSVC)		if(MSVC)
▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

lib/sanitizer_common/sanitizer_symbolizer_internal.h

Show All 19 Lines

// Parsing helpers, 'str' is searched for delimiter(s) and a string or uptr		// Parsing helpers, 'str' is searched for delimiter(s) and a string or uptr
// is extracted. When extracting a string, a newly allocated (using		// is extracted. When extracting a string, a newly allocated (using
// InternalAlloc) and null-terminataed buffer is returned. They return a pointer		// InternalAlloc) and null-terminataed buffer is returned. They return a pointer
// to the next characted after the found delimiter.		// to the next characted after the found delimiter.
const char ExtractToken(const char str, const char delims, char *result);		const char ExtractToken(const char str, const char delims, char *result);
const char ExtractInt(const char str, const char delims, int result);		const char ExtractInt(const char str, const char delims, int result);
const char ExtractUptr(const char str, const char delims, uptr result);		const char ExtractUptr(const char str, const char delims, uptr result);
		const char ExtractTokenUpToDelimiter(const char str, const char *delimiter,
		char **result);

// SymbolizerTool is an interface that is implemented by individual "tools"		// SymbolizerTool is an interface that is implemented by individual "tools"
// that can perform symbolication (external llvm-symbolizer, libbacktrace,		// that can perform symbolication (external llvm-symbolizer, libbacktrace,
// Windows DbgHelp symbolizer, etc.).		// Windows DbgHelp symbolizer, etc.).
class SymbolizerTool {		class SymbolizerTool {
public:		public:
// The main \|Symbolizer\| class implements a "fallback chain" of symbolizer		// The main \|Symbolizer\| class implements a "fallback chain" of symbolizer
// tools. In a request to symbolize an address, if one tool returns false,		// tools. In a request to symbolize an address, if one tool returns false,
Show All 26 Lines	public:
}		}
};		};

// SymbolizerProcess encapsulates communication between the tool and		// SymbolizerProcess encapsulates communication between the tool and
// external symbolizer program, running in a different subprocess.		// external symbolizer program, running in a different subprocess.
// SymbolizerProcess may not be used from two threads simultaneously.		// SymbolizerProcess may not be used from two threads simultaneously.
class SymbolizerProcess {		class SymbolizerProcess {
public:		public:
explicit SymbolizerProcess(const char *path);		explicit SymbolizerProcess(const char *path, bool use_forkpty = false);
const char SendCommand(const char command);		const char SendCommand(const char command);

private:		private:
bool Restart();		bool Restart();
const char SendCommandImpl(const char command);		const char SendCommandImpl(const char command);
bool ReadFromSymbolizer(char *buffer, uptr max_length);		bool ReadFromSymbolizer(char *buffer, uptr max_length);
bool WriteToSymbolizer(const char *buffer, uptr length);		bool WriteToSymbolizer(const char *buffer, uptr length);
bool StartSymbolizerSubprocess();		bool StartSymbolizerSubprocess();
Show All 13 Lines	private:
static const uptr kBufferSize = 16 * 1024;		static const uptr kBufferSize = 16 * 1024;
char buffer_[kBufferSize];		char buffer_[kBufferSize];

static const uptr kMaxTimesRestarted = 5;		static const uptr kMaxTimesRestarted = 5;
static const int kSymbolizerStartupTimeMillis = 10;		static const int kSymbolizerStartupTimeMillis = 10;
uptr times_restarted_;		uptr times_restarted_;
bool failed_to_start_;		bool failed_to_start_;
bool reported_invalid_path_;		bool reported_invalid_path_;
		bool use_forkpty_;
};		};

} // namespace __sanitizer		} // namespace __sanitizer

#endif // SANITIZER_SYMBOLIZER_INTERNAL_H		#endif // SANITIZER_SYMBOLIZER_INTERNAL_H

lib/sanitizer_common/sanitizer_symbolizer_libcdep.cc

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	const char ExtractUptr(const char str, const char delims, uptr result) {
const char *ret = ExtractToken(str, delims, &buff);		const char *ret = ExtractToken(str, delims, &buff);
if (buff != 0) {		if (buff != 0) {
*result = (uptr)internal_atoll(buff);		*result = (uptr)internal_atoll(buff);
}		}
InternalFree(buff);		InternalFree(buff);
return ret;		return ret;
}		}

		const char ExtractTokenUpToDelimiter(const char str, const char *delimiter,
		char **result) {
		const char *found_delimiter = internal_strstr(str, delimiter);
		uptr prefix_len =
		found_delimiter ? found_delimiter - str : internal_strlen(str);
		result = (char )InternalAlloc(prefix_len + 1);
		internal_memcpy(*result, str, prefix_len);
		(*result)[prefix_len] = '\0';
		const char *prefix_end = str + prefix_len;
		if (*prefix_end != '\0') prefix_end += internal_strlen(delimiter);
		return prefix_end;
		}

Symbolizer *Symbolizer::GetOrInit() {		Symbolizer *Symbolizer::GetOrInit() {
SpinMutexLock l(&init_mu_);		SpinMutexLock l(&init_mu_);
if (symbolizer_)		if (symbolizer_)
return symbolizer_;		return symbolizer_;
symbolizer_ = PlatformInit();		symbolizer_ = PlatformInit();
CHECK(symbolizer_);		CHECK(symbolizer_);
return symbolizer_;		return symbolizer_;
}		}

} // namespace __sanitizer		} // namespace __sanitizer

lib/sanitizer_common/sanitizer_symbolizer_mac.h

				//===-- sanitizer_symbolizer_mac.h ------------------------------- C++ --===//
				//
				samsonovUnsubmitted Not Done Reply Inline Actions Add this file to CMakeLists.txt samsonov: Add this file to CMakeLists.txt
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is shared between various sanitizers' runtime libraries.
				//
				gliderUnsubmitted Not Done Reply Inline Actions This is not accurate anymore, because we've more than two tools (although only ASan supports OSX so far) How about "This file is shared between sanitizer tools run-time libraries"? glider: This is not accurate anymore, because we've more than two tools (although only ASan supports…
				// Header for Mac-specific "atos" symbolizer.
				//===----------------------------------------------------------------------===//

				#ifndef SANITIZER_SYMBOLIZER_MAC_H
				#define SANITIZER_SYMBOLIZER_MAC_H

				#include "sanitizer_platform.h"
				#if SANITIZER_MAC
				samsonovUnsubmitted Not Done Reply Inline Actions Do you really need the declaration of DlAddrSymbolizer and AtosSymbolizer on non-Mac? Maybe, we could instead protect the whole .h and .cc file with #if SANITIZER_MAC directive? E.g. you only provide AtosSymbolizerProcess under SANITIZER_MAC anyway. samsonov: Do you really need the declaration of DlAddrSymbolizer and AtosSymbolizer on non-Mac? Maybe, we…

				#include "sanitizer_symbolizer_internal.h"

				samsonovUnsubmitted Not Done Reply Inline Actions Why do you need it? samsonov: Why do you need it?
				namespace __sanitizer {

				class DlAddrSymbolizer : public SymbolizerTool {
				public:
				bool SymbolizePC(uptr addr, SymbolizedStack *stack) override;
				bool SymbolizeData(uptr addr, DataInfo *info) override;
				gliderUnsubmitted Not Done Reply Inline Actions Please fix the include order (standard headers should be sorted alphabetically) glider: Please fix the include order (standard headers should be sorted alphabetically)
				};

				class AtosSymbolizerProcess;

				class AtosSymbolizer : public SymbolizerTool {
				public:
				samsonovUnsubmitted Not Done Reply Inline Actions don't you need to initialize fd_to_child_ here? samsonov: don't you need to initialize fd_to_child_ here?
				explicit AtosSymbolizer(const char path, LowLevelAllocator allocator);

				bool SymbolizePC(uptr addr, SymbolizedStack *stack) override;
				bool SymbolizeData(uptr addr, DataInfo *info) override;

				private:
				AtosSymbolizerProcess *process_;
				};

				} // namespace __sanitizer

				#endif // SANITIZER_MAC

				samsonovUnsubmitted Not Done Reply Inline Actions "new"?! No, please don't call system allocator in the symbolizer code. samsonov: "new"?! No, please don't call system allocator in the symbolizer code.
				#endif // SANITIZER_SYMBOLIZER_MAC_H

lib/sanitizer_common/sanitizer_symbolizer_mac.cc

				//===-- sanitizer_symbolizer_mac.cc ---------------------------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is shared between various sanitizers' runtime libraries.
				//
				// Implementation of Mac-specific "atos" symbolizer.
				//===----------------------------------------------------------------------===//

				#include "sanitizer_platform.h"
				#if SANITIZER_MAC

				#include "sanitizer_allocator_internal.h"
				#include "sanitizer_symbolizer_mac.h"

				namespace __sanitizer {

				#include <dlfcn.h>
				#include <errno.h>
				#include <stdlib.h>
				gliderUnsubmitted Not Done Reply Inline Actions Please consider using fd_t for \|fd_to_child_\| and kInvalidFd for invalid fd. glider: Please consider using fd_t for \|fd_to_child_\| and kInvalidFd for invalid fd.
				#include <sys/wait.h>
				#include <unistd.h>
				#include <util.h>

				bool DlAddrSymbolizer::SymbolizePC(uptr addr, SymbolizedStack *stack) {
				gliderUnsubmitted Not Done Reply Inline Actions In this case you're assigning "0xdeadbeef" to res->info.function, is that intentional? glider: In this case you're assigning "0xdeadbeef" to res->info.function, is that intentional?
				Dl_info info;
				int result = dladdr((const void *)addr, &info);
				gliderUnsubmitted Not Done Reply Inline Actions This blew my mind out ;) I'm curious whether using the same fd to write commands to the symbolizer and read the output actually works. Is it in any sense better than having ye olde pipes or a socketpair like in StartSymbolizerSubprocess() in sanitizer_symbolizer_posix_libcdep.cc? Anyway, I think we need to use the same code to spawn the subprocesses for both the addr2line and atos symbolizers. Not sure which version is better. glider: This blew my mind out ;) I'm curious whether using the same fd to write commands to the…
				kubamracekAuthorUnsubmitted Not Done Reply Inline Actions The reason to use forkpty is because `atos` doesn't flush its output fd after it gives a response. So using regular pipes doesn't work here, because the response gets buffered within the C library. To disable this, we make a new pseudo-terminal which makes all output from atos unbuffered. This is not a problem for llvm-symbolizer, because it does a outs().flush() after each response. kubamracek: The reason to use forkpty is because `atos` doesn't flush its output fd after it gives a…
				if (!result) return false;
				stack->info.function = internal_strdup(info.dli_sname);
				gliderUnsubmitted Not Done Reply Inline Actions s/Fork/fork glider: s/Fork/fork
				return true;
				gliderUnsubmitted Not Done Reply Inline Actions I think you need to ensure fd isn't spoiled by the failed fork(). glider: I think you need to ensure fd isn't spoiled by the failed fork().
				}

				bool DlAddrSymbolizer::SymbolizeData(uptr addr, DataInfo *info) {
				return false;
				}

				class AtosSymbolizerProcess : public SymbolizerProcess {
				public:
				explicit AtosSymbolizerProcess(const char *path, pid_t parent_pid)
				: SymbolizerProcess(path, /use_forkpty/ true),
				parent_pid_(parent_pid) {}

				private:
				bool ReachedEndOfOutput(const char *buffer, uptr length) const override {
				return (length >= 1 && buffer[length - 1] == '\n');
				}

				void ExecuteWithDefaultArgs(const char *path_to_binary) const override {
				// The `atos` binary has some issues with DYLD_ROOT_PATH on i386.
				unsetenv("DYLD_ROOT_PATH");

				char pid_str[16];
				gliderUnsubmitted Not Done Reply Inline Actions How about moving ExtractToken to a common file? There's a similar function in sanitizer_symbolizer_posix_libcdep.cc glider: How about moving ExtractToken to a common file? There's a similar function in…
				internal_snprintf(pid_str, sizeof(pid_str), "%d", parent_pid_);
				samsonovUnsubmitted Not Done Reply Inline Actions I don't like that this function essentially copies so much of the base class code. Do you have a good understanding of why we need to use one code to run llvm-symbolizer on Mac and different code (forkpty etc.) to run atos on Mac? Can we instead make certain bits of StartSymbolizerProcess() platform-specific? samsonov: I don't like that this function essentially copies so much of the base class code. Do you have…
				execl(path_to_binary, path_to_binary, "-p", pid_str, (char *)0);
				}

				pid_t parent_pid_;
				};

				static const char *kAtosErrorMessages[] = {
				"atos cannot examine process",
				"unable to get permission to examine process",
				samsonovUnsubmitted Not Done Reply Inline Actions Why do you need this? samsonov: Why do you need this?
				"An admin user name and password is required",
				"could not load inserted library",
				"architecture mismatch between analysis process",
				};

				static bool IsAtosErrorMessage(const char *str) {
				int n = sizeof(kAtosErrorMessages) / sizeof(kAtosErrorMessages[0]);
				samsonovUnsubmitted Not Done Reply Inline Actions I believe we have ARRAY_SIZE macro for that. samsonov: I believe we have ARRAY_SIZE macro for that.
				for (int i = 0; i < n; i++) {
				if (internal_strstr(str, kAtosErrorMessages[i])) {
				return true;
				}
				}
				return false;
				}

				static bool ParseCommandOutput(const char str, SymbolizedStack res) {
				// Trim ending newlines.
				char *trim;
				ExtractTokenUpToDelimiter(str, "\n", &trim);

				// The line from `atos` is in one of these formats:
				// myfunction (in library.dylib) (sourcefile.c:17)
				// myfunction (in library.dylib) + 0x1fe
				// 0xdeadbeef (in library.dylib) + 0x1fe
				// 0xdeadbeef (in library.dylib)
				// 0xdeadbeef

				if (IsAtosErrorMessage(trim)) {
				samsonovUnsubmitted Not Done Reply Inline Actions Note: you need to use internal_iserror() for write/read syscalls. If we don't have them for existing code - this is most likely a bug. samsonov: Note: you need to use internal_iserror() for write/read syscalls. If we don't have them for…
				Report("atos returned an error: %s\n", trim);
				return false;
				samsonovUnsubmitted Not Done Reply Inline Actions InternalFree(trim); samsonov: InternalFree(trim);
				}

				const char *rest = trim;
				char *function_name;
				rest = ExtractTokenUpToDelimiter(rest, " (in ", &function_name);
				if (internal_strncmp(function_name, "0x", 2) != 0)
				res->info.function = function_name;
				gliderUnsubmitted Not Done Reply Inline Actions I've mixed feelings about internal_read() returning a uptr, but don't think we need to clean it up in this CL. glider: I've mixed feelings about internal_read() returning a uptr, but don't think we need to clean it…
				else
				InternalFree(function_name);
				rest = ExtractTokenUpToDelimiter(rest, ") ", &res->info.module);

				if (rest[0] == '(') {
				rest++;
				rest = ExtractTokenUpToDelimiter(rest, ":", &res->info.file);
				char *extracted_line_number;
				rest = ExtractTokenUpToDelimiter(rest, ")", &extracted_line_number);
				res->info.line = internal_atoll(extracted_line_number);
				InternalFree(extracted_line_number);
				}
				samsonovUnsubmitted Not Done Reply Inline Actions no need to fill with zeroes, snprintf should do the right thing. samsonov: no need to fill with zeroes, snprintf should do the right thing.

				InternalFree(trim);
				return true;
				}

				AtosSymbolizer::AtosSymbolizer(const char path, LowLevelAllocator allocator)
				: process_(new(*allocator) AtosSymbolizerProcess(path, getpid())) {}

				bool AtosSymbolizer::SymbolizePC(uptr addr, SymbolizedStack *stack) {
				if (!process_) return false;
				samsonovUnsubmitted Not Done Reply Inline Actions static? samsonov: static?
				char command[32];
				internal_snprintf(command, sizeof(command), "0x%zx\n", addr);
				const char *buf = process_->SendCommand(command);
				if (!buf) return false;
				bool result = ParseCommandOutput(buf, stack);
				samsonovUnsubmitted Not Done Reply Inline Actions You can remove extra temp variable: if (!ParseCommandOutput(buf, stack)) { process_ = nullptr; return false; } return true; Add a comment describing why you discard the process in this case. samsonov: You can remove extra temp variable: if (!ParseCommandOutput(buf, stack)) { process_ =…
				if (!result) {
				process_ = nullptr;
				return false;
				}
				return true;
				}

				samsonovUnsubmitted Not Done Reply Inline Actions factor out these constants and check in a helper function. samsonov: factor out these constants and check in a helper function.
				bool AtosSymbolizer::SymbolizeData(uptr addr, DataInfo *info) { return false; }

				} // namespace __sanitizer

				#endif // SANITIZER_MAC
				samsonovUnsubmitted Not Done Reply Inline Actions Why do you need this here? If you want to avoid printing multiple error messages, you can just drop the reference to process_ samsonov: Why do you need this here? If you want to avoid printing multiple error messages, you can just…
				samsonovUnsubmitted Not Done Reply Inline Actions Wait, shouldn't AtosSymbolizerProcess be created via InternalAllocator? samsonov: Wait, shouldn't AtosSymbolizerProcess be created via InternalAllocator?
				samsonovUnsubmitted Not Done Reply Inline Actions See above - I don't think we need to add any implementation for non-Mac platforms. samsonov: See above - I don't think we need to add any implementation for non-Mac platforms.

lib/sanitizer_common/sanitizer_symbolizer_process_libcdep.cc

Show All 14 Lines
#if SANITIZER_POSIX		#if SANITIZER_POSIX
#include "sanitizer_symbolizer_internal.h"		#include "sanitizer_symbolizer_internal.h"

#include <errno.h>		#include <errno.h>
#include <stdlib.h>		#include <stdlib.h>
#include <sys/wait.h>		#include <sys/wait.h>
#include <unistd.h>		#include <unistd.h>

		#if SANITIZER_MAC
		#include <util.h> // for forkpty()
		#endif // SANITIZER_MAC

namespace __sanitizer {		namespace __sanitizer {

SymbolizerProcess::SymbolizerProcess(const char *path)		SymbolizerProcess::SymbolizerProcess(const char *path, bool use_forkpty)
: path_(path),		: path_(path),
input_fd_(kInvalidFd),		input_fd_(kInvalidFd),
output_fd_(kInvalidFd),		output_fd_(kInvalidFd),
times_restarted_(0),		times_restarted_(0),
failed_to_start_(false),		failed_to_start_(false),
reported_invalid_path_(false) {		reported_invalid_path_(false),
		use_forkpty_(use_forkpty) {
CHECK(path_);		CHECK(path_);
CHECK_NE(path_[0], '\0');		CHECK_NE(path_[0], '\0');
}		}

const char SymbolizerProcess::SendCommand(const char command) {		const char SymbolizerProcess::SendCommand(const char command) {
for (; times_restarted_ < kMaxTimesRestarted; times_restarted_++) {		for (; times_restarted_ < kMaxTimesRestarted; times_restarted_++) {
// Start or restart symbolizer if we failed to send command to it.		// Start or restart symbolizer if we failed to send command to it.
if (const char *res = SendCommandImpl(command))		if (const char *res = SendCommandImpl(command))
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	bool SymbolizerProcess::StartSymbolizerSubprocess() {
if (!FileExists(path_)) {		if (!FileExists(path_)) {
if (!reported_invalid_path_) {		if (!reported_invalid_path_) {
Report("WARNING: invalid path to external symbolizer!\n");		Report("WARNING: invalid path to external symbolizer!\n");
reported_invalid_path_ = true;		reported_invalid_path_ = true;
}		}
return false;		return false;
}		}

		int pid;
		if (use_forkpty_) {
		#if SANITIZER_MAC
		fd_t fd = kInvalidFd;
		// Use forkpty to disable buffering in the new terminal.
		pid = forkpty(&fd, 0, 0, 0);
		if (pid == -1) {
		// forkpty() failed.
		Report("WARNING: failed to fork external symbolizer (errno: %d)\n",
		errno);
		return false;
		} else if (pid == 0) {
		// Child subprocess.
		ExecuteWithDefaultArgs(path_);
		internal__exit(1);
		}

		// Continue execution in parent process.
		input_fd_ = output_fd_ = fd;

		// Disable echo in the new terminal, disable CR.
		struct termios termflags;
		tcgetattr(fd, &termflags);
		termflags.c_oflag &= ~ONLCR;
		termflags.c_lflag &= ~ECHO;
		tcsetattr(fd, TCSANOW, &termflags);
		#else // SANITIZER_MAC
		UNIMPLEMENTED();
		#endif // SANITIZER_MAC
		} else {
int *infd = NULL;		int *infd = NULL;
int *outfd = NULL;		int *outfd = NULL;
// The client program may close its stdin and/or stdout and/or stderr		// The client program may close its stdin and/or stdout and/or stderr
// thus allowing socketpair to reuse file descriptors 0, 1 or 2.		// thus allowing socketpair to reuse file descriptors 0, 1 or 2.
// In this case the communication between the forked processes may be		// In this case the communication between the forked processes may be
// broken if either the parent or the child tries to close or duplicate		// broken if either the parent or the child tries to close or duplicate
// these descriptors. The loop below produces two pairs of file		// these descriptors. The loop below produces two pairs of file
// descriptors, each greater than 2 (stderr).		// descriptors, each greater than 2 (stderr).
int sock_pair[5][2];		int sock_pair[5][2];
for (int i = 0; i < 5; i++) {		for (int i = 0; i < 5; i++) {
if (pipe(sock_pair[i]) == -1) {		if (pipe(sock_pair[i]) == -1) {
for (int j = 0; j < i; j++) {		for (int j = 0; j < i; j++) {
internal_close(sock_pair[j][0]);		internal_close(sock_pair[j][0]);
internal_close(sock_pair[j][1]);		internal_close(sock_pair[j][1]);
}		}
Report("WARNING: Can't create a socket pair to start "		Report("WARNING: Can't create a socket pair to start "
"external symbolizer (errno: %d)\n", errno);		"external symbolizer (errno: %d)\n", errno);
return false;		return false;
} else if (sock_pair[i][0] > 2 && sock_pair[i][1] > 2) {		} else if (sock_pair[i][0] > 2 && sock_pair[i][1] > 2) {
if (infd == NULL) {		if (infd == NULL) {
infd = sock_pair[i];		infd = sock_pair[i];
} else {		} else {
outfd = sock_pair[i];		outfd = sock_pair[i];
for (int j = 0; j < i; j++) {		for (int j = 0; j < i; j++) {
if (sock_pair[j] == infd) continue;		if (sock_pair[j] == infd) continue;
internal_close(sock_pair[j][0]);		internal_close(sock_pair[j][0]);
internal_close(sock_pair[j][1]);		internal_close(sock_pair[j][1]);
}		}
break;		break;
}		}
}		}
}		}
CHECK(infd);		CHECK(infd);
CHECK(outfd);		CHECK(outfd);

// Real fork() may call user callbacks registered with pthread_atfork().		// Real fork() may call user callbacks registered with pthread_atfork().
int pid = internal_fork();		pid = internal_fork();
if (pid == -1) {		if (pid == -1) {
// Fork() failed.		// Fork() failed.
internal_close(infd[0]);		internal_close(infd[0]);
internal_close(infd[1]);		internal_close(infd[1]);
internal_close(outfd[0]);		internal_close(outfd[0]);
internal_close(outfd[1]);		internal_close(outfd[1]);
Report("WARNING: failed to fork external symbolizer "		Report("WARNING: failed to fork external symbolizer "
" (errno: %d)\n", errno);		" (errno: %d)\n", errno);
return false;		return false;
} else if (pid == 0) {		} else if (pid == 0) {
// Child subprocess.		// Child subprocess.
internal_close(STDOUT_FILENO);		internal_close(STDOUT_FILENO);
internal_close(STDIN_FILENO);		internal_close(STDIN_FILENO);
internal_dup2(outfd[0], STDIN_FILENO);		internal_dup2(outfd[0], STDIN_FILENO);
internal_dup2(infd[1], STDOUT_FILENO);		internal_dup2(infd[1], STDOUT_FILENO);
internal_close(outfd[0]);		internal_close(outfd[0]);
internal_close(outfd[1]);		internal_close(outfd[1]);
internal_close(infd[0]);		internal_close(infd[0]);
internal_close(infd[1]);		internal_close(infd[1]);
for (int fd = sysconf(_SC_OPEN_MAX); fd > 2; fd--)		for (int fd = sysconf(_SC_OPEN_MAX); fd > 2; fd--)
internal_close(fd);		internal_close(fd);
ExecuteWithDefaultArgs(path_);		ExecuteWithDefaultArgs(path_);
internal__exit(1);		internal__exit(1);
}		}

// Continue execution in parent process.		// Continue execution in parent process.
internal_close(outfd[0]);		internal_close(outfd[0]);
internal_close(infd[1]);		internal_close(infd[1]);
input_fd_ = infd[0];		input_fd_ = infd[0];
output_fd_ = outfd[1];		output_fd_ = outfd[1];
		}

// Check that symbolizer subprocess started successfully.		// Check that symbolizer subprocess started successfully.
int pid_status;		int pid_status;
SleepForMillis(kSymbolizerStartupTimeMillis);		SleepForMillis(kSymbolizerStartupTimeMillis);
int exited_pid = waitpid(pid, &pid_status, WNOHANG);		int exited_pid = waitpid(pid, &pid_status, WNOHANG);
if (exited_pid != 0) {		if (exited_pid != 0) {
// Either waitpid failed, or child has already exited.		// Either waitpid failed, or child has already exited.
Report("WARNING: external symbolizer didn't start up correctly!\n");		Report("WARNING: external symbolizer didn't start up correctly!\n");
Show All 9 Lines

lib/sanitizer_common/tests/sanitizer_symbolizer_test.cc

	Show All 40 Lines

	TEST(Symbolizer, ExtractUptr) {			TEST(Symbolizer, ExtractUptr) {
	uptr token;			uptr token;
	const char *rest = ExtractUptr("123,456;789", ";,", &token);			const char *rest = ExtractUptr("123,456;789", ";,", &token);
	EXPECT_EQ(123U, token);			EXPECT_EQ(123U, token);
	EXPECT_STREQ("456;789", rest);			EXPECT_STREQ("456;789", rest);
	}			}

				TEST(Symbolizer, ExtractTokenUpToDelimiter) {
				char *token;
				const char *rest =
				ExtractTokenUpToDelimiter("aaa-+-bbb-+-ccc", "-+-", &token);
				EXPECT_STREQ("aaa", token);
				EXPECT_STREQ("bbb-+-ccc", rest);
				InternalFree(token);
				}

	} // namespace __sanitizer			} // namespace __sanitizer