This is an archive of the discontinued LLVM Phabricator instance.

[asan] Search for llvm-symbolizer in the executable's directory instead of only searching in PATH.
AbandonedPublic

Authored by akhuang on Dec 10 2020, 2:09 PM.

Details

Reviewers
rnk
vitalybuka
Summary

Currenly asan only looks for llvm-symbolizer in PATH and ASAN_OPTIONS,
but it seems like it would make sense to also look in the same directory as the
executable.

For example, in Chromium, the easiest way to have asan tests use llvm-symbolizer
is to specify the path in ASAN_OPTIONS. It'd be convenient to have asan just find
the symbolizer if it's in the same directory.

Diff Detail

Event Timeline

akhuang requested review of this revision.Dec 10 2020, 2:09 PM
akhuang created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptDec 10 2020, 2:09 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
rnk added a comment.Dec 10 2020, 2:27 PM

For example, in Chromium, the easiest way to have asan tests use llvm-symbolizer
is to specify the path in ASAN_OPTIONS. It'd be convenient to have asan just find
the symbolizer if it's in the same directory.

I think it's worth pointing out that, for developers that don't put LLVM's bin directory on their PATH, it makes LLVM binary ASan report symbolization work out the box. Other projects, such as Chromium, can achieve the same effect by copying llvm-symbolizer into their build directories.

It's also consistent with what LLVM's crash handler does:
https://github.com/llvm/llvm-project/blob/master/llvm/lib/Support/Signals.cpp#L128

compiler-rt/lib/sanitizer_common/sanitizer_file.cpp
208

This can be expensive on Linux, where it means reading /proc/self/exe. I'd consider ReadBinaryNameCached.

compiler-rt/lib/sanitizer_common/sanitizer_win.cpp
1050

It seems worth noting that this function is documented to always null terminate the output string.

1054–1056

According to the documentation, the output is always null terminated if you include the null terminator in the input (the +1 here). So, I think you can remove the second = '\0'.

I am not sure about this change. For the most of users it's very unlikely to have llvm-symbolizer next to the tested binary. Looks like this just simplifies few tests. I'd rather use usual approach with ASAN_OPTIONS there.

rnk added a comment.Dec 11 2020, 10:00 AM

I think there is a huge amount of value in making asan binaries produce symbolized reports out of the box without having to set any environment variables or wrap the binaries in a script. I'm open to any solution that achieves that, but most users aren't going to read the docs and remember to set ASAN_OPTIONS before running a chrome test binary. Copying (or linking) llvm-symbolizer into the build directory is something reliable that any build system with asan support can do.

One alternative idea we had was to bake this ASAN_OPTION into chrome binaries:

external_symbolizer_path="../../llvm-build/Release+Asserts/bin/llvm-symbolizer.exe"

But this only works so long as the user's CWD is the build directory. Absolute paths aren't allowed for build hermeticity reasons.

Any other suggestions for making this work out of the box?

  1. Find llvm-symbolizer in /usr/bin, same as addr2line before?
  2. Support something like $ORIGIN substitution in llvm_symbolizer_path.

I'm a bit uneasy about running random binaries from the current directory. Consider for example if I download an asan binary and run it from the chrome downloads directory - I don't necessarily trust the llvm-symbolizer binary in the same directory.

rnk added a comment.Dec 11 2020, 11:07 AM
  1. Find llvm-symbolizer in /usr/bin, same as addr2line before?
  2. Support something like $ORIGIN substitution in llvm_symbolizer_path.

I'm a bit uneasy about running random binaries from the current directory. Consider for example if I download an asan binary and run it from the chrome downloads directory - I don't necessarily trust the llvm-symbolizer binary in the same directory.

Fair enough, but ASan today will check if the program exists in the CWD of the program before doing a PATH search.

I'm not that keen on 1, since there's no equivalent to /usr/bin/ on Windows, it doesn't really solve our problem. 2, a metavar like $ORIGIN, would solve the issue. Or a special ASAN_OPTION, similar to external_symbolizer_path.

Amy, do you want to run with the $ORIGIN idea?

compiler-rt/lib/sanitizer_common/sanitizer_file.cpp
182

Note that asan first checks if the program exists in the CWD here.

Sure, I'm not really familiar with $ORIGIN - how does it work?

Can we embed clang binary location when compiling the program?

rnk added a comment.Dec 14 2020, 10:45 AM

Sure, I'm not really familiar with $ORIGIN - how does it work?

You can search for "rpath origin" for background on the feature. The basic idea is that it's a special variable that expands to the directory containing the binary using the variable.

In Chromium, we'd use it by adding external_symbolizer_path=$ORIGIN/../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer(.exe) to the __asan_default_option strings here:
https://source.chromium.org/chromium/chromium/src/+/master:build/sanitizers/sanitizer_options.cc;l=31?q=asan_default_options&ss=chromium

Can we embed clang binary location when compiling the program?

We used to embed paths to clang runtime libraries derived from the path to the clang binary, but it was error prone (https://crbug.com/989372). We removed them here in D65543. If the user (or distributed build system in our case) is not careful and invokes the compiler with an absolute path, any paths derived from the path to the compiler will become absolute, and potentially wrong in a distributed context.

vitalybuka resigned from this revision.Dec 14 2020, 3:55 PM

I don't have strong opinion

akhuang abandoned this revision.Feb 25 2021, 10:21 AM