This is an archive of the discontinued LLVM Phabricator instance.

[ASan] Add process basename to log name and error message to simplify analysis of sanitized systems logs.
ClosedPublic

Authored by ygribov on Feb 2 2015, 2:05 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

ygribov updated this revision to Diff 19139.Feb 2 2015, 2:05 AM
ygribov retitled this revision from to [ASan] Add process basename to log name and error message to simplify analysis of sanitized systems logs..
ygribov updated this object.
ygribov edited the test plan for this revision. (Show Details)
ygribov added reviewers: kcc, samsonov.
ygribov set the repository for this revision to rL LLVM.
ygribov added a subscriber: Unknown Object (MLST).
kcc edited edge metadata.Feb 19 2015, 5:00 PM

sorry for delay...

lib/sanitizer_common/sanitizer_common.cc
58 ↗(On Diff #19139)

What's "pname"? Can this be more descriptive?
Also, if this does not change over time, we should just precompute it at start up

299 ↗(On Diff #19139)

does binary name change over time?
If not, we should compute it at start up w/o any mutexes

lib/sanitizer_common/sanitizer_linux.cc
709 ↗(On Diff #19139)

I am lost here. What's being changed?

lib/sanitizer_common/sanitizer_printf.cc
257 ↗(On Diff #19139)

So, now everyone will have the module name in their logs by default?
Not good. Make it an option.

test/asan/lit.cfg
135 ↗(On Diff #19139)

I think this is an overkill. Just make the tests require linux/Freebsd

ygribov added inline comments.Feb 19 2015, 9:47 PM
lib/sanitizer_common/sanitizer_common.cc
58 ↗(On Diff #19139)

Ok.

299 ↗(On Diff #19139)

does binary name change over time?

You can mess with process name at least on some platforms. I'm not sure /proc/self/exe can be changed though.

If not, we should compute it at start up w/o any mutexes

You mean initialize it in _asan_init, _msan_init, etc.? Ok.

lib/sanitizer_common/sanitizer_linux.cc
709 ↗(On Diff #19139)

I removed the redundant 4.

lib/sanitizer_common/sanitizer_printf.cc
257 ↗(On Diff #19139)

I could but is there a good reason not to make this default (besides legacy scripts)?

kcc added a comment.Feb 20 2015, 8:42 AM

You mean initialize it in _asan_init, _msan_init, etc.? Ok.

Yes, create something like InitializeBinaryName() and GetBinaryName(), GetWhateverElseName() in sanitizer_common and call it in tool init routines.

I could but is there a good reason not to make this default (besides legacy scripts)?

legacy scripts is a good reason by itself, but more importantly, you are the first to request this in 4 years,
which means that most users don't need it. If they don't need it, extra bloat in the report will be a distraction.
If we see that the feature is useful we can always set the flag on by default.

but more importantly, you are the first to request this in 4 years,

That's probably because not many people sanitize full distributions.

ygribov updated this revision to Diff 20738.Feb 26 2015, 2:34 AM
ygribov edited edge metadata.

Fixed based on Kostya's feedback.

kcc added inline comments.Mar 2 2015, 7:01 PM
lib/asan/asan_rtl.cc
311 ↗(On Diff #20738)

empty line?

314 ↗(On Diff #20738)

do you really need to call this twice?

lib/sanitizer_common/sanitizer_common.cc
357 ↗(On Diff #20738)

So why do you still need the Mutex if you are calling this only once at init time?
Just add an asssertion that you've never initialized this before.
Then you also don't need binary_name_cache_initialized, you can use binary_name_cache_str[0],
which in turn could be function-scope because no one else needs it.

lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
647 ↗(On Diff #20738)

So, do we still need this here?

test/asan/TestCases/verbose-log-path_test.cc
9 ↗(On Diff #20738)

maybe just put it inside test/asan/TestCases/Linux?

16 ↗(On Diff #20738)

this test could be smaller, e.g.
int glob[10];
int main(int argc, char **argv) {

return glob[argc * 10];

}

ygribov added inline comments.Mar 3 2015, 1:05 PM
lib/asan/asan_rtl.cc
314 ↗(On Diff #20738)

Yes, just like AsanCheckIncompatibleRT is called twice in two types of initializers that we have (__asan_init and AsanInitializer). But - I should move it to AsanInitializer::AsanInitializer().

lib/sanitizer_common/sanitizer_common.cc
357 ↗(On Diff #20738)

I thought about that but I decided to keep CacheBinaryName more generic so that it could be used in other contexts besides pure program startup. I could remove this if you think it's unnecessary.

lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
647 ↗(On Diff #20738)

I don't know the whole sandboxing thing well enough so I kept it.

test/asan/TestCases/verbose-log-path_test.cc
9 ↗(On Diff #20738)

Well, it's absolutely trivial to support on other platforms so I thought it makes sense to simply xfail.

16 ↗(On Diff #20738)

That's a blind copy-paste of similar log_path test without log_exe_name.

kcc added inline comments.Mar 3 2015, 4:05 PM
lib/sanitizer_common/sanitizer_common.cc
357 ↗(On Diff #20738)

Yes, I'd prefer to simplify it and make sure it crashes if called twice, or while other threads are running (or at least after the init is done)

lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
647 ↗(On Diff #20738)

This should be safe, especially if the tests pass :)
Also, if you ensure that CacheBinaryName asserts on the 2-nd call, this call site will stop working.

test/asan/TestCases/verbose-log-path_test.cc
16 ↗(On Diff #20738)

Sure. Please make this one simpler.

kcc added a reviewer: earthdok.Mar 3 2015, 4:07 PM

adding one more reviewer in a time zone more suitable for you.

ygribov updated this revision to Diff 21748.Mar 11 2015, 11:17 AM

Updated based on previous review. Unfortunately I was unable to prohibit against multiple calls to CheckBinaryName because both UBSan and ASan call it.

kcc added inline comments.Mar 11 2015, 3:24 PM
lib/sanitizer_common/sanitizer_common.cc
347 ↗(On Diff #21748)

initialize to nullptr, so that if we forget to call CacheBinaryName we instantly crash

360 ↗(On Diff #21748)

This will not catch a situation where someone calls CacheBinaryName() twice, which should be illegal.
Instead, do a hard check that this is the first call.

364 ↗(On Diff #21748)

You can probably implement this thing w/o an extra static bool, just use binary_basename_cache_str

ygribov added inline comments.Mar 12 2015, 12:28 AM
lib/sanitizer_common/sanitizer_common.cc
347 ↗(On Diff #21748)

Yeah, I thought about this. But note that even though we call CacheBinaryName very early, some code has to execute before it. And I'd prefer this code to print something at least to stderr instead of crashing.

360 ↗(On Diff #21748)

Unfortunately we do have situations where double call is needed e.g. when app is both A- and UB-sanitized.

364 ↗(On Diff #21748)

Right.

kcc added inline comments.Mar 12 2015, 1:15 PM
lib/sanitizer_common/sanitizer_common.cc
347 ↗(On Diff #21748)

Sure, use CHECK_NE(foo, nullptr)

360 ↗(On Diff #21748)

I believe the recent refactoring by Alexey allows to solve this.
Really, if we allow double call but do not allow lazy initialization later on we are asking for trouble.

ygribov added inline comments.Mar 12 2015, 1:33 PM
lib/sanitizer_common/sanitizer_common.cc
347 ↗(On Diff #21748)

Sorry, I still don't understand - how do you propose to handle failing CHECKs in e.g. InitializeFlags (which is called before CacheBinaryName has a chance to set binary_basename_cache_str)? If we CHECK_NE(binary_basename_cache_str, nullptr) we'll abort without reporting the real error.

kcc added inline comments.Mar 12 2015, 2:06 PM
lib/sanitizer_common/sanitizer_common.cc
347 ↗(On Diff #21748)

Can you then read the binary name earlier?
Or don't use check and rely on SEGV handler to show you stack trace if binary_basename_cache_str is nullptr.
Or tolerate the binary name biign null?

ygribov added inline comments.Mar 16 2015, 7:20 AM
lib/sanitizer_common/sanitizer_common.cc
360 ↗(On Diff #21748)

Could you be more specific? I can see that ASan and UBSan have been changed to have isolated copies of some core structures (e.g. suppressions). But error reporting (e.g. report_file or log_path) are still shared across all sanitizers with mutexes being used to serialize accesses.

kcc added inline comments.Mar 16 2015, 1:25 PM
lib/sanitizer_common/sanitizer_common.cc
360 ↗(On Diff #21748)

Alexey actually tells that this part will be ready later this week.
The idea is that something like ubsan_common_init() will be called from asan_init() if ubsan and asan are bundled and
from ubsan_init() if ubsan runs separately.
In this case, we can call CacheBinaryName in
ubsan_init and in __asan_init.
This way it will be ensured that CacheBinaryName is executed strictly once, and we'll be able to add the CHECK to prevent from calling it for the second time.
Can you wait a few more days before Alexey lands his patch?

Sure, thanks for the update. Could you guys make sure that this works properly with dynamic A- and UB-runtimes in GCC (which have separate sanitizer_commons)?

kcc added a comment.Mar 17 2015, 11:10 AM

Sure, thanks for the update. Could you guys make sure that this works properly with dynamic A- and UB-runtimes in GCC (which have separate sanitizer_commons)?

OMG. I'll let Alexey speak here, but if I were him I would not support this mode.
What we should have is

  • separate UBSan run-time
  • bundled ASan+UBSan run-time

Having two sanitizers co-exist in the same process w/o properly knowing about each other is beyond of what we can reasonably maintain.

ygribov updated this revision to Diff 23145.Apr 2 2015, 6:12 AM
This comment was removed by ygribov.

I accidentally commited wrong commit. Will fix with next version of the patch.

ygribov updated this revision to Diff 23491.Apr 9 2015, 7:49 AM

Updated after recent ASan-UBSan refactoring. Addressed Kostya's remarks.

kcc added a comment.Apr 16 2015, 8:59 AM

I am on long leave. Alexey/Sergey, please review.

samsonov edited edge metadata.Apr 24 2015, 3:51 PM

Once again, sorry for delay :-/
I will look into it shortly.

Looks mostly good. I'd rather land this in three changes: (1) refactor Asan dynamic initialization (2) move CacheBinaryName to sanitizer_common and call it early (3) implement log_exe_name.

lib/asan/asan_linux.cc
116 ↗(On Diff #23491)

just if (!ASAN_DYNAMIC) return;

lib/asan/asan_rtl.cc
372 ↗(On Diff #23491)

I think that moving these functions to the AsanInitInternal is a positive change, do you think you can land it independently (as a refactoring), before submitting the rest of this patch?

lib/sanitizer_common/sanitizer_common.cc
237 ↗(On Diff #23491)

That's weird, don't we handle backslash on Windows above?

lib/sanitizer_common/sanitizer_common.h
218 ↗(On Diff #23491)

It is already declared below.

lib/sanitizer_common/sanitizer_printf.cc
261 ↗(On Diff #23491)

You should probably check that needed_length is not greater than buffer_size here as well.

lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
362 ↗(On Diff #23491)

Note that moving CacheBinaryName() to sanitizer_common (and calling it early during initialization) may also be separated to an independent change, that shouldn't change user-visible behavior.

lib/tsan/rtl/tsan_rtl.cc
309 ↗(On Diff #23491)

I'd rather do this after setting SanitizerToolName and parsing the flags, as you do in different sanitizers.
Also, what about MSan?

test/asan/TestCases/verbose-log-path_test.cc
1 ↗(On Diff #23491)

Consider changing the name of the output executable (%t-verbose-log-path_test-binary)...

6 ↗(On Diff #23491)

... and matching against this executable name here.

9 ↗(On Diff #23491)
XFAIL: win32,android,darwin

Once again, sorry for delay :-/

Np, this isn't the most interesting change for sure.

lib/asan/asan_rtl.cc
372 ↗(On Diff #23491)

Sure.

lib/sanitizer_common/sanitizer_common.cc
237 ↗(On Diff #23491)

True.

lib/sanitizer_common/sanitizer_printf.cc
261 ↗(On Diff #23491)

Good catch, otherwise there may be an overflow.

lib/tsan/rtl/tsan_rtl.cc
309 ↗(On Diff #23491)

Will do.

test/asan/TestCases/verbose-log-path_test.cc
6 ↗(On Diff #23491)

Sorry, not quite clear. What would be the benefit of this?

samsonov added inline comments.Apr 30 2015, 10:56 AM
test/asan/TestCases/verbose-log-path_test.cc
6 ↗(On Diff #23491)

Otherwise you seem to depend on the fact that %t (temporary string generated by lit) contains the source file name (verbose-log-path_test.cc).

ygribov updated this revision to Diff 27038.Jun 3 2015, 7:15 AM
ygribov edited edge metadata.

Updated based on previous review + split independent functionality to separate patches.

samsonov added inline comments.Jun 3 2015, 9:25 AM
lib/sanitizer_common/sanitizer_printf.cc
260 ↗(On Diff #27038)

Hm, now this chunk of code is copied three times in the function... Factor it out to a macro, maybe?

ygribov updated this revision to Diff 27097.Jun 4 2015, 1:28 AM

Added a macro to reduce code dup.

samsonov accepted this revision.Jun 4 2015, 11:36 AM
samsonov edited edge metadata.

LGTM

lib/sanitizer_common/sanitizer_printf.cc
277 ↗(On Diff #27097)

#undef macro here?

This revision is now accepted and ready to land.Jun 4 2015, 11:36 AM
This revision was automatically updated to reflect the committed changes.