This is an archive of the discontinued LLVM Phabricator instance.

[Sanitizer RT] Put the module name string ownership in Symbolizer in order
ClosedPublic

Authored by timurrrr on Mar 27 2015, 10:20 AM.

Details

Reviewers
kcc
samsonov
Summary

What do you think about this approach?
I haven't benchmarked it yet.

Diff Detail

Event Timeline

timurrrr updated this revision to Diff 22805.Mar 27 2015, 10:20 AM
timurrrr retitled this revision from to [Sanitizer RT] Put the module name string ownership in Symbolizer in order.
timurrrr updated this object.
timurrrr edited the test plan for this revision. (Show Details)
timurrrr added reviewers: kcc, samsonov.
timurrrr added a subscriber: Unknown Object (MLST).

I don't see any significant difference between runtimes of pdfium_test from Chromium with this patch and coverage enabled vs disabled on a 4 MB PDF.
Should I benchmark further?

kcc edited edge metadata.Mar 27 2015, 11:01 AM

What is the rationale? I see added complexity but don't see the benefit

sanitizer_symbolizer.cc
78

Why do you need a DTOR?
Are you ever going to call it except for when the process is dying?

Rationale: we have a possible UAF in the RTL.
We either leak [see r233257] or UAF or manage the strings somehow.
A possible alternative is to replace char **module_name kind of returning a string to "pass us a fixed-size buffer" and char *module_name.

sanitizer_symbolizer.cc
78

Correct. Should I remove it?

kcc added a comment.Mar 27 2015, 11:08 AM

Rationale: we have a possible UAF in the RTL.
We either leak [see r233257] or UAF or manage the strings somehow.
A possible alternative is to replace char **module_name kind of returning a string to "pass us a fixed-size buffer" and char *module_name.

"pass us a fixed-size buffer" (i.e. the caller owns the buffer, we copy data there is better here, imho

sanitizer_symbolizer.cc
78

Yes, if we decide to proceed with this, but I'd prefer we don't

How does the caller pick the buffer size? Also, this way we'll be copying
a lot of strings without the need to do so...

kcc added a comment.Mar 27 2015, 11:15 AM

How does the caller pick the buffer size?

isn't this kMaxPathLength ?

Also, this way we'll be copying
a lot of strings without the need to do so...

How bad is that?
Are we getting the module name in any perf critical place?

How does the caller pick the buffer size?

isn't this kMaxPathLength ?

Yes, probably :)

Also, this way we'll be copying a lot of strings without the need to do so...

How bad is that?
Are we getting the module name in any perf critical place?

I haven't measured that yet, but I think it's basically once for each
TU in the default coverage mode.
Possibly more for edge coverage?

timurrrr updated this revision to Diff 22889.Mar 30 2015, 8:21 AM
timurrrr edited edge metadata.

Like this?

Please note we have a warning if a function takes more than 512 bytes on stack, so I used InternalScopedString instead. This implies mmap+munmap on each TU on process startup, plus a few times on shutdown. I've tried the patch on pdfium_test and chrome and haven't observed much slowdown...

[TODO: update subject before comitting]

kcc added a comment.Mar 30 2015, 12:45 PM

Like this?

Yea.. This clearly sucks too. Probably more than you initial variant.
Let's get back to it and polish it.

Please note we have a warning if a function takes more than 512 bytes on stack, so I used InternalScopedString instead. This implies mmap+munmap on each TU on process startup, plus a few times on shutdown. I've tried the patch on pdfium_test and chrome and haven't observed much slowdown...

[TODO: update subject before comitting]

timurrrr updated this revision to Diff 22900.Mar 30 2015, 12:54 PM

Reverted to the original version, sans destructor

kcc added a comment.Mar 30 2015, 1:08 PM

Ok, let's go this way...

sanitizer_symbolizer.h
90

Can you modify the comments (and logic) by removing "as long as .. "
These names should be just live forever, independent of anything.

119

Why race conditions?
This may happen w/o any threads, right?

124

add a comment explaining thread-(un) safety and which lock should be held.

samsonov edited edge metadata.Mar 30 2015, 1:19 PM

I think this change is reasonable. Timur, please make sure that (after this goes in) you audit the calls to Symbolizer::GetModuleNameAndOffsetForPC() and Symbolizer::GetModuleNameForPc() and remove internal_strdup() that could encounter at call sites: now we don't have to take ownership of these strings, as they are always owned by the Symbolizer.

sanitizer_symbolizer.cc
153–160

nullptr

sanitizer_symbolizer.h
123

Please use a named constant.

127

const char *

128

Why is this not a const char * ?

timurrrr updated this revision to Diff 22908.Mar 30 2015, 1:43 PM
timurrrr edited edge metadata.

Addressed the review comments

PTAL

sanitizer_symbolizer.cc
153–160

Done

sanitizer_symbolizer.h
90

Comment – updated. What logic do I need to change now that the dtor is deleted?

119

Theoretically – yes, but not with the current code.
Updated the comment.

123

Done

124

Done

127

yeah, done

128

Growth rings... Fixed, thanks for catching!

Timur, please make sure that (after this goes in) you audit the calls to Symbolizer::GetModuleNameAndOffsetForPC() and Symbolizer::GetModuleNameForPc() and remove internal_strdup() that could encounter at call sites: now we don't have to take ownership of these strings, as they are always owned by the Symbolizer.

I think I've already done it, but will double-check before committing when everything else is OK.

samsonov accepted this revision.Mar 30 2015, 2:42 PM
samsonov edited edge metadata.

I'm OK with the change, but have a suggestion. Feel free to address it if you find it reasonable.

sanitizer_symbolizer.h
125

Suggestion: replace the ModuleNameOwner class with just a private method of Symbolizer class. Then you could self-documented code there:

mu_.CheckLocked();

and probably hide last_match_ in the implementation by making a function-static variable storing the cached const char *. It will be thread-safe (protected by mu_), and in practice we don't have many Symbolizer objects anyway.

This revision is now accepted and ready to land.Mar 30 2015, 2:42 PM
timurrrr closed this revision.Mar 31 2015, 5:55 AM

I think ModuleNameOwner might eventually turn into something bigger (e.g. ModulesHistory?) and/or it can be reused in other parts of the RTL (e.g. StringsCache/StringsOwner?), so I think it makes sense to keep it as a separate self-contained type.
I liked the idea of CheckLocked though, so I've decided to just pass a pointer to the Symbolizer::mu_ mutex to the ModuleNameOwner ctor.

Please take a look at r233687.