What do you think about this approach?
I haven't benchmarked it yet.
Diff Detail
Event Timeline
I don't see any significant difference between runtimes of pdfium_test from Chromium with this patch and coverage enabled vs disabled on a 4 MB PDF.
Should I benchmark further?
What is the rationale? I see added complexity but don't see the benefit
sanitizer_symbolizer.cc | ||
---|---|---|
78 ↗ | (On Diff #22805) | Why do you need a DTOR? |
Rationale: we have a possible UAF in the RTL.
We either leak [see r233257] or UAF or manage the strings somehow.
A possible alternative is to replace char **module_name kind of returning a string to "pass us a fixed-size buffer" and char *module_name.
sanitizer_symbolizer.cc | ||
---|---|---|
78 ↗ | (On Diff #22805) | Correct. Should I remove it? |
"pass us a fixed-size buffer" (i.e. the caller owns the buffer, we copy data there is better here, imho
sanitizer_symbolizer.cc | ||
---|---|---|
78 ↗ | (On Diff #22805) | Yes, if we decide to proceed with this, but I'd prefer we don't |
How does the caller pick the buffer size? Also, this way we'll be copying
a lot of strings without the need to do so...
isn't this kMaxPathLength ?
Also, this way we'll be copying
a lot of strings without the need to do so...
How bad is that?
Are we getting the module name in any perf critical place?
How does the caller pick the buffer size?
isn't this kMaxPathLength ?
Yes, probably :)
Also, this way we'll be copying a lot of strings without the need to do so...How bad is that?
Are we getting the module name in any perf critical place?
I haven't measured that yet, but I think it's basically once for each
TU in the default coverage mode.
Possibly more for edge coverage?
Like this?
Please note we have a warning if a function takes more than 512 bytes on stack, so I used InternalScopedString instead. This implies mmap+munmap on each TU on process startup, plus a few times on shutdown. I've tried the patch on pdfium_test and chrome and haven't observed much slowdown...
[TODO: update subject before comitting]
Yea.. This clearly sucks too. Probably more than you initial variant.
Let's get back to it and polish it.
Please note we have a warning if a function takes more than 512 bytes on stack, so I used InternalScopedString instead. This implies mmap+munmap on each TU on process startup, plus a few times on shutdown. I've tried the patch on pdfium_test and chrome and haven't observed much slowdown...
[TODO: update subject before comitting]
Ok, let's go this way...
sanitizer_symbolizer.h | ||
---|---|---|
90 ↗ | (On Diff #22900) | Can you modify the comments (and logic) by removing "as long as .. " |
120 ↗ | (On Diff #22900) | Why race conditions? |
125 ↗ | (On Diff #22900) | add a comment explaining thread-(un) safety and which lock should be held. |
I think this change is reasonable. Timur, please make sure that (after this goes in) you audit the calls to Symbolizer::GetModuleNameAndOffsetForPC() and Symbolizer::GetModuleNameForPc() and remove internal_strdup() that could encounter at call sites: now we don't have to take ownership of these strings, as they are always owned by the Symbolizer.
sanitizer_symbolizer.cc | ||
---|---|---|
153 ↗ | (On Diff #22900) | nullptr |
sanitizer_symbolizer.h | ||
124 ↗ | (On Diff #22900) | Please use a named constant. |
128 ↗ | (On Diff #22900) | const char * |
129 ↗ | (On Diff #22900) | Why is this not a const char * ? |
PTAL
sanitizer_symbolizer.cc | ||
---|---|---|
153 ↗ | (On Diff #22900) | Done |
sanitizer_symbolizer.h | ||
90 ↗ | (On Diff #22900) | Comment – updated. What logic do I need to change now that the dtor is deleted? |
120 ↗ | (On Diff #22900) | Theoretically – yes, but not with the current code. |
124 ↗ | (On Diff #22900) | Done |
125 ↗ | (On Diff #22900) | Done |
128 ↗ | (On Diff #22900) | yeah, done |
129 ↗ | (On Diff #22900) | Growth rings... Fixed, thanks for catching! |
Timur, please make sure that (after this goes in) you audit the calls to Symbolizer::GetModuleNameAndOffsetForPC() and Symbolizer::GetModuleNameForPc() and remove internal_strdup() that could encounter at call sites: now we don't have to take ownership of these strings, as they are always owned by the Symbolizer.
I think I've already done it, but will double-check before committing when everything else is OK.
I'm OK with the change, but have a suggestion. Feel free to address it if you find it reasonable.
sanitizer_symbolizer.h | ||
---|---|---|
125 ↗ | (On Diff #22908) | Suggestion: replace the ModuleNameOwner class with just a private method of Symbolizer class. Then you could self-documented code there: mu_.CheckLocked(); and probably hide last_match_ in the implementation by making a function-static variable storing the cached const char *. It will be thread-safe (protected by mu_), and in practice we don't have many Symbolizer objects anyway. |
I think ModuleNameOwner might eventually turn into something bigger (e.g. ModulesHistory?) and/or it can be reused in other parts of the RTL (e.g. StringsCache/StringsOwner?), so I think it makes sense to keep it as a separate self-contained type.
I liked the idea of CheckLocked though, so I've decided to just pass a pointer to the Symbolizer::mu_ mutex to the ModuleNameOwner ctor.
Please take a look at r233687.