This is an archive of the discontinued LLVM Phabricator instance.

[XRay][compiler-rt] Coalesce calls to mprotect to reduce patching overhead
ClosedPublic

Authored by dberris on Dec 12 2017, 8:36 PM.

Details

Summary

Before this change, XRay would conservatively patch sections of the code
one sled at a time. Upon testing/profiling, this turns out to take an
inordinate amount of time and cycles. For an instrumented clang binary,
the cycles spent both in the patching/unpatching routine constituted 4%
of the cycles -- this didn't count the time spent in the kernel while
performing the mprotect calls in quick succession.

With this change, we're coalescing the number of calls to mprotect from
being linear to the number of instrumentation points, to now being a
lower constant when patching all the sleds through __xray_patch() or
__xray_unpatch(). In the case of calling __xray_patch_function() or
__xray_unpatch_function() we're now doing an mprotect call once for
all the sleds for that function (reduction of at least 2x calls to
mprotect).

Event Timeline

dberris created this revision.Dec 12 2017, 8:36 PM
dberris updated this revision to Diff 126673.Dec 12 2017, 10:59 PM
  • fixup: only update the patching atomic when the patching is actually done
kpw added a comment.Dec 13 2017, 12:38 PM

Suggestion for future changes:

  1. We could decouple mprotect and __xray_patch_function so that callers that want to mprotect once and then patch individual functions have their cake and eat it too.
  2. Should xray make an effort to restore the mprotect status of pages after it's done? Probably not worth the complexity.
  3. If these regions never lose write permissions, why even call mprotect for unpatching?
compiler-rt/lib/xray/xray_flags.inc
22–23

It's a bit unfortunate that there is now an undocumented requirement that code #including this file must have already imported sanitizer_internal_defs.h.

Perhaps you could add an #ifdef check for SANITIZER_DEFS_H or the uptr typedef itself with an #error assertion.

Is uptr even the right type? Page size doesn't represent a pointer, it's more like a size_t (although we might have to use it as an offset and add it to an address).

compiler-rt/lib/xray/xray_interface.cc
211–213

What evidence or reasoning leads us to conclude that this assumption holds (ignoring DSO)?

221

const auto &Sled?

228

Usage suggests defining the flag as size_t.

232

Consider listing this requirement in the flag description.

320

This function name is very non-descriptive.

__mprotect_and_patch_function()?

dberris updated this revision to Diff 126858.Dec 13 2017, 3:41 PM
dberris marked 4 inline comments as done.
  • fixup: Address review comments, refactor things a bit
In D41153#954307, @kpw wrote:

Suggestion for future changes:

  1. We could decouple mprotect and __xray_patch_function so that callers that want to mprotect once and then patch individual functions have their cake and eat it too.

Yeah, I'd like to think about that a little more. I suspect the call to mprotect is an implementation detail of patching/unpatching.

  1. Should xray make an effort to restore the mprotect status of pages after it's done? Probably not worth the complexity.

I think we do that already, no? MProtectHelper already does that in the destructor, to make the pages read+execute only as opposed to read+write+execute.

  1. If these regions never lose write permissions, why even call mprotect for unpatching?

They do, MProtectHelper does RAII on the mprotect status of pages. It does make the assumption that the pages being mprotected are r+x to begin with.

compiler-rt/lib/xray/xray_flags.inc
22–23

Yeah, this file could only really be included from xray_flags.h and xray_flags.cc which should include the types. Fixed it from that side instead.

Unfortunately the flag parser doesn't support an overload for size_t reliably, and of the options we have there, uptr seems to fit the best.

compiler-rt/lib/xray/xray_interface.cc
211–213

This is the convention used by ELF and thus the Linux kernel to map the .text segment as contiguous virtual memory. If this is somehow not true, then our calls to mprotect will fail (i.e. if .text was somehow further segmented or non-contiguous). Is there a way to somehow improve this comment to make that clearer?

228

Unfortunately the flag parser may not know how to parse size_t but does know how to parse uptr.

320

Good point. I've moved this into the __xray namespace, put it in an anonymous namespace in there, and used a more descriptive name mprotectAndPatchFunction.

kpw accepted this revision.Dec 13 2017, 5:48 PM

Thanks for pointing out the scope cleanup from MProtectHelper and making some improvements/clarifications.

This revision is now accepted and ready to land.Dec 13 2017, 5:48 PM
This revision was automatically updated to reflect the committed changes.