This is an archive of the discontinued LLVM Phabricator instance.

[XRay][compiler-rt] Support TSC emulation even for x86_64
ClosedPublic

Authored by dberris on Mar 6 2017, 9:44 PM.

Details

Summary

Use TSC emulation in cases where RDTSCP isn't available on the host
running an XRay instrumented binary. We can then fall back into
emulation instead of not even installing XRay's runtime functionality.
We only do this for now in the naive/basic logging implementation, but
should be useful in even FDR mode.

Should fix http://llvm.org/PR32148.

Event Timeline

dberris created this revision.Mar 6 2017, 9:44 PM
pelikan added inline comments.Mar 6 2017, 10:20 PM
lib/xray/xray_inmemory_log.cc
141–150

This lambda is already in xray_tsc.h as ALWAYS_INLINE. I remember the discussion about how important it is to have it inlined, and here it's a copy+paste lambda with unclear inliner results. Why can't we just use the readTSC function pointer here?

In that case we'll have to rename it to something like emulateTSC and move out of the #elif, as (in this case) readTSC is a real thing reading the real TSC. And then even the other architectures' code would look clearer, because they *emulate* TSC, not *read* it. Intel and Power would call "read" and others would call "emulate".

dberris added inline comments.Mar 6 2017, 10:45 PM
lib/xray/xray_inmemory_log.cc
141–150

Considered it, and came to the conclusion that the alternative is a much more invasive change. I also don't want to preclude the implementation currently used in arm and mips from being able to use platform-specific features if/when we find out how to do them later (thereby imposing the burden on refactoring that to whoever is making that change later). The repetition here is fine and correct for x86_64 only. I'm attempting to localise the change here with a potentially more invasive refactoring to follow later.

The inlining question is easy to answer here -- __xray::readTSC is always inlined, so no function pointers (and indirect calls) were gotten in the making of this change. The template ensures that since all the definitions are inline and visible, there' no reason for the compiler to ignore those and not inline. It's the same rationale the standard library uses function templates for stuff that relies on inlining for performance purposes.

bcain added a subscriber: bcain.Mar 9 2017, 7:21 AM

I tested this patch. It was effective in addressing this bug for me. I have an ivy bridge i7 CPU, SR0PK (E1).

@pelikan -- can we do the suggested refactoring in a subsequent change?

pelikan accepted this revision.Mar 14 2017, 7:20 PM

OK then. Let's proceed and fix it up later.

This revision is now accepted and ready to land.Mar 14 2017, 7:20 PM
This revision was automatically updated to reflect the committed changes.