Download Raw Diff

Details

Reviewers

kcc
majnemer
rSerge
echristo

Commits

rG1b09aae82aba: [compiler-rt][XRay] Support tail call sleds
rCRT284178: [compiler-rt][XRay] Support tail call sleds
rL284178: [compiler-rt][XRay] Support tail call sleds

Summary

This change depends on D23986 which adds tail call-specific sleds. For
now we treat them first as normal exits, and in the future leave room
for implementing this as a different kind of log entry.

The reason for deferring the change is so that we can keep the naive
logging implementation more accurate without additional complexity for
reading the log. The accuracy is gained in effectively interpreting call
stacks like:

A()
  B()
    C()

Which when tail-call merged will end up not having any exit entries for
A() nor B(), but effectively in turn can be reasoned about as:

A()
B()
C()

Although we lose the fact that A() had called B() then had called C()
with the naive approach, a later iteration that adds the explicit tail
call entries would be a change in the log format and thus necessitate a
version change for the header. We can do this later to have a chance at
releasing some tools (in D21987) that are able to handle the naive log
format, then support higher version numbers of the log format too.

Diff Detail

Event Timeline

dberris updated this revision to Diff 69552.Aug 29 2016, 3:41 AM

dberris retitled this revision from to [compiler-rt][XRay] Support tail call sleds.

dberris updated this object.

dberris added reviewers: echristo, kcc, rSerge, majnemer.

dberris added a parent revision: D23986: [XRay] Detect and emit sleds for sibling/tail calls.

dberris added a subscriber: llvm-commits.

Herald added subscribers: dberris, mehdi_amini. · View Herald TranscriptAug 29 2016, 3:41 AM

Ping?

In D23988#548335, @dberris wrote:

Ping?

I'm reviewing it, though not ready to say something because I haven't yet understood the implementation for tail calls.

I have another question in https://reviews.llvm.org/D23986 .

lib/xray/xray_interface.cc
266	The comment says about a jump to the exit sled, but the code below uses `CallOpCode`. Which one is wrong?

This revision now requires changes to proceed.Sep 22 2016, 2:13 PM

So in this implementation the handler doesn't receive XRayEntryType::TAIL type of event, but instead receives XRayEntryType::EXIT, right? Is it also intended for now?

I've tried seemingly the same approach on ARM with the following program:

#include <cstdio>
#include <cassert>
#include <xray/xray_interface.h>

[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fC() { 
  std::printf("In fC()\n");
}

[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fB() { 
  std::printf("In fB()\n");
  fC();
}

[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fA() { 
  std::printf("In fA()\n");
  fB();
}

// Avoid infinite recursion in case the logging function is instrumented (so calls logging
//   function again).
[[clang::xray_never_instrument]] void simplyPrint(int32_t functionId, XRayEntryType xret)
{
  printf("XRay: functionId=%d type=%d.\n", int(functionId), int(xret));
}

int main(int argc, char* argv[]) {
  __xray_set_handler(simplyPrint);

  printf("Patching...\n");
  __xray_patch();
  fA();

  printf("Unpatching...\n");
  __xray_unpatch();       
  fA();

  return 0;
}

For me it gives the following results

Patching...
XRay: functionId=3 type=0.
In fA()
XRay: functionId=3 type=1.
XRay: functionId=2 type=0.
In fB()
XRay: functionId=2 type=1.
XRay: functionId=1 type=0.
XRay: functionId=1 type=1.
In fC()
Unpatching...
In fA()
In fB()
In fC()

So for function fC() the exit sled seems to be called too much before function exit: before printing In fC(). I compiled with the following generic flags: -O3 -g -fxray-instrument -Wall -std=c++14 -ffunction-sections -fdata-sections (this list doesn't include my specific flags like --target=armv7-linux-gnueabihf etc.).

Does the code snippet work correctly for you on x86_64? Or is the above output expected?

In D23988#553965, @rSerge wrote:

So in this implementation the handler doesn't receive XRayEntryType::TAIL type of event, but instead receives XRayEntryType::EXIT, right? Is it also intended for now?

Yes, this is intended for now.

In D23988#554292, @rSerge wrote:

I've tried seemingly the same approach on ARM with the following program:

[snipped]

So for function fC() the exit sled seems to be called too much before function exit: before printing In fC(). I compiled with the following generic flags: -O3 -g -fxray-instrument -Wall -std=c++14 -ffunction-sections -fdata-sections (this list doesn't include my specific flags like --target=armv7-linux-gnueabihf etc.).

Does the code snippet work correctly for you on x86_64? Or is the above output expected?

It's certainly not expected, but I think I can explain this.

Since printf isn't instrumented, this means the implementation of fC() could be tail-calling into printf (which is a perfectly good optimisation to make). Of course this mis-accounts the time but we really can't do anything about that yet, unless we know that we're tail-calling into something else, which will come later when we have different records for tail calls. We should probably do that when we have a more substantial change in the format once some of the stuff I'm working on (to support richer records into the log) goes upstream and increment the file version number.

Does that make sense?

Debugging for me has shown that the above happens because printf from fC is also called as a tail call. So first the exit sled of fC is executed, and only then printf is jumped into. So it seems we can't do anything about this with the current approach.

In D23988#555375, @rSerge wrote:

Debugging for me has shown that the above happens because printf from fC is also called as a tail call. So first the exit sled of fC is executed, and only then printf is jumped into. So it seems we can't do anything about this with the current approach.

This isn't such a bad thing though, given the following construction:

f1()
  f2()
    f3()

If we get f2() tail-calling into f3() and f3() tail calling into something else that isn't instrumented (not shown here), then we will still see:

f1()
  f2()
  f3()
  ... missing ...

However, since f1() is instrumented, we will see that time spent outside of f3() will be attributed to f1().

This is predicated though on having an enclosing function that is instrumented. To see this in your example, you can wrap the call to fA() with another function that will not tail-call into fA().

dberris added inline comments.Sep 28 2016, 8:19 PM

lib/xray/xray_interface.cc
266	The comment is inaccurate -- it should say "call to the exit sled instead". We need to call the exit sled so that it returns to the calling function just before the jump to the tail-called function. Let me update that.

Rebased and redone

Rebase

Fix bad merge

Ping?

Ping again? @echristo or @rSerge?

rSerge accepted this revision.Oct 13 2016, 12:10 PM

rSerge edited edge metadata.

This revision is now accepted and ready to land.Oct 13 2016, 12:10 PM

Sorry, it's apparently too easy to miss activity on Phabricator. The emails get filtered together with LLVM mailing list piles.

Closed by commit rL284178: [compiler-rt][XRay] Support tail call sleds (authored by dberris). · Explain WhyOct 13 2016, 5:06 PM

This revision was automatically updated to reflect the committed changes.

This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt][XRay] Support tail call sleds
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 69552

include/xray/xray_interface.h

lib/xray/xray_interface.cc

This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt][XRay] Support tail call sledsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 69552

include/xray/xray_interface.h

lib/xray/xray_interface.cc

[compiler-rt][XRay] Support tail call sleds
ClosedPublic