Page MenuHomePhabricator

[XRay] [compiler-rt] FDR logging arg1 handler

Authored by pelikan on May 3 2017, 6:42 PM.



Write out records about logged function call first arguments. D32840
implements the reading of this in llvm-xray.

There are some refactoring options and bugs discovered along the way, this is
only a commit to get things working.

Depends on D32840.

Diff Detail


Event Timeline

pelikan created this revision.May 3 2017, 6:42 PM
dberris added a reviewer: kpw.May 3 2017, 7:10 PM
dberris edited edge metadata.
dberris added a subscriber: kpw.

Adding in @kpw who also has more context on this implementation.

163 ↗(On Diff #97762)

Because of ABI reasons, we really shouldn't change this struct. The comments say that if implementations need to add their arg1 handler, they should do it in the init implementation.

196 ↗(On Diff #97762)


Also now you probably want to mark this always inline.

It would also make it more efficient to just return a 16-byte struct, so the inliner doesn't have to guess whether TSC and/or CPU alias.

224–225 ↗(On Diff #97762)

I'm ambivalent about this API change. Let me try to explain why.

First, while the compiler might tail-call into the specified function, it was already using 7 arguments. In SysV64, that means the first 6 arguments that fit in registers will be in the registers already. Adding an argument spills more stuff onto the stack. That's one.

Second, we're paying this cost in the path where we know we don't have an argument. That seems like a waste to me.

Maybe it's better to refactor this differently, so we have the innards of processFunctionHook extracted appropriately so we can handle the "we are logging arguments" case and "we are not logging arguments" better?

Or at least be able to compose either of these functions in an easier manner so while there will be a little redundancy, it's going to be easier to read and reason about. Packing on more arguments to the already overloaded processFunctionHook function seems like a bad way to go especially if we intend to support multiple argument logging in the future.

pelikan updated this revision to Diff 104091.Jun 26 2017, 10:07 PM
pelikan marked an inline comment as done.

after a long discussion and a long time on a different project, update to reflect comments

163 ↗(On Diff #97762)

After our discussions about why we can't change this ABI (despite the fact LLVM itself is incredibly progressive in this regard), I've changed this to describe the proposed behaviour for anyone reading this code later.

196 ↗(On Diff #97762)

It was never there, fixed. I don't see the point of "inline" since it's already static and because processFunctionHook is so huge, we obviously don't care about the function call overhead here.

Can we change this later, when that FIXME is being fixed properly?

224–225 ↗(On Diff #97762)

While I agree with you on the problems with the code structure, I tried making the change the least intrusive possible, and refactor later. I'll take your comment as a permission to tear processFunctionHook apart.

dberris accepted this revision.Jun 27 2017, 11:01 PM

LGTM (with a couple of suggestions)

613 ↗(On Diff #104091)

nit: writeTSCForThisCPU -> writeCurrentCPUTSC

623 ↗(On Diff #104091)

Consider using std::numeric_limits<int32_t>::max() instead.

This revision is now accepted and ready to land.Jun 27 2017, 11:01 PM
pelikan updated this revision to Diff 115324.Sep 14 2017, 5:23 PM

Rebased to HEAD.

pelikan updated this revision to Diff 115327.Sep 14 2017, 5:33 PM
pelikan marked 2 inline comments as done.

address nitpicks

pelikan edited the summary of this revision. (Show Details)Sep 17 2017, 9:54 PM
dberris requested changes to this revision.Sep 17 2017, 11:23 PM

Okay, now I think this needs a rebase given the latest changes.

This revision now requires changes to proceed.Sep 17 2017, 11:23 PM
dberris added inline comments.Sep 18 2017, 12:23 AM
293 ↗(On Diff #115327)

Actually, you want to install the arg1 logging handler first, so that all the functions that have arg1 logging enabled don't sometimes use the arg0 then switch to the arg1 handler.

670 ↗(On Diff #115327)

Why make a copy of the shared_ptr here? I don't think we actually need a copy, do we?

747 ↗(On Diff #115327)

You want to use the global instead of a function-local here. See what we do in the logArg0 case.

776 ↗(On Diff #115327)

No validation here on the type of the entry? Nothing like what we do in the logArg0 handler to see that the EntryType isn't one of the unexpected values?

92 ↗(On Diff #115327)

I think you want 'TRACE-DAG' still, but also need to look for the actual arguments too. Aren't we writing the list of arguments yet?

pelikan updated this revision to Diff 116761.Sep 26 2017, 10:54 PM
pelikan edited edge metadata.
pelikan marked an inline comment as done.

rebase to HEAD

pelikan added inline comments.Sep 26 2017, 10:55 PM
293 ↗(On Diff #115327)

Good point. Fixed and documented.

pelikan updated this revision to Diff 116913.Sep 27 2017, 5:46 PM
pelikan marked 2 inline comments as done.

address yesterday's discussion

dberris added inline comments.Sep 27 2017, 5:53 PM
586–595 ↗(On Diff #116913)

You can turn this into an early return:

if (CPU != TLD.CurrentCPU) {
  return RecordTSCDelta;

// rest of code that's in else branch.
721–722 ↗(On Diff #116913)

Why isn't this just in the cases above?

pelikan updated this revision to Diff 116920.Sep 27 2017, 8:32 PM
pelikan marked an inline comment as done.

address comment

dberris accepted this revision.Sep 27 2017, 10:06 PM


100–101 ↗(On Diff #116920)

Probably use UNWRITE-DAG here instead, similar to how we use TRACE-DAG above.

This revision is now accepted and ready to land.Sep 27 2017, 10:06 PM
pelikan marked an inline comment as done.Sep 27 2017, 10:30 PM
pelikan added inline comments.
670 ↗(On Diff #115327)

Good point. I haven't gotten used to using shared_ptr properly yet. (C programmer learning.)

586–595 ↗(On Diff #116913)

Makes sense.

721–722 ↗(On Diff #116913)

Because of the ordering. The metadata need to be after the function record as that means it belongs to the function.

100–101 ↗(On Diff #116920)

As discussed, we can't do that here since the point of UNWRITE is testing the other calls will *not* fire. The previous version of the test made sure of that which means putting UNWRITE-DAG here would actually destroy the tested behaviour.

pelikan updated this revision to Diff 116922.Sep 27 2017, 10:30 PM
pelikan marked an inline comment as done.

latest update before commit

This revision was automatically updated to reflect the committed changes.