This is an archive of the discontinued LLVM Phabricator instance.

[XRay][llvm] Load XRay Profiles
ClosedPublic

Authored by dberris on Jun 20 2018, 8:11 AM.

Details

Summary

This change implements the profile loading functionality in LLVM to
support XRay's profiling mode in compiler-rt.

We introduce a type named llvm::xray::Profile which allows building a
profile representation. We can load an XRay profile from a file to build
Profile instances, or do it manually through the Profile type's API.

The intent is to get the llvm-xray tool to generate Profile
instances and use that as the common abstraction through which all
conversion and analysis can be done. In the future we can generate
Profile instances from Trace instances as well, through conversion
functions.

Some of the key operations supported by the Profile API are:

  • Path interning (Profile::internPath(...)) which returns a unique path identifier.
  • Block appending (Profile::addBlock(...)) to add thread-associated profile information.
  • Path ID to Path lookup (Profile::expandPath(...)) to look up a PathID and return the original interned path.
  • Block iteration.

A 'Path' in this context represents the function call stack in
leaf-to-root order. This is represented as a path in an internally
managed prefix tree in the Profile instance. Having a handle (PathID)
to identify the unique Paths we encounter for a particular Profile
allows us to reduce the amount of memory required to associate profile
data to a particular Path.

This is the first of a series of patches to migrate the llvm-stacks
tool towards using a single profile representation.

Depends on D48653.

Diff Detail

Repository
rL LLVM

Event Timeline

dberris created this revision.Jun 20 2018, 8:11 AM
dberris updated this revision to Diff 152246.Jun 21 2018, 3:17 AM
  • fixup: Implement Profile merging
dberris updated this revision to Diff 154161.Jul 4 2018, 6:03 PM
dberris edited the summary of this revision. (Show Details)

Rebase, update description.

kpw added a comment.Jul 10 2018, 9:53 AM

Is this still a work in progress patch as noted? I'll make some time to look at this today.

In D48370#1157578, @kpw wrote:

Is this still a work in progress patch as noted? I'll make some time to look at this today.

Yes, still in progress. However we can start making progress on issues with the current state if you find any.

I suspect we can land changes on this incrementally, but I'd at least like to get to a point where we can convert Trace objects to Profile objects. That's next on my list of things to do, so that we can use the Profile type in the llvm-xray stacks tool at least.

Thoughts?

kpw added a comment.Jul 19 2018, 5:19 PM

Publishing existing comments. Haven't reviewed the whole patch.

llvm/include/llvm/XRay/Profile.h
55–56 ↗(On Diff #154161)

Maybe call this expand and elaborate on what type of error conditions are possible. Is the Expected just for an invalid id?

59 ↗(On Diff #154161)

What happens when a caller provides an argument that has already been interned? Will they receive a new ID or the existing one?

62 ↗(On Diff #154161)

The header should document the error cases. Is it intended semantics to add a block that has a ThreadId and a PathId pair that already exist in another block? Will this accumulate the call count and cumulative time for the path? What about cases where some of the PathIds in the block have been interned and some haven't. Is there partial success?

dberris updated this revision to Diff 156461.Jul 20 2018, 3:14 AM
dberris marked 3 inline comments as done.
  • fixup: Handle profile loading better in llvm-xray
  • fixup: profileFromTrace(Trace) -> Expected<Profile>
  • fixup: Update some comments
  • fixup: FIXME markers, additional tests, rename path to expandPath
dberris planned changes to this revision.Jul 20 2018, 3:14 AM
dberris added inline comments.
llvm/include/llvm/XRay/Profile.h
55–56 ↗(On Diff #154161)

Yes, error is for an unknown PathID -- called it expandPath instead.

59 ↗(On Diff #154161)

Interning implies that it will return the same ID for the same path. This is very similar to string interning. I've made that a bit more explicit in the comment.

dberris updated this revision to Diff 158175.Jul 30 2018, 11:22 PM
dberris edited the summary of this revision. (Show Details)

Reduced the scope of the patch. We'll work on follow-on changes to implement more of the functionality, working towards moving more of the implementation in llvm-stacks to libraries.

kpw added a comment.Jul 31 2018, 11:19 PM

Haven't read the test yet. Logic looks sound.

llvm/include/llvm/XRay/Profile.h
38 ↗(On Diff #158175)

Do we have a better way of passing files around as streams of bytes or a DataExtractor so that we can more easily work with sockets or other data streams later?

llvm/lib/XRay/Profile.cpp
69–72 ↗(On Diff #158175)

This pattern is repeated several times. Set the current offset, read something, check to see that the offset moved, and update the current offset. Did you think about abstracting to avoid mistakes like forgetting to update CurrentOffset?

130 ↗(On Diff #158175)

Since this function is pretty long, it might help to call this RootToLeafPath.

187–188 ↗(On Diff #158175)

There is a bit of the cascading auto here where the compiler is fine, but a reader might get a bit lost.

It is not too verbose and might help to make these:
Profile::PathID &PathID and
Profile::Data &Data.

194–196 ↗(On Diff #158175)

It might be worth noting here that you intentionally disregard whether the insert added a new element.

204–208 ↗(On Diff #158175)

Since this is the only difference between the two loops, it could make the code more simple at a small performance cost (boolean check) to simply run the same loop twice even though Inserted will always be true the first time.

222 ↗(On Diff #158175)

Another idea for code re-use. mergeProfilesByStack and mergeProfilesByThread could both call a routine like mergeProfilesByThread but with an additional parameter that is a callable that picks out the thread id from a function. It would simply return zero for mergeProfilesByStack. The code could be identical and create a map with a single element.

If you prefer to specialize that's your call, but this is less performance critical IIUC since it's not in compiler-rt.

376 ↗(On Diff #158175)

on Block per thread -> one Block per thread

dberris updated this revision to Diff 158683.Aug 1 2018, 10:43 PM
dberris marked 6 inline comments as done.

Address comments by @kpw.

llvm/include/llvm/XRay/Profile.h
38 ↗(On Diff #158175)

Ideally we'll take a DataExtractor instance. We can add that later, at this point we don't quite need it yet.

llvm/lib/XRay/Profile.cpp
69–72 ↗(On Diff #158175)

Looks like this isn't something easily automated. :(

This is very much an artefact of the data extractor interface.

187–188 ↗(On Diff #158175)

It actually turns out to be more cumbersome, because of the repetition of the type and the identifier being spelled the same:

const Profile::PathID &PathID = ...;
const Profile::Data &Data = ...;
194–196 ↗(On Diff #158175)

Used std::tie(It, std::ignore) = ... instead.

222 ↗(On Diff #158175)

The algorithms are very different, since the additional data structure for grouping by threads is by definition unnecessary in the merging across stacks losing information about threads.

kpw accepted this revision.Aug 2 2018, 9:42 PM

Please consider my comments about the tests before pushing.

llvm/lib/XRay/Profile.cpp
187–188 ↗(On Diff #158175)

In the words of the dude: "That's just like... your opinion man."

222 ↗(On Diff #158175)

That's fair.

llvm/unittests/XRay/ProfileTest.cpp
49–56 ↗(On Diff #158683)

Could you add a comment explaining each assertion. If these fail, the aggregate initializers require a lot from an unfamiliar reader.

Suggestions:

Empty struct should fail
...
Thread ID with no data should fail
...
// Thread ID with data should succeed

Die hard style reviewers would argue that this should be three tests! ;P

60–72 ↗(On Diff #158683)

Maybe choose some values that are different for the different stacks so that if the test will fail if the merge combines an incorrect permutation pairing up the stacks?

Maybe also add a unique stack to one or both profiles to check that the implementation is robust to stacks that don't have a match.

101 ↗(On Diff #158683)

Use a different thread ID here to verify we're not matching on threads?

118 ↗(On Diff #158683)

Should there be MergeProfilesByThreadAccumulate as well?

121–127 ↗(On Diff #158683)

Use different thread IDs?

This revision is now accepted and ready to land.Aug 2 2018, 9:42 PM
dberris updated this revision to Diff 158932.Aug 3 2018, 12:17 AM
dberris marked 8 inline comments as done.

Address more comments.

Thanks!

llvm/lib/XRay/Profile.cpp
187–188 ↗(On Diff #158175)

Duuuuude... :)

This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Aug 3 2018, 5:23 AM

(The cmakelists.txt changes added tabs. I fixed in r338880, but maybe check your editor settings. also, phab doesn't do a great job displaying tabs in diffs :-/)

dberris added a subscriber: kpw.Aug 3 2018, 6:04 AM

Oops -- sorry about that, Nico!

/me goes checking editor config settings for cmake files to expand tabs always.

dberris reopened this revision.Aug 6 2018, 9:29 PM

Re-opening since this has been reverted.

This revision is now accepted and ready to land.Aug 6 2018, 9:29 PM
dberris planned changes to this revision.Aug 6 2018, 9:30 PM

Turns out that a move-only type and how we're using this is problematic in some older non-clang compiler. Instead of relying on defaults, I'll implement the semantics that we need for copy to work even if it's much more expensive.

dberris updated this revision to Diff 161667.Aug 21 2018, 1:45 AM

Updated to implement a copy constructor, a move constructor, and associated assignment operations.

This revision is now accepted and ready to land.Aug 21 2018, 1:45 AM
dberris updated this revision to Diff 161669.Aug 21 2018, 1:52 AM
  • fixup: add test for move assignment

Updated, anybody care to review (again)?

Am going to land this now, since the changes introduced have been very minor (mostly to appease older compilers).

This revision was automatically updated to reflect the committed changes.