This is an archive of the discontinued LLVM Phabricator instance.

[llvm-profdata] Support JSON as as an output-only format
ClosedPublic

Authored by kazu on Aug 1 2022, 2:34 PM.

Details

Summary

This patch teaches llvm-profdata to output the sample profile in the
JSON format. The new option is intended to be used for research and
development purposes. For example, one can write a Python script to
take a JSON file and analyze how similar different inline instances of
a given function are to each other.

I've chosen JSON because Python can parse it reasonably fast, and it
just takes a couple of lines to read the whole data:

import json
with open ('profile.json') as f:
  profile = json.load(f)

Diff Detail

Event Timeline

kazu created this revision.Aug 1 2022, 2:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 2:34 PM
kazu requested review of this revision.Aug 1 2022, 2:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 2:34 PM
hoy added a comment.Aug 2 2022, 10:43 PM

Perhaps make it explicit in the title that this is json an output-only format so far?

llvm/lib/ProfileData/SampleProfWriter.cpp
539

Wondering if this is necessary since S.getBodySamples() returns a BodySampleMap object which is already ordered.

Similarly with getCallsiteSamples below.

llvm/test/tools/llvm-profdata/sample-profile-json.test
3

nit: use JSON-next for this line and below

Do you absolutely need this to be a supported format and generated natively in llvm tools?

For a profile format, we'd expect full reader/writer support, in both llvm-profdata and llvm-profgen. Having a partially supported output only format seems less than ideal.

For research purpose, would it be possible to generate json from a text profile through offline scripts?

Another option is to leverage llvm-profdata show to have an extra flag to output in json form. show output doesn't need to be a valid profile format and it doesn't need to be loadable by profile reader.

Re-reply in Phab: +1 on the show command direction. If we have a reader in the future, consider promote this into a supported format.

kazu updated this revision to Diff 451251.Aug 9 2022, 1:23 PM

Updated to move the JSON format support to "llvm-profdata show".

kazu marked 2 inline comments as done.Aug 9 2022, 1:35 PM

Do you absolutely need this to be a supported format and generated natively in llvm tools?

For a profile format, we'd expect full reader/writer support, in both llvm-profdata and llvm-profgen. Having a partially supported output only format seems less than ideal.

For research purpose, would it be possible to generate json from a text profile through offline scripts?

Another option is to leverage llvm-profdata show to have an extra flag to output in json form. show output doesn't need to be a valid profile format and it doesn't need to be loadable by profile reader.

The main reason for adding the JSON support in an llvm tool is that I can leverage the parser.

llvm-profdata show makes sense.

I wrote another "dumper" just for the JSON format in SampleProfileReader. It's harder than I thought to wire SampleProfileWriter* to llvm-profdata show. Specifically, I had trouble casting raw_fd_ostream to raw_ostream for json::OStream purposes.

kazu retitled this revision from [llvm-profdata] Add --json to [llvm-profdata] Support JSON as as an output-only format.Aug 9 2022, 1:37 PM
kazu edited the summary of this revision. (Show Details)

I've manually updated the patch description as that's not part of the diff in Phabricator.

davidxl added inline comments.Aug 9 2022, 2:06 PM
llvm/lib/ProfileData/SampleProfReader.cpp
87 ↗(On Diff #451251)

too many nesting levels. Perhaps split out the lamda?

115 ↗(On Diff #451251)

split out the lamda?

kazu updated this revision to Diff 451298.Aug 9 2022, 4:01 PM

Broke up the big lambda into smaller pieces.

kazu marked 2 inline comments as done.Aug 9 2022, 4:04 PM

I've broken up the big lambda into smaller pieces. It should be easier to see that the output has two major parts, namely`BodySamples` and CallsiteSamples, with the implementation details factored out elsewhere.

PTAL. Thanks!

davidxl accepted this revision.Aug 9 2022, 4:22 PM

lgtm

This revision is now accepted and ready to land.Aug 9 2022, 4:22 PM
This revision was landed with ongoing or failed builds.Aug 9 2022, 4:25 PM
This revision was automatically updated to reflect the committed changes.