This is an archive of the discontinued LLVM Phabricator instance.

Add a utility for converting between different types of remarks
ClosedPublic

Authored by paquette on Sep 10 2022, 11:27 AM.

Details

Summary

This adds llvm-remarkutil. This is intended to be a general tool for doing stuff with/to remark files.

This patch gives it the following powers:

  • bitstream2yaml - To convert bitstream remarks to YAML
  • yaml2bitstream - To convert YAML remarks to bitstream remarks

These are both implemented as subcommands, like

llvm-remarkutil bitstream2yaml <input_file> -o -

I ran into an issue where I had some bitstream remarks coming from CI, and I wanted to be able to do stuff with them (e.g. visualize them) But then I noticed we didn't have any tooling for doing that, so I decided to write this thing.

Being able to output YAML as a start seemed like a good idea, since it would allow people to reuse any tooling they may have written based around YAML remarks.

Hopefully it can grow into a more featureful remark utility. :)

Currently there are is an outstanding performance issue (see the TODO) with the bitstream2yaml case. I decided that I'd keep the tool small to start with and have the yaml2bitstream and bitstream2yaml cases be symmetric.

Also I moved the remarks documentation to its own header because it seems a little out of place with "basic commands" and "developer tools"; it's really kind of its own thing.

Diff Detail

Event Timeline

paquette created this revision.Sep 10 2022, 11:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2022, 11:27 AM
paquette requested review of this revision.Sep 10 2022, 11:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2022, 11:27 AM
thegameg accepted this revision.Sep 10 2022, 1:28 PM

Looks straightforward and pretty good to me. Thanks for working on this.

One suggestion would be to have a convert subcommand that takes bitstream2yaml or yaml2bitstream, assuming this tool will be extended to do more (for example, one thing I want to add is extract from a mach-o object file). I don't have strong opinions though so if everyone is fine with this I'm also fine.

llvm/tools/llvm-remarkutil/RemarkUtil.cpp
50

Does this need a static/const/constexpr?

This revision is now accepted and ready to land.Sep 10 2022, 1:28 PM
paquette updated this revision to Diff 459314.Sep 10 2022, 4:48 PM

Make some stuff static that should have been static

Looks straightforward and pretty good to me. Thanks for working on this.

One suggestion would be to have a convert subcommand that takes bitstream2yaml or yaml2bitstream, assuming this tool will be extended to do more (for example, one thing I want to add is extract from a mach-o object file). I don't have strong opinions though so if everyone is fine with this I'm also fine.

What would this use-case look like in the (proposed) UI?

One suggestion would be to have a convert subcommand that takes bitstream2yaml or yaml2bitstream, assuming this tool will be extended to do more (for example, one thing I want to add is extract from a mach-o object file). I don't have strong opinions though so if everyone is fine with this I'm also fine.

What would this use-case look like in the (proposed) UI?

$ llvm-remarkutil -convert bitstream2yaml a.opt.bitstream -o a.opt.yaml
$ llvm-remarkutil -convert yaml2bitstream a.opt.bitstream -o a.opt.yaml
$ llvm-remarkutil -extract a.o -o a.opt.bitstream`

is what I had in mind. Maybe having -input-format=bitstream -output-format=yaml is better for other use cases?

Also maybe something you would want to do in a separate patch, adding a -dump option that only calls llvm::remarks::Remark::dump() on all the parsed remarks would be very helpful (with maybe a -filter too)

jroelofs accepted this revision.Sep 12 2022, 8:07 AM

LGTM

llvm/tools/llvm-remarkutil/RemarkUtil.cpp
129

const?

One suggestion would be to have a convert subcommand that takes bitstream2yaml or yaml2bitstream, assuming this tool will be extended to do more (for example, one thing I want to add is extract from a mach-o object file). I don't have strong opinions though so if everyone is fine with this I'm also fine.

What would this use-case look like in the (proposed) UI?

$ llvm-remarkutil -convert bitstream2yaml a.opt.bitstream -o a.opt.yaml
$ llvm-remarkutil -convert yaml2bitstream a.opt.bitstream -o a.opt.yaml
$ llvm-remarkutil -extract a.o -o a.opt.bitstream`

is what I had in mind. Maybe having -input-format=bitstream -output-format=yaml is better for other use cases?

I don't think you'd actually need either of those because you could just pipe in/out of the tool, or am I missing something?

I was thinking it'd also be useful to have options like

llvm-remarkutil stats --input=yaml <file>.yaml
llvm-remarkutil stats --input=bitstream <file>.bitstream

or

llvm-remarkutil stats -y <file>.yaml
llvm-remarkutil stats -b <file>.bitstream

which could print out some stats/facts about the remarks as well.

That'd be pretty separate from conversion, which is why I'm somewhat inclined to keep these as suboptions versus regular flags.

thegameg accepted this revision.Sep 12 2022, 10:35 AM

That'd be pretty separate from conversion, which is why I'm somewhat inclined to keep these as suboptions versus regular flags.

👍

paquette updated this revision to Diff 459520.Sep 12 2022, 10:57 AM
  • Make vector const
  • Disable bitstream conversion test on non-Darwin platforms (attempting to fix pre-merge checks before pushing)
paquette updated this revision to Diff 459524.Sep 12 2022, 11:13 AM

Francis pointed out to me that the Darwin assumption for bitstream only holds when embedding metadata in Separate mode.

  • Drop the Darwin check
  • Try to modify the test to see if we can make Debian and Windows happy in the pre-merge checks.

Current assumption is that piping binary data is weird in the tests, although it Works On My Machine™

paquette updated this revision to Diff 459567.Sep 12 2022, 3:03 PM

Add missing std::move when returning an Error. That was the only issue that showed up when I built the patch inside a Debian Docker container.

It looks like the tests fail cause applying the patch won't actually create the correct binary files. Going to push now.

This revision was landed with ongoing or failed builds.Sep 12 2022, 3:04 PM
This revision was automatically updated to reflect the committed changes.
MatzeB added a subscriber: MatzeB.Sep 12 2022, 4:32 PM

On a somewhat random note: I have a remarks2sql utility here that converts remarks into an sqlite3 database which I found works great when analyzing bigger codebases where you pretty much always want to do various amount of filtering, sorting and aggregation... I should probably clean that up and add it to llvm/utils if there's demand from others... Unfortunately it's written in python, so can't easily be integrated into this tool here...

On a somewhat random note: I have a remarks2sql utility here that converts remarks into an sqlite3 database which I found works great when analyzing bigger codebases where you pretty much always want to do various amount of filtering, sorting and aggregation... I should probably clean that up and add it to llvm/utils if there's demand from others... Unfortunately it's written in python, so can't easily be integrated into this tool here...

I think that would be extremely handy to have. I thought something like a

llvm-remarkutil query <file> <query>

option would be really handy, but an actual database would probably be better-suited to that task than basically anything else...