Page MenuHomePhabricator

[optview] added C++ version of opt-viewer
Needs ReviewPublic

Authored by inglorion on Nov 15 2016, 5:44 PM.



Adds a C++ implementation of opt-viewer. It works similarly to the Python implementation, with the following differences:

  • Speed. The C++ implementation is faster, allowing it to be used on larger projects.
  • The output directory is specified using the -o flag, before the inputs. This makes it easier to provide input filenames using xargs.
  • When given a directory as an input path, this implementation crawls the directory and processes any *.opt.yaml files it finds.
  • Percentages are rounded rather than truncated, and '<', '>', and '&' are replaced by HTML entities in the generated output.
  • Uses LLVM to demangle C++ names, which sometimes generates output different from c++filt's.
  • Only depends on LLVM - no dependencies outside the code base.

    This implementation does not currently copy the style.css file to the output directory - I can add that in a follow-up if the rest of the code here looks good.

Event Timeline

inglorion updated this revision to Diff 78115.Nov 15 2016, 5:44 PM
inglorion retitled this revision from to [optview] added C++ version of opt-viewer.
inglorion updated this object.
inglorion added reviewers: anemet, hfinkel, rnk, zturner.
inglorion added subscribers: amccarth, ruiu.
ruiu added a comment.Nov 15 2016, 6:01 PM

Function names should start with lowercase letters (I think that's the most recent policy).


You want to run clang-format.


Can you use unique_ptr instead of free?

inglorion updated this revision to Diff 78117.Nov 15 2016, 6:08 PM

Removed an extra level of HTML encoding that I had accidentally introduced

inglorion added inline comments.Nov 15 2016, 6:12 PM

Hmm, odd. It came out of clang-format this way. Let me reconfigure things. Thanks for pointing that out!

inglorion updated this revision to Diff 78123.Nov 15 2016, 6:22 PM

@ruiu made a good point about the use of malloc and free: This isn't how we normally want to do things. In this case, we need to, because itaniumDemangle calls realloc on the buffer. I've added a comment to the code to explain this.

inglorion updated this revision to Diff 78126.Nov 15 2016, 6:34 PM

fixed the indentation of the switch statements

zturner added inline comments.Nov 15 2016, 7:52 PM

else if?

anemet edited edge metadata.Nov 15 2016, 8:55 PM

It does not compile for me:

../utils/opt-viewer/YAMLProcessor.cpp:130:19: error: conversion function from 'llvm::ErrorSuccess' to 'Expected<llvm::opt_viewer::Remark>' invokes a deleted function
  if (!MN) return Error::success();
../include/llvm/Support/Error.h:658:3: note: 'Expected' has been explicitly marked deleted here
  Expected(ErrorSuccess) = delete;
1 error generated.
inglorion updated this revision to Diff 78220.Nov 16 2016, 10:40 AM
inglorion edited edge metadata.

Fixed compile error (thanks, @anemet !) and inserted missing elses (thanks, @zturner !)

inglorion updated this revision to Diff 78261.Nov 16 2016, 2:02 PM

changed function names to start in lowercase letters

ruiu added a comment.Nov 16 2016, 3:52 PM

LGTM, but I'm not sure if I can sign off. Who is the owner of the tool?


It looks weird to mix << and literal string concatenation. Make it consistent.

There is a lot fiddly HTML crafting going on. Have you tried running some samples through tidy to make sure everything is properly matched up and escaped? The browsers will accept a lot of badly-formed HTML.


"descending" seems wrong. A normal comparison function returns true if the left value comes strictly "before" the right value. When such a function is used with something like std::sort, that gives you an ascending ordering.


"ascending" seems wrong here for the same reason as the previous comment.


The implicit conversion from unsigned long long to double can lose precision, which is probably not important here, but it could trigger a compiler warning.


This assumes two's complement, right?

0 is a (signed) int, likely 32 bits. ~0 sets all 32 bits to 1. Implicitly converting that to a std::size_t (which is unsigned and likely 64 bits) seems difficult to predict. I believe this works because the signed int representation is two's complement.

The common advice is to use -1 which should be safe even if the representation is signed magnitude.


How about case '"': Stream << "&quot"; break;?

You're using writeHTML to output href links (see writeLocationLinkPrologue). It's not clear if any quotes in those URIs are already escaped.


This is portable nowadays, right? The original ANSI C standard made return EXIT_SUCCESS from main do exactly the wrong thing on VMS.

For the record, there is an llvm-dev discussion that wasn't referenced here. In the thread it was raised whether it makes sense to rewrite this in C++ without proper profiling of the existing Python implementation. The idea of this review was so that people interested could give this a try not to go in as is . (For example, the patch has obvious issues duplicating the code from llvm-opt-report.)

inglorion marked an inline comment as done.Nov 16 2016, 6:10 PM
inglorion added inline comments.

It's a bit confusing. priority_queue orders elements so that the "greatest" one is popped first. The operator() returns true if R->Hotness is greater than L->Hotness, causing the queue to sort R before L. As you note, this is the opposite of what std::sort does.


I'm converting to double on purpose, because it avoids the problem of running out of bits in an unsigned long long before performing the division (or losing all the bits by performing the division first). The conversion to unsigned long long should be fine, too, because surely it has enough bits to represent values >= 0 and <= 100. Good point about the compiler warning - what's a good way to avoid that?


Yep, there was supposed to be a << there. Thanks!


I actually want whatever the maximum value of the type is, so maybe I should just write std::numeric_limits<size_t>::max().


I'd be happy to change it if you want me to. On the other hand, we're using it in utils/unittest/googletest in a couple of places, so I imagine it's ok.

There is a lot fiddly HTML crafting going on. Have you tried running some samples through tidy to make sure everything is properly matched up and escaped? The browsers will accept a lot of badly-formed HTML.

The HTML language lawyer in me doesn't immediately want to pop the champagne. I'm certain it won't meet all the requirements of the validator. That wasn't the goal - I've intentionally kept the output mostly the same as @anemet's original so that it's easier to compare what the tools generate, instead of going for full HTML-some-version-compliance. I expect that it will render just fine in any browser that supports tables. If we wanted it to pass the W3 validator, I'd say we should do that in a follow-up.

amccarth added inline comments.Nov 17 2016, 8:23 AM

Making the conversion explicit (e.g., static_cast<double>()) will suppress any "loss of precision" warning from the compiler. As a bonus, it also makes it more obvious to the reader what's going on.

I totally understand why you need double, and I'm not opposed to that.




No change necessary. The standards folks have fixed the glitch.

inglorion updated this revision to Diff 78377.Nov 17 2016, 9:56 AM

added static casts and comment to relativeHotness()