Page MenuHomePhabricator

Add new hidden option -print-changed which only reports changes to IR
ClosedPublic

Authored by jamieschmeiser on Aug 21 2020, 11:10 AM.

Details

Summary

A new hidden option -print-changed is added along with code to support
printing the IR as it passes through the opt pipeline in the new pass
manager. Only those passes that change the IR are reported, with others
only having the banner reported, indicating that they did not change the
IR, were filtered out or ignored. Filtering of output via the
-filter-print-funcs is supported and a new supporting hidden option
-filter-passes is added. The latter takes a comma separated list of pass
names and filters the output to only show those passes in the list that
change the IR. The output of both of these functions can be modified
via the -print-module-scope function.

The code introduces a template base class that generalizes the comparison
of IRs that takes an IR representation as template parameter. The
constructor takes a series of lambdas that provide an event based API
for generalized printing of IRs as they are changed in the opt pipeline
through the new pass manager.

The first of several instantiations is provided that prints the IR
in a form similar to that produced by -print-after-all with the above
mentioned filtering capabilities. This version, and the others to
follow will be introduced at the upcoming developer's conference.
See https://hotcrp.llvm.org/usllvm2020/paper/29 for more information.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ychen added inline comments.Sep 3 2020, 11:53 AM
llvm/include/llvm/Passes/StandardInstrumentations.h
93

It would be nice to use IRUnitT instead of T.

llvm/lib/Passes/StandardInstrumentations.cpp
337

Why is the round-trip of defining these here and passing them into the ctor of IRChangePrinter? Why do we need private callbacks such as ChangePrinter::HandleInitialIR. It seems just a trivial wrapper of handleInitialIR.

llvm/test/Other/change-printer.ll
7

Instead of check0, check1, check2, could you use some short meaningful string. Say check0->check-simple, check1->check-changed-pass, etc..

jamieschmeiser reopened this revision.Sep 3 2020, 1:31 PM
jamieschmeiser added inline comments.
llvm/lib/Passes/StandardInstrumentations.cpp
249

I'm sorry but I still do not understand... This unordered_set is just a list of pass names that are specified using the hidden option -filter-passes to allow filtering of output for passes. So, if you do not want output for a particular pass, you specify it with -filter-passes and that pass will be listed as being filtered out even if it makes changes. This is similar to -filter-print-funcs and this function is similar to llvm::isFuntionInPrintList. The output for a module is generated using unwrapAndPrint so I think the change that you are implying would be made in printIR that takes a Module (near line 139) and would also affect other users of that code (such as print-[before|after]-all). I think such a change should be handled separately as it doesn't just affect this new code and I am not sure of the implications.

337

The reason for the round-trip structure is that there are more instantiations of these templates that report changes in different forms still to come. I will be subsequently posting reviews for at least 2 different change reporters based on these base classes. By defining these templates in this fashion, the functionality of comparing different IR representations and determining when changes have occurred and should be filtered, reported or ignored is all in common in the base classes. The derived classes will be called when interesting events happen and only need to handle presenting the events without having to determine when the events occur.

This revision is now accepted and ready to land.Sep 3 2020, 1:31 PM

Respond to review comments: Change template parameter name, use
llvm::isSpecialPass, make space between template end delimiters optional
and use more meaningful check prefixes in tests.

aeubanks accepted this revision.Sep 6 2020, 5:58 PM

lgtm after ychen's comments are addressed

ychen added inline comments.Sep 9 2020, 11:05 AM
llvm/lib/Passes/StandardInstrumentations.cpp
249

Sorry. The position of this comment is a bit misplaced. What I meant is that uselist order is also part of IR (https://llvm.org/docs/LangRef.html#use-list-order-directives) but it is not printed by default. When we comparing IR here, it would be helpful to make sure uselist order changes are also captured by this patch.

337

Maybe I missed something. This sounds like either virtual inheritance or CRTP are better alternatives than many function objects as class members. I would prefer CRTP.

jamieschmeiser added inline comments.Sep 14 2020, 7:07 AM
llvm/lib/Passes/StandardInstrumentations.cpp
337

@ychen Sorry for the delay in response but I had to concentrate on preparing my tutorial for the meeting as the recording is due today. Anyway, I don't really think CRTP is really applicable here because there really aren't polymorphic calls being made. The calls are registered with the PIC and at that point, everything about the calls is essentially known so there isn't polymorphism in the calls. There are 3 ways of handling the callbacks: lambdas (as I've done), virtual overrides and leaving the functions to be defined by the template instantiation. Pre-C++11, I would have used virtual overrides and I think this is a clean way of handling this situation. The bases would be abstract and the derived classes would just provide the overrides. However, it seems that the llvm community shies away from virtual hierarchies in favour of CRTP which is appropriate for the way it is mostly used, but not in this situation IMHO. Virtuals have a bad rap for being unefficient but would be fine here. I don't like leaving functions that have to be provided by template instantiations as the errors provided when they are incorrect often come from the linking stage and it can be very difficult to determine how to properly provide the necessary functions in the instantiation. Lambdas seem to be preferred over virtuals so I chose them. I would be fine with changing the hierarchy to use virtual functions but I would shy away from requiring the instantiation from providing missing functions even though it would be a more efficient way of calling the functions since it makes it harder to extend the hierarchy. Efficiency isn't the main concern here since this is for aiding in code development/understanding and the overhead of virtual calls or calling through function pointers (lambdas) is small in comparison to the overhead involved in saving the representation of the IR. Would you prefer this to be expressed with virtual functions? If so, I will change it but I would rather avoid leaving functions to be provided by instantiations. I did that in one of the subsequent PRs and found it cumbersome and difficult.

Respond to review comments: Set ShouldPreserveUseListOrder to true
when generating the output for a module.

Respond to review comments: change lambdas to virtual function calls and
place code directly in functions.

ychen accepted this revision.Sep 15 2020, 10:45 AM

One more nit. Thank you!

llvm/include/llvm/Passes/StandardInstrumentations.h
110

Please remove mentioning callback here and after.

[NFC] Respond to reviewers comments: Update comments.

thakis added a subscriber: thakis.Sep 16 2020, 10:44 AM

This seems to break tests on most bots, e.g. http://45.33.8.238/linux/27972/step_12.txt

On other bots it doesn't build: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/14540/steps/build%20stage%201/logs/stdio

PTAL, and revert for now if it's not a quick fix.

mkitzan added inline comments.
llvm/lib/Passes/StandardInstrumentations.cpp
421

@thakis and @jamieschmeiser this extra semi-colon appears to be the cause of the recent build breaks.

@mkitzan, Thanks for fixing the build by removing the extra semicolon. Also, removing the call to the base dtor is fine.
commit 6859d95ea2d0f3fe0de2923a3f642170e66a1a14
Author: Michael Liao <michael.hliao@gmail.com>
Date: Wed Sep 16 14:43:08 2020 -0400

mkitzan added a subscriber: hliao.Sep 16 2020, 1:30 PM

I can't take credit for that. It was @hliao who committed the fix (thanks btw!).

This commit seems to cause a test failure on some bots, e.g. here.

This commit seems to cause a test failure on some bots, e.g. here.

I think it was already fixed, the bot has been green for the last 2 runs.

dyung added a subscriber: dyung.Sep 17 2020, 12:17 AM

This commit seems to cause a test failure on some bots, e.g. here.

I think it was already fixed, the bot has been green for the last 2 runs.

This is still failing on Windows bots:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/1269

kuhnel added a subscriber: kuhnel.EditedSep 17 2020, 12:35 AM

It also broke pre-merge testing for everyone: https://buildkite.com/llvm-project/llvm-master-build/builds/1071#7bfa9e1f-a2fe-47aa-8263-d2646ac0a0f5

I tried to revert, but it causes a merge conflict :(
Please fix ASAP.

This comment was removed by goncharov.
dyung added a comment.Sep 17 2020, 1:34 AM

I have reverted this change and the follow-up commit in b03c2b8395ba94fb53f1e73a6473faedf628bbd9. Hopefully this should get the bots green again.

jamieschmeiser reopened this revision.Sep 17 2020, 8:50 AM

It appears that the failing tests are due to differences in the PassID that is passed in the PassInstrumentationCallbacks, which is presumably build compiler/platform dependent. I know that the name sent is language dependent because it is the mangled name for C++ but all of these tests are based on input that was originally C so that should be fine. I will alter the tests to have a regex accepting anything for the PassID and just verify the constructed part of the banner and the name, which should make these tests robust and still test the functionality.

This revision is now accepted and ready to land.Sep 17 2020, 8:50 AM

Respond to build/test problems: Remove extra ;, remove call to base dtor,
change tests to have regex for PassID and ModuleID

MaskRay requested changes to this revision.Sep 17 2020, 9:43 PM
MaskRay added a subscriber: MaskRay.
MaskRay added inline comments.
llvm/include/llvm/Passes/StandardInstrumentations.h
132

Prefer member initializer

InitialIR = true

llvm/lib/Passes/StandardInstrumentations.cpp
55

No need to specifically to call out "hidden". Other options don't document themselves "hidden".

60

The relations with -filter-passes and print-module-scope probably should be moved to a separate graph, because several other -print-* (e..g -print-after-all) share the documentation.

285
302
316

Was

317

same has a trivial definition. Consider deleting the member function

367
418

Before == After

This revision now requires changes to proceed.Sep 17 2020, 9:43 PM
MaskRay added inline comments.Sep 17 2020, 9:45 PM
llvm/include/llvm/Passes/StandardInstrumentations.h
109

Capitalize comments and end with full stops.

ditto below

llvm/test/Other/change-printer.ll
58

The current tested form is too loose. Worth adding some CHECK-NEXT: with the first line of the dump, e.g. CHECK-NEXT: define i32 @g()

ychen added a comment.Sep 17 2020, 9:58 PM

Respond to build/test problems: Remove extra ;, remove call to base dtor,
change tests to have regex for PassID and ModuleID

The regex checking is not sufficient. Could you please check the common sub-string of GCC's and MSVC's output?

llvm/lib/IR/LegacyPassManager.cpp
90

Add a comma before "always"

llvm/test/Other/change-printer.ll
7

For check prefixes, using dash is much more common than underscore.

MaskRay added inline comments.Sep 17 2020, 10:01 PM
llvm/test/Other/change-printer.ll
7

I am fine with underscores because several directives use dashes and underscores are more distinguishable.

I think my biggest question is about the necessity of the ChangePrinter base class... The paragraph in the summary hints on that it is probably necessary....... but the link is unreadable and I think we need other concrete derived classes to justify the base class design.

The first of several instantiations is provided that prints the IR in a form similar to that produced by -print-after-all with the above mentioned filtering capabilities. This version, and the others to follow will be introduced at the upcoming developer's conference. See https://hotcrp.llvm.org/usllvm2020/paper/29 for more information.

@ychen, Why are the regexs not sufficient for testing? I would rather not "check the common sub-string of GCC's and MSVC's output" because it has already been sufficiently demonstrated that this is fragile in the test. Referring to the code creating one of the banners:
SmallString<20> Banner = formatv("* IR Dump After {0} *", PassID);
We see that the test checks everything except the PassID (the name is added after) and the rest of the banner is distinct in all cases. I have changed the regexs to {{.+}} which will ensure that there is actually something there but checking anything else will make the test susceptible to failing based on the compiler. The important part of the test is captured in that the banner shows that the dump is either filtered, ignored, output, etc and that the portion of IR is correct. This is the intent of the tests, rather than testing how different compilers construct a PassID.
@MaskRay From earlier comments:
"The reason for the round-trip structure is that there are more instantiations of these templates that report changes in different forms still to come. I will be subsequently posting reviews for at least 2 different change reporters based on these base classes. By defining these templates in this fashion, the functionality of comparing different IR representations and determining when changes have occurred and should be filtered, reported or ignored is all in common in the base classes. The derived classes will be called when interesting events happen and only need to handle presenting the events without having to determine when the events occur."
There are 2 other derived change reporters currently up for review based on these base classes, although the PRs need updating based on changes made in this PR. This makes at least 3 derived classes. See https://reviews.llvm.org/D87000 and https://reviews.llvm.org/D87202

llvm/lib/Passes/StandardInstrumentations.cpp
60

-filter-passes only supports this option and no other options. It is new with this code and is described here because it is easier to understand its usage when described in this context.

317

Same is a virtual override and must exist. See discussion of base class in response to another of your comments.

Respond to review comments: change comments, few code revisions, notably with
std::string. Tighten up test cases with more -NEXT checks of the IR.

Fixed a bug so that the initial IR reported is always the module.

Detecting changes of IR transformations is pain and I am really exited to see -print-changed, the diff version -print-changes and -dot-cfg-changed . Added @fedor.sergeev who is the original implementer of -print-* in the new PM.

Can you mark the dependency relation of D87000 clearly? Why does D87000 delete lots of code added in this patch? If you are organizing patches in a series, please make sure adjacent patches don't cancel each other locally.

@MaskRay, https://reviews.llvm.org/D87000 moves a bunch of the code into a separate file, rather than deleting it. -print-changed, by itself, doesn't add enough to justify it's own file but once you add in -print-changes and -dot-cfg-changes, it grows to a significant amount of code, so I moved it to a separate file. I had already marked the other PRs as children in a chain, which can be seen in the stack tab of the revision contents. Is there another way that I should indicate the relationship between the PRs? Thanks for your comments about these 3 PRs. I have found these ways of looking at the changes very useful, particularly -dot-cfg-changes. (Shameless promotion alert:) I will be presenting these three ways of looking at the IR in my tutorial at the upcoming virtual developers meeting, including a demo of -dot-cfg-changes.

@ychen, Why are the regexs not sufficient for testing? I would rather not "check the common sub-string of GCC's and MSVC's output" because it has already been sufficiently demonstrated that this is fragile in the test.

"check the common sub-string of GCC's and MSVC's output" is not fragile because we only support GCC/Clang/MSVC as the host compiler. For PassID, GCC and Clang produce similar if not identical results. Checking PassID for only one compiler is fragile yes.

The important part of the test is captured in that the banner shows that the dump is either filtered, ignored, output, etc and that the portion of IR is correct. This is the intent of the tests, rather than testing how different compilers construct a PassID.

That covers " dump is either filtered, ignored, output," functionality but it does not make sure that these functionalies are applied to the right pass and in the right order. Besides, {{.+}} makes it hard to understand the test.

Fair enough, I'll come up with a regex that works for all three. I suspect that this will need to change in response to https://reviews.llvm.org/D87216

Put text in tests that indicates the name of the pass in the banner,
using regex for template arguments.

@ychen, I changed the tests so that the banners check the names of the passes, based on other tests I found that also rely on pass ids. The other tests use regex to handle template arguments and specify other pass ids directly, so that is what I changed the tests to do.
@MaskRay I believe your concerns have been addressed; if not, please let me know.

MaskRay accepted this revision.Sep 30 2020, 1:29 PM
This revision is now accepted and ready to land.Sep 30 2020, 1:29 PM
ychen accepted this revision.Sep 30 2020, 1:35 PM

@ychen, I changed the tests so that the banners check the names of the passes, based on other tests I found that also rely on pass ids. The other tests use regex to handle template arguments and specify other pass ids directly, so that is what I changed the tests to do.

Thanks. LGTM.