This is an archive of the discontinued LLVM Phabricator instance.

Add new hidden option -print-changed which only reports changes to IR
ClosedPublic

Authored by jamieschmeiser on Aug 21 2020, 11:10 AM.

Details

Summary

A new hidden option -print-changed is added along with code to support
printing the IR as it passes through the opt pipeline in the new pass
manager. Only those passes that change the IR are reported, with others
only having the banner reported, indicating that they did not change the
IR, were filtered out or ignored. Filtering of output via the
-filter-print-funcs is supported and a new supporting hidden option
-filter-passes is added. The latter takes a comma separated list of pass
names and filters the output to only show those passes in the list that
change the IR. The output of both of these functions can be modified
via the -print-module-scope function.

The code introduces a template base class that generalizes the comparison
of IRs that takes an IR representation as template parameter. The
constructor takes a series of lambdas that provide an event based API
for generalized printing of IRs as they are changed in the opt pipeline
through the new pass manager.

The first of several instantiations is provided that prints the IR
in a form similar to that produced by -print-after-all with the above
mentioned filtering capabilities. This version, and the others to
follow will be introduced at the upcoming developer's conference.
See https://hotcrp.llvm.org/usllvm2020/paper/29 for more information.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2020, 11:10 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
jamieschmeiser requested review of this revision.Aug 21 2020, 11:10 AM

Change function names based on suggestions from clang-tidy

yrouban added inline comments.Aug 23 2020, 9:08 PM
llvm/lib/Passes/StandardInstrumentations.cpp
461

static cannot be used for a multi-threaded compiler.
Please create IRChangePrinter instance for every instance of StandardInstrumentations.

The paper link is not accessible. Could you please make it available somewhere else?

The overall approach seems very general. What's the motivation.

The current patch misses a lot the diffs. Looks like diff 1 is missed.

llvm/lib/Passes/StandardInstrumentations.cpp
69

Maybe use a pass's PassRegistery.def name instead of its textual class name? It would be consistent.

220–252

Use early return

220–252

This could be put into the lambda capture list. Maybe use shared_ptr for the IRChangePrinter instance.

220–252

Delete?

235

Make sure we consider uselist order changes for comparison.

565

For consistency, make registerChangePrinters a member function.

llvm/test/Other/change-printer.ll
17

It has some non-trivial maintenance costs. I would suggest to use unit test for this patch altogether.

Always stack an entry on the stack before each pass in change reporters.

There was an inconsistency when handling invalidated passes in that they
always popped the stack even if the pass was filtered out or if it was for a
function that was filtered out. However, for invalidated passes, no IR
is given in the callback from the pass manager so it is difficult to tell
if an invalidated pass was for a filtered function (which would not have
a stack entry previously). Only the banner is printed for filtered,
invalidated or ignored passes so just print the banner message that the pass
is invalidated rather than being more specific and pop the stack (which will
always now have an entry).

I noticed a problem in the code and posted the fix separately as I did not want to confuse it with the changes based on the review comments.

Make IRChangePrinter a member of StandardInstrumentations to control
object lifespan.

Move template base class for change printers to header file and make
IRChangePrinter a member of StandardInstrumentations. This gives the
object the appropriate life-span rather than using a static object.
This also makes registerCallbacks a member function for consistency.
This is in response to comments made by reviewers yrouban and ychen.

Thanks for the review comments. I have tried to address all of your concerns above and in the last set of changes posted.

llvm/lib/Passes/StandardInstrumentations.cpp
69

While it may be more consistent, I think it would also be more inconvenient for the user and would require the user to either know the PassRegistry.def id or look it up. However, with the textual class name, it can be determined by looking at the output itself. So, if the output is being examined and it is determined that a pass can be filtered out, the name of the pass is there in the output. Also, to use the PassRegistry.def id would require a map to get back to the textual class name as that is what the callbacks provide and would require updating the mapping function each time a pass name changed or a new pass was added or deleted.

220–252

I don't understand these comments. All 3 callbacks are registered and there is no testing so there can be no early return. Regarding the shared_ptr and delete comments, the object was made a member of StandardInstrumentations, which ovviates the need for the static member and control of life-span.

235

I am not sure I understand what you mean... If you mean that changes in order should be reported as no change, that does not happen with this option. As the comparison exists, if the order of functions change in a module pass or the order of blocks change in a function pass, this will report it as changed as it is just comparing strings. This may or may not be what the user desires but a more complex difference mechanism would be required to not report changes due to ordering changes only and would be a different change reporter. It would be possible to create such a reporting system using these base classes and a subsequent change reporter that I will be putting up for review when this has landed will provide more functionality that would ease creating such a change reporter.

461

Note that the life-span of the object extends beyond this function (which is why it was static). I changed it to be a member of StandardInstrumentations object, which gives it the appropriate life-span and there is no longer a static member.

565

Done as part of the code changes.

llvm/test/Other/change-printer.ll
17

I'm sorry but I do not understand what you mean.
I tried to choose 2 passes that currently and would likely continue to make changes to the source in the test and the tests only check that changes are made and not what the changes actually are, to avoid churn. I think the only way to completely avoid the possibility of churn would be to not test it.

jamieschmeiser added inline comments.Aug 25 2020, 3:42 PM
llvm/test/Other/change-printer.ll
17

I now know what you mean and will look into the possibility.

See comments regarding the unit tests

llvm/test/Other/change-printer.ll
17

I do not think it is possible to capture the output from dbgs() within the unit test framework. The only information I could find about capturing std::out or std::err involved using non-public APIs in the test framework. Also, llvm::dbgs() does not actually use std::out but uses its own buffer. This buffer can be sized but that would require altering the hidden option controlling the size to make it non-static so that it could be accessed within the test, which doesn't seem like a good plan. I think these tests are the best that can be done and if the concern is that they are fragile, then I will remove them if you prefer, seeing as this is for debugging information anyway.

Create 2 dummy passes, one that changes the IR by adding a dummy attribute
to a function, the other which does not change the IR. Neither are
added to any pipeline. Change the -print-changed unit test to use these
2 passes to make the unit test robust.

aeubanks added inline comments.Aug 31 2020, 9:24 PM
llvm/test/Other/change-printer.ll
17

The IR seems a bit heavy. Could you do something simple like

define i32 @f() {
  %a = add i32 2, 3
  ret i32 %a
}

and run
opt -passes='no-op-function,instsimplify'?
no-op-function serves as make-no-change and instsimplify will simplify the IR to just ret i32 5, serving as make-a-change.
That way there's no need to add two more passes to LLVM.

Respond to review comments: remove dummy passes created in last
changeset and change tests to use no-op-function and instsimplify.

yrouban added inline comments.Sep 1 2020, 9:06 PM
llvm/include/llvm/Passes/StandardInstrumentations.h
94

minor: it would be more convenient to have meaningful names inplace of F1 - F8.

123–144

minor comments:

swap 1st and 2nd parameters of HandleAfter for consistency with the other callbacks.

consider const for Before and After parameters:

std::function<bool(const T &Before, const T &After)> Same;
std::function<void(StringRef PassID, std::string &Name, const T &Before, const T &After, Any)>   HandleAfter;

[NFC] Respond to review comments: change parameter order, change variable
names, add const to parameters

yrouban accepted this revision.Sep 2 2020, 7:59 PM

No further complains from me.
lgtm if the other reviewers' comments are addressed.

This revision is now accepted and ready to land.Sep 2 2020, 7:59 PM

I'm actually very excited for this, it makes tracking down a bad transform easier and would have helped in earlier investigations!

llvm/lib/Passes/StandardInstrumentations.cpp
59
69

I'm also in favor of using the short name like "instsimplify" that matches --print-before/after, and -passes=, it seems more consistent. If you're looking at IR, you probably are familiar with how to run passes under opt, which requires knowing the short name.

However, that doesn't work under the NPM yet for --print-before/after, see https://bugs.llvm.org/show_bug.cgi?id=47370.

So for now we can proceed with this, but I'll probably try and fix this up alongside --print-before/after later.

229

I don't think this is necessary since the ModuleToFunctionPassAdaptor already filters declarations out.

Respond to review comments: fix comment and remove unnecessary test
from isInterestingFunction. Also, mark some functions inline.

jamieschmeiser marked an inline comment as done.Sep 3 2020, 7:18 AM

Thanks for doing the reviews.

llvm/lib/Passes/StandardInstrumentations.cpp
69

I am fine with this changing when 47370 is fixed. I think it will be changed automatically if the pass id that is sent on the callback changes. It will also change the name that it printed out so it will be consistent and the cut/paste aspect of my earlier comments would remain. My main concern was having to do a mapping function that would need to be kept up to date.

229

I have removed it and marked these functions inline. I think I put that test in while working on some of the follow-on reporters that report for modules so I suspect I will have to add it back in later.

Merge of last 2 changesets for committing.

This revision was landed with ongoing or failed builds.Sep 3 2020, 8:52 AM
This revision was automatically updated to reflect the committed changes.
rupprecht added inline comments.
llvm/test/Other/change-printer.ll
57

I'm seeing test failures that this should actually be: ModuleToFunctionPassAdaptor<llvm::PassManager<llvm::Function>> (note the lack of whitespace at the end in >>).

Is it safe to change it to that, or is this test somehow sensitive to the build/runtime environment? > > being required is a legacy of older versions of C++.

jamieschmeiser added inline comments.Sep 3 2020, 10:44 AM
llvm/test/Other/change-printer.ll
57

@rupprecht I see this and can fix this by removing the tail part of the CHECK. There are probably others that need the same fix and I will do this fix right away. Thanks for pointing this out.

jamieschmeiser added inline comments.Sep 3 2020, 10:59 AM
llvm/test/Other/change-printer.ll
57

Actually, the better fix is to just use a reg-ex to make the space optional, which I will do.

ychen added a comment.Sep 3 2020, 11:53 AM

Sorry to get around to this late. Some comments. If possible, please revert and recommit the patch after we solve these remaining issues.

llvm/lib/Passes/StandardInstrumentations.cpp
222

There is llvm::isSpecialPass for this.

235

Uselist order

235

(Maybe as a followup work) Please check the declaration of Module::print. Makes sure *ShouldPreserveUseListOrder* is true when printing IR for comparison. It should work out-of-box for Module, but needs some work for other IR units. This is helpful for tracking passes that may transform IR differently depending on the uselist order.

ychen added inline comments.Sep 3 2020, 11:53 AM
llvm/include/llvm/Passes/StandardInstrumentations.h
92

It would be nice to use IRUnitT instead of T.

llvm/lib/Passes/StandardInstrumentations.cpp
322

Why is the round-trip of defining these here and passing them into the ctor of IRChangePrinter? Why do we need private callbacks such as ChangePrinter::HandleInitialIR. It seems just a trivial wrapper of handleInitialIR.

llvm/test/Other/change-printer.ll
6

Instead of check0, check1, check2, could you use some short meaningful string. Say check0->check-simple, check1->check-changed-pass, etc..

jamieschmeiser reopened this revision.Sep 3 2020, 1:31 PM
jamieschmeiser added inline comments.
llvm/lib/Passes/StandardInstrumentations.cpp
235

I'm sorry but I still do not understand... This unordered_set is just a list of pass names that are specified using the hidden option -filter-passes to allow filtering of output for passes. So, if you do not want output for a particular pass, you specify it with -filter-passes and that pass will be listed as being filtered out even if it makes changes. This is similar to -filter-print-funcs and this function is similar to llvm::isFuntionInPrintList. The output for a module is generated using unwrapAndPrint so I think the change that you are implying would be made in printIR that takes a Module (near line 139) and would also affect other users of that code (such as print-[before|after]-all). I think such a change should be handled separately as it doesn't just affect this new code and I am not sure of the implications.

322

The reason for the round-trip structure is that there are more instantiations of these templates that report changes in different forms still to come. I will be subsequently posting reviews for at least 2 different change reporters based on these base classes. By defining these templates in this fashion, the functionality of comparing different IR representations and determining when changes have occurred and should be filtered, reported or ignored is all in common in the base classes. The derived classes will be called when interesting events happen and only need to handle presenting the events without having to determine when the events occur.

This revision is now accepted and ready to land.Sep 3 2020, 1:31 PM

Respond to review comments: Change template parameter name, use
llvm::isSpecialPass, make space between template end delimiters optional
and use more meaningful check prefixes in tests.

aeubanks accepted this revision.Sep 6 2020, 5:58 PM

lgtm after ychen's comments are addressed

ychen added inline comments.Sep 9 2020, 11:05 AM
llvm/lib/Passes/StandardInstrumentations.cpp
235

Sorry. The position of this comment is a bit misplaced. What I meant is that uselist order is also part of IR (https://llvm.org/docs/LangRef.html#use-list-order-directives) but it is not printed by default. When we comparing IR here, it would be helpful to make sure uselist order changes are also captured by this patch.

322

Maybe I missed something. This sounds like either virtual inheritance or CRTP are better alternatives than many function objects as class members. I would prefer CRTP.

jamieschmeiser added inline comments.Sep 14 2020, 7:07 AM
llvm/lib/Passes/StandardInstrumentations.cpp
322

@ychen Sorry for the delay in response but I had to concentrate on preparing my tutorial for the meeting as the recording is due today. Anyway, I don't really think CRTP is really applicable here because there really aren't polymorphic calls being made. The calls are registered with the PIC and at that point, everything about the calls is essentially known so there isn't polymorphism in the calls. There are 3 ways of handling the callbacks: lambdas (as I've done), virtual overrides and leaving the functions to be defined by the template instantiation. Pre-C++11, I would have used virtual overrides and I think this is a clean way of handling this situation. The bases would be abstract and the derived classes would just provide the overrides. However, it seems that the llvm community shies away from virtual hierarchies in favour of CRTP which is appropriate for the way it is mostly used, but not in this situation IMHO. Virtuals have a bad rap for being unefficient but would be fine here. I don't like leaving functions that have to be provided by template instantiations as the errors provided when they are incorrect often come from the linking stage and it can be very difficult to determine how to properly provide the necessary functions in the instantiation. Lambdas seem to be preferred over virtuals so I chose them. I would be fine with changing the hierarchy to use virtual functions but I would shy away from requiring the instantiation from providing missing functions even though it would be a more efficient way of calling the functions since it makes it harder to extend the hierarchy. Efficiency isn't the main concern here since this is for aiding in code development/understanding and the overhead of virtual calls or calling through function pointers (lambdas) is small in comparison to the overhead involved in saving the representation of the IR. Would you prefer this to be expressed with virtual functions? If so, I will change it but I would rather avoid leaving functions to be provided by instantiations. I did that in one of the subsequent PRs and found it cumbersome and difficult.

Respond to review comments: Set ShouldPreserveUseListOrder to true
when generating the output for a module.

Respond to review comments: change lambdas to virtual function calls and
place code directly in functions.

ychen accepted this revision.Sep 15 2020, 10:45 AM

One more nit. Thank you!

llvm/include/llvm/Passes/StandardInstrumentations.h
109

Please remove mentioning callback here and after.

[NFC] Respond to reviewers comments: Update comments.

thakis added a subscriber: thakis.Sep 16 2020, 10:44 AM

This seems to break tests on most bots, e.g. http://45.33.8.238/linux/27972/step_12.txt

On other bots it doesn't build: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/14540/steps/build%20stage%201/logs/stdio

PTAL, and revert for now if it's not a quick fix.

mkitzan added inline comments.
llvm/lib/Passes/StandardInstrumentations.cpp
406

@thakis and @jamieschmeiser this extra semi-colon appears to be the cause of the recent build breaks.

@mkitzan, Thanks for fixing the build by removing the extra semicolon. Also, removing the call to the base dtor is fine.
commit 6859d95ea2d0f3fe0de2923a3f642170e66a1a14
Author: Michael Liao <michael.hliao@gmail.com>
Date: Wed Sep 16 14:43:08 2020 -0400

mkitzan added a subscriber: hliao.Sep 16 2020, 1:30 PM

I can't take credit for that. It was @hliao who committed the fix (thanks btw!).

This commit seems to cause a test failure on some bots, e.g. here.

This commit seems to cause a test failure on some bots, e.g. here.

I think it was already fixed, the bot has been green for the last 2 runs.

dyung added a subscriber: dyung.Sep 17 2020, 12:17 AM

This commit seems to cause a test failure on some bots, e.g. here.

I think it was already fixed, the bot has been green for the last 2 runs.

This is still failing on Windows bots:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/1269

kuhnel added a subscriber: kuhnel.EditedSep 17 2020, 12:35 AM

It also broke pre-merge testing for everyone: https://buildkite.com/llvm-project/llvm-master-build/builds/1071#7bfa9e1f-a2fe-47aa-8263-d2646ac0a0f5

I tried to revert, but it causes a merge conflict :(
Please fix ASAP.

This comment was removed by goncharov.
dyung added a comment.Sep 17 2020, 1:34 AM

I have reverted this change and the follow-up commit in b03c2b8395ba94fb53f1e73a6473faedf628bbd9. Hopefully this should get the bots green again.

jamieschmeiser reopened this revision.Sep 17 2020, 8:50 AM

It appears that the failing tests are due to differences in the PassID that is passed in the PassInstrumentationCallbacks, which is presumably build compiler/platform dependent. I know that the name sent is language dependent because it is the mangled name for C++ but all of these tests are based on input that was originally C so that should be fine. I will alter the tests to have a regex accepting anything for the PassID and just verify the constructed part of the banner and the name, which should make these tests robust and still test the functionality.

This revision is now accepted and ready to land.Sep 17 2020, 8:50 AM

Respond to build/test problems: Remove extra ;, remove call to base dtor,
change tests to have regex for PassID and ModuleID

MaskRay requested changes to this revision.Sep 17 2020, 9:43 PM
MaskRay added a subscriber: MaskRay.
MaskRay added inline comments.
llvm/include/llvm/Passes/StandardInstrumentations.h
131

Prefer member initializer

InitialIR = true

llvm/lib/Passes/StandardInstrumentations.cpp
47

No need to specifically to call out "hidden". Other options don't document themselves "hidden".

52

The relations with -filter-passes and print-module-scope probably should be moved to a separate graph, because several other -print-* (e..g -print-after-all) share the documentation.

270
287
301

Was

302

same has a trivial definition. Consider deleting the member function

352
403

Before == After

This revision now requires changes to proceed.Sep 17 2020, 9:43 PM
MaskRay added inline comments.Sep 17 2020, 9:45 PM
llvm/include/llvm/Passes/StandardInstrumentations.h
108

Capitalize comments and end with full stops.

ditto below

llvm/test/Other/change-printer.ll
57

The current tested form is too loose. Worth adding some CHECK-NEXT: with the first line of the dump, e.g. CHECK-NEXT: define i32 @g()

ychen added a comment.Sep 17 2020, 9:58 PM

Respond to build/test problems: Remove extra ;, remove call to base dtor,
change tests to have regex for PassID and ModuleID

The regex checking is not sufficient. Could you please check the common sub-string of GCC's and MSVC's output?

llvm/lib/IR/LegacyPassManager.cpp
90

Add a comma before "always"

llvm/test/Other/change-printer.ll
7

For check prefixes, using dash is much more common than underscore.

MaskRay added inline comments.Sep 17 2020, 10:01 PM
llvm/test/Other/change-printer.ll
7

I am fine with underscores because several directives use dashes and underscores are more distinguishable.

I think my biggest question is about the necessity of the ChangePrinter base class... The paragraph in the summary hints on that it is probably necessary....... but the link is unreadable and I think we need other concrete derived classes to justify the base class design.

The first of several instantiations is provided that prints the IR in a form similar to that produced by -print-after-all with the above mentioned filtering capabilities. This version, and the others to follow will be introduced at the upcoming developer's conference. See https://hotcrp.llvm.org/usllvm2020/paper/29 for more information.

@ychen, Why are the regexs not sufficient for testing? I would rather not "check the common sub-string of GCC's and MSVC's output" because it has already been sufficiently demonstrated that this is fragile in the test. Referring to the code creating one of the banners:
SmallString<20> Banner = formatv("* IR Dump After {0} *", PassID);
We see that the test checks everything except the PassID (the name is added after) and the rest of the banner is distinct in all cases. I have changed the regexs to {{.+}} which will ensure that there is actually something there but checking anything else will make the test susceptible to failing based on the compiler. The important part of the test is captured in that the banner shows that the dump is either filtered, ignored, output, etc and that the portion of IR is correct. This is the intent of the tests, rather than testing how different compilers construct a PassID.
@MaskRay From earlier comments:
"The reason for the round-trip structure is that there are more instantiations of these templates that report changes in different forms still to come. I will be subsequently posting reviews for at least 2 different change reporters based on these base classes. By defining these templates in this fashion, the functionality of comparing different IR representations and determining when changes have occurred and should be filtered, reported or ignored is all in common in the base classes. The derived classes will be called when interesting events happen and only need to handle presenting the events without having to determine when the events occur."
There are 2 other derived change reporters currently up for review based on these base classes, although the PRs need updating based on changes made in this PR. This makes at least 3 derived classes. See https://reviews.llvm.org/D87000 and https://reviews.llvm.org/D87202

llvm/lib/Passes/StandardInstrumentations.cpp
52

-filter-passes only supports this option and no other options. It is new with this code and is described here because it is easier to understand its usage when described in this context.

302

Same is a virtual override and must exist. See discussion of base class in response to another of your comments.

Respond to review comments: change comments, few code revisions, notably with
std::string. Tighten up test cases with more -NEXT checks of the IR.

Fixed a bug so that the initial IR reported is always the module.

Detecting changes of IR transformations is pain and I am really exited to see -print-changed, the diff version -print-changes and -dot-cfg-changed . Added @fedor.sergeev who is the original implementer of -print-* in the new PM.

Can you mark the dependency relation of D87000 clearly? Why does D87000 delete lots of code added in this patch? If you are organizing patches in a series, please make sure adjacent patches don't cancel each other locally.

@MaskRay, https://reviews.llvm.org/D87000 moves a bunch of the code into a separate file, rather than deleting it. -print-changed, by itself, doesn't add enough to justify it's own file but once you add in -print-changes and -dot-cfg-changes, it grows to a significant amount of code, so I moved it to a separate file. I had already marked the other PRs as children in a chain, which can be seen in the stack tab of the revision contents. Is there another way that I should indicate the relationship between the PRs? Thanks for your comments about these 3 PRs. I have found these ways of looking at the changes very useful, particularly -dot-cfg-changes. (Shameless promotion alert:) I will be presenting these three ways of looking at the IR in my tutorial at the upcoming virtual developers meeting, including a demo of -dot-cfg-changes.

@ychen, Why are the regexs not sufficient for testing? I would rather not "check the common sub-string of GCC's and MSVC's output" because it has already been sufficiently demonstrated that this is fragile in the test.

"check the common sub-string of GCC's and MSVC's output" is not fragile because we only support GCC/Clang/MSVC as the host compiler. For PassID, GCC and Clang produce similar if not identical results. Checking PassID for only one compiler is fragile yes.

The important part of the test is captured in that the banner shows that the dump is either filtered, ignored, output, etc and that the portion of IR is correct. This is the intent of the tests, rather than testing how different compilers construct a PassID.

That covers " dump is either filtered, ignored, output," functionality but it does not make sure that these functionalies are applied to the right pass and in the right order. Besides, {{.+}} makes it hard to understand the test.

Fair enough, I'll come up with a regex that works for all three. I suspect that this will need to change in response to https://reviews.llvm.org/D87216

Put text in tests that indicates the name of the pass in the banner,
using regex for template arguments.

@ychen, I changed the tests so that the banners check the names of the passes, based on other tests I found that also rely on pass ids. The other tests use regex to handle template arguments and specify other pass ids directly, so that is what I changed the tests to do.
@MaskRay I believe your concerns have been addressed; if not, please let me know.

MaskRay accepted this revision.Sep 30 2020, 1:29 PM
This revision is now accepted and ready to land.Sep 30 2020, 1:29 PM
ychen accepted this revision.Sep 30 2020, 1:35 PM

@ychen, I changed the tests so that the banners check the names of the passes, based on other tests I found that also rely on pass ids. The other tests use regex to handle template arguments and specify other pass ids directly, so that is what I changed the tests to do.

Thanks. LGTM.