This is an archive of the discontinued LLVM Phabricator instance.

Add remarks describing when a pass changes the IR instruction count of a module
ClosedPublic

Authored by paquette on Oct 10 2017, 3:07 PM.

Details

Summary

This patch adds a remark which tells the user when a pass changes the number of IR instructions in a module. It can be enabled by using -Rpass-analysis=size-info. The point of this is to make it easier to collect statistics on how passes modify programs in terms of code size. This is similar in concept to timing reports, but using a remark-based interface makes it easy to diff changes over multiple compilations of the same program. By adding functionality like this, we can see

  • Which passes impact code size the most
  • How passes impact code size at different optimization levels
  • Which pass might have contributed the most to an overall code size regression

The patch lives in the legacy pass manager, but since it's simply emitting remarks, it shouldn't be too difficult to adapt the functionality to the new pass manager as well. This can also be adapted to handle MachineInstr counts in the later Machine* passes.

Diff Detail

Event Timeline

paquette created this revision.Oct 10 2017, 3:07 PM
paquette updated this revision to Diff 118495.Oct 10 2017, 3:11 PM

Accidentally added some useless cruft in CGSCCPassManager.h in the first diff. Removed it.

davide edited edge metadata.Oct 10 2017, 9:05 PM

Thanks for this, I think it's going to be a useful feature (in some form), as we need to track code size more effectively.
I have some minor comments inline, and Chandler should take a look as well as he fought many battles in the PM area.

lib/IR/LegacyPassManager.cpp
167–197

This doesn't really belong to PMDataManager, if we want something like this it could probably live in Module.

I'm also slightly concerned as this seems to recompute the number of instruction for the whole module every time you run a function pass.
If your TU gets large (e.g. LTO) the overhead will be substantial.
We might strive a more incremental approach, but it's not immediately clear to me how (I'll think about it)

183–184

I'm not sure I understand this one, can you please give an example?

188

we have an early return, so maybe this is redundant. I'm fine either way.

1587

opt-remarks is a function pass and here you're considering as unit of IR the whole module. I think the function pass manager *should* only operate on functions. This bit seems a little odd, to me.

I'd love to hear @chandlerc's thoughts on this,

Also, side note, analyses aren't really supposed to change the IR (at least this is the guiding principle) so you might just skip them as well.
It needs a bunch more tests, the one you have only covers the Module pass manager, Thanks again for working on this!

anemet added inline comments.Oct 11 2017, 11:52 AM
lib/IR/LegacyPassManager.cpp
1587

This is not using the ORE pass for layering reasons. Instead, it works directly with the LLVMContext::diagnose interface.

paquette marked an inline comment as done.Oct 13 2017, 11:38 AM
paquette added inline comments.
lib/IR/LegacyPassManager.cpp
167–197

If we only keep track of deltas, then we might be able to do better. Then the question is how much the actual size of the module at every step matters.

For example, if a function pass modifies the size of the program, then we know how many instructions it changed simply by getting the size of the function it acted on before and after the pass. We can then emit a remark with that delta. That would still give us a measure of how much passes change code size. You would still be in trouble with module passes, but it'd at least reduce the overhead for FunctionPasses and their ilk.

183–184

Only pass managers return non-null on getAsPMDataManager. Thus, we can tell whether or not a pass is a pass manager by checking this. It's also used in the same way later in the file in the TimingInfo class. I should probably elaborate more in the comment as well.

It seems more reasonable to use the PassKind that each pass carries around. However, that doesn't work for the inliner's pass manager, which is categorized as a PT_Module. Thus, you end up getting a remark for both the inliner and its pass manager.

paquette updated this revision to Diff 118984.Oct 13 2017, 4:17 PM

Updated diff to reflect current comments on the review.

Highlights:

  • Instruction counts are now only updated when there has *definitely* been a change.
  • Instruction counts are now computed using deltas on smaller IR unit sizes whenever possible instead of on the entire module size. This will allow us to avoid recomputing module sizes whenever possible.
  • Moved getModuleInstrCount into Module and introduced an equivalent function into Function.
  • Expanded test so that it checks remarks for Module, Function, CGSCC, and BasicBlock passes.

Now that there's discussion on making the new pass manager the default pass manager, I'm wondering if it'd make sense to just go ahead and implement this functionality there entirely? I was hoping to get it done with the LegacyPassManager first then port it over after.

Also, the test is pretty big now to cover all of the pass manager types. Does anyone think there's a more succinct way to do this?

It largely depends on what's your use case, but it's not unreasonable to provide this feature only for the new PM.

Now that there's discussion on making the new pass manager the default pass manager, I'm wondering if it'd make sense to just go ahead and implement this functionality there entirely? I was hoping to get it done with the LegacyPassManager first then port it over after.

Also, the test is pretty big now to cover all of the pass manager types. Does anyone think there's a more succinct way to do this?

Sorry, I lost part of your question.
I'm not immediately concerned with the size of the test. If you want you can probably split in multiple files but I'm not sure if that buys us something.

Reviving this...

I'd personally find analysis like this useful, especially for quickly diagnosing unintuitive code size changes and building small test-cases. If it's okay for this just to be implemented in the legacy pass manager for now, I'd like to move forward with this. However, I'd also like to know the best path forward for implementing it in the new pass manager.

paquette updated this revision to Diff 145572.May 7 2018, 3:44 PM

A bit of cleanup. Updated diff for current pass manager, changed get(IRType)IRCount to a general getIRCount() function.

anemet added a comment.May 7 2018, 4:32 PM

Mostly small things except for the question on whether we should only compute this when the remark is actually enabled.

include/llvm/IR/Function.h
144–145

getInstructionCount is probably a better name? Later to you have more IR->Instruction.

int -> unsigned?

Remove \p since this it's not a parameter?

lib/Analysis/CallGraphSCCPass.cpp
137

How expensive is this in practice (I know that it's linear but we have many passes)? You may want to do this only if the remark is requested.

You should be able to check this with the DiagnosticHandler in the LLVMContext.

lib/IR/LegacyPassManager.cpp
168–169

Just pass by value?

184

DiagnosticInfoOptimizationBase::Argument -> NV (see many examples elsewhere).

192

Add a comment that this is not using ORE for layering reasons.

paquette updated this revision to Diff 146428.May 11 2018, 3:40 PM
paquette marked 4 inline comments as done.

Updated diff to address review comments.

  • Since we can't use namespace ore here, I kept the DiagnosticInfoOptimizationBase::Argument parameters instead of switching them to NV. I'll submit a follow-up patch to resolve this. I added a FIXME explaining what to do here in the meantime.
  • All computation for the remarks is now contained in emitIRSizeChangedRemark, aside from the initial computation of InstrCount. This ensures that we don't count instructions unless we need to.
  • General cleanup, etc.
paquette updated this revision to Diff 146927.May 15 2018, 3:16 PM
  • Updated the test to use globaldce for the module pass
  • Ensured that we don't *ever* get the module's instruction count unless the remark has been requested but the user. This is handled in a new helper function, initSizeRemarkInfo
  • Improved the test. It only runs opt once per pass type we want to handle. It now no longer has hard-coded values for sizes, which should make sure it isn't terribly flakey
paquette updated this revision to Diff 146947.May 15 2018, 3:46 PM

A couple more changes:

  • int -> unsigned for getInstructionCount and friends.
  • Change names of things that include "IR" to "Instruction".
  • Add a few more comments.
paquette updated this revision to Diff 147204.May 16 2018, 3:38 PM

Fixed a typo in initSizeRemarkInfo.

paquette updated this revision to Diff 147357.May 17 2018, 11:26 AM

A few changes here after running this on some more complicated code.

  • I had entirely forgotten to implement it for loop passes in this patch. Added that. ("Why is instcombine making this module 50% bigger? Oh; it's loop unrolling.")
  • Changed and LocalChanged aren't reliable for this sort of task. Removed the reliance on that from emitting remarks.
  • Getting the initial size of the module at the entry to runOnBlah is unreliable due to missing initialization steps/passes. Moved this down to near where the remark is emitted.
  • Added documentation to the test to explain what it's doing, what behaviour it tests, why/why not.
  • Made the test enforce that instruction counts and deltas must be within a certain range, and not just any old number.
anemet accepted this revision.May 17 2018, 4:18 PM

Mostly nits. LGTM with the requested changes.

(Nice test!)

lib/IR/Function.cpp
198

unsigned

lib/IR/LegacyPassManager.cpp
166–182

Please make this logic more tight. E.g. the initial check on F is a constant true and this last one is a constant false.

test/Other/size-remarks.ll
11

Can you please file a bugzilla that we should support remarks with modules as the code region (Product:tools, Component:opt-viewer).

The tricky part is that Function is pretty deeply baked into the remarks so this may need some work.

This revision is now accepted and ready to land.May 17 2018, 4:18 PM
davide accepted this revision.May 17 2018, 4:26 PM

happy to see this going in when Adam is.

xbolva00 added inline comments.
lib/Analysis/LoopPass.cpp
154 ↗(On Diff #147357)

You also use
Module &M = *F.getParent();

2x Module &M = *(F.getParent())
1x Module &M = *F.getParent()

Please check and make it same

efriedma added inline comments.
lib/IR/Function.cpp
200

Should we use std::distance(BB.instructionsWithoutDebug.begin(), BB.instructionsWithoutDebug().end()) instead, so the numbers are consistent whether or not debug info is enabled?

paquette updated this revision to Diff 147543.May 18 2018, 10:08 AM
paquette marked 4 inline comments as done.

Updated patch to address review.

  • Fixed stylistic issues
  • Changed the basic block instruction count as per @efriedma's suggestion
  • Tightened up the logic in emitInstrCountChangedRemark
  • Fixed up the leftover int in Function::getInstructionCount
  • Small personal nit on the test. [[DELTA:-*[1-9][0-9]*]]->[[DELTA:-?[1-9][0-9]*]]. The negative sign should have 0 or 1 repetitions, not 0 or more.
paquette marked 7 inline comments as done.May 18 2018, 10:09 AM
paquette added inline comments.
lib/Analysis/LoopPass.cpp
154 ↗(On Diff #147357)

Changed them all to *F.getParent()

lib/IR/LegacyPassManager.cpp
184

This should be done in a follow-up patch.

test/Other/size-remarks.ll
11

I'll do that shortly. I'm waiting on a password reset email from bugzilla. :)

xbolva00 added inline comments.May 18 2018, 11:22 AM
lib/IR/LegacyPassManager.cpp
170

static_cast?

Ah, you’re right.

Changed in r332758.