Changeset View
Standalone View
lib/Transforms/IPO/PassManagerBuilder.cpp
Context not available. | |||||
"enable-loop-versioning-licm", cl::init(false), cl::Hidden, | "enable-loop-versioning-licm", cl::init(false), cl::Hidden, | ||||
cl::desc("Enable the experimental Loop Versioning LICM pass")); | cl::desc("Enable the experimental Loop Versioning LICM pass")); | ||||
static cl::opt<bool> | |||||
mehdi_amini: Why two options? | |||||
Not Done ReplyInline Actionsthis is adopted from the earlier implementation where we have a separated inline passes. Since now we are using the standard inliner with different thresholds. I think we can get rid of preinline option and keep disable-preinline option. xur: this is adopted from the earlier implementation where we have a separated inline passes. Since… | |||||
DisablePreInliner("disable-preinline", cl::init(false), cl::Hidden, | |||||
cl::desc("Disable pre-instrumentation inliner")); | |||||
static cl::opt<int> PreInlineThreshold( | |||||
"preinline-threshold", cl::Hidden, cl::init(75), cl::ZeroOrMore, | |||||
cl::desc("Control the amount of inlining in pre-instrumentation inliner " | |||||
"(default = 75)")); | |||||
PassManagerBuilder::PassManagerBuilder() { | PassManagerBuilder::PassManagerBuilder() { | ||||
OptLevel = 2; | OptLevel = 2; | ||||
SizeLevel = 0; | SizeLevel = 0; | ||||
Not Done ReplyInline ActionsThese seem pretty arbitrary numbers? mehdi_amini: These seem pretty arbitrary numbers? | |||||
Not Done ReplyInline ActionsNo. They are from current Os and Oz thresholds xur: No. They are from current Os and Oz thresholds | |||||
Not Done ReplyInline ActionsUh? Since we have Os/Oz as function attribute, it seems strange to me to have this hard-coded on a global level. mehdi_amini: Uh? Since we have Os/Oz as function attribute, it seems strange to me to have this hard-coded… | |||||
Not Done ReplyInline ActionsAlso is it duplicating llvm::computeThresholdFromOptLevels ? mehdi_amini: Also is it duplicating `llvm::computeThresholdFromOptLevels` ? | |||||
Not Done ReplyInline ActionsIt does not. The pre-inline's role is to do pre-profiling cleanup, but to do full scale inlining, thus it needs to be more conservative (no profile guidance is available at this point). It should not controlled in the same way as the regular inline phase. davidxl: It does not. The pre-inline's role is to do pre-profiling cleanup, but to do full scale… | |||||
Not Done ReplyInline ActionsThere is no mechanizm to do threshold tuning on a function granularity. Even there is one, using it may not help the goal of reducing the final overall size. davidxl: There is no mechanizm to do threshold tuning on a function granularity. Even there is one… | |||||
Not Done ReplyInline ActionsOK. mehdi_amini: OK. | |||||
Not Done ReplyInline ActionsDo you have text size comparison (for profile-use phase) with and without pre-cleanup? If pre-cleanup always reduces final size, there is no need to special case here. If not, we should disable pre-cleanup with Os or Oz is specified. davidxl: Do you have text size comparison (for profile-use phase) with and without pre-cleanup? If pre… | |||||
Not Done ReplyInline ActionsNeeds to have an option to control the value. davidxl: Needs to have an option to control the value. | |||||
Not Done ReplyInline ActionsApparently -inline-threshold will override. mehdi_amini: Apparently `-inline-threshold` will override. | |||||
Not Done ReplyInline ActionsI don't think that is the right option to use. The -inline-threshold option is used to control the regular inliner -- the tuning option of the pre-cleanup should not affect it. davidxl: I don't think that is the right option to use. The -inline-threshold option is used to control… | |||||
Not Done ReplyInline ActionsOK, make sense. mehdi_amini: OK, make sense. | |||||
Context not available. | |||||
// Do PGO instrumentation generation or use pass as the option specified. | // Do PGO instrumentation generation or use pass as the option specified. | ||||
Not Done ReplyInline ActionsA more effective cleanup should probably include SROA + EarlyCSE davidxl: A more effective cleanup should probably include SROA + EarlyCSE | |||||
void PassManagerBuilder::addPGOInstrPasses(legacy::PassManagerBase &MPM) { | void PassManagerBuilder::addPGOInstrPasses(legacy::PassManagerBase &MPM) { | ||||
Not Done ReplyInline ActionsI doubt JumpThreading pass will make any difference here. Can you experiment removing it? davidxl: I doubt JumpThreading pass will make any difference here. Can you experiment removing it? | |||||
if (PGOInstrGen.empty() && PGOInstrUse.empty()) | |||||
Not Done ReplyInline ActionsExperimenting removing this pass too? davidxl: Experimenting removing this pass too? | |||||
return; | |||||
// Perform the preinline and cleanup passes for O1 and above. | |||||
Not Done ReplyInline ActionsCFG simplification and inst combine look ok davidxl: CFG simplification and inst combine look ok | |||||
// And avoid doing them if optimizing for size. | |||||
Not Done ReplyInline ActionsRemove GVN pass? davidxl: Remove GVN pass? | |||||
if (OptLevel > 0 && SizeLevel == 0 && !DisablePreInliner) { | |||||
// Create preinline pass. | |||||
Not Done ReplyInline ActionsWhere is this list coming from? How is it computed? mehdi_amini: Where is this list coming from? How is it computed? | |||||
Not Done ReplyInline ActionsI came up this list. I'm open to add most passes. xur: I came up this list. I'm open to add most passes. | |||||
Not Done ReplyInline ActionsIn testing our games, we did not observe a significant reduction in instrumentation overhead from adding passes after pre-inlining. For example, we started with the set of passes initially proposed in http://reviews.llvm.org/D15828. Then we removed individual passes until there was a significant change in performance of the instrumented build. We were able to remove all of them except for the inlining pass. I.e. only the inlining pass contributed significantly to reducing overhead of the instrumented build. I would be interested to see the results of this experiment on a subset of your benchmarks. I believe the reason for this is that in the codebases we were testing (PS4 games), there are a very large number of calls to trivial single-BB functions (e.g. Vec4::operator+). On one codebase I analyzed, over 80% of all counts in the profile are from single-BB functions. silvas: In testing our games, we did not observe a significant reduction in instrumentation overhead… | |||||
MPM.add(createFunctionInliningPass(PreInlineThreshold)); | |||||
MPM.add(createSROAPass()); | |||||
MPM.add(createEarlyCSEPass()); // Catch trivial redundancies | |||||
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs | |||||
MPM.add(createInstructionCombiningPass()); // Combine silly seq's | |||||
addExtensionsToPM(EP_Peephole, MPM); | |||||
} | |||||
if (!PGOInstrGen.empty()) { | if (!PGOInstrGen.empty()) { | ||||
MPM.add(createPGOInstrumentationGenLegacyPass()); | MPM.add(createPGOInstrumentationGenLegacyPass()); | ||||
// Add the profile lowering pass. | // Add the profile lowering pass. | ||||
Context not available. |
Why two options?