The original ThinLTO pipeline was derived from some
work I did tuning FullLTO on the test suite and SPEC. This
patch reduces the amount of work done in the "linker phase" of
the build, and extend the function simplifications passes
performed during the "compile phase". This helps the build time
by reducing the IR as much as possible during the compile phase
and limiting the work to be performed during the "link phase",
while keeping the performance "on par" with the existing pipeline.
Details
- Reviewers
tejohnson - Commits
- rG31407ba009c8: Tweak the ThinLTO pass pipeline
Diff Detail
Event Timeline
I compared performance of all the C/C++ benchmarks in SPEC cpu2006 on an Ivybridge X86 with gold+ThinLTO, with 3 runs each. It looks performance neutral (some run-to-run variations that look like noise, nothing significant).
lib/Transforms/IPO/PassManagerBuilder.cpp | ||
---|---|---|
428 | Is there any significance to the move of the globalopt invocation here from its original location below LoopVersioningLICM? | |
448 | Why are the globalDCE before and after globalopt no longer needed (for the reasons stated in the comments)? |
lib/Transforms/IPO/PassManagerBuilder.cpp | ||
---|---|---|
428 | I didn't really design the original position to be after LICM. I wrote this back in October before ReversePostOrderFunctionAttrr and LoopVersioningLICM exist. | |
448 | I think my original comments were buggy: GlobalOptimizer performs DCE as well. |
Is there any significance to the move of the globalopt invocation here from its original location below LoopVersioningLICM?