This is an archive of the discontinued LLVM Phabricator instance.

Tweak the ThinLTO pass pipeline
ClosedPublic

Authored by mehdi_amini on Apr 30 2016, 5:56 PM.

Details

Summary

The original ThinLTO pipeline was derived from some
work I did tuning FullLTO on the test suite and SPEC. This
patch reduces the amount of work done in the "linker phase" of
the build, and extend the function simplifications passes
performed during the "compile phase". This helps the build time
by reducing the IR as much as possible during the compile phase
and limiting the work to be performed during the "link phase",
while keeping the performance "on par" with the existing pipeline.

Diff Detail

Event Timeline

mehdi_amini retitled this revision from to Tweak the ThinLTO pass pipeline.
mehdi_amini updated this object.
mehdi_amini added a reviewer: tejohnson.
mehdi_amini added a subscriber: llvm-commits.

Fix typo check for LTO

Rebase after splitting other changes

tejohnson edited edge metadata.May 5 2016, 6:39 PM

I compared performance of all the C/C++ benchmarks in SPEC cpu2006 on an Ivybridge X86 with gold+ThinLTO, with 3 runs each. It looks performance neutral (some run-to-run variations that look like noise, nothing significant).

lib/Transforms/IPO/PassManagerBuilder.cpp
428

Is there any significance to the move of the globalopt invocation here from its original location below LoopVersioningLICM?

448

Why are the globalDCE before and after globalopt no longer needed (for the reasons stated in the comments)?

mehdi_amini added inline comments.May 6 2016, 9:08 AM
lib/Transforms/IPO/PassManagerBuilder.cpp
428

I didn't really design the original position to be after LICM. I wrote this back in October before ReversePostOrderFunctionAttrr and LoopVersioningLICM exist.
I inserted it originally right after EliminateAvailableExternally.
In the meantime LoopVersioningLICM was inserted before EliminateAvailableExternally. In this patch I reorganized ReversePostOrderFunctionAttrr, EliminateAvailableExternally, and GlobalOpt to be ran before LICM (which is more part of the function optimization pipeline that follow) last week in this patch.
Then I was asked to split the patch, so what remains is only this part of the move.

448

I think my original comments were buggy: GlobalOptimizer performs DCE as well.

tejohnson accepted this revision.May 6 2016, 9:30 AM
tejohnson edited edge metadata.
This revision is now accepted and ready to land.May 6 2016, 9:30 AM