This is mostly unnecessary given we're about to run the simplification pipeline anyway.
A couple of remarks, some to do with not regressing PhaseOrdering tests.
The main issue with just removing GlobalCleanupPM is that we no longer run InstCombine before the inliner.
Another issue is that CGSCC passes before the function simplification pipeline may have less precise IR to work with.
SROA -> EarlyCSE -> InstCombine -> SimplifyCFG should be the passes at the top of the function simplification pipeline.
InstCombine/SimplifyCFG need to run before JumpThreading, so move JumpThreading after those.
This also adds an extra EarlyCSE in the -O1 pipeline to match where GVN runs in other pipelines, otherwise we never run EarlyCSE after InstCombine.
Fairly substantial compile time wins:
https://llvm-compile-time-tracker.com/compare.php?from=a5595e9f1feb5960132ffcef539ab425abbe97cc&to=2baa33baa9bfa722329a728a100610daba60c559&stat=instructions:u
IR diffs on llvm-test-suite:
https://github.com/aeubanks/llvm-ir-diff/commit/f833481abce7d2310c84029a5df08d4d9bd943e5
The diffs mostly look neutral except for some gcc torture suite files which are arguably unreasonable.
This regresses D117091 because we now do not run instcombine before the inliner, losing return value alignment info.
This regresses pr52289.ll which was a fuzzed test case to begin with (and never properly fixed).
Why the extra EarlyCSE pass in the O1 pipeline?