Add an explicit flag to require MSSA when LoopRotate is known to join a
loop pass pipeline with other passes that will require the anlysis.
This undoes the pipeline split from D74574, but keep the analysis from
being run when LoopRotate is run alone in the loop pipeline.
Details
- Reviewers
dmgreen fedor.sergeev nikic fhahn
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Scalar/LoopRotation.cpp | ||
---|---|---|
80 | nit: initialise directly in the initialiser list? | |
88 | Would it be slightly simpler to just have RequireMSSA == true mean that we require and preserve it and RequireMSSA == false means neither require nor preserve it? I think both approaches are fine, but in the current pipeline the case we can preserve MSSA without requiring it is not going to happen I guess? |
This re-regresses PR44408 and PR44889 at ToT today.
I am planning to resolve the underlying cause (a new DT being built when doing MSSA updates) before coming back to this. The regression will not occur when that work is done (I have tested this already).
Rebased after the fix to remove DT recomputation in MSSA. I'm still seeing a small regression on one of the tests in the original PRs (0.6% in instructions on mogrify.i), but it's nowhere near the regression that motivated the pipeline split.
I'm not sure if it's worth pushing this forward if the only motivation is re-merging the pipeline in the LPM, unless it's entirely performance neutral.
@nikic: Would you mind checking the compile-time impact for this patch?
Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=553833958fdea48e41a11ee7e9c104c903deadf5&to=cfbffacf4188264ec40c0fcb7dbd92d0e23b1b74&stat=instructions Numbers are mixed with some improvements, some regressions. Largest individual regression is on libclamav_unsp.c with 1.8% at O3 and 4.2% for ThinLTO (pre-link).
Thank you for the results! Looking at libclamav_unsp.c, I'm seeing 3.96% in instructions for ThinLTO, which I'm guessing matches the results you see.
The fluctuations may make sense, since with the additional MSSA updates in LoopRotate it can do more or less work, depending on the updates. For instructions per cycle I'm seeing it neutral results (0.5%), and for wall time (average over 20 run), also neutral.
Again, I'm not convinced this is worth pushing forward due to it only affecting the LPM, but since the change I'm undoing here was a previous large release regression in clang9, I'm glad to see the DomTree work payed off and this is now showing mixed results.
I'm fine with whichever decisions the reviewers lean towards.
I believe this can be abandoned now, it's no longer relevant now that LICM runs before LoopRotate, so MSSA is required at that point anyway.
nit: initialise directly in the initialiser list?