This patch uses a similar trick as in D113947 to only run the extra
passes after vectorization on functions where loops have been
vectorized.
The reason for running the 'extra vector passes' is
simplification/unswitching of the runtime checks created by LV, there
should be no need to run them if nothing got vectorized
To do that, a new dummy analysis ShouldRunExtraVectorPasses has been
added. If loops have been vectorized for a function, LV will cache the
analysis. At the moment it uses MadeCFGChanges as proxy for loop
vectorized, which isn't perfect (it could be too aggressive, e.g.
because no runtime checks have been added), but should be good enough
for now.
The extra passes are then managed by a new FunctionPassManager that
collects 2 sets of passes: one to run unconditionally and one to run
only if ShouldRunExtraVectorPasses has been cached.
The reason it manages 2 sets of passes is that there are some
unconditional passes between LV and the 'extra' passes. Having the pass
manager manage both allows us to query the cache before the
unconditional passes might invalidate it.
Without this patch, -extra-vectorizer-passes has the following
compile-time impact:
NewPM-O3: +4.86%
NewPM-ReleaseThinLTO: +3.56%
NewPM-ReleaseLTO-g: +7.17%
With this patch, that gets reduced to
NewPM-O3: +1.43%
NewPM-ReleaseThinLTO: +1.00%
NewPM-ReleaseLTO-g: +1.58%
It is probably still too high to enable by default, but much better.
can ExtraVectorPassManager only contain ConditionalPasses and we have a separate FPM for the normal passes? separation of concerns, seems like this is doing two things that could be separated