Ensure that the all state that depends on a module or context is cleared
so that there are no issues when the same instance of the pass is used
to process different modules.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
if we're committing to making this work for all passes, we should have a test for this, perhaps something like making opt's -run-twice flag work with the new pass manager. hopefully eventually we can make opt -passes='default<O3>' -run-twice not crash
Is there an RFC (or other context) for this patch series? It's not entirely obvious to me why we want to go out of the way to allow multiple run() calls.
I would think that a pass manager instance can be reused for processing different modules (that's why there might be multiple calls to run). This is something that so far has been working overall fine with the legacy pass manager. Is there something fundamentally different in the new pass manager infrastructure for which the same pass manager cannot be used for more than one module?
Supporting this adds complexity, even if only a little.
Is the overhead of creating multiple pass managers measurable with your use case?
With the legacy pass manager one of the concern was the extra latency caused by the construction of a pass manager for every input being processed, in particular for small input modules. As mentioned in https://reviews.llvm.org/D130952#3693697, I'm trying to port the existing logic the new pass manager infrastructure, and while doing so I've found a bunch of issues.
If this is a use case not explicitly supported/recommended, then I will try to see if with the new pass manager infrastructure it is acceptable to rebuild the optimization pipelines every time.
if you can measure that doing this has measurable compile time improvements for your use case then I think this is reasonable, but it'd be good to measure that this actually helps before proceeding
I did a quick comparison on three class of inputs (trivial(*), small, large) for a custom pipeline based on the default O2 one.
I've measured the time need to build the pass manager at every input and compared it with the time needed to run the pass manager on each of the input.
On the class of trivial inputs rebuilding the (new) pass manager for every module takes ~8.4% of the combined time (with the legacy pass manager it is 66.4%).
On the class of small inputs rebuilding the (new) pass manager for every module takes ~1.4% of the combined time (with the legacy pass manager it is 27.4%).
On the class of large inputs rebuilding the (new) pass manager for every module takes ~0.2% of the combined time (with the legacy pass manager it is 3.7%).
In the overall comparison between legacy vs. new, the new pass manager (even rebuilding it for every input) lead to better overall running time in all the three classes of inputs.
Based on these preliminary results, for now, I can drop the requirement of reusing pass managers when adopting the new pass manager infrastructure.
(*) The class of trivial inputs (although quite unrealistic) represents the worst case for this experiment -- basically there is nothing to optimize, and it roughly measures the pass manager infrastructure overhead.
I understand that to allow the reuse of pass managers each pass must ensure that no state maintained between run invocations, which is very subtle issue, not easy to reproduce, and debug, and it requires pass authors to be more careful. This was true also with the legacy pass manager, and in the past bugs of the same nature have been fixed (e.g. what has lead to the run-twice option in opt).
If for the new pass manager there is no expectation to support reusing the same pass manager instance for processing different modules, then I think it would be better to have a note about this in the pass manager documentation.
Are these three patches the only ones needed to support this? The main argument in favor of allowing this I see is that function passes *can* be reused (after all, we call them for each function), so not allowing reuse of module passes is "inconsistent".
Short answer: it depends on what "supporting this" mean.
Here's the description of how I discovered these issues after I encountered the problem with the ModuleInlinerWrapperPass.
I used a modified version of opt to do the following (with some additional hacks (*)):
- serialize to bitcode (preserving use-list order) the input module before any pass is being run
- allocate a new context
- materialize the module from bitcode in the new context
- run the pass manager (excluding output writing pass) on the newly materialized module
- destroy module and context
- continue with regular flow of opt and run the original module with the same pass manager
(*) To reduce the number of false failures, for the first run of the pass manager on the cloned module I have used a custom error handler in the context to drop errors and preventing exit(1) from the default handler, I have swapped the underlying file descriptors used by llvm::outs() and llvm::errs() with new ones pointing to files to prevent polluting the output stream with messages from the first run that might cause FileCheck failures. Moreover after the first run I reset the statistics, the debug counters (I locally added a resetCounters function to DebugCounter), all the analysis managers, and the data structures in DebugInfoBeforePass.
With the patches D130952, D130954, and D130955 there are still some failures:
- some tests in Analysis/BasicAA and Analysis/CFLAliasAnalysis/Steensgaard -- AAEvaluator counters -- these persist across invocation of run and the report is printed when the pass object is destroyed
- Transforms/FunctionImport/funcimport_cutoff.ll -- FunctionImport has a global counter (a function scope static variable ImportCount) that is used for debug purposes (cutoff option used in some tests to control how many functions should be imported)
- Transforms/Inline/cgscc-inline-replay.ll fails in one case due to the fact an error is raised due to failure in finding the replay file, but the custom diagnostic handler does not cause the execution to stop, and a nullptr is being dereferenced after that
- Transforms/Inline/inline_stats.ll fails because the inlining statistic object is not reset
- Transforms/WholeProgramDevirt/devirt-single-impl2.ll because when DevirtModule fails to read the summary from the file specified via the command line option -whoelprogramdevirt-read-summary it exits printing an error message, but during the first run of the PM llvm::errs() has been redirected, hence the FileCheck failure
- some of the CodeGen/BPF and CodeGen/BPF/CORE tests fail because in the BPFAbstractMemoryAccess function pass there is a static map referencing GlobalVariable objects -- which is never cleared, hence when once the module containing such globals is gone there are dangling references that might be accessed in subsequent runs
- Other/ChangePrinters/print-changed-diff.ll, Other/change-printer.ll -- the state of InitialIR of ChangeReporter cannot be reset hence we miss some expected lines from the second run
- Other/print-on-crash.ll -- crash during first run where llvm::errs() and llvm::outs() have been redirected
- Other/opt-bisect-new-pass-manager.ll -- cannot reset the bisect limit
For (6) we have BPFAbstractMemoryAccess as a function pass that can insert global variables in a module (and caches them in some static global map) -- I believe function passes are allowed to insert global variables, so I wonder if BPFAbstractMemoryAccess should be modified to be a module pass -- which might allow the removal of the static global map entirely.
The other failures I believe are all related to debug features.
The patches D130952, D130954, and D130955 address problems that could occur in a non-debug scenario.
I don't know if it is acceptable to have exceptions for the debug related features, or if the issue I described should be addressed somehow.