This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Transforms/IPO/
-
lib/
-
Transforms/
-
IPO/
-
DeadArgumentElimination.cpp

Differential D130954

[DeadArgElim] Clear state before returning (NFC).
AbandonedPublic

Authored by michele.scandale on Aug 1 2022, 6:48 PM.

Download Raw Diff

Details

Reviewers

fhahn
aeubanks
nikic

Summary

Ensure that the all state that depends on a module or context is cleared
so that there are no issues when the same instance of the pass is used
to process different modules.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,160 ms	x64 debian > AddressSanitizer-x86_64-linux-dynamic.TestCases::scariness_score_test.cpp
	60,110 ms	x64 debian > AddressSanitizer-x86_64-linux.TestCases::scariness_score_test.cpp

Event Timeline

michele.scandale created this revision.Aug 1 2022, 6:48 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 6:48 PM

Herald added subscribers: ormris, hiraditya. · View Herald Transcript

michele.scandale requested review of this revision.Aug 1 2022, 6:48 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 6:48 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B178671: Diff 449161.Aug 1 2022, 7:26 PM

if we're committing to making this work for all passes, we should have a test for this, perhaps something like making opt's -run-twice flag work with the new pass manager. hopefully eventually we can make opt -passes='default<O3>' -run-twice not crash

Is there an RFC (or other context) for this patch series? It's not entirely obvious to me why we want to go out of the way to allow multiple run() calls.

michele.scandale mentioned this in D130952: [Inliner] Handle multiple `run` invocation of `ModuleInlinerWrapperPass` (NFC)..Aug 2 2022, 8:29 AM

In D130954#3693243, @nikic wrote:

Is there an RFC (or other context) for this patch series? It's not entirely obvious to me why we want to go out of the way to allow multiple run() calls.

I would think that a pass manager instance can be reused for processing different modules (that's why there might be multiple calls to run). This is something that so far has been working overall fine with the legacy pass manager. Is there something fundamentally different in the new pass manager infrastructure for which the same pass manager cannot be used for more than one module?

In D130954#3693701, @michele.scandale wrote:

In D130954#3693243, @nikic wrote:

Is there an RFC (or other context) for this patch series? It's not entirely obvious to me why we want to go out of the way to allow multiple run() calls.

I would think that a pass manager instance can be reused for processing different modules (that's why there might be multiple calls to run). This is something that so far has been working overall fine with the legacy pass manager. Is there something fundamentally different in the new pass manager infrastructure for which the same pass manager cannot be used for more than one module?

Supporting this adds complexity, even if only a little.
Is the overhead of creating multiple pass managers measurable with your use case?

In D130954#3693737, @aeubanks wrote:

In D130954#3693701, @michele.scandale wrote:

In D130954#3693243, @nikic wrote:

Is there an RFC (or other context) for this patch series? It's not entirely obvious to me why we want to go out of the way to allow multiple run() calls.

I would think that a pass manager instance can be reused for processing different modules (that's why there might be multiple calls to run). This is something that so far has been working overall fine with the legacy pass manager. Is there something fundamentally different in the new pass manager infrastructure for which the same pass manager cannot be used for more than one module?

Supporting this adds complexity, even if only a little.
Is the overhead of creating multiple pass managers measurable with your use case?

With the legacy pass manager one of the concern was the extra latency caused by the construction of a pass manager for every input being processed, in particular for small input modules. As mentioned in https://reviews.llvm.org/D130952#3693697, I'm trying to port the existing logic the new pass manager infrastructure, and while doing so I've found a bunch of issues.

If this is a use case not explicitly supported/recommended, then I will try to see if with the new pass manager infrastructure it is acceptable to rebuild the optimization pipelines every time.

if you can measure that doing this has measurable compile time improvements for your use case then I think this is reasonable, but it'd be good to measure that this actually helps before proceeding

In D130954#3693854, @aeubanks wrote:

if you can measure that doing this has measurable compile time improvements for your use case then I think this is reasonable, but it'd be good to measure that this actually helps before proceeding

I did a quick comparison on three class of inputs (trivial(*), small, large) for a custom pipeline based on the default O2 one.
I've measured the time need to build the pass manager at every input and compared it with the time needed to run the pass manager on each of the input.

On the class of trivial inputs rebuilding the (new) pass manager for every module takes ~8.4% of the combined time (with the legacy pass manager it is 66.4%).
On the class of small inputs rebuilding the (new) pass manager for every module takes ~1.4% of the combined time (with the legacy pass manager it is 27.4%).
On the class of large inputs rebuilding the (new) pass manager for every module takes ~0.2% of the combined time (with the legacy pass manager it is 3.7%).

In the overall comparison between legacy vs. new, the new pass manager (even rebuilding it for every input) lead to better overall running time in all the three classes of inputs.

Based on these preliminary results, for now, I can drop the requirement of reusing pass managers when adopting the new pass manager infrastructure.

(*) The class of trivial inputs (although quite unrealistic) represents the worst case for this experiment -- basically there is nothing to optimize, and it roughly measures the pass manager infrastructure overhead.

In D130954#3693737, @aeubanks wrote:

Supporting this adds complexity, even if only a little.
Is the overhead of creating multiple pass managers measurable with your use case?

I understand that to allow the reuse of pass managers each pass must ensure that no state maintained between run invocations, which is very subtle issue, not easy to reproduce, and debug, and it requires pass authors to be more careful. This was true also with the legacy pass manager, and in the past bugs of the same nature have been fixed (e.g. what has lead to the run-twice option in opt).

If for the new pass manager there is no expectation to support reusing the same pass manager instance for processing different modules, then I think it would be better to have a note about this in the pass manager documentation.

michele.scandale mentioned this in D130955: [InstrProf] Clear `ReferencedNames` as well (NFC)..Aug 3 2022, 8:46 PM

Are these three patches the only ones needed to support this? The main argument in favor of allowing this I see is that function passes *can* be reused (after all, we call them for each function), so not allowing reuse of module passes is "inconsistent".

In D130954#3698745, @nikic wrote:

Are these three patches the only ones needed to support this? The main argument in favor of allowing this I see is that function passes *can* be reused (after all, we call them for each function), so not allowing reuse of module passes is "inconsistent".

Short answer: it depends on what "supporting this" mean.

Here's the description of how I discovered these issues after I encountered the problem with the ModuleInlinerWrapperPass.

I used a modified version of opt to do the following (with some additional hacks (*)):

serialize to bitcode (preserving use-list order) the input module before any pass is being run
allocate a new context
materialize the module from bitcode in the new context
run the pass manager (excluding output writing pass) on the newly materialized module
destroy module and context
continue with regular flow of opt and run the original module with the same pass manager

(*) To reduce the number of false failures, for the first run of the pass manager on the cloned module I have used a custom error handler in the context to drop errors and preventing exit(1) from the default handler, I have swapped the underlying file descriptors used by llvm::outs() and llvm::errs() with new ones pointing to files to prevent polluting the output stream with messages from the first run that might cause FileCheck failures. Moreover after the first run I reset the statistics, the debug counters (I locally added a resetCounters function to DebugCounter), all the analysis managers, and the data structures in DebugInfoBeforePass.

With the patches D130952, D130954, and D130955 there are still some failures:

some tests in Analysis/BasicAA and Analysis/CFLAliasAnalysis/Steensgaard -- AAEvaluator counters -- these persist across invocation of run and the report is printed when the pass object is destroyed
Transforms/FunctionImport/funcimport_cutoff.ll -- FunctionImport has a global counter (a function scope static variable ImportCount) that is used for debug purposes (cutoff option used in some tests to control how many functions should be imported)
Transforms/Inline/cgscc-inline-replay.ll fails in one case due to the fact an error is raised due to failure in finding the replay file, but the custom diagnostic handler does not cause the execution to stop, and a nullptr is being dereferenced after that
Transforms/Inline/inline_stats.ll fails because the inlining statistic object is not reset
Transforms/WholeProgramDevirt/devirt-single-impl2.ll because when DevirtModule fails to read the summary from the file specified via the command line option -whoelprogramdevirt-read-summary it exits printing an error message, but during the first run of the PM llvm::errs() has been redirected, hence the FileCheck failure
some of the CodeGen/BPF and CodeGen/BPF/CORE tests fail because in the BPFAbstractMemoryAccess function pass there is a static map referencing GlobalVariable objects -- which is never cleared, hence when once the module containing such globals is gone there are dangling references that might be accessed in subsequent runs
Other/ChangePrinters/print-changed-diff.ll, Other/change-printer.ll -- the state of InitialIR of ChangeReporter cannot be reset hence we miss some expected lines from the second run
Other/print-on-crash.ll -- crash during first run where llvm::errs() and llvm::outs() have been redirected
Other/opt-bisect-new-pass-manager.ll -- cannot reset the bisect limit

For (6) we have BPFAbstractMemoryAccess as a function pass that can insert global variables in a module (and caches them in some static global map) -- I believe function passes are allowed to insert global variables, so I wonder if BPFAbstractMemoryAccess should be modified to be a module pass -- which might allow the removal of the static global map entirely.

The other failures I believe are all related to debug features.

The patches D130952, D130954, and D130955 address problems that could occur in a non-debug scenario.
I don't know if it is acceptable to have exceptions for the debug related features, or if the issue I described should be addressed somehow.

michele.scandale abandoned this revision.Dec 15 2022, 11:16 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

IPO/

DeadArgumentElimination.cpp

4 lines

Diff 449161

llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp

Show First 20 Lines • Show All 1,106 Lines • ▼ Show 20 Lines	PreservedAnalyses DeadArgumentEliminationPass::run(Module &M,
for (Function &F : llvm::make_early_inc_range(M))		for (Function &F : llvm::make_early_inc_range(M))
Changed \|= removeDeadStuffFromFunction(&F);		Changed \|= removeDeadStuffFromFunction(&F);

// Finally, look for any unused parameters in functions with non-local		// Finally, look for any unused parameters in functions with non-local
// linkage and replace the passed in parameters with poison.		// linkage and replace the passed in parameters with poison.
for (auto &F : M)		for (auto &F : M)
Changed \|= removeDeadArgumentsFromCallers(F);		Changed \|= removeDeadArgumentsFromCallers(F);

		LiveFunctions.clear();
		LiveValues.clear();
		Uses.clear();

if (!Changed)		if (!Changed)
return PreservedAnalyses::all();		return PreservedAnalyses::all();
return PreservedAnalyses::none();		return PreservedAnalyses::none();
}		}