This is an archive of the discontinued LLVM Phabricator instance.

[LegacyPassManager] Reduce memory usage for AnalysisUsage
ClosedPublic

Authored by reames on Nov 13 2015, 7:28 PM.

Details

Summary

The LegacyPassManager was storing an instance of AnalysisUsage for each instance of each pass. In practice, most instances of a single pass class share the same dependencies. We can't rely on this because passes can (and some do) have dynamic dependencies based on instance options.

We can exploit the likely commonality by uniqueing the usage information after querying the pass, but before storing it into the pass manager. This greatly reduces memory consumption by the AnalysisUsage objects. For a long pass pipeline, I measured a decrease in memory consumption for this storage of about 50%. I have not measured on the default O3 pipeline, but I suspect it will see some benefit as well since many passes are repeated (e.g. InstCombine).

I believe this enhancement does not apply to the new pass manager; please correct me if I'm wrong.

Once this is in, I also plan to re-examine the use of SmallVectors in AnalysisUsage and their default size of 32 elements. Given most passes have only a few direct dependencies, this seems likely to be rather wasteful of memory.

Diff Detail

Repository
rL LLVM

Event Timeline

reames updated this revision to Diff 40200.Nov 13 2015, 7:28 PM
reames retitled this revision from to [LegacyPassManager] Reduce memory usage for AnalysisUsage.
reames updated this object.
reames added a reviewer: chandlerc.
reames added a subscriber: llvm-commits.
chandlerc edited edge metadata.Nov 13 2015, 8:49 PM

This is a really nice improvement. Minor adjustment below is my only real suggestion.

lib/IR/LegacyPassManager.cpp
582–587 ↗(On Diff #40200)

Since we build this completely each time, the folding set machinery isn't really hepling much. Maybe just provide a hash_value overload and use a SmallSet? Should be able to either delegate to the vector's hash_value or use hash_combine_range.

(This isn't because they're that much simpler, I'd just rather move everything away from folding sets where reasonable.)

594 ↗(On Diff #40200)

As sympathetic as I am to using multiple returns like this in an immediately called lamda, I find it quite confusing. Especially with the auto type on the variable.

Fortunately, I think with a set, the code to do this without a lambda shouldn't be onerous at all.

reames added inline comments.Nov 16 2015, 2:12 PM
lib/IR/LegacyPassManager.cpp
582–587 ↗(On Diff #40200)

I actually started with trying to use DenseSet, but the choice of empty and tombstone keys was non-obvious. I'm not familiar enough with the AU data structure to identify values which could never occur.

SmallSet is probably a bad choice. I have no evidence the set of AU values is small, only that its elements are heavily reused. Picking a good small size would be tricky. In particular, the small size which works for O2 is probably not the one which works for my out of tree pipeline and vice versa.

I would prefer to use the folding set here. I believe it is the appropriate data structure choice and the usage pattern closely maps usages elsewhere (e.g. SCEV).

594 ↗(On Diff #40200)

I slightly disagree, but minor. Will fix in an update version shortly.

reames updated this revision to Diff 40342.Nov 16 2015, 2:19 PM
reames edited edge metadata.

Remove lambda per Chandler's comment.

reames added a comment.Dec 2 2015, 4:00 PM

ping - Chandler?

chandlerc accepted this revision.Dec 2 2015, 8:56 PM
chandlerc edited edge metadata.

Mostly minor changes. I'm not thrilled with the folding set stuff still, but i don't want to spend more time tweaking the legacy pass manager code honestly. I'm not even thrilled by this complexity going in, but I understand why it's so significant for your use case.

include/llvm/IR/LegacyPassManagers.h
257–258 ↗(On Diff #40342)

nit: Just make this a struct to avoid needing to specify "public" everywhere.

259–260 ↗(On Diff #40342)

This would make a bit more sense below, where we process the arrays?

273–274 ↗(On Diff #40342)

I would prefer to not add to the Profile overload set since that gets called at a distance in the folding set code. Instead, I'd rather make it clearly an implementation detail. Unless I missed something and it's actually used elsewhere?

ProfileImpl would be fine. Maybe even a local lambda?

Actually, what about writing the above as:

for (auto &Vec : {AU.getRequiredSet(), AU.getRequiredTransitiveSet(), ...}) {
  ID.AddInteger(Vec.size());
  for (AnalysisID AID : Vec)
    ID.AddPointer(AID);
}

I think we're finally at a point where MSVC's initializer list support is up to this style of usage.

lib/IR/LegacyPassManager.cpp
595 ↗(On Diff #40342)

Add a message here? Maybe "found a null pointer in cached analysis usages" or somesuch?

This revision is now accepted and ready to land.Dec 2 2015, 8:56 PM
This revision was automatically updated to reflect the committed changes.