This is an archive of the discontinued LLVM Phabricator instance.

[MemorySSA] Use BatchAA for clobber walker
ClosedPublic

Authored by nikic on Oct 18 2022, 6:07 AM.

Details

Summary

While MemorySSA use optimization was already using BatchAA, the publicly exposed MSSA walkers were using plain AAResults. This is not great, because it is expected that clobber walking will make repeated AA queries.

This patch makes the clobber API accept a BatchAAResults instance. The plain APIs are kept as wrappers and will create a BatchAAResults instance for the duration of the query. In the future, the explicit BatchAAResults arguments will be used to share AA results across queries, not just within one query.

Diff Detail

Event Timeline

nikic created this revision.Oct 18 2022, 6:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2022, 6:07 AM
nikic requested review of this revision.Oct 18 2022, 6:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2022, 6:07 AM

e.g. for NewGVN I think we should be able to use a single BatchAA instance for all queries.

@kmitropoulou it sounds like this could help with NewGVN compile time.

nikic added a comment.Oct 25 2022, 7:44 AM

Ping :)

e.g. for NewGVN I think we should be able to use a single BatchAA instance for all queries.

@kmitropoulou it sounds like this could help with NewGVN compile time.

Without looking too deeply into it, it's likely that the changes proposed by @kmitropoulou would actually prevent sharing BatchAA for all queries, because they introduce IR changes to the analysis phase, while NewGVN is generally designed to separate analysis and transformation.

Ping :)

e.g. for NewGVN I think we should be able to use a single BatchAA instance for all queries.

@kmitropoulou it sounds like this could help with NewGVN compile time.

Without looking too deeply into it, it's likely that the changes proposed by @kmitropoulou would actually prevent sharing BatchAA for all queries, because they introduce IR changes to the analysis phase, while NewGVN is generally designed to separate analysis and transformation.

Load coercion does a similar thing to phi-of-ops transformation. Both transformations find optimizations opportunities during analysis. Both transformations emit new code after the analysis. The only difference is the point where the two transformations emit the new code. In case of phi-of-ops transformation, the new phi nodes are emitted in eliminateInstructions(),. In case of load coercion, the new code is emitted just before eliminateInstructions() for the reasons that are mentioned here: https://reviews.llvm.org/D124071#inline-1188926 .

kmitropoulou resigned from this revision.Oct 25 2022, 2:57 PM

It seems unnecessary to force all uses to instantiate and pass the BatchAA.
Can this remain part of the ClobberWalkerBase instead? Default the calls to getClobberingMemoryAccess to instantiate a BatchAA internally and pass to the ClobberWalker.
Then, there can be an additional API added to override the above behavior when useful (known as safe): either passing a given BatchAA like current patch, or, keeping a BatchAA in the ClobberWalker itself instead of the AA, and being able to "set" it to not re-initialize it on each call (and perhaps "reset" to reseting each time).
What do you think?

nikic added a comment.Nov 2 2022, 1:25 AM

It seems unnecessary to force all uses to instantiate and pass the BatchAA.
Can this remain part of the ClobberWalkerBase instead? Default the calls to getClobberingMemoryAccess to instantiate a BatchAA internally and pass to the ClobberWalker.
Then, there can be an additional API added to override the above behavior when useful (known as safe): either passing a given BatchAA like current patch, or, keeping a BatchAA in the ClobberWalker itself instead of the AA, and being able to "set" it to not re-initialize it on each call (and perhaps "reset" to reseting each time).
What do you think?

We currently have 17 calls to getClobberingMemoryAccess() and if my analysis is right, I believe 15 of those can reuse the BatchAA instance in some way (though this patch mostly doesn't do it). Given this, I think it makes more sense to always require BatchAA -- it makes reuse opportunities more obvious, and is not a particularly large burden if reuse is not possible. I'd agree if it were the other way around and BatchAA reuse was not possible in most cases.

nikic added a comment.Nov 14 2022, 1:56 AM

@asbirlea Any thoughts on my last comment?

Makes sense to use BAA at call sites if the usage is such!

Regarding the internal walker APIs, please see the inline comments. I don't think we need to pass the BAA instance everywhere, that was the goal of the class fields.

llvm/lib/Analysis/MemorySSA.cpp
520

Keep a pointer to a BatchAA instance - set when Query is created. Prevents having to pass it through all the APIs.

560

Use class BAA.

627

Use class BAA.

770

Use class BAA.

938

Set class BAA here, the same as Query.

nikic updated this revision to Diff 480099.Dec 5 2022, 7:16 AM
nikic marked 5 inline comments as done.
nikic edited the summary of this revision. (Show Details)

Use member in ClobberWalker, and add wrapper APIs that don't require BatchAAResults (and just create it internally). Something I previously missed was updating the unit tests, and creating the BatchAA instances there does get pretty annoying for no benefit.

This patch now only ensures that individual queries use BatchAA, but doesn't yet share BatchAA across multiple queries anywhere. I'll land that part separately.

asbirlea accepted this revision.Dec 5 2022, 12:36 PM

lgtm.

This revision is now accepted and ready to land.Dec 5 2022, 12:36 PM
This revision was automatically updated to reflect the committed changes.