Page MenuHomePhabricator

[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.

Authored by asbirlea on Mar 13 2019, 12:33 PM.



Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.

AA changes:

  • This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
  • All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
  • AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.

MemorySSA changes:

  • All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
  • At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
  • All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
  • The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
  • All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.

Diff Detail


Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
chandlerc added inline comments.
1158–1163 ↗(On Diff #190475)

I think using an object to model a batch of queries is going to be significantly cleaner long term. As mentioned in the discussion, in part this is because I think we'll want to have this as an actual analysis that simply is invalidated *much* more eagerly than the "normal" AA analyses (by any change to the IR at all).

To address the move issue -- my suggestion would be to have it *wrap* the AAs in place rather than moving them. I'm not seeing why this won't work? Maybe just need more explanation here.

1164–1169 ↗(On Diff #190475)

This seems a bit weird to me... I would hope that there is a better way to plumb the batched AA wrapper into these APIs, but maybe I don't understand why its difficult?

However, it does raise a specifically interesting point: the walkers should use the batched interface here, but likely *not* use the batched interface after MemorySSA has been built as at that point the IR is likely changing and the batched results are no longer valid.

This seems like an important distinction that should be made more explicit. Maybe the walkers that the rest of the code use should be distinct from those used to build MemorySSA initially?

asbirlea added inline comments.Mar 13 2019, 2:25 PM
1158–1163 ↗(On Diff #190475)

For wrapping, do you have something like this in mind?:

class BatchAAResults : public SomeNewNameAAResultsInterface { //new
     BatchAAResults(AAResults &aar) : AAR(aar), AAQI(false) {}
     alias(...) {
         AAR.alias(..., AAQI);
     // wrap all APIs and pass local AAQI.
     AAResults &AAR;
     AAQueryInfo AAQI;

class AAResults : public SomeNewNameAAResultsInterface {
    .... // keep the same

using AliasAnalysis = SomeNewNameAAResultsInterface;
// currently: using AliasAnalysis = AAResults;


class AAResults {  //existing, updates below
     AAResults(AAResults *aar) : TLI(aar->TLI), AAR(aar), AAQI(false) {}
     alias(...) {
         if (Inner)
            return Inner->alias(..., AAQI);
         // same as before
     // update all APIs to delegate to Inner (if it exists), with local AAQI.

     AAResults *Inner = nullptr; // add to existing AAResults
     AAQueryInfo AAQI;

I played with the second option before sending this out, and it works fine.
The first option probably needs additional changes to replace uses of AAResults with new name assigned to AliasAnalysis.

Cases that may require more extensive code changes, when the batch AA becomes another pass, happen where the same code needs to use an AA pass at some point, then use another later. This is the case for the comment below.
Having the objects/passes have a common SomeNewNameAAResultsInterface (first option) or AAResults (second option) would enable swapping a new object in to build and the old back in after.
Does this make sense?

1164–1169 ↗(On Diff #190475)

That is exactly right. The batched results are no longer valid after MemorySSA is built, and should no longer be used.
There is shared code between build-time and after-build.
This code remains the same and it's stateless, other than storing a reference to AA, DT etc. So either the walker needs updated to the non-batched AA results or re-initialized, as you suggest.
Not a big deal as far as plumbing, just something that needs fixing with the current code.

chandlerc added inline comments.Mar 13 2019, 3:57 PM
1158–1163 ↗(On Diff #190475)

The first, but without the using AliasAnalysis = ... change.

Essentially, it seems like the best result is that the code that is doing batched queries says so explicitly (by using the batched wrapper type to query).

Is the issue that MemorySSA (and/or the walker) don't do all of the queries directly, but instead delegate some (most?) of the queries to generic library code that expects a (non-batched) type today? How often does this happen?

asbirlea added inline comments.Mar 13 2019, 4:22 PM
1158–1163 ↗(On Diff #190475)

MemorySSA is just keeping an AliasAnalysis* AA; and doing AA->alias() or AA->getModRefInfo() calls.

The change would be to take in argument AliasAnalysis* AA; (like it is already doing), but store it internally as an SomeNewNameAAResultsInterface *InternalAA.
Building code becomes:

BatchAAResults  BatchAA(AA);
InternalAA = &BatchAA;
InternalAA = AA;
// BatchAA dies now. Assume the walker gets the AA by querying InternalAA.

As long as SomeNewNameAAResultsInterface has the same APIs, all will be fine. The reason for going to the "top-level type" is that the same code is using different AAs based on when it's being called (at build time or after).

Is this what you had in mind? Or actually changing all callsites to alias/getModRef in MemorySSA to check if you're in build stage or not, and make the call on a different object?

asbirlea updated this revision to Diff 190786.Mar 14 2019, 11:50 PM

Here goes, this is pretty big change based on Chandler's feedback.

  • Created BatchAAResults as a wrapper to AAResults. Keeps the AAResults instance and a QueryInfo instantiated to batch mode.
  • Add all the wrapper APIs needs to wrap/hide the query argument in AAResults (lots more needed than the first iteration).
  • All MemorySSA walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
  • In MemorySSA, at build time, we optimize uses. Create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
  • All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.
  • The walker we use for queries after the build is instantiated on AliasAnalysis and can be built when building MemorySSA or not (commented out currently).
  • In MemorySSA all static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during built method now only takes a BatchAAResults. If used only after build, method takes an AliasAnalysis.

Not done yet:

  • Did not teach AliasSetTracker to use BatchAAResults.
  • Did not remove setAAToBatch and setAAToSingle yet (called in the AST user (LICM)). Their usage is removed for building MemorySSA.

All changes related to AA and MemorySSA should be done. Sending for feedback while I make the remaining changes in LICM/AliasSetTracker. I can also split these out.

asbirlea marked 6 inline comments as done.Mar 14 2019, 11:50 PM
asbirlea updated this revision to Diff 190913.Mar 15 2019, 3:27 PM

Cleaned up un-needed methods.
Added AliasSetTracking changes in patch: for ease in reviewing (can be merged into this).

asbirlea updated this revision to Diff 190916.Mar 15 2019, 3:43 PM

Minor cleanups.

asbirlea edited the summary of this revision. (Show Details)Mar 15 2019, 3:54 PM

Overall approach LGTM for this selective caching behavior.

CFLAA bits LGTM. Just a few nits about MSSA

1165 ↗(On Diff #190916)

nit: can this be simplified to AA(nullptr) above, then setting this->AA = AA; after we're done with buildMemorySSA?

1169 ↗(On Diff #190916)

nit: this comment seems redundant to me. If you feel like it's useful, please go into a bit more detail about how we only intend to use BatchAA while building, etc.

1480 ↗(On Diff #190916)

nit: can these both just be stack-allocated?

1487 ↗(On Diff #190916)

If you want to do this, please do it as a part of this CL (looks like it'll require plumbing AA through). Otherwise, please remove the note

1803 ↗(On Diff #190916)

(I'd be equally happy with getWalker()->verify(this), personally, since I agree that this is a tricky edge case :) )

2337 ↗(On Diff #190916)

nit: this is becoming a lot of ceremony for 1-line methods. I know there are subtleties here about declaration order that require us to define some things out-of-line, but if that isn't the case, can you please inline these methods into their class definitions?

asbirlea updated this revision to Diff 191412.Mar 19 2019, 3:55 PM
asbirlea marked 7 inline comments as done.

Address comments.

asbirlea marked an inline comment as done.Mar 19 2019, 3:55 PM
asbirlea added inline comments.
1487 ↗(On Diff #190916)

Added the walker init in the constructor, after setting AA.

1803 ↗(On Diff #190916)

getWalker() is not const, and I wanted to keep the const qualifier on the verifyMemorySSA(). Initializing the walker when building, so I removed the check.

asbirlea updated this revision to Diff 191413.Mar 19 2019, 4:04 PM

Missed nit.

MSSA bits LGTM (to reiterate, as do the CFLAA bits). Leaving the rest of the patch to the other reviewers


1176 ↗(On Diff #191413)

Please add a comment saying that we're ensuring the walker exists because [...]

Might also be good to refactor getWalker() to remove the creation logic/checks, since creation is now always done here. Probably better done as a later CL, though.

asbirlea updated this revision to Diff 191550.Mar 20 2019, 11:55 AM

Add comment regarding building the walker.
Thank you for the review!

asbirlea marked an inline comment as done.Mar 20 2019, 11:59 AM
chandlerc requested changes to this revision.Mar 20 2019, 6:37 PM

FWIW, I really like the direction and design here. Thanks so much for pushing all of this through, I think it is a really nice generalization of the caching done by BasicAA that lets us do more and more general things of this nature.

There are several places where it seemed like clear should be called and wasn't. I marked all of the ones I saw just to avoid missing it in future iterations. But all of that may be obviated by my comment about the member query info object.

289 ↗(On Diff #191550)

Need comments for the class.

295–299 ↗(On Diff #191550)

nitpick that should probably be a future cleanup and not part of this patch: I'd love to move these to use FooT instead of FooTy. The former is more commonly used for normal type aliases while the latter is more commonly referencing an LLVM Type.

My assumption is that these need to retain the spelling used by basic-aa to avoid unrelated churn, so happy for this to happen separately / later.

305–306 ↗(On Diff #191550)

Do we actually need this boolean? Is this just for the case of AST and this goes away in the future?

770–771 ↗(On Diff #191550)

You're using shrink_and_clear above to clear this between queries. As a consequence, it isn't clear to me why this being a member is still important. Maybe I missed it in the code using this, if so just point it out and sorry for the noise.

But it seems like you might be able to create this on the stack in overload of alias that doesn't accept it as a parameter?

149 ↗(On Diff #191550)

No clear?

434 ↗(On Diff #191550)

no clear?

458 ↗(On Diff #191550)

No clear?

489 ↗(On Diff #191550)

No clear?

504 ↗(On Diff #191550)

No clear?

533 ↗(On Diff #191550)

No clear?

552 ↗(On Diff #191550)

No clear?

571 ↗(On Diff #191550)

No clear?

598 ↗(On Diff #191550)

No clear?

1803 ↗(On Diff #190916)

Would it be cleaner for the verify to build its *own* walker? It isn't in the fast path and that would insulate it from one or the other decision?

(There probably is a good reason to not do this, mostly asking to learn more about MemorySSA and the walker design.)

1173 ↗(On Diff #191550)

I'd add some comments about *why* we want a batch AA result when building typically.

1175 ↗(On Diff #191550)

Add a comment explaining that we intentionally leave AA null above to ensure the build phase uses the batch AA and not accidentally this one?

Basically, a comment for the reader who comes along and thinks "i'll fix this code to just use the member initializer above" about why the unusual pattern is unusual. I like my Chesterton's fence with signs on it. ;]

This revision now requires changes to proceed.Mar 20 2019, 6:37 PM
1803 ↗(On Diff #190916)

The reason is that MemorySSA has to be moveable, and Walkers store a pointer to their MemorySSA instance. Walker::verify is "make sure our internal pointer to MemorySSA is the same thing as our parameter and hasn't been moved, because oh geez if it's pointing somewhere else..."

Constructing a new one just to do that check is sort of redundant.

In general, I don't see how one can use MSSA without the Walker very effectively, so I don't think it's a bad idea to just have the ctor build it Once And For All. We can do a cleaner job with it, but again, that's probably best suited for a follow-up CL, since this one's pretty large as-is.

Since you asked for history, though: :)

It used to be that the walker carried a huge cache with it, so keeping it around was necessary. IIRC, it's now kept around primarily for convenience and microoptimizations: it's a relatively heavy alloc (1KBish?) on its own, since it carries with it a few large Small containers. For functions large enough to make those Small containers go to the heap, it's likely those containers will need that extra space again, so dropping them ASAP seems counterproductive.

The reason is that MemorySSA has to be moveable

After looking through logs, I take this back -- the verification is that MSSA is *never* relocated, and the verification was added in (maybe it was the Updater that we made move-able? I don't remember). In any case, it'd probably be a good idea to explicitly delete MSSA's move ctor to help ensure this move doesn't happen. I'll commit that shortly if it gives me no build errors.

asbirlea edited the summary of this revision. (Show Details)Mar 20 2019, 10:08 PM

Thank you for the comments!
Quick answer before addressing the rest of the comments: the missing clear() calls were intentional, since they're redundant. The calls that don't call clear() are just wrappers over the methods that do call clear(). These wrappers are over a single call so there wouldn't be any benefit from not clearing inside the wrapped method, even if we could.
I'm not sure if it's preferable to add the redundant calls for consistency or safety if things change in the future, or add comments explaining why they're not needed today.

Nicola added a subscriber: Nicola.Mar 21 2019, 2:09 AM
asbirlea updated this revision to Diff 191756.Mar 21 2019, 11:53 AM
asbirlea marked 17 inline comments as done.
asbirlea edited the summary of this revision. (Show Details)

Creating the QueryInfo on the stack and removing the clear method.

Still creating the walker when building MemorySSA. Prefer to clean it up in a follow-up patch (i.e. refactor getWalker() as George suggested).

Just documentation tweaks....

289–290 ↗(On Diff #191756)

I might suggest a slight rewording here to avoid confusion -- the analyses themselves may still be stateful. It's that the *queries* are stateless.


This class stores info we want to provide to or retain within an alias query.
By default, the root query is stateless and starts with a freshly constructed
info object. Specific alias analyses can use this query info to store
per-query state that is important for recursive or nested queries to avoid
recomputing. To enable preserving this state across multiple queries where
safe (due to the IR not changing), use a `BatchAAResults` wrapper.

Then I would move the more detailed discussion of how BatchAAResults works to that class's comments.

772 ↗(On Diff #191756)

See above, but generally need comments here too. Especially important to describe the invariants under which this is safe to use.

asbirlea updated this revision to Diff 191798.Mar 21 2019, 4:38 PM

Rewrite some of the documentation.

asbirlea marked 2 inline comments as done.Mar 21 2019, 4:39 PM
chandlerc accepted this revision.Mar 21 2019, 4:55 PM

LGTM, thanks for all the plumbing here.

774–776 ↗(On Diff #191798)

FWIW, I could imagine adding a clear method in the future that nukes the AAQI to make it easier for passes to reset their batch when needed.

But I like the current design and adding this only once we see it become awkward to create a distinct wrapper for each batch.

This revision is now accepted and ready to land.Mar 21 2019, 4:55 PM
asbirlea marked 2 inline comments as done.Mar 21 2019, 5:01 PM
asbirlea added inline comments.
774–776 ↗(On Diff #191798)

SGTM, thank you for the review!

This revision was automatically updated to reflect the committed changes.
asbirlea marked an inline comment as done.