Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[AA] Hoist the logic to reformulate various AA queries in terms of other parts of the AA interface out of the base class of every single AA result object.

Authored by chandlerc on Feb 17 2016, 2:41 AM.



Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.

When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.

To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).

However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.

The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.

It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that

However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.

Diff Detail


Event Timeline

chandlerc updated this revision to Diff 48166.Feb 17 2016, 2:41 AM
chandlerc retitled this revision from to [AA] Hoist the logic to reformulate various AA queries in terms of other parts of the AA interface out of the base class of every single AA result object..
chandlerc updated this object.
chandlerc added reviewers: hfinkel, reames.
chandlerc added subscribers: llvm-commits, hans.

Some "peephole" review comments inline:

757 ↗(On Diff #48166)

Is explicit needed here?

35 ↗(On Diff #48166)

Is explicit needed?

30 ↗(On Diff #48166)

Ditto about explicit

Chandler: Later tonight I am going to figure out what I need to do to my out of tree project to make this work. Hopefully everything is smooth.

Gerolf added a subscriber: Gerolf.Feb 17 2016, 4:49 PM
Gerolf added inline comments.
139 ↗(On Diff #48166)

This routine queries AA again but this time for behavior rather than ModRefInfo, but the result (no mod ref info) could be the same. Can this be combined in some meta routine so that no mod ref is spit out early? The AA loops can be very compile-time expensive so we should avoid/shorten them when possible.

149 ↗(On Diff #48166)

Could you refactor into a function?

171 ↗(On Diff #48166)

Why could Mod be set in this case? Wouldn't that be a bug?

194 ↗(On Diff #48166)

See above

244 ↗(On Diff #48166)

Separate function?

chandlerc added inline comments.Feb 17 2016, 6:10 PM
139 ↗(On Diff #48166)

This is interesting, but seems really orthogonal to this change. I don't want to try to fix everything in one go, I'm trying to fix a specific problem with the existing implementation.

149 ↗(On Diff #48166)

This is directly moving code from one place to another here. I don't really want to refactor all of it to look different at the same time. This applies to most of the remaining comments as well -- all of this code existed pretty much exactly as-is before. But if there is a desire to refactor this, I think that should be a separate patch.

171 ↗(On Diff #48166)

If this is a bug, it is a bug in existing code, so you should be able to find a test case and fix it separately from this patch?

Any other thoughts? If we'd like to get this into 3.8, it probably needs to land pretty soon...

Gerolf added inline comments.Feb 19 2016, 4:30 PM
139 ↗(On Diff #48166)

My concern here is compile-time. The AA loops are compile-time intense and this code could make it worse. For a change like this it would be great if you could collect compile-time data before you commit.

149 ↗(On Diff #48166)

My concern is compile-time. The AA loops are compile-time intense and I suspect this code makes it worse. Do your data show that is not the case?

149 ↗(On Diff #48166)

Makes sense. Also digging a bit more into the code this is not as straightforward as I thought it was.

171 ↗(On Diff #48166)

Why does the code check for this condition rather asserting? I'm wondering how the code above could diagnose a Mod for a constant location.

chandlerc added inline comments.Feb 19 2016, 6:16 PM
139 ↗(On Diff #48166)

I'm confused...

As far as I can tell, this change *cannot* make the compile time worse. Previously, this exact code existed in *every* base class for an AAResults, and would have been reached by the loop above for each one of them. That means it would have been run at least once (via BasicAA) and in many cases N times for the number of AAs active. In common Clang compiles that would be at least 4 times. So we're running the same code 1/4th as often with my change.

Unless I've missed something?

149 ↗(On Diff #48166)

See above for why this too cannot (AFAICT) be making the compile time worse.

171 ↗(On Diff #48166)

I understand, but this code is *not new* in this patch. I am moving it from one location to another, I just had to change the name of a variable.

Gerolf added inline comments.Feb 22 2016, 9:00 PM
139 ↗(On Diff #48166)

Hm, now I"m confused, although I can now see your ct argument.

Correct me wrong: The new code hoists the loop over the AA's hidden in AA->getModRef(CS, Loc) out of the loop. This makes the code more efficient. But isn't that function now just returning MRI_ModRef which renders loop above dead?

149 ↗(On Diff #48166)

Sure, but I was not concerned about ct here anyway. No issue. Thanks for your clarifications!

chandlerc added inline comments.Feb 22 2016, 9:21 PM
139 ↗(On Diff #48166)

Er, the above loop calls through the AAResultConcept. It dynamically dispatches to each AA's implementation of getModRef. Some AAs implement this to provide more precise results, but certainly if *no* AA provides those precise results each call will end up at the base class returning MRI_ModRef, and we'll fall through that loop to here. But the whole point is to *allow* an AA result to handle this if desired.

Gerolf added inline comments.Feb 24 2016, 10:07 AM
139 ↗(On Diff #48166)

Yep, now I'm in the boat and all makes sense. And it definitely helps a library compile-time problem I was looking at by cutting the time spent in AA time by about half(!!) in O3. So it all looks great to me . :-)

Still actually looking for an LGTM here...

hfinkel accepted this revision.Mar 2 2016, 7:50 AM
hfinkel edited edge metadata.


This revision is now accepted and ready to land.Mar 2 2016, 7:50 AM
This revision was automatically updated to reflect the committed changes.