This is an archive of the discontinued LLVM Phabricator instance.

[AliasSetTracker] Degrade AliasSetTracker results when may-alias sets get too large.
ClosedPublic

Authored by mkuper on Aug 11 2016, 3:44 PM.

Details

Summary

The cause of PR28832 is the quadratic behavior of repeated inserts into AliasSetTracker - inserting a pointer into AST is linear, since it requires walking over all "may" alias sets, and running an alias check vs. each pointer in such a set.
This patch tracks (more or less) the total number of pointers in all "may" sets, and when that number exceeds a threshold, declares the tracker "saturated" - and lumps all pointers into a single "may" set.

For the example in PR28832, without this patch, I have:

 ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
43.4251 ( 47.4%)   0.0440 ( 33.2%)  43.4691 ( 47.4%)  43.4471 ( 47.4%)  Loop Invariant Code Motion
19.6565 ( 21.5%)   0.0040 (  3.0%)  19.6605 ( 21.5%)  19.6512 ( 21.5%)  Loop Invariant Code Motion
17.5580 ( 19.2%)   0.0040 (  3.0%)  17.5620 ( 19.2%)  17.5534 ( 19.2%)  Loop Invariant Code Motion
 3.0498 (  3.3%)   0.0080 (  6.0%)   3.0578 (  3.3%)   3.0563 (  3.3%)  Value Propagation

And with this patch, with a threshold of 250:

---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
5.6446 ( 29.1%)   0.0160 (  5.6%)   5.6606 ( 28.8%)   5.6570 ( 28.8%)  Value Propagation
2.2423 ( 11.6%)   0.0040 (  1.4%)   2.2463 ( 11.4%)   2.2448 ( 11.4%)  Loop Invariant Code Motion
1.6510 (  8.5%)   0.0039 (  1.4%)   1.6549 (  8.4%)   1.6526 (  8.4%)  Loop Invariant Code Motion
1.5398 (  8.0%)   0.0040 (  1.4%)   1.5438 (  7.9%)   1.5429 (  7.9%)  Loop Invariant Code Motion

A threshold of 1000 produces a significant slowdown:

 ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
10.0644 ( 25.7%)   0.0080 (  2.4%)  10.0724 ( 25.5%)  10.0689 ( 25.5%)  Loop Invariant Code Motion
 7.5912 ( 19.4%)   0.0159 (  4.8%)   7.6071 ( 19.3%)   7.6063 ( 19.3%)  Loop Invariant Code Motion
 7.4471 ( 19.0%)   0.0120 (  3.6%)   7.4591 ( 18.9%)   7.4571 ( 18.9%)  Loop Invariant Code Motion
 5.5252 ( 14.1%)   0.0200 (  6.0%)   5.5452 ( 14.1%)   5.5454 ( 14.1%)  Value Propagation

This isn't in a committable state - it needs some cleanup, more testing, and some substantive fixes (per the TODOs).
So no point in reviewing it in detail yet, but I'd like to get a direction check.

Danny, w.r.t to our earlier discussion - I'm not a fan of the idea of having a saturated AST be unsafe to query, so this is required anyway. We can add isSaturated() checks to LICM to speed it up further, as a separate patch.

Diff Detail

Event Timeline

mkuper updated this revision to Diff 67759.Aug 11 2016, 3:44 PM
mkuper retitled this revision from to [AliasSetTracker] Degrade AliasSetTracker results when may-alias sets get too large..
mkuper updated this object.
mkuper added reviewers: davidxl, dberlin.
mkuper added a subscriber: llvm-commits.

There might not be anything we can do in general that is better than this, but could we degrade to using summaries of some kind instead of N^2 queries. For example, imagine that, in the degraded mode, we represented each set using the result of GetUnderlyingObject and the merged AA metadata. Then we queried using <Ptr, Size, AAMD> vs <UO, UnknownSize, MergedAAMD>, but only once per set, not once per pointer in each set? I'm trying to think of a way to keep getting the easy cases, such as noalias function parameters, even if we degrade on the hard things.

mkuper updated this revision to Diff 67933.Aug 12 2016, 4:19 PM

A more committable version of the patch.
(With full degradation)

davidxl added inline comments.Aug 12 2016, 5:58 PM
include/llvm/Analysis/AliasSetTracker.h
130

Singifies --> Signifies

132–133

How about it be called 'AliasAny' ?

418

nit: AliasAnyAS may sound better

436

Add brief documentation.

lib/Analysis/AliasSetTracker.cpp
29

add option description.

85

AS.SetSize --> AS.size() to be consistent

297

can this logic of checking SaturatedAS be folded into 'mergeAliasSetsForPointer such that there is no need to make any change in this method.

566

Use SaturatedAS->MergeSetIn(Cur ...) ?

602

The AST is now ..

mkuper marked 5 inline comments as done.Aug 15 2016, 11:29 AM

Thanks, David!

include/llvm/Analysis/AliasSetTracker.h
132–133

Sounds good.

418

Sure, wasn't happy with the "saturated" name to begin with.
Do you have any name suggestions for the option, too? (after this change, that's the only place "saturation" will appear in, which may be a bit confusing).

lib/Analysis/AliasSetTracker.cpp
29

Ugh, right, thanks!

297

It can, but since this is a major behavior difference between a saturated and a non-saturated AST, I think it would be clearer to leave it here.

mergeAliasSetsForPointer used to be called findAliasSetsForPointer. I changed the name in r266381 - the idea was to more clearly separate between the "merging" behavior, and the "get/find the right set" behavior. See discussion in https://reviews.llvm.org/D18939?id=53163#inline-159392

Do you see a specific reason for this to live in merge()?

566

Right, was going to change this between the two versions, and it slipped my mind. I think I can't use MergeSetIn() as is, but I can factor out the common logic.

mkuper updated this revision to Diff 68066.Aug 15 2016, 1:31 PM
mkuper marked 2 inline comments as done.

Updated per David's comments, except for the getAliasSetForPointer() change.

(Re mergeAllAliasSets - I was wrong, it can use mergeSetIn() as is, it does the right thing.)

davidxl added inline comments.Aug 16 2016, 11:00 AM
lib/Analysis/AliasSetTracker.cpp
297

Ok. Please add a little more comments here: something like 'AST is saturated at this point, .."

301

Add assert that aliasSet of Entry is indeed AliasAnyAS.

537

assert AliasAnyAS is null and may alias set size has reached threshold.

mkuper added inline comments.Aug 16 2016, 11:36 AM
lib/Analysis/AliasSetTracker.cpp
297

Sure.

301

Good idea, thanks!

537

We're already checking the exact same pre-condition before the call (line 570), but I don't mind an assert.

mkuper updated this revision to Diff 68236.Aug 16 2016, 11:37 AM
davidxl accepted this revision.Aug 17 2016, 9:31 AM
davidxl edited edge metadata.

lgtm.

Please also check with Danny and Hal.

This revision is now accepted and ready to land.Aug 17 2016, 9:31 AM
This revision was automatically updated to reflect the committed changes.
reames added a subscriber: reames.Aug 20 2016, 12:24 PM

I am not actively objecting to this patch, but I really don't like the overall direction here. Having a threshold where our ability to optimize falls off a cliff just seems really undesirable. As Hal pointed out, there are likely options for summarizing alias sets to allow quicker AA queries. How much have we explored that design space?

I am not actively objecting to this patch, but I really don't like the overall direction here. Having a threshold where our ability to optimize falls off a cliff just seems really undesirable. As Hal pointed out, there are likely options for summarizing alias sets to allow quicker AA queries. How much have we explored that design space?

My understanding from the discussion is that all uses of the ASTracker are going to be replaced with MemorySSA-based algorithms; that is why I was okay with this (for now). If we still need an AST concept, then we'll want to do something more sophisticated.