This is an archive of the discontinued LLVM Phabricator instance.

[MemorySSA] Overhaul CachingMemorySSAWalker's cache to allow more selective cache invalidation
AbandonedPublic

Authored by george.burgess.iv on Apr 25 2016, 3:12 PM.

Details

Summary

This patch introduces a new cache for MemorySSA that allows us to invalidate cache entries directly related to a given MemoryAccess in constant time.

We do this by making cache a directed graph. Each node is a MemoryAccess, and each edge is a cache entry. In the case of calls, each node can only have one edge to another node. In the case of Values, nodes may have multiple edges, each tagged by a MemoryLocation.

The only case in which we need to invalidate anything (at the moment) is when a MemoryAccess has been removed from the graph. We handle this by flipping an invalid bit in the corresponding node, and keeping the node alive. Edges to invalid nodes are removed as they are discovered during cache lookups. This means we don't basically have to roll our own Use-lists for this cache.

The moving of CachingMemorySSAWalker was just because I didn't feel that this cache needed to be in MemorySSA.h. If we feel that the caching walker's definition should remain in MemorySSA.h, I'm happy to refactor things so that's possible.

Running this across clang+LLVM with MemorySSA's verifier enabled yields 0 assertion failures.

I had to disable a part of a test for this to work. This is because, given the following code:

define void foo(i32* %i) {
entry:
  br i1 undef, label %if.then, label %if.end
if.then:
  ; MemoryDef
  store i32 0, i32* %i
  br label %if.end
if.end:
  ; MemoryPhi
  ; MemoryUse(Phi)
  load i32 %i
  ret void
}

...When the MemoryDef is removed, the test asserts that we should be able to invalidate the MemoryUse's cache entry (which points to the Phi) because the Def was removed. I think this is sane, but in order for it to work well, it currently requires basically invalidating the cache entries of all MemoryAccess reachable from the clobber we're deleting, unless we implement either:

  • Automatic Phi deletion when all of a Phi's upward defs point to the same place (wouldn't be a full fix, but would probably be a simple one), or
  • More information sharing between the walker and cache, so we know what each cache entry is *actually* dependent on.

I think this is necessary because, given the following CFG:

     A
   /   \
  B     |
 / \    |
C   D   E
 \ /    |
  E    /
    \ /
     F

...If we're trying to optimize a use in F that would be liveOnEntry if not for a clobber in C, there's really no way I can see to signal that the removal of the clobber in C should invalidate the use in F without basically invalidating every MemoryAccess that is reachable from C. Presumably, most of the things invalidated by that wouldn't care about that particular clobber, so we'd likely end up doing a ton of recomputation for not much (if any) gain.

We may be able to solve this problem by making the walker explicitly note that the Def in C caused optimization of the use in F to fail, but that's probably a much more involved change to the walker. Given that "overhaul the walker" is item #3 on my MemorySSA-related to-do list, I think that can be done later if we believe it would be valuable.

Diff Detail

Event Timeline

george.burgess.iv retitled this revision from to [MemorySSA] Overhaul CachingMemorySSAWalker's cache to allow more selective cache invalidation.
george.burgess.iv updated this object.
george.burgess.iv added a subscriber: llvm-commits.

A few minor cleanups

...And restore some code that I accidentally removed. Sorry for the spam :)

reames requested changes to this revision.May 5 2016, 6:07 PM
reames edited edge metadata.

Just to make sure I'm understand the context correctly, we're removing a memorydef which has been determined to be dead (i.e. dse) and as a result, leaving the memory phi in place pointing at the definition before the one we removed is correct but not necessarily optimally precise? That seems reasonable. Possibly part of the API should be the precision required?

Also, the presence of both code motion and semantic changes makes this essentially impossible to review quickly. Please make the code motion changes separately, land them, and then update the patch. If we disagree about where the code lives, we can move it again later. Your suggestion seems reasonable enough and I doubt we'll care too much. :)

This revision now requires changes to proceed.May 5 2016, 6:07 PM
dberlin edited edge metadata.May 6 2016, 1:45 PM

Just to make sure I'm understand the context correctly, we're removing a memorydef which has been determined to be dead (i.e. dse) and as a result, leaving the memory phi in place pointing at the definition before the one we removed is correct but not necessarily optimally precise? That seems reasonable. Possibly part of the API should be the precision required?

This is actually a violation of the documentation we have, which states that memoryuses will point to the thing that *actually* clobbers them.

If we can't make this work in the new caching scheme, we should fix the new caching scheme :)

However, i don't think this is that hardto make work.

Figuring out the set of invalidations is easy, even if the number may be large.

These are all Value's now, so they have Users and Uses.

For a memoryuse invalidation, you do nothing.
for a memorydef invalidation, invalidate the cache for all uses, recursively (IE recurse on memorydefs).

That is the set of things it may have changed.

This seems like it would be bad, but it won't be that bad, since most people don't ask about stores (memorydefs), and loads already point to their clobbering definitions (and thus, fwiw, caching them outside of the memoryssa builder is pointless)

Also, the presence of both code motion and semantic changes makes this essentially impossible to review quickly. Please make the code motion changes separately, land them, and then update the patch. If we disagree about where the code lives, we can move it again later. Your suggestion seems reasonable enough and I doubt we'll care too much. :)

+1 to this.

george.burgess.iv abandoned this revision.May 9 2016, 5:29 PM

Also, the presence of both code motion and semantic changes makes this essentially impossible to review quickly. Please make the code motion changes separately, land them, and then update the patch. If we disagree about where the code lives, we can move it again later. Your suggestion seems reasonable enough and I doubt we'll care too much

Will do (in the future).

If we can't make this work in the new caching scheme, we should fix the new caching scheme

That sounds like it's going to be a lot of invalidation, which this cache probably won't like very much. I'll see if I can do something that's more invalidation-friendly, then. :)

Thanks for the feedback!