This is an archive of the discontinued LLVM Phabricator instance.

[MemorySSA] Add invariant.group handling
AbandonedPublic

Authored by Prazek on Jan 23 2017, 6:03 PM.

Details

Summary

Now loads with invariant group will have the same
MemoryUse as dominating store/load with the same
invariant.group and pointer operand.

The algorithm is O(#invariant.group loads/stores in function),
which is much better compared to O(n^2) in MemDep.

NewGVN now handles some devirtualization cases, but
unfortunatelly not all of them.

Event Timeline

Prazek created this revision.Jan 23 2017, 6:03 PM
Prazek updated this revision to Diff 85509.Jan 23 2017, 6:07 PM

Added back removed empty lines

Thanks for this!

Haven't looked over the patch in great detail, but I do have a few high-level questions:

  • Is it possible for an invariant group barrier to ever be removed from/added to a function? If so, then we're either going to have to treat it as Def, or we're going to have to add a byTheWayThisInvariantBarrierIsNowDead and/or a weHaveANewInvariantBarrier function to MemorySSA.
  • Is it possible to add/remove/move an operation with invariant group metadata on it? If so, we'll probably need to change the update API to handle those cases sanely (otherwise, looks like PerBlockInvariantGroupAccesses can end up with dangling pointers?)
lib/Transforms/Utils/MemorySSA.cpp
1337

VersionStack.push_back(MSSA->getLiveOnEntryDef()); below takes care of that; liveOnEntryDef exists in the entry block, so it always dominates everything. :)

If you'd like to add commentary/asserts, feel free (though preferably as a separate commit).

dberlin edited edge metadata.Jan 23 2017, 7:55 PM

A few things

  1. I'm curious why this isn't just added to the use optimizer more directly, it looks like you are computing info about most-dominating accesses, etc, which the stack already has.
  2. If this isn't done in the walker, it'll never work in the face of updates.

(IE the use optimizer is a faster version of the walker, it's not better.
You are making something that is better, but if you don't add that to the walker, newly built memoryssa will support it but updated won't)

Thanks for this!

Haven't looked over the patch in great detail, but I do have a few high-level questions:

  • Is it possible for an invariant group barrier to ever be removed from/added to a function? If so, then we're either going to have to treat it as Def, or we're going to have to add a byTheWayThisInvariantBarrierIsNowDead and/or a weHaveANewInvariantBarrier function to MemorySSA.

Yes, it is sometimes possible to remove barrier. Case like

%unused = i8* @llvm.invariant.group.barrier(%x)

So when the result is unused. I think this could be achieved by giving some attributes to barrier like readonly, so that optimizer will
know that this can be removed if return value is unused. I haven't looked into that in the detail, but I am sure there is, or could
exist attribute that does this.

Other case is when barrier has only
one use, which is barrier, then we could remove it

%b = @llvm.invariant.group.barrier(%p)
%b2 = @llvm.invariant.group.barrier(%b)

In this case %b could be removed.

Hovewer I am thinking about some other method. I would like to have all the optimizations
working the same as if I would remove all the barriers and invariant groups, but without removing them.
One of the case is

store 42, %p
%a = @llvm.invariant.group.barrier(%p)
load %a

In this case the load will have the correct MemoryUse, but the optimizer won't be able to figure out the value of load
or the case like

store 42, %x
%a = @llvm.invariant.group.barrier(%x)
store 41, %a

In this case first store is dead.
So I don't yet know how to handle these cases, but I will at DSE and newGVN to figure it out.

  • Is it possible to add/remove/move an operation with invariant group metadata on it? If so, we'll probably need to change the update API to handle those cases sanely (otherwise, looks like PerBlockInvariantGroupAccesses can end up with dangling pointers?)

Yes, PerBlockInvariantGroupAccess will have dangling pointers, but I was planning to only use it in OptimizeUses, which is used only ones where we still have all the instructions. Hovewer this might change because I have to take a look at the walker as Daniel said.

A few things

  1. I'm curious why this isn't just added to the use optimizer more directly, it looks like you are computing info about most-dominating accesses, etc, which the stack already has.

do you mean optimizeUsesInBlock? I think this functions is already big enough with a lot of logic, so by having it in the different functions it is much simpler (and also the information about the stack is narrowed only to the things that I need)

  1. If this isn't done in the walker, it'll never work in the face of updates.

(IE the use optimizer is a faster version of the walker, it's not better.
You are making something that is better, but if you don't add that to the walker, newly built memoryssa will support it but updated won't)

I haven't noticed that. I will fix the walker soon

Prazek added inline comments.Jan 24 2017, 1:37 AM
lib/Transforms/Utils/MemorySSA.cpp
1337

That's what I thought, which means that !VersionStack.empty() is always true.
I will rewrite this loop later to make it more clear.

davide requested changes to this revision.Jan 28 2017, 12:45 AM

This probably needs rebasing after the recent Danny's change. First round of comments inline.

include/llvm/Transforms/Utils/MemorySSA.h
632–636

comment this function? (as the one above)

lib/Transforms/Utils/MemorySSA.cpp
1507

s/pesimize/pessimize.

1513–1516

Ternary op here maybe?

1536

if it's not used, can you remove that separately?

1578–1581

instead of a static function here you can probably make this a lambda inside buildMemorySSA() ?

This revision now requires changes to proceed.Jan 28 2017, 12:45 AM

(and thanks for taking care of this).

(and thanks for taking care of this).

Sure, I will finish this after exams. I need to think more about the walkers and updater

Prazek abandoned this revision.Sep 15 2021, 12:37 PM
Prazek marked 4 inline comments as done.

I implemented this properly with Arthur in https://reviews.llvm.org/D109134