This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/IR/
-
llvm/
-
IR/
-
Value.h
-
lib/
-
Analysis/
-
BasicAliasAnalysis.cpp
-
IR/
-
Value.cpp
-
test/
-
Other/Inputs/
-
Inputs/
-
invariant.group.barrier.ll
-
Transforms/GVN/
-
GVN/
-
invariant.group.ll

Differential D31585

Handle invariant.group.barrier in BasicAA
ClosedPublic

Authored by Prazek on Apr 2 2017, 12:57 PM.

Download Raw Diff

Details

Reviewers

chandlerc
• dberlin
sanjoy
hfinkel
nlewycky

Commits

rG610c966a4e90: Handle invariant.group.barrier in BasicAA
rL301227: Handle invariant.group.barrier in BasicAA

Summary

llvm.invariant.group.barrier returns pointer that mustalias
pointer it takes. It can't be marked with returned attribute,
because it would be remove easily. The other reason is that
only Alias Analysis can know about this, because if any other
pass would know it, then the result would be replaced with it's
argument, which would be invalid.

We can think about returned pointer as something that mustalias, but
it doesn't have to be bitwise the same as the argument.

Diff Detail

Repository: rL LLVM

Event Timeline

Prazek created this revision.Apr 2 2017, 12:57 PM

Prazek added a child revision: D31581: perform DSE through invariant.group.barrier.Apr 2 2017, 12:57 PM

Prazek added a child revision: D31581: perform DSE through invariant.group.barrier.

Removed empty lines

Add Krzysztof to subscribers

Prazek added a subscriber: amharc.Apr 2 2017, 1:00 PM

Note: Nothing currently removes things with the returned attribute, so it wouldn't be removed because of that.

" because if any other pass would know it, then the result would be replaced with it's
argument, which would be invalid."

Invalid or just lose optimization/have to drop the attribute?
If invalid, this sounds, honestly, like a serious design flaw.
SSA values are meant to be propagated, and don't die once created.
(Memory does, but ssa doesn't).

I should always be able to propagate an ssa value past anything i like.
If you've created an intrinsic where ssa values die when passed to it, that seems very broken.

Hi,

I'm not sure if this is correct. What if we transform:

i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = load ptr_1, !invariant.group

to (since ptr_0 must-alias ptr_1, and the rules of invariant.group, ptr_0 must-alias ptr_1 and they're not clobbered by the store to ptr_0)

i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = old_val

In other words, given what you're teaching BasicAA, even though barrier(x) cannot be replaced with x in general, since x must-alias barrier(x) we can do the replacement for load and store operations, which can break the data dependency you're trying to preserve.

This seems pretty broken.

It seems if i add an assume that the beforeptr == afterptr
it will propagate it
if i put it in a loop with a if (beforeptr == afterptr) condition, it'll propagate it, etc.

You can't prevent the compiler from propagating your beforeptr to afterptr, and it's only ever going to become smarter about proving they are the same.
Thus, things need to lose optimization, not break, when it does that.

In D31585#716343, @sanjoy wrote:
Hi,

I'm not sure if this is correct. What if we transform:
i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = load ptr_1, !invariant.group
to (since ptr_0 must-alias ptr_1, and the rules of invariant.group, ptr_0 must-alias ptr_1 and they're not clobbered by the store to ptr_0)
i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = old_val
In other words, given what you're teaching BasicAA, even though barrier(x) cannot be replaced with x in general, since x must-alias barrier(x) we can do the replacement for load and store operations, which can break the data dependency you're trying to preserve.

It can't, because the optimizations based on invariant.group will not be stripping barriers. It can only work on SSA values.

The outcome of all the optimizations will be:

i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = vptr

Which is totally valid, because barrier doesn't change any memory.

In D31585#716331, @dberlin wrote:

Note: Nothing currently removes things with the returned attribute, so it wouldn't be removed because of that.

" because if any other pass would know it, then the result would be replaced with it's
argument, which would be invalid."

Invalid or just lose optimization/have to drop the attribute?
If invalid, this sounds, honestly, like a serious design flaw.
SSA values are meant to be propagated, and don't die once created.
(Memory does, but ssa doesn't).

I should always be able to propagate an ssa value past anything i like.
If you've created an intrinsic where ssa values die when passed to it, that seems very broken.

This patch is meant to solve the "barriers stopping optimizations" problem.
llvm.invariant.group.barrier is readonly argmemonly intrinsic, which means that if I would mark it with returns attribute, then it would be
removed as fast as llvm.ssa.copy.

Check out this: https://reviews.llvm.org/D31531
There is also another argument on how you can think about barriers (Chandler's idea):

Let say that lower 32 bits of pointer are the phisical address, and high 32 bits are encoding dynamic type.
Barrier (or std::launder if it helps) can read the pointer, and based on dynamic type of the object (read from vptr)
it returns pointer with the same lower 32 bits, but high 32 bits can be different from the pointer it took.

Another problem, that need to be mention is skipping the barrier based on comparasions:

Base *a = new Base();
a->foo();
Base *b = new(a) Derived();  
assert(a == b);
b->foo();

So after assert, LLVM could replace ssa value of b with a,
which would skip the barrier and cause wrong devirtualization.
That is another reason why we need to add barriers before comparasion like:
barrier(a) == barrier(b)

but it would be not enough, if we would know that barrier(a) produces the same value
as barrier(b).

I should always be able to propagate an ssa value past anything i like.
If you've created an intrinsic where ssa values die when passed to it, that seems very broken.

This patch is meant to solve the "barriers stopping optimizations" problem.

I think this needs to be solved another way.

llvm.invariant.group.barrier is readonly argmemonly intrinsic, which means that if I would mark it with returns attribute, then it would be
removed as fast as llvm.ssa.copy.

Only one pass removes llvm.ssa.copy :)

Check out this: https://reviews.llvm.org/D31531
There is also another argument on how you can think about barriers (Chandler's idea):

Let say that lower 32 bits of pointer are the phisical address, and high 32 bits are encoding dynamic type.
Barrier (or std::launder if it helps) can read the pointer, and based on dynamic type of the object (read from vptr)
it returns pointer with the same lower 32 bits, but high 32 bits can be different from the pointer it took.

but again, if i add comparisons, assumes, or asserts that prove otherwise, it'll propagate.
You simply aren't going to be able to stop this propagation, nor should you be able to.
It violates a fundamental property of ssa - values never die :)

My problem with your barrier is simply that *when propagation happens*, it generates wrong code, instead of "losing optimization"

Compare to noalias intrinsic
If i propagate the argument of llvm.noalias to it's result, i lose optimization, but i don't break the program.

Another problem, that need to be mention is skipping the barrier based on comparasions:
Base *a = new Base();
a->foo();
Base *b = new(a) Derived();  
assert(a == b);
b->foo();
So after assert, LLVM could replace ssa value of b with a,

Yes, and it will.

which would skip the barrier and cause wrong devirtualization.
That is another reason why we need to add barriers before comparasion like:
barrier(a) == barrier(b)

but it would be not enough, if we would know that barrier(a) produces the same value
as barrier(b).

I don't think this is going to work out.
Honestly.
It is always possible to find ways to break it based on value equivalence.

You can't stop all forms of proving value equivalence.

In D31585#716348, @dberlin wrote:
I should always be able to propagate an ssa value past anything i like.
If you've created an intrinsic where ssa values die when passed to it, that seems very broken.

This patch is meant to solve the "barriers stopping optimizations" problem.

I think this needs to be solved another way.

llvm.invariant.group.barrier is readonly argmemonly intrinsic, which means that if I would mark it with returns attribute, then it would be
removed as fast as llvm.ssa.copy.

Only one pass removes llvm.ssa.copy :)
Check out this: https://reviews.llvm.org/D31531
There is also another argument on how you can think about barriers (Chandler's idea):
Let say that lower 32 bits of pointer are the phisical address, and high 32 bits are encoding dynamic type.
Barrier (or std::launder if it helps) can read the pointer, and based on dynamic type of the object (read from vptr)
it returns pointer with the same lower 32 bits, but high 32 bits can be different from the pointer it took.
but again, if i add comparisons, assumes, or asserts that prove otherwise, it'll propagate.
You simply aren't going to be able to stop this propagation, nor should you be able to.
It violates a fundamental property of ssa - values never die :)

My problem with your barrier is simply that *when propagation happens*, it generates wrong code, instead of "losing optimization"

Compare to noalias intrinsic
If i propagate the argument of llvm.noalias to it's result, i lose optimization, but i don't break the program.
Another problem, that need to be mention is skipping the barrier based on comparasions:
Base *a = new Base();
a->foo();
Base *b = new(a) Derived();  
assert(a == b);
b->foo();
So after assert, LLVM could replace ssa value of b with a,
Yes, and it will.

which would skip the barrier and cause wrong devirtualization.
That is another reason why we need to add barriers before comparasion like:
barrier(a) == barrier(b)

but it would be not enough, if we would know that barrier(a) produces the same value
as barrier(b).

I don't think this is going to work out.
Honestly.
It is always possible to find ways to break it based on value equivalence.

You can't stop all forms of proving value equivalence.

I've talked with Chandler and we think that if the barrier will be added when:

any cast from dynamic type to not dynamic (void*, int64 etc)
when pointer to dynamic object is compared in any way
standard places (ctor, dtor, placement new, union)

will be enough to not propaget the value. So as long as you cange A to B, and both A and B are after after the barrier, then we will never skip the barrier (replace ssa value "after barrier" to "before barrier")

I'll make you a deal:
If you are willing to fix this when it inevitably breaks, without hacking up the optimizations to do so, i'm happy to remove my objection :)

In D31585#716350, @dberlin wrote:

I'll make you a deal:
If you are willing to fix this when it inevitably breaks, without hacking up the optimizations to do so, i'm happy to remove my objection :)

invariant.group is my baby, ofc I would fix it :) I am also very happy to learn a better way to model this. So far after couple of years I don't see any better solution - we can't
mark this intrinsic with returned attribute, because we can't expose it to SSA. The only place that should know about it is AA. This knowledge shouldn't break anything, because
we won't replace the returned value based on MustAlias, and we want to perform the same optimizations as if we would strip all the invariant.group metadata and barriers.

The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
I was also planning to add "mustalias" attribute to barrier after figuring it out with Hal and Chandler, so I would look for functions with this attribute instead of the barriers, but
it doesn't really matter that much.
You can check out the proposal, which specify the problem and bunch of optimizations that I care about here:
https://docs.google.com/document/d/1bY9edrxgEMnqo0VxlF8cBmzu9Avrue7RqC5S3HISBiQ/edit?usp=sharing

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

reorder PSK enum and perform clang-format

In D31585#716352, @dberlin wrote:

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

I agree, but I hope that this patch will fix most of that :) There might be some missing places like the DSE that you saw that doesn't use AA and rely on SSA values only.

edit: I know that there is also GlobalOpt that I have to take a look, but it is not as important as GVN or DSE

In D31585#716374, @Prazek wrote:

In D31585#716352, @dberlin wrote:

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

I agree, but I hope that this patch will fix most of that :) There might be some missing places like the DSE that you saw that doesn't use AA and rely on SSA values only.

again, they shouldn't have to rely on AA to get *conservatively correct answers about the values*, only to optimize *the memory it touches* better.

edit: I know that there is also GlobalOpt that I have to take a look, but it is not as important as GVN or DSE

I'm fine with this for now, i just suspect it's going to break in highly creative ways as we get better about optimization.

In D31585#716375, @dberlin wrote:

In D31585#716374, @Prazek wrote:

In D31585#716352, @dberlin wrote:

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

I agree, but I hope that this patch will fix most of that :) There might be some missing places like the DSE that you saw that doesn't use AA and rely on SSA values only.

again, they shouldn't have to rely on AA to get *conservatively correct answers about the values*, only to optimize *the memory it touches* better.

edit: I know that there is also GlobalOpt that I have to take a look, but it is not as important as GVN or DSE

I'm fine with this for now, i just suspect it's going to break in highly creative ways as we get better about optimization.

In D31585#716375, @dberlin wrote:

In D31585#716374, @Prazek wrote:

In D31585#716352, @dberlin wrote:

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

I agree, but I hope that this patch will fix most of that :) There might be some missing places like the DSE that you saw that doesn't use AA and rely on SSA values only.

again, they shouldn't have to rely on AA to get *conservatively correct answers about the values*, only to optimize *the memory it touches* better.

It is not about correctness, it is only about optimizations. If I remove this patch, then it will be as correct as it was before.

edit: I know that there is also GlobalOpt that I have to take a look, but it is not as important as GVN or DSE

I'm fine with this for now, i just suspect it's going to break in highly creative ways as we get better about optimization.

The only bad thing that could happen is if someone would use stripPointerCastsWithBarriers() outside of AA, which would be invalid. Beside this, I don't see
how the information that 2 pointers mustalias could break anything. Do you have some thoughts on what could break?
Anyway, as I said, I feel responsible for the invariant.group topics, so I will be always happy to fix it.

In D31585#716378, @Prazek wrote:

In D31585#716375, @dberlin wrote:

In D31585#716374, @Prazek wrote:

In D31585#716352, @dberlin wrote:

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

I agree, but I hope that this patch will fix most of that :) There might be some missing places like the DSE that you saw that doesn't use AA and rely on SSA values only.

again, they shouldn't have to rely on AA to get *conservatively correct answers about the values*, only to optimize *the memory it touches* better.

edit: I know that there is also GlobalOpt that I have to take a look, but it is not as important as GVN or DSE

I'm fine with this for now, i just suspect it's going to break in highly creative ways as we get better about optimization.

In D31585#716375, @dberlin wrote:

In D31585#716374, @Prazek wrote:

In D31585#716352, @dberlin wrote:

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

I agree, but I hope that this patch will fix most of that :) There might be some missing places like the DSE that you saw that doesn't use AA and rely on SSA values only.

again, they shouldn't have to rely on AA to get *conservatively correct answers about the values*, only to optimize *the memory it touches* better.

It is not about correctness, it is only about optimizations. If I remove this patch, then it will be as correct as it was before.

edit: I know that there is also GlobalOpt that I have to take a look, but it is not as important as GVN or DSE

I'm fine with this for now, i just suspect it's going to break in highly creative ways as we get better about optimization.

The only bad thing that could happen is if someone would use stripPointerCastsWithBarriers() outside of AA, which would be invalid. Beside this, I don't see
how the information that 2 pointers mustalias could break anything. Do you have some thoughts on what could break?

I suspect nicer gvn's will break.
There are way too many ways to use mustalias.
While we can say mustalias does not not prove value equivalence directly, it can be used indirectly.
For example (and remember, i can ignore metadata i like :P)
store q
a = load b
d = b
c = load d
if b ==d , and b mustalias q, d mustalias q.

IE in:
store q
a = load b
c = load d
if b mustalias q, at least in our system, we expect b == d
(TBAA is the canonical example of what breaks this, but we still rely on it in some cases)

This travels backwards in time, too
store 5, b
c = load b
store 5, a
d = load a
if a mustalias b, the first store and load are dead, etc.

So chalk my feeling up to "general unease having gone down this path before".

In D31585#716379, @dberlin wrote:

In D31585#716378, @Prazek wrote:

In D31585#716375, @dberlin wrote:

In D31585#716374, @Prazek wrote:

In D31585#716352, @dberlin wrote:

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

I agree, but I hope that this patch will fix most of that :) There might be some missing places like the DSE that you saw that doesn't use AA and rely on SSA values only.

again, they shouldn't have to rely on AA to get *conservatively correct answers about the values*, only to optimize *the memory it touches* better.

edit: I know that there is also GlobalOpt that I have to take a look, but it is not as important as GVN or DSE

I'm fine with this for now, i just suspect it's going to break in highly creative ways as we get better about optimization.

In D31585#716375, @dberlin wrote:

In D31585#716374, @Prazek wrote:

In D31585#716352, @dberlin wrote:

"The only optimizations that invariant.group.barrier stops are the ones optimizing memory.
"
These are literally our most important optimizations.
Really.
Scalar optimizations just about don't matter anymore.
We can perform billions of wasted scalar calculations per second, but not so much on the memory front.

I agree, but I hope that this patch will fix most of that :) There might be some missing places like the DSE that you saw that doesn't use AA and rely on SSA values only.

again, they shouldn't have to rely on AA to get *conservatively correct answers about the values*, only to optimize *the memory it touches* better.

It is not about correctness, it is only about optimizations. If I remove this patch, then it will be as correct as it was before.

edit: I know that there is also GlobalOpt that I have to take a look, but it is not as important as GVN or DSE

I'm fine with this for now, i just suspect it's going to break in highly creative ways as we get better about optimization.

The only bad thing that could happen is if someone would use stripPointerCastsWithBarriers() outside of AA, which would be invalid. Beside this, I don't see
how the information that 2 pointers mustalias could break anything. Do you have some thoughts on what could break?

I suspect nicer gvn's will break.
There are way too many ways to use mustalias.
While we can say mustalias does not not prove value equivalence directly, it can be used indirectly.
For example (and remember, i can ignore metadata i like :P)
store q
a = load b
d = b
c = load d
if b ==d , and b mustalias q, d mustalias q.

Yep, this is totally valid reasoning, but as long as we go from pointer equality to mustalias, the barriers should not break :)

IE in:
store q
a = load b
c = load d
if b mustalias q, at least in our system, we expect b == d
(TBAA is the canonical example of what breaks this, but we still rely on it in some cases)

I don't think we should go from mustalias to equality. It is valid in some sense, but imho we should treat it as a property, that we can't derive from mustalias.

This travels backwards in time, too
store 5, b
c = load b
store 5, a
d = load a
if a mustalias b, the first store and load are dead, etc.

yep, this is exactly what I am lookig for, and it should not break the barriers :)

So chalk my feeling up to "general unease having gone down this path before".

"I don't think we should go from mustalias to equality. It is valid in some sense, but imho we should treat it as a property, that we can't derive from mustalias.

I don't actually disagree, of course.
aliasing is a property of abstract memory locations
value equivalence is a property of bits.
B implies A but A does not imply B :)

However, outside of memoryssa, which makes this very explicit (IE SSA form form for abstract memory locations), the distinction has a strong tendency to bleed through optimizations .

In D31585#716382, @dberlin wrote:

"I don't think we should go from mustalias to equality. It is valid in some sense, but imho we should treat it as a property, that we can't derive from mustalias.

"

I don't actually disagree, of course.
aliasing is a property of abstract memory locations
value equivalence is a property of bits.
B implies A but A does not imply B :)

However, outside of memoryssa, which makes this very explicit (IE SSA form form for abstract memory locations), the distinction has a strong tendency to bleed through optimizations .

In D31585#716382, @dberlin wrote:

"I don't think we should go from mustalias to equality. It is valid in some sense, but imho we should treat it as a property, that we can't derive from mustalias.

"

I don't actually disagree, of course.
aliasing is a property of abstract memory locations
value equivalence is a property of bits.
B implies A but A does not imply B :)

Cool, I am happy we agree on this :)

However, outside of memoryssa, which makes this very explicit (IE SSA form form for abstract memory locations), the distinction has a strong tendency to bleed through optimizations .

I guess we just have to be careful not to get from mustalias to equality in this case.
What about adding test like:

;Run with -O3 and another passes that are not by default that could break it
%b = call i8* @barrier(%a)
; Check here that it wasn't constant folded
%r = i1 cmp eq %b, %a
call @use(i1 %r)

This will not prove someone broke it, but it should help keep the semantics right.

In D31585#716346, @Prazek wrote:

It can't, because the optimizations based on invariant.group will not be stripping barriers. It can only work on SSA values.

The outcome of all the optimizations will be:

i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = vptr

Which is totally valid, because barrier doesn't change any memory.

Are you saying that if I have (forget about barrier for a second):

%v0 = load %X
...
%v1 = load %Y

then even if %X must-alias %Y I cannot transform the above to:

%v0 = load %X
...
%v1 = load %X

which seems pretty weird. What does "must alias" even mean then?

The other thing here is that http://llvm.org/docs/AliasAnalysis.html#the-alias-method says: "A MustAlias response implies that the pointers compare equal." However, this I think is a bug in our AA specification, and the documentation should be fixed to say that the pointers may not be equal.

(This is devolving into a general conversation about invariant.group; so please let me know if I should move this elsewhere.)

One more thing to document is that with invariant.group in the picture, certain kinds of "specializations" are no longer correct. For instance, I cannot transform:

call @f(i8* ptr)

if (ptr == X)
  call @f(i8* ptr)
else
  call @f(i8* ptr)

since it is possible that ptr == barrier(X) (in a way that isn't visible), and therefore the above can be GVN'ed into

if (ptr == X)
  call @f(i8* X)
else
  call @f(i8* ptr)

which is a problem if @f loads the vtable of X.

This needs to be carefully documented somewhere, though I'm not sure where.

In D31585#716385, @sanjoy wrote:
In D31585#716346, @Prazek wrote:

It can't, because the optimizations based on invariant.group will not be stripping barriers. It can only work on SSA values.

The outcome of all the optimizations will be:

i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = vptr

Which is totally valid, because barrier doesn't change any memory.

Are you saying that if I have (forget about barrier for a second):
%v0 = load %X
...
%v1 = load %Y
then even if %X must-alias %Y I cannot transform the above to:
%v0 = load %X
...
%v1 = load %X

Yes.
If you can prove them value equivalent, you can transform them, but not just because they refer to the same memory location.

which seems pretty weird. What does "must alias" even mean then?

It means the language says the memory pointed to by these two locations is the same memory.
That does not imply the pointers are equal.

IE it's about the pointed-to memory, *not* about the pointer to that memory.

IE i could have language where different areas of memory are considered the exact same, and carefully manage them so it's not user-visible.
They are must-alias. You can't necessarily replace one with the other.

For example,

load %x
I move area pointed to by %x to where %y is
load %y

These are must-alias, but the pointers are not value equivalent.

It is certainly possible to have an IR where must-alias implies value equivalence.
It's not even uncommon.
So if we want to go that route, we certainly can.
But it's definitely also valid to not do so.

Note you can go further if you want.

If you have must alias != Pointer value equivalence, then In world where they are different, realistically the only thing i have to prove is: At any point in the program where something accesses memory. they access the same memory.

Thus, the following could be legal (i'm going to do it in non-ssa, just imagine it's two-level pointers)

%x = a place
load %x
%x = nonsense
%x = %z
load %z

where %z and %x could still be must-alias.

(Note this is pretty aggressive, i don't think we would go this far)

The converse is true, FWIW:
If they must be pointer-equal, situations like the above can't be must-alias, if they are, you may transform the program incorrectly, particularly if you allow that equivalence to go back in time and affect earlier statements.

In D31585#716389, @sanjoy wrote:

One more thing to document is that with invariant.group in the picture, certain kinds of "specializations" are no longer correct. For instance, I cannot transform:

Okay, false alarm (for some defintion of "alarm" :) ), this is not new to the @llvm.invariant.group.barrier intrinsic -- we already can't do this kind of transform because of realloc.

Say we have a program that prints 20:

new = realloc(old);
*new = 20;
print *new;

new = realloc(old);
if (new == old) {
  *new = 20;
  print *new;
} else {
  *new = 20;
  print *new;
}

new = realloc(old);
if (new == old) {
  *old = 20;
  print *new;
} else {
  *new = 20;
  print *new;
}

to (since realloc returns noalias)

new = realloc(old);
if (new == old) {
  print *new;
  *old = 20;
} else {
  *new = 20;
  print *new;
}

and now if realloc manages to reallocate in place, we'll print garbage instead of 20.

In D31585#716390, @dberlin wrote:
In D31585#716385, @sanjoy wrote:
In D31585#716346, @Prazek wrote:

It can't, because the optimizations based on invariant.group will not be stripping barriers. It can only work on SSA values.

The outcome of all the optimizations will be:

i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = vptr

Which is totally valid, because barrier doesn't change any memory.

Are you saying that if I have (forget about barrier for a second):
%v0 = load %X
...
%v1 = load %Y
then even if %X must-alias %Y I cannot transform the above to:
%v0 = load %X
...
%v1 = load %X
Yes.
If you can prove them value equivalent, you can transform them, but not just because they refer to the same memory location.

which seems pretty weird. What does "must alias" even mean then?

It means the language says the memory pointed to by these two locations is the same memory.
That does not imply the pointers are equal.

IE it's about the pointed-to memory, *not* about the pointer to that memory.

I was not using pointer equality in the above example. I meant to say that if the two SSA values %X and %Y (of type i32*, say) are MustAlias, then they must refer to the same memory location by definition (whether they're pointer equal or not is irrelevant). Which means loading from one should be the same as loading from the other (otherwise what does "referring to the same memory location" even mean?).

IE i could have language where different areas of memory are considered the exact same, and carefully manage them so it's not user-visible.
They are must-alias. You can't necessarily replace one with the other.

I understand that I can't replace one with the other in arbitrary computation (like pointer comparisons). But it seems to me that we should be able to replace them in contexts that only load / store through those pointers.

For example,
load %x
I move area pointed to by %x to where %y is
load %y
These are must-alias, but the pointers are not value equivalent.

True, but this isn't contradictory with my previous example.

It is certainly possible to have an IR where must-alias implies value equivalence.
It's not even uncommon.
So if we want to go that route, we certainly can.

As I mentioned on IRC, in http://llvm.org/docs/AliasAnalysis.html#must-may-and-no-alias-responses we specifically say "A MustAlias response implies that the pointers compare equal.". However, I agree that that's just a specification bug.

But it's definitely also valid to not do so.

Note you can go further if you want.

If you have must alias != Pointer value equivalence, then In world where they are different, realistically the only thing i have to prove is: At any point in the program where something accesses memory. they access the same memory.

Sure - that makes sense. I imagine multi-mapping scenarios where bitwise inequal pointers may be accessing the same memory.

Thus, the following could be legal (i'm going to do it in non-ssa, just imagine it's two-level pointers)
%x = a place
load %x
%x = nonsense
%x = %z
load %z
where %z and %x could still be must-alias.

The example wasn't clear to me -- when you say %x and %z are must-alias, do you mean the value stored in %x, or the value %x itself (of type int** presumably)?

(Note this is pretty aggressive, i don't think we would go this far)

The converse is true, FWIW:
If they must be pointer-equal, situations like the above can't be must-alias, if they are, you may transform the program incorrectly, particularly if you allow that equivalence to go back in time and affect earlier statements.

In D31585#716401, @sanjoy wrote:
In D31585#716390, @dberlin wrote:
In D31585#716385, @sanjoy wrote:
In D31585#716346, @Prazek wrote:

It can't, because the optimizations based on invariant.group will not be stripping barriers. It can only work on SSA values.

The outcome of all the optimizations will be:

i8* ptr_0 = ...
i8 old_val = load ptr_0, !invariant.group
store vptr to ptr_0
i8* ptr_1 = barrier(ptr_0)
i8 val = vptr

Which is totally valid, because barrier doesn't change any memory.

Are you saying that if I have (forget about barrier for a second):
%v0 = load %X
...
%v1 = load %Y
then even if %X must-alias %Y I cannot transform the above to:
%v0 = load %X
...
%v1 = load %X
Yes.
If you can prove them value equivalent, you can transform them, but not just because they refer to the same memory location.

which seems pretty weird. What does "must alias" even mean then?

It means the language says the memory pointed to by these two locations is the same memory.
That does not imply the pointers are equal.

IE it's about the pointed-to memory, *not* about the pointer to that memory.
I was not using pointer equality in the above example. I meant to say that if the two SSA values %X and %Y (of type i32*, say) are MustAlias, then they must refer to the same memory location by definition (whether they're pointer equal or not is irrelevant). Which means loading from one should be the same as loading from the other (otherwise what does "referring to the same memory location" even mean?).

I would say that if you have these two loads of different ssa pointers, but you know they mustalias, then you know that the load will produce the same value as with second pointer.
With that information you could (sometimes) remove the load, but you can't replace %X with %Y.
Actually it would be probably safe to replace only this one use, but if you are not planning to remove the load, then it should not make any difference for optimizations.

In D31585#716506, @Prazek wrote:

I was not using pointer equality in the above example. I meant to say that if the two SSA values %X and %Y (of type i32*, say) are MustAlias, then they must refer to the same memory location by definition (whether they're pointer equal or not is irrelevant). Which means loading from one should be the same as loading from the other (otherwise what does "referring to the same memory location" even mean?).

I would say that if you have these two loads of different ssa pointers, but you know they mustalias, then you know that the load will produce the same value as with second pointer.
With that information you could (sometimes) remove the load, but you can't replace %X with %Y.

That seems a bit counter-intuitive to me, but I'd be happy with an edit to docs/AliasAnalysis.rst clearly explaining this.

Btw, I realized my example was misleading -- there is no need for the first load. I could have just said:

%t = load %Y

to (with %Y must-alias `%Y)

%t = load %X

Actually it would be probably safe to replace only this one use, but if you are not planning to remove the load, then it should not make any difference for optimizations.

I presume you'll at least need to drop the invariant.group metadata from the load whose pointer you're transforming. But, as I said, I won't be unhappy with disallowing the above transform altogether.

One more thing -- I suppose you can't use this MustAlias together with the properties conferred by the invariant.group metadata, right? E.g. in

vtbl_0 = load ptr0, !invariant.group
// placement new
*ptr0 = new_vtable
ptr1 = barrier(ptr0)
vtbl_1 = load ptr1, !invariant.group

you can't apply both of these rules

a. ptr0 must-alias ptr1
b. vtbl_1 is an invariant.group, so the store of the new vtable can be ignored

since that will let you forward vtbl_0 to vtbl_1, which isn't what you want.

If that's true, I think it too is subtle enough to put in writing (probably as a comment).

In D31585#716793, @sanjoy wrote:
In D31585#716506, @Prazek wrote:

I was not using pointer equality in the above example. I meant to say that if the two SSA values %X and %Y (of type i32*, say) are MustAlias, then they must refer to the same memory location by definition (whether they're pointer equal or not is irrelevant). Which means loading from one should be the same as loading from the other (otherwise what does "referring to the same memory location" even mean?).

I would say that if you have these two loads of different ssa pointers, but you know they mustalias, then you know that the load will produce the same value as with second pointer.
With that information you could (sometimes) remove the load, but you can't replace %X with %Y.

That seems a bit counter-intuitive to me, but I'd be happy with an edit to docs/AliasAnalysis.rst clearly explaining this.

Btw, I realized my example was misleading -- there is no need for the first load. I could have just said:
%t = load %Y
to (with %Y must-alias `%Y)
%t = load %X
Actually it would be probably safe to replace only this one use, but if you are not planning to remove the load, then it should not make any difference for optimizations.

I presume you'll at least need to drop the invariant.group metadata from the load whose pointer you're transforming. But, as I said, I won't be unhappy with disallowing the above transform altogether.

One more thing -- I suppose you can't use this MustAlias together with the properties conferred by the invariant.group metadata, right? E.g. in
vtbl_0 = load ptr0, !invariant.group
// placement new
*ptr0 = new_vtable
ptr1 = barrier(ptr0)
vtbl_1 = load ptr1, !invariant.group
you can't apply both of these rules

a. ptr0 must-alias ptr1
b. vtbl_1 is an invariant.group, so the store of the new vtable can be ignored

since that will let you forward vtbl_0 to vtbl_1, which isn't what you want.

If that's true, I think it too is subtle enough to put in writing (probably as a comment).

But what advantage does it give us, that we can replace one value with another in loads/stores? I don't think it should give any, because if one pass would take advantage of it, then it can do it the same way by just using AA. It add unnecessary complexity and we would loose soem of optimizations (like the dropping of invariant.group).
I think we should keep with the model we have right now, in which optimizations can know that loads of different pointers that mustalias produces the same value, but it can't replace the ssa values.

But what advantage does it give us, that we can replace one value with another in loads/stores?

The same redundancy elimination advantage you get anywhere else

I don't think it should give any, because if one pass would take advantage of it, then it can do it the same way by just using AA.

Err, if the address is expensive to compute, it definitely gives an advantage. It also helps prove conditionals, and a whole host of things.
We canonicalize and redundancy eliminate not just for speed, but to make the IR look simpler for other things.

It add unnecessary complexity

To what?
Invariant.group?
It's metadata, vs regualr IR semantics :)
I'm okay with adding complexity to metadata if it makes regular IR simpler to reason about.
It's complexity to say "it's okay to infer this but not that".
Complexity cuts both ways here

and we would loose soem of optimizations (like the dropping of invariant.group).

Honestly, i think you are a bit blinded here.
We lose optimization either way. You just happen to like the optimizations it helps with. But you also assume we'll lose them if we don't do it that way.
That's probably wrong. We can almost always invent better ways to represent things.
:)

I think we should keep with the model we have right now, in which optimizations can know that loads of different pointers that mustalias produces the same value, but it can't replace the ssa values.

Maybe. But i'm going to be honest. I don't think your argument for why we should do that seems very strong so far, and sanjoy is making a pretty cogent argument.

Does that mean we should do it? No, but we need to have a real discussion of the benefits/tradeoffs.
The benefits are not "we can devirtualize", because there are lots of ways to make that happen. The main benefit is it's already done. But that doesn't mean we shouldn't redesign it if we need to.
In the meantime, if it means invariant.group can't be as useful in as many places, we'll get over it.
We always find better ways to express things.

We are rarely getting metadata exactly right on the first try, and that's okay.
We shouldn't stick with things just because we did it.
(we shouldn't change things just beause we can, either)

I don't think there is any advantage of canonicalizing like this based on mustalias - it is possible, but if there would be real advantage then someone would already figure it out and we would do it. My suspect is that every intrinsic that returns, and doesn't care about keeping the value, is marked with returned attribute and we do already replace it.

There is another argument, why we should not replace the values.

%x = call i32 @foo(i8* %ptr)

If we know that foo will produce the same value as if we would give it another pointer - %ptr2 (where we don't know if %ptr == %ptr2), is it a valid reason to replace it with %ptr2?

%x = call i32 @foo(i8* %ptr2)

In general I don't think it is. Even if @foo would be readonly or readnone, it should't be good reason to do so. But the knowledge is still very useful if we can eliminate calls to foo based
on it, like:

%x = call i32 @foo(i8* %ptr)
%t = call i32 @foo(i8* %ptr2)

=>
%x = call i32 @foo(i8* %ptr)
%y = %x

I am really open to another other ideas and I hope Chandler will join the discussion.

Hi Piotr,

My point was less that "doing the replacement is super important for performance", but that "I'm worried that if we cannot do that transform we must have changed something fundamental".

It is the same as, hypothetically, not being able to transform %t = add i32 5, 10 to %t = add i32 10, 5. Obviously, the transformation is not very useful (since you'll normally just fold %t to 15) but the fact that we (hypothetically) cannot do that is surprising.

In any case, I think the replacement is fine as long as you also drop the !invariant.group associated with the load, if any. That is, the following is fine I think:

load i32 %X, !invariant.group

(%X must-alias %Y) =>

load i32 %Y

However, that needs to be documented, and part of the spec for !invariant.group.

In D31585#717000, @sanjoy wrote:
Hi Piotr,

My point was less that "doing the replacement is super important for performance", but that "I'm worried that if we cannot do that transform we must have changed something fundamental".

It is the same as, hypothetically, not being able to transform %t = add i32 5, 10 to %t = add i32 10, 5. Obviously, the transformation is not very useful (since you'll normally just fold %t to 15) but the fact that we (hypothetically) cannot do that is surprising.

In any case, I think the replacement is fine as long as you also drop the !invariant.group associated with the load, if any. That is, the following is fine I think:
load i32 %X, !invariant.group
(%X must-alias %Y) =>
load i32 %Y
However, that needs to be documented, and part of the spec for !invariant.group.

Ok, if this is the case then I totally agree - the invariant.group is associated with the SSA value, so we need to drop that if we woudl perform such transformation. I will try to send patch for a review about the documentation tomorrow.

Prazek added a parent revision: D31752: Invariant.group and mustalias docs fixes.Apr 6 2017, 3:56 AM

Prazek removed a parent revision: D31752: Invariant.group and mustalias docs fixes.Apr 6 2017, 6:31 AM

Prazek added a parent revision: D31758: Invariant.group and mustalias docs fixes.

Ok, the documentation has been fixed in https://reviews.llvm.org/D31758. Is there anything I should fix before submiting this patch?

ping, it is blocking 2 other patches

ping

Prazek added a reviewer: nlewycky.Apr 24 2017, 3:45 AM

While I'm still not 100% sure about the approach, at this point I'm okay with this going in.

One thing you can do to assuage my concerns quite a bit is specify the invariant.group feature as "experimental" (after discussing on llvm-dev). That way if we find issues with the design later on we can change it in backward incompatible ways without breaking users already using it for things other than devirtualization.

include/llvm/IR/Value.h
490 ↗	(On Diff #93816)	I'd s/stripPointerCastsWithBarriers/stripPointerCastsAndBarriers/

This revision is now accepted and ready to land.Apr 24 2017, 10:58 AM

In D31585#735694, @sanjoy wrote:

While I'm still not 100% sure about the approach, at this point I'm okay with this going in.

One thing you can do to assuage my concerns quite a bit is specify the invariant.group feature as "experimental" (after discussing on llvm-dev). That way if we find issues with the design later on we can change it in backward incompatible ways without breaking users already using it for things other than devirtualization.

I think it is a good idea. I will do so

Closed by commit rL301227: Handle invariant.group.barrier in BasicAA (authored by Prazek). · Explain WhyApr 24 2017, 12:50 PM

This revision was automatically updated to reflect the committed changes.

Prazek marked an inline comment as done.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

IR/

Value.h

11 lines

lib/

Analysis/

BasicAliasAnalysis.cpp

12 lines

IR/

Value.cpp

19 lines

test/

Other/

Inputs/

invariant.group.barrier.ll

15 lines

Transforms/

GVN/

invariant.group.ll

12 lines

Diff 96449

llvm/trunk/include/llvm/IR/Value.h

Show First 20 Lines • Show All 476 Lines • ▼ Show 20 Lines	#include "llvm/IR/Value.def"
/// Returns the original uncasted value. If this is called on a non-pointer		/// Returns the original uncasted value. If this is called on a non-pointer
/// value, it returns 'this'.		/// value, it returns 'this'.
const Value *stripPointerCasts() const;		const Value *stripPointerCasts() const;
Value *stripPointerCasts() {		Value *stripPointerCasts() {
return const_cast<Value *>(		return const_cast<Value *>(
static_cast<const Value *>(this)->stripPointerCasts());		static_cast<const Value *>(this)->stripPointerCasts());
}		}

		/// \brief Strip off pointer casts, all-zero GEPs, aliases and barriers.
		///
		/// Returns the original uncasted value. If this is called on a non-pointer
		/// value, it returns 'this'. This function should be used only in
		/// Alias analysis.
		const Value *stripPointerCastsAndBarriers() const;
		Value *stripPointerCastsAndBarriers() {
		return const_cast<Value *>(
		static_cast<const Value *>(this)->stripPointerCastsAndBarriers());
		}

/// \brief Strip off pointer casts and all-zero GEPs.		/// \brief Strip off pointer casts and all-zero GEPs.
///		///
/// Returns the original uncasted value. If this is called on a non-pointer		/// Returns the original uncasted value. If this is called on a non-pointer
/// value, it returns 'this'.		/// value, it returns 'this'.
const Value *stripPointerCastsNoFollowAliases() const;		const Value *stripPointerCastsNoFollowAliases() const;
Value *stripPointerCastsNoFollowAliases() {		Value *stripPointerCastsNoFollowAliases() {
return const_cast<Value *>(		return const_cast<Value *>(
static_cast<const Value *>(this)->stripPointerCastsNoFollowAliases());		static_cast<const Value *>(this)->stripPointerCastsNoFollowAliases());
▲ Show 20 Lines • Show All 342 Lines • Show Last 20 Lines

llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp

Show First 20 Lines • Show All 918 Lines • ▼ Show 20 Lines
/// Provide ad-hoc rules to disambiguate accesses through two GEP operators,		/// Provide ad-hoc rules to disambiguate accesses through two GEP operators,
/// both having the exact same pointer operand.		/// both having the exact same pointer operand.
static AliasResult aliasSameBasePointerGEPs(const GEPOperator *GEP1,		static AliasResult aliasSameBasePointerGEPs(const GEPOperator *GEP1,
uint64_t V1Size,		uint64_t V1Size,
const GEPOperator *GEP2,		const GEPOperator *GEP2,
uint64_t V2Size,		uint64_t V2Size,
const DataLayout &DL) {		const DataLayout &DL) {

assert(GEP1->getPointerOperand()->stripPointerCasts() ==		assert(GEP1->getPointerOperand()->stripPointerCastsAndBarriers() ==
GEP2->getPointerOperand()->stripPointerCasts() &&		GEP2->getPointerOperand()->stripPointerCastsAndBarriers() &&
GEP1->getPointerOperandType() == GEP2->getPointerOperandType() &&		GEP1->getPointerOperandType() == GEP2->getPointerOperandType() &&
"Expected GEPs with the same pointer operand");		"Expected GEPs with the same pointer operand");

// Try to determine whether GEP1 and GEP2 index through arrays, into structs,		// Try to determine whether GEP1 and GEP2 index through arrays, into structs,
// such that the struct field accesses provably cannot alias.		// such that the struct field accesses provably cannot alias.
// We also need at least two indices (the pointer, and the struct field).		// We also need at least two indices (the pointer, and the struct field).
if (GEP1->getNumIndices() != GEP2->getNumIndices() \|\|		if (GEP1->getNumIndices() != GEP2->getNumIndices() \|\|
GEP1->getNumIndices() < 2)		GEP1->getNumIndices() < 2)
▲ Show 20 Lines • Show All 242 Lines • ▼ Show 20 Lines	if (BaseAlias != MustAlias)
return BaseAlias;		return BaseAlias;

// Otherwise, we have a MustAlias. Since the base pointers alias each other		// Otherwise, we have a MustAlias. Since the base pointers alias each other
// exactly, see if the computed offset from the common pointer tells us		// exactly, see if the computed offset from the common pointer tells us
// about the relation of the resulting pointer.		// about the relation of the resulting pointer.
// If we know the two GEPs are based off of the exact same pointer (and not		// If we know the two GEPs are based off of the exact same pointer (and not
// just the same underlying object), see if that tells us anything about		// just the same underlying object), see if that tells us anything about
// the resulting pointers.		// the resulting pointers.
if (GEP1->getPointerOperand()->stripPointerCasts() ==		if (GEP1->getPointerOperand()->stripPointerCastsAndBarriers() ==
GEP2->getPointerOperand()->stripPointerCasts() &&		GEP2->getPointerOperand()->stripPointerCastsAndBarriers() &&
GEP1->getPointerOperandType() == GEP2->getPointerOperandType()) {		GEP1->getPointerOperandType() == GEP2->getPointerOperandType()) {
AliasResult R = aliasSameBasePointerGEPs(GEP1, V1Size, GEP2, V2Size, DL);		AliasResult R = aliasSameBasePointerGEPs(GEP1, V1Size, GEP2, V2Size, DL);
// If we couldn't find anything interesting, don't abandon just yet.		// If we couldn't find anything interesting, don't abandon just yet.
if (R != MayAlias)		if (R != MayAlias)
return R;		return R;
}		}

// If the max search depth is reached, the result is undefined		// If the max search depth is reached, the result is undefined
▲ Show 20 Lines • Show All 298 Lines • ▼ Show 20 Lines	AliasResult BasicAAResult::aliasCheck(const Value *V1, uint64_t V1Size,
uint64_t V2Size, AAMDNodes V2AAInfo,		uint64_t V2Size, AAMDNodes V2AAInfo,
const Value O1, const Value O2) {		const Value O1, const Value O2) {
// If either of the memory references is empty, it doesn't matter what the		// If either of the memory references is empty, it doesn't matter what the
// pointer values are.		// pointer values are.
if (V1Size == 0 \|\| V2Size == 0)		if (V1Size == 0 \|\| V2Size == 0)
return NoAlias;		return NoAlias;

// Strip off any casts if they exist.		// Strip off any casts if they exist.
V1 = V1->stripPointerCasts();		V1 = V1->stripPointerCastsAndBarriers();
V2 = V2->stripPointerCasts();		V2 = V2->stripPointerCastsAndBarriers();

// If V1 or V2 is undef, the result is NoAlias because we can always pick a		// If V1 or V2 is undef, the result is NoAlias because we can always pick a
// value for undef that aliases nothing in the program.		// value for undef that aliases nothing in the program.
if (isa<UndefValue>(V1) \|\| isa<UndefValue>(V2))		if (isa<UndefValue>(V1) \|\| isa<UndefValue>(V2))
return NoAlias;		return NoAlias;

// Are we checking for alias of the same value?		// Are we checking for alias of the same value?
// Because we look 'through' phi nodes, we could look at "Value" pointers from		// Because we look 'through' phi nodes, we could look at "Value" pointers from
▲ Show 20 Lines • Show All 323 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/Value.cpp

Show First 20 Lines • Show All 426 Lines • ▼ Show 20 Lines	void Value::replaceUsesOutsideBlock(Value New, BasicBlock BB) {
}		}
}		}

namespace {		namespace {
// Various metrics for how much to strip off of pointers.		// Various metrics for how much to strip off of pointers.
enum PointerStripKind {		enum PointerStripKind {
PSK_ZeroIndices,		PSK_ZeroIndices,
PSK_ZeroIndicesAndAliases,		PSK_ZeroIndicesAndAliases,
		PSK_ZeroIndicesAndAliasesAndBarriers,
PSK_InBoundsConstantIndices,		PSK_InBoundsConstantIndices,
PSK_InBounds		PSK_InBounds
};		};

template <PointerStripKind StripKind>		template <PointerStripKind StripKind>
static const Value stripPointerCastsAndOffsets(const Value V) {		static const Value stripPointerCastsAndOffsets(const Value V) {
if (!V->getType()->isPointerTy())		if (!V->getType()->isPointerTy())
return V;		return V;

// Even though we don't look through PHI nodes, we could be called on an		// Even though we don't look through PHI nodes, we could be called on an
// instruction in an unreachable block, which may be on a cycle.		// instruction in an unreachable block, which may be on a cycle.
SmallPtrSet<const Value *, 4> Visited;		SmallPtrSet<const Value *, 4> Visited;

Visited.insert(V);		Visited.insert(V);
do {		do {
if (auto *GEP = dyn_cast<GEPOperator>(V)) {		if (auto *GEP = dyn_cast<GEPOperator>(V)) {
switch (StripKind) {		switch (StripKind) {
case PSK_ZeroIndicesAndAliases:		case PSK_ZeroIndicesAndAliases:
		case PSK_ZeroIndicesAndAliasesAndBarriers:
case PSK_ZeroIndices:		case PSK_ZeroIndices:
if (!GEP->hasAllZeroIndices())		if (!GEP->hasAllZeroIndices())
return V;		return V;
break;		break;
case PSK_InBoundsConstantIndices:		case PSK_InBoundsConstantIndices:
if (!GEP->hasAllConstantIndices())		if (!GEP->hasAllConstantIndices())
return V;		return V;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case PSK_InBounds:		case PSK_InBounds:
if (!GEP->isInBounds())		if (!GEP->isInBounds())
return V;		return V;
break;		break;
}		}
V = GEP->getPointerOperand();		V = GEP->getPointerOperand();
} else if (Operator::getOpcode(V) == Instruction::BitCast \|\|		} else if (Operator::getOpcode(V) == Instruction::BitCast \|\|
Operator::getOpcode(V) == Instruction::AddrSpaceCast) {		Operator::getOpcode(V) == Instruction::AddrSpaceCast) {
V = cast<Operator>(V)->getOperand(0);		V = cast<Operator>(V)->getOperand(0);
} else if (auto *GA = dyn_cast<GlobalAlias>(V)) {		} else if (auto *GA = dyn_cast<GlobalAlias>(V)) {
if (StripKind == PSK_ZeroIndices \|\| GA->isInterposable())		if (StripKind == PSK_ZeroIndices \|\| GA->isInterposable())
return V;		return V;
V = GA->getAliasee();		V = GA->getAliasee();
} else {		} else {
if (auto CS = ImmutableCallSite(V))		if (auto CS = ImmutableCallSite(V)) {
if (const Value *RV = CS.getReturnedArgOperand()) {		if (const Value *RV = CS.getReturnedArgOperand()) {
V = RV;		V = RV;
continue;		continue;
}		}
		// The result of invariant.group.barrier must alias it's argument,
		// but it can't be marked with returned attribute, that's why it needs
		// special case.
		if (StripKind == PSK_ZeroIndicesAndAliasesAndBarriers &&
		CS.getIntrinsicID() == Intrinsic::invariant_group_barrier) {
		V = CS.getArgOperand(0);
		continue;
		}
		}
return V;		return V;
}		}
assert(V->getType()->isPointerTy() && "Unexpected operand type!");		assert(V->getType()->isPointerTy() && "Unexpected operand type!");
} while (Visited.insert(V).second);		} while (Visited.insert(V).second);

return V;		return V;
}		}
} // end anonymous namespace		} // end anonymous namespace

const Value *Value::stripPointerCasts() const {		const Value *Value::stripPointerCasts() const {
return stripPointerCastsAndOffsets<PSK_ZeroIndicesAndAliases>(this);		return stripPointerCastsAndOffsets<PSK_ZeroIndicesAndAliases>(this);
}		}

const Value *Value::stripPointerCastsNoFollowAliases() const {		const Value *Value::stripPointerCastsNoFollowAliases() const {
return stripPointerCastsAndOffsets<PSK_ZeroIndices>(this);		return stripPointerCastsAndOffsets<PSK_ZeroIndices>(this);
}		}

const Value *Value::stripInBoundsConstantOffsets() const {		const Value *Value::stripInBoundsConstantOffsets() const {
return stripPointerCastsAndOffsets<PSK_InBoundsConstantIndices>(this);		return stripPointerCastsAndOffsets<PSK_InBoundsConstantIndices>(this);
}		}

		const Value *Value::stripPointerCastsAndBarriers() const {
		return stripPointerCastsAndOffsets<PSK_ZeroIndicesAndAliasesAndBarriers>(
		this);
		}

const Value *		const Value *
Value::stripAndAccumulateInBoundsConstantOffsets(const DataLayout &DL,		Value::stripAndAccumulateInBoundsConstantOffsets(const DataLayout &DL,
APInt &Offset) const {		APInt &Offset) const {
if (!getType()->isPointerTy())		if (!getType()->isPointerTy())
return this;		return this;

assert(Offset.getBitWidth() == DL.getPointerSizeInBits(cast<PointerType>(		assert(Offset.getBitWidth() == DL.getPointerSizeInBits(cast<PointerType>(
getType())->getAddressSpace()) &&		getType())->getAddressSpace()) &&
▲ Show 20 Lines • Show All 393 Lines • Show Last 20 Lines

llvm/trunk/test/Other/Inputs/invariant.group.barrier.ll

				; RUN: opt -S -gvn < %s \| FileCheck %s
				; RUN: opt -S -newgvn < %s \| FileCheck %s
				; RUN: opt -S -O3 < %s \| FileCheck %s

				; This test check if optimizer is not proving equality based on mustalias
				; CHECK-LABEL: define void @dontProveEquality(i8* %a)
				define void @dontProveEquality(i8* %a) {
				%b = call i8* @llvm.invariant.group.barrier(i8* %a)
				%r = i1 icmp eq i8* %b, i8* %a
				;CHECK: call void @use(%r)
				call void @use(%r)
				}

				declare void @use(i1)
				declare i8* @llvm.invariant.group.barrier(i8 *)

llvm/trunk/test/Transforms/GVN/invariant.group.ll

Show First 20 Lines • Show All 231 Lines • ▼ Show 20 Lines	entry:
store i8 42, i8* %ptr, !invariant.group !0		store i8 42, i8* %ptr, !invariant.group !0
%ptr2 = call i8* @getPointer(i8* %ptr)		%ptr2 = call i8* @getPointer(i8* %ptr)
%a = load i8, i8* %ptr2, !invariant.group !0		%a = load i8, i8* %ptr2, !invariant.group !0

; CHECK: ret i8 %a		; CHECK: ret i8 %a
ret i8 %a		ret i8 %a
}		}

; CHECK-LABEL: define i8 @unoptimizable4() {		; CHECK-LABEL: define i8 @optimizable4() {
define i8 @unoptimizable4() {		define i8 @optimizable4() {
entry:		entry:
%ptr = alloca i8		%ptr = alloca i8
store i8 42, i8* %ptr, !invariant.group !0		store i8 42, i8* %ptr, !invariant.group !0
%ptr2 = call i8* @llvm.invariant.group.barrier(i8* %ptr)		%ptr2 = call i8* @llvm.invariant.group.barrier(i8* %ptr)
		; CHECK-NOT: load
%a = load i8, i8* %ptr2, !invariant.group !0		%a = load i8, i8* %ptr2, !invariant.group !0

; CHECK: ret i8 %a		; CHECK: ret i8 42
ret i8 %a		ret i8 %a
}		}

; CHECK-LABEL: define i8 @volatile1() {		; CHECK-LABEL: define i8 @volatile1() {
define i8 @volatile1() {		define i8 @volatile1() {
entry:		entry:
%ptr = alloca i8		%ptr = alloca i8
store i8 42, i8* %ptr, !invariant.group !0		store i8 42, i8* %ptr, !invariant.group !0
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	; CHECK: call void @bar(i8 %c)
call void @bar(i8 %c)		call void @bar(i8 %c)

%unknownValue = load i8, i8* @unknownPtr		%unknownValue = load i8, i8* @unknownPtr
; FIXME: Can assume that %unknownValue == 42		; FIXME: Can assume that %unknownValue == 42
; CHECK: store i8 %unknownValue, i8* %ptr, !invariant.group !0		; CHECK: store i8 %unknownValue, i8* %ptr, !invariant.group !0
store i8 %unknownValue, i8* %ptr, !invariant.group !0		store i8 %unknownValue, i8* %ptr, !invariant.group !0

%newPtr2 = call i8* @llvm.invariant.group.barrier(i8* %ptr)		%newPtr2 = call i8* @llvm.invariant.group.barrier(i8* %ptr)
%d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through invariant.group.barrier to get value of %ptr		; CHECK-NOT: load
; CHECK: ret i8 %d		%d = load i8, i8* %newPtr2, !invariant.group !0
		; CHECK: ret i8 %unknownValue
ret i8 %d		ret i8 %d
}		}

; This test checks if invariant.group understands gep with zeros		; This test checks if invariant.group understands gep with zeros
; CHECK-LABEL: define void @testGEP0() {		; CHECK-LABEL: define void @testGEP0() {
define void @testGEP0() {		define void @testGEP0() {
%a = alloca %struct.A, align 8		%a = alloca %struct.A, align 8
%1 = bitcast %struct.A* %a to i8*		%1 = bitcast %struct.A* %a to i8*
▲ Show 20 Lines • Show All 126 Lines • Show Last 20 Lines