This is an archive of the discontinued LLVM Phabricator instance.

Simplify recursive launder.invariant.group and strip
ClosedPublic

Authored by Prazek on May 27 2018, 6:28 AM.

Details

Summary

This patch is crucial for proving equality laundered/stripped
pointers. eg:

bool foo(A *a) {
  return a == std::launder(a);
}

Clang with -fstrict-vtable-pointers will emit something like:

define dso_local zeroext i1 @_Z3fooP1A(%struct.A* %a) {
entry:
  %c = bitcast %struct.A* %a to i8*
  %call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c)
  %0 = bitcast %struct.A* %a to i8*
  %1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0)
  %2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call)
  %cmp = icmp eq i8* %1, %2
  ret i1 %cmp
}

and because %2 can be replaced with @llvm.strip.invariant.group(%0)
and that %2 and %1 will produce the same value (because strip is readnone)
we can replace compare with true.

Diff Detail

Repository
rL LLVM

Event Timeline

Prazek created this revision.May 27 2018, 6:28 AM

Otherwise looks good to me. Obviously, you should wait for someone more competent than me to approve it ;)

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1212 ↗(On Diff #148751)

nit: you use II.getArgOperand(0) in the line above and II.getOperand(0) here. Those are obviously equivalent, but maybe stick to one of those (possibly extracting it into auto *Arg = II.getArgOperand(0);)?

Prazek updated this revision to Diff 148756.May 27 2018, 9:56 AM
Prazek marked an inline comment as done.

fixed nit

kuhar added inline comments.May 30 2018, 9:49 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1223 ↗(On Diff #148756)

llvm_unreachable would be preferred here

Prazek updated this revision to Diff 150704.Jun 11 2018, 4:02 AM
Prazek marked an inline comment as done.

unreachable

Prazek updated this revision to Diff 150706.Jun 11 2018, 4:21 AM

updated test

rsmith added inline comments.Jun 28 2018, 2:56 PM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1220–1221 ↗(On Diff #150706)

I think we can do a little better for the strip(strip(p)) case. If we have:

T *p = ...;
T *q = strip(p);
T *r = strip(q);

... it would seem preferable to rewrite that as:

T *r = q;

rather than as:

T *r = strip(p);

(Will GVN rewrite the latter as the former anyway?)

1227 ↗(On Diff #150706)

If we stripped an addrspacecast, we would need to recreate it here; a bitcast might mean something else.

Maybe it would be simpler to only deal with bitcasts here, and have separate instcombine transforms for moving strip/launder past GEPs and addrspacecasts? You should decide which is canonical: addrspacecast(launder) vs launder(addrspacecast).

Whatever you do here should also ideally be able to handle non-zero-index GEPs.

Prazek added inline comments.Jul 1 2018, 10:21 PM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1220–1221 ↗(On Diff #150706)

No, none of the optimization can remove strips. I don't think it is legal in our model and I don't see it how it could work

1227 ↗(On Diff #150706)

Can you give example of transformation of non-zero index GEPs?
Aggree with the addrspacecast, I had it in the back of my head but somehow missed that.
launder(addrspacecast) should be better.

Prazek updated this revision to Diff 153671.Jul 1 2018, 10:33 PM

fixed addrspace cast

amharc added inline comments.Jul 3 2018, 2:15 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1220–1221 ↗(On Diff #150706)

Isn't strip supposed to be pure, and hence two calls to strip(p) should be CSE'd by GVN?

Prazek added inline comments.Jul 3 2018, 10:28 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1220–1221 ↗(On Diff #150706)

Oh sorry Richard, I thought you mean

T *r = p;

The outcome of this transformation will be

T *q = strip(p);
T* r = strip(p);

And then GVN will see that strip(p) is always the same and will do

T* q = strip(p);
T* r = q;

So that is exactly what you said.

rsmith added inline comments.Jul 9 2018, 11:59 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1294–1295 ↗(On Diff #153671)

I think this is still not right from the bitcast / addrspacecast perspective: if we previously had a bitcast from one address space to another, this will convert it into an addrspacecast, which may mean something else. As a pathological example, if we had:

%a = call launder(...)   ; i8 addrspace(0)*
%b = bitcast %a to i8 addrspace(1)*
%c = addrspacecast %b to i8 addrspace(2)*
%d = bitcast %c to i8 addrspacecast(3)*
%e = call launder(%d)

... then you need to rebuild an addrspacecast from address space 1 to address space 2, even though your source is in address space 0 and your destination is in address space 3. (And in general, you may need to rebuild more than one address space cast.)

1227 ↗(On Diff #150706)

For a non-zero-index GEP, I was thinking of a case like:

struct Y { virtual void f(); };
struct X { int q; Y y; };

void f(X *p) {
  Y *a = &launder(p)->y;
  Y *b = launder(&p->y);
  // ...
}

... for which I thought it would be nice if a and b could be GVN'd to the same thing. But I think now that I was wrong: a points to an object whose static type is Y and whose vptr is Y's vptr, whereas b could point to an object derived from Y that has been placement-new'd at the right address. In particular, a->f() can be devirtualized to a call to Y::f, but b->f() cannot.

That said, the issue here is not the non-zero GEP; the issue is that the member access can be thought of as adding information to the invariant group that says 'a vptr load would load this value' (but does not say that it's actually valid to synthesize such a load through the pointer). But this does mean that I don't have any examples where transformation of a non-zero-index GEP would be useful, so it doesn't seem worth working on right now :)

Prazek marked 3 inline comments as done.Jul 10 2018, 10:35 AM
Prazek added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1294–1295 ↗(On Diff #153671)

AFAIK the addrspace casts are just like bitcasts, so code like this:

define i8 addrspace(3)* @foo(i8* %a) {

  %b = addrspacecast i8* %a to i8 addrspace(1)*
  %c = addrspacecast i8 addrspace(1)* %b to i8 addrspace(2)*
  %d = addrspacecast i8 addrspace(2)* %c to i8 addrspace(3)*
  
  ret i8 addrspace(3)* %d
}

Is optimized by opt to:

define i8 addrspace(3)* @foo(i8* readnone %a) local_unnamed_addr #0 {
  %d = addrspacecast i8* %a to i8 addrspace(3)*
  ret i8 addrspace(3)* %d
}

I didn't use bitcasts because if I am not wrong, you cant cast between addrspace with bitcast. Do you think that with this assumption the transformation is correct?

rsmith added inline comments.Jul 10 2018, 12:29 PM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1294–1295 ↗(On Diff #153671)

I can't find any restriction that would prevent bitcasting between pointers to different address spaces; the only restriction on pointer bitcast I can find in the LangRef is that the source and destination pointer types be the same size. In practice, LLVM appears to produce inttoptr(ptrtoint(p)) instead of a cross-address-space bitcast, though, so maybe it's not actually permitted.

Feedback from someone who knows the IR rules better than I do would be useful :)

Prazek added inline comments.Jul 10 2018, 5:31 PM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1294–1295 ↗(On Diff #153671)

I tried the bitcasting that you showed and it failed like:

opt-5.0: test.ll:3:18: error: invalid cast opcode for cast from 'i8*' to 'i8 addrspace(1)*'

So I guess the LangRef should be more precise about it.

But anyway, Even if it would be ok, it does not change anything - we can still get rid of the bitcasts/addrspacecasts when going throught multiple conversions.

rsmith accepted this revision.Jul 12 2018, 4:16 PM
rsmith added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1294–1295 ↗(On Diff #153671)

OK, if the verifier rejects the bitcasts then I'm happy assuming they're not valid IR, and the inttoptr(ptrtoint(X)) construction is the only way to get that effect.

This revision is now accepted and ready to land.Jul 12 2018, 4:16 PM
This revision was automatically updated to reflect the committed changes.
Prazek marked 5 inline comments as done.