This is an archive of the discontinued LLVM Phabricator instance.

[ConstantFolding] Look through local aliases when simplify GEPs
AbandonedPublic

Authored by aeubanks on Mar 24 2021, 12:15 AM.

Details

Reviewers
rnk
pcc
MaskRay
Summary

Only consider aliases that themselves aren't interposable and alias to
something that isn't interposable.

MSVC-style RTTI uses GEPs through local aliases for vtables.

With this patch, we can simplify the following C++ code into just one
function with a constant return value:

namespace {
        struct A {
                virtual int f() = 0;
                virtual ~A() = default;
        };
        struct B: A {
                virtual int f() override { return 41; }
                virtual ~B() = default;
        };
}

int foo() {
        std::unique_ptr<A> a = std::make_unique<B>();
        return a->f();
}

Diff Detail

Event Timeline

aeubanks created this revision.Mar 24 2021, 12:15 AM
aeubanks requested review of this revision.Mar 24 2021, 12:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2021, 12:15 AM
llvm/lib/IR/Value.cpp
612–613

I wonder if it would be safe to power up all of the other versions of pointer stripping to look through local aliases.

So I guess I don't understand what this does differently than stripPointerCastsAndAliases? Is limiting this to local aliases that important?

pcc requested changes to this revision.Mar 24 2021, 8:28 PM

Can you make it so that the load ends up being replaced with the loaded value, without replacing the GEP itself? That would seem to address the MSVC issue.

llvm/lib/IR/Value.cpp
612–613

I don't think we should. It wouldn't be safe to allow passes to replace @alias with @aliasee with this IR:

@alias = internal alias @aliasee
@aliasee = linkonce_odr global i32 42

We should also avoid the QoI issue (missing symbol table entry) that would result from such replacement given this IR:

@alias = internal alias @aliasee
@aliasee = private global i32 42
This revision now requires changes to proceed.Mar 24 2021, 8:28 PM
In D99240#2649567, @pcc wrote:

Can you make it so that the load ends up being replaced with the loaded value, without replacing the GEP itself? That would seem to address the MSVC issue.

What's the benefit of that over this approach? We'd have to potentially end up doing the same thing for stores and maybe other places.

llvm/lib/IR/Value.cpp
612–613

Why is the first transform not safe?

So I guess I don't understand what this does differently than stripPointerCastsAndAliases? Is limiting this to local aliases that important?

I initially tried using stripPointerCastsAndAliases(), but gep-alias.ll failed because it was specifically checking that we don't look through non-local aliases. The original comments in the test explain it better.

nikic added a subscriber: nikic.Mar 25 2021, 10:31 AM

I would expect those internal aliases to be replaced by the aliasee in the first place -- is the problem that this happens too late?

pcc added a comment.Mar 25 2021, 10:53 AM
In D99240#2649567, @pcc wrote:

Can you make it so that the load ends up being replaced with the loaded value, without replacing the GEP itself? That would seem to address the MSVC issue.

What's the benefit of that over this approach?

In general, the use of an alias is deliberate. We shouldn't be indiscriminately replacing them with aliasees as that may break semantics or lead to QoI issues. Instead, we should be making targeted replacements where they are known to be safe. In particular, replacing the load with the loaded value is safe because linkonce_odr (or external, etc.) on the alias guarantees that the value loaded from any copy of the aliasee is the same. So it doesn't matter whether we (a compile time) read from this translation unit's copy of the global or some other copy.

We'd have to potentially end up doing the same thing for stores and maybe other places.

Can you show an example of a beneficial transformation that would apply to stores?

llvm/lib/IR/Value.cpp
612–613

Because it would replace a reference to this object file's copy of @aliasee with a reference to the copy of @aliasee selected by the linker, and those could potentially be different (and have different addresses, which is observable).

I would expect those internal aliases to be replaced by the aliasee in the first place -- is the problem that this happens too late?

Which pass does this?

The original snippet had a GEP of an alias which itself is another GEP into a global.

I would expect those internal aliases to be replaced by the aliasee in the first place -- is the problem that this happens too late?

Which pass does this?

The original snippet had a GEP of an alias which itself is another GEP into a global.

At least -globalopt does that. Not sure if any other passes do.

Ah I see, https://github.com/llvm/llvm-project/blob/61a55c8812e790842799ba1de5bd81fe8afb3b16/llvm/lib/Transforms/IPO/GlobalOpt.cpp#L2953. It only works if the aliasee is a global value. Extending globalopt to handle aliasees that may be a constant expression containing a global value, rather than just a global value, would probably work as well. Either way is fine by me.

globalopt specifically deletes aliases when possible, so it seems fair to do that here as well, even if that may remove entries from the symbol table.

llvm/lib/IR/Value.cpp
612–613

It doesn't end up using the copy of @aliasee selected by the linker?

globalopt actually performs this transform:

$ cat /tmp/y.ll
@alias = internal alias i32, i32* @aliasee
@aliasee = linkonce_odr global i32 42

define i32 @foo() {
  %i = load i32, i32* @alias
  ret i32 %i
}
$ opt -passes=globalopt /tmp/y.ll -S
@aliasee = linkonce_odr local_unnamed_addr global i32 42

define i32 @foo() local_unnamed_addr {
  %i = load i32, i32* @aliasee, align 4
  ret i32 %i
}
pcc added a comment.Mar 25 2021, 12:30 PM

Ah I see, https://github.com/llvm/llvm-project/blob/61a55c8812e790842799ba1de5bd81fe8afb3b16/llvm/lib/Transforms/IPO/GlobalOpt.cpp#L2953. It only works if the aliasee is a global value. Extending globalopt to handle aliasees that may be a constant expression containing a global value, rather than just a global value, would probably work as well. Either way is fine by me.

globalopt specifically deletes aliases when possible, so it seems fair to do that here as well, even if that may remove entries from the symbol table.

The presence of a bug in globalopt doesn't justify introducing a bug here as well.

llvm/lib/IR/Value.cpp
612–613

It doesn't end up using the copy of @aliasee selected by the linker?

No, it doesn't. @alias is specifically a reference to this object file's copy of the data defined by the @aliasee global.

globalopt actually performs this transform:

I think that's a bug then.

Ah I see, https://github.com/llvm/llvm-project/blob/61a55c8812e790842799ba1de5bd81fe8afb3b16/llvm/lib/Transforms/IPO/GlobalOpt.cpp#L2953. It only works if the aliasee is a global value. Extending globalopt to handle aliasees that may be a constant expression containing a global value, rather than just a global value, would probably work as well. Either way is fine by me.

Could you share (or maybe add to the test) the actual alias/gep structure you want to optimize? It would make sense to me to always replace aliases with local linkage, even if they don't directly point to a global value. For non-local (but non-interposable) linkage it may not be profitable to do this. (I guess it might not be universally profitable for local linkage either in that an expensive constant expression could be duplicated, but I think we usually assume that constant expressions are cheap.)

llvm/lib/IR/Value.cpp
612–613

Because it would replace a reference to this object file's copy of @aliasee with a reference to the copy of @aliasee selected by the linker, and those could potentially be different (and have different addresses, which is observable).

I don't think I understand the problem. Ignoring unnamed_addr, @alias and @aliasee are required to have the same address. If another copy of @aliasee is chosen by the linker, then @alias is required to point to that other copy, which is compatible with replacing uses of @alias with @aliasee in the first place.

pcc added inline comments.Mar 25 2021, 1:02 PM
llvm/lib/IR/Value.cpp
612–613

Sorry, but that's not how aliases work. An alias is just another symbol that points to the same section as the aliasee. There's nothing linking them at the object file level. So they are each independently subject to symbol resolution by the linker.

nikic added inline comments.Mar 25 2021, 1:13 PM
llvm/lib/IR/Value.cpp
612–613

This is how aliases are specified by LangRef:

Aliases that are not unnamed_addr are guaranteed to have the same address as the aliasee expression. unnamed_addr ones are only guaranteed to point to the same content.

If this doesn't match your understanding of how aliases work, please submit a patch to adjust the specification.

The (relevant part of the) code in the description after -O2 is:

@global = private unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* bitcast (%struct.wibble* @global.1 to i8*), i8* bitcast (i32 (%struct.bar*)* @spam.12 to i8*), i8* bitcast (i8* (%struct.bar*, i32)* @blam to i8*)] }, comdat($"??_7B@?A0x8E5408E2@@6B@")

@alias = internal unnamed_addr alias i8*, getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @global, i32 0, i32 0, i32 1)

define dso_local i32 @foo() local_unnamed_addr #0 personality i8* bitcast (i32 (...)* @spam to i8*) {
bb:
  %tmp = tail call noalias nonnull dereferenceable(8) i8* @"??2@YAPEAX_K@Z"(i64 8) #5, !noalias !8
  %tmp1 = bitcast i8* %tmp to %struct.bar*
  %tmp2 = getelementptr inbounds %struct.bar, %struct.bar* %tmp1, i64 0, i32 0, i32 0
  store i32 (...)** bitcast (i8** @alias to i32 (...)**), i32 (...)*** %tmp2, align 8, !tbaa !11, !noalias !8
  %tmp3 = getelementptr %struct.bar, %struct.bar* %tmp1, i64 0, i32 0
  %tmp4 = load i8* (%struct.eggs*, i32)*, i8* (%struct.eggs*, i32)** bitcast (i8** getelementptr inbounds (i8*, i8** @alias, i64 1) to i8* (%struct.eggs*, i32)**), align 16
  %tmp5 = tail call i8* %tmp4(%struct.eggs* nonnull dereferenceable(8) %tmp3, i32 1) #6
  ret i32 41
}

%tmp4's operand gets simplified to the proper function pointer entry in @global with this change.

pcc added inline comments.Mar 25 2021, 1:25 PM
llvm/lib/IR/Value.cpp
612–613

I think that's talking about prior to linker symbol resolution. The LangRef could certainly be clarified on that point though. I'll see if I can get a chance to update it.

rnk added a comment.Mar 25 2021, 1:34 PM

To add some context to the IR fragment, in the MSVC C++ ABI, the vtable symbol (??_7...) points to the first virtual method slot in the table. If RTTI is enabled, a pointer to the RTTI data precedes the vtable, giving you this table (assuming 64-bit pointers):

  • rtti data (offset -8)
  • vtable symbol (??_7) -->
  • virtual method 1 (offset 0)
  • virtual method 2 (offset 8)
  • ...

LLVM does not have a concept of globals where the symbol points to the middle of the data, so we came up with this wacky alias representation, where we have private data, and then an alias into the private data with some offset.

We considered the alternative of adding some kind of "prefix data" to the IR. In the end, we rejected it, although I forget why exactly.

In contrast, the Itanium C++ ABI has fewer vtable symbols, and the address points, the addresses that are actually installed into object vptr slots, are offset from the main vtable symbol.

aeubanks updated this revision to Diff 335694.Apr 6 2021, 6:45 PM

only consider non-interposable aliases with non-interposable aliasees

aeubanks edited the summary of this revision. (Show Details)Apr 6 2021, 6:45 PM
rnk added a comment.Apr 12 2021, 2:42 PM

I think Arthur is right, it should be safe to look through local aliases to local linkage globals. I think we can actually do this in the "for alias analysis" variant of strip pointer casts, and we should consider that.

pcc added a comment.Apr 12 2021, 2:56 PM
In D99240#2684236, @rnk wrote:

I think Arthur is right, it should be safe to look through local aliases to local linkage globals. I think we can actually do this in the "for alias analysis" variant of strip pointer casts, and we should consider that.

Again, safe doesn't mean it's a good idea.

aeubanks planned changes to this revision.Apr 12 2021, 3:04 PM

I'll see if I can do this just for load instructions

llvm/lib/IR/Value.cpp
612–613

We can power down

rnk added a comment.Apr 12 2021, 7:32 PM

To add some specificity, we wouldn't want to go around ripping out all of our ??_7 vtable aliases and replacing them with references to the underlying private global. Consider:

@"??_7Foo@@6B@" = unnamed_addr alias i8*, getelementptr inbounds ({ [2 x i8*] }, { [2 x i8*] }* @0, i32 0, i32 0, i32 1)
define dso_local nonnull %struct.Foo* @"??0Foo@@QEAA@XZ"(%struct.Foo* nonnull returned align 8 dereferenceable(8) %this) unnamed_addr #0 align 2 {
entry:
  %0 = getelementptr inbounds %struct.Foo, %struct.Foo* %this, i64 0, i32 0
  store i32 (...)** bitcast (i8** @"??_7Foo@@6B@" to i32 (...)**), i32 (...)*** %0, align 8, !tbaa !3
  ret %struct.Foo* %this
}

This compiles to:

"??0Foo@@QEAA@XZ":                      # @"??0Foo@@QEAA@XZ"
# %bb.0:                                # %entry                
        movq    %rcx, %rax
        leaq    "??_7Foo@@6B@"(%rip), %rcx
        movq    %rcx, (%rax)
        retq

If you do the replacement here:

define dso_local nonnull %struct.Foo* @"??0Foo@@QEAA@XZ"(%struct.Foo* nonnull returned align 8 dereferenceable(8) %this) unnamed_addr #0 align 2 {
entry:
  %0 = getelementptr inbounds %struct.Foo, %struct.Foo* %this, i64 0, i32 0
  store i32 (...)** bitcast (i8** getelementptr inbounds ({ [2 x i8*] }, { [2 x i8*] }* @0, i32 0, i32 0, i32 1) to i32 (...)**), i32 (...)*** %0, align 8, !tbaa !3
  ret %struct.Foo* %this
}

You get this code, which is less good:

"??0Foo@@QEAA@XZ":                      # @"??0Foo@@QEAA@XZ"
# %bb.0:                                # %entry
        movq    %rcx, %rax
        leaq    .L__unnamed_1+8(%rip), %rcx
        movq    %rcx, (%rax)
        retq

That creates a relocation to the section instead of the symbol.

This alias happens to be external, so it wouldn't be stripped by this change, but even if it were internal, it's nicer to refer to the internal alias than the private symbol.

aeubanks abandoned this revision.Apr 21 2021, 12:21 PM

abandoning in favor of D100923