Page MenuHomePhabricator

Analysis: Don't look through aliases when simplifying GEPs.
AcceptedPublic

Authored by pcc on Jul 22 2019, 2:51 PM.

Details

Summary

It is not safe in general to replace an alias in a GEP with its aliasee
if the alias can be replaced with another definition (i.e. via strong/weak
resolution (linkonce_odr) or via symbol interposition (default visibility
in ELF)) while the aliasee cannot. An example of how this can go wrong is
in the included test case.

I was concerned that this might be a load-bearing misoptimization (it's
possible for us to use aliases to share vtables between base and derived
classes, and on Windows, vtable symbols will always be aliases in RTTI
mode, so this change could theoretically inhibit trivial devirtualization
in some cases), so I built Chromium for Linux and Windows with and without
this change. The file sizes of the resulting binaries were identical, so it
doesn't look like this is going to be a problem.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc created this revision.Jul 22 2019, 2:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2019, 2:51 PM
hfinkel accepted this revision.Jul 22 2019, 3:07 PM
hfinkel added a subscriber: hfinkel.

LGTM

This revision is now accepted and ready to land.Jul 22 2019, 3:07 PM
This revision was automatically updated to reflect the committed changes.
plotfi added a subscriber: plotfi.Jul 22 2019, 4:26 PM

I'm seeing a lit failure for gep-alias.ll on my end (Fedora 64bit). The output is:

$ /home/plotfi/Projects/llvm-project/build/bin/opt -instcombine -S -o - /home/plotfi/Projects/llvm-project/llvm/test/Analysis/ConstantFolding/gep-alias.ll
; ModuleID = '/home/plotfi/Projects/llvm-project/llvm/test/Analysis/ConstantFolding/gep-alias.ll'
source_filename = "/home/plotfi/Projects/llvm-project/llvm/test/Analysis/ConstantFolding/gep-alias.ll"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@a = internal global [3 x i8*] zeroinitializer

@b = linkonce_odr alias [3 x i8*], [3 x i8*]* @a

define i8** @f() {
  ret i8** getelementptr ([3 x i8*], [3 x i8*]* @b, i64 0, i64 1)
}

But you do a check for " ; CHECK: ret i8** getelementptr ([3 x i8*], [3 x i8*]* @b, i32 0, i32 1)" This seems odd.

pcc added a comment.Jul 22 2019, 4:33 PM

I'm seeing a lit failure for gep-alias.ll on my end (Fedora 64bit). The output is:

$ /home/plotfi/Projects/llvm-project/build/bin/opt -instcombine -S -o - /home/plotfi/Projects/llvm-project/llvm/test/Analysis/ConstantFolding/gep-alias.ll
; ModuleID = '/home/plotfi/Projects/llvm-project/llvm/test/Analysis/ConstantFolding/gep-alias.ll'
source_filename = "/home/plotfi/Projects/llvm-project/llvm/test/Analysis/ConstantFolding/gep-alias.ll"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@a = internal global [3 x i8*] zeroinitializer

@b = linkonce_odr alias [3 x i8*], [3 x i8*]* @a

define i8** @f() {
  ret i8** getelementptr ([3 x i8*], [3 x i8*]* @b, i64 0, i64 1)
}

But you do a check for " ; CHECK: ret i8** getelementptr ([3 x i8*], [3 x i8*]* @b, i32 0, i32 1)" This seems odd.

I could have sworn that I ran this test before I committed, but now I'm seeing the same behaviour as you. r366764 should fix this.

Excellent! :-)

jdoerfert reopened this revision.Jul 22 2019, 7:34 PM
jdoerfert added a subscriber: jdoerfert.

I saw this and I had a thought, please let me know what you think:

I would assume this issue is present in many more contexts. Maybe I am mistaken though.
If I am not, it is probably because we look through *any* alias by default without considering
the linker. Maybe, the right "fix" is to look through "exact definition" aliases only (by default).

(I reopened this because I do not know if Phab will otherwise send out emails).

This revision is now accepted and ready to land.Jul 22 2019, 7:34 PM
pcc added a comment.Jul 22 2019, 8:12 PM

Here is my brain dump on alias optimizations:

  • The only case in which we can soundly replace an alias with the aliasee is when we know that both have the same value.
  • This means that the transformation is only valid when both alias and aliasee are not preemptible (either at link time or runtime).
  • But if we know this, there's no difference between taking the address of the alias and taking the address of the aliasee. Replacing one with the other will result in generating the exact same code.
  • In fact, looking past aliases even when we are allowed to could be harmful from a QoI perspective:
    • Let's say that a compiler pass RAUWs an internal linkage global with an internal linkage alias pointing to a private linkage global (LowerTypeTests, WholeProgramDevirt and soon HWASAN all do this).
    • A later optimization pass that replaces references to the alias with references to the global will allow GlobalDCE to drop the alias, which means that the symbol is no longer in the symbol table, which could hurt debuggability.
  • However, there's still a benefit in looking past aliases, but only in order to look at the *contents* (not the address) of the global.
  • For example, the following is valid:
@a = private constant i8* @x
@b = linkonce_odr alias @a

define void @foo() {
  %x = load i8*, i8** @b
  ret i8* %x ; may be replaced with ret i8* @x
}
  • So I think that the ideal final state is that looking past aliases should be opt-in and only allowed when a pass introduces no dependencies on the address of the aliasee.
In D65118#1596838, @pcc wrote:
  • The only case in which we can soundly replace an alias with the aliasee is when we know that both have the same value.

That is why I thought the "FollowAliases" versions could still look through aliases if they have an "exact definition".

  • However, there's still a benefit in looking past aliases, but only in order to look at the *contents* (not the address) of the global.
  • For example, the following is valid:
@a = private constant i8* @x
@b = linkonce_odr alias @a

define void @foo() {
  %x = load i8*, i8** @b
  ret i8* %x ; may be replaced with ret i8* @x
}

Mh, this is not what I expected to be valid. To me, this looks like the classical "non-exact-definition" problem that breaks inter-procedural reasoning (@sanjoy, @reames).
What happens if I modify the example a bit:

@a = private i8 @x
@b = linkonce_odr alias @a

define i1 @always_true() {
  store i8 0, i8* @a
  %x = load i8, i8* @b
  ... ; @b and @a are not modified here but other globals might be
  %y = load i8, i8* @b
  ... ; @b and @a are not modified here but other globals might be
  %cmp = icmp eq %x, %y
  ret i1 %cmp
}

I would assume always_true to return "always" true, no matter the value of @b.
However, if we look through @b once (say for %x) and then replace it at link time
with a different alias, couldn't %y result in a different value?

[...]

  • So I think that the ideal final state is that looking past aliases should be opt-in and only allowed when a pass introduces no dependencies on the address of the aliasee.

So, should we make "NoFollowAliases" the default and add the option/variants that "FollowAliases" instead?

pcc added a comment.Jul 23 2019, 8:35 AM
In D65118#1596838, @pcc wrote:
  • The only case in which we can soundly replace an alias with the aliasee is when we know that both have the same value.

That is why I thought the "FollowAliases" versions could still look through aliases if they have an "exact definition".

  • However, there's still a benefit in looking past aliases, but only in order to look at the *contents* (not the address) of the global.
  • For example, the following is valid:
@a = private constant i8* @x
@b = linkonce_odr alias @a

define void @foo() {
  %x = load i8*, i8** @b
  ret i8* %x ; may be replaced with ret i8* @x
}

Mh, this is not what I expected to be valid. To me, this looks like the classical "non-exact-definition" problem that breaks inter-procedural reasoning (@sanjoy, @reames).
What happens if I modify the example a bit:

@a = private i8 @x
@b = linkonce_odr alias @a

define i1 @always_true() {
  store i8 0, i8* @a
  %x = load i8, i8* @b
  ... ; @b and @a are not modified here but other globals might be
  %y = load i8, i8* @b
  ... ; @b and @a are not modified here but other globals might be
  %cmp = icmp eq %x, %y
  ret i1 %cmp
}

I would assume always_true to return "always" true, no matter the value of @b.
However, if we look through @b once (say for %x) and then replace it at link time
with a different alias, couldn't %y result in a different value?

Right, and that's why I agree with you that in this case we can't replace @b with @a. In your example, you're taking an implicit dependency on the global's address by transforming a load from one global into a load from a potentially different global with different contents. When I say the "contents" of a global, what I really mean is the initializer, and only in the case where the global is constant. This, along with the linkonce_odr on the alias, guarantees that all copies of the global will have the same contents, which is what allows us to replace the load with the initializer in my example.

[...]

  • So I think that the ideal final state is that looking past aliases should be opt-in and only allowed when a pass introduces no dependencies on the address of the aliasee.

So, should we make "NoFollowAliases" the default and add the option/variants that "FollowAliases" instead?

Yes, I think that should be the goal.

pcc added a comment.Jul 23 2019, 9:02 AM

Another way to look at it is that this:

@a = private constant i8* @x
@b = linkonce_odr alias @a

should be equivalent to:

@b = linkonce_odr constant i8* @x

except that we also have a way of referencing this module's copy of @b via @a. See also https://llvm.org/bugs/show_bug.cgi?id=27866 where this was discussed as a "canonical" form of aliases.

In D65118#1597600, @pcc wrote:

Another way to look at it is that this:

@a = private constant i8* @x
@b = linkonce_odr alias @a

should be equivalent to:

@b = linkonce_odr constant i8* @x

except that we also have a way of referencing this module's copy of @b via @a. See also https://llvm.org/bugs/show_bug.cgi?id=27866 where this was discussed as a "canonical" form of aliases.

So I looked into this a bit and the code in stripPointerXXX currently checks for:

/// Return true if this global's definition can be substituted with an
/// *arbitrary* definition at link time.  We cannot do any IPO or inlinining
/// across interposable call edges, since the callee can be replaced with
/// something arbitrary at link time.
bool isInterposable() const;

not

/// Returns true if the definition of this global may be replaced by a
   /// differently optimized variant of the same source level function at link
   /// time.
   bool mayBeDerefined() const;

which is !isDefinitionExact().

What do you think?

pcc added a comment.Jul 23 2019, 12:03 PM

For the opt-in behaviour it seems that we should have the strip*() ignore the linkage and leave it up to the caller to make the correct determination based on the linkage of the alias. For example, the inliner or global const prop would check isInterposable(), and attribute-based optimizations would check isDefinitionExact().

In D65118#1597868, @pcc wrote:

For the opt-in behaviour it seems that we should have the strip*() ignore the linkage and leave it up to the caller to make the correct determination based on the linkage of the alias. For example, the inliner or global const prop would check isInterposable(), and attribute-based optimizations would check isDefinitionExact().

Mh, I think that would change how the strip*() family currently works. What about: Looking through isDefinitionExact by default for strip*() except for "NoFollowAliases". That way, users have to manually strip non-interposable aliases that do not have an exact definition and they get a somehow sane/sound result if they strip aliases. If needed, we could add a "FollowNonInterposableAliases" later as well.

pcc added a comment.Jul 23 2019, 5:06 PM
In D65118#1597868, @pcc wrote:

For the opt-in behaviour it seems that we should have the strip*() ignore the linkage and leave it up to the caller to make the correct determination based on the linkage of the alias. For example, the inliner or global const prop would check isInterposable(), and attribute-based optimizations would check isDefinitionExact().

Mh, I think that would change how the strip*() family currently works.

That's fine I think. Making strip*() more restrictive shouldn't break existing users, it should just make them optimize less (and for any in-tree users we would measure the impact and switch to a new API if necessary before doing so).

What about: Looking through isDefinitionExact by default for strip*() except for "NoFollowAliases". That way, users have to manually strip non-interposable aliases that do not have an exact definition and they get a somehow sane/sound result if they strip aliases. If needed, we could add a "FollowNonInterposableAliases" later as well.

We shouldn't look through aliases at all by default for the QoI reasons already mentioned. I wouldn't add too many different variants across this dimension since it's already growing into a bit of a mess. Just the always-strip-aliases variant should be enough in combination with the checks in the client. That way things are kept orthogonal.