This is an archive of the discontinued LLVM Phabricator instance.

[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible
ClosedPublic

Authored by MaskRay on Aug 1 2021, 11:00 PM.

Details

Summary

Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. isInterposable is an insufficient condition.

Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.

Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, alias = 345 will not be rewritten to use aliasee in a -fpic
compile.

int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy

While here, refine the condition for the alias as well.

For some binary formats like COFF, isInterposable is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the Triple(M.getTargetTriple()).isOSBinFormatELF() or
getPICLevel/getPIELevel complexity.

For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.

int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));

There are other places where GlobalAlias isInterposable usage may need to be
fixed.

Diff Detail

Event Timeline

MaskRay created this revision.Aug 1 2021, 11:00 PM
MaskRay requested review of this revision.Aug 1 2021, 11:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2021, 11:00 PM

This seems awfully confusing, and almost completely undocumented in LangRef.

If I'm understanding correctly, when you have an alias that refers to a global, the alias doesn't refer to a global symbol. It instead refers to the content of the global, as emitted into the object file. So resolving an alias that refers to a symbol that has weak linkage, or default visibility, can have weird effects: the linker might end up resolving the symbol differently.

So I guess I have two questions:

  1. If we have a definition that isn't weak or interposable, but has default linkage, would it make sense to teach the backend to use an alias to refer to the global? So we don't have to worry about it going through the PLT or whatever. Maybe this would explode somehow?
  2. Can we forbid aliases that refer to globals that are interposable, or use weak linkage? In theory, this prohibition doesn't cause any loss of generality in IR: we can turn a weak definition into an alias pointing to an internal definition, which should be essentially equivalent.

I'm not too concerned about high-quality optimizations for aliases. I'm mostly just concerned that it isn't clear what optimizations are legal.

aeubanks added a reviewer: pcc.Aug 2 2021, 9:32 AM

+1, having all of this actually documented would be great
I'm still very confused about alias semantics

ormris removed a subscriber: ormris.Jan 24 2022, 11:49 AM
MaskRay updated this revision to Diff 404298.Jan 29 2022, 12:35 PM
MaskRay retitled this revision from [GlobalOpt] Don't optimize out non-preemptible alias if aliasee may be preemptible to [GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible.
MaskRay edited the summary of this revision. (Show Details)

Add a C example this patch will fix.

Update LangRef.

MaskRay updated this revision to Diff 404299.Jan 29 2022, 12:37 PM

remove a stale comment.

rnk added a comment.Jan 31 2022, 1:03 PM

How does the new code handle the weak linkages? Can strong symbol definitions from native object files prevail over weak alias definitions in a partial LTO situation?

MaskRay updated this revision to Diff 404715.Jan 31 2022, 1:36 PM

Check both !GlobalValue::isInterposableLinkage(GV.getLinkage()) and dso_local.

(Spot by rnk): The previous diff might incorrectly optimize a WeakAny alias, which was wrong since the alias might be replaced with another in the same linked unit.

Do we need to fix other places? I tried grepping for GlobalAlias, and at least BasicAliasAnalysis.cpp and MemoryBuiltins.cpp and StackSafetyAnalysis.cpp contain similar isInterposable() checks.

  1. Can we forbid aliases that refer to globals that are interposable, or use weak linkage? In theory, this prohibition doesn't cause any loss of generality in IR: we can turn a weak definition into an alias pointing to an internal definition, which should be essentially equivalent.

Any comment on this?

Do we need to fix other places? I tried grepping for GlobalAlias, and at least BasicAliasAnalysis.cpp and MemoryBuiltins.cpp and StackSafetyAnalysis.cpp contain similar isInterposable() checks.

Maybe. Perhaps cross that bridge when we come to it.

  1. Can we forbid aliases that refer to globals that are interposable, or use weak linkage? In theory, this prohibition doesn't cause any loss of generality in IR: we can turn a weak definition into an alias pointing to an internal definition, which should be essentially equivalent.

Any comment on this?

We cannot reject such aliases. GCC supports:

int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));

Perhaps LLVM's instrumentations are better of not creating such aliases, but the IR needs to be handled.

rnk accepted this revision.Feb 1 2022, 9:27 AM

lgtm

llvm/lib/Transforms/IPO/GlobalOpt.cpp
2479

nit: this captures nothing, right? You can drop the &.

This revision is now accepted and ready to land.Feb 1 2022, 9:27 AM
MaskRay updated this revision to Diff 404991.Feb 1 2022, 10:24 AM
MaskRay marked an inline comment as done.

fic comment and remove unneeded & capture

MaskRay edited the summary of this revision. (Show Details)Feb 1 2022, 10:28 AM
This revision was landed with ongoing or failed builds.Feb 1 2022, 10:41 AM
This revision was automatically updated to reflect the committed changes.

Hmm I'm seeing functions being visited a bunch of times during the simplification pipeline in the big module I'm looking at.

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 3:26 PM
MaskRay added a subscriber: zatrazz.EditedMar 12 2022, 4:20 PM

@zatrazz https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/clang the bootstrap issue you saw can be fixed by applying this patch (but it was reverted for a timeout issue).

In elf/rtld.c,

// Without the patch, Clang may convert _rtld_local access to the unexpected _rtld_global access, leading to GOT indirection
extern struct rtld_global _rtld_local
    __attribute__ ((alias ("_rtld_global"), visibility ("hidden")));

In bootstrap_map.l_addr = elf_machine_load_address ();,
using Clang without this patch, the statement compiles to use R_X86_64_REX_GOTPCRELX _rtld_global-0x4, which may load a zero address (GOT relocations haven't been applied yet)

With this patch or with GCC, the PC-relative R_X86_64_PC32 _rtld_local-0x4 is used, which works.

Sorry, still trying to figure out why this causes compile times to blow up in some cases. In the case I was looking at it went from ~150s to ~5700s.

the module I'm looking at has a huge RefSCC (>20000 SCCs)

this change is somehow causing us to hit the case where we optimize the RefSCC, cause a function in one of the contained SCCs to now be in a child SCC (perhaps because a call/reference to it was eliminated in another function), bail out from the current RefSCC to go and optimize the new child RefSCC/SCC (since the pipeline is bottom-up), then revisit the original now slightly smaller RefSCC, and repeat
commenting out https://github.com/llvm/llvm-project/blob/74cf8575f74ab6b0d9641e0e3acf3e9328a5e365/llvm/lib/Analysis/CGSCCPassManager.cpp#L233 (and the corresponding assert below) fixes the issue, although I'm not sure that's the proper fix

I looked at the RefSCC visit order without this patch and it actually does something similar, but nowhere near as bad as with this patch. I still haven't figured out which pass is causing the call graph to move around, investigating that

D121953 should fix the compile time issues, I'll reland this after that goes in

aeubanks reopened this revision.Mar 17 2022, 2:55 PM
This revision is now accepted and ready to land.Mar 17 2022, 2:55 PM