This is an archive of the discontinued LLVM Phabricator instance.

Add alias canonicalization pass when preparing for ThinLTO
Needs ReviewPublic

Authored by tejohnson on Feb 9 2017, 12:30 PM.

Details

Summary

This patch implements an alias canonicalization pass to convert aliases
to the form discussed in https://llvm.org/bugs/show_bug.cgi?id=27866.
This changes aliases and aliasees into aliases with a private anonymous
value.

It was also discussed on the mailing list (see "Weak symbol/alias
semantics" thread here:

http://lists.llvm.org/pipermail/llvm-dev/2017-January/109038.html),

as it simplifies needed fixes for ThinLTO weak symbol/alias handling.

Therefore, to start with I have only enabled the canonicalization pass
when preparing for ThinLTO. After this, I plan to add a bitcode upgrade
mechanism and use it for ThinLTO, and then enable the pass and upgrade
mechanism in other cases. Eventually IR producers can be changed, but
that is a longer term solution.

Event Timeline

tejohnson created this revision.Feb 9 2017, 12:30 PM
pcc added a reviewer: pcc.Feb 9 2017, 12:37 PM
mehdi_amini added inline comments.Feb 13 2017, 6:22 PM
lib/Transforms/Utils/CanonicalizeAliases.cpp
35

Why materialized_uses?

36

That deserves a comment. I'd even desugarize the loop with an assert that there is only one use as "documentation" of the invariant you're relying on here. (alternatively, or at the same time, add a comment)

99

Instead of reimplementing a custom RAUW (the duplication isn't great here), would it be possible to perform the regular one, and then reattach the aliases?

110

New PM version of the pass?

tejohnson added inline comments.Feb 15 2017, 1:32 PM
lib/Transforms/Utils/CanonicalizeAliases.cpp
35

IIRC this was required when I was doing via a bitcode upgrade, because uses() asserts that the module is materialized and I believe it isn't on some paths (maybe when we are importing and the whole module isn't materialized?). It isn't strictly necessary with this patch, but will probably need to go back in when I add a follow on bitcode upgrade interface. I will remove it for now though.

36

Will do

99

Hmm, maybe. Will take a stab at it.

110

Good idea! Will add

I should also add that I have been investigating how hard it would be to do this during GlobalAlias::create. It turns out to be much more complicated, unfortunately (a few notes below). It seems like it would be more straightforward to add a pass as done here and then a bitcode upgrade path for now, and eventually teach producers such as clang to do the right thing.

BTW I will be on vacation for a week+ so might not have time to do follow-up work here until I am back Feb 27.

Complications with canonicalization during GlobalAlias::create:

  • When we canonicalize an alias being created, where the aliasee is another alias that has already been canonicalized (since we are converting each alias as we read it in from the LLVM assembly) - means having to repeatedly look through uses of the already-canonicalized alias to find the one in the alias being canonicalized in this call, so that we can rename it to the anonymous private.
  • To do the above, need to inspect the aliasee's GV for each alias being created - but there isn't already a way to find the aliasee's GV while stripping both pointer casts and inbound indices, but not follow aliases (i.e. normally getBaseObject() will look through aliases).

I have a fix for the first one, but it seems inefficient to have to re-scan uses of the aliasee that was a converted alias (not sure how common that is though). The second one I don't have a complete fix for. It probably needs an additional facility on stripPointerCastsAndOffsets.

espindola edited reviewers, added: espindola; removed: rafael.Mar 15 2018, 10:08 AM
pcc added a comment.Dec 12 2018, 1:35 PM

For canonicalization this seems like a better approach. It walks the alias operands so you don't hit the issue with alias vs non-alias users. I wrote it as a function inside LTO but it could easily be moved into a pass.

diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp
index 926c419e34a..bbca8f5b6e7 100644
--- a/llvm/lib/LTO/LTOBackend.cpp
+++ b/llvm/lib/LTO/LTOBackend.cpp
@@ -439,8 +439,49 @@ Error lto::backend(Config &C, AddStreamFn AddStream,
   return finalizeOptimizationRemarks(std::move(DiagnosticOutputFile));
 }
 
-static void dropDeadSymbols(Module &Mod, const GVSummaryMapTy &DefinedGlobals,
+static Constant *resolveAliases(Constant *C) {
+  if (auto *GA = dyn_cast<GlobalAlias>(C))
+    return resolveAliases(GA->getAliasee());
+
+  if (auto *GO = dyn_cast<GlobalObject>(C)) {
+    if (!GO->hasLocalLinkage()) {
+      // Replace GO with an alias that points to it.
+      auto *GA = GlobalAlias::create("", GO);
+      GA->setLinkage(GO->getLinkage());
+      GA->copyAttributesFrom(GO);
+      GA->takeName(GO);
+      GO->replaceAllUsesWith(GA);
+      GA->setAliasee(GO);
+      GO->setLinkage(GlobalValue::PrivateLinkage);
+
+      // This will end up giving a suffixed name to GO.
+      GO->setName(GA->getName());
+    }
+    return GO;
+  }
+
+  auto *CE = dyn_cast<ConstantExpr>(C);
+  if (!CE)
+    return C;
+
+  std::vector<Constant *> Ops;
+  for (Use &U : CE->operands())
+    Ops.push_back(resolveAliases(cast<Constant>(U)));
+  return CE->getWithOperands(Ops);
+}
+
+static void dropDeadSymbols(Module &Mod, const Triple &TT,
+                            const GVSummaryMapTy &DefinedGlobals,
                             const ModuleSummaryIndex &Index) {
+  for (GlobalAlias &GA : Mod.aliases()) {
+    // Don't handle COFF weak externals, since unlike regular aliases they
+    // operate on the symbol level.
+    if (GA.hasWeakLinkage() && TT.isOSBinFormatCOFF())
+      continue;
+
+    GA.setAliasee(resolveAliases(GA.getAliasee()));
+  }
+
   std::vector<GlobalValue*> DeadGVs;
   for (auto &GV : Mod.global_values())
     if (GlobalValueSummary *GVS = DefinedGlobals.lookup(GV.getGUID()))
@@ -488,7 +529,7 @@ Error lto::thinBackend(Config &Conf, unsigned Task, AddStreamFn AddStream,
 
   renameModuleForThinLTO(Mod, CombinedIndex);
 
-  dropDeadSymbols(Mod, DefinedGlobals, CombinedIndex);
+  dropDeadSymbols(Mod, TM->getTargetTriple(), DefinedGlobals, CombinedIndex);
 
   thinLTOResolvePrevailingInModule(Mod, DefinedGlobals);
lib/Transforms/Utils/CanonicalizeAliases.cpp
58

I don't think this is entirely correct. A bitcast can have both alias and non-alias uses, so you would need to create different ConstantExprs for each. It actually seems simpler to visit the alias's operands than the use list.