This is an archive of the discontinued LLVM Phabricator instance.

Ensure that prefix data is preserved with subsections-via-symbols
ClosedPublic

Authored by angerman on Mar 8 2017, 7:36 PM.

Details

Summary

On MachO platforms that use subsections-via-symbols dead code stripping will
drop prefix data. Unfortunately there is no great way to convey the relationship
between a function and its prefix data to the linker. We are forced to use a bit
of a hack: we give the prefix data it’s own symbol, and mark the actual function
entry an .alt_entry.

Event Timeline

angerman created this revision.Mar 8 2017, 7:36 PM
pcc edited edge metadata.Mar 8 2017, 7:39 PM

Test case?

In D30770#696262, @pcc wrote:

Test case?

I guess I need some guidance here, in putting the testfile together properly:

Using the following dont-dead_strip-prefix-data.ll

define i32 @main() prefix i32 123 {
 %main = bitcast i32 ()* @main to i32*
 %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1
 %prefix_val = load i32, i32* %prefix_ptr
 %1 = icmp eq i32 %prefix_val, 123
 %2 = xor i1 %1, true
 %3 = zext i1 %2 to i32
 ret i32 %3
}
$ clang dont-dead_strip-prefix-data.ll -o dont-dead_strip-prefix-data -dead_strip && ./dont-dead_strip-prefix-data

will exit with 1 without the patch, and 0 with the patch.

pcc added a comment.Mar 8 2017, 10:12 PM

I would modify the test test/CodeGen/X86/prefixdata.ll to also test that the backend emits the correct directives on Darwin. See test/CodeGen/X86/alias-gep.ll for an example of how to test two target triples in one file.

angerman updated this revision to Diff 91135.Mar 8 2017, 11:33 PM
  • Adds tests
In D30770#696322, @pcc wrote:

I would modify the test test/CodeGen/X86/prefixdata.ll to also test that the backend emits the correct directives on Darwin. See test/CodeGen/X86/alias-gep.ll for an example of how to test two target triples in one file.

Thanks for the pointes. I've modified the x86 test accordingly, and added a test for aarch64 as well.

@pcc Thanks for all your help so far! What else do I need to make to get this in?

pcc added inline comments.Mar 10 2017, 10:39 AM
lib/CodeGen/AsmPrinter/AsmPrinter.cpp
655

I think this will need to call OutContext.createLinkerPrivateTempSymbol() because the symbols that createTempSymbol() creates will not be emitted to the object file.

test/CodeGen/AArch64/prefixdata.ll
30 ↗(On Diff #91135)

Add newline, same with your other test

angerman updated this revision to Diff 91450.Mar 10 2017, 9:06 PM
  • adds review suggestions
angerman marked 2 inline comments as done.Mar 10 2017, 9:08 PM

Thank you for taking the time to review this @pcc! Also the suggestions. The updated diff should reflect the implementation of those.

pcc accepted this revision.Mar 13 2017, 4:37 PM

LGTM

Do you have commit access?

This revision is now accepted and ready to land.Mar 13 2017, 4:37 PM
In D30770#700029, @pcc wrote:

LGTM

Do you have commit access?

Thanks @pcc. No I do not have commit access. I would either need to get commit access or have someone land this on my behalf.

This revision was automatically updated to reflect the committed changes.
pcc added a comment.Mar 14 2017, 9:30 PM

I committed this for you in r297804.

In D30770#701356, @pcc wrote:

I committed this for you in r297804.

Thank you!

angerman reopened this revision.Jul 28 2017, 5:50 AM

So I've been playing with LLVM5 and GHC, and it seems like this solution is insufficient :-/ I've boiled this down to the following:

let lib.ll

define void @f() prefix i32 123 {
 ret void
}

and dont-dead_strip-prefix-data.ll be

declare void @f();

define i32 @main() {
 %f = bitcast void ()* @f to i32*
 %prefix_ptr = getelementptr inbounds i32, i32* %f, i32 -1
 %prefix_val = load i32, i32* %prefix_ptr
 %1 = icmp eq i32 %prefix_val, 123
 %2 = xor i1 %1, true
 %3 = zext i1 %2 to i32
 ret i32 %3
}

This is almost identical to the example from before, only now we split this over two objects.

clang -c lib.ll -o lib.o
clang dont-dead_strip-prefix-data.ll lib.o -o dont-dead_strip-prefix-data -dead_stip && ./dont-dead_strip-prefix-data

Thus I believe we need a temporary global symbol instead.

This revision is now accepted and ready to land.Jul 28 2017, 5:50 AM

The following diff, seems to at least resolve this:

diff --git a/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index ff427c9a0d7..0aca2478cc5 100644
--- a/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -655,25 +655,6 @@ void AsmPrinter::EmitFunctionHeader() {
     OutStreamer->GetCommentOS() << '\n';
   }

-  // Emit the prefix data.
-  if (F->hasPrefixData()) {
-    if (MAI->hasSubsectionsViaSymbols()) {
-      // Preserving prefix data on platforms which use subsections-via-symbols
-      // is a bit tricky. Here we introduce a symbol for the prefix data
-      // and use the .alt_entry attribute to mark the function's real entry point
-      // as an alternative entry point to the prefix-data symbol.
-      MCSymbol *PrefixSym = OutContext.createLinkerPrivateTempSymbol();
-      OutStreamer->EmitLabel(PrefixSym);
-
-      EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData());
-
-      // Emit an .alt_entry directive for the actual function symbol.
-      OutStreamer->EmitSymbolAttribute(CurrentFnSym, MCSA_AltEntry);
-    } else {
-      EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData());
-    }
-  }
-
   // Emit the CurrentFnSym.  This is a virtual function to allow targets to
   // do their wild and crazy things as required.
   EmitFunctionEntryLabel();
@@ -725,6 +706,30 @@ void AsmPrinter::EmitFunctionEntryLabel() {
     report_fatal_error("'" + Twine(CurrentFnSym->getName()) +
                        "' label emitted multiple times to assembly file");

+  // Emit the prefix data.
+  const Function *F = MF->getFunction();
+  if (F->hasPrefixData()) {
+    if (MAI->hasSubsectionsViaSymbols()) {
+      // Preserving prefix data on platforms which use subsections-via-symbols
+      // is a bit tricky. Here we introduce a symbol for the prefix data
+      // and use the .alt_entry attribute to mark the function's real entry point
+      // as an alternative entry point to the prefix-data symbol.
+
+      MCSymbol *PrefixSym = OutContext.getOrCreateSymbol(
+          Twine(CurrentFnSym->getName()) + "$prefix");
+      PrefixSym->setPrivateExtern(true);
+
+      OutStreamer->EmitLabel(PrefixSym);
+
+      EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData());
+
+      // Emit an .alt_entry directive for the actual function symbol.
+      OutStreamer->EmitSymbolAttribute(CurrentFnSym, MCSA_AltEntry);
+    } else {
+      EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData());
+    }
+  }
+
   return OutStreamer->EmitLabel(CurrentFnSym);
 }

There are however a few items I'm a bit unhappy about:

  • the prefix data is moved into the EmitFunctionEntryLabel from the EmitFunctionHeader function; this feels a little off.
  • this however ensures that CurrentFnSym->redefineIfPossible(); is called before CurrentFnSym->getName() is used.
  • a $prefix is appended, which might if one decides to define @f with prefix data and @f$prefix clash.
  • The symbol is now set to private extern, and I'm not sure if this is the sweet spot we are looking for.

@pcc if you could spare some feedback on this, that would be great. I'm truly sorry for having missed this case initially.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2019, 6:24 AM