On MachO platforms that use subsections-via-symbols dead code stripping will
drop prefix data. Unfortunately there is no great way to convey the relationship
between a function and its prefix data to the linker. We are forced to use a bit
of a hack: we give the prefix data it’s own symbol, and mark the actual function
entry an .alt_entry.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I guess I need some guidance here, in putting the testfile together properly:
Using the following dont-dead_strip-prefix-data.ll
define i32 @main() prefix i32 123 { %main = bitcast i32 ()* @main to i32* %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 %prefix_val = load i32, i32* %prefix_ptr %1 = icmp eq i32 %prefix_val, 123 %2 = xor i1 %1, true %3 = zext i1 %2 to i32 ret i32 %3 }
$ clang dont-dead_strip-prefix-data.ll -o dont-dead_strip-prefix-data -dead_strip && ./dont-dead_strip-prefix-data
will exit with 1 without the patch, and 0 with the patch.
I would modify the test test/CodeGen/X86/prefixdata.ll to also test that the backend emits the correct directives on Darwin. See test/CodeGen/X86/alias-gep.ll for an example of how to test two target triples in one file.
Thanks for the pointes. I've modified the x86 test accordingly, and added a test for aarch64 as well.
lib/CodeGen/AsmPrinter/AsmPrinter.cpp | ||
---|---|---|
655 ↗ | (On Diff #91135) | I think this will need to call OutContext.createLinkerPrivateTempSymbol() because the symbols that createTempSymbol() creates will not be emitted to the object file. |
test/CodeGen/AArch64/prefixdata.ll | ||
30 ↗ | (On Diff #91135) | Add newline, same with your other test |
Thank you for taking the time to review this @pcc! Also the suggestions. The updated diff should reflect the implementation of those.
Thanks @pcc. No I do not have commit access. I would either need to get commit access or have someone land this on my behalf.
So I've been playing with LLVM5 and GHC, and it seems like this solution is insufficient :-/ I've boiled this down to the following:
let lib.ll
define void @f() prefix i32 123 { ret void }
and dont-dead_strip-prefix-data.ll be
declare void @f(); define i32 @main() { %f = bitcast void ()* @f to i32* %prefix_ptr = getelementptr inbounds i32, i32* %f, i32 -1 %prefix_val = load i32, i32* %prefix_ptr %1 = icmp eq i32 %prefix_val, 123 %2 = xor i1 %1, true %3 = zext i1 %2 to i32 ret i32 %3 }
This is almost identical to the example from before, only now we split this over two objects.
clang -c lib.ll -o lib.o clang dont-dead_strip-prefix-data.ll lib.o -o dont-dead_strip-prefix-data -dead_stip && ./dont-dead_strip-prefix-data
Thus I believe we need a temporary global symbol instead.
The following diff, seems to at least resolve this:
diff --git a/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/lib/CodeGen/AsmPrinter/AsmPrinter.cpp index ff427c9a0d7..0aca2478cc5 100644 --- a/lib/CodeGen/AsmPrinter/AsmPrinter.cpp +++ b/lib/CodeGen/AsmPrinter/AsmPrinter.cpp @@ -655,25 +655,6 @@ void AsmPrinter::EmitFunctionHeader() { OutStreamer->GetCommentOS() << '\n'; } - // Emit the prefix data. - if (F->hasPrefixData()) { - if (MAI->hasSubsectionsViaSymbols()) { - // Preserving prefix data on platforms which use subsections-via-symbols - // is a bit tricky. Here we introduce a symbol for the prefix data - // and use the .alt_entry attribute to mark the function's real entry point - // as an alternative entry point to the prefix-data symbol. - MCSymbol *PrefixSym = OutContext.createLinkerPrivateTempSymbol(); - OutStreamer->EmitLabel(PrefixSym); - - EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData()); - - // Emit an .alt_entry directive for the actual function symbol. - OutStreamer->EmitSymbolAttribute(CurrentFnSym, MCSA_AltEntry); - } else { - EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData()); - } - } - // Emit the CurrentFnSym. This is a virtual function to allow targets to // do their wild and crazy things as required. EmitFunctionEntryLabel(); @@ -725,6 +706,30 @@ void AsmPrinter::EmitFunctionEntryLabel() { report_fatal_error("'" + Twine(CurrentFnSym->getName()) + "' label emitted multiple times to assembly file"); + // Emit the prefix data. + const Function *F = MF->getFunction(); + if (F->hasPrefixData()) { + if (MAI->hasSubsectionsViaSymbols()) { + // Preserving prefix data on platforms which use subsections-via-symbols + // is a bit tricky. Here we introduce a symbol for the prefix data + // and use the .alt_entry attribute to mark the function's real entry point + // as an alternative entry point to the prefix-data symbol. + + MCSymbol *PrefixSym = OutContext.getOrCreateSymbol( + Twine(CurrentFnSym->getName()) + "$prefix"); + PrefixSym->setPrivateExtern(true); + + OutStreamer->EmitLabel(PrefixSym); + + EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData()); + + // Emit an .alt_entry directive for the actual function symbol. + OutStreamer->EmitSymbolAttribute(CurrentFnSym, MCSA_AltEntry); + } else { + EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData()); + } + } + return OutStreamer->EmitLabel(CurrentFnSym); }
There are however a few items I'm a bit unhappy about:
- the prefix data is moved into the EmitFunctionEntryLabel from the EmitFunctionHeader function; this feels a little off.
- this however ensures that CurrentFnSym->redefineIfPossible(); is called before CurrentFnSym->getName() is used.
- a $prefix is appended, which might if one decides to define @f with prefix data and @f$prefix clash.
- The symbol is now set to private extern, and I'm not sure if this is the sweet spot we are looking for.
@pcc if you could spare some feedback on this, that would be great. I'm truly sorry for having missed this case initially.