Page MenuHomePhabricator

[RuntimeDyld] Implemented relocation of TLS symbols in ELF
ClosedPublic

Authored by MoritzS on Jul 6 2021, 2:10 AM.

Details

Summary

The tests use a GOTPC32 relocation which is implemented in D95512. So, they
will fail for now.

Diff Detail

Unit TestsFailed

TimeTest
30 msx64 debian > LLVM.ExecutionEngine/RuntimeDyld/X86::TLS.s
Script: -- : 'RUN: at line 1'; rm -rf /var/lib/buildkite-agent/builds/llvm-project/build/test/ExecutionEngine/RuntimeDyld/X86/Output/TLS.s.tmp && mkdir -p /var/lib/buildkite-agent/builds/llvm-project/build/test/ExecutionEngine/RuntimeDyld/X86/Output/TLS.s.tmp
120 msx64 windows > LLVM.ExecutionEngine/RuntimeDyld/X86::TLS.s
Script: -- : 'RUN: at line 1'; rm -rf C:\ws\w2\llvm-project\premerge-checks\build\test\ExecutionEngine\RuntimeDyld\X86\Output\TLS.s.tmp && mkdir -p C:\ws\w2\llvm-project\premerge-checks\build\test\ExecutionEngine\RuntimeDyld\X86\Output\TLS.s.tmp

Event Timeline

MoritzS created this revision.Jul 6 2021, 2:10 AM
MoritzS requested review of this revision.Jul 6 2021, 2:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2021, 2:10 AM

Hi Moritz,

I'm very sorry that I did not see this earlier -- this is really interesting stuff!

Have you looked at JITLink for x86-64 yet? I think it might be better suited to this. Between JITLink and the just-landed (e256445bfff1) ELFNixPlatform code I think that you could get full cross-process (and even cross-architecture) TLV working, similar to what we have on Darwin / MachO.

We can still discuss this going in to RuntimeDyld in the short term if you particularly need / want it, but the aim is to deprecate and eventually remove RuntimeDyld, so implementing this in JITLink would be the strongly preferred approach if possible.

Thanks for looking at this!

I'm aware that RuntimeDyld will eventually be replaced by JITLink and I agree that this should definitely be implemented in JITLink eventually.
Since I'm using this (and my other patch for GNU IFUNCs D105465) in a research project where we currently make heavy use of RuntimeDyld/MCJIT, I would like to get this into main, even if it will be deprecated soon.
However, I think JITLink in general has a much better design and I will try to implement the patches for JITLink as well, soon.

What was the previous behavior when we encountered these relocations?

My only concern here is avoiding regressing existing users if we turn a previously unsupported (but silently accepted) relocation into a potential link-time error. It's perverse, but this kind of improvement to functionality and error checking has caused problems for clients before, and the main aim for MCJIT / RuntimeDyld right now is stability.

If the new support could cause a problem then I think the best solution would be to add a 'bool supportsTLV' method to the memory manager that we can query before running any of the TLV code.

llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
1861

This function is pretty big already. Could the TLV code be split out into separate functions for readability?

However, I think JITLink in general has a much better design and I will try to implement the patches for JITLink as well, soon.

That would be very cool!

What was the previous behavior when we encountered these relocations?

My only concern here is avoiding regressing existing users if we turn a previously unsupported (but silently accepted) relocation into a potential link-time error. It's perverse, but this kind of improvement to functionality and error checking has caused problems for clients before, and the main aim for MCJIT / RuntimeDyld right now is stability.

If the new support could cause a problem then I think the best solution would be to add a 'bool supportsTLV' method to the memory manager that we can query before running any of the TLV code.

All relocations that are not implemented will eventually lead to report_fatal_error("Relocation type not implemented yet!"); in resolveX86_64Relocation(). Whereas with my changes it will reach report_fatal_error("allocation of TLS not implemented"); in MemoryManager::allocateTLSSection(). So in both cases the execution crashes, it just prints a different error message now.

MoritzS added inline comments.Aug 27 2021, 6:34 AM
llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
1861

That's a good idea. Maybe this will also make it easier to port this to JITLink in the future.

MoritzS updated this revision to Diff 369109.Aug 27 2021, 8:51 AM

Refactored most of the relocation logic into separate functions.

I added the changes from D95512 so that the tests won't fail.

MoritzS marked an inline comment as done.Aug 27 2021, 8:53 AM
MoritzS updated this revision to Diff 369244.Aug 28 2021, 3:39 AM

Require x86_64-linux for the TLS test as the execution in llvm-rtdyld is
ifdef'd only for x86_64 and ELF.

MoritzS updated this revision to Diff 369380.Aug 29 2021, 11:51 PM

Added NOLINT comments to avoid clang-tidy warning for the functions called that
have X86_64 in their name.

lhames accepted this revision.Sep 5 2021, 3:23 PM

What was the previous behavior when we encountered these relocations?

My only concern here is avoiding regressing existing users if we turn a previously unsupported (but silently accepted) relocation into a potential link-time error. It's perverse, but this kind of improvement to functionality and error checking has caused problems for clients before, and the main aim for MCJIT / RuntimeDyld right now is stability.

If the new support could cause a problem then I think the best solution would be to add a 'bool supportsTLV' method to the memory manager that we can query before running any of the TLV code.

All relocations that are not implemented will eventually lead to report_fatal_error("Relocation type not implemented yet!"); in resolveX86_64Relocation(). Whereas with my changes it will reach report_fatal_error("allocation of TLS not implemented"); in MemoryManager::allocateTLSSection(). So in both cases the execution crashes, it just prints a different error message now.

Ok. I've just realized that there's also a danger on the flip-side too: code containing TLVs will be accepted, but this only support accessing them on one thread, which will be surprising to some clients. I think it would be worth adding an "enablePartialTLVSupport" method to the memory manager (which should default to false), and gating this new feature on that. If you're happy to do that before the code lands then that's great, otherwise I'm happy to do it in-tree after it lands.

Thank you again for all your work on this Moritz -- it's really appreciated!

This revision is now accepted and ready to land.Sep 5 2021, 3:23 PM

Side note: @StephenFan just posted a review for initial JITLink / ELFNixPlatform TLV support in https://reviews.llvm.org/D109293, if you're interested in getting involved in that.

This revision was automatically updated to reflect the committed changes.

Ok. I've just realized that there's also a danger on the flip-side too: code containing TLVs will be accepted, but this only support accessing them on one thread, which will be surprising to some clients. I think it would be worth adding an "enablePartialTLVSupport" method to the memory manager (which should default to false), and gating this new feature on that. If you're happy to do that before the code lands then that's great, otherwise I'm happy to do it in-tree after it lands.

I implemented the single-thread only handling of TLS only in llvm-rtdyld.cpp. The default implementation in the memory manager is to always crash with the fatal error. So this should only be a problem if someone uses llvm-rtdyld -execute with an object file that has TLS relocations which previously crashed but now silently succeeds but will lead to incorrect execution when multiple threads are used.

brad added a subscriber: brad.Nov 3 2021, 5:47 PM

This commit appears to have broken the build on OpenBSD..

[23/210] Linking CXX executable bin/llvm-rtdyld
FAILED: bin/llvm-rtdyld
: && /usr/bin/c++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -g -Wl,--color-diagnostics tools/llvm-rtdyld/CMakeFiles/llvm-rtdyld.dir/llvm-rtdyld.cpp.o -o bin/llvm-rtdyld -L/usr/lib -Wl,-z,origin,-rpath,"\$ORIGIN/../lib"  lib/libLLVMAArch64Disassembler.a  lib/libLLVMAMDGPUDisassembler.a  lib/libLLVMARMDisassembler.a  lib/libLLVMAVRDisassembler.a  lib/libLLVMBPFDisassembler.a  lib/libLLVMHexagonDisassembler.a  lib/libLLVMLanaiDisassembler.a  lib/libLLVMMipsDisassembler.a  lib/libLLVMMSP430Disassembler.a  lib/libLLVMPowerPCDisassembler.a  lib/libLLVMRISCVDisassembler.a  lib/libLLVMSparcDisassembler.a  lib/libLLVMSystemZDisassembler.a  lib/libLLVMWebAssemblyDisassembler.a  lib/libLLVMX86Disassembler.a  lib/libLLVMXCoreDisassembler.a  lib/libLLVMAArch64Desc.a  lib/libLLVMAMDGPUDesc.a  lib/libLLVMARMDesc.a  lib/libLLVMAVRDesc.a  lib/libLLVMBPFDesc.a  lib/libLLVMHexagonDesc.a  lib/libLLVMLanaiDesc.a  lib/libLLVMMipsDesc.a  lib/libLLVMMSP430Desc.a  lib/libLLVMNVPTXDesc.a  lib/libLLVMPowerPCDesc.a  lib/libLLVMRISCVDesc.a  lib/libLLVMSparcDesc.a  lib/libLLVMSystemZDesc.a  lib/libLLVMWebAssemblyDesc.a  lib/libLLVMX86Desc.a  lib/libLLVMXCoreDesc.a  lib/libLLVMAArch64Info.a  lib/libLLVMAMDGPUInfo.a  lib/libLLVMARMInfo.a  lib/libLLVMAVRInfo.a  lib/libLLVMBPFInfo.a  lib/libLLVMHexagonInfo.a  lib/libLLVMLanaiInfo.a  lib/libLLVMMipsInfo.a  lib/libLLVMMSP430Info.a  lib/libLLVMNVPTXInfo.a  lib/libLLVMPowerPCInfo.a  lib/libLLVMRISCVInfo.a  lib/libLLVMSparcInfo.a  lib/libLLVMSystemZInfo.a  lib/libLLVMWebAssemblyInfo.a  lib/libLLVMX86Info.a  lib/libLLVMXCoreInfo.a  lib/libLLVMDebugInfoDWARF.a  lib/libLLVMExecutionEngine.a  lib/libLLVMMC.a  lib/libLLVMObject.a  lib/libLLVMRuntimeDyld.a  lib/libLLVMSupport.a  -lpthread  lib/libLLVMAArch64Utils.a  lib/libLLVMAMDGPUUtils.a  lib/libLLVMARMUtils.a  lib/libLLVMWebAssemblyUtils.a  lib/libLLVMCodeGen.a  lib/libLLVMBitWriter.a  lib/libLLVMScalarOpts.a  lib/libLLVMAggressiveInstCombine.a  lib/libLLVMInstCombine.a  lib/libLLVMTransformUtils.a  lib/libLLVMMCDisassembler.a  lib/libLLVMOrcTargetProcess.a  lib/libLLVMOrcShared.a  lib/libLLVMTarget.a  lib/libLLVMAnalysis.a  lib/libLLVMObject.a  lib/libLLVMBitReader.a  lib/libLLVMMCParser.a  lib/libLLVMMC.a  lib/libLLVMDebugInfoCodeView.a  lib/libLLVMTextAPI.a  lib/libLLVMProfileData.a  lib/libLLVMCore.a  lib/libLLVMBinaryFormat.a  lib/libLLVMRemarks.a  lib/libLLVMBitstreamReader.a  lib/libLLVMSupport.a  -lexecinfo  -lpthread  -lm  /usr/lib/libz.so.6.0  -lcurses  lib/libLLVMDemangle.a  -Wl,-rpath-link,/usr/X11R6/lib:/usr/local/lib && :
ld: error: undefined symbol: LLVMRTDyldTLSSpace
>>> referenced by llvm-rtdyld.cpp:372 (/home/brad/llvm/llvm-new/llvm/tools/llvm-rtdyld/llvm-rtdyld.cpp:372)
>>>               tools/llvm-rtdyld/CMakeFiles/llvm-rtdyld.dir/llvm-rtdyld.cpp.o:(TrivialMemoryManager::allocateTLSSection(unsigned long, unsigned int, unsigned int, llvm::StringRef))
c++: error: linker command failed with exit code 1 (use -v to see invocation)

This commit appears to have broken the build on OpenBSD..

ld: error: undefined symbol: LLVMRTDyldTLSSpace

referenced by llvm-rtdyld.cpp:372 (/home/brad/llvm/llvm-new/llvm/tools/llvm-rtdyld/llvm-rtdyld.cpp:372)

This symbol is only used when defined(__x86_64__) && defined(__ELF__) is true. The definition looks like this:

extern "C" {
alignas(16) __attribute__((visibility("hidden"), tls_model("initial-exec"),
                           used)) thread_local char LLVMRTDyldTLSSpace[16];
}

And the symbol is then directly referenced from inline assembly like this:

asm("leaq LLVMRTDyldTLSSpace@tpoff, %0" : "=r"(TLSOffset));

Does thread-local storage in ELF work differently on OpenBSD than it does on Linux? Is there an easy way to fix it or should I try to ifdef this out for OpenBSD?