This patch performs little optimization for eh_frame section.
If all FDE`s that referenced CIE are removed then CIE is also removed from output.
That can happen for example when dropping FDEs that point to dropped sections. Testcase showing that is included.
The same optimization was added to ld about 14 years ago: https://sourceware.org/ml/binutils/2001-12/msg00144.html, gold does not do that it seems.
Details
- Reviewers
ruiu • rafael - Commits
- rG147747ab0bbd: Revert or r256638. I`ve lost a little piece of code when resolved conflicts…
rG57286644f7c2: [ELF] - Optimize .eh_frame section: remove CIE if all FDEs referencing it were…
rLLD256693: Revert or r256638. I`ve lost a little piece of code when resolved conflicts…
rLLD256638: [ELF] - Optimize .eh_frame section: remove CIE if all FDEs referencing it were…
rL256693: Revert or r256638. I`ve lost a little piece of code when resolved conflicts…
rL256638: [ELF] - Optimize .eh_frame section: remove CIE if all FDEs referencing it…
Diff Detail
- Repository
- rL LLVM
Event Timeline
Lgtm assuming there is no big time regression when linking a real world
application with --gc-sections (clang is a good candidate).
Cheers,
Rafael
I am using QT creator + ninja/cmake. Was unable to find how to output link time in milliseconds :( Its always less than a second for me when linking clang. Aslo if I link not only clang than anyways since I can see only seconds, test is not accurate.
Do you know how can I calculate link time properly ?
Do anybody know how to get link time in milliseconds for performing link speed test ?
Well, I finally was able to estimate link time correctly I think. I used perf stat --repeat 50 for that
I ran it twice for original code and for patched one, linked clang with --gc-sections.
For original it was: 0.310183332 seconds and 0.306395272 seconds, after applying patch it become 0.298682542 seconds and 0.300793454 seconds. So either patch accelerated link time a bit or what is more looks like a truth for me: it just has no any visible negative affect on speed.
To be more clear, I think what you posted is a little bit strange. It's at least counter intuitive that doing more work makes linking faster, so I'd rather think that's just noise.
Also, how did you get those numbers? Which OS, preconditions (cache, disk) etc.. ? How many trials did you run? Your comment says twice, do you mean you run each experiment two times, or you run once with patch and once without (in any case I'll suggest to increase the number of trials)? What's variance, average, stddev ?
I'm not necessarily asking you to revert this until we get more detailed analysis, but I'll appreciate if you can gather informations and make them public.
In the history of LLD (the old one) some changes were made "in the name of performance" without really understanding the effects. I'm not exactly looking forward to see this again.
Thanks!
Actually I dont understand why so many attention is about this patch since the only thing its do is delays extending this->Header.sh_size until moment of any FDE appear. It does not do more work if we dont count one added comparsion as "work" of course. That actually less than any common patch usually do.
So I also think results is it just a noise and I see no anything strange here. Thats why I wrote "what is more looks like a truth for me: it just has no any visible negative affect on speed".
As I said in comment I used "perf stat --repeat 50", so I ran it 2x50 times for code without patch and 2x50 times for code with one.
First run, w/o patch, 50 repeats had 0.310183332 average link time.
Second run, w/o patch, 50 repeats had 0.306395272 average link time.
Third run, with patch, 50 repeats had 0.298682542 average link time.
Fourth run, with patch, 50 repeats had 0.300793454 average link time.
I am using 64 bit ubuntu 15.04 VM installed on SSD drive, CPU is i7-4970K@4.00
Command line was (I took it from ninja build script):
perf stat --repeat 50 /home/umb/LLVM/build/bin/clang++ -std=c++11 -fuse-ld=lld -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wnon-virtual-dtor -Wno-comment -std=c++11 -fcolor-diagnostics -fno-common -Woverloaded-virtual -O3 -DNDEBUG -Wl,-allow-shlib-undefined -Wl,--export-dynamic -Wl,-O3, -Wl,--gc-sections tools/clang/tools/driver/CMakeFiles/clang.dir/driver.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1_main.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1as_main.cpp.o -o bin/clang-3.8 lib/libLLVMAArch64CodeGen.a lib/libLLVMAArch64AsmPrinter.a lib/libLLVMAArch64AsmParser.a lib/libLLVMAArch64Desc.a lib/libLLVMAArch64Info.a lib/libLLVMAArch64Disassembler.a lib/libLLVMAMDGPUCodeGen.a lib/libLLVMAMDGPUAsmPrinter.a lib/libLLVMAMDGPUAsmParser.a lib/libLLVMAMDGPUDesc.a lib/libLLVMAMDGPUInfo.a lib/libLLVMARMCodeGen.a lib/libLLVMARMAsmPrinter.a lib/libLLVMARMAsmParser.a lib/libLLVMARMDesc.a lib/libLLVMARMInfo.a lib/libLLVMARMDisassembler.a lib/libLLVMBPFCodeGen.a lib/libLLVMBPFAsmPrinter.a lib/libLLVMBPFDesc.a lib/libLLVMBPFInfo.a lib/libLLVMCppBackendCodeGen.a lib/libLLVMCppBackendInfo.a lib/libLLVMHexagonCodeGen.a lib/libLLVMHexagonAsmParser.a lib/libLLVMHexagonDesc.a lib/libLLVMHexagonInfo.a lib/libLLVMHexagonDisassembler.a lib/libLLVMMipsCodeGen.a lib/libLLVMMipsAsmPrinter.a lib/libLLVMMipsAsmParser.a lib/libLLVMMipsDesc.a lib/libLLVMMipsInfo.a lib/libLLVMMipsDisassembler.a lib/libLLVMMSP430CodeGen.a lib/libLLVMMSP430AsmPrinter.a lib/libLLVMMSP430Desc.a lib/libLLVMMSP430Info.a lib/libLLVMNVPTXCodeGen.a lib/libLLVMNVPTXAsmPrinter.a lib/libLLVMNVPTXDesc.a lib/libLLVMNVPTXInfo.a lib/libLLVMPowerPCCodeGen.a lib/libLLVMPowerPCAsmPrinter.a lib/libLLVMPowerPCAsmParser.a lib/libLLVMPowerPCDesc.a lib/libLLVMPowerPCInfo.a lib/libLLVMPowerPCDisassembler.a lib/libLLVMSparcCodeGen.a lib/libLLVMSparcAsmPrinter.a lib/libLLVMSparcAsmParser.a lib/libLLVMSparcDesc.a lib/libLLVMSparcInfo.a lib/libLLVMSparcDisassembler.a lib/libLLVMSystemZCodeGen.a lib/libLLVMSystemZAsmPrinter.a lib/libLLVMSystemZAsmParser.a lib/libLLVMSystemZDesc.a lib/libLLVMSystemZInfo.a lib/libLLVMSystemZDisassembler.a lib/libLLVMX86CodeGen.a lib/libLLVMX86AsmPrinter.a lib/libLLVMX86AsmParser.a lib/libLLVMX86Desc.a lib/libLLVMX86Info.a lib/libLLVMX86Disassembler.a lib/libLLVMXCoreCodeGen.a lib/libLLVMXCoreAsmPrinter.a lib/libLLVMXCoreDesc.a lib/libLLVMXCoreInfo.a lib/libLLVMXCoreDisassembler.a lib/libLLVMAnalysis.a lib/libLLVMCodeGen.a lib/libLLVMCore.a lib/libLLVMipo.a lib/libLLVMInstCombine.a lib/libLLVMInstrumentation.a lib/libLLVMMC.a lib/libLLVMMCParser.a lib/libLLVMObjCARCOpts.a lib/libLLVMOption.a lib/libLLVMScalarOpts.a lib/libLLVMSupport.a lib/libLLVMTransformUtils.a lib/libLLVMVectorize.a lib/libclangBasic.a lib/libclangCodeGen.a lib/libclangDriver.a lib/libclangFrontend.a lib/libclangFrontendTool.a lib/libLLVMAArch64Desc.a lib/libLLVMAArch64AsmPrinter.a lib/libLLVMAArch64Info.a lib/libLLVMAArch64Utils.a lib/libLLVMAMDGPUAsmPrinter.a lib/libLLVMAMDGPUUtils.a lib/libLLVMARMDesc.a lib/libLLVMARMAsmPrinter.a lib/libLLVMARMInfo.a lib/libLLVMBPFAsmPrinter.a lib/libLLVMHexagonDesc.a lib/libLLVMHexagonInfo.a lib/libLLVMMipsAsmPrinter.a lib/libLLVMMipsInfo.a lib/libLLVMMSP430AsmPrinter.a lib/libLLVMNVPTXAsmPrinter.a lib/libLLVMPowerPCAsmPrinter.a lib/libLLVMPowerPCInfo.a lib/libLLVMSparcAsmPrinter.a lib/libLLVMSparcInfo.a lib/libLLVMSystemZDesc.a lib/libLLVMSystemZAsmPrinter.a lib/libLLVMSystemZInfo.a lib/libLLVMX86CodeGen.a lib/libLLVMX86Desc.a lib/libLLVMX86AsmPrinter.a lib/libLLVMX86Utils.a lib/libLLVMX86Info.a lib/libLLVMXCoreAsmPrinter.a lib/libLLVMAsmPrinter.a lib/libLLVMSelectionDAG.a lib/libLLVMCodeGen.a lib/libLLVMXCoreInfo.a lib/libLLVMMCDisassembler.a lib/libclangCodeGen.a lib/libLLVMipo.a lib/libLLVMVectorize.a lib/libLLVMInstrumentation.a lib/libLLVMObjCARCOpts.a lib/libLLVMScalarOpts.a lib/libLLVMInstCombine.a lib/libLLVMTarget.a lib/libLLVMBitWriter.a lib/libLLVMIRReader.a lib/libLLVMAsmParser.a lib/libLLVMLinker.a lib/libLLVMTransformUtils.a lib/libLLVMAnalysis.a lib/libLLVMProfileData.a lib/libLLVMObject.a lib/libclangRewriteFrontend.a lib/libclangARCMigrate.a lib/libclangStaticAnalyzerFrontend.a lib/libclangFrontend.a lib/libclangDriver.a lib/libLLVMOption.a lib/libclangParse.a lib/libLLVMMCParser.a lib/libclangSerialization.a lib/libLLVMBitReader.a lib/libclangSema.a lib/libclangEdit.a lib/libclangStaticAnalyzerCheckers.a lib/libclangStaticAnalyzerCore.a lib/libclangAnalysis.a lib/libclangAST.a lib/libclangRewrite.a lib/libclangLex.a lib/libclangBasic.a lib/libLLVMCore.a lib/libLLVMMC.a lib/libLLVMSupport.a -lrt -ldl -ltinfo -lpthread -lz -lm -Wl,-rpath,"\$ORIGIN/../lib"
I'm not necessarily asking you to revert this until we get more detailed analysis, but I'll appreciate if you can gather informations and make them public.
In the history of LLD (the old one) some changes were made "in the name of performance" without really understanding the effects. I'm not exactly looking forward to see this again.Thanks!
To be more clear, when I say "noise" I mean estimation error. Speed boost/slowdown cant be estimated because of timing error noise because with or wothout this patch results are equal.
I had to revert it in r256693. Unfortunately I lost changes in void EHOutputSection<ELFT>::writeTo() when resolved conflicts during commit. Since test didn`t catch that, I think it also need to be improved. I`ll improve the test and recommit later.
That lost change should not affect on perfomance anyhow though.
This patch does not particularly seem that it has a negative impact on
performance. It is reasonable to see only noises when comparing
performances before and after this patch. I believe Rafael left the note
for "just in case".
Thats probably true. And I still have plan to improve the test before re-commiting this.