Page MenuHomePhabricator

[ELF] - Optimize .eh_frame section: remove CIE if all FDEs referencing it were removed.
ClosedPublic

Authored by grimar on Dec 16 2015, 5:16 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar updated this revision to Diff 42991.Dec 16 2015, 5:16 AM
grimar retitled this revision from to [ELF] - Optimize .eh_frame section: remove CIE if all FDEs referencing it were removed..
grimar updated this object.
grimar added reviewers: ruiu, rafael.
grimar added subscribers: llvm-commits, grimar.
emaste added a subscriber: emaste.Dec 16 2015, 6:41 AM
rafael edited edge metadata.Dec 23 2015, 10:23 AM

Lgtm assuming there is no big time regression when linking a real world
application with --gc-sections (clang is a good candidate).

Cheers,
Rafael

Lgtm assuming there is no big time regression when linking a real world
application with --gc-sections (clang is a good candidate).

Cheers,
Rafael

I am using QT creator + ninja/cmake. Was unable to find how to output link time in milliseconds :( Its always less than a second for me when linking clang. Aslo if I link not only clang than anyways since I can see only seconds, test is not accurate.
Do you know how can I calculate link time properly ?

Do anybody know how to get link time in milliseconds for performing link speed test ?

This revision was automatically updated to reflect the committed changes.
grimar added a comment.EditedDec 30 2015, 3:44 AM

Lgtm assuming there is no big time regression when linking a real world
application with --gc-sections (clang is a good candidate).

Cheers,
Rafael

Well, I finally was able to estimate link time correctly I think. I used perf stat --repeat 50 for that
I ran it twice for original code and for patched one, linked clang with --gc-sections.
For original it was: 0.310183332 seconds and 0.306395272 seconds, after applying patch it become 0.298682542 seconds and 0.300793454 seconds. So either patch accelerated link time a bit or what is more looks like a truth for me: it just has no any visible negative affect on speed.

davide added a subscriber: davide.Jan 1 2016, 11:41 AM

Did you end up getting the numbers? I haven't seen them.

To be more clear, I think what you posted is a little bit strange. It's at least counter intuitive that doing more work makes linking faster, so I'd rather think that's just noise.
Also, how did you get those numbers? Which OS, preconditions (cache, disk) etc.. ? How many trials did you run? Your comment says twice, do you mean you run each experiment two times, or you run once with patch and once without (in any case I'll suggest to increase the number of trials)? What's variance, average, stddev ?

I'm not necessarily asking you to revert this until we get more detailed analysis, but I'll appreciate if you can gather informations and make them public.
In the history of LLD (the old one) some changes were made "in the name of performance" without really understanding the effects. I'm not exactly looking forward to see this again.

Thanks!

grimar added a comment.Jan 2 2016, 5:28 AM

To be more clear, I think what you posted is a little bit strange. It's at least counter intuitive that doing more work makes linking faster, so I'd rather think that's just noise.
Also, how did you get those numbers? Which OS, preconditions (cache, disk) etc.. ? How many trials did you run? Your comment says twice, do you mean you run each experiment two times, or you run once with patch and once without (in any case I'll suggest to increase the number of trials)? What's variance, average, stddev ?

Actually I dont understand why so many attention is about this patch since the only thing its do is delays extending this->Header.sh_size until moment of any FDE appear. It does not do more work if we dont count one added comparsion as "work" of course. That actually less than any common patch usually do.
So I also think results is it just a noise and I see no anything strange here. Thats why I wrote "what is more looks like a truth for me: it just has no any visible negative affect on speed".

As I said in comment I used "perf stat --repeat 50", so I ran it 2x50 times for code without patch and 2x50 times for code with one.
First run, w/o patch, 50 repeats had 0.310183332 average link time.
Second run, w/o patch, 50 repeats had 0.306395272 average link time.
Third run, with patch, 50 repeats had 0.298682542 average link time.
Fourth run, with patch, 50 repeats had 0.300793454 average link time.

I am using 64 bit ubuntu 15.04 VM installed on SSD drive, CPU is i7-4970K@4.00

Command line was (I took it from ninja build script):
perf stat --repeat 50 /home/umb/LLVM/build/bin/clang++ -std=c++11 -fuse-ld=lld -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wnon-virtual-dtor -Wno-comment -std=c++11 -fcolor-diagnostics -fno-common -Woverloaded-virtual -O3 -DNDEBUG -Wl,-allow-shlib-undefined -Wl,--export-dynamic -Wl,-O3, -Wl,--gc-sections tools/clang/tools/driver/CMakeFiles/clang.dir/driver.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1_main.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1as_main.cpp.o -o bin/clang-3.8 lib/libLLVMAArch64CodeGen.a lib/libLLVMAArch64AsmPrinter.a lib/libLLVMAArch64AsmParser.a lib/libLLVMAArch64Desc.a lib/libLLVMAArch64Info.a lib/libLLVMAArch64Disassembler.a lib/libLLVMAMDGPUCodeGen.a lib/libLLVMAMDGPUAsmPrinter.a lib/libLLVMAMDGPUAsmParser.a lib/libLLVMAMDGPUDesc.a lib/libLLVMAMDGPUInfo.a lib/libLLVMARMCodeGen.a lib/libLLVMARMAsmPrinter.a lib/libLLVMARMAsmParser.a lib/libLLVMARMDesc.a lib/libLLVMARMInfo.a lib/libLLVMARMDisassembler.a lib/libLLVMBPFCodeGen.a lib/libLLVMBPFAsmPrinter.a lib/libLLVMBPFDesc.a lib/libLLVMBPFInfo.a lib/libLLVMCppBackendCodeGen.a lib/libLLVMCppBackendInfo.a lib/libLLVMHexagonCodeGen.a lib/libLLVMHexagonAsmParser.a lib/libLLVMHexagonDesc.a lib/libLLVMHexagonInfo.a lib/libLLVMHexagonDisassembler.a lib/libLLVMMipsCodeGen.a lib/libLLVMMipsAsmPrinter.a lib/libLLVMMipsAsmParser.a lib/libLLVMMipsDesc.a lib/libLLVMMipsInfo.a lib/libLLVMMipsDisassembler.a lib/libLLVMMSP430CodeGen.a lib/libLLVMMSP430AsmPrinter.a lib/libLLVMMSP430Desc.a lib/libLLVMMSP430Info.a lib/libLLVMNVPTXCodeGen.a lib/libLLVMNVPTXAsmPrinter.a lib/libLLVMNVPTXDesc.a lib/libLLVMNVPTXInfo.a lib/libLLVMPowerPCCodeGen.a lib/libLLVMPowerPCAsmPrinter.a lib/libLLVMPowerPCAsmParser.a lib/libLLVMPowerPCDesc.a lib/libLLVMPowerPCInfo.a lib/libLLVMPowerPCDisassembler.a lib/libLLVMSparcCodeGen.a lib/libLLVMSparcAsmPrinter.a lib/libLLVMSparcAsmParser.a lib/libLLVMSparcDesc.a lib/libLLVMSparcInfo.a lib/libLLVMSparcDisassembler.a lib/libLLVMSystemZCodeGen.a lib/libLLVMSystemZAsmPrinter.a lib/libLLVMSystemZAsmParser.a lib/libLLVMSystemZDesc.a lib/libLLVMSystemZInfo.a lib/libLLVMSystemZDisassembler.a lib/libLLVMX86CodeGen.a lib/libLLVMX86AsmPrinter.a lib/libLLVMX86AsmParser.a lib/libLLVMX86Desc.a lib/libLLVMX86Info.a lib/libLLVMX86Disassembler.a lib/libLLVMXCoreCodeGen.a lib/libLLVMXCoreAsmPrinter.a lib/libLLVMXCoreDesc.a lib/libLLVMXCoreInfo.a lib/libLLVMXCoreDisassembler.a lib/libLLVMAnalysis.a lib/libLLVMCodeGen.a lib/libLLVMCore.a lib/libLLVMipo.a lib/libLLVMInstCombine.a lib/libLLVMInstrumentation.a lib/libLLVMMC.a lib/libLLVMMCParser.a lib/libLLVMObjCARCOpts.a lib/libLLVMOption.a lib/libLLVMScalarOpts.a lib/libLLVMSupport.a lib/libLLVMTransformUtils.a lib/libLLVMVectorize.a lib/libclangBasic.a lib/libclangCodeGen.a lib/libclangDriver.a lib/libclangFrontend.a lib/libclangFrontendTool.a lib/libLLVMAArch64Desc.a lib/libLLVMAArch64AsmPrinter.a lib/libLLVMAArch64Info.a lib/libLLVMAArch64Utils.a lib/libLLVMAMDGPUAsmPrinter.a lib/libLLVMAMDGPUUtils.a lib/libLLVMARMDesc.a lib/libLLVMARMAsmPrinter.a lib/libLLVMARMInfo.a lib/libLLVMBPFAsmPrinter.a lib/libLLVMHexagonDesc.a lib/libLLVMHexagonInfo.a lib/libLLVMMipsAsmPrinter.a lib/libLLVMMipsInfo.a lib/libLLVMMSP430AsmPrinter.a lib/libLLVMNVPTXAsmPrinter.a lib/libLLVMPowerPCAsmPrinter.a lib/libLLVMPowerPCInfo.a lib/libLLVMSparcAsmPrinter.a lib/libLLVMSparcInfo.a lib/libLLVMSystemZDesc.a lib/libLLVMSystemZAsmPrinter.a lib/libLLVMSystemZInfo.a lib/libLLVMX86CodeGen.a lib/libLLVMX86Desc.a lib/libLLVMX86AsmPrinter.a lib/libLLVMX86Utils.a lib/libLLVMX86Info.a lib/libLLVMXCoreAsmPrinter.a lib/libLLVMAsmPrinter.a lib/libLLVMSelectionDAG.a lib/libLLVMCodeGen.a lib/libLLVMXCoreInfo.a lib/libLLVMMCDisassembler.a lib/libclangCodeGen.a lib/libLLVMipo.a lib/libLLVMVectorize.a lib/libLLVMInstrumentation.a lib/libLLVMObjCARCOpts.a lib/libLLVMScalarOpts.a lib/libLLVMInstCombine.a lib/libLLVMTarget.a lib/libLLVMBitWriter.a lib/libLLVMIRReader.a lib/libLLVMAsmParser.a lib/libLLVMLinker.a lib/libLLVMTransformUtils.a lib/libLLVMAnalysis.a lib/libLLVMProfileData.a lib/libLLVMObject.a lib/libclangRewriteFrontend.a lib/libclangARCMigrate.a lib/libclangStaticAnalyzerFrontend.a lib/libclangFrontend.a lib/libclangDriver.a lib/libLLVMOption.a lib/libclangParse.a lib/libLLVMMCParser.a lib/libclangSerialization.a lib/libLLVMBitReader.a lib/libclangSema.a lib/libclangEdit.a lib/libclangStaticAnalyzerCheckers.a lib/libclangStaticAnalyzerCore.a lib/libclangAnalysis.a lib/libclangAST.a lib/libclangRewrite.a lib/libclangLex.a lib/libclangBasic.a lib/libLLVMCore.a lib/libLLVMMC.a lib/libLLVMSupport.a -lrt -ldl -ltinfo -lpthread -lz -lm -Wl,-rpath,"\$ORIGIN/../lib"

I'm not necessarily asking you to revert this until we get more detailed analysis, but I'll appreciate if you can gather informations and make them public.
In the history of LLD (the old one) some changes were made "in the name of performance" without really understanding the effects. I'm not exactly looking forward to see this again.

Thanks!

grimar added a comment.Jan 2 2016, 5:59 AM

To be more clear, when I say "noise" I mean estimation error. Speed boost/slowdown cant be estimated because of timing error noise because with or wothout this patch results are equal.

grimar added a comment.Jan 2 2016, 9:02 AM

I had to revert it in r256693. Unfortunately I lost changes in void EHOutputSection<ELFT>::writeTo() when resolved conflicts during commit. Since test didn`t catch that, I think it also need to be improved. I`ll improve the test and recommit later.
That lost change should not affect on perfomance anyhow though.

This patch does not particularly seem that it has a negative impact on
performance. It is reasonable to see only noises when comparing
performances before and after this patch. I believe Rafael left the note
for "just in case".

This patch does not particularly seem that it has a negative impact on
performance. It is reasonable to see only noises when comparing
performances before and after this patch. I believe Rafael left the note
for "just in case".

Thats probably true. And I still have plan to improve the test before re-commiting this.