This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Set DF_STATIC_TLS flag for i386 target.
ClosedPublic

Authored by grimar on Apr 21 2017, 8:38 AM.

Details

Summary

This is PR32437.

DF_STATIC_TLS
If set in a shared object or executable, this flag instructs the dynamic linker to reject attempts to load this file dynamically. It indicates that the shared object or executable contains code using a static thread-local storage scheme. Implementations need not support any form of thread-local storage.

Patch checks if IE/LE relocations were used to check if code uses static model. If so it sets the DF_STATIC_TLS flag.
Patch contains testcases for i386 target for now.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar created this revision.Apr 21 2017, 8:38 AM
ruiu added inline comments.Apr 24 2017, 10:59 AM
ELF/Relocations.cpp
207 ↗(On Diff #96149)

I don't want to bury this code deep inside the relocation processing like this. Can you create a different for loop that visits each input section's Relocations vector elements to see if there's a TLS relocation?

208 ↗(On Diff #96149)

No other members of Target is writable, so you want to treat a Target as an immutable object.

silvas added a subscriber: silvas.Apr 24 2017, 11:51 AM
silvas added inline comments.
ELF/Relocations.cpp
207 ↗(On Diff #96149)

Need to measure performance if it is a separate for loop though (like for any change adding O(number of relocations) work). Putting it here does a very small amount of extra work (since it guarded by e.g. Body.isTLS()), though it is less clean.

grimar updated this revision to Diff 96776.Apr 26 2017, 10:05 AM

Review comments addressed. Perfomance difference seems varies.
I observed from no difference to about 2%. Each run has 25 iterations.

Chromium:
Before patch:

3024,298420      task-clock (msec)         #    1,000 CPUs utilized            ( +-  0,23% )
          1      context-switches          #    0,000 K/sec                    ( +- 11,47% )
          1      cpu-migrations            #    0,000 K/sec                    ( +- 16,67% )
    168 021      page-faults               #    0,056 M/sec                    ( +-  0,15% )

3,024534666 seconds time elapsed                                          ( +-  0,23% )

After patch:

3028,392936      task-clock (msec)         #    1,000 CPUs utilized            ( +-  0,17% )
          1      context-switches          #    0,000 K/sec                    ( +- 11,47% )
          1      cpu-migrations            #    0,000 K/sec                    ( +- 11,47% )
    167 843      page-faults               #    0,055 M/sec                    ( +-  0,14% )

3,028637012 seconds time elapsed                                          ( +-  0,17% )

3,028637012 / 3,024534666 == 1.001x

Clang:
Before patch:

 444,407019      task-clock (msec)         #    0,998 CPUs utilized            ( +-  0,23% )
          0      context-switches          #    0,000 K/sec                    ( +- 46,77% )
          0      cpu-migrations            #    0,000 K/sec                    ( +- 55,28% )
     54 471      page-faults               #    0,123 M/sec                    ( +-  0,08% )

0,445156259 seconds time elapsed                                          ( +-  0,26% )

After patch:

 447,810554      task-clock (msec)         #    0,999 CPUs utilized            ( +-  0,24% )
          0      context-switches          #    0,000 K/sec                    ( +- 40,82% )
          0      cpu-migrations            #    0,000 K/sec                    ( +- 69,22% )
     54 492      page-faults               #    0,122 M/sec                    ( +-  0,04% )
0,448175370 seconds time elapsed                                          ( +-  0,25% )

0,448175370 / 0,445156259 == 1.003x

ScyllaDB:
Before patch:

2301,014053      task-clock (msec)         #    0,999 CPUs utilized            ( +-  0,24% )
          1      context-switches          #    0,000 K/sec                    ( +- 21,08% )
          1      cpu-migrations            #    0,000 K/sec                    ( +- 18,09% )
    288 479      page-faults               #    0,125 M/sec                    ( +-  0,06% )
2,302234392 seconds time elapsed                                          ( +-  0,24% )

After patch:

2347,685157      task-clock (msec)         #    1,000 CPUs utilized            ( +-  0,23% )
          1      context-switches          #    0,000 K/sec                    ( +- 17,46% )
          1      cpu-migrations            #    0,000 K/sec                    ( +- 18,09% )
    288 911      page-faults               #    0,123 M/sec                    ( +-  0,04% )
2,348484871 seconds time elapsed                                          ( +-  0,23% )

2,348484871 / 2,302234392 == 1.02x

Not sure how much that is acceptable. Should we use approach from previous diff which never
affected on perfomance ?

ruiu edited edge metadata.Apr 26 2017, 10:09 AM

0.3% is negligible but 2% isn't. Why this patch impact only ScyllaDB?

In D32354#738306, @ruiu wrote:

0.3% is negligible but 2% isn't. Why this patch impact only ScyllaDB?

I dont know. I can debug it to check this, if you think it worth to do.

ruiu added a comment.Apr 26 2017, 10:15 AM

I think it's worth to do. I'm actually curious how that can happen, if the observation is correct.

I also can suggest as an option not to burry logic inside tls handling method like first diff do, but also do not introduce a new loop.
And do all things somewhere at the begining of scanRelocs:

template <class ELFT, class RelTy>
static void scanRelocs(InputSectionBase &Sec, ArrayRef<RelTy> Rels) {
  OffsetGetter GetOffset(Sec);
  for (auto I = Rels.begin(), End = Rels.end(); I != End; ++I) {
    const RelTy &Rel = *I;

    <DO LOGIC HERE>

That will avoid second loop and should be effective.

In D32354#738320, @ruiu wrote:

I think it's worth to do. I'm actually curious how that can happen, if the observation is correct.

I checked few times, I believe them are. I'll debug it then.

ruiu added a comment.Apr 26 2017, 10:27 AM

The other way of doing this is to add code to relocateOne, to record whether we handle TLS IE/LD relocations or not. This is probably the least expensive way as it doesn't involve any virtual function calls.

grimar added a comment.May 2 2017, 6:53 AM
In D32354#738320, @ruiu wrote:

I think it's worth to do. I'm actually curious how that can happen, if the observation is correct.

I really don't know why that happened, but today I was unable to reproduce the difference when linked ScyllaDB observed earlier.
Moreover after debugging I think there should be no reasons for that.
Scylla has 661610 relocations in total for me, and after 19248 HasStaticTlsModel flag is set, what stops scanning for TLS model.
In compare, clang has 1541964 relocations and flag is never set.
So scan of relocations for clang should take much more time, though that does not happen, it does not affect on a link time for me.

Results from 100 runs:

linking clang, before patch:
Performance counter stats for './../lld -flavor gnu --hash-style=gnu --no-add-needed --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o t crt1.o crti.o crtbegin.o --strip-all -allow-shlib-undefined --export-dynamic -O3 --gc-sections tools/clang/tools/driver/CMakeFiles/clang.dir/driver.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1_main.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1as_main.cpp.o lib/libLLVMAArch64CodeGen.a lib/libLLVMAArch64AsmPrinter.a lib/libLLVMAArch64AsmParser.a lib/libLLVMAArch64Desc.a lib/libLLVMAArch64Info.a lib/libLLVMAArch64Disassembler.a lib/libLLVMAMDGPUCodeGen.a lib/libLLVMAMDGPUAsmPrinter.a lib/libLLVMAMDGPUAsmParser.a lib/libLLVMAMDGPUDesc.a lib/libLLVMAMDGPUInfo.a lib/libLLVMAMDGPUDisassembler.a lib/libLLVMARMCodeGen.a lib/libLLVMARMAsmPrinter.a lib/libLLVMARMAsmParser.a lib/libLLVMARMDesc.a lib/libLLVMARMInfo.a lib/libLLVMARMDisassembler.a lib/libLLVMBPFCodeGen.a lib/libLLVMBPFAsmPrinter.a lib/libLLVMBPFDesc.a lib/libLLVMBPFInfo.a lib/libLLVMCppBackendCodeGen.a lib/libLLVMCppBackendInfo.a lib/libLLVMHexagonCodeGen.a lib/libLLVMHexagonAsmParser.a lib/libLLVMHexagonDesc.a lib/libLLVMHexagonInfo.a lib/libLLVMHexagonDisassembler.a lib/libLLVMMipsCodeGen.a lib/libLLVMMipsAsmPrinter.a lib/libLLVMMipsAsmParser.a lib/libLLVMMipsDesc.a lib/libLLVMMipsInfo.a lib/libLLVMMipsDisassembler.a lib/libLLVMMSP430CodeGen.a lib/libLLVMMSP430AsmPrinter.a lib/libLLVMMSP430Desc.a lib/libLLVMMSP430Info.a lib/libLLVMNVPTXCodeGen.a lib/libLLVMNVPTXAsmPrinter.a lib/libLLVMNVPTXDesc.a lib/libLLVMNVPTXInfo.a lib/libLLVMPowerPCCodeGen.a lib/libLLVMPowerPCAsmPrinter.a lib/libLLVMPowerPCAsmParser.a lib/libLLVMPowerPCDesc.a lib/libLLVMPowerPCInfo.a lib/libLLVMPowerPCDisassembler.a lib/libLLVMSparcCodeGen.a lib/libLLVMSparcAsmPrinter.a lib/libLLVMSparcAsmParser.a lib/libLLVMSparcDesc.a lib/libLLVMSparcInfo.a lib/libLLVMSparcDisassembler.a lib/libLLVMSystemZCodeGen.a lib/libLLVMSystemZAsmPrinter.a lib/libLLVMSystemZAsmParser.a lib/libLLVMSystemZDesc.a lib/libLLVMSystemZInfo.a lib/libLLVMSystemZDisassembler.a lib/libLLVMX86CodeGen.a lib/libLLVMX86AsmPrinter.a lib/libLLVMX86AsmParser.a lib/libLLVMX86Desc.a lib/libLLVMX86Info.a lib/libLLVMX86Disassembler.a lib/libLLVMXCoreCodeGen.a lib/libLLVMXCoreAsmPrinter.a lib/libLLVMXCoreDesc.a lib/libLLVMXCoreInfo.a lib/libLLVMXCoreDisassembler.a lib/libLLVMAnalysis.a lib/libLLVMCodeGen.a lib/libLLVMCore.a lib/libLLVMipo.a lib/libLLVMInstCombine.a lib/libLLVMInstrumentation.a lib/libLLVMMC.a lib/libLLVMMCParser.a lib/libLLVMObjCARCOpts.a lib/libLLVMOption.a lib/libLLVMScalarOpts.a lib/libLLVMSupport.a lib/libLLVMTransformUtils.a lib/libLLVMVectorize.a lib/libclangBasic.a lib/libclangCodeGen.a lib/libclangDriver.a lib/libclangFrontend.a lib/libclangFrontendTool.a lib/libLLVMGlobalISel.a lib/libLLVMAArch64Desc.a lib/libLLVMAArch64AsmPrinter.a lib/libLLVMAArch64Info.a lib/libLLVMAArch64Utils.a lib/libLLVMAMDGPUDesc.a lib/libLLVMAMDGPUAsmPrinter.a lib/libLLVMAMDGPUInfo.a lib/libLLVMAMDGPUUtils.a lib/libLLVMARMDesc.a lib/libLLVMARMAsmPrinter.a lib/libLLVMARMInfo.a lib/libLLVMBPFAsmPrinter.a lib/libLLVMHexagonDesc.a lib/libLLVMHexagonInfo.a lib/libLLVMMipsAsmPrinter.a lib/libLLVMMipsInfo.a lib/libLLVMMSP430AsmPrinter.a lib/libLLVMNVPTXAsmPrinter.a lib/libLLVMPowerPCAsmPrinter.a lib/libLLVMPowerPCInfo.a lib/libLLVMSparcAsmPrinter.a lib/libLLVMSparcInfo.a lib/libLLVMSystemZDesc.a lib/libLLVMSystemZAsmPrinter.a lib/libLLVMSystemZInfo.a lib/libLLVMX86AsmPrinter.a lib/libLLVMX86Utils.a lib/libLLVMX86Info.a lib/libLLVMXCoreAsmPrinter.a lib/libLLVMAsmPrinter.a lib/libLLVMDebugInfoCodeView.a lib/libLLVMSelectionDAG.a lib/libLLVMCodeGen.a lib/libLLVMXCoreInfo.a lib/libLLVMMCDisassembler.a lib/libclangCodeGen.a lib/libLLVMipo.a lib/libLLVMVectorize.a lib/libLLVMInstrumentation.a lib/libLLVMObjCARCOpts.a lib/libLLVMScalarOpts.a lib/libLLVMInstCombine.a lib/libLLVMTarget.a lib/libLLVMBitWriter.a lib/libLLVMIRReader.a lib/libLLVMAsmParser.a lib/libLLVMLinker.a lib/libLLVMTransformUtils.a lib/libLLVMAnalysis.a lib/libclangRewriteFrontend.a lib/libclangARCMigrate.a lib/libclangStaticAnalyzerFrontend.a lib/libclangFrontend.a lib/libclangDriver.a lib/libLLVMOption.a lib/libLLVMProfileData.a lib/libLLVMObject.a lib/libclangParse.a lib/libLLVMMCParser.a lib/libclangSerialization.a lib/libLLVMBitReader.a lib/libclangSema.a lib/libclangEdit.a lib/libclangStaticAnalyzerCheckers.a lib/libclangStaticAnalyzerCore.a lib/libclangAnalysis.a lib/libclangAST.a lib/libclangRewrite.a lib/libclangLex.a lib/libclangBasic.a lib/libLLVMCore.a lib/libLLVMMC.a lib/libLLVMSupport.a librt.so libdl.so libtinfo.so libpthread.so.0 libz.so libm.so.6 -rpath $ORIGIN/../lib libstdc++.so libgcc_s.so libc.so.6 libc_nonshared.a --as-needed ld-linux-x86-64.so.2 --no-as-needed libgcc_s.so crtend.o crtn.o' (100 runs):

     467,482313      task-clock (msec)         #    0,999 CPUs utilized            ( +-  0,14% )
            134      context-switches          #    0,286 K/sec                    ( +-  0,19% )
              0      cpu-migrations            #    0,000 K/sec                  
         55 282      page-faults               #    0,118 M/sec                    ( +-  0,05% )
<not supported>      cycles                   
<not supported>      stalled-cycles-frontend  
<not supported>      stalled-cycles-backend   
<not supported>      instructions             
<not supported>      branches                 
<not supported>      branch-misses            

    0,467917575 seconds time elapsed                                          ( +-  0,14% )

linking clang after patch:

     470,285320      task-clock (msec)         #    0,999 CPUs utilized            ( +-  0,14% )
            134      context-switches          #    0,284 K/sec                    ( +-  0,19% )
              0      cpu-migrations            #    0,000 K/sec                  
         55 225      page-faults               #    0,117 M/sec                    ( +-  0,06% )
<not supported>      cycles                   
<not supported>      stalled-cycles-frontend  
<not supported>      stalled-cycles-backend   
<not supported>      instructions             
<not supported>      branches                 
<not supported>      branch-misses            

    0,470744480 seconds time elapsed                                          ( +-  0,14% )

Difference is still minimal here: 0,470744480 / 0,467917575 == 1.00604

linking scylladb before patch:
Performance counter stats for './../lld -flavor gnu -build-id --no-add-needed --eh-frame-hdr --hash-style=gnu -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o t crt1.o crti.o crtbegin.o --whole-archive seastar/build/release/libseastar.a --no-whole-archive --no-as-needed libaio.so libboost_program_options.so libboost_system.so libboost_filesystem.so libstdc++.so libboost_unit_test_framework.so libboost_thread.so.1.58.0 libcryptopp.so librt.so libgnutls.so libgnutlsxx.so libhwloc.so libnuma.so libpciaccess.so libxml2.so libz.so build/release/main.o build/release/database.o build/release/schema.o build/release/frozen_schema.o build/release/schema_registry.o build/release/bytes.o build/release/mutation.o build/release/row_cache.o build/release/canonical_mutation.o build/release/frozen_mutation.o build/release/memtable.o build/release/schema_mutations.o build/release/release.o build/release/utils/logalloc.o build/release/utils/large_bitset.o build/release/mutation_partition.o build/release/mutation_partition_view.o build/release/mutation_partition_serializer.o build/release/mutation_reader.o build/release/mutation_query.o build/release/key_reader.o build/release/keys.o build/release/sstables/sstables.o build/release/sstables/compress.o build/release/sstables/row.o build/release/sstables/key.o build/release/sstables/partition.o build/release/sstables/filter.o build/release/sstables/compaction.o build/release/sstables/compaction_manager.o build/release/log.o build/release/transport/event.o build/release/transport/event_notifier.o build/release/transport/server.o build/release/cql3/abstract_marker.o build/release/cql3/attributes.o build/release/cql3/cf_name.o build/release/cql3/cql3_type.o build/release/cql3/operation.o build/release/cql3/index_name.o build/release/cql3/keyspace_element_name.o build/release/cql3/lists.o build/release/cql3/sets.o build/release/cql3/maps.o build/release/cql3/functions/functions.o build/release/cql3/statements/cf_prop_defs.o build/release/cql3/statements/cf_statement.o build/release/cql3/statements/create_keyspace_statement.o build/release/cql3/statements/create_table_statement.o build/release/cql3/statements/drop_keyspace_statement.o build/release/cql3/statements/drop_table_statement.o build/release/cql3/statements/schema_altering_statement.o build/release/cql3/statements/ks_prop_defs.o build/release/cql3/statements/modification_statement.o build/release/cql3/statements/parsed_statement.o build/release/cql3/statements/property_definitions.o build/release/cql3/statements/update_statement.o build/release/cql3/statements/delete_statement.o build/release/cql3/statements/batch_statement.o build/release/cql3/statements/select_statement.o build/release/cql3/statements/use_statement.o build/release/cql3/statements/index_prop_defs.o build/release/cql3/statements/index_target.o build/release/cql3/statements/create_index_statement.o build/release/cql3/statements/truncate_statement.o build/release/cql3/statements/alter_table_statement.o build/release/cql3/update_parameters.o build/release/cql3/ut_name.o build/release/thrift/handler.o build/release/thrift/server.o build/release/thrift/thrift_validation.o build/release/utils/runtime.o build/release/utils/murmur_hash.o build/release/utils/uuid.o build/release/utils/big_decimal.o build/release/types.o build/release/validation.o build/release/service/priority_manager.o build/release/service/migration_manager.o build/release/service/storage_proxy.o build/release/cql3/operator.o build/release/cql3/relation.o build/release/cql3/column_identifier.o build/release/cql3/constants.o build/release/cql3/query_processor.o build/release/cql3/query_options.o build/release/cql3/single_column_relation.o build/release/cql3/token_relation.o build/release/cql3/column_condition.o build/release/cql3/user_types.o build/release/cql3/untyped_result_set.o build/release/cql3/selection/abstract_function_selector.o build/release/cql3/selection/simple_selector.o build/release/cql3/selection/selectable.o build/release/cql3/selection/selector_factories.o build/release/cql3/selection/selection.o build/release/cql3/selection/selector.o build/release/cql3/restrictions/statement_restrictions.o build/release/db/consistency_level.o build/release/db/system_keyspace.o build/release/db/schema_tables.o build/release/db/commitlog/commitlog.o build/release/db/commitlog/commitlog_replayer.o build/release/db/serializer.o build/release/db/config.o build/release/db/index/secondary_index.o build/release/db/marshal/type_parser.o build/release/db/batchlog_manager.o build/release/io/io.o build/release/utils/utils.o build/release/utils/UUID_gen.o build/release/utils/i_filter.o build/release/utils/bloom_filter.o build/release/utils/bloom_calculations.o build/release/utils/rate_limiter.o build/release/utils/file_lock.o build/release/utils/dynamic_bitset.o build/release/gms/version_generator.o build/release/gms/versioned_value.o build/release/gms/gossiper.o build/release/gms/failure_detector.o build/release/gms/gossip_digest_syn.o build/release/gms/gossip_digest_ack.o build/release/gms/gossip_digest_ack2.o build/release/gms/endpoint_state.o build/release/gms/application_state.o build/release/dht/i_partitioner.o build/release/dht/murmur3_partitioner.o build/release/dht/byte_ordered_partitioner.o build/release/dht/boot_strapper.o build/release/dht/range_streamer.o build/release/unimplemented.o build/release/query.o build/release/query-result-set.o build/release/locator/abstract_replication_strategy.o build/release/locator/simple_strategy.o build/release/locator/local_strategy.o build/release/locator/network_topology_strategy.o build/release/locator/token_metadata.o build/release/locator/locator.o build/release/locator/snitch_base.o build/release/locator/simple_snitch.o build/release/locator/rack_inferring_snitch.o build/release/locator/gossiping_property_file_snitch.o build/release/locator/production_snitch_base.o build/release/locator/ec2_snitch.o build/release/locator/ec2_multi_region_snitch.o build/release/message/messaging_service.o build/release/service/client_state.o build/release/service/migration_task.o build/release/service/storage_service.o build/release/service/pending_range_calculator_service.o build/release/service/load_broadcaster.o build/release/service/pager/paging_state.o build/release/service/pager/query_pagers.o build/release/streaming/stream_task.o build/release/streaming/stream_session.o build/release/streaming/stream_request.o build/release/streaming/stream_summary.o build/release/streaming/stream_transfer_task.o build/release/streaming/stream_receive_task.o build/release/streaming/stream_plan.o build/release/streaming/progress_info.o build/release/streaming/session_info.o build/release/streaming/stream_coordinator.o build/release/streaming/stream_manager.o build/release/streaming/stream_result_future.o build/release/streaming/stream_session_state.o build/release/gc_clock.o build/release/partition_slice_builder.o build/release/init.o build/release/repair/repair.o build/release/exceptions/exceptions.o build/release/dns.o build/release/auth/auth.o build/release/auth/authenticated_user.o build/release/auth/authenticator.o build/release/auth/data_resource.o build/release/auth/password_authenticator.o build/release/auth/permission.o build/release/api/api.o build/release/api/storage_service.o build/release/api/commitlog.o build/release/api/gossiper.o build/release/api/failure_detector.o build/release/api/column_family.o build/release/api/messaging_service.o build/release/api/storage_proxy.o build/release/api/cache_service.o build/release/api/collectd.o build/release/api/endpoint_snitch.o build/release/api/compaction_manager.o build/release/api/hinted_handoff.o build/release/api/lsa.o build/release/api/stream_manager.o build/release/api/system.o build/release/gen/cql3/CqlLexer.o build/release/gen/cql3/CqlParser.o build/release/gen/cassandra_types.o build/release/gen/cassandra_constants.o build/release/gen/Cassandra.o seastar/build/release/libseastar.a libthrift.so libboost_system.so libyaml-cpp.so liblz4.so libz.so libsnappy.so libjsoncpp.so libboost_filesystem.so libcrypt.so libstdc++.so libm.so.6 libgcc_s.so libpthread.so.0 libc.so.6 libc_nonshared.a libgcc_s.so crtend.o crtn.o' (100 runs):

    2364,001445      task-clock (msec)         #    1,000 CPUs utilized            ( +-  0,22% )
            378      context-switches          #    0,160 K/sec                    ( +- 10,91% )
              0      cpu-migrations            #    0,000 K/sec                  
        291 093      page-faults               #    0,123 M/sec                    ( +-  0,02% )
<not supported>      cycles                   
<not supported>      stalled-cycles-frontend  
<not supported>      stalled-cycles-backend   
<not supported>      instructions             
<not supported>      branches                 
<not supported>      branch-misses            

    2,364667169 seconds time elapsed                                            ( +-  0,22% )

linking scylladb after patch:

    2359,664993      task-clock (msec)         #    0,998 CPUs utilized            ( +-  0,20% )
            367      context-switches          #    0,156 K/sec                    ( +- 10,72% )
              0      cpu-migrations            #    0,000 K/sec                  
        291 161      page-faults               #    0,123 M/sec                    ( +-  0,02% )
<not supported>      cycles                   
<not supported>      stalled-cycles-frontend  
<not supported>      stalled-cycles-backend   
<not supported>      instructions             
<not supported>      branches                 
<not supported>      branch-misses            

    2,364471647 seconds time elapsed                                          ( +-  0,26% )

2,364471647 / 2,364667169 == 0.999917

grimar added a comment.May 2 2017, 6:57 AM
In D32354#738340, @ruiu wrote:

The other way of doing this is to add code to relocateOne, to record whether we handle TLS IE/LD relocations or not. This is probably the least expensive way as it doesn't involve any virtual function calls.

relocateOne is called really late, during writeTo. That is much later than finalizeContents() where we finish updating the DT_* entries. It does not look clean to do it there, probably ?

ruiu added inline comments.May 2 2017, 3:47 PM
ELF/Relocations.cpp
842 ↗(On Diff #96776)

This is not beautiful. Since setting DF_STATIC_TLS is not actually that important, I don't want to mess the code up for that feature. What if you add it to getRelExpr? That is also ugly, but it might be acceptable.

grimar updated this revision to Diff 97578.May 3 2017, 3:04 AM
  • Addressed review comment.
ELF/Relocations.cpp
842 ↗(On Diff #96776)

Placing in getRelExpr probably looks better and does not conflict with D32758. Done.

ruiu accepted this revision.May 5 2017, 2:25 PM

LGTM

Adding this piece of code to getRelExpr is tricky as getRelExpr is now not only do get but also has side effect. But this seems to be most less intrusive way of doing this.

ELF/Target.cpp
363 ↗(On Diff #97578)

I don't think that comment is useful. When you describe it, you want to make it useful for those who are not familiar with details about what you are dealing with. I'd write

// There are 4 different TLS variable models with varying degrees of flexibility and performance. LocalExec and InitialExec models are fast but less-flexible models. They cannot be used for dlopen(). If they are in use, we set DF_STATIC_TLS in the ELF header so that the runtime can reject such DSOs.

This revision is now accepted and ready to land.May 5 2017, 2:25 PM
This revision was automatically updated to reflect the committed changes.
grimar added inline comments.May 8 2017, 3:38 AM
ELF/Target.cpp
363 ↗(On Diff #97578)

Ok, thanks !