This is an archive of the discontinued LLVM Phabricator instance.

ELF: Move sections referred to by __start_/__stop_ symbols into the main partition.
ClosedPublic

Authored by pcc on Aug 7 2019, 3:20 PM.

Details

Summary

In the case where C identifier sections have SHF_LINK_ORDER they will most
likely be placed in the same partition as the section that they are associated
with. But unless this happens to be the main partition, this will cause them
to be excluded from the range covered by the start_ and stop_ symbols,
which may lead to incorrect program behaviour. So we need to move them
all into the main partition so that they will be covered by the start_
and
stop_ symbols.

We may want to refine this approach later and allow different start_/stop_
symbol values for different partitions. This would only make sense for
relocations from SHT_NOTE sections since they are duplicated into each
partition.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc created this revision.Aug 7 2019, 3:20 PM
Herald added a project: Restricted Project. · View Herald Transcript
pcc updated this revision to Diff 214019.Aug 7 2019, 3:29 PM
  • Add test

Makes sense. A couple of nits:

lld/ELF/MarkLive.cpp
299 ↗(On Diff #214019)

DenseSet<StringRef> can avoid allocation for strings.

308 ↗(On Diff #214019)
if (name.consume_front("__start_") || name.consume_front("__stop_"))
  startStopSymbolNames.insert(name);
lld/test/ELF/partition-move-to-main-startstop.s
17 ↗(On Diff #214019)

You can also use FileCheck %s --implicit-check-not=has_startstop and delete CHECK-NOT:

ruiu added inline comments.Aug 8 2019, 12:57 AM
lld/ELF/MarkLive.cpp
299 ↗(On Diff #214019)

In this patch you scan over all input files and their symbol tables to see if there's a symbol starting with "start_" or "stop_", but do you have to do this that way? I mean

  1. most section names are not a valid C identifier in the first place,
  2. if section whose name "foo" exists, look up the symbol table with "start_foo" and "end_foo" to see if there's a symbol that refer to the start/end of the sectoin

Am I missing somehting?

MaskRay added inline comments.Aug 8 2019, 2:08 AM
lld/ELF/MarkLive.cpp
299 ↗(On Diff #214019)

The

for (InputFile *file : objectFiles)
  for (Symbol *s : file->getSymbols())

loop already exists because STT_GNU_IFUNC and STT_TLS are special cased. The loop gets reused to check __start_ and __stop_. I am not sure an alternative approach will optimize it.

ruiu added inline comments.Aug 8 2019, 2:14 AM
lld/ELF/MarkLive.cpp
299 ↗(On Diff #214019)

Yeah, performance-wise I don't know which is better, but I feel like looking up the symbol table is more natural than scanning all symbols if we are handling symbol names.

pcc updated this revision to Diff 214230.Aug 8 2019, 2:17 PM
pcc marked 3 inline comments as done.
  • Switch to a symbol table lookup
  • Use --implicit-check-not
lld/ELF/MarkLive.cpp
299 ↗(On Diff #214019)

With regard to point 1, in certain sanitizer builds every global will have an associated C identifier named section. So depending on your build mode the number may not be that small.

That said, I couldn't measure a significant difference in link speed between the two options even in a HWASAN build of Chromium (if anything, the symbol lookup seems to be a little faster), so I'd be fine with either option. The symbol lookup turns out to be less code, so let's go with that.

MaskRay accepted this revision.Aug 8 2019, 5:41 PM
This revision is now accepted and ready to land.Aug 8 2019, 5:41 PM
This revision was automatically updated to reflect the committed changes.