This is an archive of the discontinued LLVM Phabricator instance.

ELF2: Implement __start_SECNAME and __stop_SECNAME.
ClosedPublic

Authored by ruiu on Oct 14 2015, 7:15 PM.

Details

Reviewers
davide
rafael
Summary

If a section name is valid as a C identifier (which is rare because of
the leading '.'), linkers are expected to define start_<secname> and
stop_<secname> symbols. They are at beginning and end of the section,
respectively. This is not requested by the ELF standard, but GNU ld and
gold provide this feature.

Diff Detail

Event Timeline

ruiu updated this revision to Diff 37438.Oct 14 2015, 7:15 PM
ruiu retitled this revision from to ELF2: Implement __start_SECNAME and __stop_SECNAME..
ruiu updated this object.
ruiu added a reviewer: rafael.
ruiu added subscribers: llvm-commits, silvas.
davide added a subscriber: davide.Oct 14 2015, 10:34 PM

Rare, maybe, but definitely very useful. Definitely the FreeBSD kernel relies on start/stop and I think also the Linux kernel, although I wouldn't bet on that. The code looks good to me. I would also dump the symbol table to double-check the address of the symbols is the same. I also want to spend some words on this.
In the old linker we had an entire/unfinished pass taking care of start/stop. I know because I spent a fair amount of time trying to understand and finish the feature, without success. Now, could it be I didn't know better, but it looked overly complicated to me at the time. It's nice to see the same feature implemented (without comments) in roughly 10 lines of code.

majnemer added inline comments.
ELF/Writer.cpp
513–526

$ can be used in an identifier in clang:

echo 'int x$a;' | clang -x c -fsyntax-only - ; echo $?
0

gcc as well:

echo 'int x$a;' | clang -x c -fsyntax-only - ; echo $?
0

I'm not familiar enough with this part of the standard but I always thought C identifiers were only:

  • upper/lower case a-z
  • digits 0-9
  • underscore (_)

I checked doc and it seems to confirm: https://msdn.microsoft.com/en-us/library/e7f8y25b.aspx
I also noticed that this is exactly what gold does. If accepting '$' is an extension or clang is wrong here, I don't know, but I don't think we should accept '$' as valid C identifier in this case.

Update: This seems to be GNU C, and even there, it's not guaranteed to be supported on each target https://gcc.gnu.org/onlinedocs/gcc/Dollar-Signs.html

ruiu added a comment.Oct 15 2015, 9:07 AM

I'm familiar with this topic because I once wrote a C compiler from scratch myself. That is a GNU extension to the C language. I don't think we want to allow '$' for this linker feature because I don't see the need to do that. If the feature is standardized and the spec require us to handle '$', I'd do, but this is not something like that.

ruiu updated this revision to Diff 37494.Oct 15 2015, 9:46 AM

Add a test for the symbol table

davide accepted this revision.Oct 15 2015, 9:58 AM
davide added a reviewer: davide.
This revision is now accepted and ready to land.Oct 15 2015, 9:58 AM
davide closed this revision.Oct 16 2015, 3:14 PM