This is an archive of the discontinued LLVM Phabricator instance.

[ELF2] Symbol Versioning: part 1, VERSION() directive parsing
AbandonedPublic

Authored by davide on Oct 21 2015, 3:02 PM.

Details

Reviewers
ruiu
rafael
Summary

So, this is a WIP. But it's enough that I can parse non-trivial trees, e.g.

VERSION {

 VERS_1.1 {
     global:
       x;
       *;
     local:
       zelda;
 };
VERS_1.2 {
  local:
    z;
};
VERS_1.3 {
  global:
    y;
} VERS_1.1;
VERS_1.4 {
  x;
} VERS_1.3;

}

There are a couple of issues with the current code:

  1. It's not able to parse 'extern' : extern "C++" { ns::*; "f(int, double)"; };
  2. It currently thinks ':' is a token (which I'm not completely sure it's right). So, when parsing namespaces (see example above) it will fail.

Still, this is expressive enough to parse almost every single version script FreeBSD base system ships with (modulo bugs). I hope to collect feedback, implement semantic action, and only after that commit.

Diff Detail

Event Timeline

davide updated this revision to Diff 38054.Oct 21 2015, 3:02 PM
davide retitled this revision from to [ELF2] Symbol Versioning: part 1, VERSION() directive parsing.
davide updated this object.
davide added reviewers: ruiu, rafael.
davide added a subscriber: llvm-commits.
ruiu edited edge metadata.Oct 21 2015, 3:23 PM

Does GNU linker accepts both 'global :' and 'global:' as a label? If not, we can handle ':' as part of the name rather than an independent token.

In D13960#272578, @ruiu wrote:

Does GNU linker accepts both 'global :' and 'global:' as a label? If not, we can handle ':' as part of the name rather than an independent token.

I originally thought about this, but , alas, they do accept both.

% cat version3.script
VERSION {

VERS_1.1 {
    global :
     x;
};

}

% /usr/local/bin/ld.bfd -T version3.script
/usr/local/bin/ld.bfd: no input files

% /usr/local/bin/ld.gold -T version3.script
/usr/local/bin/ld.gold: fatal error: no input files

They also support things like:
global:

global;

for added fun. It looks like for GNU ld "global" is a keyword but they have special cases recognizing "global" (and "local" and "extern") in the yacc grammar.

FWIW, I noticed that GNU gold refuses the example above (but ld.bdf accepts it) and after a conversation with Ian Taylor he agreed this is a gold bug but also pointed out that in some cases there's no right/wrong answer for linker scripts/version scripts because there's no "serious" documentation.

ruiu added a comment.Oct 21 2015, 3:53 PM

Oh yeah, that's very unfortunate, but that's true.

I took a look at gold's source code and found that their lexer behaves differently depending on the context -- if it's outside of version script, ':' is a token character, but inside version script, only '::' are handled as part of a token. I don't know if we really want to have this craziness.

I'd really like to always handle ':' as a token character, and let users to update their linker script if not compatible with that behavior. Davide, could you grep all linker scripts in FreeBSD which has a space between 'global' or 'local' and the following semicolon and count that number?

davide edited edge metadata.Oct 21 2015, 4:15 PM
davide added a subscriber: emaste.

Davide, could you grep all linker scripts in FreeBSD which has a space between 'global' or 'local' and the following semicolon and count that number?

I'm not sure that's a very useful metric since the FreeBSD base system version scripts are bespoke and likely copied from each other. These aren't representative of 3rd-party scripts. We should be able to update these in FreeBSD pretty trivially.

davide updated this revision to Diff 38146.Oct 22 2015, 10:44 AM

New patch after discussion with Rui/Ed.
Also, test added.

ruiu added a comment.Oct 22 2015, 12:19 PM

You want to plug this into something so that the code added here will actually do a task it supposed to do.

davide abandoned this revision.Jun 5 2016, 5:21 PM

Superseded, close.