This is an archive of the discontinued LLVM Phabricator instance.

[LLD] [COFF] Add support for automatically exporting all symbols
ClosedPublic

Authored by mstorsjo on Oct 10 2017, 1:47 PM.

Details

Summary

GNU ld automatically exports all symbols if no symbols have been chosen to export via either def files or dllexport attributes. The same behaviour can also be enabled via the GNU ld option --export-all-symbols, in case some symbols are marked for export via a def file or dllexport attribute.

The list of excluded symbols is from GNU ld, minus the cygwin specific symbols.

Also add support for outputting the actual list of exported symbols in a def file, as in the GNU ld option --output-def.

This currently exports all symbols from object files pulled in from libmingwex.

These options in GNU ld are documented in https://sourceware.org/binutils/docs/ld/WIN32.html.

Diff Detail

Event Timeline

mstorsjo created this revision.Oct 10 2017, 1:47 PM
ruiu added inline comments.Oct 10 2017, 2:16 PM
COFF/Driver.cpp
1258

Is it doable for the mingw driver to always add -export-all-symbols if -export is empty? If you can do this, you can remove Config->MinGW && Config->Exports.empty() from this expression.

1258

This needs comment to make it clear that this is for MinGW, and in MinGW, all symbols are automatically exported if no symbols are chosen to be exported.

1260–1290

Can you make this a separate function which returns a std::set?

1294

We generally prefer early returns, so please flip the condition and return here.

1295

count is more appropriate than find.

Thanks for the review - one inline response to a comment, will do the other suggested changes tomorrow.

COFF/Driver.cpp
1258

No, the mingw driver doesn't know whether the object files will contain any dllexport directives, so it can't take care of setting that flag.

Isn't there a hard limit on the number of files that can be exported in COFF?

COFF/Driver.cpp
1260

Can you make this a vector<StringRef>, but add an additional _ to the beginning of each item?

Then, below you can write:

if (Config->Machine != I386) {
  for (auto &Str : ExcludeSymbols) {
    Str = Str.drop_front().substr(0, Str.find('@'));
  }
}

saving unnecessary allocations on all code paths.

1289–1290

Since you've explicitly defined this set above, and there is no way to get additional items into it, why do we need a set? Seems like you can just std::sort the original vector and then std::binary_search for symbols.

mstorsjo added inline comments.Oct 11 2017, 1:15 AM
COFF/Driver.cpp
1258

Ok, will add such a comment.

1260

Ok, that should be doable.

1289–1290

Ok, changing this into a separate function that returns a sorted vector.

1294

Ok, will do

mstorsjo updated this revision to Diff 118555.Oct 11 2017, 1:17 AM

Changed the exclude list as suggested by Zachary (write all entries in fully decorated form and just substr the others), moved it to a separate function and using std::sort and std::binary_search instead of making a set of it. Added a comment and changed a condition into an early return, as suggested by Rui.

mstorsjo updated this revision to Diff 118582.Oct 11 2017, 3:33 AM

Print DATA in the output def for data symbols, adjusted the stdcall decoration for DllEntryPoint.

I did notice however that this also outputs everything from libmingwex.a that gets linked in; in practice this should probably be restricted to the object files passed directly on the command line, not libraries. Any suggestions on how to implement that? (GNU ld also has got other options like --exclude-libs that probably relates to that.) Or is this sensible as a first implementation already without that?

Isn't there a hard limit on the number of files that can be exported in COFF?

Did you mean files, or perhaps symbols? In either case, I don't really know. But as GNU ld does this, we probably should try to replicate it, at least for the simpler default cases. (E.g. any autotools project that doesn't override what to export.)

ruiu added inline comments.Oct 11 2017, 11:32 AM
COFF/Driver.cpp
580

Please mention that this is MinGW specific.

1260

Actually, I prefer a simple std::set because the number of items in the set is really small. I wouldn't avoid string copy at all because it's cost is negligible. I'd put emphasis simplicity over efficiency unless it is a performance-critical part of the linker.

1285

Please mention that this is MinGW specific.

zturner added inline comments.Oct 11 2017, 11:35 AM
COFF/Driver.cpp
1260

Fair enough, but even in that case I prefer llvm::StringSet over std::set. I rarely find a valid use case for std::set in normal code anymore when various LLVM data structures are almost always better.

mstorsjo updated this revision to Diff 118668.Oct 11 2017, 12:29 PM
mstorsjo edited the summary of this revision. (Show Details)

Added more comments about this being mingw specific, changed to use a StringSet instead of a sorted vector.

Updated the commit message to indicate that this currently exports more functions than what is ideal, in case you think it's sensible to commit it in this form. GNU ld by default ignores object files in libs named libgcc, libmingw32, libmingwex, libstdc++ and a few other similar ones, and more libs to ignore can be specified with --exclude-libs.

ruiu edited edge metadata.Oct 11 2017, 12:31 PM

By the way, why does GNU ld export all symbols by default? Is it to emulate the ELF-ism on Windows?

In D38760#895075, @ruiu wrote:

By the way, why does GNU ld export all symbols by default? Is it to emulate the ELF-ism on Windows?

Yes, I guess so. With MinGW tools, you can build e.g. quite a number of generic unix autotools based libraries without any modifications (as long as the library is linked with -no-undefined); it obviously isn't ideal windows packaging, but in many cases good enough.

ruiu added inline comments.Oct 11 2017, 12:41 PM
COFF/Driver.cpp
552

Since what this function constructs is a set containing predetermined elements, it is probably better to construct it directly, as it is more obvious what we are doing.

static StringSet<> getExportExcludeSymbols() {
  if (Config->Machine == I386)
    return {"__NULL_IMPORT_DESCRIPTOR",
            "__pei386_runtime_relocator",
            "_do_pseudo_reloc",
            "_impure_ptr",
            "__impure_ptr",
            "__fmode",
            "_environ",
            "___dso_handle",
            // These are the MinGW names that differ from the standard
            // ones (lacking an extra underscore).
            "_DllMain@12",
            "_DllEntryPoint@12",
            "_DllMainCRTStartup@12"};

  return {"_NULL_IMPORT_DESCRIPTOR",
          "_pei386_runtime_relocator",
          "do_pseudo_reloc",
          "impure_ptr",
          "_impure_ptr",
          "_fmode",
          "environ",
          "__dso_handle",
          "DllMain",
          "DllEntryPoint",
          "DllMainCRTStartup"};
}
mstorsjo updated this revision to Diff 118671.Oct 11 2017, 12:49 PM

Constructing the exclude string sets directly instead of doing string transformations.

ruiu accepted this revision.Oct 11 2017, 12:53 PM

LGTM

COFF/Driver.cpp
598–599

You can combine the two lines to

if (auto *Def = dyn_cast_or_null<Defined>(E.Sym))
This revision is now accepted and ready to land.Oct 11 2017, 12:53 PM
This revision was automatically updated to reflect the committed changes.