Page MenuHomePhabricator

[LLD] [COFF] Support linking directly against DLLs in MinGW mode

Authored by mstorsjo on Jun 18 2021, 6:06 AM.



GNU ld.bfd supports linking directly against DLLs without using an
import library, and some projects have picked up on this habit.
(There's no one single unsurmountable issue with using import
libraries, but this is a regularly surfacing missing feature.)

As long as one is linking by name (instead of by ordinal), the DLL
export table contains most of the information needed. (One can
inspect what section a symbol points at, to see if it's a function
or data symbol. The practical implementation of this loops over all
sections for each symbol, but as long as they're not very many, that
should hopefully be tolerable performance wise.)

One exception where the information in the DLL isn't entirely enough
is on i386 with stdcall functions; depending on how they're done,
the exported function name can be a plain undecorated name, while
the import library would contain the full decorated symbol name. This
issue is addressed separately in a different patch.

This is implemented mimicing the structure of a regular import library,
with one InputFile corresponding to the static archive that just adds
lazy symbols, which then are fetched when they are needed. When such
a symbol is fetched, we synthesize a coff_import_header structure
in memory and create a regular ImportFile out of it.

The implementation could be even smaller by just creating ImportFiles
for every symbol available immediately, but that would have the
drawback of actually ending up importing all symbols unless running
with GC enabled (and mingw mode defaults to having it disabled for
historical reasons).

Diff Detail

Event Timeline

mstorsjo requested review of this revision.Jun 18 2021, 6:06 AM
mstorsjo created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJun 18 2021, 6:06 AM
mstorsjo updated this revision to Diff 353171.Jun 18 2021, 11:42 PM

Updated missed bits in one testcase.

@rnk - Do you think you can have a look at this patch (plus D104532)? There’s not that many other active reviewers who know the coff linker in detail… Hoping to get them in by the 13.x branch if they aren’t too controversial.

rnk added inline comments.Jul 1 2021, 2:16 PM

Factoring out the computation of this boolean would help readability. Initially I thought "this should be a data structure", but there are so few PE sections that it really doesn't matter.


Instead of copying these strings into a newly allocated data structure, is there some way to put the export directory index into the LazyDLLSymbol? Most of the fields of DLLFile::Symbol seem like they can be calculated later after symbol resolution. This would avoid copying all the exported strings once.

mstorsjo added inline comments.Jul 1 2021, 2:37 PM

Yep; this does end up with something like O(n*m) efficiency too, but as the number of sections is only half a dozen normally, it's probably not a biggie.

I can split it out to a separate function.


I iterated a bit back and forth on this while making the patch...

In practice, we actually do need most of the fields from the get go; we do need to know whether to insert the __imp_-less symbol into the symbol table (so we need to find the bool code for each symbol). Technically we don't need to store the original symbolName (for a data import), but once we synthesize the import struct we do need the original symbolName again (and I'd prefer to avoid working with heuristics for stripping out __imp_ from the symbol name when we can avoid it). The dllName isn't needed until the end, but a StringRef for it is pretty much as cheap as a reference to the COFFObjectFile for fetching it later, assuming the operation for fetching it isn't too costly.

Secondly, ExportDirectoryEntryRef is kinda opaque - it's a bit large in itself and it doesn't expose the index publicly. I did experiment with storing the whole ExportDirectoryEntryRef in LazyDLLSymbol, but I needed to have the code flag separately anyway, and then it exceeded the size for the symbol union.

So in practice, it feels a bit excessive to fetch all the info originally at first, but there's surprisingly little to shave off...

mstorsjo updated this revision to Diff 356019.Jul 1 2021, 2:47 PM

Split out a isRVACode() helper function.

rnk accepted this revision.Jul 1 2021, 8:05 PM


This revision is now accepted and ready to land.Jul 1 2021, 8:05 PM
thakis added inline comments.

Can you say "MinGW only" in the comment?



mstorsjo added inline comments.Jul 20 2021, 1:58 PM

Sure thing, done (for both instances) in e0e09481eef2602f14523e30a612e7c9fc941936.