This is an archive of the discontinued LLVM Phabricator instance.

[LLD] [COFF] Support linking directly against DLLs in MinGW mode
ClosedPublic

Authored by mstorsjo on Jun 18 2021, 6:06 AM.

Details

Summary

GNU ld.bfd supports linking directly against DLLs without using an
import library, and some projects have picked up on this habit.
(There's no one single unsurmountable issue with using import
libraries, but this is a regularly surfacing missing feature.)

As long as one is linking by name (instead of by ordinal), the DLL
export table contains most of the information needed. (One can
inspect what section a symbol points at, to see if it's a function
or data symbol. The practical implementation of this loops over all
sections for each symbol, but as long as they're not very many, that
should hopefully be tolerable performance wise.)

One exception where the information in the DLL isn't entirely enough
is on i386 with stdcall functions; depending on how they're done,
the exported function name can be a plain undecorated name, while
the import library would contain the full decorated symbol name. This
issue is addressed separately in a different patch.

This is implemented mimicing the structure of a regular import library,
with one InputFile corresponding to the static archive that just adds
lazy symbols, which then are fetched when they are needed. When such
a symbol is fetched, we synthesize a coff_import_header structure
in memory and create a regular ImportFile out of it.

The implementation could be even smaller by just creating ImportFiles
for every symbol available immediately, but that would have the
drawback of actually ending up importing all symbols unless running
with GC enabled (and mingw mode defaults to having it disabled for
historical reasons).

Diff Detail

Event Timeline

mstorsjo requested review of this revision.Jun 18 2021, 6:06 AM
mstorsjo created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJun 18 2021, 6:06 AM
mstorsjo updated this revision to Diff 353171.Jun 18 2021, 11:42 PM

Updated missed bits in one testcase.

@rnk - Do you think you can have a look at this patch (plus D104532)? There’s not that many other active reviewers who know the coff linker in detail… Hoping to get them in by the 13.x branch if they aren’t too controversial.

rnk added inline comments.Jul 1 2021, 2:16 PM
lld/COFF/InputFiles.cpp
1170

Factoring out the computation of this boolean would help readability. Initially I thought "this should be a data structure", but there are so few PE sections that it really doesn't matter.

1181–1184

Instead of copying these strings into a newly allocated data structure, is there some way to put the export directory index into the LazyDLLSymbol? Most of the fields of DLLFile::Symbol seem like they can be calculated later after symbol resolution. This would avoid copying all the exported strings once.

mstorsjo added inline comments.Jul 1 2021, 2:37 PM
lld/COFF/InputFiles.cpp
1170

Yep; this does end up with something like O(n*m) efficiency too, but as the number of sections is only half a dozen normally, it's probably not a biggie.

I can split it out to a separate function.

1181–1184

I iterated a bit back and forth on this while making the patch...

In practice, we actually do need most of the fields from the get go; we do need to know whether to insert the __imp_-less symbol into the symbol table (so we need to find the bool code for each symbol). Technically we don't need to store the original symbolName (for a data import), but once we synthesize the import struct we do need the original symbolName again (and I'd prefer to avoid working with heuristics for stripping out __imp_ from the symbol name when we can avoid it). The dllName isn't needed until the end, but a StringRef for it is pretty much as cheap as a reference to the COFFObjectFile for fetching it later, assuming the operation for fetching it isn't too costly.

Secondly, ExportDirectoryEntryRef is kinda opaque - it's a bit large in itself and it doesn't expose the index publicly. I did experiment with storing the whole ExportDirectoryEntryRef in LazyDLLSymbol, but I needed to have the code flag separately anyway, and then it exceeded the size for the symbol union.

So in practice, it feels a bit excessive to fetch all the info originally at first, but there's surprisingly little to shave off...

mstorsjo updated this revision to Diff 356019.Jul 1 2021, 2:47 PM

Split out a isRVACode() helper function.

rnk accepted this revision.Jul 1 2021, 8:05 PM

lgtm

This revision is now accepted and ready to land.Jul 1 2021, 8:05 PM
thakis added a subscriber: thakis.Jul 19 2021, 5:42 PM
thakis added inline comments.
lld/COFF/InputFiles.h
398

Can you say "MinGW only" in the comment?

lld/COFF/Symbols.h
314

Likewise

mstorsjo added inline comments.Jul 20 2021, 1:58 PM
lld/COFF/InputFiles.h
398

Sure thing, done (for both instances) in e0e09481eef2602f14523e30a612e7c9fc941936.