[LLD] [RFC] [COFF] Add support for GNU binutils import libraries
Needs ReviewPublic

Authored by mstorsjo on Tue, Oct 3, 1:25 PM.

Details

Reviewers
ruiu
compnerd
Summary

Currently, LLD fails to link to import libraries produced by GNU binutils.

GNU binutils import libraries aren't the same kind of short import libraries as link.exe and LLD produce, but are a plain static library containing .idata section chunks. MSVC link.exe can successfully link to them (prior to MSVC 2012, they failed if the link used -opt:ref though).

Currently, the main issue is that the whole .idata section is synthesized all in one by the IdataContents class, that doesn't allow it to be mixed with such section chunks that are linked in. This patch makes the IdataContents class produce section chunks that are sorted together with other chunks from input object files.

However, since the first object file in the static import library to be included is the function itself, that object file gets pulled in before the object file that contains the import directory header. That causes the import directory header to point at the wrong point in the IAT and import name table. If these object files are included in the same order as they are in the static library (or alphabetically), this works out fine though.

I've tested handling that by sorting object files alphabetically within each static library included, but that turned out to break linking msvcrt.lib. I've yet to figure out what the correct algorithm for doing this would be - apparently link.exe manages to do it though.

Diff Detail

mstorsjo created this revision.Tue, Oct 3, 1:25 PM
ruiu added a comment.Tue, Oct 3, 1:39 PM

Import library files are tricky, and I don't fully understand how it works in the MSVC linker. It seems that because import libraries contain sections such as .idata$2, .idata$3, etc., they would be "naturally" processed and sorted in the MSVC linker, and as a result a correct output would be created. But I couldn't figure that out. So, lld implements a special logic for the import library.

I have two questions:

  1. Do you think you can figure out what MSVC does for the import libary?
  2. Why does GNU binutils create an import library that is different from MSVC's?

@ruiu I'm not sure that I understand the first question. Might be easier to discuss this on IRC or at the social.

As to why binutils generates different import libraries, is because a long time ago, import libraries used to be complete archives with object files. This was larger but simpler. They based the behaviour on that, but added some extensions to actually be able to sort the sections correctly (which in my experience is still insufficient as there have been cases where it would inject the null terminator too early, or swap things earlier). To make things worse, they actually don't get the ordering correct with the short import library format (which is what MSVC currently generates). In order to work with binutils, you need to generate the binutils specific format of the import library.

In D38513#887477, @ruiu wrote:

Import library files are tricky, and I don't fully understand how it works in the MSVC linker. It seems that because import libraries contain sections such as .idata$2, .idata$3, etc., they would be "naturally" processed and sorted in the MSVC linker, and as a result a correct output would be created. But I couldn't figure that out. So, lld implements a special logic for the import library.

Ok, thanks for the honest comment on this part!

I have two questions:

  1. Do you think you can figure out what MSVC does for the import libary?

I guess I'd have to try to reverse-engineer the logic somehow, alternatively try to look at what GNU binutils does in case that gives any clues.

This whole patch turned out to be a much larger task than I had anticipated, so I posted it early in case anyone of you would have had good insights on how to implement it. I'll probably continue with this with a bit lower priority and focus on other easier features inbetween though.

Could you elaborate on "the first object file in the static import library to be included is the function itself"? Is the function in question the linker-provided import thunk, or something else?

Could you elaborate on "the first object file in the static import library to be included is the function itself"? Is the function in question the linker-provided import thunk, or something else?

It's the import library provided import thunk.

To elaborate:
Given an import library for a dll testlib.dll, it contains a number of object files, d000000.o, d000001.o etc.

The first one, d000000.o, contains (among others) the symbols __IMPORT_DESCRIPTOR_testlib and __head_testlib_dll, which among others contains references to local zero bytes sections .idata$4 and .idata$5, intended to get a pointer to the start of the IAT for this DLL, and a reference to _testlib_dll_iname.

A later one, d000010.o in my test case, contains the symbols func and __imp_func, and contains 4 bytes of data for .idata$4 and .idata$5 (filling in one entry in the IAT), and a reference to __head_testlib_dll.

Finally, the last one in the library, d000011.o in my test, contains _testlib_dll_iname and 4 null byte terminators for .idata$4 and .idata$5.

Now while linking the module that might reference this import library, initially no object files from the import library are included. When encountering an undefined reference to func or __imp_func, it will pull in d000010.o which defines those. This object has an undefined reference to __head_testlib_dll which pulls in d000000.o, which in turn has got an undefined reference to _testlib_dll_iname which pulls in d000011.o.

This means that the object files, ordered by the order they are referenced (as I think lld does it right now?) is d000010.o, d000000.o and d000011.o. This means that the .idata$5 section group will be: [func entry 4 bytes] [0 bytes section chunk, referenced from the import descriptor] [4 null bytes terminator]. This means that the import descriptor actually points at the terminator. And if any other functions from the same import library were to be linked later, they would end up after the terminator. Or worse, if imports are done from multiple import libraries, they end up intermixed.

By enforcing the object files to be ordered alphabetically (or e.g. by their respective order in the import library archive), I make sure that the .idata$5 section group starts with [0 bytes section chunk referenced from the import descriptor], then [func entry 4 bytes] for every function referenced in the library, finishing with [4 null bytes terminator].

But this hack then turned out to break other things when linking msvcrt.lib (which contains a number of actual object files as well, in addition to the normal import library entries), exactly what it broke is yet undiagnosed.

Thanks for the detailed explanation! I'm pretty familiar with how short import libraries work, but I hadn't looked into MinGW-style import libraries too much before.

I played around with them a bit, and judging from the output of running link with /verbose, link also pulls libraries in order of reference, so I'm not sure how the ordering works out there. My hypothesis is that link might special-case zero-sized sections, but I have no evidence to back that up yet. I'm planning to play around with custom object files with zero-sized sections constructed via yaml2obj to see if that hypothesis holds up.

Also adding more Windows people in case they have any ideas.

My hypothesis is that link might special-case zero-sized sections, but I have no evidence to back that up yet.

I don't think that's it. Consider linking two separate GNU import libraries in the same module; if just piling up section chunks into .idata$5 as they are referenced, you'd have an IAT with entries mixed from both. So some sort of grouping per import library is needed, but the exact conditions for it aren't really known. Perhaps only ordering .idata$* chunks like this, but not other ones?

One could try to read the source of GNU ld, but i'm not sure how easy it is to find the answer to the question by sifting through that source...

My hypothesis is that link might special-case zero-sized sections, but I have no evidence to back that up yet.

I don't think that's it. Consider linking two separate GNU import libraries in the same module; if just piling up section chunks into .idata$5 as they are referenced, you'd have an IAT with entries mixed from both. So some sort of grouping per import library is needed, but the exact conditions for it aren't really known. Perhaps only ordering .idata$* chunks like this, but not other ones?

You're right; there appears to be no special-casing for zero-sized sections in general.

I don't think it's a library ordering thing either though. I extracted the individual object files out of the import library and linked against those rather the import library (in the order in which they would have been referenced, i.e. the object file containing the func entry was before the object file containing the zero byte section), and the imports of the linked image were still correct.

In general, link.exe is definitely doing some special-casing for .idata$* chunks though:

  • They end up in the .rdata section in the final image (whereas LLD produces a separate .idata output section).
  • The import address table chunks (.idata$5) end up at the start of the .rdata section, so regular grouping order isn't respected there.

I looked into this some more. MinGW import libraries don't appear to contain the null directory table entry that's supposed to terminate the import directory table. This doesn't cause usually cause problems for link.exe, because you'll probably be linking against a non-MinGW import library, which provides the null entry (as .idata$3). Even if you're only linking against a MinGW import library, it looks like the loader doesn't require the entire terminating entry to be null; if either the Name RVA or the IAT RVA is null, the entry is considered to be a terminator. You can craft examples which cause problems though. All of the below discussion assumes 32-bit VS tools and MinGW, for reasons that will become apparent later.

Create a file fs.c:

__declspec(noreturn) void __stdcall ExitProcess(unsigned exitCode);

__declspec(dllexport) int f() { return 1; }
__declspec(dllexport) int g() { return 2; }
__declspec(dllexport) int h() { return 3; }
__declspec(dllexport) int i() { return 4; }
__declspec(dllexport noreturn) void ex(unsigned it) { ExitProcess(it); }

Create a library from it with MSVC:

cl /LD /MD /O1 fs.c

Create a file fsmain.c:

__declspec(dllimport) int f();
__declspec(dllimport) int g();
__declspec(dllimport) int h();
__declspec(dllimport) int i();
__declspec(dllimport noreturn) void ex();

void main() { ex(f() + g() + h() + i()); }

We're gonna wanna only link this with a single import library, so let's do that with the cl-generated import library first and convince ourselves that it works:

cl /O1 /Zl fsmain.c fs.lib /link /entry:main
fsmain
echo %ERRORLEVEL%
(outputs 10)

Now create a def file fs.def:

LIBRARY fs
EXPORTS
  f
  g
  h
  i
  ex

Create an import library with MinGW and see our executable fail to start properly when linked against that import library:

dlltool -d fs.def -l fs.lib
cl /O1 /Zl fsmain.c fs.lib /link /entry:main
fsmain
(you should get an "application was unable to start correctly" error)

What's happening here is that the ILT is placed immediately after the first (and only) import directory table entry, so the loader tries to read the ILT as if it were a directory entry, and we have enough ILT entries that both the Name RVA and IAT RVA fields are non-null, so the directory entry is considered non-null, and you get a crash from the malformed entry.

You can alter the number of exports in the def file (and adjust the expression in main accordingly) to see how the loader is behaving. Only exporting ex should work because the IAT RVA (but not the Name RVA) will end up being null. Exporting ex and f will crash because both fields will end up non-null. Exporting ex, f, and g will work because the Name RVA (but not the IAT RVA) will end up being null. Exporting ex, f, g, and h will work because the IAT RVA (but not the Name RVA) will end up being null.

This is also why it's important to use 32-bit tools. With 64-bit tools, unless you have a huge import library, the upper 4 bytes of each ILT entry will be 0, so you'll always end up with a Name RVA of 0, and the problem will be masked.