This is an archive of the discontinued LLVM Phabricator instance.

[LLD] [COFF] Add support for GNU binutils import libraries
ClosedPublic

Authored by mstorsjo on Oct 3 2017, 1:25 PM.

Details

Summary

GNU binutils import libraries aren't the same kind of short import libraries as link.exe and LLD produce, but are a plain static library containing .idata section chunks. MSVC link.exe can successfully link to them.

In order for imports from GNU import libraries to mix properly with the normal import chunks, the chunks from the existing mechanism needs to be added into named sections like .idata$2.

These GNU import libraries consist of one header object, a number of object files, one for each imported function/variable, and one trailer. Within the import libraries, the object files are ordered alphabetically in this order. The chunks stemming from these libraries have to be grouped by what library they originate from and sorted, to make sure the section chunks for headers and trailers for the lists are ordered as intended. This is done on all sections named .idata$*, before adding the synthesized chunks to them.

Diff Detail

Event Timeline

mstorsjo created this revision.Oct 3 2017, 1:25 PM
ruiu edited edge metadata.Oct 3 2017, 1:39 PM

Import library files are tricky, and I don't fully understand how it works in the MSVC linker. It seems that because import libraries contain sections such as .idata$2, .idata$3, etc., they would be "naturally" processed and sorted in the MSVC linker, and as a result a correct output would be created. But I couldn't figure that out. So, lld implements a special logic for the import library.

I have two questions:

  1. Do you think you can figure out what MSVC does for the import libary?
  2. Why does GNU binutils create an import library that is different from MSVC's?
compnerd edited edge metadata.Oct 3 2017, 1:45 PM

@ruiu I'm not sure that I understand the first question. Might be easier to discuss this on IRC or at the social.

As to why binutils generates different import libraries, is because a long time ago, import libraries used to be complete archives with object files. This was larger but simpler. They based the behaviour on that, but added some extensions to actually be able to sort the sections correctly (which in my experience is still insufficient as there have been cases where it would inject the null terminator too early, or swap things earlier). To make things worse, they actually don't get the ordering correct with the short import library format (which is what MSVC currently generates). In order to work with binutils, you need to generate the binutils specific format of the import library.

In D38513#887477, @ruiu wrote:

Import library files are tricky, and I don't fully understand how it works in the MSVC linker. It seems that because import libraries contain sections such as .idata$2, .idata$3, etc., they would be "naturally" processed and sorted in the MSVC linker, and as a result a correct output would be created. But I couldn't figure that out. So, lld implements a special logic for the import library.

Ok, thanks for the honest comment on this part!

I have two questions:

  1. Do you think you can figure out what MSVC does for the import libary?

I guess I'd have to try to reverse-engineer the logic somehow, alternatively try to look at what GNU binutils does in case that gives any clues.

This whole patch turned out to be a much larger task than I had anticipated, so I posted it early in case anyone of you would have had good insights on how to implement it. I'll probably continue with this with a bit lower priority and focus on other easier features inbetween though.

Could you elaborate on "the first object file in the static import library to be included is the function itself"? Is the function in question the linker-provided import thunk, or something else?

Could you elaborate on "the first object file in the static import library to be included is the function itself"? Is the function in question the linker-provided import thunk, or something else?

It's the import library provided import thunk.

To elaborate:
Given an import library for a dll testlib.dll, it contains a number of object files, d000000.o, d000001.o etc.

The first one, d000000.o, contains (among others) the symbols __IMPORT_DESCRIPTOR_testlib and __head_testlib_dll, which among others contains references to local zero bytes sections .idata$4 and .idata$5, intended to get a pointer to the start of the IAT for this DLL, and a reference to _testlib_dll_iname.

A later one, d000010.o in my test case, contains the symbols func and __imp_func, and contains 4 bytes of data for .idata$4 and .idata$5 (filling in one entry in the IAT), and a reference to __head_testlib_dll.

Finally, the last one in the library, d000011.o in my test, contains _testlib_dll_iname and 4 null byte terminators for .idata$4 and .idata$5.

Now while linking the module that might reference this import library, initially no object files from the import library are included. When encountering an undefined reference to func or __imp_func, it will pull in d000010.o which defines those. This object has an undefined reference to __head_testlib_dll which pulls in d000000.o, which in turn has got an undefined reference to _testlib_dll_iname which pulls in d000011.o.

This means that the object files, ordered by the order they are referenced (as I think lld does it right now?) is d000010.o, d000000.o and d000011.o. This means that the .idata$5 section group will be: [func entry 4 bytes] [0 bytes section chunk, referenced from the import descriptor] [4 null bytes terminator]. This means that the import descriptor actually points at the terminator. And if any other functions from the same import library were to be linked later, they would end up after the terminator. Or worse, if imports are done from multiple import libraries, they end up intermixed.

By enforcing the object files to be ordered alphabetically (or e.g. by their respective order in the import library archive), I make sure that the .idata$5 section group starts with [0 bytes section chunk referenced from the import descriptor], then [func entry 4 bytes] for every function referenced in the library, finishing with [4 null bytes terminator].

But this hack then turned out to break other things when linking msvcrt.lib (which contains a number of actual object files as well, in addition to the normal import library entries), exactly what it broke is yet undiagnosed.

Thanks for the detailed explanation! I'm pretty familiar with how short import libraries work, but I hadn't looked into MinGW-style import libraries too much before.

I played around with them a bit, and judging from the output of running link with /verbose, link also pulls libraries in order of reference, so I'm not sure how the ordering works out there. My hypothesis is that link might special-case zero-sized sections, but I have no evidence to back that up yet. I'm planning to play around with custom object files with zero-sized sections constructed via yaml2obj to see if that hypothesis holds up.

Also adding more Windows people in case they have any ideas.

My hypothesis is that link might special-case zero-sized sections, but I have no evidence to back that up yet.

I don't think that's it. Consider linking two separate GNU import libraries in the same module; if just piling up section chunks into .idata$5 as they are referenced, you'd have an IAT with entries mixed from both. So some sort of grouping per import library is needed, but the exact conditions for it aren't really known. Perhaps only ordering .idata$* chunks like this, but not other ones?

One could try to read the source of GNU ld, but i'm not sure how easy it is to find the answer to the question by sifting through that source...

My hypothesis is that link might special-case zero-sized sections, but I have no evidence to back that up yet.

I don't think that's it. Consider linking two separate GNU import libraries in the same module; if just piling up section chunks into .idata$5 as they are referenced, you'd have an IAT with entries mixed from both. So some sort of grouping per import library is needed, but the exact conditions for it aren't really known. Perhaps only ordering .idata$* chunks like this, but not other ones?

You're right; there appears to be no special-casing for zero-sized sections in general.

I don't think it's a library ordering thing either though. I extracted the individual object files out of the import library and linked against those rather the import library (in the order in which they would have been referenced, i.e. the object file containing the func entry was before the object file containing the zero byte section), and the imports of the linked image were still correct.

In general, link.exe is definitely doing some special-casing for .idata$* chunks though:

  • They end up in the .rdata section in the final image (whereas LLD produces a separate .idata output section).
  • The import address table chunks (.idata$5) end up at the start of the .rdata section, so regular grouping order isn't respected there.

I looked into this some more. MinGW import libraries don't appear to contain the null directory table entry that's supposed to terminate the import directory table. This doesn't cause usually cause problems for link.exe, because you'll probably be linking against a non-MinGW import library, which provides the null entry (as .idata$3). Even if you're only linking against a MinGW import library, it looks like the loader doesn't require the entire terminating entry to be null; if either the Name RVA or the IAT RVA is null, the entry is considered to be a terminator. You can craft examples which cause problems though. All of the below discussion assumes 32-bit VS tools and MinGW, for reasons that will become apparent later.

Create a file fs.c:

__declspec(noreturn) void __stdcall ExitProcess(unsigned exitCode);

__declspec(dllexport) int f() { return 1; }
__declspec(dllexport) int g() { return 2; }
__declspec(dllexport) int h() { return 3; }
__declspec(dllexport) int i() { return 4; }
__declspec(dllexport noreturn) void ex(unsigned it) { ExitProcess(it); }

Create a library from it with MSVC:

cl /LD /MD /O1 fs.c

Create a file fsmain.c:

__declspec(dllimport) int f();
__declspec(dllimport) int g();
__declspec(dllimport) int h();
__declspec(dllimport) int i();
__declspec(dllimport noreturn) void ex();

void main() { ex(f() + g() + h() + i()); }

We're gonna wanna only link this with a single import library, so let's do that with the cl-generated import library first and convince ourselves that it works:

cl /O1 /Zl fsmain.c fs.lib /link /entry:main
fsmain
echo %ERRORLEVEL%
(outputs 10)

Now create a def file fs.def:

LIBRARY fs
EXPORTS
  f
  g
  h
  i
  ex

Create an import library with MinGW and see our executable fail to start properly when linked against that import library:

dlltool -d fs.def -l fs.lib
cl /O1 /Zl fsmain.c fs.lib /link /entry:main
fsmain
(you should get an "application was unable to start correctly" error)

What's happening here is that the ILT is placed immediately after the first (and only) import directory table entry, so the loader tries to read the ILT as if it were a directory entry, and we have enough ILT entries that both the Name RVA and IAT RVA fields are non-null, so the directory entry is considered non-null, and you get a crash from the malformed entry.

You can alter the number of exports in the def file (and adjust the expression in main accordingly) to see how the loader is behaving. Only exporting ex should work because the IAT RVA (but not the Name RVA) will end up being null. Exporting ex and f will crash because both fields will end up non-null. Exporting ex, f, and g will work because the Name RVA (but not the IAT RVA) will end up being null. Exporting ex, f, g, and h will work because the IAT RVA (but not the Name RVA) will end up being null.

This is also why it's important to use 32-bit tools. With 64-bit tools, unless you have a huge import library, the upper 4 bytes of each ILT entry will be 0, so you'll always end up with a Name RVA of 0, and the problem will be masked.

FWIW, I noticed this detail when linking to a GNU import library using link.exe; if linking with -verbose, link.exe prints a list of object files at the end:

testlib.lib(d000004.o)
testlib.lib(d000005.o)
testlib.lib(d000006.o)
testlib.lib(d000000.o)
testlib.lib(d000011.o)

These clearly are grouped by import library, which confirms one detail of it. But this also is more or less the same order as lld would end up linking them right now, and that doesn't work for lld, since d000000.o would have a reference to the end of the IAT instead of the start.

The grouping by import library is definitely important for both Microsoft-style and binutils-style import libraries. I'm pretty sure the second issue (how to get the correct ILT and IAT start addresses inside d000000.o) is solved by them special casing imports handling, as I mentioned before.

In general, link.exe is definitely doing some special-casing for .idata$* chunks though:

  • They end up in the .rdata section in the final image (whereas LLD produces a separate .idata output section).
  • The import address table chunks (.idata$5) end up at the start of the .rdata section, so regular grouping order isn't respected there.
mstorsjo updated this revision to Diff 165934.Sep 18 2018, 4:33 AM
mstorsjo retitled this revision from [LLD] [RFC] [COFF] Add support for GNU binutils import libraries to [LLD] [COFF] Add support for GNU binutils import libraries.
mstorsjo edited the summary of this revision. (Show Details)
mstorsjo added reviewers: rnk, pcc.

Finally updated this patch to a pretty solid, not all too messy implementation (or much less messy than before at least).

With this in place, lld can finally use a normal MinGW sysroot built with GCC and GNU binutils, making lld a proper drop-in replacement for ld.bfd in these environments. (There still are cases with more advanced C++ compiled by GCC which doesn't work properly with lld though.)

ruiu added a comment.Sep 19 2018, 3:22 PM

Wow, this patch is almost one year old.

COFF/Writer.cpp
563

Perhaps this is a good place to explain what is GNU import libraries? We shouldn't assume that code readers know what that is when reading this code.

820

Flip the condition and continue early.

Wow, this patch is almost one year old.

Yup, I put it on hold last year as it seemed complex, and I had more important things to focus on first. Now after reading a bit of binutils source, I think I understand how it's supposed to be handled (and my previous attempt didn't seem to be too far off). And after looking at the code a second time (with more experience with LLD), I managed to clean it up more so it's very lean and pretty nonintrusive now, and no longer with an RFC tag, but this should be good to go now.

COFF/Writer.cpp
563

Ok, I'll expand the comment.

820

Ok, will do.

mstorsjo updated this revision to Diff 166233.Sep 20 2018, 1:27 AM

Expanded a code comment to explain what it does and why, changed a condition to use early continue as suggested.

ruiu added a comment.Sep 20 2018, 9:17 AM

Yeah, I agree with you that this patch is lean. It actually looks pretty good.

COFF/Writer.cpp
575

nit: we omit const if it is obviously a const.

580

Can you assign Pair.first.first and Pair.first.second to local variables for the sake of documentation?

594

Ditto

610

toString is written for constructing a debug or log message, so it is better to avoid using it as part of the actual logic.

666

nit: AddChunk -> Add since the scope of this variable is very narrow.

828

I wonder if you should use assert. If this line is unreachable unless there's a bug in lld, you should use assert.

zturner added inline comments.Sep 20 2018, 9:22 AM
COFF/Writer.cpp
575

Do we? It doesn't hurt anything and prevents the user from modifying the value later. I don't see a good reason to omit it.

Yeah, I agree with you that this patch is lean. It actually looks pretty good.

Thanks! I'll update it according to your other comments, except for the one I'd prefer avoid changing here.

COFF/Writer.cpp
828

This check is just moved, so I'd avoid changing it within this patch.

I think it might theoretically be possible to trigger it with very pathological input, but I don't remember exactly right now...

ruiu added inline comments.Sep 20 2018, 9:27 AM
COFF/Writer.cpp
575

At least in many places in lld, we don't to keep it terse.

mstorsjo updated this revision to Diff 166356.Sep 20 2018, 1:27 PM

Applied @ruiu's suggestions.

ruiu added inline comments.Sep 21 2018, 9:16 AM
COFF/Writer.cpp
580

StringRef is a small object and we usually copy them instead of creating references or pointers to them.

596

Ditto

608

This function got a bit too long. Do you think there's a way to refactor?

mstorsjo updated this revision to Diff 166511.Sep 21 2018, 11:12 AM

Split two functions out of createSections, changed const StringRef& into plain StringRef.

ruiu accepted this revision.Sep 21 2018, 1:27 PM

LGTM. Beautiful!

COFF/Writer.cpp
623

Add a blank line.

637

Add a blank line before a multiline comment.

645–646

Please write a function comment.

This revision is now accepted and ready to land.Sep 21 2018, 1:27 PM
This revision was automatically updated to reflect the committed changes.