Page MenuHomePhabricator

ar: add llvm-dlltool support
ClosedPublic

Authored by martell on Feb 13 2017, 7:27 AM.

Details

Summary

A PE COFF spec compliant import library generator.
Intended to be used with mingw-w64.

Supports:
PE COFF spec (section 8, Import Library Format)
PE COFF spec (Aux Format 3: Weak Externals)

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
martell updated this revision to Diff 105055.Jul 3 2017, 5:12 AM

updated to master.
addressed comments.
fixed tests.
added clearer comment about indirect

martell added inline comments.Jul 3 2017, 5:19 AM
include/llvm/Object/COFFModuleDefinition.h
45 ↗(On Diff #98637)

mingw and wine def files does not mangle _ which is important for x86.
It also does some weird things in the mingw crt like prepending _ to all x86 crt functions so that they end up the same as x64 at link time.
I would like to not do this but this is a legacy problem with binutils dlltool.
I would have to update binutils to stop doing this before I could get mingw-w64 to change creating an abi breakage.
Some time in the future we could have this behind some flag like --legacy-mode and encourage mingw-w64 and wine to recreate all their .def files.

ruiu added inline comments.Jul 10 2017, 1:23 PM
include/llvm/Object/COFFModuleDefinition.h
45 ↗(On Diff #98637)

Please add that as a comment, including the fact that what mingw does is weird and should be removed.

Just an additional comment (hope it doesn't delay adding this tool); this functionality is available in lib.exe in the MSVC universe, and via "link.exe -lib" (which iirc is what lib.exe calls). Is it possible to hook this up into lld later as well, in case most of the actual functionality is in libs and not in the dlltool frontend? That at least seems to be the case, so that's good. (I guess I can try to do that myself once the rest of this has landed as well.)

martell updated this revision to Diff 105987.Jul 11 2017, 3:17 AM

added a comment per request of rui

Just an additional comment (hope it doesn't delay adding this tool); this functionality is available in lib.exe in the MSVC universe, and via "link.exe -lib" (which iirc is what lib.exe calls). Is it possible to hook this up into lld later as well, in case most of the actual functionality is in libs and not in the dlltool frontend? That at least seems to be the case, so that's good. (I guess I can try to do that myself once the rest of this has landed as well.)

Code was already migrated from LLD to LLVM so that LLD, llvm-lib and llvm-dlltool can share the same code base.
I purposefully done this with llvm-lib in mind, you just need to hook up the frontend api of llvm-lib to COFFModuleDefinition and the Archiver to achieve this.

martell updated this revision to Diff 105991.Jul 11 2017, 3:39 AM

updated comment to mention we should remove it in future.

ruiu added inline comments.Jul 11 2017, 3:58 PM
lib/Object/ArchiveWriter.cpp
322 ↗(On Diff #105991)

Indentation. Please use clang-format-diff to format your patch before requesting code review.

lib/Object/COFFImportFile.cpp
584–585 ↗(On Diff #101330)

Why do you have to pass buffers in the first place? Their lifetime is till the end of this block, so it seems unnecessary.

lib/Object/COFFModuleDefinition.cpp
59 ↗(On Diff #105991)

... because mingw doesn't prepend "_".

87 ↗(On Diff #105991)

Please format comment. This comment has two space characters. Please add a full stop at end of a sentence.

222–227 ↗(On Diff #105991)

Do you think you can just do this?

if (Machine == IMAGE_FILE_MACHINE_I386 && !MingwDef) {
  if (!isDecorated(E.Name))
    E.Name = (std::string("_").append(E.Name));
  if (!E.ExtName.empty() && !isDecorated(E.ExtName))
    E.ExtName = (std::string("_").append(E.ExtName));
}
lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
53 ↗(On Diff #105991)

There's a off-by-one error. X6 should be X7 and so on.

112–115 ↗(On Diff #105991)

Let's simplify this.

llvm::errs() << Args.getArgString(MissingIndex) + ": missing argument\n";
119–123 ↗(On Diff #105991)

Merge this with the following if so that it just prints out the help message.

128 ↗(On Diff #105991)

It is easier to just add i386, i386:x86-64, arm at end of this line

134 ↗(On Diff #105991)

No -> no

135 ↗(On Diff #105991)

and this line.

141 ↗(On Diff #105991)

Remove llvm:: as you did using namespace llvm.

166 ↗(On Diff #105991)

Why?

173–175 ↗(On Diff #105991)

Reverse the condition and return 0 at end of main.

martell updated this revision to Diff 106150.Jul 12 2017, 1:12 AM
martell marked 10 inline comments as done.

address comments

martell updated this revision to Diff 106152.Jul 12 2017, 1:43 AM
martell marked an inline comment as done.
martell added inline comments.
lib/Object/ArchiveWriter.cpp
322 ↗(On Diff #105991)

Sorry I thought the indentation of 2 spaces is correct already.
It actually wraps it on the next line to the external bracket.

lib/Object/COFFModuleDefinition.cpp
222–227 ↗(On Diff #105991)

No because @ and ? are are still valid decorators for mingw

lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
53 ↗(On Diff #105991)

Look at lib/ToolDrivers/llvm-lib/LibDriver.cpp#L44
I can change it but every infoTable in the llvm source tree has this off by 1 for the #define

112–115 ↗(On Diff #105991)

assuming << and not +

128 ↗(On Diff #105991)

Thought it would be better to have that list in once place but that works also.

141 ↗(On Diff #105991)

This is all based on the rest of the source code of llvm.
llvm::errs seems to be used throughout for clarity over errs()
Just looking at lib/ToolDrivers/llvm-lib/LibDriver.cpp or any other source file in llvm I can see it follows this pattern.
We shouldn't be too pedantic about this.

ruiu added inline comments.Jul 12 2017, 11:24 AM
lib/Object/COFFImportFile.cpp
584–585 ↗(On Diff #101330)

Can you address this comment?

lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
53 ↗(On Diff #105991)

This is an obvious mistake, and it is not a good idea to blindly copy mistakes.

martell updated this revision to Diff 106264.Jul 12 2017, 11:40 AM
martell marked an inline comment as done.
martell marked an inline comment as done.
martell added inline comments.Jul 12 2017, 11:45 AM
lib/Object/COFFImportFile.cpp
584–585 ↗(On Diff #101330)

In general I am just following the style of the rest of the function.

std::vector<uint8_t> ImportDescriptor; -> createImportDescriptor
std::vector<uint8_t> NullImportDescriptor; -> createNullImportDescriptor
std::vector<uint8_t> NullThunk; -> createNullThunk

std::vector<uint8_t> WeakBuffer1;
std::vector<uint8_t> WeakBuffer2;

Maybe a suggestion of a better name?
Unless you want a change to the signature of createWeakExternal?

martell added inline comments.Jul 12 2017, 11:49 AM
lib/Object/COFFImportFile.cpp
584–585 ↗(On Diff #101330)

I would assume the others use std::vector<uint8_t> in the case that someone wants to use this function on a premade buffer of their own. Possible Public API???
I am happy to put the buffer inside the createWeakExternal function itself if you want?

ruiu added inline comments.Jul 12 2017, 1:59 PM
lib/Object/COFFImportFile.cpp
584–585 ↗(On Diff #101330)

I don't think it's about style. Why do you have to pass WeakBuffer1 and WeakBuffer2? If there's no need, please don't do that.

martell added inline comments.Jul 13 2017, 7:28 AM
lib/Object/COFFImportFile.cpp
584–585 ↗(On Diff #101330)

Gotcha, It has been awhile since I wrote this so I had to take some time to re read everything.
You have to pass WeakBuffer1 and WeakBuffer2 because you need to have the buffers in memory up until the writeArchive function is called.
By having the buffer in the function itself combined with the fact we call the function twice means we are using the same memory space for both buffers which obviously creates problems like having the same import created over and over.

md5sum e15c2aae621caebae2f1cc91a040cf62 /martell/llvm/usr/x86_64-w64-mingw32/lib/libmsvcrt.a
vs
md5sum 6cfae6b5ba015594e4b3ba8f3ba9bde0 /martell/llvm/usr/x86_64-w64-mingw32/lib/libmsvcrt.a

I will clarify what I mean by style.
ObjectFactory has a BumpPtrAllocator so we could create the buffer from within the function like createShortImport but all the other functions that create full sections like createNullImportDescriptor createNullImportDescriptor and createNullThunk all use std::vector<uint8_t>
For the record all of these functions could also use the BumpPtrAllocator I just assume they don't because it is cleaner and more readable which is why I done the same for createWeakExternal

ruiu added inline comments.Jul 13 2017, 5:35 PM
lib/Object/COFFImportFile.cpp
584–585 ↗(On Diff #101330)

When writeArchive is called, both WeakBuffer1 and WeakBuffer2 are dead because their lifetime ends at the end of their enclosing innermost block. (i.e. they are invalid after line 591.) That means this patch has a use-after-free bug.

martell updated this revision to Diff 106640.Jul 14 2017, 7:17 AM

use the allocator

ruiu added inline comments.Jul 14 2017, 1:48 PM
lib/Object/COFFImportFile.cpp
559–560 ↗(On Diff #106640)

This must not be correct. You are memcpy'ing to some memory region and returning something different.

martell updated this revision to Diff 106717.Jul 14 2017, 2:59 PM
martell added inline comments.
lib/Object/COFFImportFile.cpp
559–560 ↗(On Diff #106640)

In a cleanup I removed char *Buf so simplify not even realising I needed that reference.
Should be good now. Sorry for the confusion.

ruiu added inline comments.Jul 14 2017, 3:08 PM
lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
72 ↗(On Diff #106717)

Remove llvm::.

99–100 ↗(On Diff #106717)

These variables are dead. Please remove. (Please take a look at your code more carefully so that you won't add dead code.)

101 ↗(On Diff #106717)

Move this to Line 134 so that the variable definition is closer to the use of the variable.

103 ↗(On Diff #106717)

Move this to line 134.

104 ↗(On Diff #106717)

Move this to line 131.

158 ↗(On Diff #106717)

Use empty() instead.

martell updated this revision to Diff 106738.Jul 14 2017, 5:30 PM
martell marked 6 inline comments as done.
ruiu added a comment.Jul 14 2017, 6:58 PM

Please read your patch throughly again and again to fix all errors of the same kind until you think your patch is perfect. I think I'm pointing out the same errors probably too many times.

lib/Object/COFFModuleDefinition.cpp
87–89 ↗(On Diff #106738)

This is not safe because Buf may contain just one character (and if that's the case, accessing Buf[1] can cause a runtime error.) You want to do this.

Buf = Buf.drop_front();
// GNU dlltool accepts both = and ==.
if (Buf.startswith('='))
  Buf = Buf.drop_front();
return Token(Equal, "=");
lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
101 ↗(On Diff #106738)

Remove llvm::.

129–131 ↗(On Diff #106738)
std::string Path = Args.getLastArgValue(OPT_l);
143 ↗(On Diff #106738)

You want to do this only when MB has contents, no?

155–156 ↗(On Diff #106738)

OK, so delete the above definition of Path and write everything here.

std::string Path = Args.getLastArgValue(OPT_l);
if (Path.empty())
  Path = getImplibPath(Def->OutputFile);
martell updated this revision to Diff 106768.Jul 15 2017, 5:08 AM
martell marked 3 inline comments as done.
martell added inline comments.
lib/Object/COFFModuleDefinition.cpp
87–89 ↗(On Diff #106738)

I suppose we could have a bad user that does something like hellofunc = without finishing the line.
Will do startswith but with a NULL terminated StringRef instead of a char.

lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
101 ↗(On Diff #106738)

This is not consistent with the rest of the llvm project.
Anywhere opt is used regardless of using namespace llvm; it is displayed as llvm::opt
Similar to llvm::errs and llvm::outs
For example llvm::opt::OptTable is mentioned in this file twice and every other driver.
Just looking at the lib driver I can see llvm::InputArgList in main.

143 ↗(On Diff #106738)

Added a check after the creation of MB to say definition file is empty in that case.

removed all the unneeded headers in a cleanup.

check-llvm looks good

ninja check-llvm
  Expected Passes    : 14861
  Expected Failures  : 61
  Unsupported Tests  : 6273

rebuilt mingw-w64 crt with this and tested with LLD

/Users/martell/llvm/usr/bin/lld -flavor link -OUT:a.exe /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/crt2.o /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/crtbegin.o hello.o /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/libmingw32.a /Users/martell/llvm/usr/lib/clang/5.0.0/lib/windows/libclang_rt.builtins-x86_64.a /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/libmoldname.a /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/libmingwex.a /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/libmsvcrt.a /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/libadvapi32.a /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/libshell32.a /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/libuser32.a /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/libkernel32.a /Users/martell/llvm/usr/x86_64-w64-mingw32/lib/crtend.o /alternatename:__image_base__=__ImageBase
wine a.exe
hello world

Can we get a LGTM now? :)

ruiu added inline comments.Jul 17 2017, 3:21 PM
lib/Object/COFFImportFile.cpp
167 ↗(On Diff #106768)

LLVM coding style: imp -> Imp

(As I said, I really appreciate if you carefully review your patch yourself before sending it so that I don't need to point out all these style errors. It seems we are spending too much time on these kind of stuff and am honestly a bit tired of pointing them out.)

483 ↗(On Diff #106768)

imp -> Imp

484–486 ↗(On Diff #106768)

Why static?

martell marked 2 inline comments as done.Jul 18 2017, 4:10 AM
martell added inline comments.
lib/Object/COFFImportFile.cpp
167 ↗(On Diff #106768)

Damn it, I really thought we had everything that time.
I don't know how we missed that in the previous iterations.
I assume it just became more apparent when std::vector<uint8_t> &Buffer was removed.
I just read through everything extra carefully and I ran clang-format again.
Updating now.

484–486 ↗(On Diff #106768)

The reason for doing this was to not have to initialise the variables on every call to the function.

createNullImportDescriptor has
static const uint32_t NumberOfSections = 1;
static const uint32_t NumberOfSymbols = 1;

createImportDescriptor has
static const uint32_t NumberOfSections = 2;
static const uint32_t NumberOfSymbols = 7;
static const uint32_t NumberOfRelocations = 3;

createNullThunk has
static const uint32_t NumberOfSections = 2;
static const uint32_t NumberOfSymbols = 1;

We do not need the variables for the lifetime of the program but if I am to remove static here it should also be done for the other functions.

Which option will I go with?

martell updated this revision to Diff 107054.Jul 18 2017, 4:11 AM
martell marked an inline comment as done.

rebased on master.
ran clang-format on the files even fixing style errors in files that were not my own.

I would REALLY like to make the 5.0 window with this. :)

lib/Object/ArchiveWriter.cpp
380 ↗(On Diff #107054)

clang-format picked this up previously.
I just never added it to the diff.
Adding it here to show that I am actually using clang-format.

ruiu added inline comments.Jul 18 2017, 7:37 AM
lib/Object/ArchiveWriter.cpp
380 ↗(On Diff #107054)

Well, what you want to use is actually clang-format-diff, not clang-format, as I said before. clang-format-diff formats only the part you modified while clang-format formats entire file. Pleaser revert this part of change so that you don't change unrelated code.

lib/Object/COFFImportFile.cpp
549–550 ↗(On Diff #107054)

Are these the same as

"__imp_" + Sym

and

"__imp_" + Weak

? If so, write that way.

484–486 ↗(On Diff #106768)

Compilers can optimize them out, so remove static.

lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
144–146 ↗(On Diff #107054)

Is OutputFile guaranteed to have some non-empty value at this point?

martell marked 3 inline comments as done.Jul 18 2017, 9:39 AM
martell added inline comments.
lib/Object/COFFImportFile.cpp
549–550 ↗(On Diff #107054)

writeStringTable(Buffer, {"__imp_" + Sym, "__imp_" + Weak});
error: no matching function for call to 'writeStringTable'
note: candidate function not viable: cannot convert initializer list argument to 'ArrayRef<const std::string>' (aka 'ArrayRef<const basic_string<char, char_traits<char>, allocator<char> > >')

Using Twine fails here also and converting is pointless as we might as well just use std::string at that point.

E.Name and ENameExt use this same format E.Name = (std::string("_").append(E.Name));
Seems appropriate considering the type.

lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
144–146 ↗(On Diff #107054)

parseOne in parseCOFFModuleDefinition sets it if Name is in the definition.
Is possible if -D is not passed and no name is in the def file.
I was letting writeArchive error out on this but we can exit sooner here.

martell updated this revision to Diff 107120.Jul 18 2017, 9:40 AM
ruiu accepted this revision.Jul 18 2017, 9:49 AM

LGTM

This revision is now accepted and ready to land.Jul 18 2017, 9:49 AM
This revision was automatically updated to reflect the committed changes.
pcc added inline comments.Feb 21 2018, 12:59 PM
llvm/trunk/lib/Object/COFFImportFile.cpp
540

I would use IMAGE_WEAK_EXTERN_SEARCH_ALIAS here instead of the magic number 3.

martell added inline comments.Feb 21 2018, 4:53 PM
llvm/trunk/lib/Object/COFFImportFile.cpp
540

I will get a patch ready for that and stop setting SF_Undefined in the correct scenario.
Standby for an update. :)