This is an archive of the discontinued LLVM Phabricator instance.

Add auto-exporting of symbols from tools so that plugins work on Windows
ClosedPublic

Authored by john.brawn on Apr 6 2016, 7:59 AM.

Details

Summary

The problem with plugins on Windows is that when building a plugin DLL it needs to explicitly link against something (an exe or DLL) if it uses symbols from that thing, and that thing must explicitly export those symbols. Also there's a limit of 65535 symbols that can be exported. This means that currently plugins only work on Windows when using BUILD_SHARED_LIBS, and that doesn't work with MSVC.

This patch adds an LLVM_EXPORT_SYMBOLS_FOR_PLUGINS option, which when enabled automatically exports from all LLVM tools the symbols that a plugin could want to use so that a plugin can link against a tool directly. Plugins can specify what tool they link against by using PLUGIN_TOOL argument to llvm_add_library. The option can also be enabled on Linux, though there all it should do is restrict the set of symbols that are exported as by default all symbols are exported.

This option is currently OFF by default, as while I've verified that it works with MSVC, linux gcc, and cygwin gcc, I haven't tried mingw gcc and I have no idea what will happen on OSX. Also unfortunately we can't turn on LLVM_ENABLE_PLUGINS when the option is ON as bugpoint-passes needs to be loaded by both bugpoint.exe and opt.exe which is incompatible with this approach. Also currently clang plugins don't work with this approach, which will be fixed in future patches.

Diff Detail

Repository
rL LLVM

Event Timeline

john.brawn updated this revision to Diff 52796.Apr 6 2016, 7:59 AM
john.brawn retitled this revision from to Add auto-exporting of symbols from tools so that plugins work on Windows.
john.brawn updated this object.
john.brawn added reviewers: chandlerc, rnk, chapuni.
john.brawn set the repository for this revision to rL LLVM.
john.brawn added a subscriber: llvm-commits.
rnk edited edge metadata.Apr 6 2016, 9:45 AM

This Python script is a wonderful, horrible hack. I submit it to the community as evidence that we should have real API export annotations in our public headers and build with -fvisibility=hidden on Unix. :)

utils/extract_symbols.py
164 ↗(On Diff #52796)

You should probably test the tools in a determinstic order. We also might want to prefer llvm-readobj to dumpbin /symbols. dumpbin is pretty slow.

john.brawn added inline comments.Apr 8 2016, 10:04 AM
utils/extract_symbols.py
164 ↗(On Diff #52796)

From some experiments dumpbin seems to be ~2x the speed of llvm-readobj. Looking at some of the largest libraries I get (building a release build, time in seconds running extract_symbols.py on the library):

Libraryllvm-readobjdumpbin
LLVMCodeGen6.93.2
LLVMCore4.62.1
LLVMAnalysis4.62.5
LLVMScalarOpts4.12.2

I do agree on using a deterministic order though.

john.brawn updated this revision to Diff 53036.Apr 8 2016, 10:21 AM
john.brawn edited edge metadata.

Use a deterministic order for checking tools, instead of whatever dict.keys() happens to return.

rnk added a reviewer: beanz.Apr 26 2016, 2:31 PM
rnk added a comment.Apr 26 2016, 2:36 PM

Added Chris, pinging other reviewers. Personally, I think we should go the other direction: build a DSO for LLVM, a DSO for Clang, and have opt plugins link against that. This, however, requires us to get serious about annotating our public API boundary.

chandlerc edited edge metadata.Apr 27 2016, 12:42 PM
In D18826#412821, @rnk wrote:

Added Chris, pinging other reviewers. Personally, I think we should go the other direction: build a DSO for LLVM, a DSO for Clang, and have opt plugins link against that. This, however, requires us to get serious about annotating our public API boundary.

I don't really disagree with you, but I don't know who is going to step up to do the (really huge) work here. Are you? If not, what do we do in the interim?

In D18826#412821, @rnk wrote:

Added Chris, pinging other reviewers. Personally, I think we should go the other direction: build a DSO for LLVM, a DSO for Clang, and have opt plugins link against that. This, however, requires us to get serious about annotating our public API boundary.

I don't really disagree with you, but I don't know who is going to step up to do the (really huge) work here. Are you? If not, what do we do in the interim?

I had a look at doing this approach, but didn't as it does look like a huge amount of work (though looking at the amount of time I've sunk into getting this auto-export approach to work maybe it would have been better to just do the work instead of trying to be clever). There's also another problem, and that's global data symbols - if you build LLVM.dll using LLVM_LINK_LLVM_DYLIB and link opt.exe against it there are a number of global variables used across the dll-exe boundary that need to be arranged to be exported from the dll and explicitly imported by the exe (for function symbols you can get away with not explicity importing them, but that doesn't work for variables). Alternatively those uses could be changed to not use global variables e.g. have a function that returns the address of the variable and export that, but in some cases that looks impractical e.g. APFloat has several static fltSemantics members that are used all over the place, and passes have static PassID members used in inline constructors which would all need to be changed.

In unrelated news I've found some problems with the template handling in extract_symbols.py (it looks at the outermost name in the namespace but it should be looking at the innermost name) which makes it throw away some symbols that should be kept. I expect to have an updated patch sometime next week.

john.brawn updated this revision to Diff 56271.May 5 2016, 5:46 AM
john.brawn edited edge metadata.

Updated extract_symbols.py to handle templates by looking at the innermost template outwards, which requires an unfortunate amount of demangling but is necessary as previously different member functions of the same template would be counted as the same thing which isn't what we want. Also fixes the handling of calling convention symbol decoration on 32-bit windows.

beanz added a subscriber: silvas.

I agree with @rnk's comments about philosophical design, but we shouldn't force this to wait on a giant header annotation pass.

The CMake looks good. I have a few comments about the Python. Also might be nice to get someone like @silvas to take a look at the Python too. He has provided excellent review feedback on Python code in the past.

cmake/modules/AddLLVM.cmake
725 ↗(On Diff #56271)

Nit: Since you have this in a python script it would be cleaner to have a -o option rather than a pipe here.

utils/extract_symbols.py
29 ↗(On Diff #56271)

Rather than calling subprocess.Popen you might consider using subprocess.check_output. It would significantly reduce the amount of boiler plate code in this python script.

334 ↗(On Diff #56271)

This is not a safe assumption. Apple ships nm but not objdump. In place of objdump we historically shipped otool. The most recent Xcode ships llvm-objdump, but you can't rely on that being present.

silvas added inline comments.May 13 2016, 5:36 PM
utils/extract_symbols.py
29 ↗(On Diff #56271)

Yeah. That's my main comment here too.

john.brawn added inline comments.May 16 2016, 8:45 AM
utils/extract_symbols.py
29 ↗(On Diff #56271)

Waiting for subprocess.check_output to have all of the output ready adds a significant delay. I did a quick experiment and with a debug build on Linux it increased time from ~25 seconds to ~45 seconds to run extract_symbols on all the libraries, and on Windows it increased from ~1 minutes to ~4 minutes. Using check_output in objdump_is_32bit_windows and readobj_is_32bit_windows is fine though, as the output is much smaller, so I'll do that.

334 ↗(On Diff #56271)

Ack. I'm not expecting this to work on OSX, but there's no reason not to fix that at least.

john.brawn updated this revision to Diff 57354.May 16 2016, 8:57 AM
john.brawn edited edge metadata.

Address review comments.

rnk accepted this revision.May 25 2016, 4:48 PM
rnk edited edge metadata.

Looks good to me. Based on other feedback, I think this is ready to land.

This revision is now accepted and ready to land.May 25 2016, 4:48 PM
This revision was automatically updated to reflect the committed changes.