This is an archive of the discontinued LLVM Phabricator instance.

Add an LTO interface to parse metadata and extract linker options
ClosedPublic

Authored by ygao on Dec 5 2013, 4:51 PM.

Details

Summary

Hi,
This patch adds the following LTO interface to parse metadata nodes and extract linker options and dependent
libraries from a bitcode module. The new API functions are:

unsigned int lto_module_get_num_deplibs(lto_module_t mod) : Returns the number of dependent libraries in the object module.
const char* lto_module_get_deplib(lto_module_t mod, unsigned int index) : Returns the ith dependent library in the module.
unsigned int lto_module_get_num_linkeropts(lto_module_t mod) : Returns the number of linker options in the object module.
const char* lto_module_get_linkeropt(lto_module_t mod, unsigned int index) : Returns the ith linker option in the module.

These new APIs are very similar to the existing APIs for symbols.

Could someone take a look whether this is good to go in? Does anyone have strong opinion on this proposed interface?

Many thanks,

  • Gao.

Diff Detail

Event Timeline

ygao updated this revision to Unknown Object (????).Dec 5 2013, 5:03 PM
ygao added a comment.Dec 16 2013, 11:03 PM

On Sun, Dec 15, 2013 at 11:43 AM, "Duncan P. N. Exon Smith" <dexonsmith@apple.com> wrote:

Hi Gao,

The implementation looks correct, but given that it adds to the C API (which needs to be stable),
can someone else chime in?

Also, would you please explain the use case for embedding linker options in the object file metadata?
Why wouldn’t they be passed directly to the linker? Sorry if I missed an earlier thread on this, or if
I’m missing something obvious.

Duncan

Hi Duncan,

Many thanks for reviewing my patch!

Embedding linker options in the object file metadata has been the way to support
MSVC-style auto linking since r181426. I guess your real question is why we need
to add new APIs instead of compiling to native object files and then letting the
linker extract the linker options from the object files. Please clarify if that
is not the case.

I have to discuss with our linker team for any specific use cases, but I think
there are a few linker options which affect symbol references used by LTO,
and they may be needed before the symbol resolution phase. For example,
/include:<symbol> tells the linker to pretend that a symbol is referenced. Also,
if /dll and /def:<module-definition file> are used, then the linker need to read
the input .def file and preserve the exported symbols specified in that file from
being deleted by LTO passes.

Reference: http://llvm.org/docs/LinkTimeOptimization.html

  • Gao.
ygao updated this revision to Unknown Object (????).Dec 20 2013, 7:07 PM

Rebased the patch to trunk r197851.

After discussing with the linker team, I believe we have two important use
cases that need to be solved by extending the LTO plugin APIs.

First, we have game codes that use bitcode archives as argument to pragma
comment libs. In order to have knowledge of all symbol references, the linker
needs to add bitcode files included in the dependent libraries to its input
queue. And these bitcode files could in turn refer to more dependent libraries.
So there is an iterative process of scanning bitcode files for dependencies.

To be able to support bitcode archives simplifies the process of adding LTO to
existing build system. For example, say, there is a Makefile that generates an
intermediate library libfoo.a that is compiled out of source files liba.cpp and
libb.cpp and is referenced in a source file a.cpp with

#pragma comment(lib, "foo").

Without LTO, running "make all" will put native object files liba.o and libb.o
into libfoo.a. With LTO, running "make all CFLAGS=-flto" will put bitcode files
liba.o and libb.o into libfoo.a; there would be no need to change the Makefile.

Secondly, the libraries referenced in pragma comment lib need to be added to
the input queue to be scanned immediately after the current file. So, for
example, the link order would be different if the library dependencies are
extracted from a merged, post-LTO, object.

rnk added a comment.Dec 26 2013, 11:48 AM

Interesting. This is essentially exposing the info that we would have put into the object file if we had generated one.

include/llvm/Target/TargetLoweringObjectFile.h
60 ↗(On Diff #6220)

This should return a StringRef.

lib/LTO/LTOModule.cpp
817 ↗(On Diff #6220)

Now _deplibs and _linkeropts hold pointers into global metadata. Is that OK? Is that metadata guaranteed to outlive LTOModule? Might be better to copy here.

ygao updated this revision to Unknown Object (????).Jan 7 2014, 3:30 PM

On Sun, Jan 5, 2014 at 4:08 PM, "Duncan P. N. Exon Smith" <dexonsmith@apple.com> wrote:

Thanks for the explanation; I understand the use case now.
LGTM, once you’ve satisfied Reid’s comments about string lifetime.
Duncan
P.S. Sorry for the long delay!

Hi Duncan, thanks again for the review! I only just came back from my vacation yesterday.

On Thu, Dec 26, 2013 at 11:48 AM, "Reid Kleckner" <rnk@google.com> wrote:

Interesting. This is essentially exposing the info that we would have put into the
object file if we had generated one.
================
Comment at: include/llvm/Target/TargetLoweringObjectFile.h:60
@@ +59,3 @@
+ /// option string. Returns NULL if the option does not specify a library.
+ virtual const char *getDepLibFromLinkerOpt(StringRef LinkerOption) const {

+ return NULL;

This should return a StringRef.

Comment at: lib/LTO/LTOModule.cpp:817
@@ +816,3 @@
+ getObjFileLowering().getDepLibFromLinkerOpt(Op))
+ _deplibs.push_back(DepLibName);

+ else if (!Op.empty())

Now _deplibs and _linkeropts hold pointers into global metadata. Is that OK? Is that
metadata guaranteed to outlive LTOModule? Might be better to copy here.

I updated the patch to address both of these points.

I would like to make a somewhat related observation since you mentioned string
lifetime. If you take a look at the _symbols vector inside struct LTOModule, each
NameAndAttributes entry includes both a pointer to the symbol name and a pointer
to the symbol itself (as a GlobalValue). The symbol name referenced from
NameAndAttributes is a copy with a StringSet. The symbol referenced from
the same NameAndAttributes struct does not appear to be a copy; it is just part
of the input llvm::Module. If the symbol is known to have a long lifetime, it
would indicate that the copy of the symbol name is redundant. If the symbol's
lifetime is not guaranteed, then maybe the reference to it could become invalidated
at some point.

rnk added a comment.Jan 8 2014, 9:14 AM

Nick, can you look at this, since it touches the LTO interface? IIRC you had previous objections to having flags in the LTO interface.

The basic issue is that we support embedding linker options in MachO and COFF, but without this change, they aren't exposed in LTO.

lib/CodeGen/TargetLoweringObjectFileImpl.cpp
8 ↗(On Diff #6379)

Is that dash "one" or "ell"? I would assume you want -l.

ygao updated this revision to Unknown Object (????).Jan 8 2014, 3:27 PM

It should be dash-ell. Fixed. Thanks!

The libLTO API here is fine.

I'm surprised to see the return of dependent libraries in modules. For history,
see:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157053.html

Secondly, I don't think this needs to be metadata. It doesn't need to hold weak
references to IR values. I *think* I'd rather they were first-class properties
of llvm::Module? This would require some general agreement on llvm-dev.

A Module more or less corresponds to a single .o file, which makes storing
properties like linker flags a bit unusual. (This is true regardless of whether
you make it a first class property or use named metadata.) Could you explain
why it belongs here?

lib/CodeGen/TargetLoweringObjectFileImpl.cpp
407 ↗(On Diff #6390)

Hmm. You want to stick a linker option parser inside codegen? Am I correct in understanding that COFF object files have linker flags in them?

Who fills in these linker options and dependent libraries? Is there any reason we couldn't hoist the linker option parsing into there?

kledzik added inline comments.Jan 15 2014, 1:12 PM
include/llvm-c/lto.h
201 ↗(On Diff #6390)

When you add new APIs to <llvm-c/lto.h> be sure to bump up LTO_API_VERSION, so that clients to do compiler configuration checks if the API exists.

ygao added inline comments.Jan 15 2014, 8:15 PM
include/llvm-c/lto.h
201 ↗(On Diff #6390)

Hi Nick,

I saw your discussion with Sean on the mailing list on this topic. I will bump up LTO_API_VERSION as you requested.

  • Gao.
lib/CodeGen/TargetLoweringObjectFileImpl.cpp
407 ↗(On Diff #6390)

Hi Nick,

Am I correct in understanding that COFF object files have linker flags in them?

I assume you mean the COFF .drectve section.

The translation from IR's "Linker Options" named metadata to .drectve entries happens in
TargetLoweringObjectFileCOFF::emitModuleFlags() for COFF; and in
TargetLoweringObjectFileMachO::emitModuleFlags() for MachO. That is why I thought
TargetLoweringObjectFile{COFF,MachO} are good places to add these new functions. These
emitModuleFlags() routines do not appear to parse the linker options; they just output the
metadata strings as-is into the relevant directive sections.

The generation of these "Linker Options" named metadata happens in the clang
front end. For example, in cfe/trunk/lib/CodeGen/TargetInfo.cpp, look in
WinX86_32TargetCodeGenInfo::getDependentLibraryOption() and
TargetCodeGenInfo::getDependentLibraryOption(). The metadata strings received by the
backend are already "packaged" and I was trying to extract the library names back from there.

I would argue that by adding these routines to extract the library names, the results returned
by the LTO APIs could stay independent from how the linker options are packaged in the IR
metadata, although I should admit that the "/DEFAULTLIB:<library_name>" format used by
COFF has been around for quite a while and probably will not change.

ygao updated this revision to Unknown Object (????).Jan 17 2014, 11:48 AM

Rebased to trunk r199503 and added the documentation similar to r199429.

ygao closed this revision.Jan 21 2014, 10:37 AM

Closed by commit rL199759 (authored by @ygao).