This is an archive of the discontinued LLVM Phabricator instance.

llvm-link: Add support for archive files as inputs.
ClosedPublic

Authored by jsjodin on Jun 3 2020, 10:17 AM.

Details

Summary

This patch adds support for archive files as inputs to llvm-link. One of the use-cases is for OpenMP where device specific libraries need to be extracted from libraries containing bundled object files.
The clang-offload-bundler will support extracting these archives, which will be passed into llvm-link, see https://reviews.llvm.org/D80816.

Diff Detail

Event Timeline

jsjodin created this revision.Jun 3 2020, 10:17 AM

Can you add a test for a broken archive.

Can you add a test for a broken archive.

Sure, I can use a non-archive file as an archive. I should also see what happens if .ll files are used inside an archive, I remember something bad happened when I tried it at some point.

Can you add a test for a broken archive.

Sure, I can use a non-archive file as an archive. I should also see what happens if .ll files are used inside an archive, I remember something bad happened when I tried it at some point.

A test and, if possible, an error message is probably preferable to "something bad" ;)

jsjodin added a comment.EditedJun 8 2020, 6:59 AM

Can you add a test for a broken archive.

Sure, I can use a non-archive file as an archive. I should also see what happens if .ll files are used inside an archive, I remember something bad happened when I tried it at some point.

A test and, if possible, an error message is probably preferable to "something bad" ;)

XXX Yes, I want it to work with .ll files as well, but there is an issue with memory buffers not being null terminated when they are supposed to be. I'm looking into this. XXX

[Edit]. I take that back, if it is not a bitcode file in the archive it is better to error out, otherwise a copy of the buffer will have to be created just to null terminate it. That can be fixed in a separate patch, so this does not grow too large. I will add this as a negative test.

I will add this as a negative test.

Sounds good to me.

jsjodin updated this revision to Diff 269565.Jun 9 2020, 9:05 AM

Added check that archive members are bitcode files, with a negative test. Added a negative test for a broken archive. Note: The error message "file too small to be an archive" is given regardless of the size of the input file. The error is reported if the archive library does not recognize the file.

If you think it's not too much work you could add negative test cases for other broken modules, it seems some errors are not hit so far. I also left a high level comment below.

All in all I think this is fine though. I wait a bit for anyone to chime in before I accept this.

llvm/tools/llvm-link/llvm-link.cpp
209

High level: The code is pretty "unqualified auto" heavy. While I guess this is neither performance critical nor too important, I would recommend to either specify some types where it might be helpful and use qualifiers, const, *, & where applicable.

jsjodin added inline comments.Jun 11 2020, 5:52 PM
llvm/tools/llvm-link/llvm-link.cpp
209

I had specified some types before submitting the patch, but changed them to auto, since it looked like it was the preferred way of doing things. I can specify some types to make the code easier to follow.

mehdi_amini added inline comments.Jun 11 2020, 8:37 PM
llvm/tools/llvm-link/llvm-link.cpp
209

The LLVM style is supposed to avoid auto when the type isn't obvious from the context or it helps readability.
https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable : "Use auto if and only if it makes the code more readable or easier to maintain."

jsjodin marked an inline comment as done.Jun 12 2020, 8:46 AM
jsjodin added inline comments.
llvm/tools/llvm-link/llvm-link.cpp
209

Okay, thanks for pointing it out. I will update the patch.

gregrodgers accepted this revision.Jun 12 2020, 8:56 AM

LGTM, only one new auto in an existing code with 11 previous autos. Looks like consistent usage.

This revision is now accepted and ready to land.Jun 12 2020, 8:56 AM

I really like this. I've wanted to pass archives of bitcode to llvm-link before and assume others have experienced the same disappointment.

The patch is straightforward, conservatively aborts if the archive has a mix of files in it, uses the existing parsing machinery. Understanding .ll files as well would be a nice future extension.

llvm-link raises reasonable error messages usually, I think this should hit the same handling. Maybe add a test case where the two bitcode files have different triples and check we see the usual error reported?

llvm/tools/llvm-link/llvm-link.cpp
14

Double //, should be llvm/Object

162

double negative - perhaps " failed to read name of archive member"?

354

May be better to read the first few bytes, looking for '!<arch>', instead of going by extension. Don't have strong preferences though.

ABataev added inline comments.
llvm/tools/llvm-link/llvm-link.cpp
143

Argv0. Also, better to use StringRef, if possible

144

const std::string &

157

Better to use the real type here rather than auto

158–159

Same here about autos

jsjodin updated this revision to Diff 270437.Jun 12 2020, 9:54 AM
jsjodin marked an inline comment as not done.

Removed some auto declarations to specific types.

jsjodin marked an inline comment as done.Jun 12 2020, 9:55 AM
jsjodin updated this revision to Diff 270486.Jun 12 2020, 12:20 PM

Fixed "//" in #include, fixed the parameters to loadArFile, fixed error message.

jsjodin marked 13 inline comments as done.Jun 12 2020, 12:24 PM
jsjodin added inline comments.
llvm/tools/llvm-link/llvm-link.cpp
14

Not sure how that got there. Fixed.

143

StringRef didn't simplify anything, so I kept the type, but fixed the name.

354

Maybe, but for now I will keep it as-is. If we find this to be a problem in the future it can be fixed then.

ABataev added inline comments.Jun 12 2020, 12:47 PM
llvm/tools/llvm-link/llvm-link.cpp
143

StringRef is not about simplification, this is common practice in LLVM to use it instead of const char *.

202–203

Just

if (L.linkModules(*Result, std::move(M), ApplicableFlags))
  return nullptr;
207

Does it really worth using std::move here and in other places?

jsjodin updated this revision to Diff 270515.Jun 12 2020, 2:17 PM
jsjodin marked 3 inline comments as done.

Removed local Err variable.

jsjodin marked 2 inline comments as done.Jun 12 2020, 2:17 PM
jsjodin added inline comments.
llvm/tools/llvm-link/llvm-link.cpp
207

Yes, otherwise you get a compilation error.

jsjodin requested review of this revision.Jun 16 2020, 8:25 AM

If Mehdi or someone else outside AMD could approve (or suggest more changes) that would be helpful. Thanks!

ABataev added inline comments.Jun 16 2020, 8:56 AM
llvm/tools/llvm-link/llvm-link.cpp
167

Why not StringRef?

LGTM, only one new auto in an existing code with 11 previous autos. Looks like consistent usage.

Some previous reviewers not catching deviation from the coding standard does not seem like a justification to continue this route, but rather an indication that the rest of the code could be updated.
Also the sheer number of uses of auto does not say anything: it is all about *how* it is used.

LGTM (after remaining minor inline comments)

llvm/tools/llvm-link/llvm-link.cpp
14

Includes are sorted alphabetically, clang format should do this for you (run git clang-format HEAD~...HEAD to format only your diff)

176

Nit: you're missing a space between "failed" and '

187

Nit: in other places the error message include the Archive name, I don't know if it was intentional to omit it here.

196

(Same missing space here)

jdoerfert accepted this revision.Jun 16 2020, 11:46 AM

FWIW, LGTM too.

This revision is now accepted and ready to land.Jun 16 2020, 11:46 AM

Where's the right place to document this new feature?

It's occurred to me that people may want to use llvm-link as a drop in replacement for a partial link, and this patch implements the equivalent to --whole-archive. That is, we extract all the objects, not only the ones that match an unresolved symbol earlier in the link order.

That'll surprise some users so I think we should write down that this is the behaviour they'll get, and I'm not sure where.

Where's the right place to document this new feature?

It's occurred to me that people may want to use llvm-link as a drop in replacement for a partial link, and this patch implements the equivalent to --whole-archive. That is, we extract all the objects, not only the ones that match an unresolved symbol earlier in the link order.

That is how llvm-link always worked, I mean the "whole-archive" way, isn't it?

That'll surprise some users so I think we should write down that this is the behaviour they'll get, and I'm not sure where.

LLVM-link documentation + Change Logs
@jsjodin ^

This revision was automatically updated to reflect the committed changes.