This is an archive of the discontinued LLVM Phabricator instance.

[lld] - Ignore non-elf files in archives with --whole-archive
AbandonedPublic

Authored by gbreynoo on Mar 2 2017, 10:45 AM.

Details

Summary

A patch to stop the use of the “--whole-archive” argument causing a crash if an archive contains non-elf files.

Details
This code change and test originates from a patch by rafael.
Using the argument “--whole-archive” forces the load of all members in a static library, however the link fails if the archive contains any other kinds of file. This patch means these files are instead skipped over.

Diff Detail

Event Timeline

gbreynoo created this revision.Mar 2 2017, 10:45 AM
gbreynoo added a subscriber: llvm-commits.
ruiu edited edge metadata.Mar 2 2017, 10:48 AM

Interesting. I wonder in what situation archive files contain non-ELF/bitcode files.

grimar edited edge metadata.Mar 3 2017, 1:58 AM

Its not crashing for me, it shows fatal error "error : non-elf-archive.s.tmp.foo: invalid data encoding".
lf.bfd 2.27 also errors out this case: "non-elf-archive.s.tmp.a: member non-elf-archive.s.tmp.a(non-elf-archive.s.tmp.foo) in archive is not an object".
ld.gold 2.27 shows: error: non-elf-archive.s.tmp.a: member at 656 is not an ELF object

So I guess there are probably no reason to accept such inputs. Though I think we want to have testcase for that for checking what LLD do.

Thanks for the correction grimar you are right, it is a fatal error and not a crash. However I think the change is still useful, I’ll clarify the reasons I submitted this patch below:

  1. llvm-ar allows for the creation of archives with a mixture of ELF and other arbitrary files.
  2. Historically our customers use archives with a mixture of ELF and other arbitrary files.
  3. A fatal error does not seem appropriate when the file type can be discerned at this point and not interrupt the linking process. Could a warning be output instead if required?
ruiu added a comment.Mar 3 2017, 3:58 PM

I'd think we should continue handling it as an error. There are a few reasons:

  1. The GNU linkers don't allow that (thank you George for investigating!). That means almost all Unix projects don't add non-object files to .a files, so in practice the current behavior is desirable for most people.
  2. Assume that foo.a contains x.o, y.o and z.o. The current semantics of --whole-archive foo.a --no-whole-archive is the same as x.o y.o z.o, which is a pretty simple rule. This patch changes that semantics. I think that's not a good thing to do.
  3. It is easy to relax this error check, but tightening it later is hard. I don't want to relax this unless a strong reason to do is presented.
silvas added a subscriber: silvas.Mar 6 2017, 12:18 AM
In D30544#692075, @ruiu wrote:

I'd think we should continue handling it as an error. There are a few reasons:

  1. The GNU linkers don't allow that (thank you George for investigating!). That means almost all Unix projects don't add non-object files to .a files, so in practice the current behavior is desirable for most people.

It sounds like this would be for compatibility with the PlayStation proprietary Unix linker, not with the GNU linkers. Last I heard, PlayStation has officially committed to LLD, so we should take their concerns seriously (as with any platform that wants to use LLD as their system linker).

That being said, it's not clear just how critical this is. Owen, can you quantify or otherwise clarify the impact of this feature on your customers? For example, would an excellent error diagnostic be acceptable instead of a warning?

  1. Assume that foo.a contains x.o, y.o and z.o. The current semantics of --whole-archive foo.a --no-whole-archive is the same as x.o y.o z.o, which is a pretty simple rule. This patch changes that semantics. I think that's not a good thing to do.

I think this would be a really good reason to keep the current behavior. However, it does not appear to be the case:

sean:~/pg/llvm/release % ./bin/llvm-ar rcs foo.a build.ninja                  
sean:~/pg/llvm/release % ./bin/ld.lld --whole-archive foo.a --no-whole-archive
./bin/ld.lld: error: build.ninja: invalid data encoding
zsh: exit 1     ./bin/ld.lld --whole-archive foo.a --no-whole-archive
sean:~/pg/llvm/release % ./bin/ld.lld build.ninja                             
./bin/ld.lld: error: build.ninja:21: malformed number: 1.5
./bin/ld.lld: error: build.ninja:21: ninja_required_version = 1.5
./bin/ld.lld: error: build.ninja:21:                          ^
./bin/ld.lld: error: target emulation unknown: -m or at least one .o file required
zsh: exit 1     ./bin/ld.lld build.ninja
  1. It is easy to relax this error check, but tightening it later is hard. I don't want to relax this unless a strong reason to do is presented.

This is a good reason. I think the providing an excellent diagnostic might be a good middleground.

The problem we see from a customer perspective is an inconsistency between including an archive with or without the --whole-archive argument.

When --whole-archive is used the archive is parsed on a file by file basis. When --whole-archive is not used, the archive is parsed by looking at the symbol table and files not in the symbol table are ignored. This means from a user perspective adding the --whole-archive argument can make a formally legal archive unexpectedly cause an error.

ruiu added a comment.Mar 6 2017, 9:44 AM

The problem we see from a customer perspective is an inconsistency between including an archive with or without the --whole-archive argument.

When --whole-archive is used the archive is parsed on a file by file basis. When --whole-archive is not used, the archive is parsed by looking at the symbol table and files not in the symbol table are ignored. This means from a user perspective adding the --whole-archive argument can make a formally legal archive unexpectedly cause an error.

I agree that in some cases that is true, but in general adding --whole-archive can change valid combinations of file into invalid combinations, no? For example, assume that there is a file foo.o which contains undefined symbols that cannot be resolved by any other file. As long as foo.o is in an archive file and you are not using that file, foo.o doesn't cause any rpoblem. But if you add --whole-archive, the file will cause an undefined symbol error. So, adding --whole-archive can cause an error even if all files in an archive are valid ELF files.

I have to agree with you Rui, particularly regarding the transparency of the current semantics. Thanks for the clarification everyone, I’ll abandon this revision.

gbreynoo abandoned this revision.Mar 7 2017, 6:27 AM