We currently just use GNU format for llvm-lib. This mostly works, but ARM64EC needs an additional section that does not really fit GNU format. This patch implements writing in COFF format (as in, it's what archive reader considers as K_COFF). This mostly requires symbol emitting symbol map. Note that, just like in case of MSVC, symbols are de-duplicated in both usual symbol table and the new symbol map.
Details
Diff Detail
Unit Tests
Event Timeline
So - I'm not very familiar with the internals of static archives, can you elaborate a bit more on what K_COFF vs K_GNU means here. Is this a format flag which is written to the static archives themselves (I believe not?), or just different conventions for what details are written and found in the archives? Currently, when reading such archives, are they identified as K_GNU or K_COFF? And currently, before this patch, which mode is used? If I read the patch correctly (I just eyed it quickly) - llvm-ar will continue to write things like it used to, while llvm-lib will pick K_COFF (unless it's a thin library)?
llvm/test/tools/llvm-lib/duplicate.test | ||
---|---|---|
20 | Before this change, would --print-armap not have printed anything at all, or what's the change in externally visible functionality here? |
It's about how details are written to the archives, the meaning in this patch it analogical to how reader uses them, see Archive class constructor. Essentially, if the second member is another symbol table, then the archive treated as K_COFF. The other subtle difference is that strings in string table are null-terminated, see ArchiveMemberHeader::getName. This patch makes it possible to write archives that will be recognized as K_COFF.
Currently, when reading such archives, are they identified as K_GNU or K_COFF? And currently, before this patch, which mode is used?
Libraries generated by current llvm-lib are identified as K_GNU. Libraries generated by llvm-lib with this patch are identified as K_COFF.
If I read the patch correctly (I just eyed it quickly) - llvm-ar will continue to write things like it used to, while llvm-lib will pick K_COFF (unless it's a thin library)?
Yes, nothing changes for llvm-ar yet. I guess it would make sense to add new llvm-ar --format argument for that, but we probably shouldn't change the default. If we changed the default for Windows host, it would require explicit --format when cross compiling from Windows, which is currently not required.
llvm/test/tools/llvm-lib/duplicate.test | ||
---|---|---|
20 | Without this patch, it would print duplicated and unsorted symbols: Archive map b in bar.o c in abc.o a in abc.o b in abc.o a in foo.o |
Ok, thanks for the explanation! (So far in my endeavours, I've diven into most of the internals of PE/COFF, but never had to worry about the details of static archives.)
If I read the patch correctly (I just eyed it quickly) - llvm-ar will continue to write things like it used to, while llvm-lib will pick K_COFF (unless it's a thin library)?
Yes, nothing changes for llvm-ar yet. I guess it would make sense to add new llvm-ar --format argument for that, but we probably shouldn't change the default. If we changed the default for Windows host, it would require explicit --format when cross compiling from Windows, which is currently not required.
Yeah we shouldn't make such differences based on the host where the tool is running. There's some predecent for autodetecting the mode based on other context (preexisting archive or archive members, or maybe also objects passed on the command line), which mostly should avoid the cross compilation issues. See https://github.com/llvm/llvm-project/commit/a0d42c86dbcb150c2a7a3d8f2a8500dcfb98a5d1 for the original implementation of that (I believe it has evolved a bit still through the years since).
llvm/test/tools/llvm-lib/duplicate.test | ||
---|---|---|
20 | I see, thanks! |
Interesting. Actually, that change was broken around K_COFF handling. If you try to modify an archive recognized as K_COFF, it would hit llvm_unreachable in isBSDLike when trying to write an archive. I verified that it's fixed by this patch series and llvm-ar writes K_COFF archive.
Please explicitly note in the commit message that this doesn't affect other archive-writing tools, like llvm-ar.
Otherwise LGTM
llvm/lib/Object/ArchiveWriter.cpp | ||
---|---|---|
769 | I'm curious, what does MSVC lib do here? Refuse to construct the archive? Skip creating the symbol table? Use some other output format? |
llvm/lib/Object/ArchiveWriter.cpp | ||
---|---|---|
769 | According to my testing, it has a limit of 4090 object files. When I try to pass more, I get an ambiguous error: |
llvm/lib/Object/ArchiveWriter.cpp | ||
---|---|---|
769 | That's strange... maybe accidentally running into the C library's file handle limit. What happens if you try adding additional files to a library that already contains 4090 object files? |
llvm/lib/Object/ArchiveWriter.cpp | ||
---|---|---|
769 | Oh, right, I can get larger libraries this way. I can get up to 65535 object files by merging multiple libraries until I get an error: LINK : fatal error LNK1189: library limit of 65535 objects exceeded |
llvm/lib/Object/ArchiveWriter.cpp | ||
---|---|---|
769 | I guess we don't need to worry about that for compatibility, then. |
I'm curious, what does MSVC lib do here? Refuse to construct the archive? Skip creating the symbol table? Use some other output format?