https://llvm.org/bugs/show_bug.cgi?id=30665
See the comment in the code && the PR for a detailed description.
Details
Diff Detail
Event Timeline
It seems that this does not entirely solve the problem -- you will still have collisions if two files happen to have the same size.
The gold plugin solves this problem by incorporating the byte offset within the archive file into the filename it passes to the LTO library. Would it be possible to do something similar here?
Will it solve the problem?
If you have two archives with the same size, and the two member with the same name are the first one in the archive, they'll likely start at the same offset, wouldn't you get a collision again?
This patch is already including the archive name as part of the filename, which would seem to be enough to avoid a collision in that case.
Oh sure, as long as the full path to the archive is kept and not only the archive name itself, that should be fine!
ELF/InputFiles.cpp | ||
---|---|---|
718 | s/size of the archive/size of the member buffer/ |
Thanks for clarifying, Peter. Here's a new version of the patch. I decided to store the information about the offset in the archive in class InputFile as we do for the archive name already.
ELF/InputFiles.cpp | ||
---|---|---|
726–727 | Can you use the object address as a unique identifier instead? I mean utostr((uint64_t)&MB) is always unique. |
ELF/InputFiles.cpp | ||
---|---|---|
726–727 | Maybe, but I'd like to double check with Peter to make sure it's fine on lib/LTO side (probably it is). |
ELF/InputFiles.cpp | ||
---|---|---|
726–727 | I'd prefer not to do that, for a couple of reasons:
|
ELF/InputFiles.cpp | ||
---|---|---|
726–727 | For 2), that would tie us to a "file based" API, I'll look closely at the benefit before going this route. |
If we put aside (2) for now, all the plumbing of OffsetInArchive is just to create a unique filename, which I think I don't like. There are other ways to generate a deterministic unique filename. For example, you could add static int Counter = 0 to BitcodeFile and use the counter value as part of a filename (and increment it every time you use it).
Isn't ArchiveName a full path? If you concatenate a full path archive name and a filename inside of the archive, does it become a unique identifier?
I'm not sure I like exposing a global with static lifetime to avoid threading informations properly into the linker. If it would be a awful lot of work, I'd agree with you, but it seems it's not.
No, it does not, because you can have a valid archive which contains two member functions with the same name.
No, it does not, because you can have a valid archive which contains two member functions with the same name.
That's true. That means when we print out an error message, we have the same problem that users can't identify which file is wrong. I'm wondering if InputFile::getName() should return a unique name, such as "some/path/foobar.a:123:x.o" where 123 is an offset in the archive.
That might be something we could do in lld::elf::getFilename(). It could be confusing in the normal case, so maybe we could do it only if the archive contains duplicate names?
ELF/InputFiles.cpp | ||
---|---|---|
491 | If this is a thin archive, I don't think getChildOffset will give us the right value (looking at the implementation, it appears that it will return a garbage value for thin archive members). Maybe return 0 if thin? |
Maybe, but I still think the change belongs to a different patch
ELF/InputFiles.cpp | ||
---|---|---|
491 | Oh yeah, good catch. |
If this is a thin archive, I don't think getChildOffset will give us the right value (looking at the implementation, it appears that it will return a garbage value for thin archive members). Maybe return 0 if thin?