This isn't quite pretty, but it seems to work at least. Suggestions
for how to do things better are very welcome. I'll make a few inline
comments about a few surprising bits.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
lld/COFF/Driver.cpp | ||
---|---|---|
1223 | It turns out we can't actually call def->getChunk() on a DefinedRegular from a bitcode object file, because that symbol has got the SectionChunk **data; member set to nullptr, and getChunk() dereferences *data, so we have to do this dance so we won't crash on calling def->getChunk(). | |
1232 | Here we end up iterating over all symbols in the input bitcode file, to find the one we're looking at, to figure out whether it's data or a function. Not pretty... | |
2125 | Here we need to decide what symbols to export before we run addCombinedLTOObjects(). | |
lld/COFF/MinGW.cpp | ||
126 ↗ | (On Diff #341632) | The ->getChunk() call from here stemmed from the original iplementation of this feature, where it was added just as a guesstimate of what would be a decent condition - for LTO symbols we must tolerate quite different symbols. |
lld/COFF/Driver.cpp | ||
---|---|---|
1223 | Exciting. I took a look at LTO support a while back, and I wanted to rearrange how some of this stuff worked, but I never got around to it. I think I wanted to make some kind of DefinedBitcode symbol or something like that so that the fields would be less overloaded and more uniform. | |
1232 | Yeah, it would be nice to figure out how to avoid this, since it's O(n^2)-ish. One way would be to store this bit directly on the DefinedRegular symbol object. Another way would be to start making fake, synthetic SectionChunk objects for symbols from bitcode files. We could have the standard text, data, rdata, bss sections, for example. This would make the previous conditional unnecessary. That seems better. The least invasive way would be to separately iterate over the bitcode input files and look their symbols up in the symbol table and use that to create exports. However, symbol resolution is already expensive, and this is basically a second symbol resolution phase. I kind of lean towards the second option. It leads to less null checks in the long run. | |
2125 | Yes, that makes sense: we can't really do effective LTO without knowing what the entry points are going to be. |
lld/COFF/Driver.cpp | ||
---|---|---|
1232 | You mean your preference is synthetic section chunks? That does sound nice as it'd avoid null checks, but wouldn't it be a huge waste (both mem and time) to do that for all bitcode object file symbols, for the odd chance of wanting to inspect some of them for exporting - just for the short period of time until we replace them with the LTO compiled output? Or should there be like 1 common fake section chunk used for all LTO symbols, for each symbol type (func/data)? |
lld/COFF/Driver.cpp | ||
---|---|---|
1232 | I was thinking we could have one fake section chunk for the standard types of LLVM IR GlobalValues: .text for Function, .data for GlobalVariable, .rdata for Constant, etc. Something like that. I think it's just a matter of passing extra parameters to addRegular in BitcodeFile::parse. |
Simplified (somewhat) by adding fake SectionChunks for LTO symbols until they're compiled.
There's still some amount of ugliness here; the SectionChunk ctor needs to be made to cope with a null ObjFile, and that this runs before config is initialized. Any ideas for how to make that bit neater?
Nice, I think this is a reasonable direction.
lld/COFF/InputFiles.cpp | ||
---|---|---|
1045 | Please put this in an anonymous namespace and add a comment that this is just adding a constructor to help stamp out COFF section headers. For some reason I thought it was adding fields. Actually, maybe it would be less surprising if the coff_section was a plain field, and then we can take its address to construct the SectionChunk objects. | |
1047 | Can this be marked constexpr? We want to avoid any dynamic initialization if possible. | |
1052 | Please mark the globals static. | |
1054 | The SectionChunk constructor is non-trivial, so this will definitely create dynamic initialization. I guess that's OK, LLD isn't really structured as a library anyway. |
lld/COFF/InputFiles.cpp | ||
---|---|---|
1047 | Doesn't seem like it's doable to make it constexpr in this form, it'd have to be e.g. constexpr FakeSection(int c) : section(<initializer expression>) {}, it seems (at least in C++11). The most straightforward would be C99 style designated intializers, i.e coff_section fakeSection = { .Characteristics = 42; };, but that's only available in C++20. And I'd rather not enumerate the members by hand to create something like coff_section fakeSection = { 0, 0, 0, 0, 0, 0, importantValue }. Also, coff_section consists of support::ulittle16_t, which don't seem to be constexpr'able, so I'm not sure if it'd even work like that in C++20? (This is all kinda roundabout, when what we really want just is a couple static bytes laid out in a specific way, but accessible for the caller via the coff_section class.) |
Added a comment about the fake section class, made the coff_section a member instead of subclassing it, wrapped the globals in an anonymous namespace.
lld/COFF/InputFiles.cpp | ||
---|---|---|
1050 | I guess that would work just as well, and be clearer. Will change before pushing. |
It turns out we can't actually call def->getChunk() on a DefinedRegular from a bitcode object file, because that symbol has got the SectionChunk **data; member set to nullptr, and getChunk() dereferences *data, so we have to do this dance so we won't crash on calling def->getChunk().