The COFF linker previously implemented link-time optimization using an API which has now been marked as legacy. This change refactors the COFF linker to use the new LTO API, which is also used by the ELF linker.
Details
Diff Detail
- Build Status
Buildable 3464 Build 3464: arc lint + arc unit
Event Timeline
COFF/Config.h | ||
---|---|---|
90 | You don't seem to set a value to that variable in this patch. Does it work? | |
COFF/Driver.cpp | ||
817 | Do you actually need to run the entire loop after LTO? I was thinking that, once LTO is done, all symbols are essentially resolved, so no need to run that loop. | |
COFF/LTO.cpp | ||
1 | This file seems very similar to ELF/LTO.cpp. Do you think there is a way to merge the two? |
I've added some inline notes to explain the changes. Essentially, I copied over LTO.cpp and LTO.h from the ELF linker and adapted the code to use those instead of the legacy LTO API. I've aimed to keep the changes (both compared to the ELF code and compared to the original code in the COFF linker) to a minimum to make reviewing easier. There are a couple of features that are present in the ELF linker which we're currently not using in the COFF linker, which is why createLTO() has a couple of hardcoded values. I can change that in a follow-up, or in this change if that's preferred. With these changes, all tests pass, and I can build Chrome with LTO.
COFF/Driver.cpp | ||
---|---|---|
817 | I noticed that the linker left some work to be done and then crashed further on, which I fixed by running untin run() returns false. Since we also do that just above here, I factored that out into a separate function. | |
COFF/LTO.cpp | ||
1 | Based on ELF/LTO.cpp | |
COFF/LTO.h | ||
2 | Slightly modified version of ELF/LTO.h | |
test/COFF/lto-comdat.ll | ||
1 | The new implementation generates code here that isn't bit-for-bit identical to what the old implementation generated, but it looks semantically equivalent. Basically, functions are emitted at different addresses, which causes some int3s are replaced by nopws. I've changed the test a bit so that it checks that we don't generate any unexpected calls, instead of specifically checking for int3 or nop instructions. | |
test/COFF/lto-parallel.ll | ||
8 | "lto.tmp" is what I carried over from the ELF code, so that the COFF linker and the ELF linker now use the same string. I can probably change it to be "<lto object>" if anyone strongly prefers that. |
COFF/LTO.cpp | ||
---|---|---|
1 | Yes. I noticed that there is a bit of very similar code in the ELF linker and the COFF linker. I would like to take a look at what can be merged. I'm actually not sure at this point if we should merge the LTO code, but I think for example Error.{h,cpp} can probably be unified. |
COFF/InputFiles.cpp | ||
---|---|---|
350โ351 | If we're switching to the new LTO API, it isn't quite right to keep the Replaceable flag, which mixes together the three different ways that symbols can be merged in COFF (common, comdat, weak external). The reason is that the legacy LTO API relied on the IR linker to resolve conflicts between these three types of symbol correctly (so if we had two replaceable BitcodeSymbols, it wouldn't matter which one wins), but the new API needs to know exactly which symbol prevails. Without correct symbol resolution, the wrong symbol may appear in the final object file depending on the order in which we read object files (for example, with this code, when we see a weak external followed by a comdat member, the weak external would win, whereas the comdat member needs to win). That is why I suggested offline that we could remove the BitcodeSymbol class, and extend DefinedRegular to handle bitcode symbols, as we do in the ELF linker. | |
372 | We don't really want to parse the module twice just to get the linker directives (this also won't work with multi-module bitcode files). As I mentioned on your other change, there should be an accessor function on InputFile that retrieves the directives. | |
COFF/LTO.cpp | ||
73 | This line doesn't make sense in a COFF linker. | |
75 | Non-relocatable COFF objects are rare enough that I think we can always use PIC here (which is the default), so you can remove this line. | |
80โ82 | These don't need to be initialized. | |
125 | Symbols can also be exported with /export. I think you just need to check IsUsedInRegularObj because we handle /export by (among other things) setting that flag on the symbol. | |
COFF/SymbolTable.cpp | ||
366 | I think this part is unnecessary because we are already doing it in LTO.cpp (see line 128). |
COFF/LTO.cpp | ||
---|---|---|
149 | May want to make the same change as r293138 here. |
Another thing I'd like to see is a test showing that ThinLTO works, something similar to lld/test/ELF/lto/thinlto.ll.
Another thing I'd like to see is a test showing that ThinLTO
I'll add that in a follow-up. The end goal of this refactor is making thinlto work with COFF, but the goal for this diff is just to get the existing functionality to work with the new API.
So I'm not very familiar with LTO. I want to avoid this much code duplication in LLD, but I'm not sure if it is doable. What do you think, Peter?
It seems that BitcodeCompiler::add is the only actual linker-specific part (the constructor and compile are basically the same except for a few details). If we changed the two linkers to share error handling functions, and moved BitcodeCompiler::add to a linker-specific location, I think we'd be able to share BitcodeCompiler between the linkers.
COFF/LTO.h | ||
---|---|---|
47 | Remove the word "template". I'm actually a little surprised that this compiles. |
COFF/InputFiles.cpp | ||
---|---|---|
395 | This will only ever be a native object file, I think. |
Changes requested by @pcc and @ruiu.
The major change here is that I removed DefinedBitcode. Instead, we now use DefinedRegular and DefinedCommon, regardless of whether the symbols come from a COFF file or a bitcode file. This also gets rid of IsReplaceable, and we now handle common, comdat, and weak individually. I also renamed DefinedCOFF to DefinedByFile to reflect that this is a common base class for symbols that are defined by a file, not necessarily a COFF file.
I moved the parsing of linker options in object files from COFF/InputFiles.cpp to LLVM's LTO.
Other, smaller changes:
- removed redundant loop to undefine bitcode symbols
- removed some things that we aren't actually using, e.g. SaveTemps and some ELF specifics
COFF/LTO.cpp | ||
---|---|---|
125 | /export will be correctly handled without this, but symbols annotated in the source code with dllexport will not. Specifically, COFF/dll.test fails with $ "C:/src/clang/./bin\FileCheck.EXE" "-check-prefix=EXPORT-LTO" "C:\src\llvm\tools\lld\test\COFF\dll.test" # command stderr: C:\src\llvm\tools\lld\test\COFF\dll.test:40:18: error: expected string not found in input EXPORT-LTO-NEXT: 3 0x1030 exportfn3 ^ <stdin>:11:1: note: scanning from here ^ when I remove the check for R.Prevailing && ObjSym.hasDLLExportStorageClass(). That test has two symbols for which it passes /export, and one for which it doesn't pass /export but has dllexport in the source. That last symbol is the one that fails. | |
149 | Thanks! I actually removed this code, because SaveTemps isn't currently set by the COFF linker anyway. |
COFF/InputFiles.cpp | ||
---|---|---|
350โ351 | I think your IsWeak flag would just be used for weak externals (COFF has no direct concept of weak defined symbols). Would it be possible to represent these with Undefined in a similar way to weak externals in regular objects? | |
389 | You never supply a non-default argument to ArchiveName or OffsetInArchive. I would just create the ObjectFile directly in LTO.cpp. | |
COFF/InputFiles.h | ||
71โ77 | It doesn't look like you ever set these fields. Regarding ArchiveName we also already have ParentName above. We might want to fix the issue that required us to add OffsetInArchive in the ELF linker (see D25495) but I think that can be done separately. | |
COFF/LTO.cpp | ||
125 | What if you change getLinkerOpts to emit /export flags? (see suggestion on D29207) I would also try undoing the runLoop change after making that change. | |
COFF/SymbolTable.cpp | ||
64 | Do we need to do this just yet? | |
240 | In general I would try to have just one add* function per symbol type. The input file readers should extract any field values, not the symbol table. | |
242โ243 | This will never be true, we create Undefined symbols for COFF weak externals. | |
252 | This isn't correct, we need to compare sizes as below. | |
COFF/Symbols.cpp | ||
32โ46 | Why change this? I think we still have the invariant that if the name is empty, the file is a COFF object, no? | |
COFF/Symbols.h | ||
128 | You could move this into SymbolBody to save space (that is if you don't take my suggestion to use Undefined for weak undef). | |
179 | Move into SymbolBody. | |
COFF/Writer.cpp | ||
449โ450 โ | (On Diff #85993) | It looks like you are only using fields Defined::Type and Defined::StorageClass here. Instead of storing them in Defined, you could change createSymbolAndStringTable to walk the original object's symbol table, extract the type and storage class from the symbol and pass them to this function. You could also do something similar for isSectionDefinition in MapFile.cpp if you were so inclined. |
Thanks for the review, @pcc! I'm making the changes you suggested. Meanwhile, I had a couple of questions, which I've asked inline.
COFF/SymbolTable.cpp | ||
---|---|---|
64 | The lto-comdat test will fail without this. Specifically, the test that uses a COFF main with LTO objects ends up picking the definition of comdat in main, resulting in calls to that being emitted for f1 and f2. The test checks that these calls are not present, which is the result we get when picking the comdat definition in the bitcode and then optimizing it out in LTO. I thought about this for some time, and my conclusion is that preferring a symbol defined in bitcode when we have no overriding concerns makes sense. We can optimize out calls in LTO when the definition is available, which picking the bitcode symbol accomplishes. We can't optimize away calls in objects that don't participate in LTO no matter which definition we pick. So it would seem that picking the definition in bitcode is sometimes a better choice and never a worse choice (again, assuming they're otherwise equivalent). That said, if there is a better way to do this, I'm all ears. | |
252 | The code here says "if there wasn't a defined symbol with this name, use the newly discovered symbol". I think that's correct. What I think you're referring to is the case where there is a DefinedCommon already, we still might want to replace that with the new symbol. Do we have a way to determine the size of a common defined in bitcode? Currently, that size defaults to 0, meaning the bitcode definition would always lose against a COFF definition, and different bitcode definitions are equivalent. This is consistent with the existing behavior (if there is a DefinedCommon, use that instead of the newly discovered bitcode symbol). | |
COFF/Symbols.cpp | ||
32โ46 | In the process of unifying bitcode symbols with COFF symbols, I wanted to get rid of DefinedCOFF storing a coff_symbol_generic. There are a couple of simple values we extracted from that, which I now just store directly. The only tricky one is Name, which, as the comments here mention, is supposed to be extracted only when needed. I didn't feel good about having to carry around a coff_symbol_generic, the special case for DefinedCOFF here, and add a special case for when it's actually a bitcode symbol and doesn't have a coff_symbol_generic, just for the one case where we did not already extract the name prior to constructing a DefinedRegular (see the calls to getSymbolName in InputFiles.cpp). Looking at it again, it seems that the lazy name extraction may help us with internal symbols, whose names we may not be interested in. If lazy name extraction is a feature we need to keep, I'll bring it back. |
COFF/SymbolTable.cpp | ||
---|---|---|
64 | I think we should try to avoid biasing the symbol resolution logic like that. To optimize the case where no prevailing bitcode symbol is chosen, we can teach the LTO API implementation to link non-prevailing symbol definitions with *_odr linkage if there is not already an implementation available, and set the linkage type to available_externally. That will make the function body available for inlining without it being emitted in the object file. Implementing the optimization that way will benefit all linkers no matter how they implement their symbol resolution logic. | |
252 | We get the correct behaviour with the existing code because the IR linker chooses the largest IR common definition, lld chooses the largest native common definition and lld chooses the larger of the two when it adds the combined LTO object to the link. With this code, if you have multiple common symbols in bitcode with different sizes, I think you would end up choosing the first one instead of the largest one. You can determine the size by calling getCommonSize() on the LTO symbol. | |
COFF/Symbols.cpp | ||
32โ46 | Ah, I didn't notice that this relies on the Sym field that you removed. I think for now the simplest thing to do would be to keep Sym and call getType/getStorageClass/isSectionDefinition directly via that pointer in the rest of the code. We can think about removing it in a separate change. |
Changes requested by @pcc (thanks!):
- Removed the special case for bitcode symbols in compareDefined.
- Use Undefined symbols instead of DefinedRegular symbols for weak symbols, so we don't need an IsWeak flag.
- Reverted the change from DefinedCOFF to DefinedByFile - DefinedCOFF now carries a coff_symbol_generic again, which we use to lazily resolve the symbol name.
- Use getCommonSize to determine the size of commons defined by bitcode.
- Simplified COFF/LTO.cpp by moving the check for dllexport to COFF/InputFiles.cpp.
- addCommon and addRegular now each have only one definition, which handles both regular and bitcode symbols.
I tried making the change @pcc has been suggesting to make getLinkerOpts() return /EXPORT directives for symbols with dllexport storage so that we wouldn't need hasDLLExportStorage anymore. However, this ran into an issue: implementing that requires iterating over all symbols. We actually do that in addRegularLTO in LTO/LTO.cpp, but that's after we need to determine symbol resolutions. So I would either have to implement an extra loop, or move the check for dllexport into a loop that runs earlier. The solution I came up with is to move it to BitcodeFile::parse in InputFiles.cpp, which already iterates over the symbols. This makes the fact that the symbol is used by regular objects available immediately after we create the symbol, which seems like a good place for it to live.
I tried making the change @pcc has been suggesting to make getLinkerOpts() return /EXPORT directives for symbols with dllexport storage so that we wouldn't need hasDLLExportStorage anymore. However, this ran into an issue: implementing that requires iterating over all symbols. We actually do that in addRegularLTO in LTO/LTO.cpp, but that's after we need to determine symbol resolutions. So I would either have to implement an extra loop, or move the check for dllexport into a loop that runs earlier.
I'd just add the extra loop to be honest. I would be surprised if we could measure any performance impact at all (and the performance impact will be zero once the directives are moved into the bitcode symbol table).
COFF/InputFiles.cpp | ||
---|---|---|
202 | http://llvm-cs.pcc.me.uk/include/llvm/Object/COFF.h#344 Common symbols are external by definition, you can also remove IsExternal here. | |
254 | You can remove the IsExternal argument, it will always be true. | |
353 | I think it would be more accurate to test that both SF_Weak and SF_Indirect are set here. | |
356 | Can you please also upload the change that adds this function? | |
359 | In the regular object reader we diagnose undefined symbols with different weak aliases. Could you factor that out into a function and call it from here? | |
COFF/InputFiles.h | ||
195 | I think you are no longer using this LLVMContext object, so you can remove it. | |
COFF/SymbolTable.cpp | ||
36 | Now that this function has only one caller, it no longer needs to be a separate function. Also, the logic in this function doesn't seem to match what was there before (e.g. you are no longer reporting conflicts between comdat and non-comdat). I think you just need the logic on lines 215-226 in the old code with the bitcode part removed. |
COFF/InputFiles.cpp | ||
---|---|---|
204 | Instead of setting IsUsedInRegularObj here, can you change addRegular and addCommon to set it based on the type of the input file? See what we're doing in addUndefined for example. |
COFF/InputFiles.cpp | ||
---|---|---|
204 | Yeah, I thought about that but it felt like a more magical way to do things. I decided to make add* work exactly the same for bitcode and object files, and set IsUsedInRegularObj explicitly. If you prefer, I can change add* to set the flag automatically. |
COFF/InputFiles.cpp | ||
---|---|---|
204 | I think I would slightly prefer it because setting IsUsedInRegularObj is really the symbol table's job, and passing in a flag would be redundant. (It's also consistent with the ELF linker.) |
- removed unnecessary LLVMContext objects
- removed unnecessary IsExternal parameters
- test for SF_Indirect in addition to SF_Weak when looking for weak externals
- removed isSectionDefinition from DefinedCOFF
- fixed symbol resolution logic
- use cast instead of dyn_cast and report_fatal in getCOFFSymbol
- factored out check for conflicting definitions of weak aliases
COFF/InputFiles.cpp | ||
---|---|---|
204 | I will make that change, and also synthesize "/EXPORT" flags instead of hasDLLExportStorageClass. | |
COFF/SymbolTable.cpp | ||
36 | I actually like this as a separate function. I feel it makes the logic of how symbol preference and conflicts are determined easier to read. Good call on changing the logic for comdat vs. non-comdat here. I had misread the original code. That should now be fixed. |
Generally looking pretty good at this point.
COFF/InputFiles.cpp | ||
---|---|---|
259 | Same here, Sym.isExternal() will always be false (also Name will always be empty). | |
353 | You no longer need the !IsCOMDAT part; I don't think weak externals can be comdat members. | |
COFF/SymbolTable.cpp | ||
36 |
I don't share that opinion (maybe I'm biased by being familiar with the code), but I have no strong objections to doing it this way. I guess if I felt it necessary to make this change, I would do it in a separate commit, but up to you. | |
COFF/Symbols.cpp | ||
42 | We should never see bitcode symbols with empty names, so you can cast here. | |
COFF/Symbols.h | ||
92 | *symbol | |
103 | Can you remove the explicit and the default argument? | |
128 | Mention that only internal symbol names are lazily loaded. |
Implemented the improvements suggested by @pcc. The big ones are:
- set IsUsedInRegularObject in SymbolTable instead of InputFiles
- no longer call hasDLLExportStorageClass
- removed IsCOMDAT check for weak external bitcode symbols
COFF/InputFiles.cpp | ||
---|---|---|
259 | What about this? | |
350โ351 | Use } else if (...) {. | |
COFF/SymbolTable.cpp | ||
363 | I noticed that there is a functional change here in how we're reading the combined LTO objects. The old code was adding the new objects to ObjectFiles and then calling parse, while the new code is calling addFile which calls parse and then reads the linker directives. Now that we're reading all linker directives up front, if you change this code to just add to ObjectFiles and parse, can you undo the runLoop change? | |
COFF/Symbols.cpp | ||
42 | You can also cast this. |
COFF/InputFiles.cpp | ||
---|---|---|
259 | Ah, I forgot to respond. This is calling the constructor, which expects us to pass values for these parameters. We have a couple of options:
Other options? What did you have in mind? | |
COFF/SymbolTable.cpp | ||
363 | Let me check that. I'm calling addFile here specifically so that we do process the linker directives, but perhaps that's actually the wrong thing to do. |
COFF/InputFiles.cpp | ||
---|---|---|
259 | I meant option 1. We normally add comments alongside the arguments where it is unclear from context what they mean, i.e: /*Name=*/"" and /*IsExternal=*/false. |
- don't process linker directives from object files produced by LTO; undo Driver loop change
- changed else { if (...) to else if (...) {
- replaced named parameters with constants and comments
Mostly looks good, except for one last thing.
COFF/Symbols.cpp | ||
---|---|---|
42 | What about this? | |
test/COFF/lto-parallel.ll | ||
8 | Sorry, I didn't notice this before. This test expects there to be 2 parallel LTO codegen paritions. Your change changes the semantics of /opt:lldltojobs to mean the number of ThinLTO jobs rather than the number of parallel LTO codegen partitions. I think the change in semantics is fine, it's what we've adopted in the other linkers. But it means that we aren't testing the parallel LTO codegen feature correctly here. Can you introduce a new flag /opt:lldltopartitions which controls Config->LTOPartitions, and then change this test to use it? Once you've done that, the CHECK lines should look roughly like they did before (except for the name change). |
COFF/LTO.h | ||
---|---|---|
11โ17 | Nit: s/ELF/COFF/g here. |
COFF/InputFiles.cpp | ||
---|---|---|
357 | Instead of collecting the weak aliases into a vector and looping over them, you can call checkAndSetWeakAlias directly from here I think. |
- added lldltopartitions option
- use cast instead of dyn_cast for file in getName
- changed to work with updated D29365
moved calls to checkAndSetWeakAlias out of the loop in BitcodeFile::parse and changed "COFF" to "ELF" in LTO.h
You don't seem to set a value to that variable in this patch. Does it work?