Share the ValueMap for different mapping context to avoid scheduling the
same value twice.
This doesn't work for comdat yet.
Differential D99546
[llvm][ValueMap] Share the value map between MappingContext steven_wu on Mar 29 2021, 3:35 PM. Authored by
Details
Share the ValueMap for different mapping context to avoid scheduling the This doesn't work for comdat yet.
Diff Detail
Unit Tests Event TimelineComment Actions @pcc @tejohnson I tried to fix the assertion failure shown in the test case (reduced from actual code generated from swift compiler) by simplifying the ValueMapper but I couldn't quite get the comdat tests to pass. I think comdat is relying on some of the indirect symbol that doesn't appear in the ValueMap for the normal context but I couldn't quite figure out how that works. Any suggestion? Comment Actions Note this somewhat convoluted mapping context code was added in f0d73f95c15f909c6034f1735632695248bb75a8 to deal with comdats, but it has been long enough I don't remember precisely why the comdat tests failed without it. It'd be great to simplify this if we can...
Comment Actions Ping. I still couldn't figure the algorithm used here and what semantics does the comdat have. Need some help here. Comment Actions @pcc or @tejohnson (or maybe @MaskRay from the lld side), any chance you could help explain why the testcases llvm/test/Linker/comdat16.ll and llvm/test/LTO/Resolution/X86/comdat16.ll rely on a separate value map? Or do you have a suggestion for someone else that could help? Restating the context, maybe with a bit more detail:
void Mapper::scheduleMapGlobalInitializer(GlobalVariable &GV, Constant &Init, unsigned MCID) { assert(AlreadyScheduled.insert(&GV).second && "Should not reschedule");
I don't remember much about this code, although I'm trying to page it back in; IIRC, when I broke the recursion I tried removing the second value map; that broke these same testcases, and I created this "mapping context" ID in order to make progress, but I'm not sure I fully investigated what it was for. I'm hoping someone that knows more about comdats can help us piece together the motivation here: why do comdats need a separate value map for aliases? (Maybe there's even a simpler/cleaner way to model the requirement.) Comment Actions A bit more information is that the GV that triggers the assertion now is being materialized from both Global and Local (Indirect?) Materializer because of the complexity of the GV and Alias in the module. On the other side, a non-assert build of the linker doesn't produce any invalid or broken module just because how the materializer is written right now will ignore a GV that is already initialized, even though it is scheduled to be initialized twice.
Comment Actions @steven_wu , I think this bug is a dual of http://reviews.llvm.org/D20586, and the correct fix is something like: --- a/llvm/lib/Linker/IRMover.cpp +++ b/llvm/lib/Linker/IRMover.cpp @@ -602,20 +602,24 @@ Value *IRLinker::materialize(Value *V, bool ForIndirectSymbol) { return New; } - // When linking a global for an indirect symbol, it will always be linked. - // However we need to check if it was not already scheduled to satisfy a - // reference from a regular global value initializer. We know if it has been - // schedule if the "New" GlobalValue that is mapped here for the indirect - // symbol is the same as the one already mapped. If there is an entry in the - // ValueMap but the value is different, it means that the value already had a - // definition in the destination module (linkonce for instance), but we need a - // new definition for the indirect symbol ("New" will be different. - if (ForIndirectSymbol && ValueMap.lookup(SGV) == New) + // Decide whether the body should be linked, which is always true when + // linking a global for an indirect symbol. + if (!ForIndirectSymbol && !shouldLink(New, *SGV)) return New; - if (ForIndirectSymbol || shouldLink(New, *SGV)) - setError(linkGlobalValueBody(*New, *SGV)); + // If the global is being linked for an indirect symbol, it may have already + // been scheduled to satisfy a regular symbol. Similarly, a global being linked + // for a regular symbol may have already been scheduled for an indirect + // symbol. Check for these cases by looking in the dual value map and + // confirming the same value has been scheduled. If there is an entry in the + // ValueMap but the value is different, it means that the value already had a + // definition in the destination module (linkonce for instance), but we need + // a new definition for the indirect symbol ("New" will be different). + if ((ForIndirectSymbol && ValueMap.lookup(SGV) == New) || + (!ForIndirectSymbol && IndirectSymbolValueMap.lookup(SGV) == New)) + return New; + setError(linkGlobalValueBody(*New, *SGV)); return New; } I'll post an alternate patch in a moment. Others: it'd still be useful to better understand why two value maps are needed for comdats. It'd be great to get that documented in the code. Comment Actions I'm afraid I'm not going to be much help, at least not without more investigation. The 2 comdat tests mentioned above (I think the second one is supposed to be llvm/test/LTO/Resolution/X86/comdat.ll?) were added after the separate value map for aliases already existed, and weren't related to any value mapper changes afaict. I just did some archeology and found that the separate map exists as far back as when the IRMover code was introduced by @respindola in 2015, being split from older IR linking code, which also seems to have had a separate value map for aliases (didn't look any further back though, the code was structured quite differently). In what way do they break with this change?
Comment Actions Yeah, I noticed that too — this test seems to incidentally catch a problem. The breakage is that comdat $c2 does not get linked in. Comment Actions I agree abandoning the change makes sense, but it would be great if we could somehow collectively understand (and document) why we need a second ValueMap for comdats... |
This seems like enough to reproduce: