This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Change ODR resolution and internalization to be index-based
ClosedPublic

Authored by tejohnson on May 16 2016, 10:09 AM.

Details

Summary

This patch changes the ODR resolution and internalization to be based on
updates to the Index, which are consumed by the backend portion of the
transformations.

It will be followed by an NFC change to move these out of libLTO's
ThinLTOCodeGenerator so that it can be used by other linkers
(gold and lld) and by ThinLTO distributed backends.

The global summary-based portions use callbacks so that the client can
determine the prevailing copy and other information in a client-specific
way. Eventually, with the API being developed in D20268, these may be
modified to use information such as symbol resolutions, supplied by the
clients to the API.

Diff Detail

Repository
rL LLVM

Event Timeline

tejohnson updated this revision to Diff 57365.May 16 2016, 10:09 AM
tejohnson retitled this revision from to [ThinLTO] Refactor ODR resolution and internalization (NFC).
tejohnson updated this object.
tejohnson added subscribers: llvm-commits, pcc.
mehdi_amini edited edge metadata.May 16 2016, 4:53 PM

A few first comments to begin with.

Thanks for doing this, I really planned to do it and you're cleaning behind me actually, wasn't it too painful to follow what was going on? I had the impression that things went a bit out-of-control with time and I was afraid the logic would be hard to refactor.

lib/LTO/LTO.cpp
55 ↗(On Diff #57365)

s/function/functions/

lib/LTO/ThinLTOCodeGenerator.cpp
366 ↗(On Diff #57365)

This call disappeared?
I can buy that you don't need it, but I rather have a first "NFC" commit that removes the call here and move the behavior in MustPreserveGV, and then rebase this patch on top of it so the refactoring here would be easier to follow.

793 ↗(On Diff #57365)

It seems to me that the 30 lines block above should be refactored with the block starting line 533.

It is not clear to me why you need to have this done as a global step, compared to the existing per-thread ResolveODR?

lib/Transforms/IPO/FunctionImport.cpp
536 ↗(On Diff #57365)

It it not clear if we still need to use the internalize util, is it temporary? What does it do right now other than iterating over the globals and calling the call-back?

In D20290#431538, @joker.eph wrote:

A few first comments to begin with.

Thanks for doing this, I really planned to do it and you're cleaning behind me actually, wasn't it too painful to follow what was going on? I had the impression that things went a bit out-of-control with time and I was afraid the logic would be hard to refactor.

No problem, it helps me make forward progress on some other ThinLTO improvements. Wasn't too difficult to follow. Anyway, I reviewed the code, so I had better be able to follow it. =)

lib/LTO/LTO.cpp
55 ↗(On Diff #57365)

ok

lib/LTO/ThinLTOCodeGenerator.cpp
366 ↗(On Diff #57365)

We don't need it since the AsmUndefinedRefs is checked by the new MustPreserveGV callback. But the change was actually instigated because UpdateCompilerUsed is in lib/LTO, and I was moving this function to lib/Transforms/IPO, which would have caused a circular dependence.

793 ↗(On Diff #57365)

Yeah, let me refactor those together.

Needs to be a global step so that distributed builds can use the same approach, communicating the results of this analysis to backends via the index.

lib/Transforms/IPO/FunctionImport.cpp
536 ↗(On Diff #57365)

Perhaps we could just set the linkage directly for internalized symbols (that's what gold-plugin does for LTO internalization). I was trying to stick closer to the mechanism being used by ThinLTOCodeGenerator.

mehdi_amini added inline comments.May 16 2016, 10:15 PM
lib/Transforms/IPO/FunctionImport.cpp
536 ↗(On Diff #57365)

Yeah I'm really in favor of moving towards a unified model (i.e. we don't necessarily need the internalize pass, but when I looked at it a few months ago it seemed to me that it was doing some tricky stuff).
That said I'm in favor of moving step-by-step and keeping NFC refactoring separated. For this particular case, if the direction is to get rid of it, a FIXME would seem appropriate to me.

tejohnson added inline comments.May 17 2016, 8:03 AM
lib/Transforms/IPO/FunctionImport.cpp
536 ↗(On Diff #57365)

Right, it is doing a bunch of checking, so I'd rather not muck with that in this patch. Added a FIXME.

tejohnson updated this revision to Diff 57481.May 17 2016, 8:03 AM
tejohnson edited edge metadata.
  • Address comments
mehdi_amini added inline comments.May 19 2016, 8:27 AM
lib/LTO/LTO.cpp
90 ↗(On Diff #57481)

This does not sound correct to me.

tejohnson added inline comments.May 19 2016, 8:32 AM
lib/LTO/LTO.cpp
90 ↗(On Diff #57481)

This is needed so that it is not internalized again in thinLTOInternalizeModule (which runs after promotion).

mehdi_amini added inline comments.May 19 2016, 8:41 AM
lib/LTO/LTO.cpp
90 ↗(On Diff #57481)

I see...
The issue is that this function is called *internalize* and is doing *promotion*.

tejohnson added inline comments.May 19 2016, 8:48 AM
lib/LTO/LTO.cpp
90 ↗(On Diff #57481)

Good point - how about if I change the name of this and the caller to something like "thinLTOHandleExported(GUID|InIndex)" ?

mehdi_amini added inline comments.May 19 2016, 8:52 AM
lib/LTO/LTO.cpp
90 ↗(On Diff #57481)

changing the name would be OK to me, whatever you figure is best.
HandleExported does not sound very explicit to me, what about something like resolveLinkage or InternalizeAndPromote?

Per IRC discussion, I am going to move the functions back into ThinLTOCodeGenerator.cpp for this patch. It will be followed by an NFC commit to move the functions to the new locations.

lib/LTO/LTO.cpp
90 ↗(On Diff #57481)

"resolveLinkage" is too general, especially since there is separate handling for linkonce/weak linkages. I like "InternalizeAndPromote". Will make that change.

tejohnson updated this revision to Diff 57817.May 19 2016, 10:27 AM
  • s/thinLTOInternalize/thinLTOInternalizeAndPromote/(GUID|InIndex)
  • Move functions back into ThinLTOCodeGenerator.cpp
tejohnson retitled this revision from [ThinLTO] Refactor ODR resolution and internalization (NFC) to [ThinLTO] Change ODR resolution and internalization to be index-based.May 19 2016, 10:31 AM
tejohnson updated this object.

Cannot link llvm-tblgen after this change, I get some strange errors like:

ld: warning: direct access in function 'void std::__1::vector<llvm::CGIOperandList::OperandInfo, std::__1::allocator<llvm::CGIOperandList::OperandInfo> >::__push_back_slow_path<llvm::CGIOperandList::OperandInfo const&>(llvm::CGIOperandList::OperandInfo const&&&)' from file '/tmp/thinlto.o/21.o' to global weak symbol 'std::__1::__split_buffer<llvm::CGIOperandList::OperandInfo, std::__1::allocator<llvm::CGIOperandList::OperandInfo>&>::~__split_buffer()' from file 'utils/TableGen/CMakeFiles/obj.llvm-tblgen.dir/CodeGenInstruction.cpp.o' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings.

and

ld: internal error: atom not found in symbolIndex(__ZNSt3__1plIcNS_11char_traitsIcEENS_9allocatorIcEEEENS_12basic_stringIT_T0_T1_EEPKS6_RKS9_) for architecture x86_64

Something is not NFC here...

Not sure I totally figured, but here are two things to fix

lib/LTO/ThinLTOCodeGenerator.cpp
387 ↗(On Diff #57817)

I think one thing is here it should be return Linkage != GlobalValue::InternalLinkage;

(but that's not enough to make the link passing)

539 ↗(On Diff #57817)

ModuleIdentifier is ignored here, this is causing an issue for thinLTOInternalizeAndPromoteInIndex.

thinLTOInternalizeAndPromoteInIndex should not process the full index but only DefinedGlobals

In D20290#435241, @joker.eph wrote:

Cannot link llvm-tblgen after this change, I get some strange errors like:

ld: warning: direct access in function 'void std::__1::vector<llvm::CGIOperandList::OperandInfo, std::__1::allocator<llvm::CGIOperandList::OperandInfo> >::__push_back_slow_path<llvm::CGIOperandList::OperandInfo const&>(llvm::CGIOperandList::OperandInfo const&&&)' from file '/tmp/thinlto.o/21.o' to global weak symbol 'std::__1::__split_buffer<llvm::CGIOperandList::OperandInfo, std::__1::allocator<llvm::CGIOperandList::OperandInfo>&>::~__split_buffer()' from file 'utils/TableGen/CMakeFiles/obj.llvm-tblgen.dir/CodeGenInstruction.cpp.o' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings.

and

ld: internal error: atom not found in symbolIndex(__ZNSt3__1plIcNS_11char_traitsIcEENS_9allocatorIcEEEENS_12basic_stringIT_T0_T1_EEPKS6_RKS9_) for architecture x86_64

Something is not NFC here...

The weak issue seems like it must be related to the linkonce/weak resolution. Let me see if I can repro this here. Meanwhile I will also fix the issues you found, see responses below.

lib/LTO/ThinLTOCodeGenerator.cpp
387 ↗(On Diff #57817)

Hmm, yes. I think I was assuming that this would only be called for things that are currently external, but that doesn't seem to be the case. Should actually be:

return !GlobalValue::isLocalLinkage(Linkage);

to catch private as well.

539 ↗(On Diff #57817)

Actually we do want to process the whole index - I can't recall now why I put this here rather than in the caller where it also calls resolveWeakForLinkerInIndex for the entire index. This is bad because multiple threads can be modifying the same index!

tejohnson updated this revision to Diff 57967.May 20 2016, 12:32 PM

Fix GV preservation check and move internalization/promotion for global
analysis.

In D20290#435241, @joker.eph wrote:

Cannot link llvm-tblgen after this change, I get some strange errors like:

ld: warning: direct access in function 'void std::__1::vector<llvm::CGIOperandList::OperandInfo, std::__1::allocator<llvm::CGIOperandList::OperandInfo> >::__push_back_slow_path<llvm::CGIOperandList::OperandInfo const&>(llvm::CGIOperandList::OperandInfo const&&&)' from file '/tmp/thinlto.o/21.o' to global weak symbol 'std::__1::__split_buffer<llvm::CGIOperandList::OperandInfo, std::__1::allocator<llvm::CGIOperandList::OperandInfo>&>::~__split_buffer()' from file 'utils/TableGen/CMakeFiles/obj.llvm-tblgen.dir/CodeGenInstruction.cpp.o' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings.

and

ld: internal error: atom not found in symbolIndex(__ZNSt3__1plIcNS_11char_traitsIcEENS_9allocatorIcEEEENS_12basic_stringIT_T0_T1_EEPKS6_RKS9_) for architecture x86_64

Something is not NFC here...

The weak issue seems like it must be related to the linkonce/weak resolution. Let me see if I can repro this here. Meanwhile I will also fix the issues you found, see responses below.

I couldn't reproduce this locally with gold. I did fix the issues you pointed out. Can you give it a try? If it doesn't fix the issue, could you send me more info about the linkage types we started and ended with for the flagged symbols?

Can you rebase the patch as well?

lib/LTO/ThinLTOCodeGenerator.cpp
129 ↗(On Diff #57967)

I hit this assertion, extern template are emitted with available externally apparently.

Removing the assert, I can link clang

In D20290#436122, @joker.eph wrote:

Removing the assert, I can link clang

Great. I assume you have it return null instead? I guess it doesn't matter that we would put null or whatever value in the PrevailingCopies array since it won't be looked at (no copies are weak for linker). Will make that change and rebase.

tejohnson updated this revision to Diff 58038.May 21 2016, 7:24 AM
  • Remove assert that affected available_externally extern templates in ld64 clang bootstrap. (Rebase coming next)
mehdi_amini accepted this revision.May 23 2016, 2:07 PM
mehdi_amini edited edge metadata.

LGTM, with a few minor comments.
(llvm-tblgen is linked bit-to-bit identical with this patch, still building clang for safety)

lib/LTO/ThinLTOCodeGenerator.cpp
337 ↗(On Diff #58039)

Minor preference for else if

339 ↗(On Diff #58039)

Can we internalize other than ExternalLinkage?

845 ↗(On Diff #58039)

The fact that the index is now modified as a side effect should be mentioned in the function description. I'm not sure: can it be reused for internalizing another module?

874 ↗(On Diff #58039)

Should we move this check a few lines earlier and early return?

962 ↗(On Diff #58039)

Add in the comment that the *only* reason to have this map is to support the cache.
And I'd also add FIXME: we should be able to compute the caching hash for the entry based on the index, and nuke this map.

This revision is now accepted and ready to land.May 23 2016, 2:07 PM

See question below about linkage types for internalization, will hold off on submit for now.

lib/LTO/ThinLTOCodeGenerator.cpp
337 ↗(On Diff #58039)

Done.

339 ↗(On Diff #58039)

Good question. Looking at the rest of the linkage types they are either available externally (skipped by internalizeModule), various types of weak (including linkonce), and other special types such as common and appending.

I'm not sure about the weak/linkonce types - I don't see anything in internalizeModule or elsewhere in HEAD that prevents internalizing these symbols, but I would have thought that internalizing them could be problematic since they have weak linkage and some can be overridden by external definitions (isInterposableLinkage). I assume that with ld64 any that need to be preserved because they are overridden are added to the preserved set? In gold we know which one is prevailing (which is used to guide current internalization decisions in gold-plugin) and can use that info to guide the internalization via the callback when I call this from there.

I removed the "if (GlobalValue::isExternalLinkage(S->linkage()))" guard and the regression tests run fine - do you want to try your bootstrap with that change?

845 ↗(On Diff #58039)

Done here (and in the description of promote() which modifies the index for resolved weak symbols).

It would be reusable for another module - since the analysis is global on the index the decisions should be the same for all modules. The index updates would happen the first time through here.

874 ↗(On Diff #58039)

Yes, done.

962 ↗(On Diff #58039)

Done

tejohnson added inline comments.May 24 2016, 8:58 AM
lib/LTO/ThinLTOCodeGenerator.cpp
339 ↗(On Diff #58039)

Actually, changed this to "if (!GlobalValue::isLocalLinkage(S->linkage()))" so we don't change private -> internal.

tejohnson updated this revision to Diff 58242.May 24 2016, 8:59 AM
tejohnson edited edge metadata.
  • Address comments.
mehdi_amini added inline comments.May 24 2016, 9:13 AM
lib/LTO/ThinLTOCodeGenerator.cpp
339 ↗(On Diff #58242)

Yes in ld64 and regular monolithic LTO, we are internalizing linkonce_ODR if they are not exported.
Can weak functions defined in the main binary be overriden at runtime?
It may be a platform specific question...

tejohnson added inline comments.May 24 2016, 9:23 AM
lib/LTO/ThinLTOCodeGenerator.cpp
339 ↗(On Diff #58242)

Agree it is going to be platform specific. So then I think that the new version of the patch is better, and each linker will need to handle appropriately (e.g. here by adding to preserveSymbol(), in gold by using the callback to query gold's resolution, and eventually with pcc's API via the resolution info provided to the API).

Do you want to retest this new version before I commit?

mehdi_amini added inline comments.May 24 2016, 9:33 AM
lib/LTO/ThinLTOCodeGenerator.cpp
339 ↗(On Diff #58242)

No it should be fine (and I have a bot that will bootstrap clang immediately anyway)

This revision was automatically updated to reflect the committed changes.