The linker has some implicit contract on some sections, and
promoting/renaming is breaking it.
Details
Diff Detail
Event Timeline
I think this is reasonable.
If it is ever a significant limitation we should investigate changing the representation or special casing some sections.
lib/Transforms/Utils/FunctionImportUtils.cpp | ||
---|---|---|
67 | What happens when a reference to this GVar is imported to another module? Note that this change will prevent the renaming and promotion in the exporting module, but will still do so when importing the symbol (e.g. see getName() and other callsites below). I don't think that it can be addressed by simply preventing the renaming/promotion on the importing side as well. For one, isn't it possible that there will be name clashes between same-named local variables imported from different modules when they have assigned sections? Second, while perhaps unlikely, the local could presumably be eliminated in its original module if it is no longer needed there through some type of optimization, unless that same optimization happened on the imported copy. (As an aside, I think there is a related issue with the preceding handling for local const variables due to the change not to import GVar defs - we expect to use the imported copy but won't have one, so the non-promoted, non-renamed imported ref won't be satisfied.) |
Just like with functions that have inline assembly, functions with the opt_none attribute, functions with the opt_size attributes (etc.), we need to represent this information in the summary.
We talked about having a "flags" fields that could be compressed with the existing linkage field. (for section I plan on having a single bit "hasSection" right now).
Thoughts?
I'm hitting a wall: even with adding the "hasSection" flag to the summary: no renaming a function breaks the model.
Module A { define internal foo() { ... }; define bar() { foobar() }; } Module B { define foo() { ... }; define foobar() { foo() }; }
If we leave internal foo() internal in module A and we don't rename it, we have a problem when we want to import foobar() because of the collision on the call to foo() in module B.
There is nothing in the index that can help to address this situation. We reach a point were the only conservative strategy I can imagine is to detect the collision "late" (i.e. during the FunctionImporter, in the "backend" job) and do not perform the import.
This is annoying because it prevents future optimizations (like "moving" a global for instance).
Something that can be done is an intermediate solution: allow to rename globals that have a section, but not allow to change their linkage. This should work for me.
However it won't help for functions in llvm.used for example, since these can't be renamed right now.
How frequent/important are the modules where specific sections are used? Another conservative possibility is to simply prevent importing into any module that contains a GV with local linkage and the hasSection flag set (which is info available in the index).
Also, I assume the plan is to use the proposed hasSection flag to prevent importing any function containing a reference to a local with that flag set (to avoid introducing a cross-module reference requiring promotion). If renaming is ok, but not the promotion, then a local function with hasSection can itself be imported, if we keep the imported copy local (in which case functions containing references to it can also be imported).
How frequent/important are the modules where specific sections are used?
Very frequent: NSConstantString (objc) is using a specific section all the time.
(I can't bootstrap clang on OS X without the current diff applied for instance, the ThinLTO link does not succeed)
Also, I assume the plan is to use the proposed hasSection flag to prevent importing any function containing a reference to a local with that flag set (to avoid introducing a cross-module reference requiring promotion).
Yes.
If renaming is ok, but not the promotion, then a local function with hasSection can itself be imported, if we keep the imported copy local (in which case functions containing references to it can also be imported).
Probably OK most of the time.
So the only way I could foresee to make it work requires collision detection in names based on the summary. Of course it will require to some changes in the naming scheme (the complication comes from using hashes...).
The hash function is currently hash(symbol_name, linkage, module_id) ; this is the GUID we store in the bitcode *and* we use to index in the global table in the combined index.
We can instead compute another GUID this way: hash(hash(symbol_name), hash(linkage, module_id)) .
During the FE phase, we generate in the summary only this part: hash(symbol_name) ; this is what we store in the bitcode file. We already have module_id and linkage propagated somehow.
In the combined index in memory, we store hash(symbol_name) in the summary itself (GlobalValueSummary), and compute the finale GUID hash(hash(symbol_name), hash(linkage, module_id)) for the key in the global table.
That way we can always lookup the summary of all the symbol in a module and compare hash(symbol_name) to detect the collision.
I think a variant of this should work. The current hashing scheme is actually (IsLocalLinkage)?hash(SourceFileName+":"+symbol_name):hash(symbol_name)
We don't currently have the SourceFileName in the combined index. We could however augment the module path string table to include that for use in computing the GUID/hash key into the combined index table for locals. Then use your idea of keeping a hash of just the symbol name in the combined index summary for comparison, although we only really need to do that in the case where IsLocalLinkage==true (if !IsLocalLinkage then its GUID/hash used as key in the index can be used for the comparison).
Peter was thinking about changing the naming scheme right? Instead of using the SourceFileName we would hash the names of all the public globals in a module IIRC. With a minor variant, by hashing the hash of each global name we don't need any other information in the bitcode to build the module_hash.
although we only really need to do that in the case where IsLocalLinkage==true (if !IsLocalLinkage then its GUID/hash used as key in the index can be used for the comparison).
Yes but any suggestion on how to achieve having this field optional? Unless having an extra layer in the hierarchy LocalFunctionSummary inheriting from FunctionSummary and LocalAliasSummary inheriting from AliasSummary etc. But this does not seem like the right design (won't scale with other fields).
Another way may be to have a variable size class that we malloc we extra size for the optional fields.
(Following up from IRC discussion). This is different than the way we do the promotion naming. I don't think this change affects or is related to the promotion naming scheme.
although we only really need to do that in the case where IsLocalLinkage==true (if !IsLocalLinkage then its GUID/hash used as key in the index can be used for the comparison).
Yes but any suggestion on how to achieve having this field optional? Unless having an extra layer in the hierarchy LocalFunctionSummary inheriting from FunctionSummary and LocalAliasSummary inheriting from AliasSummary etc. But this does not seem like the right design (won't scale with other fields).
Another way may be to have a variable size class that we malloc we extra size for the optional fields.
I was thinking more about the combined index format it bitcode, where we would want to emit as an optional record.
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
86 ↗ | (On Diff #54779) | s/globals/global/ |
99 ↗ | (On Diff #54779) | DefinedGVSummaries only contains entries for the module we are importing into, not from. So I don't think this is the right check. In fact, won't this cause this routine to always return true for anything not already in this module and therefore exactly when we would be introducing a cross-module reference? Looks like your new test is checking that we don't import a function that has a section, but needs to be extended to check that we don't import a function containing references to something that has a section. |
106 ↗ | (On Diff #54779) | Empty comment. |
113 ↗ | (On Diff #54779) | s/globals/global/ Can you add a FIXME here that we could import this if we do so as a copy (e.g. don't promote an imported local function). And that if we do so we could also import functions containing references to it? |
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | You're right that the logic seems wrong. But I'm not sure your are describing what the test is doing: it is checking that we don't import reference_gv_with_section which is a function that does not have a section attached, but reference an internal variable with a section. |
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | You're right, I looked too quickly at the test to see why it didn't fail. I'm wondering why it doesn't import reference_gv_with_section (incorrectly), since I think valueCanBeExternallyReferencedFromModule should return true since var_with_section wouldn't be in the DefinedGVSummaries for the importing module. |
Hitting a wall because I was trying to plumb `const StringMap<std::map<GlobalValue::GUID, GlobalValueSummary *>> &
ModuleToDefinedGVSummaries()` to get the definedGV for the module we import from, but could not because of `void llvm::ComputeCrossModuleImportForModule`.
The latter was introduced in D18945, for which we had some discussions.
I think we're moving in a direction where we can remove this anyway right?
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | Many tests were in fact broken with this patch as it was before. |
Yes with the changes I am working on to use individual index files for distributed backend compilations, and a change to compute the imports from the gold-plugin and pass them down to the in-process threads, we shouldn't need that anymore. However, I don't think you really need the defined GV info for the module we import from, see below.
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | Can't you simply reverse this condition? I.e. if it is in the DefinedGVSummaries for the module doing the importing, you aren't going to be externally referencing it by importing the referring function. ISTM that it is sufficient to check if var_with_section is not already defined in the importing module, because if not then you are adding a cross-module reference into the importing module. If the symbol in question is not defined in the same module as the function being imported, then presumably it is already externally referenced and canBeExternallyReferenced() should return true anyway. So I think this can just be "if (Summary != DefinedGVSummaries.end()) return true;", and perhaps the name of the routine should change to something like "valueCanBeExternallyReferencedFromImportingModule". |
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | I'm not sure to understand (but haven't had my coffee this morning yet...). If you look at the test, we want to import function @reference_gv_with_section which references @var_with_section which is internal (so can't be externally referenced). And this is what I'm checking. Can you illustrate the logic you're explaining above using the test as an example? |
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | Sure. When we invoke eligibleForImport on @reference_gv_with_section, it will first check whether its own summary canBeExternallyReferenced() which is true (@reference_gv_with_section is not itself a local that has a section), so then it invokes this routine to see whether the referenced @var_with_section can be externally referenced. Because @var_with_section is not defined in the module we are importing into, it is not in the DefinedGVSummaries list. So currently with your patch, valueCanBeExternallyReferencedFromModule will incorrectly return true since it isn't in the list. With my proposed change, which is: if (Summary != DefinedGVSummaries.end()) return true; we will not return true here, since the summary for @var_with_section is not in the list (not defined in the importing module). Then it will fall through to check canBeExternallyReferenced(), which should return false given the that it is local and therefore needsRenaming(), and because it hasSection flag. |
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | Interestingly it seemed obvious when I reached the first parenthesis of the first sentence now. I don't know why I went with something so complicated... |
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | Somehow you convinced me above, but I'm confused again now. What you write is almost what I attempted originally until I wrote that I hit a wall and went for the current patch. It is not clear to me how you check anything on var_with_section without its summary. And you will *not* find it if DefinedGVSummaries is about the importing module and not the exporting one. |
lib/Transforms/IPO/FunctionImport.cpp | ||
---|---|---|
99 ↗ | (On Diff #54779) | Oh I see your concern - because that is how you are getting the summary to check. Why don't you just get it the same way selectCallee gets it for a candidate call? I.e. access it from the Index via Index.findGlobalValueInfoList(GUID). If there is more than one then you could look for the one in the same module as the referring summary (which would need to be passed in here). That's where you could return false if none is found in the referring summary's module (since it already must be an external access). |
include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
377 ↗ | (On Diff #55092) | Note: I don't use this method anymore, but you need it in your patch. I guess you should take it in your patch and I'll remove it from here. |
include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
377 ↗ | (On Diff #55092) | Will do |
What happens when a reference to this GVar is imported to another module? Note that this change will prevent the renaming and promotion in the exporting module, but will still do so when importing the symbol (e.g. see getName() and other callsites below).
I don't think that it can be addressed by simply preventing the renaming/promotion on the importing side as well. For one, isn't it possible that there will be name clashes between same-named local variables imported from different modules when they have assigned sections? Second, while perhaps unlikely, the local could presumably be eliminated in its original module if it is no longer needed there through some type of optimization, unless that same optimization happened on the imported copy.
(As an aside, I think there is a related issue with the preceding handling for local const variables due to the change not to import GVar defs - we expect to use the imported copy but won't have one, so the non-promoted, non-renamed imported ref won't be satisfied.)