This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Handle empty summaries during internalization
Needs ReviewPublic

Authored by tejohnson on Oct 6 2016, 1:12 PM.

Details

Reviewers
mehdi_amini
Summary

Fix a problem I discovered when testing a fix for PR30610. The
callback MustPreserveGV in llvm::thinLTOInternalizeModule will assert if
there is no summary for a GV. We were handling this in the LTO API by
guarding the invocation of thinLTOInternalizeModule, but it was exposed
by llvm-lto which uses the legacy libLTO interface. Moved the guard into
thinLTOInternalizeModule and add a test.

Event Timeline

tejohnson updated this revision to Diff 73842.Oct 6 2016, 1:12 PM
tejohnson retitled this revision from to [ThinLTO] Handle empty summaries during internalization.
tejohnson updated this object.
tejohnson added a reviewer: mehdi_amini.
tejohnson added a subscriber: llvm-commits.

Actually, the description of the test is wrong - the test I added ensures that we still get this correct for llvm-lto2 (the new LTO API). Let me add one for llvm-lto which probably needs a -thinlto-action=internalize case.

tejohnson updated this revision to Diff 73844.Oct 6 2016, 1:28 PM

Add an llvm-lto based test as well

mehdi_amini edited edge metadata.Oct 7 2016, 1:39 PM

The model I have in mind with the thin-link is that we operate on the combined index which represent all the files we're processing accurately. The linker provides us with the external references (among other informations).

As I mentioned to you in person the other day, this model is broken by modules with empty summaries: some of the combined index transformations may be wrong (miscompile) because of that.

We need to rework a bit the model for these modules...

lib/Transforms/IPO/FunctionImport.cpp
554

Can you get there with an empty module? (As a module that does not define any IR Global?)

The model I have in mind with the thin-link is that we operate on the combined index which represent all the files we're processing accurately. The linker provides us with the external references (among other informations).

As I mentioned to you in person the other day, this model is broken by modules with empty summaries: some of the combined index transformations may be wrong (miscompile) because of that.

We need to rework a bit the model for these modules...

Yes I was thinking about this some more after our conversation Wed. I think the right way to go is to do something like we do now with the HasSection flag which prevents importing any references - we can either add another bit or generalize that one - then set it for all GVs. That way they can be in the reference graph but not be imported.

But this change is needed for now to keep the current model working and to be able to commit D25359 which depends on this.

lib/Transforms/IPO/FunctionImport.cpp
554

Sure, but then we wouldn't call back into MustPreserveGV so it is fine to exit here early.

But this change is needed for now to keep the current model working

"Keep the current model working" isn't exact: I believe it is broken since the recent addition of "false empty summaries" for module.
So have the current model working again now, you would need to revert this.

But this change is needed for now to keep the current model working

"Keep the current model working" isn't exact: I believe it is broken since the recent addition of "false empty summaries" for module.
So have the current model working again now, you would need to revert this.

I was referring to the model of preventing importing and renaming by suppressing the summary. Which has other issues as discussed, but this change at least makes the current handling uniform.

Ping. The fix in D25359 (which you already LGTM'd) depends on this change to make the handling of empty summaries consistent between llvm-lto and llvm-lto2 (i.e. ThinLTOCodeGenerator and the new LTO API).

The other alternative is to commit D25359 without any llvm-lto based tests, but that seems less than ideal.

Sorry, forgot about this patch. My main concern last week was that I fear we're piling-up on top of a fragile foundation: that'd be building technical debt. So I was trying to narrow down how to make sure we're not on a slippery slope with the current situation.

I was referring to the model of preventing importing and renaming by suppressing the summary.

I don't know what you referred to?

Sorry, forgot about this patch. My main concern last week was that I fear we're piling-up on top of a fragile foundation: that'd be building technical debt. So I was trying to narrow down how to make sure we're not on a slippery slope with the current situation.

I was referring to the model of preventing importing and renaming by suppressing the summary.

I don't know what you referred to?

I'm referring to the model where we invoke llvm::moduleCanBeRenamedForThinLTO() to guard building a summary when we can't rename (e.g. due to inline assembly, or with D25359, due to module level assembly)

But my issue is not what moduleCanBeRenamedForThinLTO() does right now, but rather what do we do when it returns "false". These modules previously didn't go through the ThinLTO pipeline, so it was fine. Now they have an empty summary and they go to the thin-link. This seems an issue to me, and I rather have a proper answer design-wise rather than patching over-and-over as bugs are reported.

But my issue is not what moduleCanBeRenamedForThinLTO() does right now, but rather what do we do when it returns "false". These modules previously didn't go through the ThinLTO pipeline, so it was fine. Now they have an empty summary and they go to the thin-link. This seems an issue to me, and I rather have a proper answer design-wise rather than patching over-and-over as bugs are reported.

Note that the bug that is being fixes is not due to this model. We are just using the same model to address it as with inline assembly.

Agree that a better long term solution is needed soon, but in the meantime it would be good to be able to fix the module assembly issue now, since we already use this approach for inline assembly, and also address the difference in how this case is being handled here in internalization.