This is an archive of the discontinued LLVM Phabricator instance.

Move summary dead stripping before regular LTO and record results in the combined summary
ClosedPublic

Authored by eugenis on May 26 2017, 4:52 PM.

Details

Summary

Add a bit to GVFlags to tell ThinLTO backends which values are dead.
Ignored by backends for now.

Diff Detail

Repository
rL LLVM

Event Timeline

eugenis created this revision.May 26 2017, 4:52 PM
pcc edited edge metadata.May 26 2017, 5:20 PM

Can we repurpose the LiveRoot bit to mean Live?

It would also be nice to change computeDeadSymbols to use the Live bit instead of creating a set, but I guess that can be done separately.

In D33615#766324, @pcc wrote:

Can we repurpose the LiveRoot bit to mean Live?

As in give it a different meaning in combined summary? Looks like we can do that. We'll still need to increase the version number because existing objects have LiveRoot==false for most symbols.

pcc added a comment.May 26 2017, 5:37 PM
In D33615#766324, @pcc wrote:

Can we repurpose the LiveRoot bit to mean Live?

As in give it a different meaning in combined summary? Looks like we can do that. We'll still need to increase the version number because existing objects have LiveRoot==false for most symbols.

Right, and upgrade old summaries to Live == true.

mehdi_amini edited edge metadata.May 26 2017, 10:27 PM

Add a bit to GVFlags to tell ThinLTO backends which values are dead.

Why?

pcc added a comment.May 26 2017, 10:32 PM

Add a bit to GVFlags to tell ThinLTO backends which values are dead.

Why?

See Evgeniy's RFC thread, in particular
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113298.html

In D33615#766506, @pcc wrote:

Add a bit to GVFlags to tell ThinLTO backends which values are dead.

Why?

See Evgeniy's RFC thread, in particular
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113298.html

I see in this thread why you want to run the dead-stripping before regular LTO, but I don't immediately understand why you're adding these to the combined index and the serialization of it?

eugenis planned changes to this revision.May 27 2017, 12:16 AM

Ouch. Thank you for bringing this to my attention.
Looking at the current cfi-icall patch, this combined summary / serialization code is dead indeed.
The patch somehow evolved to make the isLive() check in ThinLTO backend redundant without me noticing.
I'll revert the summary parts of this change, probably next week.

eugenis updated this revision to Diff 100769.May 30 2017, 2:15 PM

In fact, we already have a bit in GVFlags that's not used in ThinLTO backend: LiveRoot, and we need a way to pass this data to RegularLTO pass anyway, so let's use that. I've renamed it to just "Live"; in per-module summaries it is used to mark liveroot globals, and in combined summary it marks all live globals.

It does not have to be serialized, but since we serialize LiveRoot already, there is no regression. I can remove serialization in a separate change, but that would create a situation where part of GVFlags is lost when passed through bitcode, and it could be confusing. Perhaps that should be part of a larger refactoring that would separate data that must be passed from thin link to thin backend from module summary data.

WDYT?

mehdi_amini added inline comments.May 30 2017, 2:22 PM
include/llvm/IR/ModuleSummaryIndex.h
141 ↗(On Diff #100769)

Still unsure why do we need this? How does it relate to the need to move the dead-stripping earlier in the process?

To recap: this needs to be in combined summary, because that's the only way to pass information to LowerTypeTests in regularLTO.

Serialization is extra; I can remove it, or rather, make LiveRoot in bitcode always 0. This can be done without bumping bitcode version. Or even remove LiveRoot from bitcode and bump the version number.

To recap: this needs to be in combined summary, because that's the only way to pass information to LowerTypeTests in regularLTO.

Well you can also pass it as a set just like it is done elsewhere with DeadSymbols.
However, if we go the route of "caching" this analysis result in the summaries, then we should likely kill DeadSymbols because ComputeCrossModuleImport should be able to use the information from the summary directly.

eugenis updated this revision to Diff 100920.May 31 2017, 2:10 PM

I've killed DeadSymbols.
ComputeCrossModuleImport may be called w/o computeDeadSymbols. In that case most of the symbols will be marked as dead (at that stage Live works like LiveRoot). I've added a flag in ModuleSummaryIndex to detect this situation.

This flag is not serialized. We don't need it to be serialized, but that same is true for Live / LiveRoot. Actually, LiveRoot serialization was added in D23488 along with bitcode version bump, and, as far as I can see, it was unused from the start. I can kill the whole thing, and maybe even downgrade the version, if no one minds.

pcc added inline comments.May 31 2017, 2:20 PM
lib/Transforms/IPO/FunctionImport.cpp
454 ↗(On Diff #100920)

Do we need the LiveSymbols set any more? Can we use isLive and setLive to mark summaries as live as we go?

This flag is not serialized. We don't need it to be serialized, but that same is true for Live / LiveRoot. Actually, LiveRoot serialization was added in D23488 along with bitcode version bump, and, as far as I can see, it was unused from the start. I can kill the whole thing, and maybe even downgrade the version, if no one minds.

I'm not sure I understand this? LiveRoot was added to track things like globals marked as "llvm.used", how would this work without serialization from the compile phase to the thinlink?

include/llvm/IR/ModuleSummaryIndex.h
536 ↗(On Diff #100920)

Do not use "GC": let's not switch terminology between multiple places I think we used "Dead-Stripping" till now.

561 ↗(On Diff #100920)

If you want to go this route of making the index explicitly modal, then GlobalValueSummary::isLive() should be private and the ModuleSummaryIndex can be friended.

lib/LTO/LTO.cpp
636 ↗(On Diff #100920)

We reached a point where this patch is complex enough that I rather see it split between the change to make the live flags in the summary replacing all the sets (NFC changes) and actually moving the place where it is performed.

eugenis updated this revision to Diff 101094.Jun 1 2017, 1:22 PM

Rebased on (NFC) D33743. Added YAML serialization of GVFlags for use in tests. Added use of new flags in LowerTypeTests and WholeProgramDevirt.

Thanks for splitting out a large part of the refactoring!

include/llvm/IR/ModuleSummaryIndexYAML.h
132 ↗(On Diff #101094)

The addition of Linkage and NotEligibleToImport are unrelated to this change right? I feel you could commit them separately now.
(even the serialization of "Live" here isn't clear to me if it isn't something that could just be split out).

pcc added a comment.Jun 1 2017, 1:44 PM

It looks like you

include/llvm/IR/ModuleSummaryIndexYAML.h
132 ↗(On Diff #101094)

Would we be using Linkage and NotEligibleToImport for anything though? It seems a little odd to add fields here without any test coverage. (That's not to say that we shouldn't add the fields eventually, but I think it should be accompanied by tests, i.e. we should add a JSON summary dumper to llvm-lto2 or something and convert some tests over to use it.)

I think the serialization of Live was part of D33702, and I'd prefer it to be done as part of that change.

pcc added a comment.Jun 1 2017, 1:45 PM
In D33615#770592, @pcc wrote:

It looks like you

(sorry, disregard)

mehdi_amini added inline comments.Jun 1 2017, 1:48 PM
include/llvm/IR/ModuleSummaryIndexYAML.h
132 ↗(On Diff #101094)

Would we be using Linkage and NotEligibleToImport for anything though? It seems a little odd to add fields here without any test coverage.

What are we using the Yaml serialization for? I thought it was intended to be the "primary textual way of dumping summaries"? My comment was in this context: I was seeing this change as "fix missing serialization" and nothing else.

(That's not to say that we shouldn't add the fields eventually, but I think it should be accompanied by tests, i.e. we should add a JSON summary dumper to llvm-lto2 or something and convert some tests over to use it.)

Of course I expect a test for these, but don't we have a test for yaml dump? I was just thinking about using opt to generate a bitcode file with summaries and pretty-print to yaml. (haven't looked at the existing testing right now)

eugenis updated this revision to Diff 101098.Jun 1 2017, 1:55 PM

Let's start with a simpler change. We don't need YAML at all to move dead stripping before regular lto and test it.

pcc added inline comments.Jun 1 2017, 2:03 PM
include/llvm/IR/ModuleSummaryIndexYAML.h
132 ↗(On Diff #101094)

What are we using the Yaml serialization for? I thought it was intended to be the "primary textual way of dumping summaries"?

That is indeed the end goal, but right now it is only used as a mechanism to test that the lowertypetests and wholeprogramdevirt passes manipulate the summary index correctly.

Of course I expect a test for these, but don't we have a test for yaml dump? I was just thinking about using opt to generate a bitcode file with summaries and pretty-print to yaml. (haven't looked at the existing testing right now)

There's nothing right now that reads bitcode files and then dumps their summaries as YAML. We'd have to add it somewhere, e.g. to llvm-lto2.

eugenis added inline comments.Jun 1 2017, 2:07 PM
include/llvm/IR/ModuleSummaryIndexYAML.h
132 ↗(On Diff #101094)

There's nothing right now that reads bitcode files and then dumps their summaries as YAML. We'd have to add it somewhere, e.g. to llvm-lto2

Or llvm-bcanalyzer.

Do you want me to remove serialization of Linkage and NotEligibleToImport? It just feels strange to serialize only part of GVFlags. When this serialization is added later, it will require updating all the same tests once again.

pcc added inline comments.Jun 1 2017, 2:12 PM
include/llvm/IR/ModuleSummaryIndexYAML.h
132 ↗(On Diff #101094)

Do you want me to remove serialization of Linkage and NotEligibleToImport?

I'm fine with leaving it in.

pcc accepted this revision.Jun 1 2017, 3:30 PM

LGTM

This revision is now accepted and ready to land.Jun 1 2017, 3:30 PM
mehdi_amini added inline comments.Jun 1 2017, 5:16 PM
include/llvm/IR/ModuleSummaryIndexYAML.h
132 ↗(On Diff #101094)

FWIW: I disagree with mixing changes that are unrelated with each other: like changing the serialization and moving the position of where we run a pass.

pcc added inline comments.Jun 1 2017, 5:19 PM
include/llvm/IR/ModuleSummaryIndexYAML.h
132 ↗(On Diff #101094)

The serialization changes were moved to D33702, but it looks like they were never reverted from this patch. @eugenis can you please take care of that before committing?

There are no serialization changes here. Are you looking at an older diff by any chance?

pcc added a comment.Jun 1 2017, 5:41 PM

There are no serialization changes here. Are you looking at an older diff by any chance?

Sorry, yes, it must have been an older diff.

This revision was automatically updated to reflect the committed changes.