This is an archive of the discontinued LLVM Phabricator instance.

[LTO] Perform DSOLocal propagation in combined index
ClosedPublic

Authored by weiwang on Feb 9 2021, 10:43 PM.

Details

Summary

Perform DSOLocal propagation within summary list of every GV. This
avoids the repeated query of this information during function
importing

Diff Detail

Event Timeline

weiwang created this revision.Feb 9 2021, 10:43 PM
weiwang requested review of this revision.Feb 9 2021, 10:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2021, 10:43 PM

The overhead of repeatedly scanning all summaries could be quite high in the current ValueInfo::isDSOLocal implementation, especially in large projects. This change helped reduce overall thinLTO link time of a major internal workload by 20 mins.

The overhead of repeatedly scanning all summaries could be quite high in the current ValueInfo::isDSOLocal implementation, especially in large projects. This change helped reduce overall thinLTO link time of a major internal workload by 20 mins.

Great! I have some suggestions to improve the approach further.

llvm/include/llvm/IR/ModuleSummaryIndex.h
157

Instead of adding the memory overhead of an Optional per GUID in the index, can you just have the populate method turn into a propagation method, and just reset all IsDSOLocal flags if any are false? Then the isDSOLocal method only needs to check the first summary entry. You can add a global flag to the index that indicates that the propagation has been done, and assert that it has been set when isDSOLocal() is called. See for example the global set when we do dead code analysis and attribute propagation (WithGlobalValueDeadStripping and WithAttributePropagation).

Actually, for the most efficiency, I wonder if you can just merge this into propagateAttributes (and use a single global flag set on the index when it has been propagated). Like propagateAttributes, you presumably don't need to do this on anything that is dead, and it wouldn't need to be invoked if importing is disabled.

llvm/lib/LTO/LTO.cpp
894

This adds a second hash table lookup for GUID. Better to do the new call to findSummaryInfo(GUID) first, then there is another findSummaryInModule that takes a ValueInfo that can be used instead of the GUID based one used a few lines above.

But I'm not sure I understand why this is needed in the first place (ditto for the reset invoked when adding a global value summary) - since the populate method is not called until we have the full combined index.

weiwang added inline comments.Feb 11 2021, 10:58 AM
llvm/include/llvm/IR/ModuleSummaryIndex.h
157

This is a really great suggestion, Thanks!

Memory overhead was one concern I had. I'll see if I can do something similar as propagateAttributes. One question though, when you say "reset all IsDSOLocal flags if any are false. Then the isDSOLocal method only needs to check the first summary entry", do you mean if the flag in one of the summaries is false, then set all others to false as well?

llvm/lib/LTO/LTO.cpp
894

This is more of a bookkeeping to make sure the populated value is up-to-date. But with the propagation approach you suggested, I think this is no longer needed.

tejohnson added inline comments.Feb 11 2021, 11:38 AM
llvm/include/llvm/IR/ModuleSummaryIndex.h
157

Yes - I suggested that since isDSOLocal() currently returns true only if all IsDSOLocal flags are true, and false otherwise. So it seems like you would just propagate the flags across all summaries in the list such that if any are false, all become false (or leave them all true if all are true) - then you get the same behavior by simply querying the first summary in the list's IsDSOLocal flag.

wenlei added inline comments.Feb 11 2021, 2:07 PM
llvm/include/llvm/IR/ModuleSummaryIndex.h
157

Thanks for the suggestion, @tejohnson. That'd be a smart and clean way of doing it.

weiwang updated this revision to Diff 323175.Feb 11 2021, 3:41 PM

Use the propagation approach.

weiwang marked an inline comment as done.Feb 11 2021, 3:41 PM
weiwang retitled this revision from [LTO] Pre-populate IsDSOLocal result in combined index to [LTO] Perform DSOLocal propagation in combined index.Feb 11 2021, 3:44 PM
weiwang edited the summary of this revision. (Show Details)
tejohnson accepted this revision.Feb 11 2021, 4:50 PM

lgtm, thanks for the efficiency improvement! Some minor comments below.

llvm/lib/IR/ModuleSummaryIndex.cpp
245

I don't know that it is worth changing the name, but please add a note to the comment here and in the header that it is also propagating the IsDSOLocal flag.

291

Super nit: would it be more efficient to just do:

IsDSOLocal &= S->isDSOLocal();

i.e. without the if condition? Either way is probably ok though.

301

We could presumably get by with one merged flag. Not sure it is worth it, in case we decide to decouple these at some point...I don't have a strong feeling so up to you.

This revision is now accepted and ready to land.Feb 11 2021, 4:50 PM
tejohnson added inline comments.Feb 11 2021, 5:14 PM
llvm/lib/IR/ModuleSummaryIndex.cpp
301

Actually, probably best to leave the flags decoupled. In case we eventually want to be able to disable them independently for debugging.

weiwang updated this revision to Diff 323191.Feb 11 2021, 5:22 PM

update according to comment.

weiwang marked 4 inline comments as done.Feb 11 2021, 5:23 PM
weiwang added inline comments.Feb 11 2021, 5:29 PM
llvm/lib/IR/ModuleSummaryIndex.cpp
301

Sure, I'll leave them decoupled. Side question: I see that withAttributePropagation flag is preserved across bitcode file read/write. If the flag is set when index is materialized, does it mean the propagation was done previously on the index? If that's the case, can we skip it?

tejohnson added inline comments.Feb 11 2021, 5:49 PM
llvm/lib/IR/ModuleSummaryIndex.cpp
301

Oh that's a good point, I forgot that gets serialized. Hmm, that is actually going to be important for distributed ThinLTO backends, where we serialize out the combined index, read it back in for each backend, and invoke the function importing from there. So please add that for the new flag as well. Looks like this test should fail without an update to the expected flags: llvm/test/Assembler/summary-flags.ll.

I don't think it is worth adding a check to this code as to whether it has been set already though. I don't see any way we will ever invoke this propagation on a combined index that was serialized out after invoking it already.

weiwang updated this revision to Diff 323395.Feb 12 2021, 10:44 AM
  1. serialize and de-serialize the flag.
  2. update related test cases.
weiwang marked 2 inline comments as done.Feb 12 2021, 10:44 AM
weiwang added inline comments.
llvm/lib/IR/ModuleSummaryIndex.cpp
301

Thanks. I've updated the patch to include serialization and de-serialization of the flag. Since this is non-trivial, I'd like to get approval again.

weiwang updated this revision to Diff 323470.Feb 12 2021, 1:56 PM

fixed test failures due to bit check

Looks good except for the BitstreamReader change - I'm not sure why this would affect that?

llvm/include/llvm/Bitstream/BitstreamReader.h
196

Why this change?

llvm/lib/IR/ModuleSummaryIndex.cpp
218

Thanks, can you update the header declaration's comment as well?

weiwang updated this revision to Diff 323526.Feb 12 2021, 10:52 PM

update comment and remove one unintended change.

weiwang marked an inline comment as done.Feb 12 2021, 10:53 PM
weiwang marked an inline comment as done.Feb 12 2021, 10:56 PM
weiwang added inline comments.
llvm/include/llvm/Bitstream/BitstreamReader.h
196

Sorry, this is unintended. I was sweeping through places with "0x3f", and didn't look close enough for this one. Reverted back.

This revision was landed with ongoing or failed builds.Feb 12 2021, 11:10 PM
This revision was automatically updated to reflect the committed changes.
weiwang marked an inline comment as done.