This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Ensure we always select the same function copy to import
ClosedPublic

Authored by tejohnson on Jun 27 2018, 1:43 PM.

Details

Summary

[ThinLTO] Ensure we always select the same function copy to import

In order to always import the same copy of a linkonce function,
even when encountering it with different thresholds (a higher one then a
lower one), keep track of the summary we decided to import.
This ensures that the backend only gets a single definition to import
for each GUID, so that it doesn't need to choose one.

Move the largest threshold the GUID was considered for import into the
current module out of the ImportMap (which is part of a larger map
maintained across the whole index), and into a new map just maintained
for the current module we are computing imports for. This saves some
memory since we no longer have the thresholds maintained across the
whole index (and throughout the in-process backends when doing a normal
non-distributed ThinLTO build), at the cost of some additional
information being maintained for each invocation of ComputeImportForModule
(the selected summary pointer for each import).

There is an additional map lookup for each callee being considered for
importing, however, this was able to subsume a map lookup in the
Worklist iteration that invokes computeImportForFunction. We also are
able to avoid calling selectCallee if we already failed to import at the
same or higher threshold.

I compared the run time and peak memory for the SPEC2006 471.omnetpp
benchmark (running in-process ThinLTO backends), as well as for a large
internal benchmark with a distributed ThinLTO build (so just looking at
the thin link time/memory). Across a number of runs with and without
this change there was no significant change in the time and memory.

(I tried a few other variations of the change but they also didn't
improve time or peak memory).

Diff Detail

Repository
rL LLVM

Event Timeline

tejohnson created this revision.Jun 27 2018, 1:43 PM
davidxl added inline comments.Jul 2 2018, 9:43 AM
lib/Transforms/IPO/FunctionImport.cpp
224 ↗(On Diff #153168)

This changes the behavior in the way that a callee that could be found before this patch may no longer be found. Is this expected?

tejohnson added inline comments.Jul 2 2018, 9:52 AM
lib/Transforms/IPO/FunctionImport.cpp
224 ↗(On Diff #153168)

Yes it is a side-effect. We could potentially also change it to only check the prevailing copy (which is typically the first copy anyway), but that would also leave us sometimes without a copy to import whereas we might have imported a different smaller copy previously.

Another possibility would be to keep a set of functions already imported in to the module and check it first. We do keep the imports currently, but it is by source module, so this would likely need another set to be faster.

davidxl added inline comments.Jul 2 2018, 10:08 AM
lib/Transforms/IPO/FunctionImport.cpp
224 ↗(On Diff #153168)

How about fetch all the eligible candidates into a list, sort them by size and pick the first (smallest) one if it is below threshold. This increases the chances an import will occur.

tejohnson added inline comments.Jul 2 2018, 10:44 AM
lib/Transforms/IPO/FunctionImport.cpp
224 ↗(On Diff #153168)

I'm concerned about the efficiency of doing so - adding the copy and sort to every selectCallee invocation for very large thin links will likely be noticeable.

I'd lean towards the following, which should fix this issue and also speed up the rejection of callees we have already scanned (after being reached via another caller) and determined were too large. Keep a DenseMap of GUID to processed threshold. Also revert the change here to selectCallee so that we scan all looking for any that is under the threshold. If we decide to import, map the GUID to -1 in the map. If we decide not to import, map GUID to the current threshold in the map. When we encounter a callee, first check the map. If in the map and the threshold is -1, skip as we have already imported it (which fixes the same problem the current version of the patch fixes). If in the map and the mapped threshold is <= the current threshold, don't even bother to scan the summaries (which should add some efficiency on top of the current algorithm).

This does cost some extra memory, which I can benchmark on a large app. If the memory overhead is noticeable, I can reduce by only doing this this for summary lists that have >1 entry (which means we would do some unnecessary correctness checks on the summary entry if it was already rejected at the current threshold).

davidxl added inline comments.Jul 2 2018, 10:56 AM
lib/Transforms/IPO/FunctionImport.cpp
224 ↗(On Diff #153168)

ok.

tejohnson updated this revision to Diff 154876.Jul 10 2018, 2:02 PM

Rework based on discussion

tejohnson edited the summary of this revision. (Show Details)Jul 10 2018, 2:03 PM
tejohnson marked an inline comment as done.
davidxl added inline comments.Jul 12 2018, 11:46 AM
include/llvm/Transforms/IPO/FunctionImport.h
36 ↗(On Diff #154876)

Nit: fix stale comment. It is not a map any more.

43 ↗(On Diff #154876)

maybe make it clearer that when it is rejected for importing, the summary field is nullptr.

test/Transforms/FunctionImport/funcimport_resolved.ll
32 ↗(On Diff #154876)

Is or Not importing here?

tejohnson marked 2 inline comments as done.Jul 13 2018, 12:47 PM
tejohnson added inline comments.
test/Transforms/FunctionImport/funcimport_resolved.ll
32 ↗(On Diff #154876)

The comments in the test case were not correct. What we want to check is that we only import the copy from funcimport_resolved1, and not *also* the copy from funcimport_resolved2. Without this patch we decide to import both. The reason is that without the patch we track the import thresholds for each module we are importing from, so we don't properly detect that we have already imported a copy from funcimport_resolved1 when we try to import again at the lower threshold.

Address comments: fix header and test comments.

Remove extraneous change to test that snuck in

davidxl accepted this revision.Jul 13 2018, 1:54 PM

lgtm

This revision is now accepted and ready to land.Jul 13 2018, 1:54 PM