Assumption cache's self-updating mechanism does not correctly handle the case when blocks are extracted from the function by the CodeExtractor. As a result function's assumption cache may have stale references to the llvm.assume calls that were moved to the outlined function. This patch fixes this problem by removing extracted llvm.assume calls from the function’s assumption cache.
Details
Diff Detail
Event Timeline
@sdmitriev thanks for the patch!
lib/Transforms/IPO/HotColdSplitting.cpp | ||
---|---|---|
202 | The splitting pass does not explicitly preserve AssumptionCacheTracker. Why does this not invalidate AssumptionCacheTracker correctly? As the splitting pass does not "really" need AssumptionCacheTracker, it seems preferable to just not require it here. |
lib/Transforms/IPO/HotColdSplitting.cpp | ||
---|---|---|
202 | AssumptionCacheTracker is implemented as immutable pass, so, as I understand, it cannot be invalidated. |
LGTM, but, please wait for a +1 from someone more familiar with AssumptionCache before committing.
lib/Transforms/IPO/HotColdSplitting.cpp | ||
---|---|---|
202 | I see, thanks for explaining. It looks like it's well-understood that AssumptionCache may need to be invalidated (there's already a clear() method). As most passes do not invalidate it, making AssumptionCacheTracker an ImmutablePass might be the right tradeoff. @hfinkel does that sound right? | |
lib/Transforms/IPO/PartialInlining.cpp | ||
184 | nit, as this seems unrelated to the main change, perhaps it could be a followup? |
lib/Transforms/IPO/PartialInlining.cpp | ||
---|---|---|
184 | Right, this is not related to the main change. I just noticed that Region is passed by value which is not very efficient. I will remove this change from this patch. |
Removed unrelated change (lib/Transforms/IPO/PartialInlining.cpp, line 184) from the patch.
lib/Transforms/IPO/HotColdSplitting.cpp | ||
---|---|---|
202 | The original idea was that the only passes that added new assumptions needed to explicitly update the cache, and for everything else it should be self-updating (and this made sense under the assumption that adding assumptions is rare). |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
1407 | Can we clear here less than everything? |
Revised changes to remove extracted llvm.assume calls from the old function's assumption cache instead of clearing it.
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
1407 | I have updated changes to remove extracted llvm.assume calls from the old function's assumption cache instead of clearing it. |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
1407 | But then they'll be missing and it won't rescan? Why don't we add an interface like movedToNewFunction(CallInst *CI), and let the cache actually update itself properly? |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
1407 |
Rescanning for the old function (where the blocks are extracted from) is not really needed since I am removing all llvm.assume calls that were moved to a new function from its cache. Therefore its cache is supposed to be up-to-date. And the new function (that is created by the extractor) does not yet have assumption cache created at this point, so it will be scanned on demand when its assumption cache is requested.
OK, let me check if I understand your suggestion correctly. You propose to add such interface to the AssumptionCache class, and it is supposed to remove CI’s assumptions from ‘this’ and add CI to the new function’s cache if it exists (otherwise new function will be rescanned anyway). Is that right? I guess that would require adding a reference to the AssumptionCacheTracker to the AssumptionCache class since it does not currently have access to the other function’s cache. Would that be OK? |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
1407 | Okay, that makes sense. This LGTM. |
The splitting pass does not explicitly preserve AssumptionCacheTracker. Why does this not invalidate AssumptionCacheTracker correctly? As the splitting pass does not "really" need AssumptionCacheTracker, it seems preferable to just not require it here.