This is an archive of the discontinued LLVM Phabricator instance.

[LoopCacheAnalysis] Improve cost heuristic.
Needs ReviewPublic

Authored by etiotto on Mar 10 2020, 6:45 AM.

Details

Summary

Improve the cache cost estimation for indexed references that are not consecutive. For example, given the indexed reference 'A[i][j][k]', and assuming the j-loop is in the innermost position in the loop nest containing the array reference, the cost would be equal to the RefCost cost times the iterations of the j-loop (rather than just the RefCost).

Diff Detail

Event Timeline

etiotto created this revision.Mar 10 2020, 6:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2020, 6:45 AM
etiotto updated this revision to Diff 249946.Mar 12 2020, 8:19 AM
bmahjour added inline comments.
llvm/include/llvm/Analysis/LoopCacheAnalysis.h
107

would be useful to mention that the index is zero-based. ie: "Retrive the zero-based \p Index of the subscript ..."

llvm/lib/Analysis/LoopCacheAnalysis.cpp
310

but in the code below we multiply the trip count for all indexes starting from the given loop up to the loop for the second last subscript.

320

why getNumSubscripts() - 1 instead of just getNumSubscripts() ?

etiotto retitled this revision from [LoopCacheAnalysis}: Improve cost heuristic. to [LoopCacheAnalysis] Improve cost heuristic..Mar 13 2020, 11:28 AM
Meinersbur added inline comments.Mar 13 2020, 2:36 PM
llvm/lib/Analysis/LoopCacheAnalysis.cpp
106–119

[sugggestion]

if (auto *TripCount = dyn_cast<SCEVConstant>(BackedgeTakenCount)) 
  return SE.getAddExpr(BackedgeTakenCount, SE.getOne(BackedgeTakenCount->getType())

LLVM_DEBUG(dbgs() << "Trip count of loop " << L.getName() << " could not be computed, using DefaultTripCount\n");
return SE.getConstant(ElemSize.getType(), DefaultTripCount);
107

isa<SCEVCouldNotCompute> is redundant: the object cannot be SCEVCouldNotCompute and SCEVConstant at the same time

452

What if:

  • Multiple subscripts are using the loop variables?
  • A subscript uses multiple loop variables (eg. i+j)?
  • SCEVAddRecExpr is processed further, e.g. negated?