This is an archive of the discontinued LLVM Phabricator instance.

Memory intrinsic value profile optimization: Avoid divide by 0
ClosedPublic

Authored by tejohnson on Apr 27 2017, 4:24 PM.

Details

Summary

Skip memops if the total value profiled count is 0, we can't correctly
scale up the counts and there is no point anyway.

Event Timeline

tejohnson created this revision.Apr 27 2017, 4:24 PM
davidxl added inline comments.Apr 27 2017, 5:13 PM
lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
877

Or change it to

if (TotalCount < MemOpCountThreshold)
return false;

and move this check before the actual count check.

tejohnson added inline comments.Apr 27 2017, 5:39 PM
lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
877

I could move this up, but note that the counts will be scaled up by ActualCount/SavedTotalCount (where SavedTotalCount == TotalCount at this point). So it is possible for TotalCount < MemOpCountThreshold but not ActualCount, and not the scaled counts. So that check seems overly conservative (and will have an effect other than just preventing 0 divides).

davidxl added inline comments.Apr 27 2017, 6:59 PM
lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
877

The scale is actually to scale down for most the cases -- this is because BB's count will be updated after inlining, while value profiling total count will be updated. In other words, it won't cause conservative behavior for hot sites.

tejohnson added inline comments.Apr 28 2017, 6:24 AM
lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
877

I checked in one of our large apps, and there are over 44K memory intrinsics where we scale up the counts. So at the least, I would like to consider that change separately from this bugfix.

davidxl accepted this revision.Apr 28 2017, 7:32 AM

lgtm

This revision is now accepted and ready to land.Apr 28 2017, 7:32 AM
This revision was automatically updated to reflect the committed changes.