This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Value profile for size of memory intrinsic calls
ClosedPublic

Authored by xur on Jan 20 2017, 1:23 PM.

Details

Summary

This patch adds the value profile support to profile the size parameter of
memory intrinsic calls: memcpy, memcmp, and memmove.

We have two adjustable parameters: SmallVal and LargeVal size less than or equal to SmallVal will be precisely profiled. size greater than SmallVal but less than LargeVal will have one count size greater than or equal to LargeVale will have one count.

The patch has dependency on the profile value range support in compiler_rt:
https://reviews.llvm.org/D28964.

Diff Detail

Event Timeline

xur created this revision.Jan 20 2017, 1:23 PM
davidxl edited edge metadata.Jan 24 2017, 10:36 AM

See my comment in https://reviews.llvm.org/D28964. There is no need to introduce runtime change nor new clang intrinsics. The instrumentation lower can do the job of mapping the ranges. Another choice is to let the instrumenter does the range mapping work and keep the lowerer unchanged.

xur updated this revision to Diff 85665.Jan 24 2017, 5:21 PM

Update the patch based on David's review comments.

(1) remove instrprof_value_rang_profile intrinsic and use existing instrprof_value_profile and used valuekind to distinguish different value profile.
(2) enclose most of the implementation details in value profile lowering.

llvm-profdata needs test cases for profile reading and writing

  1. show command
  2. merge command in binary and in text format
lib/Transforms/Instrumentation/InstrProfiling.cpp
79

Perhaps make it a string option with both start and end values specified:

MemOpSizeValueRange

85

The description is not accurate.

tools/llvm-profdata/llvm-profdata.cpp
482

Add a test case to show text format dump works expected.

542

Collect summary statistics and dump it at the end -- similar to IC statistics.

631

This needs to be documented as other user facing options.

vsk added a subscriber: vsk.Jan 30 2017, 4:56 PM
xur marked 2 inline comments as done.Feb 1 2017, 12:44 PM
xur added inline comments.
lib/Transforms/Instrumentation/InstrProfiling.cpp
85

Fixed

xur updated this revision to Diff 86697.Feb 1 2017, 12:47 PM

Integrated David's review comments.

Also fixed a bug in InstrProfReader when reading text format value profile.

davidxl added a reviewer: vsk.Feb 3 2017, 9:45 AM

How to make the llvm-profdata 'show' command dump the range information as well?

include/llvm/Transforms/InstrProfiling.h
63 ↗(On Diff #86697)

should it be 1?

66 ↗(On Diff #86697)

Why 8? why not 64?

lib/ProfileData/InstrProfReader.cpp
185 ↗(On Diff #86697)

Can you submit this separately as a bug fix?

lib/Transforms/Instrumentation/InstrProfiling.cpp
178

extract this into an option parsing helper.

526

Larger range does not mean on average the number of values tracked will be proportionally larger. Perhaps using a different heuristic for allocation size? Besides, TotalNS tracks the number of value sites, overloading it for number of values does not make sense

533

Here is the counter # estimation.

lib/Transforms/Instrumentation/PGOInstrumentation.cpp
1301

unrelated change.

test/tools/llvm-profdata/memop-size-prof.proftext
3 ↗(On Diff #86697)

also test 'merge' with -text option.

18 ↗(On Diff #86697)

perhaps add a test with 2 value kinds

tools/llvm-profdata/llvm-profdata.cpp
469

TotalNumValueSitesICall

Or perhaps defined an array

TotalNumValueSites[..] and use value kind index to access.

470

TotalNumValuesSitesWithValueProfileICall

473

Make the var name consistent -- TotalNumValueSitesMemOpSize

506

NumMemOpCalls

508

Number of memory intrinsic calls :

543

extract this into to a common helper function? There are many duplications.

xur marked 10 inline comments as done.Mar 3 2017, 10:23 AM
xur added inline comments.
include/llvm/Transforms/InstrProfiling.h
63 ↗(On Diff #86697)

there are calls to memory intrinsics with size 0. There are quite many actually.

66 ↗(On Diff #86697)

I think 8 is enough. In the optimization, I multi-version up to value of 8.
This is a tunable. 64 might be too big.

lib/Transforms/Instrumentation/InstrProfiling.cpp
526

I removed this change. Now MemOPSize profile uses the same heuristic as indirect-call profiling.

lib/Transforms/Instrumentation/PGOInstrumentation.cpp
1301

reverted.

tools/llvm-profdata/llvm-profdata.cpp
469

Used a vector here.

xur updated this revision to Diff 90506.Mar 3 2017, 10:24 AM
xur marked an inline comment as done.

Updated the patch according on David's review comments.

The profile annotation part of another patch should be merged in here.

include/llvm/Transforms/InstrProfiling.h
63 ↗(On Diff #86697)

The value transformation patch does not seem to handle it . Do you see cases where value 0 is a frequent value in hot stringop sites?

lib/Transforms/Instrumentation/InstrProfiling.cpp
91

document the case when some value is missing in the option.

davidxl added inline comments.Mar 8 2017, 11:28 AM
lib/Transforms/Instrumentation/InstrProfiling.cpp
526

With this default heuristics, do you see any dropped value profiling at runtime?

tools/llvm-profdata/llvm-profdata.cpp
449

Can you split out this part as a refactoring patch?

xur updated this revision to Diff 91213.Mar 9 2017, 1:44 PM

This is the patch that after splitting out the llvm-profdata refactor change.

xur updated this revision to Diff 91900.Mar 15 2017, 10:50 AM

Update the patch: remove some unrelated changes.

After the refactoring code lands, do you plan to add the site annotation part in this patch?

xur added a comment.Mar 15 2017, 11:23 AM

After the refactoring code lands, do you plan to add the site annotation part in this patch?

I plan to do that part if a separated patch.

This revision is now accepted and ready to land.Mar 15 2017, 11:27 AM
This revision was automatically updated to reflect the committed changes.