This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Enable common VP format in profile runtime
ClosedPublic

Authored by davidxl on Nov 29 2015, 10:03 AM.

Details

Summary

In this patch, the following changes are done

  1. Enable emission of VP data in ValueProfData format using common API (now made available through InstrProfData.inc)
  1. Removed the update of TotalValueDataSize
    • this reduced the number of atomtic operations (slow with high latency) executed during profile update
    • allows MIPS arch to enable value profiling -- which lacks the syn fetch-add support
    • Using TotalValueDataSize also has a fundamental problem that it always penalize the last function that gets profile collected -- if new values were added after buffer is created, last function's existing vp data (can be hot) may get dropped.
    • With the new approach in the patch, only the new values for any function are dropped (if any).
  1. Added an extensive value profiling mock test. The test declares > 100 dummy caller functions and synthesize up to 128 value profile sites for each caller. Up to 8 values are added to each value site using profiler runtime VP interfaces.

Diff Detail

Repository
rL LLVM

Event Timeline

davidxl updated this revision to Diff 41352.Nov 29 2015, 10:03 AM
davidxl retitled this revision from to [PGO] Enable common VP format in profile runtime .
davidxl updated this object.
davidxl added a reviewer: betulb.
davidxl added a subscriber: llvm-commits.
vsk added a subscriber: vsk.Dec 1 2015, 2:22 PM
vsk added inline comments.
lib/profile/InstrProfiling.c
66 ↗(On Diff #41352)

nit, method

117 ↗(On Diff #41352)

Is Data->Values supposed to be free'd at some point?

175 ↗(On Diff #41352)

NULL?

test/profile/instrprof-value-prof.c
4 ↗(On Diff #41352)

I might be doing this incorrectly but I get a syntax error here. Doesn't this need to be:

// RUN: llvm-profdata show --all-functions -ic-targets  %t.profdata | FileCheck \
// RUN: %s
51 ↗(On Diff #41352)

return (intptr_t)p2 - (intptr_t)p1?

95 ↗(On Diff #41352)

Will the CHECK lines break if the assumed mapping between function names and addresses changes?

davidxl marked 3 inline comments as done.Dec 1 2015, 3:54 PM
davidxl added inline comments.
lib/profile/InstrProfiling.c
117 ↗(On Diff #41352)

Fixed.

test/profile/instrprof-value-prof.c
4 ↗(On Diff #41352)

Clang format did this, I think. Fixed.

95 ↗(On Diff #41352)

Right - the callee address sorting should be removed. This is fixed now -- as the callee targets are properly ordered according to how the initializer for calleeAddr array is defined.

davidxl updated this revision to Diff 41576.Dec 1 2015, 3:57 PM

Updated patch with vsk's comments.

betulb edited edge metadata.Dec 2 2015, 12:21 AM

One side comment on the commit message: Using TotalValueDataSize would only cause a drop of the last N function(s)' value data if/when file writing is to be triggered through means other than atexit() and calls to __llvm_profile_instrument_target are allowed to execute concurrently w/ file write. Neither is happening at this time.

lib/profile/InstrProfiling.c
70 ↗(On Diff #41576)

Is the casting really needed? What's passed in is a __llvm_profile_data pointer, so
shouldn't Data->NumValueSites[ValueKind] be OK?

188 ↗(On Diff #41576)

Nothing to free. So, just "return 0;" ?

224 ↗(On Diff #41576)

Shouldn't line 223 check for !getValueProfDataSizeRT(R) and not for !I->Values? The new check also should happen after line 225.

betulb added a subscriber: betulb.Dec 2 2015, 7:08 AM

betulb added a comment.

One side comment on the commit message: Using TotalValueDataSize would
only
cause a drop of the last N function(s)' value data if/when file writing is
to be triggered through means other than atexit() and calls to
__llvm_profile_instrument_target are allowed to execute concurrently w/
file
write. Neither is happening at this time.

Comment at: lib/profile/InstrProfiling.c:70
@@ +69,3 @@
+ uint32_t ValueKind, uint16_t
NumValueSites) {
+ *((uint16_t *)&Data->NumValueSites[ValueKind]) = NumValueSites;

+}

Is the casting really needed? What's passed in is a __llvm_profile_data
pointer, so
shouldn't Data->NumValueSites[ValueKind] be OK?

I see. It's needed to remove the const qualifier.

Comment at: lib/profile/InstrProfiling.c:188
@@ +187,3 @@
+ if (!Records)
+ goto Return;

+ /*

Nothing to free. So, just "return 0;" ?

Comment at: lib/profile/InstrProfiling.c:224
@@ +223,3 @@
+ if (!I->Values)
+ continue;

+ ValueProfRuntimeRecord *R = &Records[I - DataBegin];

Shouldn't line 223 check for !getValueProfDataSizeRT(R) and not for
!I->Values? The new check also should happen after line 225.

http://reviews.llvm.org/D15057

davidxl marked an inline comment as done.Dec 2 2015, 10:10 AM
davidxl added inline comments.
lib/profile/InstrProfiling.c
70 ↗(On Diff #41576)

the const qualifier needs to be removed.

224 ↗(On Diff #41576)

You are right. I->Values is lazily initialized -- so I->Values == NULL does not mean we can skip it (only when there is zero value site statically). I need to enhance the test case to cover it.

betulb added inline comments.Dec 2 2015, 10:57 AM
lib/profile/InstrProfiling.c
224 ↗(On Diff #41576)

I'm not sure if I'm following your concern about lazy initialization for I->Values.
The earlier loop initializes the ValueProfRuntimeRecord.

It did seemed to me that you're concerned about value data being dropped at file write time. If there is any scenario which can cause dropping of the value data in the previous implementation, then this new implementation could cause a buffer overwrite under the same circumstances, if a NULL entry for I->Values gets updated after the null check on line 194 and before line 223. But, I do not see any allowance for that at the moment.

davidxl added inline comments.Dec 2 2015, 11:11 AM
lib/profile/InstrProfiling.c
224 ↗(On Diff #41576)

No -- I don't have a concern about I->Values being lazily initialized -- it is perfectly fine to do it -- just the writer code needs to handle it properly -- and your suggestion of not checking !I->Values is correct (also avoid the overwrite problem you mentioned here).

The value dropping problem as you mentioned can happen for the last 'N' function -- the main problem is that it can potentially lead to hot values gets truncated.

davidxl updated this revision to Diff 41674.Dec 2 2015, 1:57 PM
davidxl edited edge metadata.

Updated the patch according to Betul's feedback.

  1. uses getNumValueKindsRT to determine if a function can be skipped (the size method will always return >0 size for the header due to indexed format restriction)
  2. enhanced test case to cover the case when no value data is collected -- the resulting profile data should be mergable with the case when value profile data exist

The data race problem to I->Values still exist -- I will have a follow up patch to deal with it.

davidxl updated this revision to Diff 41788.Dec 3 2015, 12:21 PM

Update the patch:

  1. instead of pre-allocate RuntimeRecord which wastes heap space, we now just precomputes the base buffer size
  2. an environment variable is introduced to allow user to allocate no default extra bytes to accommodate the case when concurrent profile update form other threads causes profile data to be dropped (due to to capacity limit)
  3. verbose messages are added to diagnose various error conditions.

Patch to completely remove data races on Values field will follow.

My LGTM was for Diff 41674.

davidxl added a subscriber: davidxl.Dec 3 2015, 4:03 PM

The new version is functional equivalent to the previous diff -- but
more user friendly (in terms of runtime behavior). I will check in
this version (to remove one of the unwanted atomic operations) for now

  • more comments are welcome after this.

David

This revision was automatically updated to reflect the committed changes.