Page MenuHomePhabricator

[Profile] backward propagate profile data in jump-threading

Authored by davidxl on Aug 17 2017, 11:15 PM.



If a branch condition is defined by a function call result, the profile information for the branch may be (effectively) lost after inlinling.

For instance,

 bool cond() {
      if (a)
        return true;
      if (b)
        return true;
     return false;

void test() {
    if (__builtin_expect(cond(), false) ) {
         bar();  // cold

After inlining and jump threading, the first branch in the inline instance does not have profile data -- this will make the block for bar() look hot.

In this patch, the problem is fixed in jump threading by 'backward' propagate the BP to the predecessor given the conditional probability discovered.

Diff Detail

Event Timeline

davidxl created this revision.Aug 17 2017, 11:15 PM
chandlerc edited edge metadata.Aug 18 2017, 1:40 AM

Really, really nice find here. A bunch of comments below, but they're all pretty minor on the whole.


Would be good to connect this comment to the arguments and behavior of this function.

(Don't get me wrong, this comment is *excellent* for explaining why we can do this back propagation and how to compute it! I just also want comments that explain how this routine fits into the model.)


Update -> update


You can use return {IncomingBB, PhiBB}; here and elsewhere to return. You'll have to specify the return type in the lambda, but then you won't even need to do that for nullptrs...


Why just one? You could do:

auto *PredBB = IncomingBB;
while (auto *SinglePredBB = PredBB->getSinglePredecessor())

PredBB = SinglePredBB;

at the top fo the function, and have a single set of logic to handle the whole thing?


Can simplify as:

if (!CI || !CI->getType()->isIntegerTy(1))

This code only handle i1 PHIs that directly feed conditional branches, right?

Provided that's so, I would assert that CI is a 1 bit integer here, and then test whether it the null value or not rather than zexting and extracting it.


No need to compare with nullptr, pointers are testable directly:

if (PredOutEdge.first)

"we" -> "We"

I wonder, since what we have is fundamentally an upper bound, would it make sense to go ahead and apply this even when there is existing profile info?

Even for static hints, I can see this happening. For example, the pre-inlining code might have some relatively weak hint that is, after inlining, bounded by a powerful hint.

Or is your concern doing this without measuring the impact for PGO? If that's why, I would make that more clear -- we *intend* to always use this to refine weights, but PGO requires more careful tuning to make sure this doesn't regress anything.

Otherwise, I'd just go ahead and implement this as clamping any existing weights.


"With PGO, this can be turned on too to refine the profile with" -> "With PGO, this can be used to refine even existing profile data with"


I would remind the reader that this is because all we have is an upper bound.

davidxl marked 6 inline comments as done.Aug 18 2017, 10:32 AM
davidxl added inline comments.

Annotated the comments with variable names in the code.


There is an early check of one bit type. I am not clear what is the suggestion about (regarding extracting value).


Yes, the latter . I have seen issues with PGO so without more testing, I will leave this for nonPGO case for now.

davidxl updated this revision to Diff 111700.Aug 18 2017, 10:33 AM

Addressed Chandler's review feedbacks

chandlerc accepted this revision.Aug 18 2017, 2:50 PM

Awesome. LGTM with the minor things below addressed. Let me know if any of this isn't make sense.


nit: spurious blank line...


nit: clang-format to fix line length



assert(CI.getBitWidth() == 1 && "We only handle i1 constants here.");
BP = CI->isOne() ? ... : ...;

You could also use CI->isZero(), but one seems more clear.


This wording improvement for the comment would still be helpful I think.

This revision is now accepted and ready to land.Aug 18 2017, 2:50 PM
davidxl updated this revision to Diff 112064.Aug 21 2017, 2:59 PM

Fix an obvious bug.

Also improved test cases to cover the path that exposed the bug.

chandlerc accepted this revision.Aug 21 2017, 8:00 PM

Cool! With the changes below, LGTM.

Any further testing you think this needs before relanding it? (I don't really have good ideas other than to patch it in internally and build some large things with PGO and ThinLTO)


I think this will be easier to read inverted:

auto *SinglePredBB = PredBB->getSinglePredecessor();
if (!SinglePredBB)
  return {nullptr, nullptr};

SuccBB = PredBB;
PredBB = SinglePredBB;

nit: When infinitely looping, I find for (;;) { a shorter and still easy to read idiom.


No need for this once the loop above returns directly when exiting.

davidxl updated this revision to Diff 112111.Aug 21 2017, 9:40 PM

Addressed review feedback.

Tested with clang self build with PGO.