r298270 added profile update logic for branch_weights. This patch implements profile update logic for VP prof metadata too.
Details
Diff Detail
- Build Status
Buildable 6081 Build 6081: arc lint + arc unit
Event Timeline
lib/IR/Instruction.cpp | ||
---|---|---|
637 | This is not easy to read. Why are only odd indices pushed into Vals? I suppose you want to scale only operand 2. I suggest you separate the two cases (VP aand branch_weights) and handle them separately. |
I found that not all calls with VP metadata were being scaled after this patch. Specifically, this was VP metadata on a memcpy intrinsic (so profiling the memcpy size values not an indirect call), but it still makes sense to scale them the same way. I confirmed that the call was not in the value map iterated in the caller of this function (updateCallProfile), which is what is iterated to update the inlined instructions in the caller. Are we guaranteed that every inlined instruction would be in there?
This may be a false alarm. It looks like we might simply not bed doing the update, which would happen if the entry count of the callee was 0. My guess is that this happened because I was only redoing the ThinLTO backend with the patch, and we probably needed to do scaling in the compile step to avoid importing and inlining a callee with 0 entry count and unscaled VP metadata...confirming with a full rebuild, which will take some time.
Confirmed that this was a false alarm. In the source file I was looking at the expected functions are marked cold. So LGTM, but please wait for Easwaran to see if he has any more concerns.
LGTM after hoisting the APInt outside the loop (see inlined comment).
lib/IR/Instruction.cpp | ||
---|---|---|
643 | Creating APInt objects > 64 bits in length requires heap allocation. So the loop invariant APInt objects (S and T) should be hoisted out. I know this is pre-existing, but it is better to fix this since similar code exists for VP (and where the loop likely executes more iterations). |
lib/IR/Instruction.cpp | ||
---|---|---|
635 | APInt(128, T) should also be hoisted here. |
APInt(128, T) should also be hoisted here.