Current value profiler supports value tracking with variable length list, but capped at a given max value. When the max entry count is reached, the new target will be permanently dropped no matter how hot it is. This patch implements an efficient way to evict cold entries and ensures hot targets can survive. With this in place, we can choose to reduce the max number entries tracked to reduce memory usage without reducing VP quality.
Details
Diff Detail
Event Timeline
This looks good overall. I have a few nits and one corner case I'm concerned about.
lib/profile/InstrProfilingValue.c | ||
---|---|---|
102 | nit, UINT64_MAX? | |
108 | If CurrentVNode == MinCountVNode before this increment, should we check if MinCountVNode needs to be replaced before we return? It's not a common case, but I'm concerned about doing --MinCountVNode->Count on the wrong node in subsequent calls. | |
111 | nit, clang-format | |
120 | nit, "min count node's count"? | |
test/profile/Inputs/instrprof-value-prof-evict.c | ||
49 | nit, could you pull this macro into main()? That makes it a bit clearer where the I comes from. | |
test/profile/Inputs/instrprof-value-prof-real.c | ||
312 | I don't follow why [0, foo_..., N] changes to [0, foo. Could you explain why this happens? |
lib/profile/InstrProfilingValue.c | ||
---|---|---|
108 | This is by design. This allows the mincount node tracking to be adaptive (can correct previous mistakes) such that it can eventually be locked on the cold one that is in stable state. | |
test/profile/Inputs/instrprof-value-prof-real.c | ||
312 | The test case is a stress test that have > 256 values per site. WIth this change, the last 256 cold targets are competing for this one slot, so it does not matter who survives. |
The eviction method is good. My concerns is we don't using lock in this function, there are chances of corrupted data in threaded programs. This eviction mechanism seems to increase the chances. And previously the corruption will only be count values. Now it could be targets. I would suggest to use atomic in setting the target when replacing.
Another related question: The counter allocation uses atomic operation. Why not the counter updates also using atomic?
-Rong
lib/profile/InstrProfilingValue.c | ||
---|---|---|
108 | done. |
nit, UINT64_MAX?