vp-counters-per-site tunings for small apps generally works (for SPEC), but gobmk is an outlier. It has ~40 sites with ~800 values collected. Bumping up per-site ratio to 20x can be too excessive for general cases. Here a very low overhead solution is to pre-allocate a shared pool which works well.
I can make it work in a way such that the overhead is only paid with value profiling is on and when dynamic allocation is off. However given that the overhead (24k bytes) is tiny, I would rather avoid it.