SizeClassForTransferBatch is expensive and is called for every CreateBatch
and DestroyBatch. Caching it means kNumClasses calls in InitCache
instead. This should be a performance gain if more than kNumClasses / 2
batches are created and destroyed during the lifetime of the local cache.
I have chosen to fully remove the function and putting the code in InitCache,
which is a debatable choice.
In single threaded benchmarks leveraging primary backed allocations, this turns
out to be a sizeable gain in performances (greater than 5%). In multithreaded
benchmarks leveraging everything, it is less significant but still an
improvement (about 1%).