GetActuallyAllocatedSize is actually expensive. In order to avoid calling this
function in the malloc/free fast path, we change the Scudo chunk header to
store the size of the chunk, if from the Primary, or the amount of unused
bytes if from the Secondary. This way, we only have to call the culprit
function for Secondary backed allocations (and still in realloc).
The performance gain on a singly threaded pure malloc/free benchmark exercising
the Primary allocator is above 5%.