Use PackedBitVector instead of raw char array. This simplifies the implementation but also allocates on the heap instead of the stack; which is not ideal.
We could specialize PackedBitVector so it takes a SmallBitVector instead of the BitVector but looking at the implementation SmallBitVector is actually a TinyBitVector which can only pack a handful of bits.
At this point I'm not clear on the benefits of this patch. Feel free to push back.
Do that in a separate patch ?