The general idea is to introduce a variadic template function which can deserialize arbitrary sequences of items from a byte stream. For example, if you write:
const Layout *L; StringRef S; uint64_t N; consume(Data, L, S, N);
it will deserialize each of these in sequence, returning an error at any point if there is a problem.
There are a few cases where things get more complicated.
- Sometimes fields might only be present if a particular condition is true. For example, if a function is an introducing pure virtual function, then a VFTableOffset will be present, otherwise it won't. To handle this we introduce the notion of conditional fields, specified with the CONDITIONAL_FIELD macro.
- There might be a field which represents a number N of elements, and then an N element array might follow. To handle this we introduce the ARRAY_FIELD_N macro.
- There are occasions where what follows is an array whose length is not encoded, but which runs until the end of the buffer. To handle this we introduce the ARRAY_FIELD_TAIL macro.
These macros work by constructing a lambda function with the given conditional, so that it is evaluated lazily, meaning that if you write
const uint64_t Count; ArrayRef<uint32_t> Items; consume(Data, Count, ARRAY_FIELD_N(Items, Count));
it will work, because Count in the macro is not evaluated until it processes that item, at which point it has already been deserialized.
There are 2 remaining instances where I was not able to come up with a good solution that allows us to use this variadic template. One involves a nested "sub field" which must be deserialized, and another involves a complex nibble decoding algorithm.
In any case, this should handle 90% of cases, and greatly reduces the amount of boilerplate needed to deserialize records.
Seems kind of awkward that uint64_t is a leaf numeric, but uint32_t is fixed length.