The visitor abstraction is useful for plugging lots of somewhat related components together, but with abstraction comes a cost in performance. Type Merging is easily the slowest part of a link, so it makes sense to consider whether it's worth sacrifcing abstractional purity in exchange for speed here.
This patch swings completely the opposite direction. It bypasses the TempSerializer, it bypasses the FieldListRecordBuilder, it bypasses the TypeRecordMapping, and it bypasses the visitation switch and instead re-implements the record deserialization switch in a hand rolled algorithm that doesn't care about deserializing at all, but only about knowing what offsets within a record's byte sequence contain type indices.
For field list records in particular, this algorithm also implements the logic of skipping a record to find the next record. This allows us to completely skip the process of deserializing field list records in order to determine what indices to remap, and then re-serializing them. Furthermore, computing "how many bytes is this field list member record" is somewhat faster than actually pulling out all the fields of the record.
The algorithm here simply gets as input a sequence of bytes, and returns as output a list of offsets that we need to remap.
Performance wise, this is a huge win. This is linking lld using clang-cl generated objects and library inputs (so /Z7) before this patch, after this patch, and using MSVC.
MSVC: 25.67s
Before Patch: 18.59s
After Patch: 8.92s
This basically builds a SmallVector. Its... sort of like an iterator. Do we still want to call it one? I feel like "Iterator" has a specific meaning in C++, and this isn't it. Maybe TypeIndexFinder.h to go with findTypeIndexOffsets or something?
Alternatively, do you want to just mush all this functionality into TypeSerializer.h/cpp?