Debug Subsections (Checksums, Line Info, Frame Data, etc) present an interesting challenge when it comes to reading and writing, especially to and from YAML. Some of these challenges are:
- Depending on whether the CodeView container is a PDB or an Object File, these show up in slightly different ways. For example, in an Object File, string tables appear in the .debug$S section as a "debug subsection", while in a PDB file the string table is global to the entire file and exists
- They can appear in any order, even though some subsections reference other subsections. For example, you might have a Lines subsection followed by a Checksums subsection followed by a String Table subsection. But the Lines subsection references the Checksums subsection, and the Checksum subsection references the String Table subsection. So you can't interpret the data until you've read everything.
When going from Obj to Yaml, you might read (for example) a File Checksums subsection, which references a String Table subsection which you haven't actually read yet. But in Pdb to Yaml, you'll already have the string table section because it's global to the file.
Prior to this patch, we handled in this in sort of a crude way. When going from PDB to YAML, we would find the checksums subsection and write it to YAML first, then write subsequent sections later. In a sense mandating a topological sorting similar to type record streams. But this has a drawback. It means that we can't easily test our ability to work with arbitrary files, because the only files we can produce are laid out a specific way. Furthermore, it means that round-tripping will end up producing a different PDB than what you started with, because some of the subsections will be in a different order.
Finally, it makes things even more complicated when we start trying to do Obj -> Yaml -> Obj, because now these have a String Table subsection and various other subsections with cross subsection dependencies, and maintaining the topological sorting becomes cumbersome. So I consider this work a precursor to getting CodeView Obj <-> Yaml support working.
In short, if it is a valid object (or PDB) file, we would like to be able to produce it.
This patch allows subsections to be specified in any order. While this complicates some of the internal Yaml <-> Native data structure conversion logic, it greatly simplifies the portion of the library which actually writes subsections out, because you no longer have to have complicated sanitization checks to ensure things are written out in a specified order. You can just write whatever you want and it will just work.