This is an archive of the discontinued LLVM Phabricator instance.

Dump TPI records to Yaml
ClosedPublic

Authored by zturner on Aug 5 2016, 1:57 PM.

Details

Summary

Note this does not yet convert the Yaml back into a serialized Tpi stream or write them to the File. That's a little bit more work, so I figured it would be better to get this in first.

Diff Detail

Event Timeline

zturner updated this revision to Diff 67014.Aug 5 2016, 1:57 PM
zturner retitled this revision from to Dump TPI records to Yaml.
zturner updated this object.
zturner added reviewers: rnk, ruiu.
ruiu added inline comments.Aug 5 2016, 2:04 PM
test/DebugInfo/PDB/pdbdump-yaml-types.test
34

Value is an integer type, so it's odd that the value is quoted with single quotes. Can you print it out without quote?

zturner updated this revision to Diff 67017.Aug 5 2016, 2:16 PM

Print integral values without quotes.

ruiu accepted this revision.Aug 5 2016, 2:48 PM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Aug 5 2016, 2:48 PM
zturner updated this revision to Diff 67596.EditedAug 10 2016, 1:49 PM
zturner edited edge metadata.

I found some problems with the previous version of the diff. Namely, that we weren't printing the TypeLeafKind for field list members because it was still using a little bit of a special code path. Although technically the information was still there, it wasn't in the right format that would have allowed the YAML to re-extract it. This is because it would print something like

Enumerator {
}

So sure, if you look at the file in an editor it looks like there's enough info to determine what kind of member field this is, but the YAML api doesn't expose this value, we need to print it like this:

Field {
   Kind: LF_ENUMERATE
   Enumerator {
   }
}

The logical way to do this is to re-use the same visitor code for visiting the members of the field list as we do for visiting top level members of the TPI stream. As this code already prints stuff in exactly the format we need. But, this is complicated by the fact that field lists and field list members have always had special handling in the visitor dispatch.

This patch tries to solve all this by merging the field list member and top level member code paths. A FieldListRecord is introduced which contains a vector<CVType>. When a FieldList is encountered in the TPI stream, it is Decomposed into a vector<CVType> and then passed to a single call visitKnownRecord(FieldList&). This method, being part of the visitor, can then do whatever it wants, including iterating over and re-visiting the members of the field list, or throwing them away and doing nothing.

The only real downside of this is that we have to deserialize the field list members twice. We can probably solve this if it's important by changing FieldListRecord to be a little smarter. But for simply dumping records it doesn't seem to be an issue.

zturner updated this revision to Diff 67604.Aug 10 2016, 2:26 PM

I managed to get it working where it's not deserializing twice. Basically FieldListRecord just stores an ArrayRef<uint8_t> of the entire list of FieldList bytes. Then the visitor uses the visitor again to visit all the members.

rnk added inline comments.Aug 10 2016, 4:08 PM
tools/llvm-pdbdump/PdbYaml.cpp
208

These both seem like they can fail in interesting ways. What do you think we should do with the error code, assert? The yaml validation method for error handling seems inappropriate.

rnk added inline comments.Aug 16 2016, 10:43 AM
include/llvm/DebugInfo/CodeView/TypeRecord.h
436

This could be a StreamRef, to avoid the need to copy the entire field list record into contiguous memory.

This revision was automatically updated to reflect the committed changes.