This is an archive of the discontinued LLVM Phabricator instance.

Refactor CodeView TypeRecord visitors to use common structures
ClosedPublic

Authored by zturner on May 10 2016, 3:00 PM.

Details

Summary

If we don't want to go with the approach because it involves too much boilerplate, another approach would be to go back to what we were doing before, where we raise the Layout class outside of the memory-representation class and put the serialization on the Layout class. One of the problems now is that there are some cases where the deserialization is conditional, or we have to extract some bitfields to convert to the in memory format, so there's always a little bit of custom marshalling involved.

If we wanted to more easily generalize the deserialization, limiting it to the layout classes could provide better code reuse since you may be able to declare the layout classes declaratively with macros and then have a single deserialization function that is templatized and works for everything due to the declarative nature of the layout classes. Then you could have the in memory representation classes take a layout class to the constructor or something.

Either way, I don't think this is a bad first step, and I plan to generalize as much as possible using variadic templates in a followup patch.

Diff Detail

Event Timeline

zturner updated this revision to Diff 56822.May 10 2016, 3:00 PM
zturner retitled this revision from to Refactor CodeView TypeRecord visitors to use common structures.
zturner updated this object.
zturner added reviewers: rnk, amccarth.
zturner added a subscriber: llvm-commits.
rnk edited edge metadata.May 10 2016, 3:33 PM

There's some common patterns we should fix up for safety. I'll re-review all the .slice call sites again after the common patterns are fixed up.

include/llvm/DebugInfo/CodeView/CVTypeVisitor.h
73

This should be return parseError();, and if we were really cool we'd propagate out the Error. I'm fine with the existing bool.

include/llvm/DebugInfo/CodeView/CodeView.h
429

Eventually we should unify this with CVLeafTypes, but today is not that day.

include/llvm/DebugInfo/CodeView/TypeRecord.h
18

lib/DebugInfo/CodeView shouldn't depend on lib/Object, maybe go for std::errc::illegal_byte_sequence or something for now.

115–118

Is there a reason to not use consumeObject here?

164

ditto, consumeObject?

299

This needs handling to avoid buffer overrun when the string is not null terminated. I think you should add a helper like this and then use it everywhere you use reinterpret_cast<const char *>:

StringRef consumeCString(ArrayRef<uint8_t> &Data) {
  StringRef Str, Rest;
  std::tie(Str, Rest) = getBytesAsCharacters(Data).split('\0');
  Data = ArrayRef<uint8_t>(reinterpret_cast<const uint8_t*>(Rest.data()), Rest.size());
  return Str;
}

Alternatively, we could treat lack of null termination as an error, but at least truncating is better than running over.

337

This slice should be checked against Data.size() or we'll abort

480

getBytesAsCString

519

getBytesAsCString

582

I agree, splitting this seems useful.

751

Woo, nibble decoding. I think the dumper doesn't even print these because they are so uninteresting in a post 16 bit world.

You should check up front that there are at least (VFEntryCount +1) / 2 bytes in Data, or the slices will abort.

zturner updated this revision to Diff 56836.May 10 2016, 4:27 PM
zturner edited edge metadata.
zturner marked 8 inline comments as done.

Fix review suggestions, now with full context.

rnk added inline comments.May 10 2016, 5:08 PM
include/llvm/DebugInfo/CodeView/TypeRecord.h
74

Given that you wired up error_code, this should return an error if Data.size() == Str.size(), which implies that there was no null terminator.

83

I think you meant Data.size() < Size :)

278

We probably shouldn't inline all these deserialization methods. We should probably make a TypeRecord.cpp for these.

633

consumeCString

697

consumeCString

794

You should check up front that there are at least (VFEntryCount +1) / 2 bytes in Data, or the slices will abort.

This comment is still relevant. Maybe you could handle it by doing this:

ArrayRef<uint8_t> SlotBytes;
if (auto EC = consumeArray(SlotBytes, Data, (L->VFEntryCount + 1) / 2))
  return EC;
if (L->VFEntryCount % 2 == 1)
  SlotBytes = SlotBytes.drop_back(1); // plus other special handling for that last byte
for (uint8_t Byte : SlotBytes) {
  // handle two nibbles at a time
}
zturner updated this revision to Diff 56842.May 10 2016, 5:27 PM

I'm thinking I can move the deserialize methods to a cpp file in a followup, because I want to try reducing the amount of boilerplate with some templates, and it's not clear how that will turn out. If i'm lucky I can get a lot of the deserialize methods to disappear.

BTW, latest patch should address all the concerns.

rnk accepted this revision.May 11 2016, 8:36 AM
rnk edited edge metadata.

lgtm with one last fix

include/llvm/DebugInfo/CodeView/TypeRecord.h
697

What about this though?

This revision is now accepted and ready to land.May 11 2016, 8:36 AM
zturner closed this revision.May 11 2016, 3:04 PM