This is an archive of the discontinued LLVM Phabricator instance.

[codeview] Add complex record type translation
ClosedPublic

Authored by aaboud on Jun 5 2016, 8:32 PM.

Details

Summary

Improved the support of records (classes and structures) to include virtual and non-virtual methods, virtual and non-virtual inheritance, bitfield members.

Major changes included in the patch

  1. Introduced "ClassInfo" container that is used to collect record info from current debug info metadata that is needed to emit CodeView records.
  2. Extended the TypeIndices map to have another dimension, i.e. from {DINode -> TypeIndex} to {DIType -> {DINode -> TypeIndex} }.
    • This is needed as DISubroutine might be shared for static methods from different classes as well as regular functions, e.g. all three functions bellow (f1, f2, and f3) will have same DISubroutine.
void f1() {}
class A {
  static void f2() {}
};
class B {
  static void f3() {}
};

Diff Detail

Repository
rL LLVM

Event Timeline

aaboud updated this revision to Diff 59686.Jun 5 2016, 8:32 PM
aaboud retitled this revision from to [codeview] Add complex record type translation.
aaboud updated this object.
aaboud added reviewers: rnk, majnemer.
aaboud added a subscriber: llvm-commits.
aaboud added subscribers: bwyma, smerritt.
aaboud updated this revision to Diff 59688.Jun 5 2016, 8:35 PM

Fixed some comments.

rnk edited edge metadata.Jun 6 2016, 12:10 PM

Can we separate the instance member function type changes from the class info changes? We should already get method types from member functions.

include/llvm/DebugInfo/CodeView/TypeRecord.h
1362 ↗(On Diff #59688)

It would be more consistent to directly take the TypeRecordKind as a parameter here.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
876–878 ↗(On Diff #59688)

We should assert if this happens. I'd be interested in a test case that triggered this scenario.

927 ↗(On Diff #59688)

I would really prefer it if we formed a new key into our type index map and called the generic getTypeIndex method here. Calling 'lowerTypeMemberFunction' unconditionally is a good way to end up lowering a type twice by accident.

964–966 ↗(On Diff #59688)

Ditto, this is bad.

1220 ↗(On Diff #59688)

Ah, yeah, OK, when generating complete type info we will need to recurse onto base types. Your comment from before about not caching the hash table lookup makes sense now.

1301 ↗(On Diff #59688)

Please don't commit this code if we can just fix it in clang. It's really easy to change our name printing code, we have a printing option that detects if codeview is enabled.

lib/CodeGen/AsmPrinter/CodeViewDebug.h
145–150 ↗(On Diff #59688)

Having DenseMaps inside DenseMaps will be terribly inefficient. A better way to do this would be to extend the key to have more data, perhaps with a PointerUnion between DINode* and MyCustomSubroutineKey.

Actually, I think it might be better to change DISubroutineType to have a scope operand. We're going to need to add a callingconv field anyway, might as well go all the way so we can simplify this. I need to look at Duncan's recent ODR changes to understand if it's OK to have this cycle in the metadata.

rnk edited edge metadata.Jun 6 2016, 12:10 PM
rnk added a subscriber: dexonsmith.
majnemer added inline comments.Jun 6 2016, 4:37 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1239–1243 ↗(On Diff #59688)

Please use / 8 here and elsewhere.

1255 ↗(On Diff #59688)

Please use the LLVM naming convention.

1591 ↗(On Diff #59688)

Please parenthesize the subexpression.

aaboud marked 6 inline comments as done.Jun 8 2016, 7:29 AM

Thanks for the feedback, I will apply fixes in next patch.
Please see my response on some comments below.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
876–878 ↗(On Diff #59688)

No need to add assertion, here is an example:

struct A {
  A* next;
};

void foo() {
 A *p;
 A a;
}
964–966 ↗(On Diff #59688)

And once again, an example :)

struct A {
  const A* next;
  A();
};

void foo() {
 const A a;
}
1301 ↗(On Diff #59688)

Actually, I think we can change the way clang generate DISubprogram names: http://reviews.llvm.org/rL255744
This commit was not the right thing to do, because when generating function inside FieldList we need to give it the naked name (without scoping), however in all other places we will need the full name with the scope.

If you ask me, I prefer that Clang do not add the scope name prefix to the function name, and lit this to the be done by the CodeViewDebug component in the Backend.

aaboud updated this revision to Diff 60036.Jun 8 2016, 7:36 AM

Updated patch according to comments from Reid and David.

rnk added inline comments.Jun 8 2016, 10:59 AM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
879–881 ↗(On Diff #60036)

This example works fine today without this extra lookup, though. The only way I could imagine this happening is if translating 'A*' triggers translation of the complete A type, and we can avoid that. The graph of types should be acyclic if you don't follow edges through record types, which is what I was trying to do.

Instead, I think we should queue all complete record types that we've seen during type translation in a separate list, and lower them later. We'll need this anyway to implement S_UDT.

1065 ↗(On Diff #60036)

Hm, maybe this thing should just take (unsigned RecordTag, unsigned Flags) so we can avoid the duplication.

1293 ↗(On Diff #60036)

Sure, we can do that. Just modify the metadata in your test case to have the names you expect and we'll fix clang later.

Thanks Reid for the comments.
Please just let me know what you think about the double check for Modifier and Pointer record generation (see my comment below).
I will upload a new patch with all the fixes accordingly.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
879–881 ↗(On Diff #60036)

The only reason why this is working today without this check, is that we have another check in MemoryTypeTableBuilder for the hashed record.
But this is still inefficient as we are creating the Record and building the String using the TypeRecordBuilder, just to find out that we already have the record hashed!

If you think this is good enough, I do not mind removing all these checks (we can add them later anyway).

1065 ↗(On Diff #60036)

Will fix in next uploaded patch.

1293 ↗(On Diff #60036)

Will do that in next uploaded patch.

aaboud updated this revision to Diff 60107.Jun 8 2016, 3:07 PM
aaboud marked an inline comment as done.

Addressing two of Reid comments.

rnk added a comment.Jun 8 2016, 6:23 PM

I still feel like this patch is too large and is trying to do too much. Can we limit it to just handling non-virtual methods and their overloads? I don't want to review the VBPtr offset computation code at the same time as bitfields and virtual bases and etc etc. Each of those things requires a certain amount of care.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1029 ↗(On Diff #60107)

You should rebase your patch to pick up the calling convention changes, this TODO should be done.

1132–1137 ↗(On Diff #60107)

You should be able to key these on MDString* instead of StringRef, because MDStrings are uniqued. That will be more efficient.

1209 ↗(On Diff #60107)

All this code below here scares me. It feels like an ad-hoc reimplementation of inheritance logic that should really be in clang. I think it would really simplify things if we ignored virtual base classes for now, and didn't try to emit them just yet. It would make it easier to review this patch.

1323 ↗(On Diff #60107)

This patch uses too much auto in for loops. The LLVM guidelines say to use it in "places where the type is already obvious from the context". As a reader, Methods is some other data structure that I don't know much about, so it would help me to have the type here.

aaboud added a comment.Jun 9 2016, 6:39 AM
In D21011#453066, @rnk wrote:

I still feel like this patch is too large and is trying to do too much. Can we limit it to just handling non-virtual methods and their overloads? I don't want to review the VBPtr offset computation code at the same time as bitfields and virtual bases and etc etc. Each of those things requires a certain amount of care.

OK, this is what I am going to do:

  1. Update (rebase) this patch to top of trunk (no other changes will be made) and will upload the patch.
  2. I will remove remove from the patch the handling of bitfield, VBPtr, virtual inheritance, etc. and will apply relevant comments.

I am doing (1) so we still have a reference for this code (updated to top of trunk).
I will upload these patches soon today.

aaboud updated this revision to Diff 60245.Jun 9 2016, 2:55 PM

Updated patch to top of trunk.
Next will split patch into small patches.

aaboud updated this revision to Diff 60418.Jun 10 2016, 4:21 PM

Reduced the support to only methods.

rnk added a comment.Jun 15 2016, 10:30 AM

This needs some test, I'd add something like types-method-overloads.ll, and keep it focused to non-virtual methods so that we don't have to update it too much as we handle more complex virtual inheritance cases.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1185–1192 ↗(On Diff #60418)

This does 3 lookups, it only needs to do one, something like this:

auto Insertion = ClassInfoMap.insert({Ty, std::unique_ptr<ClassInfo>()});
std::unique_ptr<ClassInfo> &Info = *Insertion.first->second;
if (!Insertion.second)
  return Info;
Info.reset(new ClassInfo());
...
1202 ↗(On Diff #60418)

Should we claim that all methods are introduced virtual methods, or just the virtual ones?

aaboud marked an inline comment as done.Jun 16 2016, 10:53 AM

Thanks Reid for the comments, I will upload a new version with a LIT tests and some fixes.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1202 ↗(On Diff #60418)

Introduce is only for virtual methods, however the "Methods" container holds all methods of the class (virtual and non-virtual).
I am using the same container to be able to output the methods in the right order, though I am not sure if the order is important!

See "translateMethodKindFlags" at line 1091, where it consider the "Introduced" variable only for virtual functions.

rnk added inline comments.Jun 16 2016, 11:07 AM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1202 ↗(On Diff #60418)

I guess I'm questioning the usefulness of MethodInfo. The 'Introduced' boolean is always set to true in this code.

How do you propose to calculate whether a method was introduced? Remember, we can't actually walk the class hierarchy because the frontend may not emit complete debug information about base classes, unless -fstandalone-debug-info is on. I think this is something that we will need the frontend to tell us directly, see https://llvm.org/bugs/show_bug.cgi?id=28150.

Anyway, this doesn't need to hold up the patch. With tests I think this is ready and we can iterate on it from there.

aaboud marked an inline comment as done.Jun 16 2016, 11:12 AM
aaboud added inline comments.
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1202 ↗(On Diff #60418)

Actually, I planed to travel the hierarchy, though you might have a point with clang not emitting the complete debug info (but need to produce an example for that first).

Anyway, I added a comment and switched the value to false, as we are handling non-virtual methods for now, and they should not be marked as introduced.

aaboud updated this revision to Diff 61009.Jun 16 2016, 12:29 PM
aaboud marked an inline comment as done.

Added LIT test, and applied some minor changes to the code according to Reid comments.

majnemer added inline comments.Jun 16 2016, 12:37 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
751–752 ↗(On Diff #61009)

else after return is discouraged by the coding standards: http://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return

1132 ↗(On Diff #61009)

Does this need to be in the LLVM namespace? Why not stick it in an anonymous namespace?

1263 ↗(On Diff #61009)

Why not use std::ignore instead?

1283 ↗(On Diff #61009)

Comments should end in a period.

1299 ↗(On Diff #61009)

Please clang-format this.

rnk added inline comments.Jun 16 2016, 1:21 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1050–1051 ↗(On Diff #61009)

The TODO is done, you can delete the comment.

1132 ↗(On Diff #61009)

It does, it's forward declared and used as an incomplete type in the header.

aaboud updated this revision to Diff 61050.Jun 16 2016, 5:14 PM
aaboud marked 5 inline comments as done.

Applied changes according to comments from David and Reid.

rnk accepted this revision.Jun 17 2016, 8:02 AM
rnk edited edge metadata.

lgtm, thanks!

This revision is now accepted and ready to land.Jun 17 2016, 8:02 AM
This revision was automatically updated to reflect the committed changes.