This is an archive of the discontinued LLVM Phabricator instance.

[codeview] Add complex record type translation
ClosedPublic

Authored by aaboud on Jun 5 2016, 8:32 PM.

Details

Summary

Improved the support of records (classes and structures) to include virtual and non-virtual methods, virtual and non-virtual inheritance, bitfield members.

Major changes included in the patch

  1. Introduced "ClassInfo" container that is used to collect record info from current debug info metadata that is needed to emit CodeView records.
  2. Extended the TypeIndices map to have another dimension, i.e. from {DINode -> TypeIndex} to {DIType -> {DINode -> TypeIndex} }.
    • This is needed as DISubroutine might be shared for static methods from different classes as well as regular functions, e.g. all three functions bellow (f1, f2, and f3) will have same DISubroutine.
void f1() {}
class A {
  static void f2() {}
};
class B {
  static void f3() {}
};

Diff Detail

Event Timeline

aaboud updated this revision to Diff 59686.Jun 5 2016, 8:32 PM
aaboud retitled this revision from to [codeview] Add complex record type translation.
aaboud updated this object.
aaboud added reviewers: rnk, majnemer.
aaboud added a subscriber: llvm-commits.
aaboud added subscribers: bwyma, smerritt.
aaboud updated this revision to Diff 59688.Jun 5 2016, 8:35 PM

Fixed some comments.

rnk edited edge metadata.Jun 6 2016, 12:10 PM

Can we separate the instance member function type changes from the class info changes? We should already get method types from member functions.

include/llvm/DebugInfo/CodeView/TypeRecord.h
1362

It would be more consistent to directly take the TypeRecordKind as a parameter here.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
876–878

We should assert if this happens. I'd be interested in a test case that triggered this scenario.

927

I would really prefer it if we formed a new key into our type index map and called the generic getTypeIndex method here. Calling 'lowerTypeMemberFunction' unconditionally is a good way to end up lowering a type twice by accident.

964–966

Ditto, this is bad.

1220

Ah, yeah, OK, when generating complete type info we will need to recurse onto base types. Your comment from before about not caching the hash table lookup makes sense now.

1301

Please don't commit this code if we can just fix it in clang. It's really easy to change our name printing code, we have a printing option that detects if codeview is enabled.

lib/CodeGen/AsmPrinter/CodeViewDebug.h
145–150

Having DenseMaps inside DenseMaps will be terribly inefficient. A better way to do this would be to extend the key to have more data, perhaps with a PointerUnion between DINode* and MyCustomSubroutineKey.

Actually, I think it might be better to change DISubroutineType to have a scope operand. We're going to need to add a callingconv field anyway, might as well go all the way so we can simplify this. I need to look at Duncan's recent ODR changes to understand if it's OK to have this cycle in the metadata.

rnk edited edge metadata.Jun 6 2016, 12:10 PM
rnk added a subscriber: dexonsmith.
majnemer added inline comments.Jun 6 2016, 4:37 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1239–1243

Please use / 8 here and elsewhere.

1255

Please use the LLVM naming convention.

1591

Please parenthesize the subexpression.

aaboud marked 6 inline comments as done.Jun 8 2016, 7:29 AM

Thanks for the feedback, I will apply fixes in next patch.
Please see my response on some comments below.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
876–878

No need to add assertion, here is an example:

struct A {
  A* next;
};

void foo() {
 A *p;
 A a;
}
964–966

And once again, an example :)

struct A {
  const A* next;
  A();
};

void foo() {
 const A a;
}
1301

Actually, I think we can change the way clang generate DISubprogram names: http://reviews.llvm.org/rL255744
This commit was not the right thing to do, because when generating function inside FieldList we need to give it the naked name (without scoping), however in all other places we will need the full name with the scope.

If you ask me, I prefer that Clang do not add the scope name prefix to the function name, and lit this to the be done by the CodeViewDebug component in the Backend.

aaboud updated this revision to Diff 60036.Jun 8 2016, 7:36 AM

Updated patch according to comments from Reid and David.

rnk added inline comments.Jun 8 2016, 10:59 AM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
876–878

This example works fine today without this extra lookup, though. The only way I could imagine this happening is if translating 'A*' triggers translation of the complete A type, and we can avoid that. The graph of types should be acyclic if you don't follow edges through record types, which is what I was trying to do.

Instead, I think we should queue all complete record types that we've seen during type translation in a separate list, and lower them later. We'll need this anyway to implement S_UDT.

1073

Hm, maybe this thing should just take (unsigned RecordTag, unsigned Flags) so we can avoid the duplication.

1301

Sure, we can do that. Just modify the metadata in your test case to have the names you expect and we'll fix clang later.

Thanks Reid for the comments.
Please just let me know what you think about the double check for Modifier and Pointer record generation (see my comment below).
I will upload a new patch with all the fixes accordingly.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
876–878

The only reason why this is working today without this check, is that we have another check in MemoryTypeTableBuilder for the hashed record.
But this is still inefficient as we are creating the Record and building the String using the TypeRecordBuilder, just to find out that we already have the record hashed!

If you think this is good enough, I do not mind removing all these checks (we can add them later anyway).

1073

Will fix in next uploaded patch.

1301

Will do that in next uploaded patch.

aaboud updated this revision to Diff 60107.Jun 8 2016, 3:07 PM
aaboud marked an inline comment as done.

Addressing two of Reid comments.

rnk added a comment.Jun 8 2016, 6:23 PM

I still feel like this patch is too large and is trying to do too much. Can we limit it to just handling non-virtual methods and their overloads? I don't want to review the VBPtr offset computation code at the same time as bitfields and virtual bases and etc etc. Each of those things requires a certain amount of care.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1092

You should rebase your patch to pick up the calling convention changes, this TODO should be done.

1155–1160

You should be able to key these on MDString* instead of StringRef, because MDStrings are uniqued. That will be more efficient.

1232

All this code below here scares me. It feels like an ad-hoc reimplementation of inheritance logic that should really be in clang. I think it would really simplify things if we ignored virtual base classes for now, and didn't try to emit them just yet. It would make it easier to review this patch.

1346

This patch uses too much auto in for loops. The LLVM guidelines say to use it in "places where the type is already obvious from the context". As a reader, Methods is some other data structure that I don't know much about, so it would help me to have the type here.

aaboud added a comment.Jun 9 2016, 6:39 AM
In D21011#453066, @rnk wrote:

I still feel like this patch is too large and is trying to do too much. Can we limit it to just handling non-virtual methods and their overloads? I don't want to review the VBPtr offset computation code at the same time as bitfields and virtual bases and etc etc. Each of those things requires a certain amount of care.

OK, this is what I am going to do:

  1. Update (rebase) this patch to top of trunk (no other changes will be made) and will upload the patch.
  2. I will remove remove from the patch the handling of bitfield, VBPtr, virtual inheritance, etc. and will apply relevant comments.

I am doing (1) so we still have a reference for this code (updated to top of trunk).
I will upload these patches soon today.

aaboud updated this revision to Diff 60245.Jun 9 2016, 2:55 PM

Updated patch to top of trunk.
Next will split patch into small patches.

aaboud updated this revision to Diff 60418.Jun 10 2016, 4:21 PM

Reduced the support to only methods.

rnk added a comment.Jun 15 2016, 10:30 AM

This needs some test, I'd add something like types-method-overloads.ll, and keep it focused to non-virtual methods so that we don't have to update it too much as we handle more complex virtual inheritance cases.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1189–1196

This does 3 lookups, it only needs to do one, something like this:

auto Insertion = ClassInfoMap.insert({Ty, std::unique_ptr<ClassInfo>()});
std::unique_ptr<ClassInfo> &Info = *Insertion.first->second;
if (!Insertion.second)
  return Info;
Info.reset(new ClassInfo());
...
1206

Should we claim that all methods are introduced virtual methods, or just the virtual ones?

aaboud marked an inline comment as done.Jun 16 2016, 10:53 AM

Thanks Reid for the comments, I will upload a new version with a LIT tests and some fixes.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1206

Introduce is only for virtual methods, however the "Methods" container holds all methods of the class (virtual and non-virtual).
I am using the same container to be able to output the methods in the right order, though I am not sure if the order is important!

See "translateMethodKindFlags" at line 1091, where it consider the "Introduced" variable only for virtual functions.

rnk added inline comments.Jun 16 2016, 11:07 AM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1206

I guess I'm questioning the usefulness of MethodInfo. The 'Introduced' boolean is always set to true in this code.

How do you propose to calculate whether a method was introduced? Remember, we can't actually walk the class hierarchy because the frontend may not emit complete debug information about base classes, unless -fstandalone-debug-info is on. I think this is something that we will need the frontend to tell us directly, see https://llvm.org/bugs/show_bug.cgi?id=28150.

Anyway, this doesn't need to hold up the patch. With tests I think this is ready and we can iterate on it from there.

aaboud marked an inline comment as done.Jun 16 2016, 11:12 AM
aaboud added inline comments.
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1206

Actually, I planed to travel the hierarchy, though you might have a point with clang not emitting the complete debug info (but need to produce an example for that first).

Anyway, I added a comment and switched the value to false, as we are handling non-virtual methods for now, and they should not be marked as introduced.

aaboud updated this revision to Diff 61009.Jun 16 2016, 12:29 PM
aaboud marked an inline comment as done.

Added LIT test, and applied some minor changes to the code according to Reid comments.

majnemer added inline comments.Jun 16 2016, 12:37 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
759–760

else after return is discouraged by the coding standards: http://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return

1136

Does this need to be in the LLVM namespace? Why not stick it in an anonymous namespace?

1461

Why not use std::ignore instead?

1505

Comments should end in a period.

1552

Please clang-format this.

rnk added inline comments.Jun 16 2016, 1:21 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1094–1095

The TODO is done, you can delete the comment.

1136

It does, it's forward declared and used as an incomplete type in the header.

aaboud updated this revision to Diff 61050.Jun 16 2016, 5:14 PM
aaboud marked 5 inline comments as done.

Applied changes according to comments from David and Reid.

rnk accepted this revision.Jun 17 2016, 8:02 AM
rnk edited edge metadata.

lgtm, thanks!

This revision is now accepted and ready to land.Jun 17 2016, 8:02 AM
This revision was automatically updated to reflect the committed changes.