This is an archive of the discontinued LLVM Phabricator instance.

[codeview] Emit typedefs of types derived from incomplete types
Needs ReviewPublic

Authored by rnk on Dec 3 2018, 3:39 PM.

Details

Reviewers
zturner
aganea
Summary

The last time we evaluated this behavior was in r311904, which had this
message:

S_UDT symbols are the debugger's "index" for all the structs,
typedefs, classes, and enums in a program.  If any of those
structs/classes don't have a complete declaration, or if there is a
typedef to something that doesn't have a complete definition, then
emitting the S_UDT is unhelpful because it doesn't give the debugger
enough information to do anything useful.  On the other hand, it
results in a huge size blow-up in the resulting PDB, which is
exacerbated by an order of magnitude when linking with
/DEBUG:FASTLINK.

With this patch, we drop S_UDT records for types that refer either
directly or indirectly (e.g. through a typedef, pointer, etc) to a
class/struct/union/enum without a complete definition.  This brings us
about 50% of the way towards parity with /DEBUG:FASTLINK PDBs
generated from cl-compiled object files.

This modifies LLVM so that we only discard S_UDTs pointing directly to
incomplete types, but keep S_UDTs that indirectly use incomplete types,
as in the following C++ fragment:

struct Opaque;
typedef Opaque *OpaqueHandle;
OpaqueHandle gv;

If we don't emit the OpaqueHandle typedef in this TU, it's possible that
no TU will emit the typedef. On the other hand, this could considerably
increase object file size, so we should keep an eye on that.

Event Timeline

rnk created this revision.Dec 3 2018, 3:39 PM
zturner added inline comments.Dec 3 2018, 4:02 PM
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1447

Is this sufficient? If T->isForwardDecl() is true, does that mean the debug info does not contain a full decl at all?

rnk marked an inline comment as done.Dec 3 2018, 6:26 PM
rnk added inline comments.
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1447

Yes, the metadata has only a forward decl or a complete definition.

aganea added a comment.EditedDec 4 2018, 8:40 AM

I just wanted to point out that this overall process of dropping UDT records in the compiler is not the behavior that MSVC exhibits. S_UDTs are (should be) stripped by the linker, not by the compiler. It's also the linker's job (in my sense) to remap forward references to full complete types (albeit if you can it locally in the compiler, it is probably better).

> cl udts-fwd.cpp /c /Z7
> cvdump udts-fwd.obj

*** SYMBOLS

(0000C8) S_GDATA32: [0000:00000000], Type:             0x105C, global_ptr
(0000E1) S_UDT:             0x105D, TypedefFwdPtr
(0000F7) S_UDT:             0x105B, TypedefFwd

*** TYPES

0x105b : Length = 34, Leaf = 0x1505 LF_STRUCTURE
	# members = 0,  field list type 0x0000, FORWARD REF, 
	Derivation list type 0x0000, VT shape type 0x0000
	Size = 0, class name = Foo, unique name = .?AUFoo@@

0x105c : Length = 10, Leaf = 0x1002 LF_POINTER
	Pointer (__ptr64), Size: 8
	Element type : 0x105B

0x105d : Length = 10, Leaf = 0x1002 LF_POINTER
	Pointer (__ptr64), Size: 8
	Element type : 0x105B

> link udts-fwd.obj /debug /entry:ExitProcess kernel32.lib /subsytem:console
> cvdump udts-fwd.pdb


*** TYPES

0x1000 : Length = 34, Leaf = 0x1505 LF_STRUCTURE
	# members = 0,  field list type 0x0000, FORWARD REF, 
	Derivation list type 0x0000, VT shape type 0x0000
	Size = 0, class name = Foo, unique name = .?AUFoo@@

0x1001 : Length = 10, Leaf = 0x1002 LF_POINTER
	Pointer (__ptr64), Size: 8
	Element type : 0x1000

*** GLOBALS

S_UDT:             0x1001, TypedefFwdPtr
S_GDATA32: [0003:00000000], Type:             0x1001, global_ptr

Like I was suggesting in D55236, I think this is part of a post-merge step that link.exe does, see here and here (although I haven't looked exactly into detail).

I just wanted to point out that this overall process of dropping UDT records in the compiler is not the behavior that MSVC exhibits. S_UDTs are (should be) stripped by the linker, not by the compiler.

I think it's fine to drop them in the compiler, because we're relying on another TU to emit them. So this saves the linker from doing some unnecessary merging in cases where we know there's no benefit to emitting them in the compiler.

It's also the linker's job (in my sense) to remap forward references to full complete types (albeit if you can it locally in the compiler, it is probably better).

In an ideal world, probably. But this is also going to be rather slow. In Reid's example above, which I'll repeat here:

struct Opaque;
typedef Opaque *OpaqueHandle;
OpaqueHandle gv;

we can't emit a full declaration for the S_UDT for Opaque, because we don't even have one. So the linker still has some work to do here if we're trying to be perfect, because if a full declaration exists anywhere in the program, we would want it to point to it. On the other hand, if it was struct Opaque {}; then the compiler might as well do the remapping to save the linker some work.

But I think this patch isn't trying to address the first case. We can evaluate it in a followup, but since it has the potential to slow down link times, we should be careful.

As an aside, we should also be careful about any remapping done in the linker, since it has the potential to interfere with ghash. In this case we're fine since we only ghash types and not symbols, and S_UDT is a symbol. But it's something to keep in mind.

aganea added a comment.Dec 4 2018, 9:07 AM

My whole point was that our slowdowns were with cl.exe-generated OBJs, not clang OBJs. If LLD doesn't support this remapping (and dropping) in the same way as link.exe, this won't completly fix S_UDTs.

aganea added a comment.Dec 4 2018, 9:11 AM

But overall I agree with @zturner, the more optimizations you can do early in the compiler, the better it is. This patch is fine and achieves that purpose. I'm just saying that it isn't enough if the goal is to support linking with cl OBJs.

My whole point was that our slowdowns were with cl.exe-generated OBJs, not clang OBJs. If LLD doesn't support this remapping (and dropping) in the same way as link.exe, this won't completly fix S_UDTs.

Yes, that's true. However, it's possible my other patch (https://reviews.llvm.org/D55228) could, since it's specifically adding functionality to the linker that wasn't there before. It's possible that our omission of S_UDT records entirely was causing the slowdown.