This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Fix bug in constructor homing for classes with trivial constructors.
ClosedPublic

Authored by akhuang on Sep 16 2020, 6:02 PM.

Details

Summary

This changes the code for constructor homing to check for aggregate classes and classes with trivial default constructors. Previously, this tried to loop through the constructors, which doesn't work when constructors are not yet instantiated.

Bug: https://bugs.llvm.org/show_bug.cgi?id=47556

Diff Detail

Event Timeline

akhuang created this revision.Sep 16 2020, 6:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 16 2020, 6:02 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
akhuang requested review of this revision.Sep 16 2020, 6:02 PM
akhuang edited the summary of this revision. (Show Details)Sep 16 2020, 6:17 PM

@rsmith What's the deal with these anonymous structs/unions? Why do they have copy/move constructors (are those technically called from the enclosing class's copy/move constructors?) but no default constructor to be called from the other ctors of the enclosing class?

akhuang added a reviewer: rnk.Sep 18 2020, 9:39 AM
rnk added a comment.Sep 18 2020, 9:59 AM

@rsmith What's the deal with these anonymous structs/unions? Why do they have copy/move constructors (are those technically called from the enclosing class's copy/move constructors?) but no default constructor to be called from the other ctors of the enclosing class?

Yes, I'm curious: copy and move constructors require an object of this type to already exist in memory. Is there a well-defined way of creating an object of this type in memory when it has no other constructors?

Maybe the issue is that this code is running into the lazy implicit special member declaration optimization. Maybe the class in question has an implicit, trivial, default constructor, but we there is no CXXConstructorDecl present in the ctors list for the loop to find.

In D87808#2282223, @rnk wrote:

@rsmith What's the deal with these anonymous structs/unions? Why do they have copy/move constructors (are those technically called from the enclosing class's copy/move constructors?) but no default constructor to be called from the other ctors of the enclosing class?

Yes, I'm curious: copy and move constructors require an object of this type to already exist in memory. Is there a well-defined way of creating an object of this type in memory when it has no other constructors?

Maybe the issue is that this code is running into the lazy implicit special member declaration optimization. Maybe the class in question has an implicit, trivial, default constructor, but we there is no CXXConstructorDecl present in the ctors list for the loop to find.

Yeah, I'd guess that it has a constructor but it doesn't show up in the loop?

@rsmith What's the deal with these anonymous structs/unions? Why do they have copy/move constructors (are those technically called from the enclosing class's copy/move constructors?) but no default constructor to be called from the other ctors of the enclosing class?

Let's use the class G in the example. G::G directly initializes the field g_, so it can't call any kind of constructor for the anonymous struct member. Instead, anonymous structs and unions are "expanded" when building the constructor initializer lists for all cases other than a copy or move constructor (no initialization action is taken for union members by default, though -- not unless they have a default member initializer or an explicit mem-initializer). So the default constructor of an anonymous struct or union is never really used for anything[*].

But now consider the copy or move constructor for a type like this:

struct X {
  Y y;
  union { ... };
};

That copy or move constructor needs to build a member initializer list that (a) calls the Y copy/move constructor for y, and (b) performs a bytewise copy for the anonymous union. We can't express this in the "expanded" form, because we wouldn't know which union member to initialize.

So for the non-copy/move case, we build CXXCtorInitializers for expanded members, and in the copy/move case, we build CXXCtorInitializers calling the copy/move constructors for the anonymous structs/unions themselves.

[*]: It's not entirely true that the default constructor is never used for anything. When we look up special members for the anonymous struct / union (looking for the copy or move constructor) we can trigger the implicit declaration of the default constructor. And it's actually even possible to *call* that default constructor, if you're tricksy enough: https://godbolt.org/z/Tq56bz

In D87808#2282223, @rnk wrote:

Maybe the issue is that this code is running into the lazy implicit special member declaration optimization. Maybe the class in question has an implicit, trivial, default constructor, but we there is no CXXConstructorDecl present in the ctors list for the loop to find.

That seems very likely. The existing check:

if (Ctor->isTrivial() && !Ctor->isCopyOrMoveConstructor())
  return false;

is skating on thin ice in this regard: a class with an implicit default constructor might or might not have had that default constructor implicitly declared. But I think this code should give the same outcome either way, because a class with any constructor other than a default/copy/move constructor must have a user-declared constructor of some kind, and so will never have an implicit default constructor. (Inherited constructors introduce some complications here, but we don't do lazy constructor declaration for classes with inherited constructors, so I think that's OK too.)

clang/lib/CodeGen/CGDebugInfo.cpp
2294–2297

This looks pretty similar to:

return RD->hasUserDeclaredConstructor() && !RD->hasTrivialDefaultConstructor();

(other than its treatment of user-declared copy or move constructors), but I have to admit I don't really understand what the "at least one constructor and no trivial or constexpr constructors" rule aims to achieve, so it's hard to know if this is the right interpretation. The rule as written in the comment above is presumably not exactly right -- all classes have at least one constructor, and we're not excluding classes with trivial copy or move constructors, only those with trivial default constructors.

I wonder if the intent would be better captured by "at least one non-inline constructor" (that is, assuming all declared functions are defined, there is at least one translation unit in which a constructor is defined that can be used as a home for the class info).

@rsmith What's the deal with these anonymous structs/unions? Why do they have copy/move constructors (are those technically called from the enclosing class's copy/move constructors?) but no default constructor to be called from the other ctors of the enclosing class?

Let's use the class G in the example. G::G directly initializes the field g_, so it can't call any kind of constructor for the anonymous struct member. Instead, anonymous structs and unions are "expanded" when building the constructor initializer lists for all cases other than a copy or move constructor (no initialization action is taken for union members by default, though -- not unless they have a default member initializer or an explicit mem-initializer). So the default constructor of an anonymous struct or union is never really used for anything[*].

But now consider the copy or move constructor for a type like this:

struct X {
  Y y;
  union { ... };
};

That copy or move constructor needs to build a member initializer list that (a) calls the Y copy/move constructor for y, and (b) performs a bytewise copy for the anonymous union. We can't express this in the "expanded" form, because we wouldn't know which union member to initialize.

So for the non-copy/move case, we build CXXCtorInitializers for expanded members, and in the copy/move case, we build CXXCtorInitializers calling the copy/move constructors for the anonymous structs/unions themselves.

OK - I /mostly/ follow all that.

[*]: It's not entirely true that the default constructor is never used for anything. When we look up special members for the anonymous struct / union (looking for the copy or move constructor) we can trigger the implicit declaration of the default constructor. And it's actually even possible to *call* that default constructor, if you're tricksy enough: https://godbolt.org/z/Tq56bz

Good to know! Makes things more interesting. (any case where something could get constructed without calling the constructor is something this feature needs to be aware of/careful about)

In D87808#2282223, @rnk wrote:

Maybe the issue is that this code is running into the lazy implicit special member declaration optimization. Maybe the class in question has an implicit, trivial, default constructor, but we there is no CXXConstructorDecl present in the ctors list for the loop to find.

That seems very likely. The existing check:

if (Ctor->isTrivial() && !Ctor->isCopyOrMoveConstructor())
  return false;

is skating on thin ice in this regard: a class with an implicit default constructor might or might not have had that default constructor implicitly declared.

Yeah, that's subtle & probably best not to rely on being able to gloss over that case.

But I think this code should give the same outcome either way, because a class with any constructor other than a default/copy/move constructor must have a user-declared constructor of some kind, and so will never have an implicit default constructor.

Hmm, trying to parse this. So if we're in the subtle case of having an implicit default constructor (and no other (necessarily user-declared, as you say) constructors) - if it's not been declared, then this function will return false. If it has been declared, it'll return false... hmm, nope, then it'll return true.

It sounds like there's an assumption related to "a class with any constructor other than a default/copy/move" - a default constructor would be nice to use as a constructor home. (certainly a user-defined one, but even an implicit one - so long as it gets IRGen'd when called, and not skipped (as in the anonymous struct/class case) or otherwise frontend-optimized away)

(Inherited constructors introduce some complications here, but we don't do lazy constructor declaration for classes with inherited constructors, so I think that's OK too.)

Ah, handy!

clang/lib/CodeGen/CGDebugInfo.cpp
2294–2297

So the general goal is to detect any type where the construction of that type somewhere must invoke a constructor that will be IR-generated.

Move and copy constructors are ignored because the assumption is they must be moving/copying from some other object, which must've been constructed, ultimately, by a non-move/copy constructor.

Ideally this would be usable even for inline ctors - even if the ctor calls get optimized away later[^1], they'd still allow us to reduce the number of places the type is emitted to only those places that call the ctor.

[^1] actually, the way that should work doesn't seem to be working right now (eg:
type.cpp

struct t1 { t1() { } };
void f1(void*);
int main() {
  f1(new t1());
}

type2.cpp

struct t1 { t1() { } };
void f1(void* v) {
  delete (t1*)v;
}

build: clang++ type.cpp -g -Xclang -fuse-ctor-homing type2.cpp && llvm-dwarfdump a.out
-> definition of "t1" in the DWARF
build with optimizations: clang++ -O3 type.cpp -g -Xclang -fuse-ctor-homing type2.cpp && llvm-dwarfdump a.out
-> missing definition of "t1"
type.cpp is chosen as a home for t1 because it calls a user-defined ctor, but then that ctor gets optimized away and there's no other mention of t1 in type.cpp so the type is dropped entirely. This could happen even with a non-inline definition - under LTO the ctor could get optimized away (though this would be safe in FullLTO - the other references to t1 would be made to refer to the definition and keep it alive - but in ThinLTO the TU defining the ctor might be imported and nothing else - leaving the type unused/dropped)
To fix this we should put 'homed' types in the retained types list so they are preserved even if all other code/references to the type are dropped. I think I implemented this homed type pinning for explicit template specialization definitions, because they have no code attachment point, so similar logic could be used for ctor homing. (vtable homing /might/ benefit from this with aggressive/whole program devirtualization? Not sure - harder to actually optimize away all the references to a type, but possible maybe?)

But I think this code should give the same outcome either way, because a class with any constructor other than a default/copy/move constructor must have a user-declared constructor of some kind, and so will never have an implicit default constructor.

Hmm, trying to parse this. So if we're in the subtle case of having an implicit default constructor (and no other (necessarily user-declared, as you say) constructors) - if it's not been declared, then this function will return false. If it has been declared, it'll return false... hmm, nope, then it'll return true.

Hmm, yes, you're right, if the implicit default constructor would not be trivial, we could get different answers depending on whether we triggered its implicit declaration or not. And for:

struct A { A(); };
struct X { A a; };
struct Y { A a; };
void f(X x) {}
void f(decltype(Y()) y) {}

... I see we get different debug information for X and Y, and it doesn't look like this patch will fix that, but querying hasTrivialDefaultConstructor() would.

clang/lib/CodeGen/CGDebugInfo.cpp
2294–2297

Oh, I see, that's clever and very neat.

So the intent is to identify types for which we know that either no instances are ever created, or some constructor must be actually emitted in some translation unit (prior to optimizations running). And that's why we want to ignore copy and move constructors [and could in fact ignore any constructor taking the class type by value or reference, directly or indirectly] (they cannot be the only place that creates an object) and also why we don't want to do this if there's a constexpr constructor or trivial default constructor (we might not generate code for them). And the "no constructors" check seems to effectively be checking whether aggregate initialization could be performed.

I think we can replace the loop with return !RD->isAggregate() && !RD->hasTrivialDefaultConstructor();.

(Minor aside, it is possible to create an instance of a type if its only constructor is a copy constructor, eg with struct A { A(const A&) {} } a = a;, but I doubt that's a practical concern for your purposes.)

akhuang added inline comments.Sep 22 2020, 3:12 PM
clang/lib/CodeGen/CGDebugInfo.cpp
2294–2297

So the intent is to identify types for which we know that either no instances are ever created, or some constructor must be actually emitted in some translation unit (prior to optimizations running). And that's why we want to ignore copy and move constructors [and could in fact ignore any constructor taking the class type by value or reference, directly or indirectly] (they cannot be the only place that creates an object) and also why we don't want to do this if there's a constexpr constructor or trivial default constructor (we might not generate code for them). And the "no constructors" check seems to effectively be checking whether aggregate initialization could be performed.

I think we can replace the loop with return !RD->isAggregate() && !RD->hasTrivialDefaultConstructor();.

Ok, I think that makes sense - I guess the loop was sort of attempting to do what RD->hasTrivialDefaultConstructor()does.

To fix this we should put 'homed' types in the retained types list so they are preserved even if all other code/references to the type are dropped.

Yeah, think you mentioned this before and then I never added it--I'll make a separate patch for it.

akhuang updated this revision to Diff 293579.Sep 22 2020, 3:56 PM

Update ctor homing check, and add some test cases.

dblaikie added inline comments.Sep 22 2020, 6:55 PM
clang/lib/CodeGen/CGDebugInfo.cpp
2294–2297

And that's why we want to ignore copy and move constructors [and could in fact ignore any constructor taking the class type by value or reference, directly or indirectly] (they cannot be the only place that creates an object)

Yeah, fair point - that'd be a nice generalization. (indirectly, you mean, for instance - ctor of A takes a reference to B that contains an A member?).

I think we can replace the loop with return !RD->isAggregate() && !RD->hasTrivialDefaultConstructor();.

OK - so any type with a user-defined ctor would be a non-aggregate, we don't care about the copy/move ctors so the only possibly trivial ctor that could sneak through would be the default one. Hmm - what cases would we miss by /only/ testing that the type doesn't have a trivial default ctor? Ah, right, the stuff we've just been discussing where the ctor isn't instantiated... and the !aggregate test will cover us there? OK, think I'm with it & really appreciate the nuance.

(Minor aside, it is possible to create an instance of a type if its only constructor is a copy constructor, eg with struct A { A(const A&) {} } a = a;, but I doubt that's a practical concern for your purposes.)

Yeah, I take it that's valid/not UB? Figured there was that tiny case & yeah, I'm willing to let that slide/through, really.

clang/test/CodeGenCXX/debug-info-limited-ctor.cpp
36–56

Probably good to have a comment on each of these tests to clarify their purpose.

46–50

This is to test an implicit, non-trivial default ctor of H?

52–56

Looks like another implicit non-trivial ctor? Oh, I guess the previous case is an implicit non-trivial default ctor that isn't instantiated/built? (so the implementation doesn't observe any default ctor in 'I' at all?)

akhuang updated this revision to Diff 293817.Sep 23 2020, 11:33 AM
akhuang marked an inline comment as done.

Add comments to tests, and add a test for non instantiated trivial ctor and one for lambdas.

dblaikie accepted this revision.Sep 23 2020, 7:29 PM

Generally looks good to me!

clang/lib/CodeGen/CGDebugInfo.cpp
2287

Maybe flesh out the "lambda objects" bit - something about how they are constructed without calling/having a constructor, I guess?

This revision is now accepted and ready to land.Sep 23 2020, 7:29 PM
akhuang updated this revision to Diff 294149.Sep 24 2020, 1:23 PM

Add to comment for lambdas.

akhuang retitled this revision from [DebugInfo] Fix bug in constructor homing where it would use ctor homing when a class only has copy/move constructors to [DebugInfo] Fix bug in constructor homing for classes with trivial constructors..Sep 24 2020, 1:25 PM
akhuang edited the summary of this revision. (Show Details)