This is an archive of the discontinued LLVM Phabricator instance.

Avoid segfault when destructor is not yet known
Needs ReviewPublic

Authored by dim on Mar 15 2018, 1:15 PM.

Details

Summary

In some cases, a class type can have a definition, but its destructor
may not yet be known. In that case, a segfault can occur in
EmitObjectDelete.

Avoid it by checking the return value of CXXRecordDecl::getDestructor,
and add a minimized test case.

Fixes PR36749.

Diff Detail

Event Timeline

dim created this revision.Mar 15 2018, 1:15 PM

I'm not sure it's supposed to be a valid state to get into IRGen with a non-trivial destructor that isn't yet declared. Richard?

dim added a comment.Mar 15 2018, 1:41 PM

I'm not sure it's supposed to be a valid state to get into IRGen with a non-trivial destructor that isn't yet declared. Richard?

As a side note, clang also emits a warning about it (but then crashes :) ):

VMMapsMain-minimized.cpp:10:18: warning: deleting pointer to incomplete type 'c' may cause undefined behavior
  virtual ~d() { delete e; }
                 ^      ~
VMMapsMain-minimized.cpp:7:7: note: forward declaration of 'c'
class c;
      ^
Segmentation fault

Hmm. Sema is lazy about actually creating implicit destructor declarations, but it's supposed to only do it whenever the destructor is actually used for something. I suspect that Sema just thinks that nothing is using c::~c, because the only thing that does use it is d::~d, which is inline and has no apparent users. But from the stack trace, it seems that IRGen is in fact emitting d::~d. So the question is, why is IRGen emitting d::~d? Because AFAIK a non-array new-expression never requires the destructor to exist.

Hmm. Sema is lazy about actually creating implicit destructor declarations, but it's supposed to only do it whenever the destructor is actually used for something. I suspect that Sema just thinks that nothing is using c::~c, because the only thing that does use it is d::~d, which is inline and has no apparent users. But from the stack trace, it seems that IRGen is in fact emitting d::~d. So the question is, why is IRGen emitting d::~d? Because AFAIK a non-array new-expression never requires the destructor to exist.

Oh, because it's used in d's v-table, of course, which is needed by the implicit constructor for d.

dim added a comment.Mar 19 2018, 3:44 PM

@rsmith, any objections?

I think Richard is probably catching up from a week at the C++ committee.

To be clear, I am objecting to this; I think the destructor should clearly have been created at this point. I'm just hoping Richard will have an idea for how best to fix it.

dim added a comment.Mar 28 2018, 1:45 PM

Ping. Open to sugggestions here :)

Hmm. Sema is lazy about actually creating implicit destructor declarations, but it's supposed to only do it whenever the destructor is actually used for something.

I'm looking at a similar problem where clang crashes in IRGen because the destructor of a class hasn't been added to the AST.

I'm not sure why Sema has to add the declaration lazily on demand rather than always adding it after the class definition has been parsed, for example. Could you explain the reason for this behavior? Does doing so break something or cause mis-compile?

It seems that DeclareImplicitDestructor should be called somewhere, but it's not clear to me where it should be called. Calling it at the end of Sema::CheckCompletedCXXClass fixes the crash I'm seeing, but that doesn't seem to be the right fix.

I think it's part of an effort to avoid creating implicit declarations for all the special members of every struct we parse from system headers.

I see, so Sema::CheckCompletedCXXClass probably isn't the right place to call DeclareImplicitDestructor as that could significantly increase the size of the AST.

rjmccall added a comment.EditedMar 28 2018, 9:41 PM

I see, so Sema::CheckCompletedCXXClass probably isn't the right place to call DeclareImplicitDestructor as that could significantly increase the size of the AST.

Right. Again, I'd like Richard to weigh in here, but my suspicion would be that, somehow, the fact that e is an incomplete type when we type-check that call is combining poorly with the fact that it's declared by the time that we generate IR for it. We probably do not record the existence of a pre-emptive use of the destructor. There is no other way to get into this situation because every other thing that cares about the existence of a destructor requires a complete type.

That points us towards a reasonable solution. The code has undefined behavior because it's deleting an object with non-trivial destructor using an incomplete type. We've already emitted a warning that the code is deleting a pointer to an incomplete type. There are things we clearly *can't* make work retroactively; for example, we're not going to go find an in-class operator delete during IRGen. So I think the right fix here really is that we should just set a flag in the CXXDeleteExpr saying that the destructor is trivial, based on what we found in Sema, and in IRGen we should look at that flag before actually looking at the type.

rsmith added a comment.Apr 2 2018, 3:00 PM

It seems that we have two options: either we valiantly try to support this:

  • we keep a list of types for which we've tried to form a delete-expression, but found that the type was incomplete
  • when such a type is completed, we mark the destructor as used, triggering its synthesis or instantiation if necessary, and likewise for the operator delete
  • we change our CXXDeleteExpr representation so that we don't store any information that might be changed by completing the class on the expression, such as the selected OperatorDelete function and whether we're performing a sized delete

... or we say that we're not interested in calling the destructor for code like this that has undefined behavior. The latter is certainly my preference; it seems hard to justify the complexity required to get this "right", especially since we still won't get it right in cases where we choose to emit some function definitions before the end of the TU.

I agree with John that tracking that we're in the "delete of incomplete class type" case (or some equivalent state, such as a "need to run a cleanup" flag) on the CXXDeleteExpr seems best. We should be careful to ensure that we rebuild the expression in template instantiation when the "delete of incomplete class type" flag is set, even if the expression is non-dependent, though. That is, I think this should work:

struct X { ~X(); };
struct A;
A *get();
template<int> void f() { delete (&get)(); }
struct A { X x; };
A *get() { return new A; }
void g() { f<0>(); }

... even though we first form a (non-dependent) delete-expression for an A at a point where A is incomplete. (Note: the above example is carefully contrived to cause TreeTransform to reuse the CXXDeleteExpr from the template AST in the instantiation. Sadly the implicit array-to-pointer decay in the CallExpr is enough to cause it to rebuild rather than reuse...)

rsmith added a comment.Apr 2 2018, 3:03 PM

Right. Again, I'd like Richard to weigh in here, but my suspicion would be that, somehow, the fact that e is an incomplete type when we type-check that call is combining poorly with the fact that it's declared by the time that we generate IR for it. We probably do not record the existence of a pre-emptive use of the destructor. There is no other way to get into this situation because every other thing that cares about the existence of a destructor requires a complete type.

Yes, that's exactly the situation. In all other cases, we can mark the destructor itself as used, and trigger synthesis / instantiation at that point, but if the class is incomplete, we don't have a destructor declaration to mark, and we can't even generate one. (That's not the only problem, though: there is state we store on the CXXDeleteExpr, such as the OperatorDelete and the sized delete flag, that we also need a complete class type in order to correctly generate.)

It seems that we have two options: either we valiantly try to support this:

  • we keep a list of types for which we've tried to form a delete-expression, but found that the type was incomplete
  • when such a type is completed, we mark the destructor as used, triggering its synthesis or instantiation if necessary, and likewise for the operator delete
  • we change our CXXDeleteExpr representation so that we don't store any information that might be changed by completing the class on the expression, such as the selected OperatorDelete function and whether we're performing a sized delete

Yeah, this is probably unworkable. (Of course I'm brainstorming ways we could do it — we could have a global entry for late-realized destructor information. But it just seems like a terrible idea.)

... or we say that we're not interested in calling the destructor for code like this that has undefined behavior. The latter is certainly my preference; it seems hard to justify the complexity required to get this "right", especially since we still won't get it right in cases where we choose to emit some function definitions before the end of the TU.

I agree.

I agree with John that tracking that we're in the "delete of incomplete class type" case (or some equivalent state, such as a "need to run a cleanup" flag) on the CXXDeleteExpr seems best. We should be careful to ensure that we rebuild the expression in template instantiation when the "delete of incomplete class type" flag is set, even if the expression is non-dependent, though. That is, I think this should work:

struct X { ~X(); };
struct A;
A *get();
template<int> void f() { delete (&get)(); }
struct A { X x; };
A *get() { return new A; }
void g() { f<0>(); }

... even though we first form a (non-dependent) delete-expression for an A at a point where A is incomplete.

Good point. I agree that we should allow incomplete deletes to work as long as they've been completed by instantiation time.

dim added a comment.Aug 2 2020, 3:19 AM

Hm, this review's still open after two years, and even as of 2020-08-02 clang still crashes on the sample. :)

Would you like to pick it back up? We laid out an implementation path: we need to track the fact that a delete was of an incomplete class type in the AST and then unconditionally treat such operations as trivial to destroy in IRGen.