This is an archive of the discontinued LLVM Phabricator instance.

Detect mismatching 'new' and 'delete' uses
ClosedPublic

Authored by ismailp on Jul 24 2014, 2:27 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

ismailp updated this revision to Diff 11854.Jul 24 2014, 2:27 PM
ismailp retitled this revision from to Detect mismatching 'new' and 'delete' uses.
ismailp updated this object.
ismailp edited the test plan for this revision. (Show Details)
ismailp added reviewers: rsmith, rtrieu, jordan_rose.
ismailp added a subscriber: Unknown Object (MLST).
rsmith edited edge metadata.Jul 24 2014, 3:07 PM

This seems like a really nice idea.

lib/Sema/SemaExprCXX.cpp
2246–2247 ↗(On Diff #11854)

You should probably delay this until the end of the TU, so you can do the check based on knowledge of the full set of constructors (in the common case where there exists a file that defines them all). Take a look at the way we implement the 'unused private field` warning for inspiration.

2249–2250 ↗(On Diff #11854)

I'm not sure that breaking out once you see any CXXNewExpr is the right approach. Consider:

struct S {
  bool Array;
  T *P;
  S() : Array(false), P(new T) {}
  S(int N) : Array(true), P(new T[N]) {}
  ~S() { if (Array) delete [] P; else delete P; }
};

Here, whether we get a diagnostic, and which one we get, will depend on which constructor we happen to visit first.

2251 ↗(On Diff #11854)

You need to map to the definition of CD here.

2253 ↗(On Diff #11854)

You should walk over single-element InitListExprs here, for cases like : p{new T}, and IgnoreParenImpCasts, for cases like Base *p; [...] : p(new Derived).

2405 ↗(On Diff #11854)

This is not OK as recovery from a warning, especially not a warning with false positives.

ismailp updated this revision to Diff 12356.Aug 11 2014, 11:15 AM
ismailp edited edge metadata.

Updated patch to address following issues:

  • Delay checking delete expressions until the end of translation unit
  • Collect a list of new-expressions that mismatch with delete-expression
  • If at least one new-expression matches with delete-expression, no warning will be emitted
  • Do not attempt to recover
  • Collect ctor-initializers from definitions of constructors
  • Added InitListExprs support
rsmith added inline comments.Aug 11 2014, 11:33 AM
include/clang/Sema/Sema.h
395–396 ↗(On Diff #12356)

This is not a reasonable way to track the relevant state here.

The interesting cases are VarDecls and FieldDecls. For VarDecls, at the point where we see the delete, you can immediately check whether you have an new-expression as an initializer (and don't worry about the obscure cases where an initializer comes after the use). For FieldDecls, perform the check where you see the delete, and if you see either (a) no constructor with a matching new, or (b) only constructors with the wrong kind of new, then add your delete to a list to check at the end of the TU.

That way, in almost all cases we can deal with the diagnostic immediately rather than at end of TU. (This avoids building up a big set of all delete expressions.)

You'll also need to figure out how you're going to handle serialization/deserialization here, in the case where you hit one of these "not sure" cases in a PCH or similar. To that end, you should probably store a list of {SourceLocation, FieldDecl} pairs, or -- better -- a map from FieldDecl to the first source location where we saw a delete. (You'll need one extra bit to describe whether it was delete or delete[] I suppose...).

ismailp added inline comments.Aug 20 2014, 1:26 PM
include/clang/Sema/Sema.h
395–396 ↗(On Diff #12356)

I have implemented the first half of the patch, slightly different than your suggestion. We diagnose this immediately for VarDecl, as you said. For FieldDecls, delete expression is queued, if at least one constructor isn't defined (yet). Otherwise (if all constructor definitions are visible), warning is issued in ActOnCXXDelete right before function returns. Could you please explain the rationale for your point (b)? If we can see all constructors while analyzing a CXXDeleteExpr, we can reason about constructors and in-class initializers immediately. If no constructor initializes the member, we check in-class initializer (if exists). If all constructors initialize member with mismatching type of new, then we can diagnose them all immediately, instead of postponing it to the end of translation unit. I can think of only one case (aside from PCH case) that requires postponing this analysis to the end of translation unit; that is, when we cannot see at least one constructor's definition.

I have a few questions regarding PCH/external source. I have no prior experience with ASTReader/Writer at all. Sorry, if my questions don't make any sense. Do you mean we should handle following case:

// header; compiled as pch.
#ifndef header
struct X {
  int *a;
  ~X() { delete a; }  // not sure
  X();
 };
 #endif

// source file, includes above pch
#include "header"
X::X() : a(new int[1]) { }

We encounter with a delete-expr in a PCH. We can't prove anything, because we can't see definition of X::X() yet. Even if there was an in-class initializer, that wouldn't help, exactly for the same reason. X::X() will be visible only at the end of translation unit. Therefore, we must write this information somewhere in PCH so that we can load it at the end of translation unit.

I have followed UndefinedButUsed as a serialization/deserialization example. As far as I understood, I need to write a record, and read it back. I blindly followed your suggestion here :) This record contains, as you said, FieldDecl, SourceLocation and a bit indicating array form. I put them in an std::tuple<FieldDecl *, SourceLocation, bool>. Can that "bit" be represented as APInt(width=1)? I did, roughly:

RecordData DeleteExprsToAnalyze;
AddDeclRef(get<0>(tuple), DeleteExprsToAnalyze);
AddSource(get<1>(tuple), DeleteExprsToAnalyze);
AddAPInt(APInt(1, get<2>(tuple)), DeleteExprsToAnalyze);
// ...
if (!DeleteExprsToAnalyze.empty())
  Stream.EmitRecord(DELETE_EXPRS_TO_ANALYZE, DeleteExprsToAnalyze);

I am trying the PCH part now. I would like to learn the right way, as I might be doing something completely wrong.

rsmith added inline comments.Sep 24 2014, 6:18 PM
include/clang/Sema/Sema.h
395–396 ↗(On Diff #12356)

Could you please explain the rationale for your point (b)?

The idea is: if you haven't seen all the constructors yet, but you have seen a constructor that used the right kind of new, then there's no need to queue the expression; the delete is OK.

If no constructor initializes the member, we check in-class initializer (if exists).

This should probably be: if any constructor does not explicitly initialize the member, check the in-class initializer. (But it's probably redundant, since the constructor's initializer list will have been extended to contain a reference to the in-class initializer in this case.)

AddAPInt(APInt(1, get<2>(tuple)), DeleteExprsToAnalyze);

This is wasteful; DeleteExprsToAnalyze is a SmallVector of integers. Just do DeleteExprsToAnalyze.push_back(get<2>(tuple));. Other than that, this looks like the right approach.

ismailp updated this revision to Diff 14188.Sep 29 2014, 2:45 PM
  • Added PCH support
  • If not all constructors are visible, queue 'delete' for later analysis
  • Try to suggest fix-it for removal of '[]'

This is looking really good. Some fairly minor comments...

include/clang/Sema/Sema.h
8539–8541 ↗(On Diff #14188)

This should probably say "with the array form of 'delete'."

Also, the function's return type is void; it won't return false.

lib/Sema/Sema.cpp
1489–1492 ↗(On Diff #14188)

Can you avoid making a copy of the DenseMap here?

lib/Sema/SemaExprCXX.cpp
2246–2247 ↗(On Diff #14188)

\brief only goes on the first line.

2271 ↗(On Diff #14188)

Typo "eturn"

2348 ↗(On Diff #14188)

IgnoreParenImpCasts would seem better here; you don't find parenthesized new-expressions here.

2409–2411 ↗(On Diff #14188)

This seems like overspecification; just say that we require a field or a delete expression.

2467–2468 ↗(On Diff #14188)

Maybe pass Detector.NewExprs into DiagnoseMismatchedNewDelete and have it produce the notes as well as the error? That'd remove a bit of duplication between this function and the next one.

lib/Serialization/ASTWriter.cpp
4347 ↗(On Diff #14188)

This writes out the map in a nondeterministic order. Maybe use an llvm::MapVector?

ismailp added inline comments.Sep 30 2014, 3:35 PM
include/clang/Sema/Sema.h
8539–8541 ↗(On Diff #14188)

Yes, the reason is.. it was the function that used to detect the mismatch. I should have written documentation for the latest revision. I have written them for myself, primarily, because I've got interrupted frequently while making this patch.

lib/Sema/SemaExprCXX.cpp
2246–2247 ↗(On Diff #14188)

I vaguely recall that is how multi-line brief was supposed to be, but I can't seem to find that on doxygen documentation.

2451 ↗(On Diff #14188)

Do you have concerns about this function call?

ismailp updated this revision to Diff 14249.Sep 30 2014, 3:40 PM

Addressed comments:

  • Convert DenseMap to MapVector to keep order of queued delete-expressions
  • Fixed (or removed) documentation
  • Convert Sema::DiagnoseMismatchedNewDelete into a static non-member function
  • Simplify assert message in MismatchingNewDeleteDetector::analyzeField
  • Use IgnoreParenImpCasts in MismatchingNewDeleteDetector::getNewExprFromInitListOrExpr
rsmith added inline comments.Nov 13 2014, 6:48 PM
lib/Sema/SemaExprCXX.cpp
2285 ↗(On Diff #14249)

Remove this member and pass DE into analyzeDeleteExpr, which is the only place that uses it.

2349–2354 ↗(On Diff #14249)

Reverse the order of these checks to remove the redundancy:

if (isa<ILE>(E) && 1 init)
  E = ILE->getInit(0)->IgnoreParenImpCasts()
auto *NE = dyn_cast<CXXNewExpr>(E);
2371–2372 ↗(On Diff #14249)

Please give this a better name, saying what you're actually analyzing and what the return value might mean. Suggestion: switch from returning false if you find a matching *new-expression* to returning true and rename this to hasMatchingNewInCtor (and rename analyzeCtorInitializer to hasMatchingNewInCtorInit).

2389–2403 ↗(On Diff #14249)

The logic in here is half dealing with in-class initializers and half finalizing the result of your analysis for the field. It would be better to split this up so that this function doesn't do things that are unrelated to in-class initializers.

lib/Serialization/ASTReader.cpp
3208 ↗(On Diff #14249)

Use const uint64_t Count = Record[I++]; here.

lib/Serialization/ASTWriter.cpp
4347 ↗(On Diff #14249)

APInt is overkill here. Just use DeleteExprsToAnalyze.push_back(DeleteExprsInfo.second.size()).

ismailp updated this revision to Diff 16299.Nov 17 2014, 10:50 AM

Addressed:

  • Removed class member MismatchingNewDeleteDetector::DE, and passed DE to analyzeDeleteExpr
  • Reordered checks in getNewExprFromInitListOrExpr
  • Renamed member functions so that they make more sense. e.g. hasMatchingNewInCtor, hasMathingVarInit
  • Moved multiple checks from within analyzeInClassInitializer into analyzeField
  • Removed unnecessary uses of APInt in serialization code
majnemer added inline comments.
include/clang/Sema/ExternalSemaSource.h
31 ↗(On Diff #16299)

Please order this before CXXConstructorDecl.

ismailp added inline comments.Jan 27 2015, 1:03 PM
include/clang/Sema/ExternalSemaSource.h
31 ↗(On Diff #16299)

Shouldn't CXXDeleteExpr come right after CXXConstructorDecl, and before CXXRecordDecl to preserve alphabetical ordering?

ismailp updated this revision to Diff 18846.Jan 27 2015, 1:03 PM
  • Reordered forward declaration of CXXDeleteExpr after CXXConstructorDecl, and before CXXRecordDecl to keep declarations in alphabetically sorted.

Friendly ping!

rsmith accepted this revision.May 12 2015, 3:23 PM
rsmith edited edge metadata.

LGTM, thanks!

lib/Sema/SemaExprCXX.cpp
2265 ↗(On Diff #18846)

Extra "at" in this comment.

This revision is now accepted and ready to land.May 12 2015, 3:23 PM
This revision was automatically updated to reflect the committed changes.

Thank you for reviewing!