This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][MallocChecker][NFC] Communicate the allocation family to auxiliary functions with parameters
ClosedPublic

Authored by Szelethus on Sep 27 2019, 3:07 PM.

Details

Summary

The following series of refactoring patches aim to fix the horrible mess that MallocChecker.cpp is.

I genuinely hate this file. It goes completely against how most of the checkers are implemented, its by far the biggest headache regarding checker dependencies, checker options, or anything you can imagine. On top of all that, its just bad code. Its seriously everything that you shouldn't do in C++, or any other language really. Bad variable/class names, in/out parameters... Apologies, rant over.

So: there are a variety of memory manipulating function this checker models. One aspect of these functions is their AllocationFamily, which we use to distinguish between allocation kinds, like using free() on an object allocated by operator new. However, since we always know which function we're actually modeling, in fact we know it compile time, there is no need to use tricks to retrieve this information out of thin air n+1 function calls down the line. This patch changes many methods of MallocChecker to take a non-optional AllocationFamily template parameter (which also makes stack dumps a bit nicer!), and removes some no longer needed auxiliary functions.

Diff Detail

Event Timeline

Szelethus created this revision.Sep 27 2019, 3:07 PM
NoQ added a comment.EditedSep 27 2019, 6:54 PM

Thank you, fantastic finding!

in fact we know it compile time

Yeah, but is it accidental or is there a good reason behind always having this information at compile time? 'Cause i don't want to restrict the code to always provide this information at compile time if we're not sure it'll always be able to provide it in compile time.

My mental model for this sort of stuff is something like this:

CallDescriptionMap M = {
  "malloc", {&MallocChecker::MallocMemAux, AF_Malloc},
  "alloca", {&MallocChecker::MallocMemAux, AF_Alloca},
};

void checkPostCall(const CallEvent &Call, CheckerContext &C) {
  if (Info *I = M.lookup(Call))
    I->first(I->second, C, Call);
}
In D68162#1686810, @NoQ wrote:

Thank you, fantastic finding!

in fact we know it compile time

Yeah, but is it accidental or is there a good reason behind always having this information at compile time? 'Cause i don't want to restrict the code to always provide this information at compile time if we're not sure it'll always be able to provide it in compile time.

Definitely accidental. And now that I think about it, maybe it would be better to turn these into non-optional regular arguments. Though its far in the future, we could eventually add annotations that tell whether a function returns with newed or malloc()ated memory.

The entire point of the patch was to get rid of getAllocationFamily, because there really isn't a need to get the allocation family of hardcoded functions.

Szelethus updated this revision to Diff 222781.Oct 2 2019, 2:43 AM
Szelethus retitled this revision from [analyzer][MallocChecker][NFC] Communicate the allocation family information to auxiliary functions with template parameters to [analyzer][MallocChecker][NFC] Communicate the allocation family to auxiliary functions with parameters.

Use regular parameters instead of template parameters.

I like this change. If we can retrieve something with a simple getter, then do not carry extra parameters all over the code. However, if we have to recalculate something over and over again then it is much better to determine it once and pass it around as a parameter. Especially in this case where we have the information statically at top level and do not have to calculate it all if we use that.

clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
1875

Please add an && "<Error message>" to the assertion.

I forgot to emphasise that the entire point of the patch was to get rid of getAllocationFamily, at least the way it used to work, because it was a mess and prevented me from moving forward with changing things to a CallDescriptionMap.

I like this change. If we can retrieve something with a simple getter, then do not carry extra parameters all over the code. However, if we have to recalculate something over and over again then it is much better to determine it once and pass it around as a parameter. Especially in this case where we have the information statically at top level and do not have to calculate it all if we use that.

In D68162#1686810, @NoQ wrote:

My mental model for this sort of stuff is something like this:

CallDescriptionMap M = {
  "malloc", {&MallocChecker::MallocMemAux, AF_Malloc},
  "alloca", {&MallocChecker::MallocMemAux, AF_Alloca},
};

void checkPostCall(const CallEvent &Call, CheckerContext &C) {
  if (Info *I = M.lookup(Call))
    I->first(I->second, C, Call);
}

I guess you two are right. Maybe the best solution would be collapse const CallExpr * and AllocationFamily parameters into CallEvent, and just query the relevant information. Sure, its a query every time we need it, but nothing impacts performance as much as months of hair pulling trying to understand what this file does :^)

NoQ accepted this revision.Nov 7 2019, 5:13 PM

Thanks again!~

I guess you two are right. Maybe the best solution would be collapse const CallExpr * and AllocationFamily parameters into CallEvent, and just query the relevant information. Sure, its a query every time we need it, but nothing impacts performance as much as months of hair pulling trying to understand what this file does :^)

Just make sure not to store call events in the program state, 'cause they're short-lived :)

This revision is now accepted and ready to land.Nov 7 2019, 5:13 PM
This revision was automatically updated to reflect the committed changes.