This is an archive of the discontinued LLVM Phabricator instance.

Improve alpha.core.CastToStruct warn about widening casts of structs also
ClosedPublic

Authored by danielmarjamaki on Aug 15 2016, 6:31 AM.

Details

Summary

The alpha.core.CastToStruct warns when for instance casting a int pointer to a struct pointer. As accessing a field can lead to memory access errors.

This patch improves the checker so it also warns about widening casts of structs.

Diff Detail

Repository
rL LLVM

Event Timeline

danielmarjamaki retitled this revision from to Improve alpha.core.CastToStruct warn about widening casts of structs also.
danielmarjamaki updated this object.
NoQ edited edge metadata.Aug 17 2016, 3:38 AM

Could we re-use some code between the two checks, like obtaining the type of the operands, and throwing the bug report?

P.S. A thing to notice: this checker shouldn't be path-sensitive, it doesn't use any path-sensitive information.

P.S. A thing to notice: this checker shouldn't be path-sensitive, it doesn't use any path-sensitive information.

Yes I do wonder about that.

As the checker is written right now it seems to me it could be moved to clang-tidy.

But I am thinking we could use path-sensitive analysis in this check and see if the cast leads to memory access problems.

NoQ added a comment.Aug 18 2016, 4:46 AM

But I am thinking we could use path-sensitive analysis in this check and see if the cast leads to memory access problems.

Yeah, in particular, we could also see if the suspicious cast is actually a revert of a previous cast. No way it's blocking this review though.

danielmarjamaki edited edge metadata.

Reuse code, and use RecursiveASTVisitor.

NoQ added a comment.Sep 19 2016, 1:10 AM

Thanks!! It looks a lot better now.

lib/StaticAnalyzer/Checkers/CastToStructChecker.cpp
1

Ouch, just noticed ^

75

Could you add a test case for other branches of this code (cast without unary operator, cast with non-& unary operator)?

85

Could you add a test case for this? It might force you to add a C++ test file. Adding a dedicated file of tests for this checker is anyway a good idea, i think.

You may also want to make this more accurate with CXXRecordDecl::isDerivedFrom() etc. if you like.

lib/StaticAnalyzer/Checkers/CastToStructChecker.cpp
1

I am not sure what problem you see with this line. It looks ok to me.

75

Ok. Maybe the getOpCode is redundant. The ~, !, - operators does not make much sense on pointers and I believe I get compile errors. However imho it's nice for clarity to show that I only look for &var.

85

I started writing a testcase. But I am now thinking about removing this code. I could not write a sensible test case.

If there is code like:

struct Derived : public Base { .. };
struct Base B = {1,2};
struct Derived *D = (struct Derived *)&B;

Then writing the warning seems ok. It seems to me this is just as dangerous as if there was no inheritance.

danielmarjamaki set the repository for this revision to rL LLVM.

Added testfile. Remove code for derived records.

NoQ added inline comments.Sep 19 2016, 4:29 AM
lib/StaticAnalyzer/Checkers/CastToStructChecker.cpp
1

The description is from another checker.

85

Then you should exclude dynamic_cast and probably static_cast as well, because they were created for this particular purpose. Even then, i'm afraid many projects still use C-style casts with combination of kind-of custom RTTI (even in C) and have large switches by kind followed by casts to kind everywhere. You should be able to see this if you run the checker on a large codebase - even though the code you posted is clearly invalid, you cannot prove this without proper path-sensitive analysis, and the cast itself isn't necessarily bad.

lib/StaticAnalyzer/Checkers/CastToStructChecker.cpp
85

I think you're missing that I am not writing warnings about arbitrary struct pointers.

The dynamic_cast is normally used to cast some kind of pointer variable:

Base *B;
...
Derived *D = dynamic_cast<Derived *>(B);

I do not warn about casting such pointers. Because the check does not know what B points at.

I require that '&' is used, like so:

Base B;
...
Derived *D = dynamic_cast<Derived *>(&Base);

Then the check knows what the pointer points at. It's just a Base. I guess you are telling me that we should not warn here. Well you are right, nothing dangerous happens the cast result will just be 0.

NoQ added a comment.Sep 19 2016, 6:49 AM

Thanks for the tests!

lib/StaticAnalyzer/Checkers/CastToStructChecker.cpp
85

Uhm, but the operator-&-heuristic sounds like something extremely restrictive, that should be eventually removed (not even a pair of parentheses is allowed around operator-&).

Some false positives may still appear though, i suspect:

Derived D;
Base &B = D;
Derived *Dptr = (D*)&B; // no-warning

^You can find an example of such code in MallocChecker::checkPreCall, the way they cast CallEvent references. Right now i'm not sure how popular correct and incorrect code patterns of this kind are.

Why i think the heuristic is too restrictive:

Base B;
Base *Bptr = &B;
Derived *Dptr = (D*)B; // expected-warning{{}}
danielmarjamaki removed rL LLVM as the repository for this revision.

Handle case between Base/Derived types better.

Ping.

lib/StaticAnalyzer/Checkers/CastToStructChecker.cpp
85

thanks!

yes I agree it's extremely restricted. of course a pair of parentheses should not cause false negatives though.

imho , path-sensitive analysis is needed to warn about "pointers" and "references" as the check must know what it points at / references to avoid false positives.

NoQ accepted this revision.Sep 23 2016, 9:29 AM
NoQ edited edge metadata.

This patch seems to be an improvement for the original checker, and i think it should be ok to commit if your evaluation of the new check shows reasonable true positive rate.

However, if i was to choose, based on this discussion and other experiences, i would have wished this checker go in a different direction: make a single checker for violations of the strict aliasing rule. This should be do-able as a classic path-sensitive checker which would warn if the same object (RegionOffset) is accessed assuming incompatible value types. For symbolic void* or char* pointers, such checker would need to catch two accesses (which forces us to register stuff with program state), while for other pointer symbols and for concrete regions perhaps a single access should be enough to detect the problem. Such checker would probably be improved by finding branches that introduce aliases between pointer symbols, eg.

void foo(int *x, struct S *y) {
  if (x == y) {
    ...
  }
}

which we don't handle very well in the path-sensitive engine, but it's a great place to start looking into this. I wish somebody looked into this, because it sounds like a really good task for the analyzer.

lib/StaticAnalyzer/Checkers/CastToStructChecker.cpp
64–91

You can use a single Sr here, the convenient ArrayRef constructor from a single element allows passing a single element instead of a one-element array into EmitBasicReport.

This revision is now accepted and ready to land.Sep 23 2016, 9:29 AM

However, if i was to choose, based on this discussion and other experiences, i would have wished this checker go in a different direction: make a single checker for violations of the strict aliasing rule.

That is reasonable. Ideally everybody would fix all UB.

As far as I know it's common to break the strict aliasing rule. There are users that would rather use -fno-strict-aliasing than fix their code. I want to have a checker that is useful when -fno-strict-aliasing is used and detects access out of bounds.

This revision was automatically updated to reflect the committed changes.