Page MenuHomePhabricator

[Analyzer] Fix for PR23790: bind real value returned from strcmp when modelling strcmp.

Authored by ayartsev on Jan 19 2016, 6:34 AM.



The patch is a fix for PR23790. Call to StringRef::Compare() returning [1,0,-1] is replaced with the real call to strcmp to be more precise in modelling. This also may theoretically help to find defects if a tested code relays on a value returned from strcmp like
if (strcmp(x, y) == 1) { ... }

Please review!

Diff Detail

Event Timeline

ayartsev updated this revision to Diff 45253.Jan 19 2016, 6:34 AM
ayartsev retitled this revision from to [Analyzer] Fix for PR23790: bind real value returned from strcmp when modelling strcmp..
ayartsev updated this object.
ayartsev added a reviewer: zaks.anna.
ayartsev added a subscriber: cfe-commits.
NoQ added a subscriber: NoQ.Jan 19 2016, 7:27 AM

Hmm. If we want to catch bugs resulting from alternative strcmp() implementations, then probably a test case that demonstrates the improvement would be worth it, eg.:

int x = strcmp("foo", "bar"));
if (x == 1 || x == -1)
  clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
if (x > 1 || x < -1)
  clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}

However, now we don't quite pass it yet, because the hardcoded implementation of strcmp() is still specific, just different depending on how the clang code was compiled (which may be similar to or different from the implementation on which the code under analysis relies).

In order to pass such test, we could conjure a symbol for return value of strcmp() and only enforce range on this symbol (such as [INT_MIN, -1] or [1, INT_MAX]), rather than returning a concrete value.

As Artem notes, you can't defer to the host strcmp() -- doing so is just as unsound as using StringRef::compare() less predictable under optimization of the analyzer. I think his suggested approach is the way to go: create a symbol and constrain its range based on the result of StringRef::compare.

For tests, I would suggest:

int lessThanZero = strcmp("aaa", "nnn");
clang_analyzer_eval(lessThanZero < 0); // expected-warning {{TRUE}}
clang_analyzer_eval(lessThanZero >= 0); // expected-warning {{FALSE}}
clang_analyzer_eval(lessThanZero < -13); // expected-warning {{UNKNOWN}}

int greaterThanZero = strcmp("nnn", "aaa");
clang_analyzer_eval(greaterThanZero > 0); // expected-warning {{TRUE}}
clang_analyzer_eval(greaterThanZero <= 0); // expected-warning {{FALSE}}
clang_analyzer_eval(greaterThanZero > 13); // expected-warning {{UNKNOWN}}

int equalToZero = strcmp("aaa", "aaa");
clang_analyzer_eval(equalToZero == 0); // expected-warning {{TRUE}}

These show that the analyzer does assume the strongest sound postcondition and spot checks that it doesn't assume either the stronger, StringRef-implementation-specific invariant (1/-1) or an invariant from a common unoptimized memcpy() implementation ('a' - 'n' is 13).


Whatever changes you make for the case-sensitive compare should also be analogously applied to the case-insensitive compare.

dcoughlin requested changes to this revision.Jan 19 2016, 8:46 AM
dcoughlin added a reviewer: dcoughlin.
dcoughlin edited edge metadata.
This revision now requires changes to proceed.Jan 19 2016, 8:46 AM


This also may theoretically help to find defects if a tested code relays on a value returned from strcmp like
if (strcmp(x, y) == 1) { ... }

I think it would be useful and not that difficult to write a checker that checks for this explicitly. I don't think an AST-based/clang-tidycheck would quite be enough because I suspect programmers often store the result of strcmp() in a local before using it. One approach would be to have a path-sensitive checker keep a set of symbolic result values from strcmp() and diagnose if any is ever explicitly checked for (dis)-equality with -1 or 1.

ayartsev updated this revision to Diff 46538.Feb 1 2016, 8:09 AM
ayartsev edited edge metadata.

Updated the patch, please review.

dcoughlin accepted this revision.Feb 22 2016, 9:19 AM
dcoughlin edited edge metadata.

Looks great, other than some non-standard indentation.


Non-standard indentation here.

This revision is now accepted and ready to land.Feb 22 2016, 9:19 AM
NoQ closed this revision.Aug 17 2016, 8:15 AM

Committed in r270154 (long time ago).