This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Checker for non-determinism caused by sorting of pointer-like elements
ClosedPublic

Authored by mgrang on Aug 8 2018, 7:30 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

mgrang created this revision.Aug 8 2018, 7:30 PM
NoQ added a comment.Aug 9 2018, 2:47 PM

Hmm, this might make a good optin.* checker. Also i see that this is a pure AST-based checker; did you consider putting this into clang-tidy? Like, you shouldn't if you're not using clang-tidy yourself (it's a bit messy) but this is an option. Also we're actively using ASTMatchers in Analyzer checkers anyway these days because they're just shorter and easier to read and generally cool.

lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
91–100 ↗(On Diff #159849)

This heuristic ("the argument of std::sort is a class whose first field is of pointer type") is quite wonky, do you have a plan on how to make it more precise?

Szelethus added inline comments.Aug 10 2018, 3:48 AM
lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
91–100 ↗(On Diff #159849)
template< class RandomIt >
void sort( RandomIt first, RandomIt last );

Since RandomIt must satisfy RandomAccesIterator, maybe you could obtain the value type with std::iterator_traits<It>::value_type, and check whether it's a pointer type.

whisperity added inline comments.Aug 10 2018, 3:50 AM
lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
88 ↗(On Diff #159849)

Brrr... equals. StringRef has a == and != operator which accepts string literals on the other side, which would result in a more concise code.

Also, this heuristic can be applied without extra changes (apart from those mentioned by NoQ and might be mentioned later by others) to other sorting functions, such as std::stable_sort and std::stable_partition. Perhaps it would be worthy to enable checking those functions too.

I'm only a beginner, but here are some things that caught my eye. I really like the idea! :)

lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
28 ↗(On Diff #159849)

This comment holds little value.

46 ↗(On Diff #159849)

// end of anonymous namespace

88 ↗(On Diff #159849)

Maybe II->isStr("sort")?

93 ↗(On Diff #159849)

Is there a reason for not directly acquiring the record declaration?

119 ↗(On Diff #159849)

// end of anonymous namespace

test/Analysis/ptr-sort.cpp
1 ↗(On Diff #159849)

Always run the core checkers too.
-analyzer-checker=core,alpha.nondeterminism.PointerSorting

The basics of the heuristics look okay as comparing pointers from non-continuous block of memory is undefined, it would be worthy to check if no compiler warning (perhaps by specifying -W -Wall -Wextra -Weverything and various others of these enable-all flags!) is emitted if std::sort is instantiated for such a use case.

There is a chance for a serious false positive with the current approach! One thing which you should add as a FIXME, and implement in a follow-up patch. Using std::sort, std::unique and then erase is common technique for deduplicating a container (which in some cases [[https://stackoverflow.com/a/1041939/1428773 | might even be quicker than using, let's say std::set]]). In case my container contains arbitrary memory addresses (e.g. multiple things give me different stuff but might give the same thing multiple times) which I don't want to do things with more than once, I might use sort-unique-erase and the sort call will emit a report.

george.karpenkov requested changes to this revision.Aug 10 2018, 10:53 AM

All the comments above + if anything, should use ASTMatchers.

This revision now requires changes to proceed.Aug 10 2018, 10:53 AM
NoQ added a comment.Aug 10 2018, 11:03 AM

I'm only a beginner, but here are some things that caught my eye. I really like the idea! :)

Thanks, i appreciate help with reviews greatly.

lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
93 ↗(On Diff #159849)

I suspect it's about typedefs. We usually use getCanonicalType() for this purpose.

test/Analysis/ptr-sort.cpp
1 ↗(On Diff #159849)

This is a good discipline, but it's not very useful in case of AST-based checkers because they don't interact at all with each other.

Thanks for all your review comments. I will try to address them soon.

Meanwhile, I have started an email thread on the general problem of writing checkers/sanitizers to detect non-deterministic behaviors. See http://lists.llvm.org/pipermail/llvm-dev/2018-August/125191.html.

The background for this is that I had already added support in LLVM to uncover 2 types of non-deterministic behaviors: iteration of unordered containers and sorting of keys with same values. But that is only for LLVM code, not user code. I now wanted to write checkers so that we could detect non-deterministic behavior in user code. It would be great if the reviewers here could chip-in with comments/suggestions for this. Thanks!

MTC added a subscriber: MTC.Aug 12 2018, 7:27 PM
NoQ added a comment.Aug 14 2018, 3:07 PM

I guess one of the things the analyzer could find with path-sensitive analysis is direct comparison of non-aliasing pointers. Not only this is non-deterministic, but there's a related problem that comparison for equality would always yield false and is therefore useless. However, there will be many false negatives because most of the time it's hard to figure out if pointers alias by looking at a small part of the program (a call stack of at most 3-4 functions in the middle of nowhere), as the analyzer does.

mgrang updated this revision to Diff 161152.Aug 16 2018, 5:44 PM

Changed patch to use AST Matchers.

This was my first time using AST matchers so it took me a while to figure out how exactly to get this right. clang-query helped a lot. Backspace seems to be a problem with clang-query though.

mgrang updated this revision to Diff 161165.Aug 16 2018, 10:12 PM
mgrang edited the summary of this revision. (Show Details)Aug 16 2018, 10:12 PM

Added checks for more algorithms: stable_sort, is_sorted, partial_sort, partition, stable_partition, nth_element.

I think testcases for non-class iterator objects would be valuable.

lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
72–76 ↗(On Diff #161165)

I'm not a matcher-expert, but here we go:

As I understand it, you're matching iterator objects that wrap a pointer to pointer. Would it match this case?:

void fill(int **IntPointerArray);

int main() {
  int *IntPointerArray[20]; // or a simple int array
  fill(IntPointerArray);

  std::sort(IntPointerArray, IntPointerArray + 20);
}

int** is an RandomAccessIterator, but is not a CXXRecordDecl. As I mentioned earlier, it might be worth taking a look at hasType, and attempt to acquire value_type, as it is required for STL algorithms (it could be in a base class as well). value_type is also predefined through std::iterator_traits for pointer iterators like int**.

Szelethus added inline comments.Aug 17 2018, 9:51 AM
lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
72–76 ↗(On Diff #161165)

Also, this could lead to false positives, as iterators may store pointers to pointers that don't compare memory addresses. And it could maybe cause false negatives, if the value_type is a pointer type but the elements are stored in wrappers.

As I understand it.

I think testcases for non-class iterator objects would be valuable.

Thanks @Szelethus. Yes, as you correctly pointed out this would not match non-class iterator objects.

This is my first time with ASTMatchers and I am not sure how to get the value_type from hasType (I don't see a matcher for value_types in ASTMatchers.h). Would I end up using a visitor for that? If yes, then maybe the entire check for pointer types needs to be done via visitors? Sorry for the newbie questions :)

mgrang added inline comments.Aug 17 2018, 1:15 PM
lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
72–76 ↗(On Diff #161165)

Yes, I will try to fine tune the heuristic in subsequent patches. I have noted the false-positives in the FIXME.

Szelethus added a comment.EditedAug 21 2018, 5:00 AM

This is my first time with ASTMatchers and I am not sure how to get the value_type from hasType (I don't see a matcher for value_types in ASTMatchers.h). Would I end up using a visitor for that? If yes, then maybe the entire check for pointer types needs to be done via visitors? Sorry for the newbie questions :)

I played around for a bit, my clang-query is a little out of date, but here's what I found: You can match typedefs with typedefDecl(), and using by typeAliasDecl(), and then add hasName("value_type") as a parameter. As to how to check whether a type has any of these, I'm a little unsure myself, but you could use hasDescendant, and narrow down the matches.

I'm not sure whether this checks inherited type declarations, and it sure doesn't check template specializations of std::iterator_traits, but it is a good start :) I'll revisit this when I have a little more time on my hand.

Minor comments in-line, looking good on first glance.

I'm not sure if we discussed, has this checker been tried on some in-the-wild examples? To see if there are some real findings or crashes?
There is a good selection of projects here in this tool: https://github.com/Xazax-hun/csa-testbench.

lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
52 ↗(On Diff #161165)

I generally have a concern about calling these thing "keys". How did you mean this? The file's top comment says elements.

81–87 ↗(On Diff #161165)

Please order this listing.

102 ↗(On Diff #161165)

emitDiagnostics takes a const T&. To prevent unnecessary copy operations in this iteration, you should also use const BoundNode &Match.

46 ↗(On Diff #159849)

This comment can be marked as Done.

whisperity added a project: Restricted Project.Sep 4 2018, 5:01 AM

@Szelethus clang-query seems to sometimes not include matcher functions that are perfectly available in the code... I recently had some issue with using isUserDefined(), it was available in my code, but clang-query always rejected it. Seems there isn't an automatic 1:1 mapping between the two.

@NoQ What is your decision about introducing a new NonDeterminism group? Should be, shan't be, is this name right?

mgrang updated this revision to Diff 163876.Sep 4 2018, 11:10 AM

Addressed comments.

mgrang retitled this revision from [Analyzer] Checker for non-determinism caused by sorting of pointer-like keys to [Analyzer] Checker for non-determinism caused by sorting of pointer-like elements.Sep 4 2018, 11:10 AM
mgrang edited the summary of this revision. (Show Details)
mgrang set the repository for this revision to rC Clang.
mgrang marked 7 inline comments as done.

I'm not sure if we discussed, has this checker been tried on some in-the-wild examples? To see if there are some real findings or crashes?
There is a good selection of projects here in this tool: https://github.com/Xazax-hun/csa-testbench.

Thanks @whisperity for pointing me to the Static Analyzer tests. I will run the checker on it and report my findings here.

mgrang added a comment.EditedSep 4 2018, 11:17 AM

This is my first time with ASTMatchers and I am not sure how to get the value_type from hasType (I don't see a matcher for value_types in ASTMatchers.h). Would I end up using a visitor for that? If yes, then maybe the entire check for pointer types needs to be done via visitors? Sorry for the newbie questions :)

I played around for a bit, my clang-query is a little out of date, but here's what I found: You can match typedefs with typedefDecl(), and using by typeAliasDecl(), and then add hasName("value_type") as a parameter. As to how to check whether a type has any of these, I'm a little unsure myself, but you could use hasDescendant, and narrow down the matches.

I'm not sure whether this checks inherited type declarations, and it sure doesn't check template specializations of std::iterator_traits, but it is a good start :) I'll revisit this when I have a little more time on my hand.

I played around a lot with clang-query in the past week but I haven't been able to match typedefDecl with hasName("value_type"). Here are some matchers I tried:

match callExpr(allOf (callee(functionDecl(hasName("std::sort"))), hasArgument(0,          stmt(hasDescendant(expr(hasType(typedefType(hasDeclaration(typedefNameDecl(hasName("value_type")))))))))))

match callExpr(allOf (callee(functionDecl(hasName("std::sort"))), hasArgument(0,            hasDescendant(expr(hasType(typedefType(hasDeclaration(typedefDecl(matchesName(".*value.*"))))))))))

match callExpr(allOf (callee(functionDecl(hasName("std::sort"))), hasArgument(0,            hasDescendant(declRefExpr(to(fieldDecl(hasName("value_type"))))))))))

Here's the AST dump for the IntPointerArray example:

FunctionDecl 0x639e958 <../../tst/s.cpp:5:1, line:10:1> line:5:5 main 'int ()'
`-CompoundStmt 0x639f1d0 <col:12, line:10:1>
  |-DeclStmt 0x639ead0 <line:6:3, col:27>
  | `-VarDecl 0x639ea70 <col:3, col:26> col:8 used IntPointerArray 'int *[20]'
  |-CallExpr 0x639ebd0 <line:7:3, col:23> 'void'
  | |-ImplicitCastExpr 0x639ebb8 <col:3> 'void (*)(int **)' <FunctionToPointerDecay>
  | | `-DeclRefExpr 0x639eb68 <col:3> 'void (int **)' lvalue Function 0x639e890 'fill' 'void (int **)'
  | `-ImplicitCastExpr 0x639ec00 <col:8> 'int **' <ArrayToPointerDecay>
  |   `-DeclRefExpr 0x639eb40 <col:8> 'int *[20]' lvalue Var 0x639ea70 'IntPointerArray' 'int *[20]'
  `-CallExpr 0x639f180 <line:9:3, col:50> 'void'
|-ImplicitCastExpr 0x639f168 <col:3, col:8> 'void (*)(int **, int **)' <FunctionToPointerDecay>
    | `-DeclRefExpr 0x639f0d0 <col:3, col:8> 'void (int **, int **)' lvalue Function 0x639efd0 'sort' 'void (int **, int **)' (FunctionTemplate 0x638dd18 'sort')
    |-ImplicitCastExpr 0x639f1b8 <col:13> 'int **' <ArrayToPointerDecay>
    | `-DeclRefExpr 0x639ec98 <col:13> 'int *[20]' lvalue Var 0x639ea70 'IntPointerArray' 'int *[20]'
    `-BinaryOperator 0x639ed20 <col:30, col:48> 'int **' '+'
      |-ImplicitCastExpr 0x639ed08 <col:30> 'int **' <ArrayToPointerDecay>
      | `-DeclRefExpr 0x639ecc0 <col:30> 'int *[20]' lvalue Var 0x639ea70 'IntPointerArray' 'int *[20]'
      `-IntegerLiteral 0x639ece8 <col:48> 'int' 20

Make sure you use only the C++ projects like BitCoin, LLVM/Clang, ProtoBuf and such from the Xazax CSA Test suite because this checker is not really applicable to C projects.

Generally I like the idea (personally I've ran into a similar issue to ordering of sets in Java missed and on another OS we failed the tests where they tried to stringify elements and compare the output...) very much, this is one of those subleties you can really miss when writing and when reviewing code, and it will only bite you in the ass months down the line.

I think I'll summon @martong here, ASTMatchers are not my fortée, but he has been tinkering with them much recently.

lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
108–110 ↗(On Diff #163876)

Speaking of not being applicable to C code, I think this snippet here should allow you to skip running the checker if you are not on a C++ file:

if (!Mgr.getLangOpts().CPlusPlus)
      return;

From this little information I have hear are my thoughts:

match callExpr(allOf (callee(functionDecl(hasName("std::sort"))), hasArgument(0, hasDescendant(declRefExpr(to(fieldDecl(hasName("value_type"))))))))))

I think this is a good direction, but keep in mind that value_type is a typedef, thus you should use the typedefNameDecl matcher instead of the fieldDecl.

(Also if I understand correctly then this is good that this matcher does not match in case of the intPointerArray example, because the array does not have any member at all ...)

From this little information I have hear are my thoughts:

match callExpr(allOf (callee(functionDecl(hasName("std::sort"))), hasArgument(0, hasDescendant(declRefExpr(to(fieldDecl(hasName("value_type"))))))))))

I think this is a good direction, but keep in mind that value_type is a typedef, thus you should use the typedefNameDecl matcher instead of the fieldDecl.

Thanks @martong Yes, I tried another matcher with typedefNameDecl which also did not match:

match callExpr(allOf (callee(functionDecl(hasName("std::sort"))), hasArgument(0, stmt(hasDescendant(expr(hasType(typedefType(hasDeclaration(typedefNameDecl(hasName("value_type")))))))))))

(Also if I understand correctly then this is good that this matcher does not match in case of the intPointerArray example, because the array does not have any member at all ...)

How then do you think I should try matching the IntPointerArray example? I am now leaning towards not doing this via AST Matchers as I am not sure if I can easily match more complex cases.

mgrang updated this revision to Diff 164086.Sep 5 2018, 11:19 AM

Enabled checker only for C++.

mgrang marked an inline comment as done.Sep 5 2018, 11:20 AM

How then do you think I should try matching the IntPointerArray example? I am now leaning towards not doing this via AST Matchers as I am not sure if I can easily match more complex cases.

I've sadly reached the end of my knowledge on AST based analysis, but just to throw this in there, are you aware of this amazing guide about the static analyzer? https://github.com/haoNoQ/clang-analyzer-guide/releases There are a couple sections I believe could help you.

lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
108–110 ↗(On Diff #163876)

Actually, I'm very much shooting in the dark here, but wouldn't this work with objective C++?

Why explicitly skip C projects? We would not be able to match std::X functions anyway.

The code looks fine to me; @mgrang What evaluation steps have you performed? You note there are a few notable false positives, do you have ideas for suppressing those?

Why explicitly skip C projects? We would not be able to match std::X functions anyway.

Why spend time constructing matchers and running on the AST when we are sure that generating any result is impossible? Skipping the entire useless traversal of the tree can end up saving precious execution time that should not be overlooked.

NoQ added a comment.Sep 6 2018, 6:05 PM

I'm generally fine with landing the patch as long as the overall direction is chosen (and be moved towards) for how to safely identify the non-deterministic iterators (my previous inline question). This has to be addressed before we move out of alpha, but i believe we should pick at least some direction now, because i've very little idea about how to actually do that safely, and i'm worried that we'll get stuck in a dead end here.

Package name sounds good.

test/Analysis/ptr-sort.cpp
1 ↗(On Diff #164086)

There's no need to test with -analyzer-output=text for a syntactic check, because you're not checking path diagnostics. It'll make tests tidier by not forcing you to test notes.

If you ever decide to add extra "note" pieces to your checker, they'll be shown in the default output anyway.

Hey @mgrang! Let's see some results on some projects and answer NoQ's comment, then I think we can put this in for all the world to see.

Hey @mgrang! Let's see some results on some projects and answer NoQ's comment, then I think we can put this in for all the world to see.

Was busy with CppCon 2018 in the past couple of weeks so could not work on this. Hope to test this out soon.

I'm marking it as "needs changes" for now -- the idea seems reasonable, but it's hard to tell without a thorough evaluation on at least a few codebases.
Additionally, we should figure out a checker group for this - if "alpha" means "work in progress", then it should be under something else, like "linting" or similar.

george.karpenkov requested changes to this revision.Oct 22 2018, 11:13 AM
This revision now requires changes to proceed.Oct 22 2018, 11:13 AM
mgrang updated this revision to Diff 186549.Feb 12 2019, 3:01 PM
mgrang added a comment.EditedFeb 12 2019, 3:06 PM

Reviving this now that I have some cycles to work on this.

So I tried running this on csa-testbench projects but I didn't have much success. I always run into a bunch of build/env related errors:

python run_experiments.py --config myconfig.json

15:05:20 [libcxx] Checking out project... 
[ERROR] Unknown option: json

15:05:22 [libcxx] LOC: ?.
15:05:22 [libcxx] Generating build log... 
15:05:22 [libcxx_master] Analyzing project... 
[ERROR] Traceback (most recent call last):
  File "/local/mnt/workspace/mgrang/comm_analyzer/CodeChecker/cc_bin/CodeChecker.py", line 20, in <module>
    from shared.ttypes import RequestFailed
ImportError: No module named shared.ttypes

Not sure why it tries to invoke cc_bin/CodeChecker.py. There's already a bin/CodeChecker.py.

I installed CodeChecker and various other dependencies mentioned in the README. Here's what myconfig.json looks like:

{
  "projects": [
    {
      "url": "git://github.com/chapuni/libcxx.git",
      "name": "libcxx",
      "clang_sa_args": "--analyze -Xanalyzer -analyzer-checker=alpha.nondeterminism.PointerSorting",
      "clang_path": "path/to/llvm/install/bin",
      "binary_dir": "build"
    },
    {
      "name": "bitcoin",
      "url": "https://github.com/bitcoin/bitcoin.git",
      "tag": "v0.15.1",
      "clang_sa_args": "--analyze -Xanalyzer -analyzer-checker=alpha.nondeterminism.PointerSorting",
      "clang_path": "path/to/llvm/install/bin",
      "binary_dir": "build"
    },
    {
      "name": "xerces-c",
      "url": "https://github.com/apache/xerces-c.git",
      "prepared": true,
      "clang_sa_args": "--analyze -Xanalyzer -analyzer-checker=alpha.nondeterminism.PointerSorting",
      "clang_path": "path/to/llvm/install/bin",
      "binary_dir": "build"
    }
  ],

  "CodeChecker": {
    "url": "http://localhost:8001/Default",
    "analyze_args": "",
    "store_args": ""
  }
}

@rnkovacs and @xazax.hun might be able to help with that?

Szelethus set the repository for this revision to rC Clang.Feb 13 2019, 4:29 AM

Reviving this now that I have some cycles to work on this.

So I tried running this on csa-testbench projects but I didn't have much success. I always run into a bunch of build/env related errors:

python run_experiments.py --config myconfig.json

15:05:20 [libcxx] Checking out project... 
[ERROR] Unknown option: json

15:05:22 [libcxx] LOC: ?.
15:05:22 [libcxx] Generating build log... 
15:05:22 [libcxx_master] Analyzing project... 
[ERROR] Traceback (most recent call last):
  File "/local/mnt/workspace/mgrang/comm_analyzer/CodeChecker/cc_bin/CodeChecker.py", line 20, in <module>
    from shared.ttypes import RequestFailed
ImportError: No module named shared.ttypes

Hi!

Sorry for the late response. Does CodeChecker work for you (when not using the testbench)?
I think one of the most common reason for such errors is when we do not source the virtualenv of CodeChecker so the dependencies are not available.

mgrang added a comment.EditedFeb 14 2019, 12:47 PM

Reviving this now that I have some cycles to work on this.

So I tried running this on csa-testbench projects but I didn't have much success. I always run into a bunch of build/env related errors:

python run_experiments.py --config myconfig.json

15:05:20 [libcxx] Checking out project... 
[ERROR] Unknown option: json

15:05:22 [libcxx] LOC: ?.
15:05:22 [libcxx] Generating build log... 
15:05:22 [libcxx_master] Analyzing project... 
[ERROR] Traceback (most recent call last):
  File "/local/mnt/workspace/mgrang/comm_analyzer/CodeChecker/cc_bin/CodeChecker.py", line 20, in <module>
    from shared.ttypes import RequestFailed
ImportError: No module named shared.ttypes

Hi!

Sorry for the late response. Does CodeChecker work for you (when not using the testbench)?
I think one of the most common reason for such errors is when we do not source the virtualenv of CodeChecker so the dependencies are not available.

I guess I hadn't sourced the CodeChecker venv correctly. Now that I did source it and started the CodeChecker server I seem to be progressing. The error I get now is compiler related (which I will dig into).

12:51:35 [bitcoin] Checking out project... 
[ERROR] Unknown option: json

[ERROR] configure: error: in `csa-testbench/projects/bitcoin':
configure: error: C++ compiler cannot create executables
See `config.log' for more details

Should I be worried about the "[ERROR] Unknown option: json" message?

mgrang added a comment.EditedFeb 15 2019, 12:54 PM

The error I now get is:

clang is not able to compile a simple test program.
/usr/bin/ld: unrecognised emulation mode: armelf_linux_eabi
  Supported emulations: elf_x86_64 elf32_x86_64 elf_i386 elf_iamcu i386linux
  elf_l1om elf_k1om i386pep i386pe

This is because I need to pass the correct -target and -L options to config. I see that there is a configure_command option that can be set. However, different projects use different build systems (make, cmake, bake, etc). How do we then pass flags to their respective configures?

mgrang added a comment.EditedFeb 15 2019, 3:52 PM

I hacked around the run_experiments.py to set CMAKE_C_FLAGS and now I see the following error in stats.html. Note: I see the same error with an existing checker like PointerArithmChecker. And I do not hit this assert when I run the checker outside of csa-testbench.

Assertion `CheckerTags.count(tag) != 0 && "Requested checker is not registered! Maybe you should add it as a " "dependency in Checkers.td?"'

Here's my config file:

{
  "projects": [
    {
      "name": "TinyXML2",
      "url": "https://github.com/leethomason/tinyxml2.git",
      "clang_sa_args": "-target x86_64-linux-gnu -L /usr/lib/x86_64-linux-gnu -B /usr/lib/x86_64-linux-gnu -B /usr/lib/gcc/x86_64-linux-gnu/5 -L /usr/lib/gcc/x86_64-linux-gnu/5 --analyze -Xanalyzer -analyzer-checker=core -Xanalyzer -analyzer-checker=alpha.core.PointerArithm -Xclang -analyzer-stats",
      "clang_path": "path/install/bin",
      "binary_dir": "build"
    }
  ],

  "CodeChecker": {
    "url": "http://localhost:8001/Default",
    "analyze_args": "",
    "store_args": ""
  }
}

Yeah, it seems upstream has moved away due to @Szelethus' implementation of a much more manageable "checker dependency" system. You most likely will have to rebase your patch first, then check what you missed which got added to other merged, existing checkers.

mgrang updated this revision to Diff 187418.Feb 19 2019, 11:17 AM

Yeah, it seems upstream has moved away due to @Szelethus' implementation of a much more manageable "checker dependency" system. You most likely will have to rebase your patch first, then check what you missed which got added to other merged, existing checkers.

Yes, I see that D55429, D54438 added a new dependency field for checkers. However, I see that not all checkers need/have that field. In this case, shouldn't providing a ento::shouldRegisterPointerSortingChecker function which returns true be enough to register the checker?
Moreover, as I said I hit this assert only when invoking the checker via csa-testbench. Outside that it seems to work fine. For example, my unit test Analysis/ptr-sort.cpp passes.

Szelethus added a comment.EditedFeb 19 2019, 12:25 PM

Yeah, it seems upstream has moved away due to @Szelethus' implementation of a much more manageable "checker dependency" system. You most likely will have to rebase your patch first, then check what you missed which got added to other merged, existing checkers.

Moreover, as I said I hit this assert only when invoking the checker via csa-testbench. Outside that it seems to work fine. For example, my unit test Analysis/ptr-sort.cpp passes.

It's because it invokes CodeChecker, which by default enables valist.Uninitialized, but not ValistBase. Do you have assert failures after my hotfix?

Yes, I see that D55429, D54438 added a new dependency field for checkers. However, I see that not all checkers need/have that field. In this case, shouldn't providing a ento::shouldRegisterPointerSortingChecker function which returns true be enough to register the checker?

It is enough, and this really is not documented properly. There is a revision to solve this problem here: D58065. I guess your input, as someone who didn't participate in the checker dependency related patch reviews would be invaluable, in terms of whether my description is understandable enough.

Edit: and yes, the Dependency field in Checkers.td is completely optional. The shouldRegister function is a tool to prevent checkers being enabled for a particular language -- some C++ checkers, for example, crash when a code is being analyzed according to the C standard.

mgrang updated this revision to Diff 187466.Feb 19 2019, 4:19 PM

It's because it invokes CodeChecker, which by default enables valist.Uninitialized, but not ValistBase. Do you have assert failures after my hotfix?

@Szelethus Thanks. I run into this assert when run through CodeChecker:

Assertion `CheckerTags.count(tag) != 0 && "Requested checker is not registered! Maybe you should add it as a " "dependency in Checkers.td?"'

Is there a workaround?

Also I don't see the helptext for PointerSorting when I run:

clang -cc1 -analyzer-checker-help | grep Pointer

  alpha.core.PointerArithm        Check for pointer arithmetic on locations other than array elements
  alpha.core.PointerSub           Check for pointer subtractions on two pointers pointing to different memory chunks
  alpha.nondeterminism.PointerSorting
  cplusplus.InnerPointer          Check for inner pointers of C++ containers used after re/deallocation

There is a revision to solve this problem here: D58065. I guess your input, as someone who didn't participate in the checker dependency related patch reviews would be invaluable, in terms of whether my description is understandable enough.

Thanks. I took a look at the documentation and it looks fine to me (modulo the comments from other reviewers and a couple of minor typos). I feel the csa-testbench documentation (https://github.com/Xazax-hun/csa-testbench) is very minimal and does not document a lot of intricacies which I had to figure out by trial-and-error.

Szelethus added a comment.EditedFeb 20 2019, 1:07 AM

It's because it invokes CodeChecker, which by default enables valist.Uninitialized, but not ValistBase. Do you have assert failures after my hotfix?

@Szelethus Thanks. I run into this assert when run through CodeChecker:

Assertion `CheckerTags.count(tag) != 0 && "Requested checker is not registered! Maybe you should add it as a " "dependency in Checkers.td?"'

Is there a workaround?

Uhh, I'm afraid the only way out is to actually fix the problem :) couple questions:

  1. Are you *sure* that you rebased to rC354235 (which is the hotfix I mentioned), CodeChecker runs *that* clang when the assert failure happens?
  2. Can you provide a stacktrace? Luckily, that would for sure point me to the problematic checker.

Also I don't see the helptext for PointerSorting when I run:

clang -cc1 -analyzer-checker-help | grep Pointer

  alpha.core.PointerArithm        Check for pointer arithmetic on locations other than array elements
  alpha.core.PointerSub           Check for pointer subtractions on two pointers pointing to different memory chunks
  alpha.nondeterminism.PointerSorting
  cplusplus.InnerPointer          Check for inner pointers of C++ containers used after re/deallocation

How about clang -cc1 -analyzer-checker-help | grep Pointer -A3? There could be a linebreak after the ckecker's name (raises the question though whether it should be there).

There is a revision to solve this problem here: D58065. I guess your input, as someone who didn't participate in the checker dependency related patch reviews would be invaluable, in terms of whether my description is understandable enough.

Thanks. I took a look at the documentation and it looks fine to me (modulo the comments from other reviewers and a couple of minor typos). I feel the csa-testbench documentation (https://github.com/Xazax-hun/csa-testbench) is very minimal and does not document a lot of intricacies which I had to figure out by trial-and-error.

Due to an upcoming conference, I didn't bother with it much, i just used CodeChecker on its own, which is fairly well documented.
I'm trying to move my checker out of alpha, and my testing goes as follows:

  1. Clone LLVM+Clang, cppcheck, rtags, bitcoin, xerces (you could throw in vim, tmux, as your checker isnt C++ exclusive (see edit), or whatever else)
  2. Create compile_commands.json either with CMake (-DCANE_GENERATE_COMPILE_COMMANDS, or smt similar), or CodeChecker log
  3. CodeChecker analyze projects, paying attention to not forgetting the CodeChecker flag --verbose debug_analyzer (and enabling your checker ofc), and piping the output to a file
  4. Create a CodeChecker server, CodeChecker store the results, and stare at the web gui for hours. Its very pretty btw ;)

Csa-testbench is a work in progress, so I guess you can expect more in the future :)

Thanks for your work, even through a few annoyances, very much appreciated!

Edit: Since your checker is C++ exclusive, please return true only if LangOpts.CPlusPlus is true in the shouldRegister function. Didn't catch my eye at first.

mgrang updated this revision to Diff 187695.Feb 20 2019, 4:41 PM

It's because it invokes CodeChecker, which by default enables valist.Uninitialized, but not ValistBase. Do you have assert failures after my hotfix?

@Szelethus Thanks. I run into this assert when run through CodeChecker:

Assertion `CheckerTags.count(tag) != 0 && "Requested checker is not registered! Maybe you should add it as a " "dependency in Checkers.td?"'

Is there a workaround?

Uhh, I'm afraid the only way out is to actually fix the problem :) couple questions:

  1. Are you *sure* that you rebased to rC354235 (which is the hotfix I mentioned), CodeChecker runs *that* clang when the assert failure happens?

I guess, I totally missed your hotfix. After sync'ing to the latest sources I no longer hit the assert. Thanks!

  1. Can you provide a stacktrace? Luckily, that would for sure point me to the problematic checker.

Also I don't see the helptext for PointerSorting when I run:

clang -cc1 -analyzer-checker-help | grep Pointer

  alpha.core.PointerArithm        Check for pointer arithmetic on locations other than array elements
  alpha.core.PointerSub           Check for pointer subtractions on two pointers pointing to different memory chunks
  alpha.nondeterminism.PointerSorting
  cplusplus.InnerPointer          Check for inner pointers of C++ containers used after re/deallocation

How about clang -cc1 -analyzer-checker-help | grep Pointer -A3? There could be a linebreak after the ckecker's name (raises the question though whether it should be there).

Yep, line break it was. I see the help text with -A1.

There is a revision to solve this problem here: D58065. I guess your input, as someone who didn't participate in the checker dependency related patch reviews would be invaluable, in terms of whether my description is understandable enough.

Thanks. I took a look at the documentation and it looks fine to me (modulo the comments from other reviewers and a couple of minor typos). I feel the csa-testbench documentation (https://github.com/Xazax-hun/csa-testbench) is very minimal and does not document a lot of intricacies which I had to figure out by trial-and-error.

Due to an upcoming conference, I didn't bother with it much, i just used CodeChecker on its own, which is fairly well documented.
I'm trying to move my checker out of alpha, and my testing goes as follows:

  1. Clone LLVM+Clang, cppcheck, rtags, bitcoin, xerces (you could throw in vim, tmux, as your checker isnt C++ exclusive (see edit), or whatever else)
  2. Create compile_commands.json either with CMake (-DCANE_GENERATE_COMPILE_COMMANDS, or smt similar), or CodeChecker log
  3. CodeChecker analyze projects, paying attention to not forgetting the CodeChecker flag --verbose debug_analyzer (and enabling your checker ofc), and piping the output to a file
  4. Create a CodeChecker server, CodeChecker store the results, and stare at the web gui for hours. Its very pretty btw ;)

Csa-testbench is a work in progress, so I guess you can expect more in the future :)

Thanks for your work, even through a few annoyances, very much appreciated!

Edit: Since your checker is C++ exclusive, please return true only if LangOpts.CPlusPlus is true in the shouldRegister function. Didn't catch my eye at first.

Done.

So I was able compile a couple of C++ code bases through csa-testbench. I built cppcheck and tinyxml2 without any problems. cppcheck has one invocation std::sort but the keys are not pointers whereas tinyxml2 does not use std::sort. I tried bitcoin, rtags, xerces but run into a lot of configure/build errors.

So I was able compile a couple of C++ code bases through csa-testbench. I built cppcheck and tinyxml2 without any problems. cppcheck has one invocation std::sort but the keys are not pointers whereas tinyxml2 does not use std::sort. I tried bitcoin, rtags, xerces but run into a lot of configure/build errors.

@Szelethus Are you able to build bitcoin, xerces, etc? I seem to hit "Could not link against boost_filesystem" errors with bitcoin. I have installed dependencies mentioned in https://stackoverflow.com/questions/9132335/configure-error-could-not-link-against-boost-system/47773206 and https://github.com/bitcoin/bitcoin/issues/8749.

bitcoin:

sudo apt-get install build-essential libtool autotools-dev automake pkg-config libssl-dev libevent-dev bsdmainutils python3 libboost-all-dev software-properties-common libminiupnpc-dev libzmq3-dev libqt5gui5 libqt5core5a libqt5dbus5 qttools5-dev qttools5-dev-tools libprotobuf-dev protobuf-compiler
sudo add-apt-repository ppa:bitcoin/bitcoin
sudo apt-get update
sudo apt-get install libdb4.8-dev libdb4.8++-dev

git clone https://github.com/bitcoin/bitcoin.git
cd bitcoin

./autogen.sh
./configure
CodeChecker log -b "make -j13" -o compile_commands.json

xerces:

wget http://www-eu.apache.org/dist//xerces/c/3/sources/xerces-c-3.2.2.tar.gz
tar xf xerces-c-3.2.2.tar.gz
mv xerces-c-3.2.2 xerces
rm xerces-c-3.2.2.tar.gz
cd xerces

./configure
CodeChecker log -b "make -j13" -o compile_commands.json

where -j13 means 13 threads.

mgrang added a comment.EditedFeb 22 2019, 4:48 PM

bitcoin:

sudo apt-get install build-essential libtool autotools-dev automake pkg-config libssl-dev libevent-dev bsdmainutils python3 libboost-all-dev software-properties-common libminiupnpc-dev libzmq3-dev libqt5gui5 libqt5core5a libqt5dbus5 qttools5-dev qttools5-dev-tools libprotobuf-dev protobuf-compiler
sudo add-apt-repository ppa:bitcoin/bitcoin
sudo apt-get update
sudo apt-get install libdb4.8-dev libdb4.8++-dev

git clone https://github.com/bitcoin/bitcoin.git
cd bitcoin

./autogen.sh
./configure
CodeChecker log -b "make -j13" -o compile_commands.json

xerces:

wget http://www-eu.apache.org/dist//xerces/c/3/sources/xerces-c-3.2.2.tar.gz
tar xf xerces-c-3.2.2.tar.gz
mv xerces-c-3.2.2 xerces
rm xerces-c-3.2.2.tar.gz
cd xerces

./configure
CodeChecker log -b "make -j13" -o compile_commands.json

where -j13 means 13 threads.

Thanks @Charusso . I have already tried these steps.

By running just ./configure I get this error:

configure: error: cannot figure out how to use std::atomic

Instead, this is the configure command I use for bitcoin:

./configure --disable-wallet CFLAGS='-target x86_64-linux-gnu -L /usr/lib/x86_64-linux-gnu -B /usr/lib/x86_64-linux-gnu -B /usr/lib/gcc/x86_64-linux-gnu/5 -L /usr/lib/gcc/x86_64-linux-gnu/5 -I /usr/include/c++/5 -I /usr/include/x86_64-linux-gnu/c++/5' CXXFLAGS='-target x86_64-linux-gnu -L /usr/lib/x86_64-linux-gnu -B /usr/lib/x86_64-linux-gnu -B /usr/lib/gcc/x86_64-linux-gnu/5 -L /usr/lib/gcc/x86_64-linux-gnu/5 -I /usr/include/c++/5 -I /usr/include/x86_64-linux-gnu/c++/5'

And then I hit this:

configure: error: Could not link against boost_filesystem !

So I was able compile a couple of C++ code bases through csa-testbench. I built cppcheck and tinyxml2 without any problems. cppcheck has one invocation std::sort but the keys are not pointers whereas tinyxml2 does not use std::sort. I tried bitcoin, rtags, xerces but run into a lot of configure/build errors.

Thats great. How about LLVM+Clang? That'll be a pain in the butt to analyze, but is pretty great for testing. Also, did you clone rtags with the git option --recursive?

include/clang/StaticAnalyzer/Checkers/Checkers.td
97 ↗(On Diff #187695)

Hmmm, okay, so your checker ks C++ exclusive I belive? How about making this checker reside in alpha.cplusplus? Rgard this one as more of a question.

lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp
1 ↗(On Diff #187695)

Missing emacs thingie here?

30 ↗(On Diff #187695)

I heard somewhere that LLVM::StringLiteral is used for global constexpr strings? Anyway, constexpr could be added here if I'm wrong sbout that :)

110 ↗(On Diff #187695)

This is no longer needed, your checker wont be registered if the shouldRegister function returns false.

Szelethus accepted this revision.Feb 23 2019, 3:27 AM

But, as a work-in-progress alpha checker, the direction is set and looks great. Please let @NoQ have the final say.

mgrang updated this revision to Diff 188232.Feb 25 2019, 11:57 AM
mgrang set the repository for this revision to rC Clang.
mgrang marked 3 inline comments as done.
mgrang marked an inline comment as done.Feb 25 2019, 12:02 PM
mgrang added inline comments.
include/clang/StaticAnalyzer/Checkers/Checkers.td
97 ↗(On Diff #187695)

This particular checker is C++ exclusive. The reason behind creating a new non-determinism category was so that we could add more language-agnostic checkers for non-determinism in future.

mgrang added a comment.EditedFeb 25 2019, 12:08 PM

So I was able compile a couple of C++ code bases through csa-testbench. I built cppcheck and tinyxml2 without any problems. cppcheck has one invocation std::sort but the keys are not pointers whereas tinyxml2 does not use std::sort. I tried bitcoin, rtags, xerces but run into a lot of configure/build errors.

Thats great. How about LLVM+Clang? That'll be a pain in the butt to analyze, but is pretty great for testing. Also, did you clone rtags with the git option --recursive?

LLVM does not directly use std::sort. It calls llvm::sort which in-turn calls std::sort (but it does have std::stable_sort, etc). However, I run into a host of errors while building LLVM inside csa-testbench. Maybe, once this patch lands, I could use llvm as a bootstrap stage1 compiler with this checker enabled.

Also, did you clone rtags with the git option --recursive?

Yes, I followed the build instructions from https://github.com/Andersbakken/rtags. But run into these errors:

./configure //this invokes "cmake" "." -DCMAKE_EXPORT_COMPILE_COMMANDS=1
  The compiler uses a libstdc++ without c++11 regex support.

./configure --clang-cxxflags '-std=c++11' //this invokes "cmake" "." -DCMAKE_EXPORT_COMPILE_COMMANDS=1 -DLIBCLANG_CXXFLAGS="-std=c++11"
  It's likely that the include file <clang-c/Index.h> could not be found!

I tried installing libclang-dev, libclang-3.8-dev, libclang-6.0-dev as well as adding -I include paths for clang-c to configure but I am not able to go past the above error.

But, as a work-in-progress alpha checker, the direction is set and looks great. Please let @NoQ have the final say.

Thanks a lot @Szelethus! I will wait for @NoQ 's comments.

But, as a work-in-progress alpha checker, the direction is set and looks great. Please let @NoQ have the final say.

Thanks a lot @Szelethus! I will wait for @NoQ 's comments.

@NoQ Could you please review this patch?

mgrang added a comment.Mar 4 2019, 3:41 PM

@NoQ Ping 2 for reviews please.

NoQ accepted this revision.Mar 7 2019, 4:03 PM

Ok, thanks, sounds good! Let's land and see where it goes.

test/Analysis/ptr-sort.cpp
4–25 ↗(On Diff #188232)

Please feel free to put these directly into system-header-simulator-cxx.h.

mgrang updated this revision to Diff 189905.Mar 8 2019, 12:11 PM
mgrang marked an inline comment as done.
This revision was not accepted when it landed; it landed in state Needs Review.Mar 8 2019, 12:14 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2019, 12:14 PM