Page MenuHomePhabricator

[analyzer] Revise GenericTaintChecker's internal representation
Needs ReviewPublic

Authored by boga95 on Dec 15 2018, 1:53 AM.

Details

Summary

Use a more optimal data structure to represent taint propagation rules. There is no change in the checker's behavior.
Clean up the code.

Diff Detail

Event Timeline

boga95 created this revision.Dec 15 2018, 1:53 AM
Szelethus retitled this revision from Refactoring GenericTaintChecker.cpp to [analyzer] Refactoring GenericTaintChecker.cpp.Dec 15 2018, 2:22 AM
Szelethus added a reviewer: NoQ.

If you don't mind, please submit diffs with full context (diff -U99999 or something alike depending on what you use). It makes it much easier to review patches. Also the formatting looks really off in some places where 4 spaces are used. Also another nitpick, nested statement ifs should have the statements within them aligned without adding another indentation level for every line break within the statement, also in case below dyn_cast_or_null may eliminate a need for an additional check in which case you can just use && instead.

if ((D->getName().find("stdin") != StringRef::npos) && D->isExternC())
  if (const PointerType *PtrTy =
          dyn_cast<PointerType>(D->getType().getTypePtr()))
    if (PtrTy->getPointeeType().getCanonicalType() ==
        C.getASTContext().getFILEType().getCanonicalType())
      return true;

Also, is there a reason for pulling in <utility> here?

boga95 updated this revision to Diff 178351.Dec 15 2018, 2:51 AM

Upload diff with full context.

Szelethus added inline comments.Dec 15 2018, 2:57 AM
lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
208–230

Hmmm, the formatting part change already is in your other revision, D54918. Can you please rebase this patch on top of it, and add it as a dependency (that can be done on the right side, "Edit related revisions...").

I use std::move which is in the <utility>.

boga95 updated this revision to Diff 178368.Dec 15 2018, 7:58 AM

Hmm, I find the revision title and summary a little vague -- I'd expect a revision called "Refactoring" not to feature any funcitonal change, yet you changed how variadic functions are handled. Please

  • Keep purely formatting changes to your earlier revision, and rebase this patch on that
  • Edit the revision title and/or summary about what your patch does,
  • If it features any change in the checker's behavior, include tests.
lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
128

typo: marks

135–140

I like this a lot! Call sites look very nice.

146–159

In-class method definitions are implicitly inline -- could you please remove them while we're cleaning up anyways? :)

boga95 updated this revision to Diff 178373.Dec 15 2018, 11:38 AM

Rebase patch on the earlier version. Remove unnecessary inline specifiers. Fix typo.

boga95 retitled this revision from [analyzer] Refactoring GenericTaintChecker.cpp to [analyzer] Revise GenericTaintChecker's internal representation.Dec 15 2018, 11:44 AM
boga95 edited the summary of this revision. (Show Details)
NoQ added a comment.Dec 16 2018, 4:16 PM

Hi, thanks again for looking into this! I have a single piece of doubt that this doesn't change the behavior (the comment about pread), but other than that, this diff is good to go, in my opinion.

lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
210

llvm::ArrayRef has a fancy feature: it can be constructed not only from an initializer list, but also from a single element. This could have allowed skipping {} in these constructors when there is just one value in the list. That's what i would have done, but i guess it's up to you to decide which approach is prettier.

222

Now pread doesn't produce a tainted result when only the buffer is tainted but other arguments are not. Is it a functional change? I guess we should be able to add tests for it. You can also commit this patch with {0, 1, 2, 3} and remove 1 in the next patch that also adds a test.

462–463

auto here as well?

463–464

The more natural way to write this is ArgNum >= CE->getNumArgs(), as opposed to ArgNum < CE->getNumArgs() which is the usual boundary when dealing with arrays that start at 0.

482–483

auto here as well?

boga95 updated this revision to Diff 178537.Dec 17 2018, 2:49 PM
boga95 marked 4 inline comments as done.
a_sidorin added inline comments.Sun, Dec 23, 1:16 AM
lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
156

llvm::find?

191

This string formatting change looks a bit broken.

343–344

for (unsigned ArgNum : TaintArgs)?

482–483

for (unsigned ArgNum : DstArgs)?

boga95 updated this revision to Diff 179436.Sun, Dec 23, 3:07 AM
boga95 marked 7 inline comments as done.