This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Use the custom propagation rules and sinks in GenericTaintChecker
ClosedPublic

Authored by boga95 on Mar 21 2019, 5:50 AM.

Details

Summary

The TaintPropagationRule deduction uses the custom rules from the config.
Check custom sinks when looking for errors. Give an error when at least one of the specified arguments is tainted.
Emit error message and omit a parameter if it's out of bound.

Diff Detail

Event Timeline

boga95 created this revision.Mar 21 2019, 5:50 AM
boga95 updated this revision to Diff 212119.Jul 28 2019, 2:34 PM
boga95 added a subscriber: steakhal.

In general, don't emit to stderr unless we either emit a warning/error about the incorrect configuration. As an experiment, what happens when you try to emit an error in the middle of the symbolic execution? You can retrieve a DiagnosticsEngine from any decl: D->getASTContext().getDiagnostics() (it's funny how you can retrieve almost all major manager objects if you try hard enough).

clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
143

How about llvm::StringLiteral?

820

Hmmm, how do we do with qualified names (MyClass::generateTaint(), std::cin >>)?

boga95 marked 2 inline comments as done.Jul 31 2019, 2:21 AM

I think it shouldn't give compile error in case of incorrect configuration now (maybe warning) because:

  • Without qualified names, I can create a code which cannot be configured properly.
  • It can throw an error without configuration, for example:
void read(int*); // There is an existing propagation rule for it

I suggest to let it unchanged now, and I will change it when the checker can handle qualified names.
On the other hand, I think we should make this type of error configurable (from the command line). So the user can select between warnings and errors.

clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
820

These patches focus on C style functions. I have implemented the uses of qualified names, but I intended to make a separate patch for that.

boga95 updated this revision to Diff 215268.Aug 14 2019, 3:23 PM
boga95 marked an inline comment as done.

Should I do anything or it is ready?

NoQ added a comment.Aug 30 2019, 2:58 PM

I have a few immediate style issues but otherwise it looks good :)

lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
237 ↗(On Diff #215268)

StringLiterals need to always be declared as constexpr.

251 ↗(On Diff #215268)

It should be "user-defined" because it's a single adjective.

I recommend against using the word "sink" in user-facing messages because it's too jargony. Do we have a better word for this?

252 ↗(On Diff #215268)

This semicolon looks unnecessary.

595–596 ↗(On Diff #215268)

These debug prints should not land.

840 ↗(On Diff #215268)

The GenericTaintChecker:: qualifier is unnecessary.

boga95 updated this revision to Diff 218395.Sep 2 2019, 3:13 PM
Szelethus accepted this revision.Sep 5 2019, 12:54 AM
Szelethus added a reviewer: steakhal.

Some nits inline, otherwise LGTM. @steakhal, do you have anything to add to this?

lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
191 ↗(On Diff #218395)

How about only passing CustomPropagations?

605 ↗(On Diff #218395)

I get that there isn't much substance to this assert, but why remove it? We might as well populate the lines in between that and the branch.

844 ↗(On Diff #218395)

Wasn't this commited before?

This revision is now accepted and ready to land.Sep 5 2019, 12:54 AM

Also, please mark inlines done as you fix them :)

steakhal added inline comments.Sep 5 2019, 5:04 AM
lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
191 ↗(On Diff #218395)

I would even consider to move this function out of the whole class. (Not only this function, but the others as well. Like isStdin, etc.)
I think pure, free-functions (in an anonymous namespace) are easier to reason about.

844 ↗(On Diff #218395)
test/Analysis/taint-generic.c
377 ↗(On Diff #218395)

We could use this syntacs to achieve shorter lines. Note that @-1. Same for all the other lines.

mySink(x, 1, 2);
// expected-warning@-1 {{Untrusted data is passed to a user-defined sink}}
boga95 updated this revision to Diff 219172.Sep 6 2019, 2:00 PM
boga95 marked 7 inline comments as done.
Szelethus accepted this revision.Sep 6 2019, 2:46 PM

I'm fine with moving in-class function into anonymous namespace later. LGTM, feel free to commit!

steakhal added inline comments.Sep 7 2019, 1:26 AM
lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
118 ↗(On Diff #218395)

Shouldn't we still need an out-of-class initializer part for each static constexpr class member variable?
These would provide the memory locations for the declarations.

constexpr llvm::StringLiteral GenericTaintChecker::MsgUncontrolledFormatString;
constexpr llvm::StringLiteral GenericTaintChecker::MsgSanitizeSystemArgs;
constexpr llvm::StringLiteral GenericTaintChecker::MsgTaintedBufferSize;
constexpr llvm::StringLiteral GenericTaintChecker::MsgCustomSink;
boga95 closed this revision.Sep 8 2019, 1:03 PM
boga95 marked 10 inline comments as done.

Closed by 080ecafdd8b3e990e5ad19202d089c91c9c9b164.

lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
118 ↗(On Diff #218395)

Constexpr values cannot be initialized out of the class, that's why I moved them to here.

191 ↗(On Diff #218395)

I could do that in a separate patch if necessary.

605 ↗(On Diff #218395)

I think there is no need for this. Maybe I can make the ArgNum const.

251 ↗(On Diff #215268)

Currently, I don't have any better idea. I will think about it.