This is an archive of the discontinued LLVM Phabricator instance.

Checker of proper vfork usage
ClosedPublic

Authored by ygribov on Oct 23 2015, 2:12 AM.

Details

Summary

[analyzer] Add VforkChecker to find unsafe code in vforked process.

This syntactic checker looks for unsafe constructs in vforked process:
function calls (excluding whitelist), memory write and returns.
This was originally motivated by a vfork-related bug in xtables package.

Patch by Yury Gribov.

Diff Detail

Repository
rL LLVM

Event Timeline

ygribov updated this revision to Diff 38218.Oct 23 2015, 2:12 AM
ygribov retitled this revision from to Checker of proper vfork usage.
ygribov updated this object.
ygribov set the repository for this revision to rL LLVM.
ygribov added a subscriber: cfe-commits.
vsk added a subscriber: vsk.Oct 23 2015, 12:49 PM

Thanks for the patch.

I didn't know vfork() is in regular use. Afaict copy-on-write fork() and threading both solve a similar (the same?) problem. I'm also not convinced that flagging all stores in the vfork child process is the right thing to do, since the FP rate could be quite high. For example, doing 'x = malloc(4); *x = 0;' in the child process seems fine to me.

Some inline comments --

lib/StaticAnalyzer/Checkers/VforkChecker.cpp
10 ↗(On Diff #38218)

This comment needs an update.

38 ↗(On Diff #38218)

Please clarify this comment -- it might be better to state a list of specific buggy conditions.

E.g, are writes into heap-allocated allowed? Where are return statements allowed?

44 ↗(On Diff #38218)

This comment isn't needed -- a description to this effect can go in the top-of-file comment.

65 ↗(On Diff #38218)

This comment doesn't add much, please remove it.

75 ↗(On Diff #38218)

Why not nullptr? If there's a good reason, please explain it with a comment.

147 ↗(On Diff #38218)

This is hard to read, please simplify this. E.g;

if (PD = dyn-cast(P)) { /* handle cases where P is-a DeclStmt */ }
else if (Assign = dyn-cast(P)) { /* handle cases where P is-a binop */ }

Using loop constructs here shouldn't be necessary.

167 ↗(On Diff #38218)

Typically nullptr is used here.

186 ↗(On Diff #38218)

Actually, if this is the case, wouldn't it be worthwhile to flag calls to vfork() in child processes?

dcoughlin edited edge metadata.Oct 23 2015, 1:54 PM

Hi Yury,

Thanks for the patch. This is definitely interesting for upstream!

One thing to note (which I assume you are already aware of) is that we already have the "security.insecureAPI.vfork" checker, an AST check that warns on *every* use of vfork. This check is on by default (which I think makes sense, given the variety of ways that vfork is problematic) -- so users of your more permissive check would have to disable the default check.

lib/StaticAnalyzer/Checkers/Checkers.td
360 ↗(On Diff #38218)

The convention here is to not have a '.' after the help text.

lib/StaticAnalyzer/Checkers/VforkChecker.cpp
61 ↗(On Diff #38218)

LLVM convention is to use capitalization and punctuation for comments. (See http://llvm.org/docs/CodingStandards.html#commenting).

74 ↗(On Diff #38218)

Is it possible to use a name that implies a more semantic notion? For example, "VforkResultRegion"?

102 ↗(On Diff #38218)

I think it is better to use SmallSet for VforkWhitelist rather than duplicate that functionality here.

115 ↗(On Diff #38218)

What do you think about making the warnings more descriptive here? For example, you could one warning saying that "Child can only modify variable used to store result of vfork()", one that says "Child can only call _exit() or exec() family of functions", "Child must not return from function calling vfork()", etc.

These descriptions would help the user not have to guess what they are doing wrong.

118 ↗(On Diff #38218)

You should use C.generateErrorNode() here, which expresses the intent to create a node for the purposes of error reporting.

135 ↗(On Diff #38218)

For dyn_cast and friends and you use auto to avoid repeating the type twice. For example:

const auto *PD = dyn_cast_or_null<DeclStmt>(P);
186 ↗(On Diff #38218)

This will get caught by checkPreCall, right? Because vfork is not in the white list.

test/Analysis/vfork-1.c
1 ↗(On Diff #38218)

Tests generally need to have all of core turned on (e.g., -analyzer-checker=core,alpha.unix.Vfork) because the analyzer implements key functionality in the core checkers. (It is not safe to run a path-sensitive checker without core turned on).

68 ↗(On Diff #38218)

I think it would be good to add // no-warning on the lines that shouldn't warn. This will make it easier for people tracking down test failures to know that there really shouldn't be a warning on that line.

test/Analysis/vfork-2.c
1 ↗(On Diff #38218)

Is it possible to combine this with the other test file? If not, can you rename the files to be more descriptive than "-1" or "-2". This will help people adding future tests decide which file they should go in.

ygribov added a comment.EditedOct 23 2015, 1:56 PM

I didn't know vfork() is in regular use.

Neither did I, until I had to debug the damned code. Actually Android has some 5 or 6 uses of it.

I'm also not convinced that flagging all stores in the vfork child process is the right thing to do,
since the FP rate could be quite high.

I also have these concerns. Manpage is very strict about behavior of vforked processes (for a good reason though). Currently most uses of vfork are incorrect according to this checker.

I think there is some potential to relax the constraints:

  • at the very least we can allow calls to functions marked as inline as those won't result in any memory writes
  • we can even allow to call any functions which are guaranteed to not change global state

For example, doing 'x = malloc(4); *x = 0;' in the child process seems fine to me.

I also thought about similar examples. Here are some not so obvious side-effects:

  • you'll get a memory leak (as malloc will be shared by parent and child processes)
  • in the following case behavior of parent is undefined because original value of variable will be corrupted:
x = some_very_important_data;
if (vfork()==0) {
  x = ...; // some_very_important_data is lost
  _exit(1);
}
// parent code
if (some_very_important_data) {
  // UB
}
  • even if child does not explicitly write any parent's variables, compiler may decide to merge stack slots so write into a local variable in child process may corrupt a seemingly unrelated variable in parent
ygribov added inline comments.Oct 23 2015, 1:56 PM
lib/StaticAnalyzer/Checkers/VforkChecker.cpp
10 ↗(On Diff #38218)

Yikes, thanks for pointing this out.

38 ↗(On Diff #38218)

This primitive scheme simply mirrors the manpage which is very strict about this - write to _any_ data besides return value of vfork is disallowed, ditto for function calls (and so "return" only applies to current function).

44 ↗(On Diff #38218)

Right.

65 ↗(On Diff #38218)

Ok.

75 ↗(On Diff #38218)

Ok, this indeed deserves a comment. FTR: nullptr means parent process, VFORK_LHS_NONE means vforked process without explicit LHS and rest means LHS variable.

147 ↗(On Diff #38218)

In this case each statement depends on the previous one so we'll have to resort to deeply nested conditionals. I thought that do-while-false is a common pattern for simplifying such constructs but I may be wrong.

167 ↗(On Diff #38218)

See my comment above, these two encode two different things.

186 ↗(On Diff #38218)

They will be flagged in checkPreCall. But I should probably add a comment as you're the second reviewer who complained about this.

For example, doing 'x = malloc(4); *x = 0;' in the child process seems fine to me.

I don't think this is necessarily safe. For example, malloc() could end up both modifying memory shared between the child and parent process but only modifying process state for the child, leaving the parent in an inconsistent state.

POSIX (before removing vfork() completely) was pretty brutal about the restrictions in the child:

...the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit() or one of the exec family of functions.

One thing to note (which I assume you are already aware of) is that we already have the "security.insecureAPI.vfork" checker,
an AST check that warns on *every* use of vfork.

Yes, I was aware of this one. Unfortunately as I mentioned above vfork is probably to stay with us for quite a while.

This check is on by default (which I think makes sense, given the variety of ways that vfork is problematic) --
so users of your more permissive check would have to disable the default check.

This is a valid concern because it greatly complicates enablement of VforkChecker for a casual user. Do we have some mechanism so that one checker (here VforkChecker) could disable/modify behavior of it's less precise companion (insecureAPI)? This sounds like a generally useful feature (although it damages the modularity of checkers).

lib/StaticAnalyzer/Checkers/Checkers.td
360 ↗(On Diff #38218)

Got it.

lib/StaticAnalyzer/Checkers/VforkChecker.cpp
61 ↗(On Diff #38218)

Thanks, I should probably re-read the std once again.

74 ↗(On Diff #38218)

That would be the fourth iteration of the name) But yes, "Result region" sounds more descriptive.

102 ↗(On Diff #38218)

Ok.

115 ↗(On Diff #38218)

Yes, given the public ignorance about vfork's idiosyncrasies that makes good sense.

118 ↗(On Diff #38218)

Will do.

135 ↗(On Diff #38218)

True. This was forward-ported from our ancient version so lacks C++11 beauties.

test/Analysis/vfork-1.c
1 ↗(On Diff #38218)

Ok, good to know.

68 ↗(On Diff #38218)

Sounds like a good maintenance practice, I'll definitely add it.

test/Analysis/vfork-2.c
1 ↗(On Diff #38218)

Vfork-2.c is actually dedicated to model the bug in libxtables. I can surely merge both tests.

This is a valid concern because it greatly complicates enablement of VforkChecker for a casual user.

I think at the very least I can check that InsecureAPI is enable and issue a warning to user.

I put some suggestions in inline comments.

lib/StaticAnalyzer/Checkers/VforkChecker.cpp
47 ↗(On Diff #38218)

I think it's a good idea to make some functions static and/or move them out of class definition.

97 ↗(On Diff #38218)

What about combination of std::transform + std::sort + std::binary_search?

136 ↗(On Diff #38218)
if (PD && PD->isSingleDecl()) {
  if (const VarDecl *D = dyn_cast<VarDecl>(PD->getSingleDecl())
   return D;
149 ↗(On Diff #38218)

You can use early return to escape do{}.

196 ↗(On Diff #38218)

Is check for non-concrete value is really required?

252 ↗(On Diff #38218)
if (is...)
  report...
ygribov added inline comments.Oct 24 2015, 7:34 AM
lib/StaticAnalyzer/Checkers/VforkChecker.cpp
47 ↗(On Diff #38218)

Right. Which one is preferred btw?

97 ↗(On Diff #38218)

I think I should use SmallSet here as suggested by Devin.

149 ↗(On Diff #38218)

In this particular case - yes. But what's wrong about original do-while-false? I thought it's a common idiom...

196 ↗(On Diff #38218)

This may be remains of old code, I'll re-check.

ygribov added inline comments.Oct 24 2015, 8:20 AM
lib/StaticAnalyzer/Checkers/VforkChecker.cpp
149 ↗(On Diff #38218)

FYI there are several uses of do-while-false in CSA codebase.

a.sidorin added inline comments.Oct 24 2015, 8:57 AM
lib/StaticAnalyzer/Checkers/VforkChecker.cpp
47 ↗(On Diff #38218)

isChildProcess(), getAssignedVariable() are good candidates since they don't use any class member.

ygribov updated this revision to Diff 38423.Oct 26 2015, 9:05 AM
ygribov edited edge metadata.
ygribov removed rL LLVM as the repository for this revision.

Updated based on review.

ygribov marked 43 inline comments as done.Oct 26 2015, 9:06 AM

This is a valid concern because it greatly complicates enablement of VforkChecker for a casual user.

I think at the very least I can check that InsecureAPI is enable and issue a warning to user.

Actually I think that's not a huge problem. InsecureAPI checker would issue a warning but not emit a sink node. So it wouldn't influence VforkChecker at all.

LGTM. Can you update the summary to a commit message? Then I will commit. Thanks for upstreaming!

zaks.anna edited edge metadata.Oct 28 2015, 11:56 PM

Thanks for the patch!

Some minor comments inline.

What happens when this checker and the security.insecureAPI.vfork are enabled at the same time?

Did you run this checker on a large body of code? Did it find any issues? What is needed to turn this into a non-alpha checker?

lib/StaticAnalyzer/Checkers/VforkChecker.cpp
10 ↗(On Diff #38423)

Let's move the more detailed comment that describes what the checker does here.

44 ↗(On Diff #38423)

This should be added as a utility function. Does this exist elsewhere?

63 ↗(On Diff #38423)

Can this be rewritten so that it is more clear why this terminates?
Also, I'd prefer to use "while(true)".

155 ↗(On Diff #38423)

"a vforked process"?

160 ↗(On Diff #38423)

"after successful vfork" -> "after a successful vfork"?
Devin, are we missing an article here and in the other error messages?

163 ↗(On Diff #38423)

Ideally, we would point out where the vfork has occurred along the path with a BugReportVisitor. (But it's not a blocker.)

208 ↗(On Diff #38423)

We are not listing the full whitelist here. Should we drop the "(except ..)" from the message?

226 ↗(On Diff #38423)

"except return value" -> "except the return value"? Or maybe we should drop the "(except ...)" here as well to make the message shorter.

test/Analysis/vfork.c
24 ↗(On Diff #38423)

We should check for the exact expected warning in regression tests.

What happens when this checker and the security.insecureAPI.vfork are enabled at the same time?

Both checkers will emit warnings independently (which I think is ok).

Did you run this checker on a large body of code? Did it find any issues?

Yes, I've ran it on Android 5 and found several violations (function calls, assignments). Frankly I think that most existing uses of vfork do not obey the specification.

What is needed to turn this into a non-alpha checker?

I guess that's question for maintainers) We have to consider that vfork is used rarely enough.

lib/StaticAnalyzer/Checkers/VforkChecker.cpp
44 ↗(On Diff #38423)

Frankly this function only handles two common patterns right now (assignment and initialization). This was enough for all uses of vfork which I've seen but I'm afraid that general case may require much more work.

I'll try to find any dups.

63 ↗(On Diff #38423)

Actually this isn't really a loop - it's a common do-while-false idiom which allows to reduce amount of nesting. You can check for similar pattern in ArrayBoundCheckerV2.cpp and also other parts of codebase.

155 ↗(On Diff #38423)

Right.

163 ↗(On Diff #38423)

Yeah, that would be nice. But vfork code blocks are typically very small (5-10 LOC) so it'll all be clear anyway.

208 ↗(On Diff #38423)

What about

except exec*() or _exit()

?

226 ↗(On Diff #38423)

I'd rather keep it.

test/Analysis/vfork.c
24 ↗(On Diff #38423)

Ok.

ygribov updated this revision to Diff 38740.Oct 29 2015, 8:20 AM
ygribov edited edge metadata.

Updated after Anna's review.

ygribov marked 15 inline comments as done.Oct 29 2015, 8:23 AM
ygribov added inline comments.
lib/StaticAnalyzer/Checkers/VforkChecker.cpp
45 ↗(On Diff #38740)

It seems that other checkers do more or less the same throw-away predicates (e.g. see isAssignmentOp in DereferenceChecker.cpp).

164 ↗(On Diff #38740)

I've added a TODO.

zaks.anna added inline comments.Oct 31 2015, 1:33 PM
lib/StaticAnalyzer/Checkers/VforkChecker.cpp
45 ↗(On Diff #38740)

Frankly this function only handles two common patterns right now

It seems that other checkers do more or less the same throw-away
predicates

Both of these just highlight that it's best to make this a utility function:

  • New checkers will not have to roll their own predicates.
  • The reusable code will get enhanced in the future and all checkers will benefit from it.
64 ↗(On Diff #38740)

Ah, right!

209 ↗(On Diff #38740)

Using regex in a diagnostic is not user friendly. See the comment below.

227 ↗(On Diff #38740)

It is very important to keep the diagnostics as succinct as possible.

The diagnostics in this checker are long as is and I do not think the "except" clauses are helping the user to understand or fix the problem. They will not get the warning in the cases that are described in the "except" clause so why are we talking about this..

Most likely you are using it to make sure that the message is true. How about rephrasing the message to address that issue. For example, you could use something like:
"Assigning variables (except return value of vfork) is prohibited after a successful fork"
->
"This assignment is prohibited after a successful vfork"

ygribov updated this revision to Diff 38926.Nov 2 2015, 8:16 AM
ygribov marked 2 inline comments as done.

Updated after review.

ygribov marked 5 inline comments as done.Nov 2 2015, 8:17 AM
ygribov added inline comments.Nov 2 2015, 8:18 AM
lib/StaticAnalyzer/Checkers/DereferenceChecker.cpp
118 ↗(On Diff #38926)

Semantics is slightly changed for assignment case here: originally S would have been changed iff Init is not NULL but now I also require valid VarDecl on LHS. Frankly I'm not sure if it's a serious change (probably not).

Thanks for addressing all the comments. A couple of additional ones below.

test/Analysis/vfork.c
60 ↗(On Diff #38926)

Shouldn't the warning appear here as well?

You need to register for ReturnStmt callback instead of the checkEndFunction.

64 ↗(On Diff #38926)

I think "from this function" is redundant in this particular error message.
"Return from this function is prohibited after a successful vfork" -> "Return is prohibited after a successful vfork; call _exit() instead."

ygribov updated this revision to Diff 39040.Nov 3 2015, 3:06 AM

Updated warning messages.

ygribov marked 2 inline comments as done.Nov 3 2015, 3:06 AM

What is needed to turn this into a non-alpha checker?

I guess that's question for maintainers) We have to consider that vfork is used rarely enough.

Please, make this into a non-alpha, turned on by default.

Thank you!
Anna.

ygribov updated this revision to Diff 39345.Nov 5 2015, 4:43 AM

Moved to unix package (and thus enabled by default). I now also test for collaboration with security.InsecureAPI.vfork.

I now also test for collaboration with security.InsecureAPI.vfork.

Should probably clarify: I've added checks to testcase to verify that new checker properly interacts with (existing) InsecureAPI.vfork checker,

Ok. Thanks!
I'll commit shortly.

zaks.anna accepted this revision.Nov 5 2015, 10:19 AM
zaks.anna edited edge metadata.
This revision is now accepted and ready to land.Nov 5 2015, 10:19 AM
This revision was automatically updated to reflect the committed changes.