This is an archive of the discontinued LLVM Phabricator instance.

[OPENMP]Initial fix PR42392: Improve -Wuninitialized warnings for OpenMP programs.
ClosedPublic

Authored by ABataev on Jul 8 2019, 11:30 AM.

Details

Summary

Some OpenMP clauses rely on the values of the variables. If the variable
is not initialized and used in OpenMP clauses that depend on the
variables values, it should be reported that the uninitialized variable
is used in the OpenMP clause expression.
This patch adds initial processing for uninitialized variables in OpenMP
constructs. Currently, it checks for use of the uninitialized variables
in the structured blocks.

Diff Detail

Repository
rL LLVM

Event Timeline

ABataev created this revision.Jul 8 2019, 11:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2019, 11:30 AM
NoQ added a comment.Jul 8 2019, 3:14 PM

I don't know much about OpenMP, but i guess i can help with the CFG a bit.

I strongly recommend adding a few direct tests for the CFG so that to check that the order of statements and the overall topology of the CFG is as you intended. Currently those tests are routed through the Static Analyzer because nobody else seems to care; see test/Analysis/cfg.cpp as an example (there are a few more such tests, grep for DumpCFG).

I tried to apply the patch and accidentally noticed that none of the newly added tests actually pass for me. Is this the right patch? Or is it just me?

lib/Analysis/CFG.cpp
4746–4750 ↗(On Diff #208467)

Not sure, are these expressions actually evaluated in run-time when the directive is hit? When i looked at a few AST dumps it seemed to me as if these are just a few DeclRefExprs that refer back to the captured variables; they don't really do anything.

ABataev marked an inline comment as done.Jul 8 2019, 3:43 PM
In D64356#1574540, @NoQ wrote:

I don't know much about OpenMP, but i guess i can help with the CFG a bit.

Thanks a lot!

I strongly recommend adding a few direct tests for the CFG so that to check that the order of statements and the overall topology of the CFG is as you intended. Currently those tests are routed through the Static Analyzer because nobody else seems to care; see test/Analysis/cfg.cpp as an example (there are a few more such tests, grep for DumpCFG).

Thanks for the advice, will do.

I tried to apply the patch and accidentally noticed that none of the newly added tests actually pass for me. Is this the right patch? Or is it just me?

That's strange. Maybe the patch applied incorrectly?

lib/Analysis/CFG.cpp
4746–4750 ↗(On Diff #208467)

Some of them are real expressions, evaluated at the runtime.

Some of them are the captured variables. But for some of those vars we need to create private copies, initialized with the values of the original variables. And we need to diagnose that the original captured variable is used without being initialized.
I'm going to extend this function to check all required expressions later. This is just an initial patch to setup the basic required functionality.

NoQ added a comment.Jul 8 2019, 3:56 PM

That's strange. Maybe the patch applied incorrectly?

Whoops, i needed to pull rC365334. The patch has applied cleanly "with fuzz 2", which meant "oh there's no -Wuninitialized in this run line but i guess that's fine".

NoQ added inline comments.Jul 8 2019, 6:41 PM
lib/Analysis/CFG.cpp
4746–4750 ↗(On Diff #208467)

Aha, ok, got it. So these are kinda like function arguments that are evaluated before the call, outside the call.

My concern is that the CFG for these expressions need to appear only once: either as part of the current CFG or as part of the CFG of the "outlined" statement, but not in both. Otherwise any sort of inter-"procedure"-al analysis over such CFG would mistakenly think that these statements are evaluated twice.

4756–4757 ↗(On Diff #208467)

What you're saying here is "the statement is going to be executed at the current moment of time, exactly once". Just curious, how accurate is this?

Generally it's perfectly fine drop some effects of the statement (eg., atomicity). I'm not sure about things like #pragma omp parallel that may evaluate the statement more than once, but it's not entirely inaccurate to assume that the statement is executed once (on some systems it may as well be true).

test/OpenMP/atomic_messages.c
5–9 ↗(On Diff #208467)

Currently the CFG is as follows:

[B1]
   1: int x;
   2: argc
   3: x
   4: argc = x; // CapturedStmt
   5: #pragma omp atomic read 'argc = x;'

I.e.,

  1. Declare x;
  2. Compute lvalue argc;
  3. Compute lvalue x;
  4. Draw the rest of the owl assignment operator. Ignore the whole complexity with lvalue-to-rvalue casts that are expected to be there.
  5. Suddenly realize that this whole thing was an OpenMP atomic read from the start.

Step 4 actually looks pretty good to me, given that we treat CapturedStmt as some sort of function call. I.e., we simply delegate the work of evaluating the actual statement into an "outlined" function which has its own, separate CFG that doesn't need to be squeezed into the current CFG. And this other CFG is where we're going to have our implicit cast and stuff.

I don't understand the point of having a separate step 5. I just don't see what extra work can be done here that wasn't already done on step 4. I'd probably mildly prefer to have only one of those: either step 4 or (better because it has more information) step 5. I don't mind having both though.

So i guess generally this CFG looks good to me. In particular, i can totally work with it if i end up implementing OpenMP support in the Static Analyzer. At the same time i'd probably be fine with a CFG that contains only steps 1 and 5. I need to see more examples :)

NoQ added a subscriber: rsmith.Jul 8 2019, 7:03 PM

Ok, so i think i more or less understand where this is going and i like it! My only concern about making sure that used-expressions don't appear in both CFGs; and, even then, it's likely that i'm wrong.

+@rsmith just in case he has any immediate thoughts on this.

include/clang/AST/StmtOpenMP.h
292 ↗(On Diff #208467)

This whole X.for_each(λ) idiom doesn't seem to be popular in LLVM; people seem to prefer to write an iterator and then use the generic for_each(X, λ) over it.

(i don't really care)

lib/Analysis/CFG.cpp
2063 ↗(On Diff #208467)

The first check looks redundant. I don't think it wins much performance either.

ABataev marked 3 inline comments as done.Jul 9 2019, 7:40 AM
ABataev added inline comments.
lib/Analysis/CFG.cpp
4746–4750 ↗(On Diff #208467)

Aha, ok, got it. So these are kinda like function arguments that are evaluated before the call, outside the call.

Well, a kind of.

My concern is that the CFG for these expressions need to appear only once: either as part of the current CFG or as part of the CFG of the "outlined" statement, but not in both. Otherwise any sort of inter-"procedure"-al analysis over such CFG would mistakenly think that these statements are evaluated twice.

Agree, but currently they do not appear at all. Those expressions in most cases are not immdiate children of the OpenMP directive, they are children of OpenMP clauses, associated with the directives. And the function iterates through all the clauses to build the CFG for such expressions.

4756–4757 ↗(On Diff #208467)

Do you suggest to exclude the bodies of the directives from the analysis?

test/OpenMP/atomic_messages.c
5–9 ↗(On Diff #208467)

I'll investigate this and will add some CFG specific tests.

ABataev marked an inline comment as done.Jul 9 2019, 7:44 AM
In D64356#1574981, @NoQ wrote:

Ok, so i think i more or less understand where this is going and i like it! My only concern about making sure that used-expressions don't appear in both CFGs; and, even then, it's likely that i'm wrong.

+@rsmith just in case he has any immediate thoughts on this.

What "both" CFGs do you mean?

include/clang/AST/StmtOpenMP.h
292 ↗(On Diff #208467)

I can try to add the iterator instead.

NoQ added a comment.Jul 9 2019, 12:28 PM

Something like this:

Like, you can construct a CFG for an arbitrary statement. CFG₁ is the CFG for xxx() and CFG₂ is the CFG for the CapturedStmt (and its children). I'm trying to say that even if used expressions are duplicated in the AST, they should not be duplicated in our CFGs.

But that's more, like, for the future patches. I'll be happy to accept this patch once you add CFG tests :)

In D64356#1576813, @NoQ wrote:

Something like this:

Like, you can construct a CFG for an arbitrary statement. CFG₁ is the CFG for xxx() and CFG₂ is the CFG for the CapturedStmt (and its children). I'm trying to say that even if used expressions are duplicated in the AST, they should not be duplicated in our CFGs.

But that's more, like, for the future patches. I'll be happy to accept this patch once you add CFG tests :)

I don't expect such duplication here. Those expressions associated with the directive have no relation with the CapturedStmt. They are associated only with the directive itself through the OpenMP clauses.

NoQ added a comment.Jul 9 2019, 1:44 PM

Yay, great!

ABataev updated this revision to Diff 209042.Jul 10 2019, 12:16 PM

Improved code.

NoQ accepted this revision.Jul 10 2019, 12:43 PM

Ugh, i forced a lot of boilerplate on you. Hope it was worth it >.<

Thank you!~

test/Analysis/cfg-openmp.cpp
3 ↗(On Diff #209042)

Those are quite readable because of very verbose pretty-prints but generally i prefer to interleave code and CHECK: lines so that to easily see both on one screen and know what corresponds to what.

This revision is now accepted and ready to land.Jul 10 2019, 12:43 PM
ABataev marked an inline comment as done.Jul 10 2019, 1:48 PM
In D64356#1579154, @NoQ wrote:

Ugh, i forced a lot of boilerplate on you. Hope it was worth it >.<

Thank you!~

No problems, thanks for the review!

test/Analysis/cfg-openmp.cpp
3 ↗(On Diff #209042)

Ok, I will do this.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2019, 7:54 AM