The invalidation of pointer pointers returned by subsequent calls to genenv is
suggested by the POSIX standard, but is too strict from a practical point of
view. A new checker option 'InvalidatingGetEnv' is introduced, and is set to a
more lax default value, which does not consider consecutive getenv calls
invalidating.
The handling of the main function's possible specification where an environment
pointer is also pecified as a third parameter is also considered now.
Details
- Reviewers
Szelethus NoQ donat.nagy balazske steakhal
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp | ||
---|---|---|
99–102 | The state modelling is refined to model the env region coming from the main function and the getenv calls. | |
clang/test/Analysis/cert/env34-c.c | ||
6 | This test file is incomplete. |
The commit looks good in general, I have a few minor suggestions and a serious question about the state transition bureaucracy.
clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp | ||
---|---|---|
74–75 | Nitpick: "practical false positives" sounds strange for me, consider writing | |
131–136 | Perhaps add a comment that clarifies that passing a nullptr as the ExplodedNode to addTransition is equivalent to specifying the current node. I remember this because I was studying its implementation recently, but I would've been surprised and suspicious otherwise. | |
223 | I fear that this state transition will go "sideways" and the later state transitions (which add the note tags) will branch off instead of building onto this. IIUC calling CheckerContext::addTransition registers the transition without updating the "current ExplodedNode" field of CheckerContext, so you need to explicitly store and pass around the ExplodedNode returned by it if you want to build on it. This is an ugly and counter-intuitive API, and I also ran into a very similar issue a few weeks ago (@Szelethus helped me). | |
clang/test/Analysis/cert/env34-c.c | ||
6 | Personally I'd prefer putting those cases into a separate files, because this test file is already 340 lines long and it'd be difficult to understand if it was filled with conditional checks. |
I'm sorry starting the review of this one only now, but I'm quite booked.
Is it still relevant? If so, I'll continue.
clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp | ||
---|---|---|
120–121 | FunctionName and Message will dangle inside the NoteTag. | |
131–136 | If nullptr is equivalent with C.getPredecessor() inside addTransition(), why not simply initialize it to that value instead of to nullptr? | |
219 | We should hoist this into a field, to only construct it once. | |
221–222 | What ensures that Call.getReturnValue().getAsRegion() is not null? | |
223 | I think the usage here is correct. |
Add tests for checker option
Remove unnecessary const_cast
Only model a getenv call if there is a value to model
Use getPredecessor to better indicate what happens during EG building
Hoist GetEnvCall variable
Fix dangling strings in note generation
clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp | ||
---|---|---|
44 | Reworded the message here | |
48 | Hoisted here | |
120–121 | Good catch, thanks! Fixed this with a lambda capture initializer. | |
131–136 | I ended up using C.getPredecessor() instead of explaining; this seems a bit more intuitive (if such a thing even exists in CSA). | |
223 | (the line number of this comment desync-ed) | |
clang/test/Analysis/invalid-ptr-checker.c | ||
51 | This gives 2 warnings. One for subexpression envp, and one for the whole statement *envp. This is the current behaviour ( check clang/test/Analysis/cert/env31-c.c ), and this patch does not change it. |
I would like to go through with this option, and then I would like to fix the following issues with this checker as well:
- the previous function call notes could be more streamlined
- the notes of this checker are also shown when another checker hits those nodes with its report
- for example taint checker giving a warning to getenv usage would also trigger the display of the 'previous function call was here' note here), this I would like to filter with bug category filters
- code examples for this filtering are below
- try to consolidate the multiple warnings coming from this checker's checkLocation callback
category based filtering ( example from lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:167 ):
If (!BR.isInteresting(CallLocation) || BR.getBugType().getCategory() != categories::TaintedData) { //but this would be InvalidPtr BugType's category, namely memory_error return ""; }
or checker based filtering ( example from lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp:397 )
if (&BR.getBugType() != smartptr::getNullDereferenceBugType() || // this is a comparison of the address of a static bugtype !BR.isInteresting(ThisRegion))
This second one gives a more precise filtering, but the implementation-specific detail of storing the bugtype by reference is what seems to make this work, which I find hacky.
If the checker issues a NoteTag, it makes sense in certain situations to make sure that it acts on only reports issued by that checker. The standard way of achieving that is by comparing the tags, as you do in the second example.
There is nothing wrong with this, if that particular checker issued that NoteTag in the first place. It's marginally debatable, if it was not issued by the given checker but rather something else. That would suggest to me some logic flaw, or coupling issue. For cross-checker cases, I think the bug category would be the better option, but I would still need to think about that, so not set in stone :D
FYI I haven't looked at the patch yet, but I wanted to answer your question.
The change looks promising, I only have minor remarks.
clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp | ||
---|---|---|
121–122 | Minor nitpick: in situations like this, when we want to save an already composed string, std::string is better than SmallString because it doesn't occupy more memory than the actual length of the string. (OTOH SmallString is better for buffer variables, I've seen code that creates a SmallString, composes the message in it, then converts it to a std::string for longer-term storage.) Of course these are all just inconsequential micro-optimization details... | |
131–136 |
😆 | |
clang/test/Analysis/invalid-ptr-checker.c | ||
10 | Use -verify=expected,pedantic here and then you can eliminate the pedantic-warning lines that duplicate the messages of expected-warnings. |
Thanks for the ping. I have some concerns inline.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
1000–1008 | I think we should mention this flag in the docs, and an example. | |
clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp | ||
115 | I'd prefer an explicit out parameter instead of capturing &State here. | |
120–121 | On second thought, I'm wrong. It won't dangle, because the StringRef(FunctionName) is owned by the identifier of the function, and thus lives as long as the ASTContext. | |
124 | To me, it feels like all the messages we emit from this NoteTag, are specific to this particular checker. I also have the feeling that it should mark uninteresting the region once it puts a message there - which should stop other notes placed for the same reason for other - basically unrelated env invalidations. | |
138–142 | I'd prefer if we wouldn't put N separate NoteTags, but rather iterate over this set of regions inside the NoteTag. | |
219–225 |
Moving this to GitHub as Phabricator is shutting down. Relevant PR here: https://github.com/llvm/llvm-project/pull/67663
I think we should mention this flag in the docs, and an example.