This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Add support for more basic_string API in DanglingInternalBufferChecker
ClosedPublic

Authored by rnkovacs on Jul 15 2018, 9:20 PM.

Details

Summary

A pointer referring to the elements of a basic_string may be invalidated by calling a non-const member function, except operator[], at, front, back, begin, rbegin, end, and rend. The checker now warns if the pointer is used after such operations.

Diff Detail

Event Timeline

rnkovacs created this revision.Jul 15 2018, 9:20 PM
NoQ accepted this revision.Jul 15 2018, 10:15 PM

Cool! I don't have a strong preference with respect to whitelist vs. blacklist; your approach is safer but listing functions that don't immediately invalidate the buffer would allow us to avoid hard-to-detect false negatives while pretending that our users would notice and report easy-to-fix false positives for us. Also we rarely commit to adding a test for every single supported API function; bonus points for that, but usually 2-3 functions from a series of similar functions is enough :)

lib/StaticAnalyzer/Checkers/DanglingInternalBufferChecker.cpp
112–124

That quote from the Standard would look great here.

This revision is now accepted and ready to land.Jul 15 2018, 10:15 PM
rnkovacs updated this revision to Diff 155770.Jul 16 2018, 3:29 PM
rnkovacs marked an inline comment as done.
rnkovacs edited the summary of this revision. (Show Details)

Added standard quote, marking the section about non-member functions that may also invalidate the buffer as a TODO.
Also changed the note message to that suggested by @NoQ (thanks!). All tests pass now.

In D49360#1163113, @NoQ wrote:

Also we rarely commit to adding a test for every single supported API function; bonus points for that, but usually 2-3 functions from a series of similar functions is enough :)

Um, okay, noted for next time :)

NoQ added a comment.Jul 16 2018, 5:16 PM

Hmm, the destructor-specific message was pretty good, can we keep it? It should be possible to print a different message depending on the program point within N.

It is really nice to see this checker take shape! Some drive by diagnostic comments in line.

test/Analysis/dangling-internal-buffer.cpp
175

In other parts of clang we use the term "inner pointer" to mean a pointer that will be invalidated if its containing object is destroyed https://clang.llvm.org/docs/AutomaticReferenceCounting.html#interior-pointers. There are existing attributes that use this term to specify that a method returns an inner pointer.

I think it would be good to use the same terminology here. So the diagnostic could be something like "Dangling inner pointer obtained here".

176

What do you think about explicitly mentioning the name of the method here when we have it? This will make it more clear when there are multiple methods on the same line.

I also think that instead of saying "is allowed to" (which raises the question "by whom?") you could make it more direct.

How about:
"Inner pointer invalidated by call to 'clear'"

or, for the destructor "Inner pointer invalidated by call to destructor"?

What do you think?

If you're worried about this wording being to strong, you could weaken it with a "may be" to:

"Inner pointer may be invalidated by call to 'clear'"

rnkovacs updated this revision to Diff 155944.Jul 17 2018, 12:10 PM
rnkovacs marked 2 inline comments as done.
rnkovacs added a reviewer: dcoughlin.

Note messages updated.

test/Analysis/dangling-internal-buffer.cpp
175

I feel like I should also retitle the checker to DanglingInnerBuffer to remain consistent. What do you think?

176

I like these, thanks! I went with the stronger versions now, as they seem to fit better with the warnings themselves.

NoQ added a comment.Jul 17 2018, 2:43 PM

I showed the bug mentioned in D49058 to a friend who didn't do much C++ recently, for a fresh look, and he provided a bunch of interesting feedback by explaining the way he didn't understand what the analyzer was trying to say.

  1. When we call c_str(), the pointer is not dangling yet, not until the destructor or realloc is called. He didn't understand the report because he was trying to figure out why do we think the pointer is already dangling.
  2. A generic "use after free" warning on the return site is confusing because the user would expect to see an actual "use" instead of just passing it around. We should be more specific, i.e. "Deallocated pointer returned to the caller".
  3. We mention that there's a destructor, but the destructor is hard to see. Knowing the type of the destroyed object would help. Knowing that it's a temporary object would help.
  4. The whole idea of "string has a buffer that would be destroyed when the string is destroyed and we shouldn't pass the pointer around" needs to be explained all together, rather than separated into different diagnostic pieces. The user needs to be somehow informed that this is how std::string operates because he doesn't necessarily know that.
test/Analysis/dangling-internal-buffer.cpp
176

"Inner pointer invalidated by call to 'clear'"

I think the word "invalidated" may be confusing, how about "reallocated"? And "deallocated" in case of destructors.

MTC added a subscriber: MTC.Jul 17 2018, 7:15 PM
NoQ accepted this revision.Jul 18 2018, 1:52 PM

Let's commit this patch and make another round later, as we gather ideas for better names and messages.

NoQ added inline comments.Jul 18 2018, 1:55 PM
test/Analysis/dangling-internal-buffer.cpp
175

My intuition suggests that we should remove the word "Dangling" from the checker name, because our checker names are usually indicating what they check, not what bugs they find. Eg., MallocChecker doesn't find all mallocs, it checks that mallocs are used correctly. This checker checks that pointers to inner buffers are used correctly, so we may call it InnerPointerChecker or something like that.

This revision was automatically updated to reflect the committed changes.