This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add more detection rules for redundant c_str calls.
ClosedPublic

Authored by etienneb on Mar 25 2016, 9:58 AM.

Details

Summary

The string class contains methods which support receiving either a string literal or a string object.

For example, calls to append can receive either a char* or a string.

string& append (const string& str);
string& append (const char* s);

Which make these cases equivalent, and the .c_str() useless:

std::string s = "123";
str.append(s);
str.append(s.c_str());

In these cases, removing .c_str() doesn't provide any size or speed improvement.
It's only a readability issue.

If the string contains embedded NUL characters, the string literal and the string
object won't produce the same semantic.

Diff Detail

Event Timeline

etienneb updated this revision to Diff 51647.Mar 25 2016, 9:58 AM
etienneb retitled this revision from to [clang-tidy] Add more detection rules for redundant c_str calls..
etienneb updated this object.
etienneb added a reviewer: alexfh.
etienneb added a subscriber: cfe-commits.

Thank you for enhancement!

I think will be good idea to extend check for any function call that receives .c_str(), since it mat be in other classes could have methods which receives both C++ or C strings. See also PR24870.

Thank you for enhancement!

I think will be good idea to extend check for any function call that receives .c_str(), since it mat be in other classes could have methods which receives both C++ or C strings. See also PR24870.

I was thinking the same when I started that patch. But, there is multiple cases where it will bring a false-positive.
As an example:

string& insert (size_t pos, const string& str, size_t subpos, size_t sublen = npos);
string& insert (size_t pos, const char* s, size_t n);

Even if the overloaded name/types are compatible, the semantic is not the same. You can't simply remove the .c_str() here.

I agree that a more general cases could exists, but I would like to be able to turn it off over a large code base when trying to keep false positives as low as posible.

etienneb updated this revision to Diff 51657.Mar 25 2016, 10:36 AM

Fixing some matchers to use the recent new AstMatcher.
(see http://reviews.llvm.org/D18275)

chapuni@: I would like to know why "-target x86_64-unknown -std=c++11" was needed.

I'm fixing an issue with char16_t/char32_t. Is that the problem you just fixed?

etienneb updated this revision to Diff 51664.Mar 25 2016, 11:17 AM
etienneb updated this object.

fix unittest on linux/32-bits.
tested on windows/32-bits.

aaron.ballman added inline comments.
clang-tidy/readability/RedundantStringCStrCheck.cpp
129

I think this comment is incorrect, you want assignment, not equality.

etienneb updated this revision to Diff 51927.Mar 29 2016, 8:47 AM

fix invalid comment.

etienneb marked an inline comment as done.Mar 29 2016, 8:47 AM

thanks, aaron.ballman@.

clang-tidy/readability/RedundantStringCStrCheck.cpp
129

copy-paste! Good catch.

alexfh accepted this revision.Mar 29 2016, 9:55 AM
alexfh edited edge metadata.

Looks good to me. Adding Samuel, since he has done a lot of similar stuff by now and might have good ideas on improving this and making this more general.

This revision is now accepted and ready to land.Mar 29 2016, 9:55 AM
sbenza edited edge metadata.Mar 29 2016, 10:38 AM

Looks good to me. Adding Samuel, since he has done a lot of similar stuff by now and might have good ideas on improving this and making this more general.

There are two things I've done before that might apply here.
First, you can move the function+arg into data and just loop through it to make the matchers.
Eg:

struct Case {
  StringRef Func;
  unsigned Arg;
};
const Case Cases[] = {...};
for (Case C : Cases) {
  Finder->addMatcher(...);
}

This reduces the amount of code duplication.

Another thing I've done is to find matching overloads dynamically instead of hardcoding the list of functions.
The idea is that you match something like callExpr(hasAnyArgument(StringCStrCallExpr)) and then you figure out if that function has a matching overload that takes const string& on that specific argument.
This is flexible, but might give false positives.

etienneb marked an inline comment as done.Mar 29 2016, 11:56 AM

I'll to try to make a better way to describe redundant cases (as proposed by sbenza@) with an array.
I like the array based approach, it's lean and clean.

I'll try to make a prototype of the generic overloaded functions matcher and I'll look at the ratio of false-positive.
It may be gated under a configuration to avoid reporting these false-positives.

Afterward, we can take decision of what/how should this be integrated.

Can you summarize the improvements in docs/ReleaseNotes.rst please?

etienneb closed this revision.Apr 15 2016, 11:17 AM