This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Support std::format and std::print in readability-redundant-string-cstr
ClosedPublic

Authored by mikecrowe on Feb 5 2023, 7:19 AM.

Details

Summary

std::format (C++20) and std::print (C++23) are perfectly happy to accept
std::string arguments. Converting them to C-style strings by calling
c_str() is unnecessary and may cause extra walking of the string to
determine its length.

Depends on D144216

Diff Detail

Event Timeline

mikecrowe created this revision.Feb 5 2023, 7:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2023, 7:19 AM
mikecrowe requested review of this revision.Feb 5 2023, 7:19 AM

This check can also work trivially for fmt::format and fmt::print[1] too (in fact, that's what I'm actually using), if the project would allow them to be added too.

[1] https://fmt.dev/

Please document change in Release Notes, as well as in the check documentation, together with its limitations (can only handle 1 argument at a time).

clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
182

Nit: add a comment explaining what this matcher does, for consistency with how it's done with the other matchers above.

mikecrowe updated this revision to Diff 494937.Feb 5 2023, 12:16 PM

carlosgalvezp wrote:

Please document change in Release Notes, as well as in the check documentation, together with its limitations (can only handle 1 argument at a time).

Hopefully I've done those things.

carlosgalvezp accepted this revision.Feb 6 2023, 1:04 AM

LGTM, thanks for the fix! Might be good to wait a few days for other reviewers to have a look or give suggestions on how to improve the issue of having to do one parameter at a time.

This revision is now accepted and ready to land.Feb 6 2023, 1:04 AM
njames93 requested changes to this revision.Feb 6 2023, 1:50 AM
njames93 added inline comments.
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
186–187
188

The limitation about only transforming the first argument can be alleviated by using the forEachArgumentWithParam matcher

clang-tools-extra/docs/ReleaseNotes.rst
129–131

Please address this needless restriction, see above comment

clang-tools-extra/test/clang-tidy/checkers/readability/redundant-string-cstr.cpp
336

Again this test case would need updating to handle applying all fixes

This revision now requires changes to proceed.Feb 6 2023, 1:50 AM

The limitation about only transforming the first argument can be alleviated by using the forEachArgumentWithParam matcher

I shall try to make that work. My initial attempts were not successful, but there's still more to try. If I can't make it work I'll ask for more help here.

Thanks.

mikecrowe added inline comments.Feb 10 2023, 3:27 AM
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
188

Thanks for the hint. I did try to overcome this limitation but failed to find forEachArgumentWithParam.

I think that I've managed to make forEachArgumentWithParam work, but only if I lose the materializeTemporaryExpr. I hope that's not essential. I've ended up with:

C++
  Finder->addMatcher(                                                                                                                                                                                      
    traverse(TK_AsIs,                                                                                                                                                                                      
             callExpr(callee(functionDecl(hasAnyName("::std::print", "::std::format"))),                                                                                                                   
                      forEachArgumentWithParam(StringCStrCallExpr, parmVarDecl()))),                                                                                                                       
    this);

I suspect that there's something better than the parmVarDecl() second parameter to forEachArgumentWithParam though.

njames93 added inline comments.Feb 10 2023, 7:04 AM
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
188

parmVarDecl is exactly what you need here.

My understanding of the temporary expressions isn't perfect, but from what I can gather, we shouldn't need temporaries when calling the .c_str()method for all but the first argument.

This does highlight an issue with your test code though.
std::format and std::print take a std::format_string as the first argument, but in your tests you have it as const char *
Anyway I think this means the first argument in real world code would result in a materializeTemporaryExpr
See here

mikecrowe added inline comments.Feb 12 2023, 4:45 AM
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
188

I mentioned that I'd failed to use a genuine type for the first parameter in my commit message. However, I've now tried again, and I think that I've got something working that is much closer to that required by the standard.

I tried to come up with something that I could call c_str() on to pass as the first argument, but couldn't come up with anything that g++ thought was a constant expression. I tried:

consteval std::string f()                                                                                                                                                                                    {                                                                                                                                                                                                            
    return {"Hi {}"};                                                                                                                                                                                        
}                                                                                                                                                                                                            

    std::puts(std::format(f().c_str(), get().c_str());

but g++ still claims:

format.cpp:22:28: error: ‘std::string{std::__cxx11::basic_string<char>::_Alloc_hider{((char*)(&<anonymous>.std::__cxx11::basic_string<char>::<anonymous>.std::__cxx11::basic_string<char>::<unnamed union>::_M_local_buf))}, 5, std::__cxx11::basic_string<char>::<unnamed union>{char [16]{'H', 'i', ' ', '{', '}', 0, '\000', '\000', '\000', '\000', '\000', '\000', '\000', '\000', '\000', '\000'}}}’ is not a constant expression
   22 |     std::puts(std::format(f().c_str(), get().c_str());
      |                           ~^~

So I think that the risk of inadvertently removing c_str() from the first argument is low, given that such a call appears not to be legal anyway. Even if there does turn out to be a way to do so, I think that removing the c_str() call probably ought to be expected to work too.

mikecrowe updated this revision to Diff 496758.Feb 12 2023, 7:09 AM

Incorporate feedback from Nathan James:

  • Simplify callExpr pattern in matcher.
  • Use forEachArgumentWithParam so that check works on all arguments and remove mentions of previous limitation in commit message and documentation.
  • Make lit test use closer approximation to standard first argument for std::format and std::print rather than const char *.
  • Add wide-character string tests.
  • Other minor improvements to tests.
mikecrowe marked 5 inline comments as done.Feb 12 2023, 7:10 AM
mikecrowe edited the summary of this revision. (Show Details)
mikecrowe marked an inline comment as done.Feb 12 2023, 7:12 AM

Just a few more points then it should be good to land

clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
181–190

Can this be wrapped in a check to make sure it only runs in c++20

Likewise ::std::print is only c++23, so maybe:

getLangOpts().CPlusPlus2B ? hasAnyName("::std::print", "::std::format") : hasAnyName("::std::format")
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-string-cstr.cpp
1

We shouldn't be removing tests from older language standards
Can all the changes made to the file be moved into a new file redundant-string-cstr-cpp20.cpp
Would likely either need to create a second file for c++23 or make use of the check-suffix to run the tests for std::print

Thanks for the review and further suggestions.

Mike.

clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
181–190

I can do that. I had (presumably incorrectly) assumed that since and use of std::print and std::format in any previous version would be UB then it was safe to apply the check regardless of the C++ version.

clang-tools-extra/test/clang-tidy/checkers/readability/redundant-string-cstr.cpp
1

My brief reading up on check-suffix implies that it could be used to keep everything in redundant-string-cstr.cpp, but that file is getting rather large anyway. Having redundant-string-cstr-cpp20.cpp also containing C++23 tests might be a bit confusing. Having redundant-string-cstr-cpp23.cpp too would mean moving the format_args stuff to a header (which might be beneficial regardless.)

Anyway, I shall have a play and see what I can make work.

Do I need to add tests to ensure that the checks don't trigger when running against earlier standard versions?

mikecrowe marked 2 inline comments as done.Feb 16 2023, 1:57 PM

I ended up splitting out the std::format and std::print tests to their own file which meant that I didn't need to modify the existing redundant-string-cstr.cpp file in this commit. (Though of course I had to extract the <string> header in D144216 first.) Let me know if you don't like the split.

I also slightly re-ordered the tests and removed a few incorrect comments that incorrectly survived the switch to using forEachArgumentWithParam.

PiotrZSL added inline comments.
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
189

Please introduce configuration option to specify custom functions.
For example if some project (like mine) is wrapping fmt::format with some variadic template function, then such function could be specified.
Same goes to things like some loggers.

Check utils/OptionsUtils.h for configuration, and utils/Matchers.h (matchesAnyListedName)

mikecrowe added inline comments.Mar 11 2023, 9:45 AM
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
189

Please introduce configuration option to specify custom functions.
For example if some project (like mine) is wrapping fmt::format with some variadic template function, then such function could be specified.

That's exactly where this change originated (as part of my clang-tidy fmt fork, which I hope to submit for review soon once I've made it configurable too and improved the test cases.)

Same goes to things like some loggers.

Check utils/OptionsUtils.h for configuration, and utils/Matchers.h (matchesAnyListedName)

Thanks for the pointer. I shall study those files for how to support this.

Do you think that the change can land like in its current state first? Or would you prefer that the configuration option is added at the same time?

"Do you think that the change can land like in its current state first? Or would you prefer that the configuration option is added at the same time?"

Code is fine, probably If would would write this, then I would bother to split into C++20 and C++2B simply because std::print wouldn't compile if it wouldn't be available. So I would just use hasAnyName.
But code looks fine, configuration could be added later...

Code is fine, probably If would would write this, then I would bother to split into C++20 and C++2B simply because std::print wouldn't compile if it wouldn't be available. So I would just use hasAnyName.
But code looks fine, configuration could be added later...

Thanks. I originally wrote it like that, but @njames93 requested that I checked the version, so I did. :) It doesn't do any harm and might avoid confusion.

mikecrowe added inline comments.Mar 11 2023, 12:29 PM
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
189

Please introduce configuration option to specify custom functions.
For example if some project (like mine) is wrapping fmt::format with some variadic template function, then such function could be specified.

I could add some sort of readability-redundant-string-cstr.PrintFunction option or readability-redundant-string-cstr.FormatFunction option, but reducing this to its fundamental behaviour would be "a function that takes const char * arguments that is also willing to take std::string arguments". I'm struggling to find a sensible name for such an option. Maybe readability-redundant-string-cstr.StdStringAcceptingFunction=::fmt::format? Or just readability-redundant-string-cstr.FunctionCall=::fmt::format?

Same goes to things like some loggers.

Loggers may be classes, so there would need to be an option that specifies a class name (or even a base class name) and a method name (which may be an operator.) See this hard-coded example (If it's not obvious, there's a description in the README. In such cases the options could be readability-redundant-string-cstr.Class=::BaseTrace and readability-redundant-string-cstr.Method=Log or readability-redundant-string-cstr.Operator=(), but then it would be hard to tie together the right classes and methods. That could be avoided with something like readability-redundant-string-cstr.MemberFunctionCall=::BaseTrace::operator(),::NullTrace::operator() and some parsing I suppose.

Regardless, I'll try and get the simple case working and await suggestions for appropriate option names.

Thanks again for the suggestions.

PiotrZSL added inline comments.Mar 11 2023, 1:58 PM
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
189

I asked AI, and it suggested readability-redundant-string-cstr.FormattingFunctionsList.
Also, I thing that matchesAnyListedName should support things like '::NullTrace::operator()', so one option should be sufficient.

FormattingFunctionsList:
A semicolon-separated list of (fully qualified) function/method/operator names, with the requirement that
any parameter capable of accepting a 'const char*' input should also be able to accept 'std::string' or
'std::string_view' inputs, or proper overload candidates that can do so should exist.

mikecrowe added inline comments.Mar 12 2023, 10:11 AM
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
189

Far be it from me to question the wisdom of the AI :) , but my point was that the check works for any function that accepts const char * parameters that could equally well be std::string parameters. It's not limitted to formatting functions. Despite this, I suspect that if the majority use case is formatting functions then maybe it's more helpful to use a name connected with that for discoverability.

In my prototype (which is proving easier to implement than I expected so far), I've used readability-redundant-string-cstr.StringParameterFunctions, which I can't say I'm entirely happy with either.

PiotrZSL added inline comments.Mar 12 2023, 10:23 AM
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
189

StringParameterFunctions can be too....

mikecrowe added inline comments.
clang-tools-extra/clang-tidy/readability/RedundantStringCStrCheck.cpp
189

My first go at an implementation is in https://reviews.llvm.org/D145885 (but could be squashed into this change if desired, since it hasn't landed yet.)

PiotrZSL accepted this revision.Mar 12 2023, 1:50 PM

Correct commit message and this could land.

Correct commit message and this could land.

I've read through all the comments and the commit message itself and I can't spot what's wrong with the commit message, except perhaps for its verbosity. What would you like me to correct?

"The dummy implementations of std::format and std::print require C++20,
so it's easier to test them in a separate
redundant-string-cstr-format.cpp check file. This also requires a small
improvement to the basic_string_view implementation in the <string>
dummy header.

With these changes, I was able to successfully run the
readability-redundant-string-cstr check on a real source file that
compiled successfully with what is destined to become GCC 13, and saw
the expected:"

You shouldn't explain your self why you choose this or other implementation, or pasting output from console.
Commit message should contain only description about changes.
All those explanation you can always put into review as comment.

https://llvm.org/docs/DeveloperPolicy.html#commit-messages:
"The body should be concise, but explanatory, including a complete reasoning. Unless it is required to understand the change, examples, code snippets and gory details should be left to bug comments, web review or the mailing list."

mikecrowe edited the summary of this revision. (Show Details)Mar 12 2023, 2:12 PM

Thank you for your patience.

You will need to rebase this, there are conflicts...

mikecrowe updated this revision to Diff 504474.Mar 12 2023, 2:33 PM
mikecrowe edited the summary of this revision. (Show Details)

Rebase. No conflicts.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 12 2023, 2:43 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.