This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Extract string header from redundant-string-cstr checker
ClosedPublic

Authored by mikecrowe on Feb 16 2023, 1:47 PM.

Details

Summary

In preparation for using the implementation of basic_string and
basic_string_view from redundant-string-cstr.cpp in other checks, let's
extract it to a header.

Using the existing <string.h> to provide size_t, using that to define
size_type and using it rather than the previous "size" type makes it
easier to provide the size() method later and makes the implementation
closer to the standard.

Diff Detail

Event Timeline

mikecrowe created this revision.Feb 16 2023, 1:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 16 2023, 1:47 PM
Herald added a subscriber: xazax.hun. · View Herald Transcript
mikecrowe requested review of this revision.Feb 16 2023, 1:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 16 2023, 1:47 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
Eugene.Zelenko set the repository for this revision to rG LLVM Github Monorepo.
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

May be you could use heard in other test in same patch?

carlosgalvezp added inline comments.Feb 17 2023, 1:16 AM
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-string-cstr.cpp
1–2

Where does this come from? Other tests use the actual path to the folder.

May be you could use heard in other test in same patch?

I'm not quite sure what you mean by "heard". Searching for "heard" in the LLVM code and Googling "llvm heard" reveal nothing useful. Please can you explain what you mean?

If you're suggesting that I could use the new <string> header to replace declarations of basic_string etc. in other tests then I think that would be possible with some careful checking to make sure it include the necessary functionality. I think that would easier to review separately rather than in a single patch though.

clang-tools-extra/test/clang-tidy/checkers/readability/redundant-string-cstr.cpp
1–2

I copied it from other checks:

clang-tools-extra/test/clang-tidy/checkers/bugprone/signal-handler-minimal.c:// RUN: -- -isystem %clang_tidy_headers
clang-tools-extra/test/clang-tidy/checkers/bugprone/signal-handler-posix.c:// RUN: -- -isystem %clang_tidy_headers
clang-tools-extra/test/clang-tidy/checkers/bugprone/signal-handler.c:// RUN: %check_clang_tidy %s bugprone-signal-handler %t -- -- -isystem %clang_tidy_headers
clang-tools-extra/test/clang-tidy/checkers/bugprone/signal-handler.cpp:// RUN: %check_clang_tidy -std=c++14 %s bugprone-signal-handler %t -- -- -isystem %clang_tidy_headers -isystem %S/Inputs/signal-handler -target x86_64-unknown-unknown
clang-tools-extra/test/clang-tidy/checkers/bugprone/suspicious-include.cpp:// RUN: %check_clang_tidy %s bugprone-suspicious-include %t -- -- -isystem %clang_tidy_headers -fmodules
clang-tools-extra/test/clang-tidy/checkers/google/objc-function-naming.m:// RUN: %check_clang_tidy %s google-objc-function-naming %t -- -- -isystem %clang_tidy_headers
clang-tools-extra/test/clang-tidy/checkers/llvm/include-order.cpp:// RUN: %check_clang_tidy %s llvm-include-order %t -- -- -isystem %clang_tidy_headers
clang-tools-extra/test/clang-tidy/checkers/modernize/pass-by-value-macro-header.cpp:// RUN: %check_clang_tidy %s modernize-pass-by-value %t -- -- -isystem %clang_tidy_headers
clang-tools-extra/test/clang-tidy/checkers/performance/move-constructor-init.cpp:// RUN: -- -isystem %clang_tidy_headers
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-string-cstr-format.cpp:// RUN: %check_clang_tidy -check-suffix=STDFORMAT -std=c++20 %s readability-redundant-string-cstr %t -- --  -isystem %clang_tidy_headers -DTEST_STDFORMAT
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-string-cstr-format.cpp:// RUN: %check_clang_tidy -check-suffixes=STDFORMAT,STDPRINT -std=c++2b %s readability-redundant-string-cstr %t -- --  -isystem %clang_tidy_headers -DTEST_STDFORMAT -DTEST_STDPRINT
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-string-cstr.cpp:// RUN: %check_clang_tidy %s readability-redundant-string-cstr %t -- -- -isystem %clang_tidy_headers

It seems to work!

mikecrowe marked an inline comment as done.Feb 17 2023, 9:08 AM

If you're suggesting that I could use the new <string> header to replace declarations of basic_string etc. in other tests then I think that would be possible with some careful checking to make sure it include the necessary functionality. I think that would easier to review separately rather than in a single patch though.

I had a quick look at likely candidates:

  • abseil/redundant-strcat-calls.cpp appears to declare basic_string outside the std namespace and inside it, and does some strange stuff with a base class. If I rip all that out, and replace uses of string with std::string then the tests pass using <string>.
  • readability/string-compare.cpp and readability/container-data-pointer.cpp require some tweaks to <string> but are straightforward.
  • There may be complications if MSVC differs in its std::basic_string implementation in ways that the -msvc tests care about, but I didn't spot any.

So, it looks like the use of this new <string> header could be extended. Do you have a preference for one big patch, one patch per directory (abseil, readability), or one patch per check? Do you require all to be complete before even this patch can be accepted?

carlosgalvezp accepted this revision.Feb 17 2023, 9:17 AM

Fair enough! I think we should merge this patch as is first, so that people can benefit from it already, and work through the technical debt afterwards.

This revision is now accepted and ready to land.Feb 17 2023, 9:17 AM

Fair enough! I think we should merge this patch as is first, so that people can benefit from it already, and work through the technical debt afterwards.

Great. Thank you. I'm still interested in answers to the questions in my comment since I have the necessary changes to migrate the tests I mentioned to using <string> ready and I'd be happy to have a go at migrating more.

Thanks.

Mike.

If you're suggesting that I could use the new <string> header to replace declarations of basic_string etc. in other tests then I think that would be possible with some careful checking to make sure it include the necessary functionality. I think that would easier to review separately rather than in a single patch though.

I had a quick look at likely candidates:

  • abseil/redundant-strcat-calls.cpp appears to declare basic_string outside the std namespace and inside it, and does some strange stuff with a base class. If I rip all that out, and replace uses of string with std::string then the tests pass using <string>.
  • readability/string-compare.cpp and readability/container-data-pointer.cpp require some tweaks to <string> but are straightforward.
  • There may be complications if MSVC differs in its std::basic_string implementation in ways that the -msvc tests care about, but I didn't spot any.

So, it looks like the use of this new <string> header could be extended. Do you have a preference for one big patch, one patch per directory (abseil, readability), or one patch per check? Do you require all to be complete before even this patch can be accepted?

If it's possible to re-use this new string header into the other checks that would be ideal. After all, in the real world there's only one string header. If it's a drop-in replacement I think it's fine to do multiple checks in one patch. Otherwise if the check / string header needs tweak it's probably better to do one check at a time.

mikecrowe updated this revision to Diff 502385.Mar 4 2023, 1:01 PM
mikecrowe edited the summary of this revision. (Show Details)

This new version uses size_t and size_type rather than the annoyingly-named "size" type that the old implementation in redundant-string-cstr.cpp used and is the basis for using the new <string> header in four further checks to be submitted as separate changes.

The changes that extend the <string> header further and use it in a few more checks are:

and of course my original change:

carlosgalvezp accepted this revision.Mar 5 2023, 7:39 AM

Sounds good, should we land this? If you don't have commit rights, please let us know Github name and email for attribution.

Sounds good, should we land this?

If you're happy to do so, then so am I.

If you don't have commit rights, please let us know Github name and email for attribution.

These are my first code contributions to LLVM, so I don't really know the process. I don't believe that I have commit rights. My GitHub username is "mikecrowe" and my email address is "mac@mcrowe.com".

Thanks!

Mike.

carlosgalvezp added inline comments.Mar 5 2023, 8:05 AM
clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/string
5–6

Sorry I just noticed this - should we keep using the __SYZE_TYPE__ macro that existed in the previous patch? The purpose of this header is to not include any standard header, and yet this is done in line 5 so it kinda defeats the purpose.

I.e add the following under line 8:

typedef __SIZE_TYPE__ size_t

And then use that in line 17.

The purpose of this header is to not include any standard header, and yet this is done in line 5 so it kinda defeats the purpose.

I believe that this <string> header is including clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/string.h rather than the system string.h header. (That's why I'm including string.h to get size_t rather than the more obvious stdlib.h.) If we're going to have sets of "standard-but-not-standard" headers for these checks then it feels like it ought to be safe to use them.

I will double check that this is true once my current build is complete.

Mike.

I will double check that this is true once my current build is complete.

Yes, it's true. I stuck a #error in clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/string.h and I saw the expected error from a file including <string>.

I will double check that this is true once my current build is complete.

Yes, it's true. I stuck a #error in clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/string.h and I saw the expected error from a file including <string>.

Awesome, thanks for checking, I wasn't aware we already had string.h. Then if we do have headers named the same as the standard headers, would it make sense to name this header cstring instead of string?

I will double check that this is true once my current build is complete.

Yes, it's true. I stuck a #error in clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/string.h and I saw the expected error from a file including <string>.

Awesome, thanks for checking, I wasn't aware we already had string.h. Then if we do have headers named the same as the standard headers, would it make sense to name this header cstring instead of string?

I think you mean rename string.h to cstring. (string clearly has to be just string if it's the standard C++ string header.)

A bit of searching in the existing checks shows that string.h is included by C checks (bugprone/signal-handler*.c) and also by checks that deliberately want string.h so they can suggest switching to cstring (modernize/deprecated-headers*) so it looks like it's necessary to keep string.h. If you wish, a cstring wrapper that just includes string.h could be created, but it really ought to put everything in namespace std too. (If so, I think this would probably be better done as a separate change.)

I will double check that this is true once my current build is complete.

Yes, it's true. I stuck a #error in clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/string.h and I saw the expected error from a file including <string>.

Awesome, thanks for checking, I wasn't aware we already had string.h. Then if we do have headers named the same as the standard headers, would it make sense to name this header cstring instead of string?

I think you mean rename string.h to cstring. (string clearly has to be just string if it's the standard C++ string header.)

A bit of searching in the existing checks shows that string.h is included by C checks (bugprone/signal-handler*.c) and also by checks that deliberately want string.h so they can suggest switching to cstring (modernize/deprecated-headers*) so it looks like it's necessary to keep string.h. If you wish, a cstring wrapper that just includes string.h could be created, but it really ought to put everything in namespace std too. (If so, I think this would probably be better done as a separate change.)

You are absolutely right, I don't know why I mixed cstring with string, nevermind! Pushed now, will look at the follow up patches soon. Thanks again for the fix!