This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add bugprone-unsafe-functions checker.
ClosedPublic

Authored by futogergely on Nov 7 2020, 3:24 AM.

Details

Summary

Checks for unsafe functions listed in CERT C Coding Standard
Recommendation MSC24-C and Rule MSC33-C.

For the listed functions, an alternative, more secure replacement is
suggested, if available. The checker heavily relies on the functions
from annex K (Bounds-checking interfaces) of C11.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Herald added a project: Restricted Project. · View Herald TranscriptNov 7 2020, 3:24 AM
ktomi996 requested review of this revision.Nov 7 2020, 3:24 AM
ktomi996 updated this revision to Diff 303633.Nov 7 2020, 3:37 AM
steakhal added a subscriber: steakhal.
lebedev.ri retitled this revision from MSC24-C Obsolescent Functions check to [clang-tidy] CERT MSC24-C Obsolescent Functions check.Nov 7 2020, 4:03 AM
Eugene.Zelenko added inline comments.
clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
19 ↗(On Diff #303633)

Why this could not be member of check class?

52 ↗(On Diff #303633)

Why not to use plain assignment? See readability-simplify-boolean-expr.

59 ↗(On Diff #303633)

Early return will be better.

69 ↗(On Diff #303633)

Please elide braces.

80 ↗(On Diff #303633)

Please elide braces.

84 ↗(On Diff #303633)

Could be const auto *, because type is spelled in same statement.

85 ↗(On Diff #303633)

Could be const auto *, because type is spelled in same statement.

92 ↗(On Diff #303633)

Could be const auto *, because type is spelled in same statement.

93 ↗(On Diff #303633)

Could be const auto *, because type is spelled in same statement.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
12 ↗(On Diff #303633)

Please separate include statement and namespaces with empty line.

13 ↗(On Diff #303633)

Please separate with empty line.

clang-tools-extra/docs/clang-tidy/checks/cert-obsolescent-functions.rst
6 ↗(On Diff #303633)

Please make first statement same as in Release Notes. Please enclose all preprocessor variables and function names in double back-ticks.

15 ↗(On Diff #303633)

Link to original rule should be at the end. See other documentation as example.

17 ↗(On Diff #303633)

Please remove space before colon and separate code with empty line.

18 ↗(On Diff #303633)

Should be .. code-block:: c++. See other documentation as formatting example.

clang-tools-extra/test/clang-tidy/cert-obsolescent-functions.cpp
5 ↗(On Diff #303633)

Please separate with empty line.

Eugene.Zelenko set the repository for this revision to rG LLVM Github Monorepo.Nov 7 2020, 7:25 AM

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.
What do you think?

There is a mismatch between your code and docs too, regarding the function you check for.
In the code you match for vsprintf, vsscanf, vswprintf, vswcanf, wcrtomb, wcscat, but none of these are mentioned in the docs.

The checker warns only if STDC_LIB_EXT1 macro is defined and the value of STDC_WANT_LIB_EXT1 macro is 1 in this case it suggests the corresponding functions from Annex K instead the vulnerable function.

I would suggest mentioning these macros and their purpose in the docs.
Eg. that the STDC_WANT_LIB_EXT1 should be defined to 1 but the other is left to the implementation.

That being said, I would request more tests, demonstrating that this macro detection works accordingly.

This checker might be a bit noisy. Have you tried it on open-source projects?
If it is, we should probably note that in the docs as well.

In the tests, It is a good practice to demonstrate that the offered recommendation does not trigger yet another warning.
Don't forget to put a no-warning to highlight that.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
30 ↗(On Diff #303633)

Hm, it must be a typo.
vswcanf -> vswscanf

By the same token, I miss these functions from this enumeration.
vfwprintf, vfwscanf, vprintf, vwprintf, vwscanf
However, they are present in the cert table you mentioned.

33 ↗(On Diff #303633)

Why is strlen an obsolescent function? I don't feel it justified, even if there was a comment about it on the cert page.

69 ↗(On Diff #303633)

Please also check if the value is 1, as your summary and the standard requires.

77–79 ↗(On Diff #303633)
87–89 ↗(On Diff #303633)

We should probably use llvm::Twine instead of multiple std::string constructions.
I might be wrong though.

89 ↗(On Diff #303633)

can -> should
The same at the other diagnostic message.

91 ↗(On Diff #303633)

This and the previous if statement is exclusive, am I right?
If we have a match, that is either a callexpr or a vardecl.

I would prefer a return at the end of the previous if and an assert here that the FunctionPointer must be non-null.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
27 ↗(On Diff #303633)

It should be private.

clang-tools-extra/docs/clang-tidy/checks/cert-obsolescent-functions.rst
10 ↗(On Diff #303633)

Another typo.

19–23 ↗(On Diff #303633)

I'm not convinced by this example.

AFAIK, the intended usage is to define the __STDC_WANT_LIB_EXT1__ to 1 before the string.h header included.
It's important to provide a good example, as this checker will be silent if not used correctly.

23 ↗(On Diff #303633)

We should probably align this comment as well.

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

What do you think?

There is a mismatch between your code and docs too, regarding the function you check for.
In the code you match for vsprintf, vsscanf, vswprintf, vswcanf, wcrtomb, wcscat, but none of these are mentioned in the docs.

The checker warns only if STDC_LIB_EXT1 macro is defined and the value of STDC_WANT_LIB_EXT1 macro is 1 in this case it suggests the corresponding functions from Annex K instead the vulnerable function.

I would suggest mentioning these macros and their purpose in the docs.
Eg. that the STDC_WANT_LIB_EXT1 should be defined to 1 but the other is left to the implementation.

That being said, I would request more tests, demonstrating that this macro detection works accordingly.

This checker might be a bit noisy. Have you tried it on open-source projects?
If it is, we should probably note that in the docs as well.

In the tests, It is a good practice to demonstrate that the offered recommendation does not trigger yet another warning.
Don't forget to put a no-warning to highlight that.

steakhal requested changes to this revision.Dec 23 2020, 4:38 AM

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

From the user's point of view, it does not matter if the safer alternative is inside annex K or not.
IMO if the CERT rule says that don't use any of these functions but use these other functions instead.
If we don't check all of them in the list, this checker is incomplete. The name of this checker might lead the user to a false sense of security.

This revision now requires changes to proceed.Dec 23 2020, 4:38 AM

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

From the user's point of view, it does not matter if the safer alternative is inside annex K or not.

I suspect that should be a flag in the check's options.

IMO if the CERT rule says that don't use any of these functions but use these other functions instead.
If we don't check all of them in the list, this checker is incomplete. The name of this checker might lead the user to a false sense of security.

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

From the user's point of view, it does not matter if the safer alternative is inside annex K or not.

I suspect that should be a flag in the check's options.

I don't see any value in disabling the check for eg. setbuf. Even the man page says that you should use the setvbuf instead.
Why would anyone disable this if one already wants diagnostics for the CERT rule it implements?

IMO if the CERT rule says that don't use any of these functions but use these other functions instead.
If we don't check all of them in the list, this checker is incomplete. The name of this checker might lead the user to a false sense of security.

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

From the user's point of view, it does not matter if the safer alternative is inside annex K or not.

I suspect that should be a flag in the check's options.

I don't see any value in disabling the check for eg. setbuf. Even the man page says that you should use the setvbuf instead.
Why would anyone disable this if one already wants diagnostics for the CERT rule it implements?

IMO if the CERT rule says that don't use any of these functions but use these other functions instead.
If we don't check all of them in the list, this checker is incomplete. The name of this checker might lead the user to a false sense of security.

I think the question is, *why* are these checks being implemented?
Just to claim that for some particular rule there is a check, and cross it off a list?
Or for them to be actually used?

I think the question is, *why* are these checks being implemented?
Just to claim that for some particular rule there is a check, and cross it off a list?

Initially, yes. I think one could learn a lot from contributing to any project.
It's not inherently a bad thing to combine these two.

Or for them to be actually used?

I want a useful checker, that's why I highlighted some of my concerns and suggested a way forward.
It might not be useful to everyone as it tries to implement a domain-specific CERT rule, but it's still up to the user to enable this.
I think we should ask, who is the audience of this checker?
I assume only the users who are interested in the CERT guideline would use this. At this point, we should be clear about what we are checking for.
I think it's OK, to say that only a part of the rule is implemented, and we should carefully document this fact. But IMO one should go the extra mile to try hard and implement the other parts of the rule as well.
Like, matching on the setbuf is not that hard really. The rest of the missing functions probably fall into the same category.

I might be wrong on this though.
Keep in mind that I'm not tidy dev, so take my opinion with a pinch of salt.

I think the question is, *why* are these checks being implemented?
Just to claim that for some particular rule there is a check, and cross it off a list?
Or for them to be actually used?

We have an existing CERT module for CERT-specific checks. When the check is honoring a published coding standard, that's usually sufficient justification -- but I do think the check should implement what the CERT recommendation says. If the CERT recommendation seems deficient in some way, it's reasonable to reach out to the authors (they allow comments on their wiki and are pretty good about responding to those comments) for clarification or to discuss potential improvements to the recommendation that would make the check more palatable.

whisperity edited the summary of this revision. (Show Details)Jan 5 2021, 2:41 AM
whisperity added a subscriber: gerazo.
futogergely commandeered this revision.Nov 29 2021, 12:46 AM
futogergely added a reviewer: ktomi996.
futogergely retitled this revision from [clang-tidy] CERT MSC24-C Obsolescent Functions check to [clang-tidy] Add cert-msc24-c checker..
futogergely edited the summary of this revision. (Show Details)
steakhal resigned from this revision.Nov 29 2021, 2:24 AM

I've already reviewed this change offline, thus I resign. For me this looks great, I'm eager to hear your opinions!

x64 debian failed

Should/does this work in C++ mode for std::whatever?

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
48 ↗(On Diff #390309)

Is this ordering specific in any way? Is the rule listing them in this order? If not, can we have them listed alphabetically, for easier search and potential later change?

clang-tools-extra/docs/clang-tidy/checks/cert-msc24-c.rst
10 ↗(On Diff #390309)

(And consistent capitalisation later.)

29 ↗(On Diff #390309)
futogergely marked 2 inline comments as done.
futogergely marked an inline comment as done.Dec 3 2021, 3:44 AM
futogergely added inline comments.
clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
48 ↗(On Diff #390309)

done

clang-tools-extra/docs/clang-tidy/checks/cert-msc24-c.rst
10 ↗(On Diff #390309)

done

futogergely marked an inline comment as done.Dec 3 2021, 4:01 AM

Should/does this work in C++ mode for std::whatever?

Right now the checker finds the functions in the global namespace only. The recommendation is listed only in the C part of the CERT rules, and as far as I know, Annex K functions are defined in the global namespace only, (or at least based on the standard). I can't really decide if the checker should look for the functions in the std namespace as well or not...

L129 and L135 are uncovered by tests. The rest of the lines are covered by tests, according to lcov.

The checker produced in total 15 reports on these 18 projects:
memcached,tmux,curl,twin,vim,openssl,sqlite,ffmpeg,postgres,tinyxml2,libwebm,xerces,bitcoin,protobuf,qtbase,codechecker,contour,llvm-project
You can check the reports by following this link:
https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?is-unique=on&detection-status=New&detection-status=Reopened&detection-status=Unresolved&run=%2aD91000-diff-4&checker-name=cert-msc24-c&diff-type=New

It seems like none of these projects actually use the annex K functions, which is not really a surprise.
VLC and lighttpd seems to use it. @futogergely could you please run your check on those projects?

clang-tools-extra/docs/clang-tidy/checks/cert-msc24-c.rst
13 ↗(On Diff #390309)

remove trailing whitespaces

Thanks for this new check!

In terms of whether we should expose the check in C++: I think we shouldn't. Annex K *could* be supported by a C++ library implementation (http://eel.is/c++draft/library#headers-10), but none of the identifiers are added to namespace std and there's not a lot of Annex K support in C so I imagine there's even less support in C++. We can always revisit the decision later and expand support for C++ once we know what that support should look like.

I think we should probably also not enable the check when the user compiles in C99 or earlier mode, because there is no Annex K available to provide replacement functions.

We should also circle back with the CERT authors for updates on missing entries in their tables once we've figured out what is missing, and on the terminology concerns of referring to these as "deprecated or obsolescent" functions.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
41–42 ↗(On Diff #391604)

This comment is not accurate. gets_s() is a secure replacement for gets().

49–59 ↗(On Diff #391604)

This list appears to be missing quite a few functions with secure replacements in Annex K. For example: tmpfile_s, tmpnam_s, strerrorlen_s, strlen_s... can you check the list against the actual Annex K, as it seems the CERT recommendation is still out of date.

84–86 ↗(On Diff #391604)

Fixed a few nits with the code, but gets() was never deprecated, so the message is not correct (it was present in C99 and removed in C11 with no deprecation period). I think it may be better to say "function %0 was removed in C11".

103 ↗(On Diff #391604)
33 ↗(On Diff #303633)

I think its exclusion from the CERT page was an oversight. strlen_s() is a secure replacement for strlen() and it would be weird to leave it off the list.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
18 ↗(On Diff #391604)

The terminology used in the CERT recommendation is pretty unfortunate and I don't think we should replicate it. Many of these are *not* deprecated or obsolescent functions and calling them that will confuse users. The crux of the CERT recommendation is that these functions have better replacements in more modern versions of C. So I would probably try to focus our diagnostics and documentation around modernization rather than deprecation.

FWIW, this is feedback that should also go onto the CERT recommendation itself. I noticed someone already observed the same thing: https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions?focusedCommentId=215482395#comment-215482395

"L129 and L135 are uncovered by tests. The rest of the lines are covered by tests, according to lcov."
This happens if STDC_WANT_LIB_EXT1 is defined empty (L129) or STDC_WANT_LIB_EXT1 is not literal (numeric constant, ...).

"It seems like none of these projects actually use the annex K functions, which is not really a surprise.
VLC and lighttpd seems to use it. @futogergely could you please run your check on those projects?"

lighttpd: the checker issued 386 warnings. The reason is that STDC_WANT_LIB_EXT1 is defined in a header which is included everywhere. However, functions from Annex K are used only in one source file.
VLC: 2 warnings, usage of the rewind function.

Both of these projects can be compiled without Annex K as well, which makes the code cumbersome. Maybe this check can be really useful in a project where Annex K is a must.

Maybe we could remove the check for setbuf() and rewind() functions, making this a pure Annex K checker. There is an overlapping with another recommendation (https://wiki.sei.cmu.edu/confluence/display/c/ERR07-C.+Prefer+functions+that+support+error+checking+over+equivalent+functions+that+don%27t), these functions are also listed there.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
41–42 ↗(On Diff #391604)

If gets is removed from C11, and gets_s is introduced in C11, then gets_s cannot be a replacement or? Maybe fgets?

Also I was wondering if we would like to disable this check for C99, maybe we should remove the check for gets all together.

49–59 ↗(On Diff #391604)

Missing functions added: tmpfile/tmpfile_s, tmpnam/tmpnam_s, memset/memset_s, scanf, strlen, wcslen

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
18 ↗(On Diff #391604)

Changed it to:
Checks for functions listed in CERT C Coding Standard Recommendation MSC24-C
that have safer, more secure replacements. The functions checked are considered unsafe because for
example they are missing bounds checking and/or non-reentrant. For the listed functions a
replacement function is suggested, if available. The checker heavily relies on the functions from
Annex K (Bounds-checking interfaces) of C11.

Also I changed the "is obsolescent" to "is not bounds-checking" in the getRationale function

futogergely added inline comments.Jan 4 2022, 1:37 AM
clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
84–86 ↗(On Diff #391604)

Done

103 ↗(On Diff #391604)

Done.

The functions asctime and asctime_r are discouraged according to CERT MSC33-C rule. These could be added to this check as well. There is a clang SA checker SecuritySyntaxChecker that contains other obsolete functions (and the whole check looks like it can be done in clang-tidy).

The functions asctime and asctime_r are discouraged according to CERT MSC33-C rule. These could be added to this check as well. There is a clang SA checker SecuritySyntaxChecker that contains other obsolete functions (and the whole check looks like it can be done in clang-tidy).

+1

The functions asctime and asctime_r are discouraged according to CERT MSC33-C rule. These could be added to this check as well. There is a clang SA checker SecuritySyntaxChecker that contains other obsolete functions (and the whole check looks like it can be done in clang-tidy).

The inclusion of CERT MSC33-C rule seems to be straightforward: check for asctime and asctime_r, and suggest asctime_s if Annex K is available, otherwise suggest strftime.

security.insecureAPI: the following functions could be added to the checker: bcmp, bcopy, bzero, getpw, mktemp, vfork, and if arc4random is available: drand48, erand48, jrand48, lcong48, lrand48, mrand48, nrand48, random, rand_r.
I think for now it is enough to issue a warning of using these functions, and not suggest a replacement. Should we add an option to the checker to also check for these functions?

I think for now it is enough to issue a warning of using these functions, and not suggest a replacement. Should we add an option to the checker to also check for these functions?

IMHO, it is okay to start with just simply issuing the warning. Later we might add that option (in a subsequent patch).

Bump! Thanks @futogergely for picking up @ktomi996's work, and @steakhal for the help given during the implementation process!

I think for now it is enough to issue a warning of using these functions, and not suggest a replacement. Should we add an option to the checker to also check for these functions?

IMHO, it is okay to start with just simply issuing the warning. Later we might add that option (in a subsequent patch).

Yes, please. And I suggest indeed hiding it behind an option, e.g. ReportMoreUnsafeFunctions which I believe should default to true, but setting it to false would allow people who want their code to be "only" strict CERT-clean (that is, not care about these additional matches), can request doing so. Not offering a replacement right now is okay too, we can definitely work them in later.

In terms of whether we should expose the check in C++: I think we shouldn't. Annex K *could* be supported by a C++ library implementation (http://eel.is/c++draft/library#headers-10), but none of the identifiers are added to namespace std and there's not a lot of Annex K support in C so I imagine there's even less support in C++. We can always revisit the decision later and expand support for C++ once we know what that support should look like.

(Minor suggestion, but we could pull a trick with perhaps checking whether the macro is defined in the TU by the user and the library, and trying to match against the non-std:: Annex K functions, and if they are available, consider the implementation the user is running under as "AnnexK-capable-even-in-C++-mode" and start issuing warnings? This might sound very technical, and definitely for later consideration!)

I think we should probably also not enable the check when the user compiles in C99 or earlier mode, because there is no Annex K available to provide replacement functions.

Except for warning about gets() and suggesting fgets()?


Some nits: when it comes to inline comments, please mark the comments of the reviewers as "Done" when replying if you consider that comment was handled. Only you, the patch author, can do this. When there are too many open threads of comments, it gets messy to see what has been handled and what is still under discussion.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
41–42 ↗(On Diff #391604)

Yes, it's strange territory. If I make my code safer but stay pre-C11, I actually can't, because the new function isn't there yet. If I also upgrade then I'll have to make my code "safer", because the old function is now missing...

Given how brutally dangerous gets is (you very rarely see a documentation page just going Never use gets()!), I would suggest keeping at least the warning.

Although even the CERT rule mentions gets_s(). We could have some wiggle room here: do the warning for gets(), and suggest two alternatives: fgets(), or gets_s() + Annex K? fgets(stdin, ...) is also safe, if the buffer's size is given appropriately.

41–42 ↗(On Diff #391604)

CERT mentions C99 TC3, which seems to be available here: https://webstore.iec.ch/p-corrigenda/isoiec9899-cor3%7Bed2.0%7Den.pdf . I'm not sure how normative this source is (iec.ch seems legit to me that this isn't just a random WGML draft!), and on page 8, in point 46 it says: "Add new paragraph after paragraph 1: The gets function is obsolescent, and is deprecated.".

This seems like nitpicking, but maybe CppReference is outdated as it never indicated the "deprecation" period?

(FYI: http://www.iso.org/standard/50510.html does not offer any purchase of the older version of the standard, or this errata.)

94 ↗(On Diff #391604)
110 ↗(On Diff #391604)

Maybe say this->PP or rename the parameter here ThePP and do PP = ThePP?

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
19 ↗(On Diff #391604)
21–24 ↗(On Diff #391604)

(Something went wrong with the formatting here.)

42–43 ↗(On Diff #391604)
clang-tools-extra/docs/ReleaseNotes.rst
105–108

Format, reflow, align with the comment in the header, etc.

futogergely marked 14 inline comments as done and an inline comment as not done.Feb 11 2022, 2:50 AM
futogergely added inline comments.
clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
41–42 ↗(On Diff #391604)

I don't use 'deprecated' in the warning message, I issue 'is insecure, and it is removed from C11' for gets instead.

I added the following replacement suggestions for gets: gets_s if Annex K is available, fgets if Annex K is not available.

49–59 ↗(On Diff #391604)

strlen/strnlen_s, wcslen/wcsnlen_s, memset/memset_s, scanf/scanf_s has been added.

I did not add tmpfile/tmpfile_s, tmpnam/tmpnam_s because there is a separate CERT rule for it: https://wiki.sei.cmu.edu/confluence/display/c/FIO21-C.+Do+not+create+temporary+files+in+shared+directories.

futogergely marked an inline comment as done.
futogergely retitled this revision from [clang-tidy] Add cert-msc24-c checker. to [clang-tidy] Add cert-msc24-msc33-c checker..
futogergely edited the summary of this revision. (Show Details)

I changed the class name: ObsolescentFunctionsCheck->UnsafeFunctionsCheck.
Since MSC33-C is also included, I changed the checker name to cert-msc24-msc33-c.
I added the following functions from CheckSecuritySyntaxOnly under option 'ReportMoreUnsafeFunctions': bcmp, bcopy, bzero, getpw, vfork. Since there is a replacement suggested there, I added the replacement suggestions also.
I did not add tmpnam, tmpfile, mktemp, mkstemp, rand..() to the checker, because there are separate CERT rules for these.

Now it would be better to have a checker called UnsafeFunctionsCheck (probably in bugprone) and add the cert checkers "msc24-c" and "msc33-c" as aliases. This makes the check extendable if more (CERT rule related or not) cases for unsafe functions are added.

Now it would be better to have a checker called UnsafeFunctionsCheck (probably in bugprone) and add the cert checkers "msc24-c" and "msc33-c" as aliases. This makes the check extendable if more (CERT rule related or not) cases for unsafe functions are added.

Done

Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 5:44 AM
futogergely retitled this revision from [clang-tidy] Add cert-msc24-msc33-c checker. to [clang-tidy] Add bugprone-unsafe-functions checker..

Checker has been moved to bugprone.

I found only small issues.

clang-tools-extra/clang-tidy/bugprone/UnsafeFunctionsCheck.cpp
109

DeclRef->getSourceRange() can be used

clang-tools-extra/docs/clang-tidy/checks/bugprone-usafe-functions.rst
4

I do not know if this works, it is better if this line has the same length as the line above.

105

These examples do not work: strcat_s and strcpy_s take different parameters than the original.

futogergely marked 3 inline comments as done.Apr 8 2022, 12:25 AM

Just one question if you could try this out for me: what happens if you run clang-tidy a.c b.c (two TUs in the invocation) where one of them (preferably the later one, i.e. b.c) does NOT have Annex K enabled? I believe the cached IsAnnexKAvailable (like any other TU-specific state of the check instance) should be invalidated/cleared in an overridden void onStartTranslationUnit() function.

Also, what happens if the check is run for C++ code?

clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
208–214

What is the reason for this being a module option, when the name of the option looks like a check option? Did something change and the API requires this now? If you do Options.get("ReportMoreUnsafeFunctions", true) it will automatically work and use this default option. You also overridden the storeOptions() function properly by the looks of it, so there should be no reason for this change.

clang-tools-extra/clang-tidy/bugprone/UnsafeFunctionsCheck.cpp
158–160

(And you can remove all the other // Caching the result comments.

204–213

Instead of wrapping the StringRef into an Optional, couldn't we achieve the same with the empty string(ref)... signalling the fact that there is no replacement?

clang-tools-extra/clang-tidy/bugprone/UnsafeFunctionsCheck.h
22
48

(Superfluous comment.)

clang-tools-extra/docs/ReleaseNotes.rst
111
clang-tools-extra/docs/clang-tidy/checks/bugprone-unsafe-functions.rst
9 ↗(On Diff #421435)
81–83 ↗(On Diff #421435)

(Visual nit.)

futogergely marked 7 inline comments as done.Jun 27 2022, 6:37 AM

Just one question if you could try this out for me: what happens if you run clang-tidy a.c b.c (two TUs in the invocation) where one of them (preferably the later one, i.e. b.c) does NOT have Annex K enabled? I believe the cached IsAnnexKAvailable (like any other TU-specific state of the check instance) should be invalidated/cleared in an overridden void onStartTranslationUnit() function.

Also, what happens if the check is run for C++ code?

It is working as is, a new ClangTidyCheck is created for every translation unit.

futogergely marked an inline comment as done.Jun 27 2022, 6:38 AM

updates based on comments.

Locations for tests and check documentation was changed recently. Please rebase from main and adjust your code accordingly.

Locations for tests and check documentation was changed recently. Please rebase from main and adjust your code accordingly.

Done.

Eugene.Zelenko added inline comments.Jun 27 2022, 8:02 AM
clang-tools-extra/docs/ReleaseNotes.rst
106

Please keep checks entries in alphabetical order.

108

One sentence should be enough.

futogergely marked 2 inline comments as done.

Another functions that can be added to the list: atoi(), atol(), atoll(), atof(). These are unsafe according to https://wiki.sei.cmu.edu/confluence/display/c/ERR34-C.+Detect+errors+when+converting+a+string+to+a+number.

Generally LGTM. Please revisit the documentation and let's fix the style, and then we can land.

In terms of whether we should expose the check in C++: I think we shouldn't. [...]

I think we should probably also not enable the check when the user compiles in C99 or earlier mode, because there is no Annex K available to provide replacement functions.

@aaron.ballman I think the current version of the check satisfies these conditions. It seems the check will run for every TU, but in case there is no alternative the check could suggest, it will do nothing:

if (!ReplacementFunctionName)
  return;

Is this good enough? This seems more future-proof than juggling the LangOpts instance in yet another member function.

clang-tools-extra/clang-tidy/bugprone/UnsafeFunctionsCheck.cpp
2

Ditto.

151

(Just so that we do not get "unused variable" warnings when compiling.)

clang-tools-extra/clang-tidy/bugprone/UnsafeFunctionsCheck.h
2

Style nit: length of line is not padded to 80-col.

clang-tools-extra/docs/clang-tidy/checks/bugprone/unsafe-functions.rst
31–40 ↗(On Diff #440259)

In general, the rendered version of the docs should be visually observed, as backtick in RST only turns on emphasis mode and not monospacing, i.e. it will be memcpy_s and not memcpy_s. Please look at the other checks and if there is a consistent style (e.g. symbol names (function) are monospaced, but options of the check is emphasised) align it to that so we have the documentation "fall into place" visually.

52 ↗(On Diff #440259)

(Style nit: yeah this will not look like a symbol at all here)

In terms of whether we should expose the check in C++: I think we shouldn't. [...]

I think we should probably also not enable the check when the user compiles in C99 or earlier mode, because there is no Annex K available to provide replacement functions.

@aaron.ballman I think the current version of the check satisfies these conditions. It seems the check will run for every TU, but in case there is no alternative the check could suggest, it will do nothing:

if (!ReplacementFunctionName)
  return;

Is this good enough? This seems more future-proof than juggling the LangOpts instance in yet another member function.

@aaron.ballman @njames93 Ping!
It seems @futogergely has resigned from the company, so I'll end up flying the approach, but the one above is the last outstanding question.

Generally LGTM. Please revisit the documentation and let's fix the style, and then we can land.

In terms of whether we should expose the check in C++: I think we shouldn't. [...]

I think we should probably also not enable the check when the user compiles in C99 or earlier mode, because there is no Annex K available to provide replacement functions.

@aaron.ballman I think the current version of the check satisfies these conditions. It seems the check will run for every TU, but in case there is no alternative the check could suggest, it will do nothing:

if (!ReplacementFunctionName)
  return;

Is this good enough? This seems more future-proof than juggling the LangOpts instance in yet another member function.

My concern with that approach was that we pay the full expense of doing the matches only get get into the check() function to bail out on all the Annex K functions, but now that there are replacements outside of Annex K, I don't see a way around paying that expense, so I think my concern has been addressed as well as it could have been.

My concern with that approach was that we pay the full expense of doing the matches only get get into the check() function to bail out on all the Annex K functions, but now that there are replacements outside of Annex K, I don't see a way around paying that expense, so I think my concern has been addressed as well as it could have been.

I think that Clang-Tidy checks are instantiated per AST. I will look into whether we can somehow do the disabling of the check as early as possible! (In that case, we could simply NOT register the matcher related to Annex-K functions.) Either way, I'll do a rebase, re-run the tests and etc., and likely take over the check.

Hi, sorry for the late answer, did not have time to check this in the last few weeks. I will try to address all of the remaining comments.

futogergely marked 5 inline comments as done.
futogergely removed a reviewer: ktomi996.

Addressing review comments.

My concern with that approach was that we pay the full expense of doing the matches only get get into the check() function to bail out on all the Annex K functions, but now that there are replacements outside of Annex K, I don't see a way around paying that expense, so I think my concern has been addressed as well as it could have been.

I think that Clang-Tidy checks are instantiated per AST. I will look into whether we can somehow do the disabling of the check as early as possible! (In that case, we could simply NOT register the matcher related to Annex-K functions.) Either way, I'll do a rebase, re-run the tests and etc., and likely take over the check.

I checked, and I think that at the point of ClangTidyCheck::registerMatchers the preprocessor has not been executed yet... (and we need the value of macros STDC_LIB_EXT1 and STDC_WANT_LIB_EXT1 to decide if we need to register some matchers or not) @whisperity could you maybe double-check it please?
What we could do is:

  1. add a new checker option to decide if we suggest replacements from AnnexK. We could avoid registering matchers this way, but I don't really like this, having an option for something we could decide from the defined macros.
  2. As a TODO, we could make possible to register checkers AFTER the preprocessor is executed. I have not looked into this, so I don't really know if it is possible at all in the current architecture.

What we could do is:

  1. add a new checker option to decide if we suggest replacements from AnnexK. We could avoid registering matchers this way, but I don't really like this, having an option for something we could decide from the defined macros.
  2. As a TODO, we could make possible to register checkers AFTER the preprocessor is executed. I have not looked into this, so I don't really know if it is possible at all in the current architecture.

I think it's fine if we do not go a great length for changing the entire infrastructure around this. At least we looked, it's not possible.

@aaron.ballman said that we've likely did as much as we could. At least the matchers themselves aren't that expensive, it's a small string lookup (hopefully hashed or at least memoised) in a trivial search for a very specific node type only.

I'll try running at least one final grand CI test of this check just to make sure it's not crashing and such, but due to the C11 and Annex K requirement, I do not expect a lot of results...

Tentative LGTM, will still need to run the "usual test projects" in the CI for confirmation. What's missing from this patch are the .rst files for the CERT-specific aliases of the check, but those are trivial and we can "add them in post" anyway.

Alright, the tests came back from the CI positively and there were no crashes on like 15 projects (about 50:50 C and C++). There are some nice results:

  • In LLVM, in LexPPMacroExpansion.cpp ExpandBuiltinMacro() there is a hit for a call Result = asctime(TM);. The suggestion is strftime which is available in all C++ versions and >= C99, so this seems like a valid finding. Result is a const char * which is initialised by this call.
  • In Bitcoin, 2 findings in logging.cpp doing setbuf(out, nullptr);. Not sure if these are legit findings because these seem to be only "resetting" some buffering, but the warning message is legible.
  • In Postgres:
    • 2 findings in isolationtester.c, same setbuf(stdout, NULL); and setbuf(stderr, NULL);. setvbuf/4 is available from C99 so this seems like a valid finding from a language version standpoint.
    • In initdb.c, a call to rewind with the comment preceding: /* now reprocess the file and store the lines */.
  • In SQLite:
    • In lemon.c, a call to rewind [...]
    • In shell.c, 3 calls to rewind [...]
  • In OpenSSL:
    • In randfile.c and apps.c, 2 setbuf(fileptr, NULL); calls [...]
  • In Vim, 4 calls to rewind() in xxd.c, tag.c, and term.c.
    • The case in term.c is extra funny because directly following the rewind(fd) call there is a while (!feof(fd)) ...
  • In MemcacheD, a setbuf(stderr, NULL); with the comment /* set stderr non-buffering ... */

It seems not many projects use Annex K at least from this test set. Note that we're primarily working on Linux, where Annex K isn't in the standard library. It has also been reported to not that useful, although it's good that the check uses them optionally. In general, we seem to have made a good enough framework for registering "unsafe standard APIs" later on with those stringswitches.

Regarding the performance, I did not see any meaningful slow-down. Most of the projects analysed in a few seconds or minutes, similar to other Tidy runs in the same infrastructure.

whisperity accepted this revision.Feb 2 2023, 5:11 AM

Alright, I've done a full reanalysis after a rebase. Overhead is not meaningfully measurable, even for complex TUs such as LLVM. The check is viable in C++ code as it finds cases (such as the one described in LLVM, but also Bitcoin is a primarily C++ project), so I won't rework the check to disable it in C++ mode explicitly. It seems the name lookup is implemented pretty well, helped by the fact that the names to match are short. No crashes had been observed in the test projects; let's hope it stays the same way; the matchers themselves are simple enough. The Annex K. matcher is only registered if in >= C11 mode.
I've also gone through the discussion earlier, and it looks to me as if all the concerns were either resolved or made obsolete due to the evolution of the check.

I did make several NFC changes to the patch in post-production, such as fixing the documentation here and there, ensuring that the cert- aliases are appropriately documented, and a little bit of refactoring so internal details of the check are genuinely internal. Thus, I'll be committing a version that is different from the one here.

This revision is now accepted and ready to land.Feb 2 2023, 5:11 AM
This revision was automatically updated to reflect the committed changes.

Oddly enough, one of the buildbots caught a not matching test that was working locally... on it.

whisperity added a comment.EditedFeb 2 2023, 6:16 AM

Alright, right now, I have no meaningful idea why this failure appears, locally I'm trying every test command as they appear in the test, and all the tests are passing. It's as if on that system the whole Annex K support would not be allowed. Locally I added a few debug prints and I'm getting the right answers for "Whether Annex K is allowed?".

Dumping out the value:

$ clang-tidy --checks='-*,bugprone-unsafe-functions' ~/llvm-project/clang-tools-extra/test/clang-tidy/checkers/bugprone/unsafe-functions.c -- -D__STDC_LIB_EXT1__=1 -D__STDC_WANT_LIB_EXT1__=1

getReplacementFor(gets, 1);

And I'm getting

/home/whisperity/llvm-project/clang-tools-extra/test/clang-tidy/checkers/bugprone/unsafe-functions.c:17:3: warning: function 'gets' is insecure, was deprecated and removed in C11 and C++14; 'gets_s' should be used instead [bugprone-unsafe-functions]

And if the __STDC_ macros are defined for 0 instead of 1, the dummy output shows 0 too and the suggested replacement is fgets.

The interesting part is that builds for other platforms, such as x86_64-debian passes:

PASS: Clang Tools :: clang-tidy/checkers/bugprone/unsafe-functions.c (17639 of 68905)


@njames93 @aaron.ballman Do you have any idea what could make that test fail in that system?

Ping @dyung, it looks like you're the owner of the relevant build-bot. I can't find any information what x86_64-sie is...

Meanwhile, it seems my attempt at fix is not actually fixing anything, I'm trying to figure out how to at least disable the check on this specific architecture. The rest of the architectures seem to be passing normally as expected. Something must be special within Clang's default environment or compiler configuration...

dyung added a comment.Feb 2 2023, 6:49 AM

Ping @dyung, it looks like you're the owner of the relevant build-bot. I can't find any information what x86_64-sie is...

Meanwhile, it seems my attempt at fix is not actually fixing anything, I'm trying to figure out how to at least disable the check on this specific architecture. The rest of the architectures seem to be passing normally as expected. Something must be special within Clang's default environment or compiler configuration...

Sorry responses from me might be delayed today as I am on holiday, but that build bot builds with PS4 as the default target. This is the cmake line that is used:

cmake ../llvm-project/llvm -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DCMAKE_BUILD_TYPE=Release -DCLANG_ENABLE_ARCMT=OFF -DCLANG_ENABLE_CLANGD=OFF -DLLVM_BUILD_RUNTIME=OFF -DLLVM_CCACHE_BUILD=ON -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4 -DLLVM_ENABLE_ASSERTIONS=ON '-DLLVM_LIT_ARGS=--verbose -j100' -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_USE_LINKER=gold '-DLLVM_ENABLE_PROJECTS=clang;cross-project-tests;llvm;clang-tools-extra;lld' -GNinja

Hopefully this can help you to reproduce the problem.

-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4

Hopefully this can help you to reproduce the problem.

Ah, thank you very much. And indeed, giving --target=x86_64-scei-ps4 to Clang-Tidy will make the test case fail, but using x86_64-linux-gnu works perfectly.
Now knowing the exact triple I think I can delve into the preprocessor stack or something to figure out why the Annex K.-related flags, my best bet is that it's explicitly forbidden from being defined. 😉

clang-tidy /tmp/LLVM-Build/tools/clang/tools/extra/test/clang-tidy/checkers/bugprone/Output/unsafe-functions.c.tmp.c --checks='-*,bugprone-unsafe-functions' -config={} -- --target=x86_64-linux-gnu -D__STDC_LIB_EXT1__=1 -D__STDC_WANT_LIB_EXT1__=1 -nostdinc++

My first wish is to figure out how to reliably disable the test case and have this quick-fix pushed immediately so the buildbots don't continue failing, and then we can figure out that this check should be disabled on this specific platform, or something like that...

dyung added a comment.Feb 2 2023, 7:08 AM
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4

Hopefully this can help you to reproduce the problem.

Ah, thank you very much. And indeed, giving --target=x86_64-scei-ps4 to Clang-Tidy will make the test case fail, but using x86_64-linux-gnu works perfectly.
Now knowing the exact triple I think I can delve into the preprocessor stack or something to figure out why the Annex K.-related flags, my best bet is that it's explicitly forbidden from being defined. 😉

clang-tidy /tmp/LLVM-Build/tools/clang/tools/extra/test/clang-tidy/checkers/bugprone/Output/unsafe-functions.c.tmp.c --checks='-*,bugprone-unsafe-functions' -config={} -- --target=x86_64-linux-gnu -D__STDC_LIB_EXT1__=1 -D__STDC_WANT_LIB_EXT1__=1 -nostdinc++

My first wish is to figure out how to reliably disable the test case and have this quick-fix pushed immediately so the buildbots don't continue failing, and then we can figure out that this check should be disabled on this specific platform, or something like that...

One particular oddity of our platform is that we use a different default C++ and I think C standard, so if explicitly setting the C standard causes the test to pass, that is likely the case.

To XFAIL the test, grep for lines with "XFAIL" and "ps4" and you should find some examples. It was recently changed how it worked lately.

whisperity added a comment.EditedFeb 2 2023, 7:44 AM

To XFAIL the test, grep for lines with "XFAIL" and "ps4" and you should find some examples. It was recently changed how it worked lately.

Thank you, yes I found an example. I went with UNSUPPORTED instead of XFAIL because the test implements 4 or 5 separate cases and XFAILing it all would make some cases become "Unexpectedly passed"... I did a quick commit rG9225d08ccca5be900c07eb89e907c4092bbdd462 with marking it as unsupported with a bit of an explanation in the code, and now the buildbots are coming back green.