Page MenuHomePhabricator

[clang-tidy] Add cert-msc24-c checker.
Needs ReviewPublic

Authored by futogergely on Nov 7 2020, 3:24 AM.

Details

Summary

Checks for deprecated and obsolescent functions listed in CERT C Coding
Standard Recommendation MSC24-C.

For the listed functions, an alternative, more secure replacement is
suggested, if available. The checker heavily relies on the functions
from annex K (Bounds-checking interfaces) of C11.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Herald added a project: Restricted Project. · View Herald TranscriptNov 7 2020, 3:24 AM
ktomi996 requested review of this revision.Nov 7 2020, 3:24 AM
ktomi996 updated this revision to Diff 303633.Nov 7 2020, 3:37 AM
steakhal added a subscriber: steakhal.
lebedev.ri retitled this revision from MSC24-C Obsolescent Functions check to [clang-tidy] CERT MSC24-C Obsolescent Functions check.Nov 7 2020, 4:03 AM
Eugene.Zelenko added inline comments.
clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
20

Why this could not be member of check class?

53

Why not to use plain assignment? See readability-simplify-boolean-expr.

60

Early return will be better.

70

Please elide braces.

81

Please elide braces.

85

Could be const auto *, because type is spelled in same statement.

86

Could be const auto *, because type is spelled in same statement.

93

Could be const auto *, because type is spelled in same statement.

94

Could be const auto *, because type is spelled in same statement.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
13

Please separate include statement and namespaces with empty line.

14

Please separate with empty line.

clang-tools-extra/docs/clang-tidy/checks/cert-obsolescent-functions.rst
6 ↗(On Diff #303633)

Please make first statement same as in Release Notes. Please enclose all preprocessor variables and function names in double back-ticks.

15 ↗(On Diff #303633)

Link to original rule should be at the end. See other documentation as example.

17 ↗(On Diff #303633)

Please remove space before colon and separate code with empty line.

18 ↗(On Diff #303633)

Should be .. code-block:: c++. See other documentation as formatting example.

clang-tools-extra/test/clang-tidy/cert-obsolescent-functions.cpp
5 ↗(On Diff #303633)

Please separate with empty line.

Eugene.Zelenko set the repository for this revision to rG LLVM Github Monorepo.Nov 7 2020, 7:25 AM

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.
What do you think?

There is a mismatch between your code and docs too, regarding the function you check for.
In the code you match for vsprintf, vsscanf, vswprintf, vswcanf, wcrtomb, wcscat, but none of these are mentioned in the docs.

The checker warns only if STDC_LIB_EXT1 macro is defined and the value of STDC_WANT_LIB_EXT1 macro is 1 in this case it suggests the corresponding functions from Annex K instead the vulnerable function.

I would suggest mentioning these macros and their purpose in the docs.
Eg. that the STDC_WANT_LIB_EXT1 should be defined to 1 but the other is left to the implementation.

That being said, I would request more tests, demonstrating that this macro detection works accordingly.

This checker might be a bit noisy. Have you tried it on open-source projects?
If it is, we should probably note that in the docs as well.

In the tests, It is a good practice to demonstrate that the offered recommendation does not trigger yet another warning.
Don't forget to put a no-warning to highlight that.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
31

Hm, it must be a typo.
vswcanf -> vswscanf

By the same token, I miss these functions from this enumeration.
vfwprintf, vfwscanf, vprintf, vwprintf, vwscanf
However, they are present in the cert table you mentioned.

34

Why is strlen an obsolescent function? I don't feel it justified, even if there was a comment about it on the cert page.

70

Please also check if the value is 1, as your summary and the standard requires.

78–80
88–90

We should probably use llvm::Twine instead of multiple std::string constructions.
I might be wrong though.

90

can -> should
The same at the other diagnostic message.

92

This and the previous if statement is exclusive, am I right?
If we have a match, that is either a callexpr or a vardecl.

I would prefer a return at the end of the previous if and an assert here that the FunctionPointer must be non-null.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
28

It should be private.

clang-tools-extra/docs/clang-tidy/checks/cert-obsolescent-functions.rst
10 ↗(On Diff #303633)

Another typo.

19–23 ↗(On Diff #303633)

I'm not convinced by this example.

AFAIK, the intended usage is to define the __STDC_WANT_LIB_EXT1__ to 1 before the string.h header included.
It's important to provide a good example, as this checker will be silent if not used correctly.

23 ↗(On Diff #303633)

We should probably align this comment as well.

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

What do you think?

There is a mismatch between your code and docs too, regarding the function you check for.
In the code you match for vsprintf, vsscanf, vswprintf, vswcanf, wcrtomb, wcscat, but none of these are mentioned in the docs.

The checker warns only if STDC_LIB_EXT1 macro is defined and the value of STDC_WANT_LIB_EXT1 macro is 1 in this case it suggests the corresponding functions from Annex K instead the vulnerable function.

I would suggest mentioning these macros and their purpose in the docs.
Eg. that the STDC_WANT_LIB_EXT1 should be defined to 1 but the other is left to the implementation.

That being said, I would request more tests, demonstrating that this macro detection works accordingly.

This checker might be a bit noisy. Have you tried it on open-source projects?
If it is, we should probably note that in the docs as well.

In the tests, It is a good practice to demonstrate that the offered recommendation does not trigger yet another warning.
Don't forget to put a no-warning to highlight that.

steakhal requested changes to this revision.Dec 23 2020, 4:38 AM

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

From the user's point of view, it does not matter if the safer alternative is inside annex K or not.
IMO if the CERT rule says that don't use any of these functions but use these other functions instead.
If we don't check all of them in the list, this checker is incomplete. The name of this checker might lead the user to a false sense of security.

This revision now requires changes to proceed.Dec 23 2020, 4:38 AM

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

From the user's point of view, it does not matter if the safer alternative is inside annex K or not.

I suspect that should be a flag in the check's options.

IMO if the CERT rule says that don't use any of these functions but use these other functions instead.
If we don't check all of them in the list, this checker is incomplete. The name of this checker might lead the user to a false sense of security.

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

From the user's point of view, it does not matter if the safer alternative is inside annex K or not.

I suspect that should be a flag in the check's options.

I don't see any value in disabling the check for eg. setbuf. Even the man page says that you should use the setvbuf instead.
Why would anyone disable this if one already wants diagnostics for the CERT rule it implements?

IMO if the CERT rule says that don't use any of these functions but use these other functions instead.
If we don't check all of them in the list, this checker is incomplete. The name of this checker might lead the user to a false sense of security.

Quoting the revision summary:

This checker guards against using some vulnerable C functions which are mentioned in MSC24-C in obsolescent functions table.

Why don't we check the rest of the functions as well?
asctime, atof, atoi, atol, atoll, ctime, fopen, freopen, rewind, setbuf

Hm, I get that cert-err34-c will already diagnose the uses of atof, atoi, atol, atoll, but then why do we check vfscanf, vscanf then?
We should probably omit these, while documenting this.
On the other hand, I would recommend checking asctime, ctime, fopen, freopen, rewind, setbuf for the sake of completeness.

I only check functions which are in Unchecked Obsolescent Functions without setbuf because setbuf does not have a safer alternative in Annex K.
https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions

From the user's point of view, it does not matter if the safer alternative is inside annex K or not.

I suspect that should be a flag in the check's options.

I don't see any value in disabling the check for eg. setbuf. Even the man page says that you should use the setvbuf instead.
Why would anyone disable this if one already wants diagnostics for the CERT rule it implements?

IMO if the CERT rule says that don't use any of these functions but use these other functions instead.
If we don't check all of them in the list, this checker is incomplete. The name of this checker might lead the user to a false sense of security.

I think the question is, *why* are these checks being implemented?
Just to claim that for some particular rule there is a check, and cross it off a list?
Or for them to be actually used?

I think the question is, *why* are these checks being implemented?
Just to claim that for some particular rule there is a check, and cross it off a list?

Initially, yes. I think one could learn a lot from contributing to any project.
It's not inherently a bad thing to combine these two.

Or for them to be actually used?

I want a useful checker, that's why I highlighted some of my concerns and suggested a way forward.
It might not be useful to everyone as it tries to implement a domain-specific CERT rule, but it's still up to the user to enable this.
I think we should ask, who is the audience of this checker?
I assume only the users who are interested in the CERT guideline would use this. At this point, we should be clear about what we are checking for.
I think it's OK, to say that only a part of the rule is implemented, and we should carefully document this fact. But IMO one should go the extra mile to try hard and implement the other parts of the rule as well.
Like, matching on the setbuf is not that hard really. The rest of the missing functions probably fall into the same category.

I might be wrong on this though.
Keep in mind that I'm not tidy dev, so take my opinion with a pinch of salt.

I think the question is, *why* are these checks being implemented?
Just to claim that for some particular rule there is a check, and cross it off a list?
Or for them to be actually used?

We have an existing CERT module for CERT-specific checks. When the check is honoring a published coding standard, that's usually sufficient justification -- but I do think the check should implement what the CERT recommendation says. If the CERT recommendation seems deficient in some way, it's reasonable to reach out to the authors (they allow comments on their wiki and are pretty good about responding to those comments) for clarification or to discuss potential improvements to the recommendation that would make the check more palatable.

whisperity edited the summary of this revision. (Show Details)Jan 5 2021, 2:41 AM
whisperity added a subscriber: gerazo.
futogergely commandeered this revision.Nov 29 2021, 12:46 AM
futogergely added a reviewer: ktomi996.
futogergely retitled this revision from [clang-tidy] CERT MSC24-C Obsolescent Functions check to [clang-tidy] Add cert-msc24-c checker..
futogergely edited the summary of this revision. (Show Details)
steakhal resigned from this revision.Nov 29 2021, 2:24 AM

I've already reviewed this change offline, thus I resign. For me this looks great, I'm eager to hear your opinions!

x64 debian failed

Should/does this work in C++ mode for std::whatever?

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
49

Is this ordering specific in any way? Is the rule listing them in this order? If not, can we have them listed alphabetically, for easier search and potential later change?

clang-tools-extra/docs/clang-tidy/checks/cert-msc24-c.rst
11

(And consistent capitalisation later.)

30
futogergely marked 2 inline comments as done.
futogergely marked an inline comment as done.Dec 3 2021, 3:44 AM
futogergely added inline comments.
clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
49

done

clang-tools-extra/docs/clang-tidy/checks/cert-msc24-c.rst
11

done

futogergely marked an inline comment as done.Dec 3 2021, 4:01 AM

Should/does this work in C++ mode for std::whatever?

Right now the checker finds the functions in the global namespace only. The recommendation is listed only in the C part of the CERT rules, and as far as I know, Annex K functions are defined in the global namespace only, (or at least based on the standard). I can't really decide if the checker should look for the functions in the std namespace as well or not...

L129 and L135 are uncovered by tests. The rest of the lines are covered by tests, according to lcov.

The checker produced in total 15 reports on these 18 projects:
memcached,tmux,curl,twin,vim,openssl,sqlite,ffmpeg,postgres,tinyxml2,libwebm,xerces,bitcoin,protobuf,qtbase,codechecker,contour,llvm-project
You can check the reports by following this link:
https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?is-unique=on&detection-status=New&detection-status=Reopened&detection-status=Unresolved&run=%2aD91000-diff-4&checker-name=cert-msc24-c&diff-type=New

It seems like none of these projects actually use the annex K functions, which is not really a surprise.
VLC and lighttpd seems to use it. @futogergely could you please run your check on those projects?

clang-tools-extra/docs/clang-tidy/checks/cert-msc24-c.rst
14

remove trailing whitespaces

Thanks for this new check!

In terms of whether we should expose the check in C++: I think we shouldn't. Annex K *could* be supported by a C++ library implementation (http://eel.is/c++draft/library#headers-10), but none of the identifiers are added to namespace std and there's not a lot of Annex K support in C so I imagine there's even less support in C++. We can always revisit the decision later and expand support for C++ once we know what that support should look like.

I think we should probably also not enable the check when the user compiles in C99 or earlier mode, because there is no Annex K available to provide replacement functions.

We should also circle back with the CERT authors for updates on missing entries in their tables once we've figured out what is missing, and on the terminology concerns of referring to these as "deprecated or obsolescent" functions.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
34

I think its exclusion from the CERT page was an oversight. strlen_s() is a secure replacement for strlen() and it would be weird to leave it off the list.

41–42

This comment is not accurate. gets_s() is a secure replacement for gets().

49–59

This list appears to be missing quite a few functions with secure replacements in Annex K. For example: tmpfile_s, tmpnam_s, strerrorlen_s, strlen_s... can you check the list against the actual Annex K, as it seems the CERT recommendation is still out of date.

84–86

Fixed a few nits with the code, but gets() was never deprecated, so the message is not correct (it was present in C99 and removed in C11 with no deprecation period). I think it may be better to say "function %0 was removed in C11".

103
clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
18

The terminology used in the CERT recommendation is pretty unfortunate and I don't think we should replicate it. Many of these are *not* deprecated or obsolescent functions and calling them that will confuse users. The crux of the CERT recommendation is that these functions have better replacements in more modern versions of C. So I would probably try to focus our diagnostics and documentation around modernization rather than deprecation.

FWIW, this is feedback that should also go onto the CERT recommendation itself. I noticed someone already observed the same thing: https://wiki.sei.cmu.edu/confluence/display/c/MSC24-C.+Do+not+use+deprecated+or+obsolescent+functions?focusedCommentId=215482395#comment-215482395

"L129 and L135 are uncovered by tests. The rest of the lines are covered by tests, according to lcov."
This happens if STDC_WANT_LIB_EXT1 is defined empty (L129) or STDC_WANT_LIB_EXT1 is not literal (numeric constant, ...).

"It seems like none of these projects actually use the annex K functions, which is not really a surprise.
VLC and lighttpd seems to use it. @futogergely could you please run your check on those projects?"

lighttpd: the checker issued 386 warnings. The reason is that STDC_WANT_LIB_EXT1 is defined in a header which is included everywhere. However, functions from Annex K are used only in one source file.
VLC: 2 warnings, usage of the rewind function.

Both of these projects can be compiled without Annex K as well, which makes the code cumbersome. Maybe this check can be really useful in a project where Annex K is a must.

Maybe we could remove the check for setbuf() and rewind() functions, making this a pure Annex K checker. There is an overlapping with another recommendation (https://wiki.sei.cmu.edu/confluence/display/c/ERR07-C.+Prefer+functions+that+support+error+checking+over+equivalent+functions+that+don%27t), these functions are also listed there.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.cpp
41–42

If gets is removed from C11, and gets_s is introduced in C11, then gets_s cannot be a replacement or? Maybe fgets?

Also I was wondering if we would like to disable this check for C99, maybe we should remove the check for gets all together.

49–59

Missing functions added: tmpfile/tmpfile_s, tmpnam/tmpnam_s, memset/memset_s, scanf, strlen, wcslen

84–86

Done

103

Done.

clang-tools-extra/clang-tidy/cert/ObsolescentFunctionsCheck.h
18

Changed it to:
Checks for functions listed in CERT C Coding Standard Recommendation MSC24-C
that have safer, more secure replacements. The functions checked are considered unsafe because for
example they are missing bounds checking and/or non-reentrant. For the listed functions a
replacement function is suggested, if available. The checker heavily relies on the functions from
Annex K (Bounds-checking interfaces) of C11.

Also I changed the "is obsolescent" to "is not bounds-checking" in the getRationale function

The functions asctime and asctime_r are discouraged according to CERT MSC33-C rule. These could be added to this check as well. There is a clang SA checker SecuritySyntaxChecker that contains other obsolete functions (and the whole check looks like it can be done in clang-tidy).

The functions asctime and asctime_r are discouraged according to CERT MSC33-C rule. These could be added to this check as well. There is a clang SA checker SecuritySyntaxChecker that contains other obsolete functions (and the whole check looks like it can be done in clang-tidy).

+1

The functions asctime and asctime_r are discouraged according to CERT MSC33-C rule. These could be added to this check as well. There is a clang SA checker SecuritySyntaxChecker that contains other obsolete functions (and the whole check looks like it can be done in clang-tidy).

The inclusion of CERT MSC33-C rule seems to be straightforward: check for asctime and asctime_r, and suggest asctime_s if Annex K is available, otherwise suggest strftime.

security.insecureAPI: the following functions could be added to the checker: bcmp, bcopy, bzero, getpw, mktemp, vfork, and if arc4random is available: drand48, erand48, jrand48, lcong48, lrand48, mrand48, nrand48, random, rand_r.
I think for now it is enough to issue a warning of using these functions, and not suggest a replacement. Should we add an option to the checker to also check for these functions?

I think for now it is enough to issue a warning of using these functions, and not suggest a replacement. Should we add an option to the checker to also check for these functions?

IMHO, it is okay to start with just simply issuing the warning. Later we might add that option (in a subsequent patch).