Page MenuHomePhabricator

[analyzer] Implement a prototype checker for detecting Year 2038 related issues.
Needs ReviewPublic

Authored by appcs on Oct 11 2018, 9:03 PM.

Details

Summary

Defines a prototype checker in the Clang Static Analyzer for proper definition/use of time.h related functionality. The intention for this work is to audit user code for identifying potential issues in the light of the "Year 2038" (aka: "Y2K38") problem.

Checks implemented:

0. Warn if the time_t type is not defined to be a signed integer type, at-least 64-bits in size, having file scope. When the size of the time_t type is less than 64-bits, report the actual size in the warning.

  1. Warn when a time_t value is cast to: a. A non-integer type. b. An integer type of lesser width. c. An unsigned integer type.

Diff Detail

Event Timeline

appcs created this revision.Oct 11 2018, 9:03 PM

Thank you very much for the feedback. I have added a little functionality to demonstrate the working of a prototype checker for a some sample Y2K38 issues. There are some TODO's in the test files indicative of some subsequent checking that we will add.

appcs edited the summary of this revision. (Show Details)Oct 11 2018, 9:10 PM
appcs edited the summary of this revision. (Show Details)
appcs edited the summary of this revision. (Show Details)Oct 11 2018, 9:13 PM
appcs edited the summary of this revision. (Show Details)
appcs retitled this revision from Implement a prototype checker for detecting Year 2038 related issues. to Implement a prototype checker in the Clang Static Analyzer for detecting Year 2038 related issues..Oct 11 2018, 9:29 PM
appcs edited the summary of this revision. (Show Details)
Szelethus added a subscriber: NoQ.EditedOct 12 2018, 4:48 AM

That is nice! I love the doc you put up there, I wish every checker had something like that.

If you don't mind, I'll add the [analyzer] tag to your title and reword it a bit. Many of us use Herald to automatically subscribe to revisions that has analyzer in the title name. This is the case here, but [analyzer].* is preferred.

edit: I realized what I wrote here was wrong, so I removed it. Oops :)

edit2.: I wonder whether you could turn this into a simple clang-tidy check.

include/clang/StaticAnalyzer/Checkers/Checkers.td
526

Are you planning to add more time.h related functionality? If not, maybe you could specialize this description, explicitly mentioning the Y2K38 problem.

Buuuuut mentioning would be great idea anyways, because this is a little vague just yet.

lib/StaticAnalyzer/Checkers/TimeChecker.cpp
74

IdentifierInfo actually has a handy isStr method.
T.getBaseTypeIdentifier()->isStr("time_t")

Little bit of shooting in the dark, but, as far as I'm aware, QualType::getAsString has all the sugar and typedef you want to get rid of, but IdentifierInfo::getName() (which is used in isStr() too) is stripped of that.

98

QualType is essentially a wrapper around Type, and can be used similarly to a smart pointer. For example, if you'd like to know that a type is an integer type, you can just do:

QualType T = /* whatever */;
if (T->isIntegerType())
  doStuff();

Prefer QualType::isNull() to checking QualType::getTypePtr().

There are very few cases where you actually need to obtain the underlaying Type.

This is used in a lot of places in your code, but I'll only mention it here, just grep getTypePtr :).

104
110–111
110–111

Why not reuse the string and the stream object outside this block? You have the same 2 lines of code in 3 branches.

212–221

Can you put this on top of the file? AFAIK as much as ODR lets it, usually checker classes are up there.

test/Analysis/TimeChecker-0.c
2–4

Is there a need for this file? Never seen anything like it (didn't look super hard though).

test/Analysis/TimeChecker-1.c
2
  1. Prefer %clang_analyze_cc1 to %clang + -Xclang's
  2. Always run core checkers too. -analyzer-checker=core,alpha.unix.Time
  3. You can actually use newlines to organize the RUN line, although this could just fit into 80 columns.

Sooo:

// RUN: %clang_analyze_cc1 -analyzer-checker=core,alpha.unix.Time \
// RUN:   -verify %s
Szelethus retitled this revision from Implement a prototype checker in the Clang Static Analyzer for detecting Year 2038 related issues. to [analyzer] Implement a prototype checker for detecting Year 2038 related issues..Oct 12 2018, 4:48 AM
test/Analysis/TimeChecker-1.c
2

@Szelethus your second point is actually not correct for syntactic checkers.

test/Analysis/TimeChecker-11.c
8

If those are "TODO", should the patch be marked as "[WIP]"?

Szelethus added inline comments.Oct 22 2018, 11:50 AM
test/Analysis/TimeChecker-1.c
2

Oops, sorry about that.

appcs added inline comments.Oct 22 2018, 12:15 PM
test/Analysis/TimeChecker-11.c
8

The TODO are out of the scope of this prototype patch - (which is not WIP) - I put them in because we wanted a sense of where this checker is headed; I'll take them out as a part of the overall edit based on the feedback so far, for the re-spin. Thanks.

appcs updated this revision to Diff 172030.Oct 31 2018, 3:12 PM

Incorporated code review feedback; thank you @Szelethus and @george.karpenkov

appcs marked 12 inline comments as done.Oct 31 2018, 3:22 PM
appcs added inline comments.
lib/StaticAnalyzer/Checkers/TimeChecker.cpp
63

PS: Moved class TimeChecker up here; (kindly explain the term 'ODR', thank you).

74

Also: T.getAsString() == T.getBaseTypeIdentifier()->getName().str()

98

Eliminated all uses of getTypePtr()

test/Analysis/TimeChecker-0.c
2–4

Done; Renamed TimeChecker-<i>.c to TimeChecker-<i-1>.c; Added:TimeChecker-12.c

test/Analysis/TimeChecker-1.c
2

Was 80 columns after change

test/Analysis/TimeChecker-11.c
8

Removed all TODO's (those warning messages will appear in subsequent functionality).

appcs marked 9 inline comments as done.Oct 31 2018, 3:22 PM
Szelethus added inline comments.Oct 31 2018, 3:51 PM
lib/StaticAnalyzer/Checkers/TimeChecker.cpp
63

Sorry about that O:) One Definition Rule: https://en.wikipedia.org/wiki/One_Definition_Rule

appcs marked an inline comment as done.Nov 1 2018, 10:37 AM
appcs added inline comments.
lib/StaticAnalyzer/Checkers/TimeChecker.cpp
63

Thank you; understood the ODR. However, I am not able to get the implication of the ODR on the placement of the class TimeChecker - sorry, if I am missing something basic here - please let me know. Is the placement of the TimeChecker class as you expected it? (If so, is this patch good to commit?) Thank you!

Szelethus added inline comments.Nov 1 2018, 10:42 AM
lib/StaticAnalyzer/Checkers/TimeChecker.cpp
63

In my checker (UninitializedObject), if I recall correctly, the checker class uses some types I need to define earlier, so I can't place it on top of the file because of that.

My point basically was that try to keep declarations on top, definitions below that. Your solution is perfect imo.

I'll revisit this patch shorty, but I'm a bit busy at the moment :)

appcs marked 3 inline comments as done.Nov 1 2018, 10:50 AM
appcs added inline comments.
lib/StaticAnalyzer/Checkers/TimeChecker.cpp
63

Thank you, yes, I understand and will bear in mind the point about declarations on top and definitions beneath. Appreciate your help very much, thank you.

Let's talk about more general topics: Your checker's name is "TimeChecker", not "Y2K38Checker" or something similar, do you have plans on expanding this checker to cover more time.h related functionality?

appcs marked 2 inline comments as done.Nov 1 2018, 4:21 PM

Let's talk about more general topics: Your checker's name is "TimeChecker", not "Y2K38Checker" or something similar, do you have plans on expanding this checker to cover more time.h related functionality?

Yes, going forward, I plan to detect general indications of use of improperly implemented time.h functionality or improper use of time.h facility, e.g.:

  • Warn if a typedef for time_t does not come from a header file with the standard header name: 'time.h'
  • Warn if a typedef for the time_t type is not found in the translation unit.
  • Verify time.h function signatures.
  • Not using double difftime(time_t time1, time_t time0) to find the time difference between time1 and time0. ...

    While Y2K38 is the immediate problem needing addressing and the motivation for taking a closer look at time_t usage in one's program, the idea is to be sure of using time.h facility correctly and fully, in general; while at it. Also, vice-versa, if the user is checking their usage of time.h related functionality, then now is the time to act on any lurking Y2K38 issues. I imagine that there is analysis to be developed, modeled on how a user would trace through their time_t values (def and use) as they debug issues, so that the TimeChecker makes inferences that the user would eventually make.

Let's talk about more general topics: Your checker's name is "TimeChecker", not "Y2K38Checker" or something similar, do you have plans on expanding this checker to cover more time.h related functionality?

Yes, going forward, I plan to detect general indications of use of improperly implemented time.h functionality or improper use of time.h facility, e.g.:

  • Warn if a typedef for time_t does not come from a header file with the standard header name: 'time.h'
  • Warn if a typedef for the time_t type is not found in the translation unit.
  • Verify time.h function signatures.
  • Not using double difftime(time_t time1, time_t time0) to find the time difference between time1 and time0. ...

    While Y2K38 is the immediate problem needing addressing and the motivation for taking a closer look at time_t usage in one's program, the idea is to be sure of using time.h facility correctly and fully, in general; while at it. Also, vice-versa, if the user is checking their usage of time.h related functionality, then now is the time to act on any lurking Y2K38 issues. I imagine that there is analysis to be developed, modeled on how a user would trace through their time_t values (def and use) as they debug issues, so that the TimeChecker makes inferences that the user would eventually make.

I'm sadly not an expert on this particular topic, but it sounds good to me! The main reason why I asked this question is that I was curious whether you only do AST based analysis, and it seems to me that you do. A natural question that arises with AST based checkers is whether a simpler clang-tidy check could do the job just as well. Were there any particular reasons why you picked the static analyzer instead?

I realize that I should've asked this question before a barrage of lower lever observations :/ I'm not at all saying that this patch should be remade, please don't get me wrong, but in any case, sorry about this, certainly a lesson for me too on how I should leave feedback on patches.

Sorry, I wasn't aware you're looking for more feedback. Feel free to ping once a week.

I have two questions.

  1. As I've stated before, have you looked into the possibility of doing a clang-tidy check?
  2. Is there any particular reason behind using RecursiveASTVisitor, as opposed to AST matchers?
NoQ added a comment.Dec 6 2018, 3:51 PM

I think the idea is great, thanks!

One important thing to notice here is that it seems that time_t is allowed to be a float/double in C11 and is unspecified in C++ (i.e., might as well be a struct). So i suspect that we can't really ban these types entirely. But having it to be an integer type that's too small is definitely bad.

Another thing to think about eventually is how much of this checker can be enabled by default and how much can only be opt-in. For instance, how many projects define their own time_t in a valid manner for good reasons and therefore don't want our advice on where to put it?

Also feel free to make a new directory for your tests in test/Analysis/... and give files within it expressive names.

I have a few minor comments and i think we can land an alpha version so that to unblock your further work.

lib/StaticAnalyzer/Checkers/TimeChecker.cpp
119

Do we care about vector types here? Maybe stick to the usual isIntegerType()/isSignedIntegerType()/isUnsignedIntegerType()?

158

Why is this considered to be a bug?

In particular, i don't see anything wrong with casting time_t to double or with casting it to void (which is a common way of suppressing "unused variable" or "unused return value" warnings).

179

For "file scope", i'd try something like isa<TranslationUnitDecl>(TD->getDeclContext()). That'll include typedefs within classes and namespaces.

appcs added a comment.Dec 7 2018, 3:18 PM

Thank you @Szelethus for the guidance. Thank you @NoQ for further input. Please see inline:

In D53185#1322212, @NoQ wrote:

I think the idea is great, thanks!

Thank you!

One important thing to notice here is that it seems that time_t is allowed to be a float/double in C11 and is unspecified in C++ (i.e., might as well be a struct). So i suspect that we can't really ban these types entirely. But having it to be an integer type that's too small is definitely bad.

I'll relax the constraint so that when time_t is defined to be an integer type, it must be at least 64-bits. I'll add tests so that when C11, it does not complain on 'float'/'double' and also so that when C++, I'll add tests that ensure that it does not complain about whatever (variety of) type is used to define time_t.

Another thing to think about eventually is how much of this checker can be enabled by default and how much can only be opt-in. For instance, how many projects define their own time_t in a valid manner for good reasons and therefore don't want our advice on where to put it?

Sure, keeping some behaviour default and some of it opt-in will make it more usable; we'll do that eventually.

Also feel free to make a new directory for your tests in test/Analysis/... and give files within it expressive names.

Will do.

I have a few minor comments and i think we can land an alpha version so that to unblock your further work.

Will re-spin along with the above and the other three review feedback comments for TimeChecker.cpp

Thank you, so much!