This is an archive of the discontinued LLVM Phabricator instance.

Initial commit of mktime.
ClosedPublic

Authored by rtenneti on Nov 16 2020, 10:03 AM.

Details

Summary

This introduces mktime to LLVM libc, based on C99/C2X/Single Unix Spec.

Co-authored-by: Jeff Bailey <jeffbailey@google.com>

This change doesn't handle TIMEZONE, tm_isdst and leap seconds. It returns -1 for invalid dates. I have verified the return results for all the possible dates with glibc's mktime.

TODO:
+ Handle leap seconds.
+ Handle out of range time and date values that don't overflow or underflow.
+ Implement the following suggestion Siva - As we start accumulating the seconds, we should be able to check if the next amount of seconds to be added can lead to an overflow. If it does, return the overflow value. If not keep accumulating. The benefit is that, we don't have to validate every input, and also do not need the special cases for sizeof(time_t) == 4.
+ Handle timezone and update of tm_isdst

Diff Detail

Event Timeline

rtenneti created this revision.Nov 16 2020, 10:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2020, 10:03 AM
rtenneti requested review of this revision.Nov 16 2020, 10:03 AM
rtenneti added a subscriber: jeffbailey.
rtenneti edited the summary of this revision. (Show Details)Nov 16 2020, 10:17 AM

Drive-by: POSIX.1-2017: "Local timezone information shall be set as though mktime() called tzset()." Is "TZ" handled in this patch?

libc/config/linux/api.td
34 ↗(On Diff #305544)

Seems that tm_isdst is entirely unhandled.

libc/src/time/mktime.cpp
11
15

namespace-scope const/constexpr variables are automatically of internal linkage. You don't need an anonymous namespace (which is also discouraged) in this case.

36

Have two arrays for leap/nonleap wastes space (libc.a space matters a lot). Having one array plus leap adjustment is sufficient.

rtenneti edited the summary of this revision. (Show Details)Nov 16 2020, 10:26 AM
rtenneti updated this revision to Diff 305623.Nov 16 2020, 4:39 PM
rtenneti marked 2 inline comments as done.
rtenneti edited the summary of this revision. (Show Details)

Remove anonymous namespace, fix typos in ILP64 msg.
Removed duplicate arrays for days calculation.

Raman, looks like you lost other files when you updated the patch.

rtenneti updated this revision to Diff 305819.Nov 17 2020, 9:08 AM
rtenneti marked 2 inline comments as done.

Fixed the comments from Fangrui.

sivachandra added inline comments.Nov 17 2020, 12:22 PM
libc/src/time/mktime.cpp
45

The standard says,

"The values in time are not checked for being out of range."

So, is all this checking necessary?

53

Could we perhaps define a OVERFLOW_VALUE instead of doing (time_t)(-1) everywhere? You can define it in the implementation header as a static constexpr which then allows us to use it in the tests as well.

64

Could we do this overflow check in a more general fashion? For example, we can find a maxYears that can fit in TIME_T_MAX (I am making up TIME_T_MAX). After that, assuming valid inputs for other fields, we can progressively check if the result will fit in time_t.

66

Its not clear to me as to what would/should change on a ILP64 platform. Can add a comment explaining this?

libc/test/src/time/mktime_test.cpp
23 ↗(On Diff #305819)

Except for this line, it doesn't seem like you need this function initialize_tm_data. A pattern like this does not work?

tm temp{...};
...
temp = {...};
33 ↗(On Diff #305819)

You can inline lines 31 to 33 to:

EXPECT_EQ(__llvm_libc::mktime(&temp), OVERFLOW_VALUE);

And use the same pattern at other places where relevant.

jeffbailey added inline comments.Nov 17 2020, 12:46 PM
libc/src/time/mktime.cpp
45

The Single Unix Spec says this:

The mktime() function shall return the specified time since the Epoch encoded as a value of type time_t. If the time since the Epoch cannot be represented, the function shall return the value (time_t)-1 [CX] [Option Start] and set errno to indicate the error.

So our bug is that we're not setting errno here to EOVERFLOW.

53

This should be OUT_OF_RANGE since it serves underflow as well.

66

We'll add a comment. In brief: because struct tm is ints, a 32-bit int year can be represented in a 64-bit time_t. However, a 64-bit int year cannot. Without a good way to run tests on it, I'd rather just assume a 32-bit int and make sure it catches fire at build time when that's violated.

libc/test/src/time/mktime_test.cpp
23 ↗(On Diff #305819)

This is the case where I was hoping we could use a C99/C++20 designated initializer so that it was readable and clean.

rtenneti updated this revision to Diff 306888.Nov 21 2020, 6:54 PM
rtenneti marked 10 inline comments as done.

Fixed comments.

rtenneti updated this revision to Diff 306926.Nov 22 2020, 10:37 AM

Fixed code review comments.

Jeff and I have made the changes. Please take another look. Thanks.

libc/config/linux/api.td
34 ↗(On Diff #305544)

Added a TODO to handle tm_isdst . Also needs to read timezone file.

libc/src/time/mktime.cpp
45

Hi Siva and Jeff,

Added a TODO to handle out of range date and time (it seems mktime in C converts invalid dates to valid dates.  For example  "October 40th is silently converted to November 9th")
64

Add a TODO to investigate the above idea.

libc/test/src/time/mktime_test.cpp
23 ↗(On Diff #305819)

Done.

jeffbailey added inline comments.Nov 22 2020, 4:52 PM
libc/src/time/mktime.cpp
64

Is the TODO still necessary with the creation of outOfRange() that sets errno and returns -1?

rtenneti updated this revision to Diff 306952.Nov 22 2020, 6:35 PM

Fixed Jeff's comments.

rtenneti updated this revision to Diff 306955.Nov 22 2020, 6:41 PM
rtenneti marked an inline comment as done.

Fixed the todo.

Removed the TODO. Thanks Jeff.

Mostly LGTM. I have left a few nits along with a not-a-nit comment.

From offline discussions, I understand you are preparing follow ups. It should be OK to land WiP pieces as long as you have clear TODOs about what is missing.

libc/src/time/mktime.cpp
42

Nit: Use static_cast instead of C-style casts, similar to how you do in tests. Even better, define another constant like this:

constexpr time_t OutOfRangeReturnValue = -1;
45

Can you add TODOs for the pieces missing in this implementation? Just rephrase what you have in the commit message as a TODO.

64

Along with the above comment about checking for invalid input, what I meant was this: As we start accumulating the seconds, we should be able to check if the next amount of seconds to be added can lead to an overflow. If it does, return the overflow value. If not keep accumulating. The benefit is that, we don't have to validate every input, and also do need the special cases for sizeof(time_t) == 4.

libc/test/src/time/mktime_test.cpp
30 ↗(On Diff #306955)

Nit: To avoid confusions, can you rename this function to call_mktime?

44 ↗(On Diff #306955)

Nit: Define a global perhaps to avoid repeated listings of static_cast. Something like:

static constexpr time_t OutOfRangeReturnValue = -1;
rtenneti updated this revision to Diff 308412.Nov 30 2020, 10:13 AM
rtenneti marked 2 inline comments as done.

Fixed Siva's comments and added a TODO.

sivachandra accepted this revision.Nov 30 2020, 10:26 AM
sivachandra added inline comments.
libc/src/time/mktime.cpp
53

Ah, a typo in my comment got carried over here.... "... also, do NOT need the special cases ..."

This revision is now accepted and ready to land.Nov 30 2020, 10:26 AM
rtenneti updated this revision to Diff 308420.Nov 30 2020, 10:30 AM

Added "not" to the comment.

rtenneti marked an inline comment as done.Nov 30 2020, 5:08 PM
rtenneti added inline comments.
libc/src/time/mktime.cpp
64

Added a TODO. The above is my intetntion when implementing handle of invalid data (of some one passing INT_MAX in either seconds, minutes, hours ...).

rtenneti marked an inline comment as done.Nov 30 2020, 5:09 PM
rtenneti edited the summary of this revision. (Show Details)Nov 30 2020, 5:13 PM
rtenneti edited the summary of this revision. (Show Details)
rtenneti edited the summary of this revision. (Show Details)
rtenneti edited the summary of this revision. (Show Details)
rtenneti marked 3 inline comments as done.Nov 30 2020, 5:15 PM
This revision was automatically updated to reflect the committed changes.
MaskRay added inline comments.Dec 1 2020, 10:47 AM
libc/config/linux/api.td
41 ↗(On Diff #308529)

This may not be a good declaration for 32-bit systems (their support may have lower priority, though, I guess). https://sourceware.org/glibc/wiki/Y2038ProofnessDesign

Then you can avoid all the sizeof(time_t) == 4 special cases.

jeffbailey added inline comments.Dec 1 2020, 11:03 AM
libc/config/linux/api.td
41 ↗(On Diff #308529)

The declaration of time_t comes from the following:

The range and precision of times representable in clock_t and time_t are
implementation-defined. (C99 7.23.1(4))

time_t shall be an integer type. (https://pubs.opengroup.org/onlinepubs/9699919799/toc.htm)

IBM has an analysis: https://www.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.ref.dev.doc/q104610_.htm

In practice, it will be somewhat more nuanced because of backwards compatibility. I expect that *historical* 32-bit systems will need to maintain a 32-bit signed time_t for compat. However, *new* ones (like em32_t) would use a 64-bit time_t. Since this libc doesn't any support for that we've used this. Someone implementing em32_t would have to go over I think many such types, so I'm not worried about this being extra work for them later if/when that happens.