This is an archive of the discontinued LLVM Phabricator instance.

Implement TZ Variable Parsing Function.
Needs ReviewPublic

Authored by rtenneti on Jul 22 2022, 6:12 PM.

Details

Reviewers
sivachandra
Summary

This function takes in a string in the format of
https://datatracker.ietf.org/doc/html/rfc8536 and returns the
struct that will be used by LLVM libc for the date functions.

This commit does not connect this parser to any date functions,
that will be done in subsequent changes.

Tested:
Unit tests

Co-authored-by: Jeff Bailey <jeffbailey@google.com>

Diff Detail

Event Timeline

rtenneti created this revision.Jul 22 2022, 6:12 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 22 2022, 6:12 PM
rtenneti requested review of this revision.Jul 22 2022, 6:12 PM
rtenneti added a subscriber: jeffbailey.

One thing: the test builds should fail with this as this patch depends on my optional.h patch that I'm still reworking. (I'm trying to learn enough about c++ traits to implement the ones I need). However, Raman and I work together on Fridays on our patches and we wanted to get this up for review since it's a large one.

I'm hoping by the end of the weekend to have the reworked Optional patch sent up.

Took a quick review while I had a break this morning.

libc/src/time/time_zone_posix.cpp
19

(Here and below)

69

I keep wondering if this should be bool instead of int. I think it's always plus one or negative one, never zero or another value.

86

The various .has_value() are redundant as optional supports operator bool for these. (Here and below)

121

Could ParseInt be a template and return int8_t, unit16_t, etc? That way the static_casts below wouldn't be needed.

238

We should have a comment saying why this is negative here. Especially since if what it parses after this has a negative, it will invert this to a positive.

(Also in UpdateDstOffset)

libc/test/src/time/TZHelper.h
20 ↗(On Diff #447017)

Instead of this could the callers just use a designated initializer directly?

PosixTimeZone res{
  .m_std_abbr = std_abbr,
  .m_std_offset = std_offset,
  (etc...)

That way this function and the others here wouldn't be needed and it would be clearer when reading it that the arguments aren't accidentally in the wrong order.

tschuett added a subscriber: tschuett.EditedJul 26 2022, 11:07 AM

Is there going to be a fuzzer?

libc/src/time/time_zone_posix.h
36

enum class?

53

Will this become the libc variant?

84

This m_ style is mostly used by LLDB. The rest of the code never uses it.

See https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

jeffbailey added inline comments.Jul 26 2022, 7:27 PM
libc/src/time/time_zone_posix.cpp
191

Should this be if!, and an early return?

204

Should this have an else return Nullopt? I don't think we'd process the /offset below if this part isn't set.

tschuett added inline comments.Jul 30 2022, 11:33 PM
libc/src/time/time_zone_posix.cpp
69

Boolean parameters can be confusing. I would instead wrap the boolean into a class.

class Sign {
};
rtenneti updated this revision to Diff 452560.Aug 14 2022, 1:49 PM

Fixed errors to handle cpp::optional and cpp:nullopt.

rtenneti updated this revision to Diff 458050.Sep 5 2022, 12:56 PM
rtenneti marked 12 inline comments as done.

Addressed all the comments.

Addressed all the comments. PTAL.

libc/src/time/time_zone_posix.cpp
121

Will do in the next CL.

191

It could start with either "," or "/" or both, thus we shouldn't return early.

204

I don't know if the following TZ spec are valid or not. With this change, the following incomplete specs become invalid (we won't be defaulting). Hope it is okay.

+ "PST8PDT,",
+ "PST8PDT,M3",
+ "PST8PDT,M3.",
+ "PST8PDT,M3.2",
+ "PST8PDT,M3.2.",
+ "PST8PDT,M3.2.0,",

+ "PST8PDT,M3.2.0,M11",
+ "PST8PDT,M3.2.0,M11.",
+ "PST8PDT,M3.2.0,M11.1",
+ "PST8PDT,M3.2.0,M11.1.",

libc/src/time/time_zone_posix.h
36

Defined it as enum so we don't need to do static cast. We're relying on how it decays to an int

53

Added a TODO

libc/test/src/time/TZHelper.h
20 ↗(On Diff #447017)

Deleted this file (as it is not needed)

Sorry for not getting to this earlier. I have left a few high level comments. I will do another pass once they are addressed.

libc/src/time/time_zone_posix.cpp
3
25

I know that it is not considered idiomatic to pass a string_view object by reference in general. But, in this case, it seems like a good thing to pass the spec by non-const-reference to these helper methods? It will avoid storing spec as a state variable.

29

LLVM libc style convention is to use snake_case for variable names.

30

strtol and the internel string to integer converter [1] have some gotchas: For example, they allow prefix spaces like 123. Is that OK or should we error out for such strings? Also, strtol can fail or return 0 silently, which probably is OK/irrelevant here?

[1] - https://github.com/llvm/llvm-project/blob/main/libc/src/__support/str_to_integer.h

61

Can we use string_view::find_first_of?

176

You can reduce the verbosity of the static_cast with just uint8_t(month).

283

Is Z permitted in the string for the TZ env var? Also, is ParseOffset handling it?

libc/src/time/time_zone_posix.h
12

Duplicate?

85

Why are these a struct instead of a typedef like this:

typedef uint16_t NonLeapDay;
142

Nit: Do we really want the verbosity of PosixTimeZone? Is TZInfo not good enough?

166

LLVM libc's style is to use snake_case for function and method names. So, this and the private methods below should be fixed. Also, would the name parse_tz_string be more appropriate?

168

Should spec be part of the state, especially when it really denotes nothing after parsing? See below for related comments.