This is an archive of the discontinued LLVM Phabricator instance.

[libc] add strtoll function and backend
ClosedPublic

Authored by michaelrj on Aug 9 2021, 4:48 PM.

Details

Summary

This change adds the stroll function, but most of the implementation is
in the new file str_conv_utils.h since many of the other integer
conversion functions are implemented through what are effectively calls
to strtoll.

Diff Detail

Event Timeline

michaelrj created this revision.Aug 9 2021, 4:48 PM
michaelrj requested review of this revision.Aug 9 2021, 4:48 PM

Just one major comment about errno but the rest are mostly nits and bikeshed comments.

libc/src/__support/CMakeLists.txt
16

Bikeshed: The name stdlib_utils sounds very general and can end up with more than one topic. May be call it str_conv_utils to keep it focused.

20

For targets in the same directory, you use .ctype_utils.

libc/src/__support/stdlib_utils.h
12 ↗(On Diff #365316)

We don't include LLVM libc's errno now. Just do #include <errno.h> and refer to errno directly in code. Lint rules will catch if the header file is not from LLVM libc. For there reference to errno in code, you might need a NOLINT comment.

If you do the standard build, as in not the full build, my feeling is this will not build.

39 ↗(On Diff #365316)

May be s/base_from_num_start/infer_base ?

41 ↗(On Diff #365316)

Nit: Format it with parenthesis: ++(*src)?

55 ↗(On Diff #365316)

Since this is in an internal namespace, you can name it strtoll.

78 ↗(On Diff #365316)

Nit: What does the CUR prefix indicate? It seems to indicate the current value of some variable being updated in a loop. But it is actually a const.

libc/test/src/stdlib/strtoll_test.cpp
54

Why not LLONG_MAX?

59

LLONG_MIN here?

Question: Do you intend to use internal::strtoll to implement atoi and friends? If yes, then may be the internal strtoll function should also take an additional arg for errno?

michaelrj marked 8 inline comments as done.

Addressed comments, renamed the backend file and function.

libc/src/__support/stdlib_utils.h
12 ↗(On Diff #365316)

I have switched it, but all of the stdlib functions are inside of an if(LLVM_LIBC_FuLL_BUILD) in entrypoints.txt, so if full build is off then it won't even try to build this.

78 ↗(On Diff #365316)

In this case it's to distinguish it from LLONG_MAX, since this is the current maximum value we can store (either LLONG_MAX or -LLONG_MIN). I'm not sure what other prefix to use to indicate that this is a local variable, but a const one.

103 ↗(On Diff #365316)

This line gives me compiler warnings due to casting from a const char* to a char* but the specification says that str_end is not const. Is there a better way to do this or alternately a way to suppress the compiler warnings?

libc/test/src/stdlib/strtoll_test.cpp
54

I was trying to avoid including <limits.h>, but on reflection having the variables match across the files is more important than that.

I was planning on using internal::strtoll to implement atoi and friends, but I'm not sure what the additional errno argument you mentioned would do. The existing implementation of atoi gives errors in the same manner as strtoll, e.g. setting errno to ERANGE when passed a string that decodes to a number greater than 2^63.

I was planning on using internal::strtoll to implement atoi and friends, but I'm not sure what the additional errno argument you mentioned would do. The existing implementation of atoi gives errors in the same manner as strtoll, e.g. setting errno to ERANGE when passed a string that decodes to a number greater than 2^63.

If I am reading the C standard and the POSIX standard correctly, atoi and friends do not modify errno. So, what I was trying to point out is that you might want to pass an additional int & argument for errno which you will use in strtoll and friends but not in atoi and friends.

libc/src/__support/stdlib_utils.h
12 ↗(On Diff #365316)

Ah, as a follow up, we should revisit this and may be move some of the functions outside the conditional.

78 ↗(On Diff #365316)

Call it ABS_MAX may be?

103 ↗(On Diff #365316)

*str_end = const_cast<char *>(src); ?

michaelrj marked 3 inline comments as done.Aug 10 2021, 11:46 AM

The relevant line from the specification is "The functions atof, atoi, atol, and atoll need not affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined." From what I can tell that means that we aren't required to set the errno at any time, since if you pass it an out of range number then that is undefined behavior. However, implementations in other libcs do set errno to ERANGE when atoi is passed a number that is greater than 2^63.
Since this follows the specification, matches expected behavior from other libcs, and simplifies implementation, I think this is the best way forward.

libc/src/__support/stdlib_utils.h
103 ↗(On Diff #365316)

ah yes, that fixed it. Thank you.

michaelrj marked an inline comment as done.

address comments

sivachandra accepted this revision.Aug 10 2021, 11:57 AM

The relevant line from the specification is "The functions atof, atoi, atol, and atoll need not affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined." From what I can tell that means that we aren't required to set the errno at any time, since if you pass it an out of range number then that is undefined behavior. However, implementations in other libcs do set errno to ERANGE when atoi is passed a number that is greater than 2^63.
Since this follows the specification, matches expected behavior from other libcs, and simplifies implementation, I think this is the best way forward.

Ah! Not the first time cppreference.com is incomplete: https://en.cppreference.com/w/c/string/byte/atoi
Also, not all libcs set errno in atoi and friends. For example: https://git.musl-libc.org/cgit/musl/tree/src/stdlib/atoi.c.

This revision is now accepted and ready to land.Aug 10 2021, 11:57 AM

Ah, that makes sense. I was only testing with my system's default glibc. I will land this patch after I do a pull and run the tests on my machine.

michaelrj updated this revision to Diff 365602.Aug 10 2021, 1:25 PM

add strtoll to the spec in stdc.td, as well as add CharPtrPtr to the types defined in spec.td

sivachandra accepted this revision.Aug 10 2021, 2:58 PM
sivachandra added inline comments.
libc/src/__support/str_conv_utils.h
14

You don't need this. In the code below, use errno directly. llvmlibc_errno.h is not available in the default build.

switch over from using the internal errno to using the public errno.
This required changing all the tests, since ErrnoSetterMatcher.h uses only the internal errno.

michaelrj marked an inline comment as done.Aug 11 2021, 10:57 AM
michaelrj added inline comments.
libc/src/__support/str_conv_utils.h
14

Done, but I had to change all the tests to match.

michaelrj updated this revision to Diff 365841.EditedAug 11 2021, 1:32 PM
michaelrj marked an inline comment as done.

fix type for strtoll being char**, it is now char **restrict

michaelrj edited the summary of this revision. (Show Details)Aug 11 2021, 2:16 PM
This revision was landed with ongoing or failed builds.Aug 11 2021, 4:37 PM
This revision was automatically updated to reflect the committed changes.