This is an archive of the discontinued LLVM Phabricator instance.

[libc] Adds strrchr implementation.
ClosedPublic

Authored by cgyurgyik on Jul 29 2020, 10:35 AM.

Details

Summary

Rationale: A naive approach may first use strlen to find the string length, then memrchr to match. This is similar, but just does both in the same loop. Since '\0' is a valid character to match with, we need to ensure that it is checked after the comparison.

Diff Detail

Event Timeline

cgyurgyik created this revision.Jul 29 2020, 10:35 AM
cgyurgyik requested review of this revision.Jul 29 2020, 10:35 AM
cgyurgyik updated this revision to Diff 281662.
Harbormaster completed remote builds in B66248: Diff 281662.
cgyurgyik edited the summary of this revision. (Show Details)Jul 29 2020, 10:59 AM
sivachandra accepted this revision.Jul 31 2020, 10:30 AM
sivachandra added inline comments.
libc/src/string/strrchr.cpp
25

Nit: To avoid the post-increment operator, may be:

for (; ; ++str) {
  if (*str == ch)
    last_occurance = str;
  if (*str == '\0')
    break;
}

To me personally, mixing post/pre-increment operators with other operators make it hard to read so this comment is also driven by my personal preference.

This revision is now accepted and ready to land.Jul 31 2020, 10:30 AM
sivachandra added inline comments.Jul 31 2020, 10:31 AM
libc/test/src/string/strrchr_test.cpp
12

Nit: For completeness, may be add a test where src points to a byte string having more than one null terminators.

cgyurgyik marked an inline comment as done.

Thanks for the review Siva.

libc/src/string/strrchr.cpp
25

I battled with this for a bit as well, but found using a do...while loop cleaner. For this reason, I'm going to leave it as is.

This revision was landed with ongoing or failed builds.Jul 31 2020, 11:39 AM
This revision was automatically updated to reflect the committed changes.

Clearly I'm too late but there are some problems with this revision.

libc/src/string/strrchr.cpp
16–17

Do the const cast at the end. Siva is right too *str++ is an operator precedence rule away from modifying src which is particularly scary when you've const casted early.

18

const not useful here. In any case you can just do c = static_cast<unsigned char>(c). Also why unsigned? Whats wrong with keeping str as a const char *?

25

+1 to Siva's suggestion. The break can be return last_occurance;

libc/test/src/string/strrchr_test.cpp
18–19

This doesn't do anything. You want ASSERT_STREQ(src, "abcde"); And for the next 3 tests. Also all 4 of these can be consolidated into the same function.

sivachandra added inline comments.Jul 31 2020, 1:00 PM
libc/test/src/string/strrchr_test.cpp
18–19

Ah, good catch! I missed it in the strchr_test as well.

cgyurgyik added inline comments.Jul 31 2020, 1:12 PM
libc/test/src/string/strrchr_test.cpp
18–19

Yep thanks will fix.

cgyurgyik added inline comments.Jul 31 2020, 1:29 PM
libc/src/string/strrchr.cpp
16–17

*str++ occurs in the order one might expect if they read it left to right, and follows C operator precedence as expected in this implementation. However, I am not tied to this, and understand your concerns. I will change it in a follow-up revision.

18

Can you explain what you mean by const not useful here?

Is there a difference between what I have and c = static_cast<unsigned char>(c)?

As for your third question, I chose to reinterpret it after reading "each character interpreted as unsigned char": https://en.cppreference.com/w/c/string/byte/strchr

cgyurgyik added inline comments.Jul 31 2020, 1:38 PM
libc/test/src/string/strrchr_test.cpp
18–19

Are you saying consolidate the "source string should not change" part?

abrachet added inline comments.Jul 31 2020, 1:54 PM
libc/src/string/strrchr.cpp
16–17

I'm not questioning the order of precedence, I'm saying for data that we are not allowed to change we should rely on const. I am tied to doing the const_cast at the end. const_cast is dangerous and we should only do it when we absolutely need to.

18

const is useful as a contract for shared state. There isn't a compelling reason to make private state const in my opinion.

Not after optimizations presumably.

FWIW the C standard says "The strrchr function locates the last occurrence of c (converted to a char) in the string pointed to by s" Regardless signedness won't change equality so this is just an extra step we don't need to take as it's just adding unnecessary code.

cgyurgyik added inline comments.Jul 31 2020, 2:10 PM
libc/src/string/strrchr.cpp
16–17

Agreed, const_cast later is the safer option. I did the cast early to avoid having to call it in the while loop.

18

To me, const simply means I don't want the value associated with it to change. In this case, I don't want ch to change later in the function call. This is another layer of type safety.

FWIW the C standard says "The strrchr function locates the last occurrence of c (converted to a char) in the string pointed to by s"

Noted, I'll avoid the unsigned char conversion here.