This is an archive of the discontinued LLVM Phabricator instance.

[libc] Add strtok implementation.
ClosedPublic

Authored by cgyurgyik on Aug 9 2020, 10:48 AM.

Details

Summary

Rationale: Calling strspn and strcspn in strtok would currently create two separate Bitset instances and loop through the delimiters twice. We can avoid this by creating one instance inside the strtok function.

Also removes unnecessary header from String utils.

Diff Detail

Event Timeline

cgyurgyik created this revision.Aug 9 2020, 10:48 AM
cgyurgyik requested review of this revision.Aug 9 2020, 10:48 AM
cgyurgyik updated this revision to Diff 284229.Aug 9 2020, 1:02 PM
cgyurgyik added a reviewer: sivachandra.
cgyurgyik updated this revision to Diff 284240.Aug 9 2020, 4:45 PM
sivachandra added inline comments.Aug 10 2020, 2:44 PM
libc/src/string/strtok.cpp
17

Nit: delimiter_string

18

Nit: delimiter_set

22

Nit: Can you move this out side of this function and call it strtok_str?

24

I think the comments are reversed. It should be "non-delimiter" here and "delimiter" below? But, I think comments are just a distraction here as the code speaks for itself.

28

Nit: s/delimiter_span/token

34

A readability thing to not mix increment/decrement operators with other operators:

*strtok_str = '\0';
 ++strtok_str;
libc/test/src/string/strtok_test.cpp
62

Does strtok as implemented above ever return nullptr? This got me to go look why ASSERT_STREQ is happy. It turns out, our STREQ comparison first converts the string to llvm::StringRef which makes an empty string compare equal to a nullptr! That needs fixing separately.

cgyurgyik added inline comments.Aug 10 2020, 5:53 PM
libc/test/src/string/strtok_test.cpp
62

Do we want strtok to return nullptr in the case the null terminator is hit?

cgyurgyik updated this revision to Diff 284540.Aug 10 2020, 6:17 PM
cgyurgyik marked 6 inline comments as done.

Address Siva's comments.

sivachandra added inline comments.Aug 10 2020, 8:26 PM
libc/test/src/string/strtok_test.cpp
62

Yes. I would think that is what this text from the standard means:

If no such character was found, there are no tokens in str at all, and the function returns a null pointer.

https://en.cppreference.com/w/c/string/byte/strtok

cgyurgyik updated this revision to Diff 284659.Aug 11 2020, 4:45 AM

Address comment.

cgyurgyik marked an inline comment as done.Aug 11 2020, 4:45 AM
sivachandra accepted this revision.Aug 11 2020, 7:20 AM
sivachandra added inline comments.
libc/src/string/strtok.cpp
26

Should we update strtok_str = src ?

libc/test/src/string/strtok_test.cpp
61

Switch this to ASSERT_EQ?

65

Same here.

77

And here.

This revision is now accepted and ready to land.Aug 11 2020, 7:20 AM
cgyurgyik updated this revision to Diff 284729.Aug 11 2020, 7:54 AM
cgyurgyik marked 2 inline comments as done.
cgyurgyik added inline comments.Aug 11 2020, 10:29 AM
libc/test/src/string/strtok_test.cpp
61

I get:

 no matching function for call to 'test'
  ASSERT_EQ(token, nullptr);
...
template ignored: deduced conflicting types for parameter 'ValType' ('char *' vs. 'nullptr_t')
  static bool test(RunContext &Ctx, TestCondition Cond, ValType LHS,

Is there an easy fix for this that I'm missing?

sivachandra added inline comments.Aug 11 2020, 10:35 AM
libc/test/src/string/strtok_test.cpp
61

Never mind then. I will fix it separately.

This revision was automatically updated to reflect the committed changes.