This is an archive of the discontinued LLVM Phabricator instance.

[libc] Implement strsep
ClosedPublic

Authored by abrachet on Apr 3 2023, 10:23 PM.

Diff Detail

Event Timeline

abrachet created this revision.Apr 3 2023, 10:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2023, 10:23 PM
abrachet requested review of this revision.Apr 3 2023, 10:23 PM
mcgrathr accepted this revision.Apr 4 2023, 5:30 PM
mcgrathr added a subscriber: mcgrathr.

This is missing the bsd_ext.td addition for the <string.h> decl.

libc/test/src/string/strsep_test.cpp
2

Perhaps you can templatize the strtok test and then reuse that here.
Both functions should have all the same cases covered.

This revision is now accepted and ready to land.Apr 4 2023, 5:30 PM

I'm not finding good documentation online, but according to the man page for strsep, it should behave slightly differently from strtok. Instead of skipping past repeated delimeters, it should instead treat them the same and possibly return a pointer to '\0'. I've copied the example below:

EXAMPLES
       The program below is a port of the one found in strtok(3), which, however, doesn't discard multiple delimiters
       or empty tokens:

           $ ./a.out 'a/bbb///cc;xxx:yyy:' ':;' '/'
           1: a/bbb///cc
                    --> a
                    --> bbb
                    -->
                    -->
                    --> cc
           2: xxx
                    --> xxx
           3: yyy
                    --> yyy
           4:
                    -->

   Program source

       #include <stdio.h>
       #include <stdlib.h>
       #include <string.h>

       int
       main(int argc, char *argv[])
       {
           char *token, *subtoken;

           if (argc != 4) {
               fprintf(stderr, "Usage: %s string delim subdelim\n", argv[0]);
               exit(EXIT_FAILURE);
           }

           for (unsigned int j = 1; (token = strsep(&argv[1], argv[2])); j++) {
               printf("%u: %s\n", j, token);

               while ((subtoken = strsep(&token, argv[3])))
                   printf("\t --> %s\n", subtoken);
           }

           exit(EXIT_SUCCESS);
       }

SEE ALSO
       index(3), memchr(3), rindex(3), strchr(3), string(3), strpbrk(3), strspn(3), strstr(3), strtok(3)

Linux man-pages 6.01                                  2022-10-09                                            STRSEP(3)
abrachet updated this revision to Diff 511467.Apr 6 2023, 10:23 AM

Good catch, thanks.

abrachet updated this revision to Diff 511469.Apr 6 2023, 10:34 AM
michaelrj accepted this revision.Apr 6 2023, 10:42 AM

LGTM with one nit.

libc/src/string/string_utils.h
197–199

Nit: This should be if constexpr (SkipDelim) to make it clear this is a compile-time if statement.

This revision was automatically updated to reflect the committed changes.
abrachet marked an inline comment as done.
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2023, 10:49 AM
abrachet added inline comments.Apr 6 2023, 10:49 AM
libc/src/string/string_utils.h
197–199

Done in commit