This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
libc/trunk/
-
trunk/
-
CMakeLists.txt
-
cmake/modules/
-
modules/
-
LLVMLibCRules.cmake
-
docs/
-
build_system.rst
-
entrypoints.rst
-
header_generation.rst
-
implementation_standard.rst
-
source_layout.rst
-
include/
-
CMakeLists.txt
-
__llvm-libc-common.h
1
ctype.h
-
math.h
-
string.h
-
lib/
-
CMakeLists.txt
-
src/
-
CMakeLists.txt
-
__support/
-
CMakeLists.txt
-
common.h.def
-
linux/
-
entrypoint_macro.h.inc
-
string/
-
CMakeLists.txt
-
strcat/
-
CMakeLists.txt
-
strcat.h
-
strcat.cpp
-
strcat_test.cpp
-
strcpy/
-
CMakeLists.txt
-
strcpy.h
-
strcpy.cpp
-
strcpy_test.cpp
-
utils/build_scripts/
-
build_scripts/
-
gen_hdr.py
-
llvm/trunk/
-
trunk/
-
CMakeLists.txt
-
projects/
-
CMakeLists.txt

Differential D67867

[libc] Add few docs and implementation of strcpy and strcat.
ClosedPublic

Authored by sivachandra on Sep 20 2019, 3:33 PM.

Download Raw Diff

Details

Reviewers

dlj
hfinkel
theraven
jfb
alexander-shaposhnikov
jdoerfert
zturner
MaskRay
stanshebs

Commits

rG4380647e79bd: Add few docs and implementation of strcpy and strcat.
rL373764: Add few docs and implementation of strcpy and strcat.

Summary

This patch illustrates some of the features like modularity we want
in the new libc. Few other ideas like different kinds of testing, redirectors
etc are not yet present.

Diff Detail

Repository: rL LLVM

Event Timeline

sivachandra created this revision.Sep 20 2019, 3:33 PM

Herald added a reviewer: alexander-shaposhnikov. · View Herald TranscriptSep 20 2019, 3:33 PM

Herald added a reviewer: jdoerfert. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, dexonsmith, mgorny. · View Herald Transcript

Harbormaster completed remote builds in B38385: Diff 221129.Sep 20 2019, 3:38 PM

sivachandra added a reviewer: zturner.Sep 20 2019, 3:39 PM

There are a bunch of design docs and CMake rules in this change. They are not perfect yet and have a very Linux/ELF feel to them. Also, there are no tests in this patch. My intent is to get a signoff on the general direction. When I can get it into a shape we agree on, I will add unit tests.

Some of the other ideas like redirectors, different kinds of testing etc are dependent on this initial change. So, once it lands, I can prepares patches showcasing those ideas as well. I also plan to start working on the Windows story to eliminate the Linux/ELF-isms that are currently baked into this patch.

After a quick once-over, I do have a few nit-picky comments. I plan to do another pass, but these are some things that were a little bit surprising up front.

libc/docs/header_generation.md
7 ↗	(On Diff #221129)	Unrequested opinion: my personal complaint in this area is that reading the files (as a human) requires understanding the entirety of the configuration macros, plus usually double-checking `clang -E -dM`, just to figure out what the preprocessor result will be. Generating files up-front at least carves out the internal configuration part. I'm not sure how to word that eloquently, but it might be possible to say something a bit more concrete than the current wording.
16 ↗	(On Diff #221129)	It might be good to describe `.h.in` files, too. (Obligatory bikeshed: and maybe give them a different extension to disambiguate from CMake-configured files?)
38 ↗	(On Diff #221129)	Wording nit: the distinction between "parameter" and "argument" caught me a bit by surprise: the "arguments" here are what I would call "parameters" -- things that parameterize the target; the term "parameters" here are free-floating values, but they could be passed as as what I would call "arguments." (This is mostly following the way these terms are used in C++: functions are parameterized, call expressions have arguments.) I really don't want to get into a bikeshed argument here, though. If you don't agree with my reasoning, I can live with this as-is, too. :-)
libc/include/null.h.in
21 ↗	(On Diff #221129)	Do you think it would be reasonable to check for availability of __null, and use it if possible? My reasoning: this would be different in Clang's AST, and that could lead to subtle differences (for example, when using the `nullPointerConstant` AST matcher).
libc/scripts/gen_hdr.py
22 ↗	(On Diff #221129)	It might be helpful to add a comment "command." For example, looking at the .h.in files, it's not immediately obvious that they are used for something other than CMake `configure_file`, so comments might help.
37 ↗	(On Diff #221129)	(Side note: this looks like it could be a function with `@contextlib.contextmanager` instead, which might be simpler: https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager)
71 ↗	(On Diff #221129)	Nit: maybe `_fatal_error` instead? It seems reasonable to assume `_report_error` would only report.

tschuett added a subscriber: tschuett.Sep 20 2019, 11:26 PM

jyknight added a subscriber: jyknight.Sep 23 2019, 5:49 PM

jyknight added inline comments.

libc/include/ctype.h
16 ↗	(On Diff #221129)	All the argument names should either be removed or put in the implementation namespace, e.g. named `__c`, etc. Since they're not specified by the standard, users can `#define` them to something else. Removed is probably best.
libc/include/gcc_clang_size_t.h.in
1 ↗	(On Diff #221129)	It would be good to be clear as to who is responsible for providing which headers. Right now, clang expects to provide `stddef.h`, and thus size_t, so this shouldn't be provided by the libc. I note that some bsd libcs remove some of clang's provided headers, and other libcs like musl want their headers in the wrong order (libc first, then clang builtins), but I don't know why they do that, or why they want to do that. And I think that's probably not a good thing to do for the llvm libc.
libc/include/math.h
16 ↗	(On Diff #221129)	Same issue with all these argument names.
libc/include/null.h.in
1 ↗	(On Diff #221129)	Same issue here -- duplicative of clang's stddef.h.
libc/include/stdbool.h
1 ↗	(On Diff #221129)	Duplicative of clang's stdbool.h.

MaskRay added a subscriber: MaskRay.Sep 24 2019, 12:47 AM

MaskRay added inline comments.

libc/CMakeLists.txt
19 ↗	(On Diff #221129)	We can reuse CMAKE_AR.
libc/include/ctype.h
16 ↗	(On Diff #221129)	Removed is probably best. +1
libc/include/gcc_clang_size_t.h.in
1 ↗	(On Diff #221129)	See https://reviews.llvm.org/D65699#1614203 for why musl has a different search path order from glibc. I agree we don't need gcc_clang_size_t.h.in for now.
libc/include/math.h
18 ↗	(On Diff #221129)	I'd prefer deleting empty lines among similar functions.
libc/include/null.h.in
21 ↗	(On Diff #221129)	The GNU extension is not necessary: void foo(int a) { puts("int"); } void foo(long a) { puts("long"); } void foo(void* a) { puts("void*"); } both `foo(__null)` and `foo(0L)` resolve to `foo(long)`. C++03: an integral constant expression rvalue of integer type that evaluates to zero C++11: an integer literal with value zero, or a prvalue of type std::nullptr_t 0L is clearly a conforming definition for all conforming compilers. This is also well tested by musl.
libc/src/string/strcpy/strcpy.cpp
13 ↗	(On Diff #221129)	superfluous empty line.
16 ↗	(On Diff #221129)	Such delegation is inefficient. The call to memcpy will go through GOT/PLT. You need a hidden alias to memcpy to eliminate the PLT call.

MaskRay added a reviewer: MaskRay.Sep 24 2019, 12:47 AM

MaskRay removed a subscriber: MaskRay.

This patch is subscribed to llvm-commits, not libc-commits, is that correct? (you may need to bump phab people to setup correct auto-subscribe rules for this new repo)
Also, @sivachandra please please, can libc patches all start with [libc] prefix in their patch name, so it is less painful to read through -commits lists, and commit log?

seiya added a subscriber: seiya.Sep 24 2019, 3:10 AM

Address comments

I addressed most comments. For the rest, I asked for clarification/suggestion.

libc/docs/header_generation.md
7 ↗	(On Diff #221129)	I made starts to address this comment, but scrubbed them all out as I do not think I am doing a good job. Do you have any suggestion, or some specific points you want to see covered here?
16 ↗	(On Diff #221129)	Changed the extension to ".h.inc". Also, added a note about the "begin" command and described the ".h.inc" files over there.
38 ↗	(On Diff #221129)	I agree there was some mix up. I made minor changes now. How does it look?
libc/include/null.h.in
21 ↗	(On Diff #221129)	Removed this file for now.
libc/scripts/gen_hdr.py
22 ↗	(On Diff #221129)	I have now changed the extension to ".h.inc". Also added a comment syntax for the ".h.def" files. In a ".h.inc" file, there can be two kinds of comments: Comments to describe what the file is for/about - Such comments can be put before the %%begin command. Anything goes before the %%begin command. Comments which describe the actual contents - These comments should also go into the generated header file, so they should be normal C/C++ style comments. On the other hand, one can have comments in the ".h.def" file attached to a command. Such comments should not be copied into the generated header file, so I added comment syntax for that: Prefix the comment lines with "<!>", then the rest of the line will be ignored.
libc/src/string/strcpy/strcpy.cpp
16 ↗	(On Diff #221129)	Since llvm-libc does not yet provide the memcpy implementation, we use ::memcpy from the system libc. When llvm-libc provides the memcpy implementation, this should be changed to a call to llvm_libc::memcpy.

Harbormaster completed remote builds in B38506: Diff 221619.Sep 24 2019, 4:11 PM

lebedev.ri removed a subscriber: lebedev.ri.Sep 25 2019, 1:03 AM

MaskRay added inline comments.Sep 25 2019, 8:48 AM

libc/include/math.h
18 ↗	(On Diff #221129)	I mean double asin(double); // no blank line below float asinf(float); long double asinl(long double);

Some high level comments on your filesystem layout standard:

I think one directory per function is going to be annoying, not helpful. Most libc functions are expressible as a single function of less than 100 LOC. I'd suggest, instead, to have e.g. src/stdlib/strcpy.cc, implementing the strcpy function. It seems like a good rule that every file implements one public function. If additional files are needed for implementation clarity, I'd suggest to put such helper code into something like src/stdlib/internal/strcpy_internal.cpp (obviously not _actually_ for strcpy)

This layout does not have room for splitting across what I think are likely to be more useful boundaries. In particular, I think it'd be good to have at the toplevel whether the function is ISO C standard, POSIX standard, or a private extension. Another thing that may be good to somehow express is a distinction separation between generally-portable "library-code-only" nonstandard functions, and libc functions which are explicitly exposing non-portable kernel-specific functionality. In the former would go things like asprintf, and in the latter would go things like kqueue or epoll_create.

That is, maybe something like:

libc/
  src/
    c_std/{string/,math/,...}
    posix_std/{string/,math/,...}
    extensions_portable/{string/,stdlib/,...}
    extensions_macos/{string/,stdlib/,...}
    extensions_linux/{string/,stdlib/,...}

Probably also worth considering splitting the headers in the same way. E.g., move the current libc/include/math.h file to libc/include/c_std/math.h, and have a 'math.h' be generated by %%includeing the pieces from all the standards you wish to implement.

libc/docs/implementation_standard.md
29 ↗	(On Diff #221619)	Unless all the symbols defined here are going to be suppressed via some external mechanism, this namespace name is also not okay, as it will break valid user programs using the un-reserved namespace "llvm_libc". I'd suggest renaming __llvm_libc.

In D67867#1682894, @jyknight wrote:

I think one directory per function is going to be annoying, not helpful. Most libc functions are expressible as a single function of less than 100 LOC. I'd suggest, instead, to have e.g. src/stdlib/strcpy.cc, implementing the strcpy function. It seems like a good rule that every file implements one public function. If additional files are needed for implementation clarity, I'd suggest to put such helper code into something like src/stdlib/internal/strcpy_internal.cpp (obviously not _actually_ for strcpy)

My choice here is driven by my preference to keep tests co-located with the implementation. Since we are in general going to have different kinds of testing like unit-testing, differential testing, fuzz-testing, I have chosen to put implementations of individual functions in their own directories. This way, the tests (which will span multiple files) will all be in a single directory and avoid clutter in the higher level directory. In the rare occasion that implementation also spans multiple files, all the implementation files will also live together.

That said, I think we will need a place for common/shared infrastructure. For example, if all hyperbolic functions use a common algorithm, that algorithm should probably have its own target and go in a directory like src/math/__<appropriate_name>.

I agree that, they way I have structured it now, it can be an annoyance. But, I am of the opinion that it gives us a simpler mental model, and that the annoyance can be overcome with some kind of convenience tooling/scripting if required.

This layout does not have room for splitting across what I think are likely to be more useful boundaries. In particular, I think it'd be good to have at the toplevel whether the function is ISO C standard, POSIX standard, or a private extension. Another thing that may be good to somehow express is a distinction separation between generally-portable "library-code-only" nonstandard functions, and libc functions which are explicitly exposing non-portable kernel-specific functionality. In the former would go things like asprintf, and in the latter would go things like kqueue or epoll_create.

That is, maybe something like:
libc/
  src/
    c_std/{string/,math/,...}
    posix_std/{string/,math/,...}
    extensions_portable/{string/,stdlib/,...}
    extensions_macos/{string/,stdlib/,...}
    extensions_linux/{string/,stdlib/,...}

I did consider such a layout. However, the same kind of distinction can be achieved by two things:

Next to the implementation and/or target listing for a function, call out the standard/extension that prescribes it.
When composing the target for libc.a for a platform, group the functions per standards/extensions they come from.

Another point which made me not pick this layout: I agree that as a catalog, the structure you are suggesting could be more meaningful. But as a developer, my mental model is much simpler if all the functions from a header file are grouped in one place, irrespective of the standard they come from.

Probably also worth considering splitting the headers in the same way. E.g., move the current libc/include/math.h file to libc/include/c_std/math.h, and have a 'math.h' be generated by %%includeing the pieces from all the standards you wish to implement.

Irrespective of the directory layout, I think we will have such %%include-ed files anyway. For example, linux extensions can only be %%include-ed for linux targets.

libc/docs/implementation_standard.md
29 ↗	(On Diff #221619)	Thanks again for pointing out another of my oversights. I will change per you suggestion when I update the code next.

Add unittests for strcat and strcpy.

sivachandra edited the summary of this revision. (Show Details)Sep 25 2019, 11:38 PM

sivachandra marked an inline comment as done.Sep 25 2019, 11:41 PM

sivachandra added inline comments.

libc/include/math.h
18 ↗	(On Diff #221129)	I like your suggestion, but I want to leave it as is for now. As we add implementations, I want to separate out those having implementations from those not. So, within the implemented ones, we should follow your suggestion of grouping similar functions. Eventually, the full set of functions will be grouped.

Harbormaster completed remote builds in B38575: Diff 221891.Sep 25 2019, 11:43 PM

In D67867#1683099, @sivachandra wrote:

In D67867#1682894, @jyknight wrote:

I think one directory per function is going to be annoying, not helpful. Most libc functions are expressible as a single function of less than 100 LOC. I'd suggest, instead, to have e.g. src/stdlib/strcpy.cc, implementing the strcpy function. It seems like a good rule that every file implements one public function. If additional files are needed for implementation clarity, I'd suggest to put such helper code into something like src/stdlib/internal/strcpy_internal.cpp (obviously not _actually_ for strcpy)

My choice here is driven by my preference to keep tests co-located with the implementation. Since we are in general going to have different kinds of testing like unit-testing, differential testing, fuzz-testing, I have chosen to put implementations of individual functions in their own directories. This way, the tests (which will span multiple files) will all be in a single directory and avoid clutter in the higher level directory. In the rare occasion that implementation also spans multiple files, all the implementation files will also live together.

Some tests will involve several functions from a header, e.g. flockfile+ftrylock+funlockfile. Where will you place the test, under flockfile/, or funlockfile/ ?

I prefer jyknight's proposed hierarchy.

In D67867#1684296, @MaskRay wrote:

In D67867#1683099, @sivachandra wrote:

In D67867#1682894, @jyknight wrote:

I think one directory per function is going to be annoying, not helpful. Most libc functions are expressible as a single function of less than 100 LOC. I'd suggest, instead, to have e.g. src/stdlib/strcpy.cc, implementing the strcpy function. It seems like a good rule that every file implements one public function. If additional files are needed for implementation clarity, I'd suggest to put such helper code into something like src/stdlib/internal/strcpy_internal.cpp (obviously not _actually_ for strcpy)

My choice here is driven by my preference to keep tests co-located with the implementation. Since we are in general going to have different kinds of testing like unit-testing, differential testing, fuzz-testing, I have chosen to put implementations of individual functions in their own directories. This way, the tests (which will span multiple files) will all be in a single directory and avoid clutter in the higher level directory. In the rare occasion that implementation also spans multiple files, all the implementation files will also live together.

Some tests will involve several functions from a header, e.g. flockfile+ftrylock+funlockfile. Where will you place the test, under flockfile/, or funlockfile/ ?

I prefer jyknight's proposed hierarchy.

The source tree layout should probably reflect the structure of implementation (which we decide and work within), not just the structure of the interface (which is specified for us).

Another way to think about it: within the implementation, how can we localize changes?

Siva mentioned tests, and I think that's quite instructive: the testing boundary provides a natural cut point for functions that must be tested mutually.

File locking functions are practically inseparable, so it's reasonable to keep them together. However, that doesn't imply they should be grouped with unrelated functions, nor that all functions should be lumped together. A finer-grained directory structure seems quite intuitive for expressing these chunks of inseparable functionality.

libc/docs/header_generation.md
7 ↗	(On Diff #221129)	Eh, I'll think about it. Don't block on me... I can send a patch if I come up with something.
38 ↗	(On Diff #221129)	This is definitely more consistent, but I think it's still the reverse of what I had in mind: "${thing}" is a specific value that exists at call sites, so I would call that an argument. The thing defined by the called function (with unspecified value) is a parameter. int foo(int a); // 1 parameter named "a" int b; int c = foo(b); // 1 argument named "b" The usage below of "Parameters" in `include_file` matches my expectation, but "Arguments" for `begin` seems at odds with this. (The specific wording I'm thinking of is in the C++ standard: [dcl.fct] and [expr.call].)

dxf added a subscriber: dxf.Sep 26 2019, 2:09 PM

Update docs according to the latest comments.

In D67867#1684562, @dlj wrote:

In D67867#1684296, @MaskRay wrote:

Some tests will involve several functions from a header, e.g. flockfile+ftrylock+funlockfile. Where will you place the test, under flockfile/, or funlockfile/ ?

I prefer jyknight's proposed hierarchy.

The source tree layout should probably reflect the structure of implementation (which we decide and work within), not just the structure of the interface (which is specified for us).

Another way to think about it: within the implementation, how can we localize changes?

Siva mentioned tests, and I think that's quite instructive: the testing boundary provides a natural cut point for functions that must be tested mutually.

File locking functions are practically inseparable, so it's reasonable to keep them together. However, that doesn't imply they should be grouped with unrelated functions, nor that all functions should be lumped together. A finer-grained directory structure seems quite intuitive for expressing these chunks of inseparable functionality.

I understand maskray's concerns about related functions. I also agree with what dlj says, so I have updated the implementation standard to say that implementation of related groups of functions will be in their own directories (previous wording said that every function/entrypoint will be in its own directory.)

libc/docs/header_generation.md
38 ↗	(On Diff #221129)	Ah, I think I understand whats going on now. I think I got mixed up here. Fixed it now (I hope).

sivachandra added a reviewer: stanshebs.Sep 26 2019, 3:36 PM

Harbormaster completed remote builds in B38625: Diff 222037.Sep 26 2019, 3:36 PM

Looks good for my comments.

This revision is now accepted and ready to land.Sep 26 2019, 4:20 PM

MaskRay added inline comments.Sep 26 2019, 6:53 PM

libc/cmake/modules/LLVMLibCRules.cmake
128 ↗	(On Diff #222037)	there is an existing cmake variable to specify this llvm has moved to c++14 now.
157 ↗	(On Diff #222037)	Prefer CMAKE_OBJCOPY. Does this target need llvm-objcopy to build?
libc/include/string.h
56 ↗	(On Diff #222037)	place mem* functions together
libc/src/string/strcat/strcat.h
13 ↗	(On Diff #222037)	blank line above namespace
15 ↗	(On Diff #222037)	delete dest and src
libc/src/string/strcpy/strcpy_test.cpp
18 ↗	(On Diff #222037)	clang-format
23 ↗	(On Diff #222037)	clang-format. This should be `delete[] dest`;
llvm/CMakeLists.txt
62 ↗	(On Diff #222037)	Making it build by default makes me concerned: there are some cmake constructs which I am not sure build on a non-Linux platform. If nobody has tested that, probably hold off on this a bit.

I get a bunch of Ninja targets. When the libc is ready, we will get thousands of targets like these:

build strcat: phony projects/libc/src/string/strcat/strcat
build strcat_objects: phony projects/libc/src/string/strcat/strcat_objects
build strcat_test: phony bin/strcat_test
build strcpy: phony projects/libc/src/string/strcpy/strcpy
build strcpy_objects: phony projects/libc/src/string/strcpy/strcpy_objects
build strcpy_test: phony bin/strcpy_test
build string_h: phony projects/libc/include/string_h

Is there some way not to create the targets in the "global namespace"? The target name bin/strcpy_test is also less ideal. With thousands tests, those executables can "overflow" people's bin/ directories.

The build system concerns me the most. I think you should probably find some capable CMake reviewers (I am not well versed in it).

libc/CMakeLists.txt
19 ↗	(On Diff #222037)	Delete this. Just use `CMAKE_LINKER`. On Windows, the variable defaults to `set(CMAKE_LINKER "${LLVM_NATIVE_TOOLCHAIN}/bin/lld-link" CACHE FILEPATH "")`. compiler-rt/lib/fuzzer uses ld -r, maybe you can read its CMakeLists.txt for some insights.
libc/cmake/modules/LLVMLibCRules.cmake
27 ↗	(On Diff #222037)	Use `file(COPY ...)`
libc/docs/build_system.md
19 ↗	(On Diff #222037)	I am not sure names `libm.a` and `libc.a`are required by POSIX or other standards. I think this is more of a convention. Citation needed if you state this.
libc/docs/source_layout.md
70 ↗	(On Diff #222037)	This is usually named `utils/` in LLVM projects. `test/` and `unittests/` contain lit tests and unit tests, respectively.
libc/src/string/strcpy/strcpy.h
13 ↗	(On Diff #222037)	Blank line
15 ↗	(On Diff #222037)	Delete `dest` and `src`.

Another round of updates.

In D67867#1685213, @MaskRay wrote:
I get a bunch of Ninja targets. When the libc is ready, we will get thousands of targets like these:
build strcat: phony projects/libc/src/string/strcat/strcat
build strcat_objects: phony projects/libc/src/string/strcat/strcat_objects
build strcat_test: phony bin/strcat_test
build strcpy: phony projects/libc/src/string/strcpy/strcpy
build strcpy_objects: phony projects/libc/src/string/strcpy/strcpy_objects
build strcpy_test: phony bin/strcpy_test
build string_h: phony projects/libc/include/string_h
Is there some way not to create the targets in the "global namespace"? The target name bin/strcpy_test is also less ideal. With thousands tests, those executables can "overflow" people's bin/ directories.

I am not aware of a way to avoid creating targets in the global namespace. But, the target names are restricted C names anyway. So, I am not sure it will be a problem. If it does turn out to be a problem, it should be straightforward to add some prefixes and formulate naming rules as required.

I have now moved the test executables out of the bin directory. But note that the test executables are excluded from the "all" target anyway.

The build system concerns me the most. I think you should probably find some capable CMake reviewers (I am not well versed in it).

This is the first patch for the libc, and I am sure I did not get everything perfect. So, I also share your concerns about not getting everything perfect. At the same time, I am not worried about this as it isn't like something is getting set in stone - we can change as required as we learn along the way.

libc/cmake/modules/LLVMLibCRules.cmake
27 ↗	(On Diff #222037)	file and configure_file do not create concrete build targets. We want concrete build targets so that we have the ability to pick and choose targets.
157 ↗	(On Diff #222037)	We don't need llvm-objcopy just yet. But, I have a follow up change which will require llvm-objcopy. So, I have been carrying it here as well. Removed now anyway; will put it back in the next change.
libc/docs/build_system.md
19 ↗	(On Diff #222037)	I couldn't pull out an official reference, but I have this for now: http://www.musl-libc.org/faq.html - See the question "Why is libm.a empty?"
libc/docs/source_layout.md
70 ↗	(On Diff #222037)	I am for reducing the number of toplevel directories. So, I have modified it to this structure: + utils build_scripts testing
libc/src/string/strcat/strcat.h
15 ↗	(On Diff #222037)	This is an internal header file, so I do not see any problem with keeping the names.
libc/src/string/strcpy/strcpy.h
15 ↗	(On Diff #222037)	This is an internal header, so I do not see any problem with keeping the names.
llvm/CMakeLists.txt
62 ↗	(On Diff #222037)	I am not sure what you mean by "default". libc will be built only if one specifies libc among the list of project with -DLLVM_ENABLE_PROJECTS, or if one says "all".

Harbormaster completed remote builds in B38642: Diff 222087.Sep 27 2019, 12:02 AM

We don't have anything documenting the usage of reserved identifiers. The C standard reserves identifiers that begin with a double underscore or with an underscore or a capital as reserved for the 'implementation' but assumes that the compiler and library are a single blob. GCC has a documented policy that double-underscore identifiers are reserved for the compiler and underscore-capital identifiers are reserved for the library (but glibc doesn't follow it, so ended up with __block being used as an identifier in unistd.h, which broke many things for a long time). Do we want to have a stricter policy, for example that the only identifiers that we use in public headers that are not standard symbols are __LLVM_LIBC_{FOO} for macros and __llvm_libc_foo for other identifiers?

libc/docs/header_generation.md
9 ↗	(On Diff #222087)	I'm not sure I understand how this will work. For existing code, it's common to have to do things like `#define` or `#undef` macros like `_GNU_SOURCE` for a single compilation unit before including libc headers. There are some annoying differences in philosophy between the header exports: most libc implementations export all interfaces but provide a mechanism for hiding non-standard (or later-standard) ones from portable code, glibc exports only standard functions (though I don't recall which set of standards) and requires feature macros to expose others. Code that works with glibc and other libc implementations ends up jumping through hoops to support both models. Are we adding a third mechanism? Does this work for projects that want to use only standard interfaces in most compilation units but some non-standard extensions in their platform abstraction layer?
21 ↗	(On Diff #222087)	This sounds like you will end up with only one set of headers per configuration, so you lose the ability to have different projects using the same generated headers but enforcing different sets of standards compliance in their use of the interfaces.
libc/docs/implementation_standard.md
79 ↗	(On Diff #222087)	There are a few things that are unclear to me in this description: How do we express the standards to which an entrypoint conforms? For example, a function defined in C11 or POSIX2008? How do we differentiate between things that we want to be preemptible versus things that we don't? If we want to call the preemptible version of a symbol in other libc code, will we have the `::foo` symbol visible at library build time? How are we exposing information for building subsets of the implementation that avoid dependencies on certain platform features? For example, a CloudABI-compatible mode that does not provide (or consume) any functions that touch the global namespace.
libc/include/ctype.h
14 ↗	(On Diff #222087)	BSD libcs include a `cdefs.h` that provides macros such as `__BEGIN_DECLS` and `__END_DECLS` wrapping this kind of pattern, and `__restrict` for language keywords that should not be in headers in certain versions of the standard. It looks as if `__support/common.h` is the equivalent - we should probably have an explicit rule that this header is included first in all libc headers and provide sensible helpers there.

The libc/docs/*.md should be .RST - there are no more no other .MD docs in llvm, they were all converted back to .RST

theraven added inline comments.Sep 27 2019, 3:10 AM

libc/include/math.h
20 ↗	(On Diff #222087)	The `long long` functions should be exposed only for C99 or later and a version of C++ that supports the `long long` type.
libc/include/string.h
12 ↗	(On Diff #222087)	Namespace pollution. The standard expects `size_t` to be exposed by this header, but not the other types in `stddef.h`. Software that relies on this pollution is non-portable (and will break on existing libc implementations that follow the standard).
18 ↗	(On Diff #222087)	Missing `restrict` qualifiers on this and many other standard functions in this file. The standard defines `memcpy` as: void memcpy(void restrict s1, const void * restrict s2, size_t n); I'm surprised that clang doesn't warn about the declaration of `memcpy` with incorrect types - it usually notices missing qualifiers.
libc/src/string/strcpy/strcpy_test.cpp
16 ↗	(On Diff #222087)	Having the libc test suite depend on a working C++ runtime and standard library is likely to make unit testing the library difficult. The implementation of `std::string` almost certainly depends on several `libc` functions and `new` and `delete` probably do as well. This means that we can test the namespaced libc functions in an environment that already has a libc, but we can't easily extend these tests to run in a process (or other environment) that has only LLVM libc.

In D67867#1683099, @sivachandra wrote:

I did consider such a layout. However, the same kind of distinction can be achieved by two things:

Next to the implementation and/or target listing for a function, call out the standard/extension that prescribes it.

When composing the target for libc.a for a platform, group the functions per standards/extensions they come from.

Another point which made me not pick this layout: I agree that as a catalog, the structure you are suggesting could be more meaningful. But as a developer, my mental model is much simpler if all the functions from a header file are grouped in one place, irrespective of the standard they come from.

Which standard the function is from could be useful to think about while writing/reading the code -- it might be good to, in general, only use functions from the same or "lower" standard versions when implementing functions in a "higher" standard. Although...if the namespacing mechanism described here is kept, and calls are all made only to other llvm_libc-namespace functions, it's less problematic to use nonstandard functions. Doing so then won't pollute the linker namespace, unlike if you're calling C toplevel functions.

Maybe everything is fine, but given this setup, does anyone see any potential problems with compiling these functions for nvptx? I'd like to eventually see a mode where we compile an appropriate subset of these functions for GPU targets -- either in bitcode form for linking with our device-side OpenMP runtime library or as a native object -- to provide a more feature-complete offloading environment.

The one thing that caught by eye was the use of the section attribute, and I was curious what nvptx does with that. As far as I can tell, perhaps the answer is nothing. I took this:

cat /tmp/f.c 
int __attribute__((section(".llvm.libc.entrypoint.foo"))) foo ()  {
  return 0;
}

clang -target nvptx64 -O3 -S -o - /tmp/f.c
//
// Generated by LLVM NVPTX Back-End
//
 
.version 3.2
.target sm_20
.address_size 64
 
  // .globl foo
 
.visible .func  (.param .b32 func_retval0) foo()
{
  .reg .b32   %r<2>;
 
  mov.u32   %r1, 0;
  st.param.b32  [func_retval0+0], %r1;
  ret;
 
}

so maybe that part will just work?

In D67867#1686056, @hfinkel wrote:

Maybe everything is fine, but given this setup, does anyone see any potential problems with compiling these functions for nvptx? I'd like to eventually see a mode where we compile an appropriate subset of these functions for GPU targets -- either in bitcode form for linking with our device-side OpenMP runtime library or as a native object -- to provide a more feature-complete offloading environment.

The one thing that caught by eye was the use of the section attribute, and I was curious what nvptx does with that. As far as I can tell, perhaps the answer is nothing.

Then I think this scheme won't work, since the point of the sections is to enable the creation of the global symbols post-build.

E.g., I think the idea is that the main implementation defines the function with C++ name __llvm_libc::strcpy(char *, const char *), and places the code in the .llvm.libc.entrypoint.strcpy section. And then another tool comes along and iterates the llvm.libc.entrypoint sections, and adds global symbol aliases for each one.

That scheme feels probably over-complex, IMO, but I don't have an concrete counter-proposal in mind.

In D67867#1685625, @lebedev.ri wrote:

The libc/docs/*.md should be .RST - there are no more no other .MD docs in llvm, they were all converted back to .RST

So, does it mean, this has been reversed: https://reviews.llvm.org/D44910

In D67867#1686137, @sivachandra wrote:

In D67867#1685625, @lebedev.ri wrote:

The libc/docs/*.md should be .RST - there are no more no other .MD docs in llvm, they were all converted back to .RST

So, does it mean, this has been reversed: https://reviews.llvm.org/D44910

llvm-project$ find -iname *.md | grep "docs"
llvm-project$

So yes, i suppose that patch needs to be reverted, to avoid misdirecting new docs.

In D67867#1686152, @lebedev.ri wrote:
In D67867#1686137, @sivachandra wrote:

In D67867#1685625, @lebedev.ri wrote:

The libc/docs/*.md should be .RST - there are no more no other .MD docs in llvm, they were all converted back to .RST

So, does it mean, this has been reversed: https://reviews.llvm.org/D44910
llvm-project$ find -iname *.md | grep "docs"
llvm-project$
So yes, i suppose that patch needs to be reverted, to avoid misdirecting new docs.

I think you should put *.md in quotes like this to see results:

$> find -iname "*.md" | grep "docs"

Address theraven's comments.

libc/docs/header_generation.md
9 ↗	(On Diff #222087)	This is kind of related to the other question you have asked below. I have tried to address the two questions together below.
21 ↗	(On Diff #222087)	Yes, that is the general direction in which this is going. We are making the headers for a configuration much simpler to navigate at the cost of having multiple sets of headers. In this day and age, I do not think forcing multiple sets of header files is a bad thing. Note also that users' build systems already have the knowledge and capability to handle multiple configurations. Hence, we are not making the build systems any more complicated. This is not what traditional libcs have done. So, yes we are introducing a "third mechanism". At the same time, one can also argue that we are doing away with such mechanisms as we require that each configuration have its own set of header files.
libc/docs/implementation_standard.md
79 ↗	(On Diff #222087)	Answers as per numbers in the comment. We can do it in the public header file, or in the implementation .cpp files. Or, at both the places. Did I understand this question correctly? Is this question related to or similar to #3 below? Like, are you asking as to how we will add a new function without breaking the old standard? If yes, then the %%include mechanism is present to accommodate such scenarios: we start with a baseline standard and %%include new standards until they become baseline. I frankly do not have a good answer and would prefer someone who cares about this use case to contribute. May be a real world example can help me think about this more clearly. For the header files, we have the %%include_file mechanism. For the library files, we pick and choose to compose a suitable library target. For example, like the one in lib/CMakeLists.txt of this patch. At some level, my answers are only guessing about how things would evolve. So, I wouldn't be surprised if my answers here aren't valid or relevant even in say 3 months from now.
libc/include/ctype.h
14 ↗	(On Diff #222087)	All this is great, so I have incorporated them now.
libc/include/math.h
20 ↗	(On Diff #222087)	We are C17 and higher already. Should we still have such conditions?
libc/include/string.h
12 ↗	(On Diff #222087)	Fixed.
18 ↗	(On Diff #222087)	I used a script which scanned the standard document to produce these headers and missed restrict. I guess clang did not find it because we do a C++ compilation of the implementations.
libc/src/string/strcpy/strcpy_test.cpp
16 ↗	(On Diff #222087)	Using gtest already brings in the C++ runtime. Should that also be avoided?

Harbormaster completed remote builds in B38691: Diff 222243.Sep 27 2019, 1:48 PM

In D67867#1686056, @hfinkel wrote:
Maybe everything is fine, but given this setup, does anyone see any potential problems with compiling these functions for nvptx? I'd like to eventually see a mode where we compile an appropriate subset of these functions for GPU targets -- either in bitcode form for linking with our device-side OpenMP runtime library or as a native object -- to provide a more feature-complete offloading environment.

The one thing that caught by eye was the use of the section attribute, and I was curious what nvptx does with that. As far as I can tell, perhaps the answer is nothing. I took this:
cat /tmp/f.c 
int __attribute__((section(".llvm.libc.entrypoint.foo"))) foo ()  {
  return 0;
}

clang -target nvptx64 -O3 -S -o - /tmp/f.c
//
// Generated by LLVM NVPTX Back-End
//
 
.version 3.2
.target sm_20
.address_size 64
 
  // .globl foo
 
.visible .func  (.param .b32 func_retval0) foo()
{
  .reg .b32   %r<2>;
 
  mov.u32   %r1, 0;
  st.param.b32  [func_retval0+0], %r1;
  ret;
 
}
so maybe that part will just work?

The way it is setup, it is very ELF-ish (or should I be saying elvish :)

Very likely, non-ELF systems like Windows and nvptx will require some other mechanism. I working on the windows story in parallel, but I am totally new to nvptx. If you have any suggestions on how to make it work there, we can surely incorporate it.

In D67867#1686112, @jyknight wrote:

In D67867#1686056, @hfinkel wrote:

Maybe everything is fine, but given this setup, does anyone see any potential problems with compiling these functions for nvptx? I'd like to eventually see a mode where we compile an appropriate subset of these functions for GPU targets -- either in bitcode form for linking with our device-side OpenMP runtime library or as a native object -- to provide a more feature-complete offloading environment.

The one thing that caught by eye was the use of the section attribute, and I was curious what nvptx does with that. As far as I can tell, perhaps the answer is nothing.

Then I think this scheme won't work, since the point of the sections is to enable the creation of the global symbols post-build.

E.g., I think the idea is that the main implementation defines the function with C++ name __llvm_libc::strcpy(char *, const char *), and places the code in the .llvm.libc.entrypoint.strcpy section. And then another tool comes along and iterates the llvm.libc.entrypoint sections, and adds global symbol aliases for each one.

That scheme feels probably over-complex, IMO, but I don't have an concrete counter-proposal in mind.

Another problem with the .llvm.libc.entrypoint.strcpy + objcopy --add-symbol scheme is that there is no way to change the st_size field. This can impact symbolization preciseness when debugging information is stripped. This makes me wonder what advantages the namespace __llvm_libc:: brings. @theraven questioned the disadvantage:

This means that we can test the namespaced libc functions in an environment that already has a libc, but we can't easily extend these tests to run in a process (or other environment) that has only LLVM libc.

libc/cmake/modules/LLVMLibCRules.cmake
164 ↗	(On Diff #222243)	`,function` -> `,function,global` otherwise it is STB_LOCAL.
libc/include/string.h
18 ↗	(On Diff #222087)	So this should be void memcpy(void __restrict, const void *__restrict, size_t); to be usable in C++

In D67867#1686112, @jyknight wrote:

In D67867#1686056, @hfinkel wrote:

Maybe everything is fine, but given this setup, does anyone see any potential problems with compiling these functions for nvptx? I'd like to eventually see a mode where we compile an appropriate subset of these functions for GPU targets -- either in bitcode form for linking with our device-side OpenMP runtime library or as a native object -- to provide a more feature-complete offloading environment.

The one thing that caught by eye was the use of the section attribute, and I was curious what nvptx does with that. As far as I can tell, perhaps the answer is nothing.

Then I think this scheme won't work, since the point of the sections is to enable the creation of the global symbols post-build.

E.g., I think the idea is that the main implementation defines the function with C++ name __llvm_libc::strcpy(char *, const char *), and places the code in the .llvm.libc.entrypoint.strcpy section. And then another tool comes along and iterates the llvm.libc.entrypoint sections, and adds global symbol aliases for each one.

That scheme feels probably over-complex, IMO, but I don't have an concrete counter-proposal in mind.

For what it's worth, FreeBSD libc does a similar namespacing trick in C. The internal symbols are underscore prefixed and they're exported as aliases (typically as weak aliases, to allow them to be preempted by other implementations, and to explicitly give names to callers for the preemptible and non-preemptible versions). Making symbols preemptible isn't really possible with PE/COFF, because the linkage model has a stronger concept of where definitions come from than ELF (at least, in the absence of symbol versions in ELF). On ELF platforms, we should support symbol versions as early as possible, because adding them later is an ABI break, even if we change no code.

theraven added inline comments.Sep 29 2019, 12:39 AM

libc/docs/header_generation.md
21 ↗	(On Diff #222087)	I think that's fine when you consider building libc and shipping a single configuration, but a lot of projects that I've seen have different feature macros defined for different components. Are they now expected to rebuild the libc headers multiple times for each module? Do they need to drive that from their own build system (which is often not CMake)? It's even more complex when a project contains C89, C11, and C++11 files - these all have subtly different sets of requirements for the functions exposed in libc headers: do we require that they build a set for each? Or do you imagine that anyone shipping C11 will ship a powerset of headers? The reason that we don't do the separate header thing in libcs today is that we end up with a huge explosion of the set of things that are supported. For example, in FreeBSD we support 3 versions of the C standard, 3 or 4 versions of POSIX, GNU and BSD extensions. Almost any combination of these is allowed, so we're looking at 20-30 possible sets of header files, before we start considering restricted subsets for sandboxed applications, custom configurations for sanitisers, and so on.
libc/docs/implementation_standard.md
79 ↗	(On Diff #222087)	Compliance with overlapping standards is one of the core reasons that we make symbols preemptible in existing libc implementations. C reserves a set of identifiers for the C standard and a C89 program is free to use any other identifier. If POSIX2008 or C11 use those identifiers then libc should not call their implementations from internal code, but should allow theirs to be called. Pthreads is subtly different, where users are allowed to bring their own pthreads implementation and libc should correctly consume it. The third question is more in terms of layering. For example, I recently tried building libc++ to work in kernel space. This is incredibly hard, because a lot of things depend on locale, for example, and pull in iostream dependencies, so you end up needing a load of things that have no real meaning in kernel space. The same is true for sandboxed environments, where things like `open` may not exist (though `openat`) may, (important when something like locale support in libc needs to open files: for a sandboxed deployment we should support either baking those files into the binary or not expose the symbols that depend on their working correctly). If we don't start out with some declarative definitions of things like C89 / C99 / POSIX2008 compliance, then any kind of automatic tooling to generate a pure C11 library (no POSIX) or to ensure that the correct symbols are preemptible will be very hard.
libc/include/math.h
20 ↗	(On Diff #222087)	We are building as C++17. There is no C17. I would hope that we're still aiming for the headers to be consumed by C89 programs, because there are a huge number of those in the world.
libc/include/string.h
16 ↗	(On Diff #222243)	Does this work correctly with the inclusion guards? I don't see the `stddef.h` implementation here, so I don't know what those macros do (they don't do anything in FreeBSD libc's `stddef.h`, they appear to do something in libc++'s `cstddef`, though I'm not entirely sure what) . The FreeBSD solution so this is to define types like `__size_t`, use these in headers that are supposed to use `size_t` in function prototypes but not provide a definition of `size_t` (yes, there are several in the C spec, it's annoying but that's what the standard says), and then add a guarded typedef to turn that into `size_t` in this header, `stddef.h` and a couple of other places.

In D67867#1687128, @theraven wrote:

In D67867#1686112, @jyknight wrote:

In D67867#1686056, @hfinkel wrote:

Maybe everything is fine, but given this setup, does anyone see any potential problems with compiling these functions for nvptx? I'd like to eventually see a mode where we compile an appropriate subset of these functions for GPU targets -- either in bitcode form for linking with our device-side OpenMP runtime library or as a native object -- to provide a more feature-complete offloading environment.

The one thing that caught by eye was the use of the section attribute, and I was curious what nvptx does with that. As far as I can tell, perhaps the answer is nothing.

Then I think this scheme won't work, since the point of the sections is to enable the creation of the global symbols post-build.

E.g., I think the idea is that the main implementation defines the function with C++ name __llvm_libc::strcpy(char *, const char *), and places the code in the .llvm.libc.entrypoint.strcpy section. And then another tool comes along and iterates the llvm.libc.entrypoint sections, and adds global symbol aliases for each one.

That scheme feels probably over-complex, IMO, but I don't have an concrete counter-proposal in mind.

For what it's worth, FreeBSD libc does a similar namespacing trick in C. The internal symbols are underscore prefixed and they're exported as aliases (typically as weak aliases, to allow them to be preempted by other implementations, and to explicitly give names to callers for the preemptible and non-preemptible versions). Making symbols preemptible isn't really possible with PE/COFF, because the linkage model has a stronger concept of where definitions come from than ELF (at least, in the absence of symbol versions in ELF). On ELF platforms, we should support symbol versions as early as possible, because adding them later is an ABI break, even if we change no code.

Actually, now that I think I understand the existing proposal better, I believe it's broken, as well as confusing. It's getting the same effect as using __attribute__((alias)), except harder to understand. But it's not ok to have a single object file expose both a strong public alias and an internal alias, for any function that's not in the baseline ISO C standard. It would be OK if the aliases were weak, or if they were strong but exposed by a separate .o file. (In any case, I'd like to suggest not using an external objcopy invocation to achieve this.)

For example of why this is wrong -- consider if libc has an 'open.o' object file, which defines __llvm_libc::open, and has also had the alias open added to it with objcopy. Internally, if libc needs to call open, it calls __llvm_libc::open, which pulls in that open.o file, which then also defines the global 'open' function. Then there thus be a duplicate symbol error for any Standard C (e.g. non-posix) program which defines its own open function.

libc/docs/header_generation.md
21 ↗	(On Diff #222087)	IMO, it makes sense not to bother making C99/C11-only functions conditionally available. The libc headers still ought to be compatible in C89 mode, but I don't see that there's really much point to excluding declarations for new functions like 'strtof', 'aligned_alloc', etc, when building in older standards modes. The same most likely can apply to old POSIX versions. However, I do think it is quite likely to be necessary to preserve the ability to conditionally disable the various standards "layers". That is -- for all the headers specified in ISO C, you should be able to disable the declarations added by POSIX (and extensions) with a define. And for all the headers specified in POSIX, you should be able to disable the declarations added by the GNU/BSD/etc extensions with a define.
libc/include/string.h
16 ↗	(On Diff #222243)	Both Clang and GCC ship a stddef.h which supports these defines -- and it's expected to be first in the include path, before libc's headers. For some reason, freebsd and some other platforms remove these compiler-shipped files and replace them with their own for their libc. Exactly what the contract should be between the compiler headers and the libc headers could be a larger discussion, but for now, I'm strongly in favor of assuming that we're using the existing stddef.h from clang/gcc -- in which case this code will work correctly.

In D67867#1688297, @jyknight wrote:

Actually, now that I think I understand the existing proposal better, I believe it's broken, as well as confusing. It's getting the same effect as using __attribute__((alias)), except harder to understand. But it's not ok to have a single object file expose both a strong public alias and an internal alias, for any function that's not in the baseline ISO C standard. It would be OK if the aliases were weak, or if they were strong but exposed by a separate .o file. (In any case, I'd like to suggest not using an external objcopy invocation to achieve this.)

The objcopy step is required to avoid putting mangled names with the alias attribute. If there is any other way to achieve the same thing, I am open to it.

For example of why this is wrong -- consider if libc has an 'open.o' object file, which defines __llvm_libc::open, and has also had the alias open added to it with objcopy. Internally, if libc needs to call open, it calls __llvm_libc::open, which pulls in that open.o file, which then also defines the global 'open' function. Then there thus be a duplicate symbol error for any Standard C (e.g. non-posix) program which defines its own open function.

I am open to making all public symbols weak. Should we start with that, or should make them weak on an as-needed basis?

libc/docs/header_generation.md
21 ↗	(On Diff #222087)	With respect to excluding extension standards, I will go back to my earlier comment here: The %%include mechanism gives us a way to do it, but calls for a header set per configuration. Note that the libc source tree still only has a single set of files. True that there will be an "explosion of configurations", but I expect a large number of these to be downstream configurations. Off-the-shelf, we should probably only provide what a normal Linux or a Windows development environment needs.
libc/include/math.h
20 ↗	(On Diff #222087)	We have said in the proposal that we will only support C17 and higher: http://llvm.org/docs/Proposals/LLVMLibC.html
libc/include/string.h
16 ↗	(On Diff #222243)	I was convinced by jyknight's reasoning last time. If not anything else, I like that it keeps our implementation surface smaller.

In D67867#1688556, @sivachandra wrote:

The objcopy step is required to avoid putting mangled names with the alias attribute. If there is any other way to achieve the same thing, I am open to it.

Ah, I see! I'd suggest using extern "C" instead. There's no need for these be C++-mangled -- you can simply use a name prefix instead. E.g., if you define it as extern "C" __llvm_libc_strcpy(...) {} then it's trivial to make the strcpy alias without objcopy magic.

In D67867#1688779, @jyknight wrote:

In D67867#1688556, @sivachandra wrote:

The objcopy step is required to avoid putting mangled names with the alias attribute. If there is any other way to achieve the same thing, I am open to it.

Ah, I see! I'd suggest using extern "C" instead. There's no need for these be C++-mangled -- you can simply use a name prefix instead. E.g., if you define it as extern "C" __llvm_libc_strcpy(...) {} then it's trivial to make the strcpy alias without objcopy magic.

Yes, this solution was considered. But, it does not really solve the problem you brought up; we will still need to make the alias weak. It of course eliminates the objcopy step, but we end up with some kind of source code mismatch/inconsistency (for the lack of a better word I can think of). That is, we will have the main function in a global namespace, while the support/helper functions will live inside of a namespace. I rather prefer everything in a single LLVM-libc specific namespace. The objcopy step is neatly hidden behind a build rule, so I do not see it as being "complicated" or "confusing". Someone coming in to implement a new function just has to use the build rule and put the code in the namespace.

In D67867#1686180, @sivachandra wrote:
In D67867#1686152, @lebedev.ri wrote:
In D67867#1686137, @sivachandra wrote:

In D67867#1685625, @lebedev.ri wrote:

The libc/docs/*.md should be .RST - there are no more no other .MD docs in llvm, they were all converted back to .RST

So, does it mean, this has been reversed: https://reviews.llvm.org/D44910
llvm-project$ find -iname *.md | grep "docs"
llvm-project$
So yes, i suppose that patch needs to be reverted, to avoid misdirecting new docs.
I think you should put *.md in quotes like this to see results:

$> find -iname "*.md" | grep "docs"

Aha. But still, all but few docs migrated back to rst.
I'm not sure it's great to add new ones in the format being-migrated-from.

In D67867#1691626, @lebedev.ri wrote:

Aha. But still, all but few docs migrated back to rst.
I'm not sure it's great to add new ones in the format being-migrated-from.

Can you kindly point me to some reference which says rst is preferred over markdown? Not that I do not trust you, but I was told by others to prefer markdown, and so I used markdown. I want to have a reference to point to if I am questioned in future about this.

In D67867#1691647, @sivachandra wrote:

In D67867#1691626, @lebedev.ri wrote:

Aha. But still, all but few docs migrated back to rst.
I'm not sure it's great to add new ones in the format being-migrated-from.

Can you kindly point me to some reference which says rst is preferred over markdown? Not that I do not trust you, but I was told by others to prefer markdown, and so I used markdown. I want to have a reference to point to if I am questioned in future about this.

See e.g.
https://lists.llvm.org/pipermail/llvm-dev/2019-June/133038.html
https://reviews.llvm.org/D66305
I think there was some other disscussion but i can't find it.

In D67867#1691597, @sivachandra wrote:

In D67867#1688779, @jyknight wrote:

In D67867#1688556, @sivachandra wrote:

The objcopy step is required to avoid putting mangled names with the alias attribute. If there is any other way to achieve the same thing, I am open to it.

Ah, I see! I'd suggest using extern "C" instead. There's no need for these be C++-mangled -- you can simply use a name prefix instead. E.g., if you define it as extern "C" __llvm_libc_strcpy(...) {} then it's trivial to make the strcpy alias without objcopy magic.

Yes, this solution was considered. But, it does not really solve the problem you brought up; we will still need to make the alias weak.

Correct, my suggestion above was only to reduce the complexity.

However, back to the weak-alias question -- I have a different suggestion for solving that.

The issue really only exists when you refer to object files across standards layers -- e.g., using an object file that exposes the POSIX symbol "open" from an object file implementing ISO C. If you make sure to always strictly-layer the libc, so that ISO C-implementing object files don't use any POSIX-exporting object files, and so on, you won't need to mark anything weak. For the example of "open", you'd have an internal implementation of open in its own file, only exposing libc-internal symbols. Then fopen (ISO C) can use that safely, without dragging in a definition of the symbol open. Separately, the implementation of open (POSIX) can be defined in its own file, also based on the internal open.)

It of course eliminates the objcopy step, but we end up with some kind of source code mismatch/inconsistency (for the lack of a better word I can think of). That is, we will have the main function in a global namespace, while the support/helper functions will live inside of a namespace.

Sort of... At the source level, you can keep using namespaces as usual -- you don't need to move the function to the global namespace. E.g., given

namespace __llvm_libc {
extern "C" __llvm_libc_strcat(...) {
  ...
}
}

the __llvm_libc_strcat function is within the namespace __llvm_libc, as far as C++ name-resolution semantics are concerned. It's only at a lower level, in the object-file linkage semantics, that it is "as if" it was in the global namespace -- the function has been given the symbol name __llvm_libc_strcat instead of a C++ namespace-mangled name.

I rather prefer everything in a single LLVM-libc specific namespace. The objcopy step is neatly hidden behind a build rule, so I do not see it as being "complicated" or "confusing". Someone coming in to implement a new function just has to use the build rule and put the code in the namespace.

I do think it's both complicated and confusing to have build rules invoking objcopy to post-process .o files after compilation. If that was necessary, it'd be one thing, but since it's not, I'd say the additional complexity is not justified.

In D67867#1691962, @jyknight wrote:

The issue really only exists when you refer to object files across standards layers -- e.g., using an object file that exposes the POSIX symbol "open" from an object file implementing ISO C. If you make sure to always strictly-layer the libc, so that ISO C-implementing object files don't use any POSIX-exporting object files, and so on, you won't need to mark anything weak. For the example of "open", you'd have an internal implementation of open in its own file, only exposing libc-internal symbols. Then fopen (ISO C) can use that safely, without dragging in a definition of the symbol open. Separately, the implementation of open (POSIX) can be defined in its own file, also based on the internal open.)

This approach is fairly general can be adopted with the existing setup as well. In fact, I can image that irrespective of the approach we take, we will end up with patterns like this.

Sort of... At the source level, you can keep using namespaces as usual -- you don't need to move the function to the global namespace. E.g., given
namespace __llvm_libc {
extern "C" __llvm_libc_strcat(...) {
  ...
}
}
the __llvm_libc_strcat function is within the namespace __llvm_libc, as far as C++ name-resolution semantics are concerned. It's only at a lower level, in the object-file linkage semantics, that it is "as if" it was in the global namespace -- the function has been given the symbol name __llvm_libc_strcat instead of a C++ namespace-mangled name.

This approach was also considered. (I missed the __ prefixes in my considerations of course.)

I rather prefer everything in a single LLVM-libc specific namespace. The objcopy step is neatly hidden behind a build rule, so I do not see it as being "complicated" or "confusing". Someone coming in to implement a new function just has to use the build rule and put the code in the namespace.

I do think it's both complicated and confusing to have build rules invoking objcopy to post-process .o files after compilation. If that was necessary, it'd be one thing, but since it's not, I'd say the additional complexity is not justified.

This is in subjective territory...

I prefer not to have the repetitiveness of llvm_libc::llvm_libc_strcpy or anything similar. One can suggest using a macro to avoid the repetitiveness of course, but I'd rather avoid using a macro at every call site. To me, that seems not pretty and not necessary. I prefer to keep the implementation layer be as much as possible like a normal C++ library. The post-processing is done to make this C++ library be usable as a C library as well.

I agree that, "post-processing using objcopy" does sound complex as it kind of gives a feel of a complex binary surgery. But, what the current proposal is doing is to just add an alias symbol using objcopy. That is neither invasive, nor complex as "post-processing using objcopy" sounds. Moreover, as I have said earlier, developers do not need to deal with objcopy on a regular basis as it is hidden behind a build rule.

In D67867#1692075, @sivachandra wrote:

In D67867#1691962, @jyknight wrote:

The issue really only exists when you refer to object files across standards layers -- e.g., using an object file that exposes the POSIX symbol "open" from an object file implementing ISO C. If you make sure to always strictly-layer the libc, so that ISO C-implementing object files don't use any POSIX-exporting object files, and so on, you won't need to mark anything weak. For the example of "open", you'd have an internal implementation of open in its own file, only exposing libc-internal symbols. Then fopen (ISO C) can use that safely, without dragging in a definition of the symbol open. Separately, the implementation of open (POSIX) can be defined in its own file, also based on the internal open.)

This approach is fairly general can be adopted with the existing setup as well. In fact, I can image that irrespective of the approach we take, we will end up with patterns like this.

Yes. This is totally separate from whether to use objcopy or not. I'm sorry the two concerns were merged into one comment thread.

I agree that, "post-processing using objcopy" does sound complex as it kind of gives a feel of a complex binary surgery. But, what the current proposal is doing is to just add an alias symbol using objcopy. That is neither invasive, nor complex as "post-processing using objcopy" sounds. Moreover, as I have said earlier, developers do not need to deal with objcopy on a regular basis as it is hidden behind a build rule.

I continue to disagree that that's at all a good trade-off, but since we're only going around in circles now, I'll drop it in the name of progress.

Move markdown docs to reStructuredText; Will add conf.py and build targets in a later pass.

Harbormaster completed remote builds in B38932: Diff 222968.Oct 2 2019, 11:06 PM

My earlier question is about why we need the namespace __llvm_libc at all. From libc/src/string/strcat/strcat_test.cpp I conclude it is for unit testing in an environment that already has a libc (gtest). This should probably be documented.

https://reviews.llvm.org/D67867#1686834 mentioned that the objcopy scheme will break the st_size fields of symbols.

Can we do things the other way round? No namespace, no __llvm_libc_ prefix. Add the -ffreestanding compiler flag and just define strstr, open, etc in the global namespace. In unit tests, invoke llvm-objcopy --redefine-syms= to rename strstr to __llvm_libc_strstr, and call __llvm_libc_strstr in the tests. For functions that affect global states, gtest will not be suitable. It is good to think how the tests will be organized in the current early stage.

Since weak aliases have been mentioned. I'd like to say that a weak_alias macro will be needed, probably not now, but it should be taken into consideration so that we will not change things back and forth.

Avoid namespace pollution. ISO C interfaces should not pull in POSIX functions, e.g. fopen can call strstr, strchr (ISO C) but not open (POSIX). To call open, (a) define STV_HIDDEN __open (b) make open a weak alias of __open (c) call __open in fopen.
Weak definition for dummy implementation. Some features do not necessarily pull in functions from other components. Create a static dummy function and create a weak definition. When used as a variable, this can be used to check whether some features are unavailable.
Pure aliases. Required either by standard or ABI compatibility purposes. glibc uses this a lot.

As an example when this will be used: in the implementation of strcpy, call STV_HIDDEN __memcpy (I'm not saying implementing strcpy on top of memcpy is efficient. This is merely an example), not STV_DEFAULT memcpy, to make it explicit symbol interposition is not desired. This matters if you ever support dynamic linking and support libc.so.

In the description, a brief mention of what will be built and how to test will be helpful to reviewers and subscribers. It is not easy to figure out where the libc build artifacts are located.

In D67867#1692484, @MaskRay wrote:

My earlier question is about why we need the namespace __llvm_libc at all. From libc/src/string/strcat/strcat_test.cpp I conclude it is for unit testing in an environment that already has a libc (gtest). This should probably be documented.

Documentation is a good idea. At this point, we want the implementation to be as much as possible like a normal C++ library and allow mixing this libc with other libcs in various scenarios like unit testing, differential testing etc. This is the first patch; I am sure a lot will change in the coming months. I want to write up things like this after we at least get a clear idea of the Windows strategy (I am working on it).

https://reviews.llvm.org/D67867#1686834 mentioned that the objcopy scheme will break the st_size fields of symbols.

Yes, st_size is broken in the sense that it doesn't show the size of the aliasee. But, it almost never matters practically.

Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2019, 10:10 PM

Herald added a subscriber: libc-commits. · View Herald Transcript

Closed by commit rL373764: Add few docs and implementation of strcpy and strcat. (authored by sivachandra). · Explain WhyOct 4 2019, 10:29 AM

This revision was automatically updated to reflect the committed changes.

The commit was done in a hurry. For the initial commit of a brand new project that sets up the project hierarchy, this seems to have received fewer than enough thumbs up. Many points raised in the review process were just shrugged off.

Proper cmake review
Detailed summary. The commit message should at least reference some previous discussions on the mailing list, especially this is a brand new project.
The llvm-objcopy issue definitely needs more consideration. This may interfere badly with instrumentation tools, which is a selling point of the llvm libc.
Why __llvm_libc / .llvm.libc.entrypoint. are necessary is not well explained.
Some necessary options -ffreestanding -nostdinc are absent.
C++ should not get #define __restrict restrict
...

I understand that you want agile development but I just did not want 1000 lines of code were commit then 700 lines of which were replaced very quickly in the next month. If you read the source code of musl, you'll find still a large chunk of code remains untouched since the initial import ("initial check-in, version 0.5.0").

This is a post-commit review anyway so many points are probably moot.

In D67867#1692484, @MaskRay wrote:

My earlier question is about why we need the namespace __llvm_libc at all. From libc/src/string/strcat/strcat_test.cpp I conclude it is for unit testing in an environment that already has a libc (gtest). This should probably be documented.

Can we do things the other way round? No namespace, no __llvm_libc_ prefix. Add the -ffreestanding compiler flag and just define strstr, open, etc in the global namespace. In unit tests, invoke llvm-objcopy --redefine-syms= to rename strstr to __llvm_libc_strstr, and call __llvm_libc_strstr in the tests. For functions that affect global states, gtest will not be suitable. It is good to think how the tests will be organized in the current early stage.

In D67867#1696908, @MaskRay wrote:

The commit was done in a hurry. For the initial commit of a brand new project that sets up the project hierarchy, this seems to have received fewer than enough thumbs up. Many points raised in the review process were just shrugged off.

I don't know if it matters anymore because this was committed but I agree with @MaskRay. His suggestion of using llvm-objcopy to rename the symbols for tests makes much more sense to me. I haven't seen a libc that does testing in an ergonomic way and this suggestion seems the best to me, frankly.

There is a lot going on here, it's hard to follow it all in one patch, and I think some comments got lost because of this. I feel like a lot of big design decisions were made here, did I miss something on the libc-dev mailing list?

In D67867#1698273, @abrachet wrote:

In D67867#1696908, @MaskRay wrote:

The commit was done in a hurry. For the initial commit of a brand new project that sets up the project hierarchy, this seems to have received fewer than enough thumbs up. Many points raised in the review process were just shrugged off.

I don't know if it matters anymore because this was committed but I agree with @MaskRay. His suggestion of using llvm-objcopy to rename the symbols for tests makes much more sense to me. I haven't seen a libc that does testing in an ergonomic way and this suggestion seems the best to me, frankly.

There is a lot going on here, it's hard to follow it all in one patch, and I think some comments got lost because of this. I feel like a lot of big design decisions were made here, did I miss something on the libc-dev mailing list?

FWIW -- llvm project policy is that both pre and post-commit reviews must be addressed.

That this is now committed does not change anything w.r.t. needing to respond to outstanding comments.

__simt__ added a subscriber: __simt__.Oct 7 2019, 6:45 PM

In D67867#1698316, @jyknight wrote:

That this is now committed does not change anything w.r.t. needing to respond to outstanding comments.

Absolutely!

I have heard everyone, and will try my best to address everything. FWIW, I am not happy myself that I am unable to address jyknight's concerns on post-processing in an acceptable fashion.

lygstate added a subscriber: lygstate.Oct 9 2019, 7:10 AM

lygstate added inline comments.

libc/trunk/include/ctype.h
18	There is no DLL export things here for MSVC, so only building as a static c lib? not considerating as a shared library?

Just as an FYI, this patch breaks LLVM_INCLUDE_TESTS=OFF for me:

$ cmake -GNinja -DPYTHON_EXECUTABLE=$(command -v python3)  -DLLVM_ENABLE_PROJECTS=all ../llvm
...
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nathan/src/llvm-project/build

$ cd .. && rm -rf build && mkdir -p build && cd build 

$ cmake -GNinja -DPYTHON_EXECUTABLE=$(command -v python3) -DLLVM_ENABLE_PROJECTS=all -DLLVM_INCLUDE_TESTS=OFF ../llvm
...
-- Configuring done
CMake Error at /home/nathan/src/llvm-project/libc/cmake/modules/LLVMLibCRules.cmake:264 (add_dependencies):
  The dependency target "gtest" of target "strcpy_test" does not exist.
Call Stack (most recent call first):
  /home/nathan/src/llvm-project/libc/src/string/strcpy/CMakeLists.txt:11 (add_libc_unittest)


CMake Error at /home/nathan/src/llvm-project/libc/cmake/modules/LLVMLibCRules.cmake:264 (add_dependencies):
  The dependency target "gtest" of target "strcat_test" does not exist.
Call Stack (most recent call first):
  /home/nathan/src/llvm-project/libc/src/string/strcat/CMakeLists.txt:12 (add_libc_unittest)


-- Generating done
-- Build files have been written to: /home/nathan/src/llvm-project/build

$ git revert -n 4380647e79bd80af1ebf6191c2d6629855ccf556

$ cd .. && rm -rf build && mkdir -p build && cd build 

$ cmake -GNinja -DPYTHON_EXECUTABLE=$(command -v python3) -DLLVM_ENABLE_PROJECTS=all -DLLVM_INCLUDE_TESTS=OFF ../llvm
...
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nathan/src/llvm-project/build

This is as of r374191.

Revision Contents

Path

Size

libc/

trunk/

CMakeLists.txt

24 lines

cmake/

modules/

LLVMLibCRules.cmake

280 lines

docs/

build_system.rst

24 lines

entrypoints.rst

6 lines

header_generation.rst

98 lines

implementation_standard.rst

85 lines

source_layout.rst

85 lines

include/

30 lines

33 lines

46 lines

360 lines

66 lines

lib/

CMakeLists.txt

9 lines

src/

CMakeLists.txt

3 lines

__support/

CMakeLists.txt

9 lines

common.h.def

18 lines

linux/

entrypoint_macro.h.inc

13 lines

string/

CMakeLists.txt

4 lines

strcat/

21 lines

20 lines

23 lines

43 lines

strcpy/

19 lines

20 lines

19 lines

40 lines

utils/

build_scripts/

gen_hdr.py

188 lines

llvm/

trunk/

CMakeLists.txt

2 lines

projects/

CMakeLists.txt

1 line

Diff 223246

libc/trunk/CMakeLists.txt

				cmake_minimum_required(VERSION 3.4.3)

				list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake/modules")

				# The top-level source directory of libc.
				set(LIBC_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
				# The top-level directory in which libc is being built.
				set(LIBC_BUILD_DIR ${CMAKE_CURRENT_BINARY_DIR})

				# Path libc/scripts directory.
				set(LIBC_BUILD_SCRIPTS_DIR "${LIBC_SOURCE_DIR}/utils/build_scripts")


				set(LIBC_TARGET_OS ${CMAKE_SYSTEM_NAME})
				string(TOLOWER ${LIBC_TARGET_OS} LIBC_TARGET_OS)

				set(LIBC_TARGET_MACHINE ${CMAKE_SYSTEM_PROCESSOR})

				include(CMakeParseArguments)
				include(LLVMLibCRules)

				add_subdirectory(include)
				add_subdirectory(src)
				add_subdirectory(lib)

libc/trunk/cmake/modules/LLVMLibCRules.cmake


				# A rule for self contained header file targets.
				# This rule merely copies the header file from the current source directory to
				# the current binary directory.
				# Usage:
				# add_header(
				# <target name>
				# HDR <header file>
				# )
				function(add_header target_name)
				cmake_parse_arguments(
				"ADD_HEADER"
				"" # No optional arguments
				"HDR" # Single value arguments
				"DEPENDS" # No multi value arguments
				${ARGN}
				)
				if(NOT ADD_HEADER_HDR)
				message(FATAL_ERROR "'add_header' rules requires the HDR argument specifying a headef file.")
				endif()

				set(dest_file ${CMAKE_CURRENT_BINARY_DIR}/${ADD_HEADER_HDR})
				set(src_file ${CMAKE_CURRENT_SOURCE_DIR}/${ADD_HEADER_HDR})

				add_custom_command(
				OUTPUT ${dest_file}
				COMMAND cp ${src_file} ${dest_file}
				DEPENDS ${src_file}
				)

				add_custom_target(
				${target_name}
				DEPENDS ${dest_file}
				)

				if(ADD_HEADER_DEPENDS)
				add_dependencies(
				${target_name} ${ADD_HEADER_DEPENDS}
				)
				endif()
				endfunction(add_header)

				# A rule for generated header file targets.
				# Usage:
				# add_gen_header(
				# <target name>
				# DEF_FILE <.h.def file>
				# GEN_HDR <generated header file name>
				# PARAMS <list of name=value pairs>
				# DATA_FILES <list input data files>
				# )
				function(add_gen_header target_name)
				cmake_parse_arguments(
				"ADD_GEN_HDR"
				"" # No optional arguments
				"DEF_FILE;GEN_HDR" # Single value arguments
				"PARAMS;DATA_FILES" # Multi value arguments
				${ARGN}
				)
				if(NOT ADD_GEN_HDR_DEF_FILE)
				message(FATAL_ERROR "`add_gen_hdr` rule requires DEF_FILE to be specified.")
				endif()
				if(NOT ADD_GEN_HDR_GEN_HDR)
				message(FATAL_ERROR "`add_gen_hdr` rule requires GEN_HDR to be specified.")
				endif()

				set(out_file ${CMAKE_CURRENT_BINARY_DIR}/${ADD_GEN_HDR_GEN_HDR})
				set(in_file ${CMAKE_CURRENT_SOURCE_DIR}/${ADD_GEN_HDR_DEF_FILE})

				set(fq_data_files "")
				if(ADD_GEN_HDR_DATA_FILES)
				foreach(data_file IN LISTS ADD_GEN_HDR_DATA_FILES)
				list(APPEND fq_data_files "${CMAKE_CURRENT_SOURCE_DIR}/${data_file}")
				endforeach(data_file)
				endif()

				set(replacement_params "")
				if(ADD_GEN_HDR_PARAMS)
				list(APPEND replacement_params "-P" ${ADD_GEN_HDR_PARAMS})
				endif()

				set(gen_hdr_script "${LIBC_BUILD_SCRIPTS_DIR}/gen_hdr.py")

				add_custom_command(
				OUTPUT ${out_file}
				COMMAND ${gen_hdr_script} -o ${out_file} ${in_file} ${replacement_params}
				WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
				DEPENDS ${in_file} ${fq_data_files} ${gen_hdr_script}
				)

				add_custom_target(
				${target_name}
				DEPENDS ${out_file}
				)
				endfunction(add_gen_header)

				set(ENTRYPOINT_OBJ_TARGET_TYPE "ENTRYPOINT_OBJ")

				# A rule for entrypoint object targets.
				# Usage:
				# add_entrypoint_object(
				# <target_name>
				# SRCS <list of .cpp files>
				# HDRS <list of .h files>
				# DEPENDS <list of dependencies>
				# )
				function(add_entrypoint_object target_name)
				cmake_parse_arguments(
				"ADD_ENTRYPOINT_OBJ"
				"" # No optional arguments
				"" # No single value arguments
				"SRCS;HDRS;DEPENDS" # Multi value arguments
				${ARGN}
				)
				if(NOT ADD_ENTRYPOINT_OBJ_SRCS)
				message(FATAL_ERROR "`add_entrypoint_object` rule requires SRCS to be specified.")
				endif()
				if(NOT ADD_ENTRYPOINT_OBJ_HDRS)
				message(FATAL_ERROR "`add_entrypoint_object` rule requires HDRS to be specified.")
				endif()

				add_library(
				"${target_name}_objects"
				# We want an object library as the objects will eventually get packaged into
				# an archive (like libc.a).
				OBJECT
				${ADD_ENTRYPOINT_OBJ_SRCS}
				${ADD_ENTRYPOINT_OBJ_HDRS}
				)
				target_compile_options(
				${target_name}_objects
				BEFORE
				PRIVATE
				-fpie -std=${LLVM_CXX_STD_default}
				)
				target_include_directories(
				${target_name}_objects
				PRIVATE
				"${LIBC_BUILD_DIR}/include;${LIBC_SOURCE_DIR};${LIBC_BUILD_DIR}"
				)
				add_dependencies(
				${target_name}_objects
				support_common_h
				)
				if(ADD_ENTRYPOINT_OBJ_DEPENDS)
				add_dependencies(
				${target_name}_objects
				${ADD_ENTRYPOINT_OBJ_DEPENDS}
				)
				endif()

				set(object_file_raw "${CMAKE_CURRENT_BINARY_DIR}/${target_name}_raw.o")
				set(object_file "${CMAKE_CURRENT_BINARY_DIR}/${target_name}.o")

				add_custom_command(
				OUTPUT ${object_file_raw}
				DEPENDS $<TARGET_OBJECTS:${target_name}_objects>
				COMMAND ${CMAKE_LINKER} -r $<TARGET_OBJECTS:${target_name}_objects> -o ${object_file_raw}
				)

				add_custom_command(
				OUTPUT ${object_file}
				DEPENDS ${object_file_raw}
				COMMAND ${CMAKE_OBJCOPY} --add-symbol "${target_name}=.llvm.libc.entrypoint.${target_name}:0,function,weak,global" ${object_file_raw} ${object_file}
				)

				add_custom_target(
				${target_name}
				ALL
				DEPENDS ${object_file}
				)
				set_target_properties(
				${target_name}
				PROPERTIES
				"TARGET_TYPE" ${ENTRYPOINT_OBJ_TARGET_TYPE}
				"OBJECT_FILE" ${object_file}
				"OBJECT_FILE_RAW" ${object_file_raw}
				)
				endfunction(add_entrypoint_object)

				# A rule to build a library from a collection of entrypoint objects.
				# Usage:
				# add_entrypoint_library(
				# DEPENDS <list of add_entrypoint_object targets>
				# )
				function(add_entrypoint_library target_name)
				cmake_parse_arguments(
				"ENTRYPOINT_LIBRARY"
				"" # No optional arguments
				"" # No single value arguments
				"DEPENDS" # Multi-value arguments
				${ARGN}
				)
				if(NOT ENTRYPOINT_LIBRARY_DEPENDS)
				message(FATAL_ERROR "'add_entrypoint_library' target requires a DEPENDS list of 'add_entrypoint_object' targets.")
				endif()

				set(obj_list "")
				foreach(dep IN LISTS ENTRYPOINT_LIBRARY_DEPENDS)
				get_target_property(dep_type ${dep} "TARGET_TYPE")
				string(COMPARE EQUAL ${dep_type} ${ENTRYPOINT_OBJ_TARGET_TYPE} dep_is_entrypoint)
				if(NOT dep_is_entrypoint)
				message(FATAL_ERROR "Dependency '${dep}' of 'add_entrypoint_collection' is not an 'add_entrypoint_object' target.")
				endif()
				get_target_property(target_obj_file ${dep} "OBJECT_FILE")
				list(APPEND obj_list "${target_obj_file}")
				endforeach(dep)

				set(library_file "${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_STATIC_LIBRARY_PREFIX}${target_name}${CMAKE_STATIC_LIBRARY_SUFFIX}")
				add_custom_command(
				OUTPUT ${library_file}
				COMMAND ${CMAKE_AR} -r ${library_file} ${obj_list}
				DEPENDS ${obj_list}
				)
				add_custom_target(
				${target_name}
				ALL
				DEPENDS ${library_file}
				)
				endfunction(add_entrypoint_library)

				function(add_libc_unittest target_name)
				cmake_parse_arguments(
				"LIBC_UNITTEST"
				"" # No optional arguments
				"SUITE" # Single value arguments
				"SRCS;HDRS;DEPENDS" # Multi-value arguments
				${ARGN}
				)
				if(NOT LIBC_UNITTEST_SRCS)
				message(FATAL_ERROR "'add_libc_unittest' target requires a SRCS list of .cpp files.")
				endif()
				if(NOT LIBC_UNITTEST_DEPENDS)
				message(FATAL_ERROR "'add_libc_unittest' target requires a DEPENDS list of 'add_entrypoint_object' targets.")
				endif()

				set(entrypoint_objects "")
				foreach(dep IN LISTS LIBC_UNITTEST_DEPENDS)
				get_target_property(dep_type ${dep} "TARGET_TYPE")
				string(COMPARE EQUAL ${dep_type} ${ENTRYPOINT_OBJ_TARGET_TYPE} dep_is_entrypoint)
				if(NOT dep_is_entrypoint)
				message(FATAL_ERROR "Dependency '${dep}' of 'add_entrypoint_unittest' is not an 'add_entrypoint_object' target.")
				endif()
				get_target_property(obj_file ${dep} "OBJECT_FILE_RAW")
				list(APPEND entrypoint_objects "${obj_file}")
				endforeach(dep)

				add_executable(
				${target_name}
				EXCLUDE_FROM_ALL
				${LIBC_UNITTEST_SRCS}
				${LIBC_UNITTEST_HDRS}
				)
				target_include_directories(
				${target_name}
				PRIVATE
				${LLVM_MAIN_SRC_DIR}/utils/unittest/googletest/include
				${LLVM_MAIN_SRC_DIR}/utils/unittest/googlemock/include
				${LIBC_SOURCE_DIR}
				)
				target_link_libraries(${target_name} PRIVATE ${entrypoint_objects} gtest_main gtest)
				set_target_properties(${target_name} PROPERTIES RUNTIME_OUTPUT_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})

				add_dependencies(
				${target_name}
				${LIBC_UNITTEST_DEPENDS}
				gtest
				)
				add_custom_command(
				TARGET ${target_name}
				POST_BUILD
				COMMAND $<TARGET_FILE:${target_name}>
				)
				if(LIBC_UNITTEST_SUITE)
				add_dependencies(
				${LIBC_UNITTEST_SUITE}
				${target_name}
				)
				endif()
				endfunction(add_libc_unittest)

libc/trunk/docs/build_system.rst

				LLVM libc build rules
				=====================

				At the cost of verbosity, we want to keep the build system of LLVM libc
				as simple as possible. We also want to be highly modular with our build
				targets. This makes picking and choosing desired pieces a straighforward
				task.

				Targets for entrypoints
				-----------------------

				Every entrypoint in LLVM-libc has its own build target. This target is listed
				using the ``add_entrypoint_object`` rule. This rule generates a single object
				file containing the implementation of the entrypoint.

				Targets for entrypoint libraries
				--------------------------------

				Standards like POSIX require that a libc provide certain library files like
				``libc.a``, ``libm.a``, etc. The targets for such library files are listed in
				the ``lib`` directory as ``add_entrypoint_library`` targets. An
				``add_entrypoint_library`` target takes a list of ``add_entrypoint_object``
				targets and produces a static library containing the object files corresponding
				to the ``add_entrypoint_targets``.

libc/trunk/docs/entrypoints.rst

				Entrypoints in LLVM libc
				------------------------

				A public function or a global variable provided by LLVM-libc is called an
				entrypoint. The notion of entrypoints is ingrained in LLVM-libc's
				source layout, build system and source code.

libc/trunk/docs/header_generation.rst

				Generating Public and Internal headers
				======================================

				Other libc implementations make use of preprocessor macro tricks to make header
				files platform agnostic. When macros aren't suitable, they rely on build
				system tricks to pick the right set of files to compile and export. While these
				approaches have served them well, parts of their systems have become extremely
				complicated making it hard to modify, extend or maintain. To avoid these
				problems in llvm-libc, we use a header generation mechanism. The mechanism is
				driven by a header configuration language.

				Header Configuration Language
				-----------------------------

				Header configuration language consists of few special commands. The header
				generation mechanism takes a an input file, which has an extension of
				``.h.def``, and produces a header file with ``.h`` extension. The header
				configuration language commands are listed in the input ``.h.def`` file. While
				reading a ``.h.def`` file, the header generation tool does two things:

				1. Copy the lines not containing commands as is into the output ``.h`` file.
				2. Replace the line on which a command occurs with some other text as directed
				by the command. The replacment text can span multiple lines.

				Command syntax
				~~~~~~~~~~~~~~

				A command should be listed on a line by itself, and should not span more than
				one line. The first token to appear on the line is the command name prefixed
				with ``%%``. For example, a line with the ``include_file`` command should start
				with ``%%include_file``. There can be indentation spaces before the ``%%``
				prefix.

				Most commands typically take arguments. They are listed as a comma separated
				list of named identifiers within parenthesis, similar to the C function call
				syntax. Before performing the action corresponding to the command, the header
				generator replaces the arguments with concrete values.

				Argument Syntax
				~~~~~~~~~~~~~~~

				Arguments are named indentifiers but prefixed with ``$`` and enclosed in ``{``
				and ``}``. For example, ``${path_to_constants}``.

				Comments
				~~~~~~~~

				There can be cases wherein one wants to add comments in the .h.def file but
				does not want them to be copied into the generated header file. Such comments
				can be added by beginning the comment lines with the ``<!>`` prefix. Currently,
				comments have to be on lines of their own. That is, they cannot be suffixes like
				this:

				```
				%%include_file(a/b/c) <!> Path to c in b of a. !!! WRONG SYNTAX
				```

				Available Commands
				------------------

				Sub-sections below describe the commands currently available. Under each command
				is the discription of the arugments to the command, and the action taken by the
				header generation tool when processing a command.

				``include_file``
				~~~~~~~~~~~~~~~~

				This is a replacement command which should be listed in an input ``.h.def``
				file.

				Arguments

				* path argument - An argument representing a path to a file. The file
				should have an extension of ``.h.inc``.

				Action

				This command instructs that the line on which the command appears should be
				replaced by the contents of the file whose path is passed as argument to the
				command.

				``begin``
				~~~~~~~~~

				This is not a replacement command. It is an error to list it in the input
				``.h.def`` file. It is normally listed in the files included by the
				``include_file`` command (the ``.h.inc`` files). A common use of this command it
				mark the beginning of what is to be included. This prevents copying items like
				license headers into the generated header file.

				Arguments

				None.

				Action

				The header generator will only include content starting from the line after the
				line on which this command is listed.

libc/trunk/docs/implementation_standard.rst

				Convention for implementing entrypoints
				=======================================

				LLVM-libc entrypoints are defined in the entrypoints document. In this document,
				we explain how the entrypoints are implemented. The source layout document
				explains that, within the high level ``src`` directory, there exists one
				directory for every public header file provided by LLVM-libc. The
				implementations of related group of entrypoints will also live in a directory of
				their own. This directory will have a name indicative of the related group of
				entrypoints, and will be under the directory corresponding to the header file of
				the entrypoints. For example, functions like ``fopen`` and ``fclose`` cannot be
				tested independent of each other and hence will live in a directory named
				``src/stdio/file_operations``. On the other hand, the implementation of the
				``round`` function from ``math.h`` can be tested by itself, so it will live in
				the directory of its own named ``src/math/round/``.

				Implementation of entrypoints can span multiple ``.cpp`` and ``.h`` files, but
				there will be atleast one header file with name of the form
				``<entrypoint name>.h`` for every entrypoint. This header file is called as the
				implementation header file. For the ``round`` function, the path to the
				implementation header file will be ``src/math/round/round.h``. The rest of this
				document explains the structure of implementation header files and ``.cpp``
				files.

				Implementaion Header File Structure
				-----------------------------------

				We will use the ``round`` function from the public ``math.h`` header file as an
				example. The ``round`` function will be declared in an internal header file
				``src/math/round/round.h`` as follows::

				// --- round.h --- //
				#ifndef LLVM_LIBC_SRC_MATH_ROUND_ROUND_H
				#define LLVM_LIBC_SRC_MATH_ROUND_ROUND_H

				namespace __llvm_libc {

				double round(double);

				} // namespace __llvm_libc

				#endif LLVM_LIBC_SRC_MATH_ROUND_ROUND_H

				Notice that the ``round`` function declaration is nested inside the namespace
				``__llvm_libc``. All implementation constructs in LLVM-libc are declared within
				the namespace ``__llvm_libc``.

				``.cpp`` File Structure
				-----------------------

				The implementation can span multiple ``.cpp`` files. However, the signature of
				the entrypoint function should make use of a special macro. For example, the
				``round`` function from ``math.h`` should be defined as follows, say in the file
				``src/math/math/round.cpp``::

				// --- round.cpp --- //

				namespace __llvm_libc {

				double LLVM_LIBC_ENTRYPOINT(round)(double d) {
				// ... implementation goes here.
				}

				} // namespace __llvm_libc

				Notice the use of the macro ``LLVM_LIBC_ENTRYPOINT``. This macro helps us define
				an C alias symbol for the C++ implementation. The C alias need not be added by
				the macro by itself. For example, for ELF targets, the macro is defined as
				follows::

				#define ENTRYPOINT_SECTION_ATTRIBUTE(name) \
				__attribute__((section(".llvm.libc.entrypoint."#name)))
				#define LLVM_LIBC_ENTRYPOINT(name) ENTRYPOINT_SECTION_ATTRIBUTE(name) name

				The macro places the C++ function in a unique section with name
				``.llvm.libc.entrypoint.<function name>``. This allows us to add a C alias using
				a post build step. For example, for the ``round`` function, one can use
				``objcopy`` to add an alias symbol as follows::

				objcopy --add-symbol round=.llvm.libc.entrypoint.round:0,function round.o

				NOTE: We use a post build ``objcopy`` step to add an alias instead of using
				the ``__attribute__((alias))``. For C++, this ``alias`` attribute requires
				mangled names of the referees. Using the post build ``objcopy`` step helps
				us avoid putting mangled names with ``alias`` atttributes.

libc/trunk/docs/source_layout.rst

				LLVM-libc Source Tree Layout
				============================

				At the top-level, LLVM-libc source tree is organized in to the following
				directories::

				+ libc
				- cmake
				- docs
				- include
				- lib
				- loader
				- src
				+ utils
				- build_scripts
				- testing
				- www

				Each of these directories is explained in detail below.

				The ``cmake`` directory
				-----------------------

				The ``cmake`` directory contains the implementations of LLVM-libc's CMake build
				rules.

				The ``docs`` directory
				----------------------

				The ``docs`` directory contains design docs and also informative documents like
				this document on source layout.

				The ``include`` directory
				-------------------------

				The ``include`` directory contains:

				1. Self contained public header files - These are header files which are
				already in the form that get installed when LLVM-libc is installed on a user's
				computer.
				2. ``.h.def`` and ``.h.in`` files - These files are used to construct the
				generated public header files.
				3. A ``CMakeLists.txt`` file - This file lists the targets for the self
				contained and generated public header files.

				The ``lib`` directory
				---------------------

				This directory contains a ``CMakeLists.txt`` file listing the targets for the
				public libraries ``libc.a``, ``libm.a`` etc.

				The ``loader`` directory
				------------------------

				This directory contains the implementations of the application loaders like
				``crt1.o`` etc.

				The ``src`` directory
				---------------------

				This directory contains the implementations of the llvm-libc entrypoints. It is
				further organized as follows:

				1. There is a toplevel CMakeLists.txt file.
				2. For every public header file provided by llvm-libc, there exists a
				corresponding directory in the ``src`` directory. The name of the directory
				is same as the base name of the header file. For example, the directory
				corresponding to the public ``math.h`` header file is named ``math``. The
				implementation standard document explains more about the header
				directories.

				The ``www`` directory
				---------------------

				The ``www`` directory contains the HTML content of libc.llvm.org

				The ``utils/build_scripts`` directory
				-------------------------------------

				This directory contains scripts which support the build system, tooling etc.

				The ``utils/testing`` directory
				-------------------------------

				This directory contains testing infrastructure.

libc/trunk/include/CMakeLists.txt


				add_header(
				llvm_libc_common_h
				HDR
				__llvm-libc-common.h
				)

				add_header(
				ctype_h
				HDR
				ctype.h
				DEPENDS
				llvm_libc_common_h
				)

				add_header(
				math_h
				HDR
				math.h
				DEPENDS
				llvm_libc_common_h
				)

				add_header(
				string_h
				HDR
				string.h
				DEPENDS
				llvm_libc_common_h
				)

libc/trunk/include/__llvm-libc-common.h

				//===------- Common definitions for LLVM-libc public header files- --------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIBC___COMMON_H
				#define LLVM_LIBC___COMMON_H

				#ifdef __cplusplus

				#undef __BEGIN_C_DECLS
				#define __BEGIN_C_DECLS extern "C" {

				#undef __END_C_DECLS
				#define __END_C_DECLS }

				#else // not __cplusplus

				#undef __BEGIN_C_DECLS
				#define __BEGIN_C_DECLS

				#undef __END_C_DECLS
				#define __END_C_DECLS

				#undef __restrict
				#define __restrict restrict // C99 and above support the restrict keyword.

				#endif // __cplusplus

				#endif // LLVM_LIBC___COMMON_H

libc/trunk/include/ctype.h

				//===---------------- C standard library header ctype.h -----------------*-===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIBC_CTYPE_H
				#define LLVM_LIBC_CTYPE_H

				#include <__llvm-libc-common.h>

				__BEGIN_C_DECLS

				int isalnum(int);

				int isalpha(int);
				lygstateUnsubmitted Not Done Reply Inline Actions There is no DLL export things here for MSVC, so only building as a static c lib? not considerating as a shared library? lygstate: There is no DLL export things here for MSVC, so only building as a static c lib? not…

				int isblank(int);

				int iscntrl(int);

				int isdigit(int);

				int isgraph(int);

				int islower(int);

				int isprint(int);

				int ispunct(int);

				int isspace(int);

				int isupper(int);

				int isxdigit(int);

				int tolower(int);

				int toupper(int);

				__END_C_DECLS

				#endif // LLVM_LIBC_CTYPE_H

libc/trunk/include/math.h

				//===----------------- C standard library header math.h -----------------*-===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIBC_MATH_H
				#define LLVM_LIBC_MATH_H

				#include <__llvm-libc-common.h>

				__BEGIN_C_DECLS

				double acos(double);

				float acosf(float);

				long double acosl(long double);

				double asin(double);

				float asinf(float);

				long double asinl(long double);

				double atan(double);

				float atanf(float);

				long double atanl(long double);

				double atan2(double, double);

				float atan2f(float, float);

				long double atan2l(long double, long double);

				double cos(double);

				float cosf(float);

				long double cosl(long double);

				double sin(double);

				float sinf(float);

				long double sinl(long double);

				double tan(double);

				float tanf(float);

				long double tanl(long double);

				double acosh(double);

				float acoshf(float);

				long double acoshl(long double);

				double asinh(double);

				float asinhf(float);

				long double asinhl(long double);

				double atanh(double);

				float atanhf(float);

				long double atanhl(long double);

				double cosh(double);

				float coshf(float);

				long double coshl(long double);

				double sinh(double);

				float sinhf(float);

				long double sinhl(long double);

				double tanh(double);

				float tanhf(float);

				long double tanhl(long double);

				double exp(double);

				float expf(float);

				long double expl(long double);

				double exp2(double);

				float exp2f(float);

				long double exp2l(long double);

				double expm1(double);

				float expm1f(float);

				long double expm1l(long double);

				double frexp(double, int);

				float frexpf(float, int);

				long double frexpl(long double, int);

				int ilogb(double);

				int ilogbf(float);

				int ilogbl(long double);

				double ldexp(double, int);

				float ldexpf(float, int);

				long double ldexpl(long double, int);

				double log(double);

				float logf(float);

				long double logl(long double);

				double log10(double);

				float log10f(float);

				long double log10l(long double);

				double log1p(double);

				float log1pf(float);

				long double log1pl(long double);

				double log2(double);

				float log2f(float);

				long double log2l(long double);

				double logb(double);

				float logbf(float);

				long double logbl(long double);

				double modf(double, double);

				float modff(float, float);

				long double modfl(long double, long double);

				double scalbn(double, int);

				float scalbnf(float, int);

				long double scalbnl(long double, int);

				double scalbln(double, long int);

				float scalblnf(float, long int);

				long double scalblnl(long double, long int);

				double cbrt(double);

				float cbrtf(float);

				long double cbrtl(long double);

				double fabs(double);

				float fabsf(float);

				long double fabsl(long double);

				double hypot(double, double);

				float hypotf(float, float);

				long double hypotl(long double, long double);

				double pow(double, double);

				float powf(float, float);

				long double powl(long double, long double);

				double sqrt(double);

				float sqrtf(float);

				long double sqrtl(long double);

				double erf(double);

				float erff(float);

				long double erfl(long double);

				double erfc(double);

				float erfcf(float);

				long double erfcl(long double);

				double lgamma(double);

				float lgammaf(float);

				long double lgammal(long double);

				double tgamma(double);

				float tgammaf(float);

				long double tgammal(long double);

				double ceil(double);

				float ceilf(float);

				long double ceill(long double);

				double floor(double);

				float floorf(float);

				long double floorl(long double);

				double nearbyint(double);

				float nearbyintf(float);

				long double nearbyintl(long double);

				double rint(double);

				float rintf(float);

				long double rintl(long double);

				long int lrint(double);

				long int lrintf(float);

				long int lrintl(long double);

				long long int llrint(double);

				long long int llrintf(float);

				long long int llrintl(long double);

				double round(double);

				float roundf(float);

				long double roundl(long double);

				long int lround(double);

				long int lroundf(float);

				long int lroundl(long double);

				long long int llround(double);

				long long int llroundf(float);

				long long int llroundl(long double);

				double trunc(double);

				float truncf(float);

				long double truncl(long double);

				double fmod(double, double);

				float fmodf(float, float);

				long double fmodl(long double, long double);

				double remainder(double, double);

				float remainderf(float, float);

				long double remainderl(long double, long double);

				double remquo(double, double, int);

				float remquof(float, float, int);

				long double remquol(long double, long double, int);

				double copysign(double, double);

				float copysignf(float, float);

				long double copysignl(long double, long double);

				double nan(const char);

				float nanf(const char);

				long double nanl(const char);

				double nextafter(double, double);

				float nextafterf(float, float);

				long double nextafterl(long double, long double);

				double nexttoward(double, long double);

				float nexttowardf(float, long double);

				long double nexttowardl(long double, long double);

				double fdim(double, double);

				float fdimf(float, float);

				long double fdiml(long double, long double);

				double fmax(double, double);

				double fmaxf(double, double);

				double fmaxl(double, double);

				double fmin(double, double);

				float fminf(float, float);

				long double fminl(long double, long double);

				double fma(double, double, double);

				float fmaf(float, float, float);

				long double fmal(long double, long double, long double);

				__END_C_DECLS

				#endif // LLVM_LIBC_MATH_H

libc/trunk/include/string.h

				//===---------------- C standard library header string.h ------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIBC_STRING_H
				#define LLVM_LIBC_STRING_H

				#include <__llvm-libc-common.h>

				#define __need_size_t // To get only size_t from stddef.h
				#define __need_NULL // To get only NULL from stddef.h
				#include <stddef.h>

				__BEGIN_C_DECLS

				void memcpy(void __restrict, const void *__restrict, size_t);

				void memmove(void , const void *, size_t);

				int memcmp(const void , const void , size_t);

				void memchr(const void , int, size_t);

				void memset(void , int, size_t);

				char strcpy(char __restrict, const char *__restrict);

				char strncpy(char __restrict, const char *__restrict, size_t);

				char strcat(char __restrict, const char *__restrict);

				char strncat(char , const char *, size_t);

				int strcmp(const char , const char );

				int strcoll(const char , const char );

				int strncmp(const char , const char , size_t);

				size_t strxfrm(char __restrict, const char __restrict, size_t);

				char strchr(const char , int);

				size_t strcspn(const char , const char );

				char strpbrk(const char , const char *);

				char strrchr(const char , int c);

				size_t strspn(const char , const char );

				char strstr(const char , const char *);

				char strtok(char __restrict, const char *__restrict);

				char *strerror(int);

				size_t strlen(const char *);

				__END_C_DECLS

				#endif // LLVM_LIBC_STRING_H

libc/trunk/lib/CMakeLists.txt


				add_entrypoint_library(
				llvmlibc
				DEPENDS
				### C standard library entrypoints
				# string.h entrypoints
				strcpy
				strcat
				)

libc/trunk/src/CMakeLists.txt

				add_subdirectory(string)

				add_subdirectory(__support)

libc/trunk/src/__support/CMakeLists.txt

				add_gen_header(
				support_common_h
				DEF_FILE common.h.def
				PARAMS
				entrypoint_macro=${LIBC_TARGET_OS}/entrypoint_macro.h.inc
				GEN_HDR common.h
				DATA_FILES
				${LIBC_TARGET_OS}/entrypoint_macro.h.inc
				)

libc/trunk/src/__support/common.h.def

				//===-------------------- Common internal contructs ---------------------*-===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIBC_SUPPORT_COMMON_H
				#define LLVM_LIBC_SUPPORT_COMMON_H

				#define INLINE_ASM __asm__ __volatile__

				<!> The entrypoint macro has a platform specific definition. So, we include the
				<!> right definition at build time.
				%%include_file(${entrypoint_macro})

				#endif // LLVM_LIBC_SUPPORT_COMMON_H

libc/trunk/src/__support/linux/entrypoint_macro.h.inc

				//===---- Definition of LLVM_LIBC_ENTRYPOINT macro for ELF paltforms ----*-===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				%%begin()

				#define ENTRYPOINT_SECTION_ATTRIBUTE(name) \
				__attribute__((section(".llvm.libc.entrypoint."#name)))
				#define LLVM_LIBC_ENTRYPOINT(name) ENTRYPOINT_SECTION_ATTRIBUTE(name) name

libc/trunk/src/string/CMakeLists.txt

				add_custom_target(libc_string_unittests)

				add_subdirectory(strcpy)
				add_subdirectory(strcat)

libc/trunk/src/string/strcat/CMakeLists.txt

				add_entrypoint_object(
				strcat
				SRCS
				strcat.cpp
				HDRS
				strcat.h
				DEPENDS
				strcpy
				string_h
				)

				add_libc_unittest(
				strcat_test
				SUITE
				libc_string_unittests
				SRCS
				strcat_test.cpp
				DEPENDS
				strcat
				strcpy
				)

libc/trunk/src/string/strcat/strcat.h

				//===----------------- Implementation header for strcat -------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIBC_SRC_STRING_STRCAT_H
				#define LLVM_LIBC_SRC_STRING_STRCAT_H

				#include <string.h>

				namespace __llvm_libc {

				char strcat(char dest, const char *src);

				} // namespace __llvm_libc

				#endif // LLVM_LIBC_SRC_STRING_STRCAT_H

libc/trunk/src/string/strcat/strcat.cpp

				//===-------------------- Implementation of strcat -----------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "src/string/strcat/strcat.h"

				#include "src/__support/common.h"
				#include "src/string/strcpy/strcpy.h"

				namespace __llvm_libc {

				char LLVM_LIBC_ENTRYPOINT(strcat)(char dest, const char *src) {
				// We do not yet have an implementaion of strlen in so we will use strlen
				// from another libc.
				__llvm_libc::strcpy(dest + ::strlen(dest), src);
				return dest;
				}

				} // namespace __llvm_libc

libc/trunk/src/string/strcat/strcat_test.cpp

				//===---------------------- Unittests for strcat --------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include <string>

				#include "src/string/strcat/strcat.h"
				#include "gtest/gtest.h"

				TEST(StrCatTest, EmptyDest) {
				std::string abc = "abc";
				char *dest = new char[4];

				dest[0] = '\0';

				char *result = __llvm_libc::strcat(dest, abc.c_str());
				ASSERT_EQ(dest, result);
				ASSERT_EQ(std::string(dest), abc);
				ASSERT_EQ(std::string(dest).size(), abc.size());

				delete[] dest;
				}

				TEST(StrCatTest, NonEmptyDest) {
				std::string abc = "abc";
				char *dest = new char[4];

				dest[0] = 'x';
				dest[1] = 'y';
				dest[2] = 'z';
				dest[3] = '\0';

				char *result = __llvm_libc::strcat(dest, abc.c_str());
				ASSERT_EQ(dest, result);
				ASSERT_EQ(std::string(dest), std::string("xyz") + abc);
				ASSERT_EQ(std::string(dest).size(), abc.size() + 3);

				delete[] dest;
				}

libc/trunk/src/string/strcpy/CMakeLists.txt

				add_entrypoint_object(
				strcpy
				SRCS
				strcpy.cpp
				HDRS
				strcpy.h
				DEPENDS
				string_h
				)

				add_libc_unittest(
				strcpy_test
				SUITE
				libc_string_unittests
				SRCS
				strcpy_test.cpp
				DEPENDS
				strcpy
				)

libc/trunk/src/string/strcpy/strcpy.h

				//===----------------- Implementation header for strcpy -------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIBC_SRC_STRING_STRCPY_H
				#define LLVM_LIBC_SRC_STRING_STRCPY_H

				#include <string.h>

				namespace __llvm_libc {

				char strcpy(char dest, const char *src);

				} // namespace __llvm_libc

				#endif // LLVM_LIBC_SRC_STRING_STRCPY_H

libc/trunk/src/string/strcpy/strcpy.cpp

				//===-------------------- Implementation of strcpy -----------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "src/string/strcpy/strcpy.h"

				#include "src/__support/common.h"

				namespace __llvm_libc {

				char LLVM_LIBC_ENTRYPOINT(strcpy)(char dest, const char *src) {
				return reinterpret_cast<char *>(::memcpy(dest, src, ::strlen(src) + 1));
				}

				} // namespace __llvm_libc

libc/trunk/src/string/strcpy/strcpy_test.cpp

				//===----------------------- Unittests for strcpy -------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include <string>

				#include "src/string/strcpy/strcpy.h"
				#include "gtest/gtest.h"

				TEST(StrCpyTest, EmptyDest) {
				std::string abc = "abc";
				char *dest = new char[4];

				char *result = __llvm_libc::strcpy(dest, abc.c_str());
				ASSERT_EQ(dest, result);
				ASSERT_EQ(std::string(dest), abc);
				ASSERT_EQ(std::string(dest).size(), abc.size());

				delete[] dest;
				}

				TEST(StrCpyTest, OffsetDest) {
				std::string abc = "abc";
				char *dest = new char[7];

				dest[0] = 'x';
				dest[1] = 'y';
				dest[2] = 'z';

				char *result = __llvm_libc::strcpy(dest + 3, abc.c_str());
				ASSERT_EQ(dest + 3, result);
				ASSERT_EQ(std::string(dest), std::string("xyz") + abc);
				ASSERT_EQ(std::string(dest).size(), abc.size() + 3);

				delete[] dest;
				}

libc/trunk/utils/build_scripts/gen_hdr.py

Property	Old Value	New Value
svn:executable	null	* \ No newline at end of property

				#! /usr/bin/python
				#===---------------- Script to generate header files ----------------------===#
				#
				# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				# See https:#llvm.org/LICENSE.txt for license information.
				# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				#
				#===-----------------------------------------------------------------------===#
				#
				# This script takes a .h.def file and generates a .h header file.
				# See docs/header_generation.md for more information.
				#
				#===-----------------------------------------------------------------------===#

				import argparse
				import contextlib
				import os
				import sys

				COMMAND_PREFIX = "%%"
				COMMENT_PREFIX = "<!>"

				BEGIN_COMMAND = "begin"
				COMMENT_COMMAND = "comment"
				INCLUDE_FILE_COMMAND = "include_file"


				class _Location(object):
				def __init__(self, filename, line_number):
				self.filename = filename
				self.line_number = line_number

				def __str__(self):
				return "%s:%s" % (self.filename, self.line_number)


				@contextlib.contextmanager
				def output_stream_manager(filename):
				if filename is None:
				try:
				yield sys.stdout
				finally:
				pass
				else:
				output_stream = open(filename, "w")
				try:
				yield output_stream
				finally:
				output_stream.close()


				def _parse_command(loc, line):
				open_paren = line.find("(")
				if open_paren < 0 or line[-1] != ")":
				return _fatal_error(loc, "Incorrect header generation command syntax.")
				command_name = line[len(COMMAND_PREFIX):open_paren]
				args = line[open_paren + 1:-1].split(",")
				args = [a.strip() for a in args]
				if len(args) == 1 and not args[0]:
				# There are no args, so we will make the args list an empty list.
				args = []
				return command_name.strip(), args


				def _is_named_arg(token):
				if token.startswith("${") and token.endswith("}"):
				return True
				else:
				return False


				def _get_arg_name(token):
				return token[2:-1]


				def _fatal_error(loc, msg):
				sys.exit("ERROR:%s: %s" % (loc, msg))


				def _is_begin_command(line):
				if line.startswith(COMMAND_PREFIX + BEGIN_COMMAND):
				return True


				def include_file_command(out_stream, loc, args, values):
				if len(args) != 1:
				_fatal_error(loc, "`%%include_file` command takes exactly one "
				"argument. %d given." % len(args))
				include_file_path = args[0]
				if _is_named_arg(include_file_path):
				arg_name = _get_arg_name(include_file_path)
				include_file_path = values.get(arg_name)
				if not include_file_path:
				_fatal_error(
				loc,
				"No value specified for argument '%s'." % arg_name)
				if not os.path.exists(include_file_path):
				_fatal_error(
				loc,
				"Include file %s not found." % include_file_path)
				with open(include_file_path, "r") as include_file:
				begin = False
				for line in include_file.readlines():
				line = line.strip()
				if _is_begin_command(line):
				# Parse the command to make sure there are no errors.
				command_name, args = _parse_command(loc, line)
				if args:
				_fatal_error(loc, "Begin command does not take any args.")
				begin = True
				# Skip the line on which %%begin() is listed.
				continue
				if begin:
				out_stream.write(line + "\n")


				def begin_command(out_stream, loc, args, values):
				# "begin" command can only occur in a file included with %%include_file
				# command. It is not a replacement command. Hence, we just fail with
				# a fatal error.
				_fatal_error(loc, "Begin command cannot be listed in an input file.")


				# Mapping from a command name to its implementation function.
				REPLACEMENT_COMMANDS = {
				INCLUDE_FILE_COMMAND: include_file_command,
				BEGIN_COMMAND: begin_command,
				}


				def apply_replacement_command(out_stream, loc, line, values):
				if not line.startswith(COMMAND_PREFIX):
				# This line is not a replacement command.
				return line
				command_name, args = _parse_command(loc, line)
				command = REPLACEMENT_COMMANDS.get(command_name.strip())
				if not command:
				_fatal_error(loc, "Unknown replacement command `%`", command_name)
				command(out_stream, loc, args, values)


				def parse_options():
				parser = argparse.ArgumentParser(
				description="Script to generate header files from .def files.")
				parser.add_argument("def_file", metavar="DEF_FILE",
				help="Path to the .def file.")
				parser.add_argument("--args", "-P", nargs= "*", default=[],
				help="NAME=VALUE pairs for command arguments in the "
				"input .def file.")
				# The output file argument is optional. If not specified, the generated
				# header file content will be written to stdout.
				parser.add_argument("--out-file", "-o",
				help="Path to the generated header file. Defaults to "
				"stdout")
				opts = parser.parse_args()
				if not all(["=" in arg for arg in opts.args]):
				# We want all args to be specified in the form "name=value".
				_fatal_error(
				__file__ + ":" + "[command line]",
				"Command arguments should be listed in the form NAME=VALUE")
				return opts


				def main():
				opts = parse_options()
				arg_values = {}
				for name_value_pair in opts.args:
				name, value = name_value_pair.split("=")
				arg_values[name] = value
				with open(opts.def_file, "r") as def_file:
				loc = _Location(opts.def_file, 0)
				with output_stream_manager(opts.out_file) as out_stream:
				for line in def_file:
				loc.line_number += 1
				line = line.strip()
				if line.startswith(COMMAND_PREFIX):
				replacement_text = apply_replacement_command(
				out_stream, loc, line, arg_values)
				out_stream.write("\n")
				elif line.startswith(COMMENT_PREFIX):
				# Ignore comment line
				continue
				else:
				out_stream.write(line + "\n")


				if __name__ == "__main__":
				main()

llvm/trunk/CMakeLists.txt

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	if (NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
message(STATUS "No build type selected, default to Debug")		message(STATUS "No build type selected, default to Debug")
set(CMAKE_BUILD_TYPE "Debug" CACHE STRING "Build type (default Debug)" FORCE)		set(CMAKE_BUILD_TYPE "Debug" CACHE STRING "Build type (default Debug)" FORCE)
endif()		endif()

# Side-by-side subprojects layout: automatically set the		# Side-by-side subprojects layout: automatically set the
# LLVM_EXTERNAL_${project}_SOURCE_DIR using LLVM_ALL_PROJECTS		# LLVM_EXTERNAL_${project}_SOURCE_DIR using LLVM_ALL_PROJECTS
# This allows an easy way of setting up a build directory for llvm and another		# This allows an easy way of setting up a build directory for llvm and another
# one for llvm+clang+... using the same sources.		# one for llvm+clang+... using the same sources.
set(LLVM_ALL_PROJECTS "clang;clang-tools-extra;compiler-rt;debuginfo-tests;libclc;libcxx;libcxxabi;libunwind;lld;lldb;llgo;openmp;parallel-libs;polly;pstl")		set(LLVM_ALL_PROJECTS "clang;clang-tools-extra;compiler-rt;debuginfo-tests;libc;libclc;libcxx;libcxxabi;libunwind;lld;lldb;llgo;openmp;parallel-libs;polly;pstl")
set(LLVM_ENABLE_PROJECTS "" CACHE STRING		set(LLVM_ENABLE_PROJECTS "" CACHE STRING
"Semicolon-separated list of projects to build (${LLVM_ALL_PROJECTS}), or \"all\".")		"Semicolon-separated list of projects to build (${LLVM_ALL_PROJECTS}), or \"all\".")
if( LLVM_ENABLE_PROJECTS STREQUAL "all" )		if( LLVM_ENABLE_PROJECTS STREQUAL "all" )
set( LLVM_ENABLE_PROJECTS ${LLVM_ALL_PROJECTS})		set( LLVM_ENABLE_PROJECTS ${LLVM_ALL_PROJECTS})
endif()		endif()

# LLVM_ENABLE_PROJECTS_USED is `ON` if the user has ever used the		# LLVM_ENABLE_PROJECTS_USED is `ON` if the user has ever used the
# `LLVM_ENABLE_PROJECTS` CMake cache variable. This exists for		# `LLVM_ENABLE_PROJECTS` CMake cache variable. This exists for
▲ Show 20 Lines • Show All 1,033 Lines • Show Last 20 Lines

llvm/trunk/projects/CMakeLists.txt

Show All 25 Lines	if(${LLVM_BUILD_RUNTIME})
# fixed.		# fixed.
# FIXME: LLVM_FORCE_BUILD_RUNTIME is currently used by libc++ to force		# FIXME: LLVM_FORCE_BUILD_RUNTIME is currently used by libc++ to force
# enable the in-tree build when targeting clang-cl.		# enable the in-tree build when targeting clang-cl.
if(NOT MSVC OR LLVM_FORCE_BUILD_RUNTIME)		if(NOT MSVC OR LLVM_FORCE_BUILD_RUNTIME)
# Add the projects in reverse order of their dependencies so that the		# Add the projects in reverse order of their dependencies so that the
# dependent projects can see the target names of their dependencies.		# dependent projects can see the target names of their dependencies.
add_llvm_external_project(libunwind)		add_llvm_external_project(libunwind)
add_llvm_external_project(pstl)		add_llvm_external_project(pstl)
		add_llvm_external_project(libc)
add_llvm_external_project(libcxxabi)		add_llvm_external_project(libcxxabi)
add_llvm_external_project(libcxx)		add_llvm_external_project(libcxx)
endif()		endif()
if(NOT LLVM_BUILD_EXTERNAL_COMPILER_RT)		if(NOT LLVM_BUILD_EXTERNAL_COMPILER_RT)
add_llvm_external_project(compiler-rt)		add_llvm_external_project(compiler-rt)
endif()		endif()
endif()		endif()

add_llvm_external_project(dragonegg)		add_llvm_external_project(dragonegg)
add_llvm_external_project(parallel-libs)		add_llvm_external_project(parallel-libs)
add_llvm_external_project(openmp)		add_llvm_external_project(openmp)

if(LLVM_INCLUDE_TESTS)		if(LLVM_INCLUDE_TESTS)
add_llvm_external_project(debuginfo-tests)		add_llvm_external_project(debuginfo-tests)
endif()		endif()

This is an archive of the discontinued LLVM Phabricator instance.

[libc] Add few docs and implementation of strcpy and strcat.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 223246

libc/trunk/CMakeLists.txt

libc/trunk/cmake/modules/LLVMLibCRules.cmake

libc/trunk/docs/build_system.rst

libc/trunk/docs/entrypoints.rst

libc/trunk/docs/header_generation.rst

libc/trunk/docs/implementation_standard.rst

libc/trunk/docs/source_layout.rst

libc/trunk/include/CMakeLists.txt

libc/trunk/include/__llvm-libc-common.h

libc/trunk/include/ctype.h

libc/trunk/include/math.h

libc/trunk/include/string.h

libc/trunk/lib/CMakeLists.txt

libc/trunk/src/CMakeLists.txt

libc/trunk/src/__support/CMakeLists.txt

libc/trunk/src/__support/common.h.def

libc/trunk/src/__support/linux/entrypoint_macro.h.inc

libc/trunk/src/string/CMakeLists.txt

libc/trunk/src/string/strcat/CMakeLists.txt

libc/trunk/src/string/strcat/strcat.h

libc/trunk/src/string/strcat/strcat.cpp

libc/trunk/src/string/strcat/strcat_test.cpp

libc/trunk/src/string/strcpy/CMakeLists.txt

libc/trunk/src/string/strcpy/strcpy.h

libc/trunk/src/string/strcpy/strcpy.cpp

libc/trunk/src/string/strcpy/strcpy_test.cpp

libc/trunk/utils/build_scripts/gen_hdr.py

llvm/trunk/CMakeLists.txt

llvm/trunk/projects/CMakeLists.txt

[libc] Add few docs and implementation of strcpy and strcat.
ClosedPublic