This is an archive of the discontinued LLVM Phabricator instance.

[libc] Minimal Darwin support
AbandonedPublic

Authored by tschuett on Jun 23 2020, 12:49 PM.

Details

Summary

This is the minimal copy and paste version to get something running on Darwin.

Note the file platfrom_defs.h.inc. The section name has a ',' because of Mach-O. The section attribute length can be at most 16 characters. Thus, I had to remove the llvm part of the name.
The objcopy changes are beyond my me.

It stills fails the liniting. It claims that stddef.h and stdint.h are system headers. stddef.h and stdint.h are provided by the libc and the resource directory of clang ...

Diff Detail

Event Timeline

tschuett created this revision.Jun 23 2020, 12:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2020, 12:49 PM
tschuett edited the summary of this revision. (Show Details)Jun 23 2020, 12:58 PM
tschuett edited the summary of this revision. (Show Details)Jun 23 2020, 1:10 PM
tschuett updated this revision to Diff 272814.Jun 23 2020, 1:47 PM

copy and paste api.td from linux ...

tschuett updated this revision to Diff 272831.Jun 23 2020, 2:32 PM

some cmake improvements

Thanks a lot for working on this. I have a few first reaction questions, but I want to play with this patch before I ask them. So, it will take some time before I can comeback. In the meanwhile, @abrachet might share his opinions/ideas. Also, we are heading into a US holiday week during which I will be away the whole week.

tschuett updated this revision to Diff 272924.Jun 24 2020, 12:30 AM

add missing include

tschuett added a comment.EditedJun 24 2020, 2:09 PM

Notes:

  • llvm-objcopy cannot do aliases for Mach-O. It can only rename symbols.
  • ld64 supports an alias_list option to create aliases. It would probably require the mangled names.
  • clang does not support the alias attribute for Darwin.
  • use llvm-readobj to extract the mangled names from the entry point sections.

Would you mind looking if it is possible for a program not to be linked against LibSystem.dylib which contains Apple's libc? I thought it was the case that it wasn't possible, but I might be wrong.

It stills fails the liniting. It claims that stddef.h and stdint.h are system headers. stddef.h and stdint.h are provided by the libc and the resource directory of clang ...

@PaulkaToast Might have some insight here, or maybe ideas on what we can do. But it does make sense that the clang function used in the linter is saying these are system headers, they are provided by Apple also. /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stddef.h This one isn't provided by clang, right?

I don't know whether linking without systemb is supported at all, but LLVM libc is designed to be layered on top of the system libc. It seems to be a problem of the include order.

This is the stddef.h from clang. The path includes clang's version and something like lib/clang/version ...
/Applications/Xcode.app/ContentsDeveloper/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/11.0.3/include/stddef.h
This is the stddef.h from the libc. It is similar to your example: SDKs/MacOSX10.15.sdk/usr/include/stddef.h
/Applications/Xcode.app/Contents
Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/stddef.h
Mine is from the Xcode installation. Yours is from the command line tools.

Would you mind looking if it is possible for a program not to be linked against LibSystem.dylib which contains Apple's libc? I thought it was the case that it wasn't possible, but I might be wrong.

It stills fails the liniting. It claims that stddef.h and stdint.h are system headers. stddef.h and stdint.h are provided by the libc and the resource directory of clang ...

@PaulkaToast Might have some insight here, or maybe ideas on what we can do. But it does make sense that the clang function used in the linter is saying these are system headers, they are provided by Apple also. /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stddef.h This one isn't provided by clang, right?

On Darwin/BSD, I think the OS provides the libc. Not sure if -ffreestanding has any implications. So, for Darwin/BSD, we probably have to use a whitelist approach?

A more important question on Darwin/BSD is, what exactly does a libc mean? Since the OS provides the libc (including the headers), should we just implement redirectors for everything? May be not. May be LLVM-libc will be an alternate for users wanting to use it and so we should be like any normal library. There can still be questions like, for how much of the OS provided libc do we provide an alternate?

However we evolve for Darwin/BSD, I do not think we need to block on concrete answers to the above questions at this point in time. At the same time, my notes here are probably not helping this patch in anyway. I did see @tschuett pointing out other valid problems. Unfortunately, I will be away until 6th of July so I will not be picking this patch up until then. If others have any further thoughts/opinions, feel free to share.

tschuett added a comment.EditedJun 27 2020, 12:02 AM

Maybe the linting problem is not unique to Darwin, but highlights a more general problem with this approach. So far development was limited to Linux.

Does clang-tidy use -ffreestanding?

Supporting Darwin might help to accelerate the development of LLVM-libc. The development of the string and math library can be done on any platform. My first thought was adding redirectors for malloc and free to support the strdup implementation.

tschuett added a comment.EditedJul 9 2020, 6:21 AM
/Users/XXX/Work/XXX/llvm-project/libc/src/string/memcpy.h:13:1: error: system include stddef.h not allowed, transitively included from /Users/XXX/Work/XXX/llvm-project/libc/src/string/memcpy.h (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/stddef.h) [llvmlibc-restrict-system-libc-headers,-warnings-as-errors]
#include <stddef.h> // size_t

Despite -freestanding, it gets the header from the C++ STL?!?

-nostdinc++ solved the linting problems.

tschuett updated this revision to Diff 276956.Jul 10 2020, 2:04 AM
  • fixed linting
  • minimal entry point
sivachandra added inline comments.Jul 10 2020, 11:50 AM
libc/config/darwin/platfrom_defs.h.inc
11

This is probably OK as a compromise. Few points around this:

  1. If we are actually OK with this in general and not just for Darwin, then we don' t need to use aliases at all. We can use an arrangement like this:
// entrypoint.h

#define ENTRYPOINT_DECL extern "C"

namespace __llvm_libc {

ENTRYPOINT_DECL <ret_type> entrypoint(<argl list>);

} // namespace __llvm_libc
// entrypoint.cpp
#include "entrypoint.h"

namespace __llvm_libc {

<ret_type> entrypoint(<arg list>) {
 ...
}

} // namespace __llvm_libc

This gives you a C symbol but requires scope qualification in the callers from within LLVM libc. Exactly what we want.

  1. But, LLVM libc aims to be intermixed with other libcs In which case, the above mechanism or even using aliases, does not give us a way to ensure that calls from within LLVM libc to other LLVM libc entrypoints resolve to symbols from LLVM libc itself. This was the reason why we went with the LLVM_LIBC_ENTRYPOINT macro as it is today for linux.

I see, but this solution seems to be a very Linux/ELF way of doing it. I could not find anything similar for Mach-O.

I see, but this solution seems to be a very Linux/ELF way of doing it. I could not find anything similar for Mach-O.

I agree, but also do not have a good answer yet. I am still doing my homework on this. Personally, I strongly believe we need to solve this problem so I am on it. I will share as and when I have something concrete to share.

Now, I realized that I cannot use/develop malloc/free redirectors on Darwin.

tschuett added a comment.EditedJul 10 2020, 12:31 PM

Mach-O is my toy project. The real one will be COFF/Windows.

tschuett abandoned this revision.Sep 17 2020, 12:43 AM