This is an archive of the discontinued LLVM Phabricator instance.

Add implementations of POSIX mmap and munmap functions.
ClosedPublic

Authored by sivachandra on Dec 17 2019, 3:45 PM.

Details

Summary

A set of of linux x86_64 internal syscall helpers have also been added.

This change does not try to be perfect with respect to OS and machine
abstractions. A TODO note has been added at places where such abstractions
would help and make the arrangement scalable and cleaner. Addressing the
TODOs and building such abstractions is not in the scope of this change.
It is hoped that follow up changes cleaning up the problem areas and
addressing the TODOs will better illustrate the need for the changes.

This change also does not try to imitate mmap and munmap implementations
of other libcs. The idea here is to put in the bare minimum required to
obtain a working mmap and munmap, and then add the rest of the
functionality on an as needed basis.

Diff Detail

Event Timeline

sivachandra created this revision.Dec 17 2019, 3:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 17 2019, 3:45 PM
abrachet added inline comments.
libc/config/linux/api.td
7–47

Why are all these in a linux specific directory?

119–121

Is there no linux header for these?

libc/config/linux/platfrom_defs.h.inc
15

Shouldn't this be defined in some linux header that we can include? Either way I don't think we should just be defining it like this, as this is platform specific (although what platforms use a different page size) it should go in the config files I reckon.

libc/config/linux/x86_64/syscall_test.cpp
2

Change the name.

libc/src/sys/mman/mmap_test.cpp
20

Make this pagesize, if it calls mmap2, then this will be an error because it's not a multiple of pagesize.

35

You could do ASSERT_DEATH(array[0] = 123, ::testing::KilledBySignal(SIGSEGV) or something like that. I don't think we need to though, that's just testing the OS at that point. But if you think it is worth the comment it should be worth the test as well.

42

Maybe also a test for invalid munmap?

phosek added inline comments.Dec 18 2019, 7:24 PM
libc/config/linux/platfrom_defs.h.inc
15

I'm not sure whether this should be defined at all, the man page for getpagesize says:

Whether  getpagesize()  is  present as a Linux system call depends on the architecture.  If it is, it returns the kernel symbol PAGE_SIZE, whose value depends on the architecture and machine model.  Generally, one uses binaries that are dependent on the architecture but not on the machine model, in order to have a single binary distribution per architecture.  This means that a user program should not find PAGE_SIZE at compile time from a header file, but use an actual system call, at least for those architectures (like sun4) where this dependency exists.  Here glibc 2.0 fails because its getpagesize() returns a statically derived value, and does not use a system call.  Things are OK in glibc 2.1.
libc/include/sys/syscall.h.def
2

The comment should be updated.

MaskRay added inline comments.Dec 19 2019, 11:01 AM
libc/config/linux/platfrom_defs.h.inc
15

4096 is good for x86. Linux uapi does not define PAGE_SIZE as a constant on arm/aarch64/powerpc/etc.

If you are going to make PAGE_SIZE arch-specific, it probably makes more sense to use PAGESIZE (in POSIX base), and add #define PAGE_SIZE PAGESIZE (XSI extension).

libc/config/linux/syscall_numbers.h.inc
111

Some __NR_*64 macros do not exist on some architectures. In such cases, such SYS_*64 should not be defined as user programs may use #ifdef __SYS_fstatfs64 and the definition here will break them.

It probably also makes sense to define SYS_* to SYS_*64 if the latter is defined, if you don't want users of the new libc also fight with the Y2038 problem.

libc/include/__posix-types.h
15

It probably makes more sense to define off_t as long long (64-bit) or long (32-bit) to not have to deal with LFS64.

sivachandra marked 10 inline comments as done.

Address comments.

sivachandra added inline comments.Dec 19 2019, 1:54 PM
libc/config/linux/api.td
7–47

Right. They can live in a common config file, but they are here as we only have one config right now. When we add a new config, we can move them to a common location as suitable.

119–121

Surprisingly, no. AFAICT, the arch specific values are not exposed via any public linux header. Both glibc and musl also define these macros in their tree.

libc/config/linux/platfrom_defs.h.inc
15

Let me start by saying that it was my mistake to not have added a long comment explaining why I added PAGE_SIZE here.

So, page size should ideally be got from the aux vector[*]. We do not have that yet so I had to rely on some other way to obtain the page size. Looking around, what I found was glibc's use of EXEC_PAGESIZE from linux/param.h. But, glibc is actually mixed about this: in some places it uses EXEC_PAGESIZE, and in some other places uses the value it read from the aux vector. There was a discussion around this long time ago on the glibc list: https://lists.gnu.org/archive/html/bug-hurd/2011-10/msg00035.html.

This hard coding is only an interim step (I have added a comment now), and as @MaskRay points out, for x86 it is actually correct. I plan to move the implementations of mmap and munmap to the config/linux directory (we cannot do it now as the CMake targets will also have to be moved under linux/config. I have a follow up change which helps us avoid it and then we can move.) After moving them, I plan to use EXEC_PAGESIZE from linux/param.h. Though it is not perfect, it will also be an interim step, but a better interim: EXEC_PAGESIZE works for most architectures, but would fail for those on which the page size is not fixed.

* Using page size from the aux vector brings in its own set of problems. It would require the libc to maintain global state with parts of the libc implementation using this state. This causes problems wrt mixing LLVM-libc with another libc. I have not put enough thought into it, so may be it can be overcome cleanly.

libc/config/linux/syscall_numbers.h.inc
111

In a way, I have paralleled what musl does. But, I think you are right so I have now put everything under #ifdef, which is what glibc does.

libc/src/sys/mman/mmap_test.cpp
20

Hmm, I am not sure. AFAIU, mmap2 only changes the interpretation of the offset.

35

Removed now.

abrachet accepted this revision.Dec 19 2019, 3:26 PM

LGTM. The biggest thing at this stage is if our current implementation will make it such that it is too difficult to change later in the pursuit of supporting more architectures and I don't think that is the case here. Even if our internal handling of PAGE_SIZE will change later I think this revision is fine as it is. Wait for someone else to accept too of course.

This revision is now accepted and ready to land.Dec 19 2019, 3:26 PM
phosek accepted this revision.Dec 23 2019, 11:41 AM

LGTM

This revision was automatically updated to reflect the committed changes.