This is an archive of the discontinued LLVM Phabricator instance.

[Support] Support MF_HUGE_HINT on Linux and FreeBSD
Changes PlannedPublic

Authored by riccibruno on Dec 29 2019, 11:37 AM.

Details

Summary

Implement support for large page requests with the allocateMappedMemory interface on Linux and FreeBSD. If the MF_HUGE_HINT flag is passed we first try an mmap with the relevant flag. If that fails we fall back to a normal mmap. On some systems (such as Linux with transparent huge pages) it is possible to indicate with madvise that a certain range of memory should use large pages. We do this if supported and the first mmap with the large page flag failed.

Note that only a little test is added since the allocateMappedMemory interface is already well-tested. It seems difficult to reliably test that we got large pages. I am open to suggestions here (maybe with userfaultfd ?).

Diff Detail

Event Timeline

riccibruno created this revision.Dec 29 2019, 11:37 AM
riccibruno retitled this revision from [Support] Support MF_HUGE_HINT for Linux and FreeBSD to [Support] Support MF_HUGE_HINT on Linux and FreeBSD.Dec 29 2019, 11:38 AM
riccibruno marked an inline comment as done.Dec 30 2019, 7:26 AM
riccibruno added inline comments.
llvm/lib/Support/Unix/Memory.inc
163

Note that the choice to first try without the large page hint is arbitrary. If anyone has an argument for one way or another please go ahead.

rnk added a subscriber: aganea.Dec 30 2019, 2:24 PM

+@aganea, who added this flag.

Note that only a little test is added since the allocateMappedMemory interface is already well-tested. It seems difficult to reliably test that we got large pages. I am open to suggestions here (maybe with userfaultfd ?).

At a certain point, this would really be a test for the OS. That's pretty hard. I think this is fine without additional testing.

llvm/lib/Support/Unix/Memory.inc
79
169

This code pattern of a recursive tail call shows up in a few places in old LLVM code like this. Personally, I don't like it. It seems like a do / while loop around the mmap call itself would be clearer, since we don't need to re-run most of the flag calculations above. This would also avoid the Impl rename.

riccibruno planned changes to this revision.Jan 3 2020, 12:40 PM
riccibruno marked 2 inline comments as done.
In D71975#1799355, @rnk wrote:

+@aganea, who added this flag.

Note that only a little test is added since the allocateMappedMemory interface is already well-tested. It seems difficult to reliably test that we got large pages. I am open to suggestions here (maybe with userfaultfd ?).

At a certain point, this would really be a test for the OS. That's pretty hard. I think this is fine without additional testing.

So I did rework the patch to address your comments, but I found out an "interesting" design decision with how mmap/munmap work on linux:

  • mmap with the MAP_HUGETLB flag automatically adjusts the length parameter to be a multiple of the size of a huge page. But there is no easy way to find out to what value the length was adjusted to. There is also no easy way to find what the size of a huge page is. The recommended method (and what libhugetlbfs does) is to parse /proc/meminfo.
  • munmap requires the adjusted length.

So I need to:

  • Add a getHugePageSize similar to sys::Process::getPageSize which will do its best to find out about the huge page size(s) on the platform. This needs to be done for Windows too.
  • Use that in allocateMappedMemory.
llvm/lib/Support/Unix/Memory.inc
169

I agree that your suggestion is nicer.

aganea added a comment.Jan 3 2020, 1:18 PM

So I need to:

  • Add a getHugePageSize similar to sys::Process::getPageSize which will do its best to find out about the huge page size(s) on the platform. This needs to be done for Windows too.

Do you need to know the large page size outside of Memory.inc? If no, you could mark it as a static function, in the same way as the Windows enableProcessLargePages() (which detects the minimum large page size). Additionally, you can also take a look at how rpmalloc does it: rpmalloc.c, L1802-L1871 in D71786, as it already handles the same platforms as LLVM.

bsdjhb added a comment.Jan 7 2020, 5:16 PM

For FreeBSD, MAP_ALIGNED_SUPER probably isn't what you want. The default mmap() for MAP_ANON on FreeBSD tries to do what MAP_ALIGNED_SUPER does if the request is at least one large page in size first, and if that fails to find available address space, falls back to finding any possible virtual addresses (it does the two steps you are doing inside of mmap() itself). Specifying MAP_ALIGNED_SUPER tells mmap() to only try the first pass and fail without trying the second pass, but then the second call to mmap() that you would make as a result would just do both passes anyway.

FreeBSD doesn't currently have a "must-only-use-large-pages-and-never-demote-to-small-pages" flag which is probably what you would want to use here instead. (We've talked about adding one, but no one has done it to date.)

For FreeBSD, MAP_ALIGNED_SUPER probably isn't what you want. The default mmap() for MAP_ANON on FreeBSD tries to do what MAP_ALIGNED_SUPER does if the request is at least one large page in size first, and if that fails to find available address space, falls back to finding any possible virtual addresses (it does the two steps you are doing inside of mmap() itself). Specifying MAP_ALIGNED_SUPER tells mmap() to only try the first pass and fail without trying the second pass, but then the second call to mmap() that you would make as a result would just do both passes anyway.

Ah, I did not know that. Thanks for your comment. The reason I used MAP_ALIGNED_SUPER is the note in the man page for mmap(2):

The MAP_ALIGNED_SUPER flag is an optimization that will align the mapping request to the size of a large page similar to MAP_ALIGNED [...]

but what you are saying is that for large enough requests with MAP_ANON the default is already MAP_ALIGNED_SUPER with a fallback. The only difference with using MAP_ALIGNED_SUPER is that we can detect if the first mmap fails and clear the MF_HUGE_HINT flag. However this is not actually useful since the MF_HUGE_HINT flag in the returned block of memory has no reliable meaning (on some systems the block of memory may use large pages even if MF_HUGE_HINT is cleared, and on some other systems the block of memory may use small pages even if MF_HUGE_HINT is set).

FreeBSD doesn't currently have a "must-only-use-large-pages-and-never-demote-to-small-pages" flag which is probably what you would want to use here instead. (We've talked about adding one, but no one has done it to date.)

Not really, MF_HUGE_HINT is only a hint and a fallback to small pages is entirely acceptable.