This is an archive of the discontinued LLVM Phabricator instance.

ASan: move allocator base to avoid conflict with high-entropy ASLR for x86-64 Linux
ClosedPublic

Authored by thurston on Apr 10 2023, 5:44 PM.

Details

Summary

Users have discovered [*] that when CONFIG_ARCH_MMAP_RND_BITS == 32,
it will frequently conflict with ASan's allocator on x86-64 Linux, because the
PIE program segment base address of 0x555555555554 plus an ASLR shift of up to
((2**32) * 4K == 0x100000000000) will sometimes exceed ASan's hardcoded
base address of 0x600000000000. We fix this by simply moving the allocator base
to 0x500000000000, which is below the PIE program segment base address. This is
cleaner than trying to move it to another location that is sandwiched between
the PIE program and library segments, because if either of those grow too large,
it will collide with the allocator region.

Note that we will never need to change this base address again (unless we want to increase
the size of the allocator), because ASLR cannot be set above 32-bits for x86-64 Linux (the
PIE program segment and library segments would collide with each other; see also
ARCH_MMAP_RND_BITS_MAX in https://github.com/torvalds/linux/blob/master/arch/x86/Kconfig).

and https://groups.google.com/a/google.com/g/chrome-os-gardeners/c/BbfzCP3dEeo/m/h3C_vVUxCQAJ

Diff Detail

Event Timeline

thurston created this revision.Apr 10 2023, 5:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 10 2023, 5:44 PM
Herald added subscribers: Enna1, pengfei. · View Herald Transcript
thurston requested review of this revision.Apr 10 2023, 5:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 10 2023, 5:44 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
thurston retitled this revision from Move ASan allocator base to avoid conflict with high-entropy ASLR to ASan: move allocator base to avoid conflict with high-entropy ASLR for x86-64 Linux.Apr 11 2023, 8:24 AM
thurston edited the summary of this revision. (Show Details)
vitalybuka accepted this revision.Apr 11 2023, 9:58 AM
This revision is now accepted and ready to land.Apr 11 2023, 9:58 AM

I ran a mondo comparing the new allocator base vs. old allocator base; the results are within statistical noise (ordinary flakiness).

Q: Let a = 0x555555555000 + 2**CONFIG_ARCH_MMAP_RND_BITS * 4096 (is this the max load base in the kernel?), is this patch try to make [a, a+image_size) and [kAllocatorSpace,kAllocatorSpace+kAllocatorSize) not interact?

If yes, 0x500000000000ULL looks like a good choice!

thurston added a comment.EditedApr 12 2023, 9:15 PM

Q: Let a = 0x555555555000 + 2**CONFIG_ARCH_MMAP_RND_BITS * 4096 (is this the max load base in the kernel?), is this patch try to make [a, a+image_size) and [kAllocatorSpace,kAllocatorSpace+kAllocatorSize) not interact?

Yes, that is one of the constraints.

The other constraint is: let b be the library segments, which start at roughly (ignoring the stack) 0x7FFFFFFFFFFF - 2**CONFIG_ARCH_MMAP_RND_BITS * 4096, growing downwards. We need to have [b, b-library_images_size] and [kAllocatorSpace,kAllocatorSpace+kAllocatorSize) not conflict either.

For "reasonable" sizes of image_size, library_images_size and allocator_size, it's actually possible to squeeze the allocator in between the program segment and library segment regions ((0x7fffffffffff - 0x555500000000 - 2* ((2**32) * 4096)) = 10.7TB that can be distributed between the three sizes), but placing it underneath the program segments sidesteps the issue entirely.

Bonus pics of the ASan memory layout before and after the change: https://docs.google.com/presentation/d/1z39DhQxUCT31OBOWWAtjd48BwCm0PgXeaG67QDs3TZI/edit

Q: Let a = 0x555555555000 + 2**CONFIG_ARCH_MMAP_RND_BITS * 4096 (is this the max load base in the kernel?), is this patch try to make [a, a+image_size) and [kAllocatorSpace,kAllocatorSpace+kAllocatorSize) not interact?

Yes, that is one of the constraints.

The other constraint is: let b be the library segments, which start at roughly (ignoring the stack) 0x7FFFFFFFFFFF - 2**CONFIG_ARCH_MMAP_RND_BITS * 4096, growing downwards. We need to have [b, b-library_images_size] and [kAllocatorSpace,kAllocatorSpace+kAllocatorSize) not conflict either.

For "reasonable" sizes of image_size, library_images_size and allocator_size, it's actually possible to squeeze the allocator in between the program segment and library segment regions ((0x7fffffffffff - 0x555500000000 - 2* ((2**32) * 4096)) = 10.7TB that can be distributed between the three sizes), but placing it underneath the program segments sidesteps the issue entirely.

Bonus pics of the ASan memory layout before and after the change: https://docs.google.com/presentation/d/1z39DhQxUCT31OBOWWAtjd48BwCm0PgXeaG67QDs3TZI/edit

Thank you! The formula and the doc are very useful. I have some understanding of CONFIG_ARCH_MMAP_RND_BITS now...

hans added subscribers: lgrey, hans.Apr 13 2023, 12:54 AM

This broke the lit tests on Mac: LeakSanitizer-AddressSanitizer-x86_64 :: TestCases/Darwin/trampoline.mm
@lgrey who added that in D129385 maybe has some idea of what's going on?

I'll revert this for now.

lgrey added a comment.Apr 13 2023, 8:02 AM

This broke the lit tests on Mac: LeakSanitizer-AddressSanitizer-x86_64 :: TestCases/Darwin/trampoline.mm
@lgrey who added that in D129385 maybe has some idea of what's going on?

I'll revert this for now.

Thanks, looking now. Since this wasn't intended to affect Darwin anyway, it can probably reland with SANITIZER_APPLE excluded

MaskRay added a comment.EditedApr 13 2023, 11:31 AM

Confining this to non-Apple platforms looks good to me.

Note that this change (I believe) also affects other less-used 64-bit platforms like s390x/mips64 that I don't know how to test... I think they are likely fine with the change but I cannot say for sure.
Does the Linux kernel documentation give some way to derive the memory mapping layout easily? ;-)

Confining this to non-Apple platforms looks good to me.

Note that this change (I believe) also affects other less-used 64-bit platforms like s390x/mips64 that I don't know how to test... I think they are likely fine with the change but I cannot say for sure.

Good point. How about limiting the change to SANITIZER_LINUX && (defined(__x86_64__)?

Does the Linux kernel documentation give some way to derive the memory mapping layout easily? ;-)

Dynamically or statically?

Confining this to non-Apple platforms looks good to me.

Note that this change (I believe) also affects other less-used 64-bit platforms like s390x/mips64 that I don't know how to test... I think they are likely fine with the change but I cannot say for sure.

Good point. How about limiting the change to SANITIZER_LINUX && (defined(__x86_64__)?

Not needed:) I have tested this on an AArch64 Linux machine and it works fine.
I think it works with aarch64/amd64 FreeBSD as well and have a fair chance to work on s390x/mips.
Perhaps just Apple platforms need to be excluded.

Does the Linux kernel documentation give some way to derive the memory mapping layout easily? ;-)

Dynamically or statically?

Statically to judge whether a kAllocatorSpace choice without having to run programs in a qemu-system environment:)

thurston added a comment.EditedApr 13 2023, 1:50 PM

Confining this to non-Apple platforms looks good to me.

Note that this change (I believe) also affects other less-used 64-bit platforms like s390x/mips64 that I don't know how to test... I think they are likely fine with the change but I cannot say for sure.

Good point. How about limiting the change to SANITIZER_LINUX && (defined(__x86_64__)?

Not needed:) I have tested this on an AArch64 Linux machine and it works fine.
I think it works with aarch64/amd64 FreeBSD as well and have a fair chance to work on s390x/mips.
Perhaps just Apple platforms need to be excluded.

I wouldn't want to risk breaking anyone's S390 or MIPS setups. How about making it (SANITIZER_FREEBSD || SANITIZER_LINUX) && (defined(__x86_64__)?
(It's not necessary to opt-in aarch64 because the old mapping also works for them.)

Does the Linux kernel documentation give some way to derive the memory mapping layout easily? ;-)

Dynamically or statically?

Statically to judge whether a kAllocatorSpace choice without having to run programs in a qemu-system environment:)

That's pretty hard AFAICS. The kernel source code doesn't actually define constants such as 0x555555550000 (the program segment start location) - which made it fun trying to reverse engineer :-). The magic happens in:

#define ELF_ET_DYN_BASE		(2 * DEFAULT_MAP_WINDOW_64 / 3)

and then you need to go down the rabbit hole to find the DEFAULT_MAP_WINDOW_64 value (which is defined in terms of VA_BITS_MIN ...). We also need to calculate ASLR, and the library segments start location and stack size (if we want to place the allocator in between the library and program segments).

We also need to make sure that the allocator doesn't conflict with ASan's shadow mappings, though that can be trivially solved at runtime.

Confining this to non-Apple platforms looks good to me.

Note that this change (I believe) also affects other less-used 64-bit platforms like s390x/mips64 that I don't know how to test... I think they are likely fine with the change but I cannot say for sure.

Good point. How about limiting the change to SANITIZER_LINUX && (defined(__x86_64__)?

Not needed:) I have tested this on an AArch64 Linux machine and it works fine.
I think it works with aarch64/amd64 FreeBSD as well and have a fair chance to work on s390x/mips.
Perhaps just Apple platforms need to be excluded.

I wouldn't want to risk breaking anyone's S390 or MIPS setups. How about making it (SANITIZER_FREEBSD || SANITIZER_LINUX) && (defined(__x86_64__)?
(It's not necessary to opt-in aarch64 because the old mapping also works for them.)

I have checked mappings on an x86-64 FreeBSD machine. The address is fine. aarch64 FreeBSD should behave similarly.

I think there is very large chance that 0x500000000000ULL will work for s390x and mips64, so we don't need to be too defensive here.
In the worst case s390x and mips64 can opt out in a future commit...

Does the Linux kernel documentation give some way to derive the memory mapping layout easily? ;-)

Dynamically or statically?

Statically to judge whether a kAllocatorSpace choice without having to run programs in a qemu-system environment:)

That's pretty hard AFAICS. The kernel source code doesn't actually define constants such as 0x555555550000 (the program segment start location) - which made it fun trying to reverse engineer :-). The magic happens in:

#define ELF_ET_DYN_BASE		(2 * DEFAULT_MAP_WINDOW_64 / 3)

and then you need to go down the rabbit hole to find the DEFAULT_MAP_WINDOW_64 value (which is defined in terms of VA_BITS_MIN ...). We also need to calculate ASLR, and the library segments start location and stack size (if we want to place the allocator in between the library and program segments).

We also need to make sure that the allocator doesn't conflict with ASan's shadow mappings, though that can be trivially solved at runtime.

Reading #define ELF_ET_DYN_BASE is how I approached this as well. Sigh, no good way...

Thanks hans for the heads up about the failed test and taking care of the revert, and lgrey and MaskRay for the advice on how to revise the patch! It's been re-landed (with the suggested change) in D148280.