The original asan port was support Sv39 only, however Sv48 support has
landed to linux kernel[1] for a while, so we are trying to make it support
both Sv39 and Sv48, the key point is we need to use a dynamic offset for the
shadow region, this bring extra runtime overhead, but compare to other
overhead in ASan, this should be acceptable.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
compiler-rt/lib/sanitizer_common/sanitizer_platform.h | ||
---|---|---|
310 | Just out of curiosity, why can't we modify the code for SV57 here which I assume works for Sv48 and SV39? Maybe there is no way to test it? root@74c2d5eadeed:/llvm-project/build# ./nobug string is: Hello world! ==76543==LeakSanitizer has encountered a fatal error. ==76543==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1 ==76543==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc) |
Folks, please let me know in case I can do anything to make this patch merged as soon as possible. I can run tests with Qemu, docker and an unmatched board.
The original asan port was support Sv39 only, however Sv48 support has landed to linux kernel[1] for a while, so we are trying to make it support both Sv39 and Sv48,
Supporting too many address space bits can introduce significant hidden maintenance burden and performance implication, e.g. using SizeClassAllocator32 for LeakSanitizer (standalone or asan/hwasan integrated) is extremely slow.
See D137666 that AArch64 msan dropped <48-bit VMA. A line needs to be drawn.
+1
Also overhead of dynamic shadow is quite significant, it dependents on hardware. So it would be nice if you can just drop smaller address spaces.
Supporting too many address space bits can introduce significant hidden maintenance burden and performance implication, e.g. using SizeClassAllocator32 for LeakSanitizer (standalone or asan/hwasan integrated) is extremely slow.
See D137666 that AArch64 msan dropped <48-bit VMA. A line needs to be drawn.
As I know there is still many RISC-V core are implement with Sv39 for now, we might able to drop support in future but it's not good timing for now I think.
Will defer few more days to commit for continue the discussion.
compiler-rt/lib/sanitizer_common/sanitizer_platform.h | ||
---|---|---|
310 |
In theory: yes, we can modify it to Sv57 and should work with Sv48 and Sv39, but we have no way to test due to the RISC-V linux didn't support that - and we don't have HW to test that too.
The message seems cause by docker. |
With the current (unpatched) LLVM tree on an RV64 SV39 host I get this ASAN test failure (for check-asan):
- AddressSanitizer-riscv64-linux :: TestCases/memcmp_test.cpp
With this patch on an RV64 SV48 host I get these test failures:
- AddressSanitizer-riscv64-linux :: TestCases/Linux/syscalls.cpp
- AddressSanitizer-riscv64-linux :: TestCases/Posix/bcmp_test.cpp
- AddressSanitizer-riscv64-linux :: TestCases/strncpy-overflow.cpp
I'll need to double-check if those results are actually correct but...
@kito-cheng Does that match your test results? If not, what do you observe? If it matches, have you investigated the new test failures?
compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp | ||
---|---|---|
1119 | The "default is 47-bit for RISCV64" is confusing, after listing "32, 39, 48 and 57". By that do you mean that Linux defaults to SV48? (if so fix 47->48). Or that the sanitizer defaults to a 1 << 47 userspace address range? Or both, or something else? |
Just back from Chinese new year vocation :P let me check with my local test environment again, thanks for testing and reporting this.
compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp | ||
---|---|---|
1119 | That's my stupid copy-and-paste error from AArch64, and my intention is support multiple address space layout. |
Sorry for not updating for a while, we/SiFive gonna to seeking a ultimate solution to work on each different Sv* system, my exception is we still need dynamic shadow sentinel which is what we did for this patch, but it not work well as @luismarques report, so I would like to defer this patch landing for now, and it become high priority work to SiFive, we've allocated dedicated engineer resources to improve the ASan including the ASan for scalable vector stuff and RVV intrinsic, the both work are targeting end of Q2.
Hi all,
I just published 3 more reviews:
https://reviews.llvm.org/D152895
https://reviews.llvm.org/D152990
https://reviews.llvm.org/D152991
With the 3 new patches, the check-lsan passes in our sv39, sv48, sv57 QEMU system mode test environments.
The common check-asan failure cases in the 3 virtual memory modes are
AddressSanitizer-riscv64-linux :: TestCases/Linux/syscalls.cpp AddressSanitizer-riscv64-linux :: TestCases/Posix/bcmp_test.cpp AddressSanitizer-riscv64-linux :: TestCases/memcmp_test.cpp AddressSanitizer-riscv64-linux :: TestCases/log-path_test.cpp
The first 3 failures are because the CHECK statements check specific function names in call stack, like
// CHECK: AddressSanitizer: stack-buffer-overflow (v) // CHECK: {{#1.*bcmp}} <= frame1 // CHECK: {{#2.*main}} <= frame2
But the function QuickCheckForUnpoisonedRegion() is not inlined, so the call stack looks a little bit different:
#0 0x2aac31a59c in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) .../llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:804:7 #1 0x2aac31a9aa in QuickCheckForUnpoisonedRegion .../llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:850:10 #2 0x2aac31a9aa in bcmp .../llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:817:7 #3 0x2aac3bbca8 in main .../llvm-project/compiler-rt/test/asan/TestCases/Posix/bcmp_test.cpp:13:13 #4 0x3f886e36c0 (/lib/libc.so.6+0x266c0) #5 0x3f886e376c in __libc_start_main (/lib/libc.so.6+0x2676c) #6 0x2aac306d52 in _start (.../llvm-project/build/test/asan/RISCV64LinuxConfig/TestCases/Posix/Output/bcmp_test.cpp.tmp+0x26d52)
https://reviews.llvm.org/D152991 looses the check statement to make the tests pass.
For AddressSanitizer-riscv64-linux :: TestCases/log-path_test.cpp failure, one of the test tries to set the log path to /INVALID and expects error.
But since in our QEMU test environment uses root, the path is actually valid.
There are 3 additional check-asan failures for sv57:
AddressSanitizer-riscv64-linux :: TestCases/Linux/allocator_oom_test.cpp AddressSanitizer-riscv64-linux :: TestCases/Linux/auto_memory_profile_test.cpp AddressSanitizer-riscv64-linux :: TestCases/Linux/quarantine_size_mb.cpp
We are still investigating them. But since the tests seems not so critical and sv57 is rather little users, we decide to propose the reviews for now.
@vitalybuka this patch is waiting D152895 for make sure LSan is work fine on both Sv39 and Sv48, and D152990 could make it work on Sv57.
This patch has tested ASan on Sv39 and Sv48 by @hauhsu, and all failures found by @luismarques has been addressed by D152991 + D152895.
If I'm reading this right, anything built with ASAN will fail on sv48 or sv57? We're defaulting the mmap() hints to sv48 because we didn't have any failing workloads there (Hotspot and Go fail sv57), but if this fails we should probably default to sv37. Is there a reproducer bouncing around we can put in kselftests?
I guess you might confused by SANITIZER_MMAP_RANGE_SIZE, it a parameter for internal memory allocator, set for larger Sv is always works on smaller Sv but less memory efficiency.
And the original reason fail on the Sv48/Sv57 is because most setting is Sv sensitive (kRISCV64_ShadowOffset64), and we try make it into Sv insensitive by trading few more runtime performance.
The "default is 47-bit for RISCV64" is confusing, after listing "32, 39, 48 and 57". By that do you mean that Linux defaults to SV48? (if so fix 47->48). Or that the sanitizer defaults to a 1 << 47 userspace address range? Or both, or something else?