This patch follows https://reviews.llvm.org/D28609 and enables LSan for arm.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
test/lsan/TestCases/swapcontext.cc | ||
---|---|---|
7 | The point here is that this test fails on arm with SIGBUS due to child_stack is unaligned: bash-3.2# LSAN_OPTIONS=log_pointers=1 /build/llvm/projects/compiler-rt/test/lsan/ARMLsanConfig/TestCases/Output/swapcontext.cc.tmp fooChild stack: 0x7e9ee5f3 Bus error I can align it manually or use malloc/aligned_alloc instead of new[] to avoid the issue. |
You seem to be enabling it on soft and hard float. Did you test on both environments? What type of hardware did you test this?
lib/sanitizer_common/sanitizer_linux.cc | ||
---|---|---|
1269 | This sequence doesn't coincide with the order of parameters above, and can lead to unexpected results. Why do you need to reserve the registers in this code at all? |
lib/sanitizer_common/sanitizer_linux.cc | ||
---|---|---|
1269 | Why should it? The sole purpose of register annotations is to pass arguments to swi call below. |
lib/sanitizer_common/sanitizer_linux.cc | ||
---|---|---|
1269 | Right, I just realised the compiler will do the necessary moves. This code is really opaque, can you add some comments like the ones above? | |
1284 | If this is ARMv4T and no interwork, this will fail, as blx was introduced in ARMv5. Check compiler-rt/lib/builtins/assembly.h for more info on the multiple ways to return. |
lib/sanitizer_common/sanitizer_linux.cc | ||
---|---|---|
1269 | Note that that's how it's done for other cores, including AArch64. All comments is given for x86 internal_clone implementation above (it basically says that this hairy code is inspired by Glibc clone implementation). |
I've tested the patch on two configurations:
- Linaro Ubuntu 14.04 (Trusty) on Arndale board. It's a hard float system, right?
- I've built a softfp sysroot and tested the patch there on the same Arndale board (just chrooted to new root).
lib/sanitizer_common/sanitizer_linux.cc | ||
---|---|---|
1315 | I think you could simplify to just: #ifdef ARCH_HAS_BX # ifdef ARCH_HAS_BLX # define BLX(R) "blx " #R # else # define BLX(R) "mov lr, pc; bx" #R # endif #else # define BLX(R) "mov lr, pc; mov pc," #R #endif And then on inline assembly just call BLX(ip). | |
lib/sanitizer_common/sanitizer_linux_libcdep.cc | ||
249 | This is not what I am seeing on my environment: 2.25 - 1216 The 2.22 to 2.23 change (cause by 3e2ee6f0e3471ce commit) removed HAVE_FORCED_UNWIND and thus 'struct pthread' adds a 'struct _Unwind_Exception' regardless if architecture defined or not HAVE_FORCED_UNWIND. |
Oh, things become more complicated... I see many test failures with host GCC 4.8 because fast unwinder failed to get backtraces for mallocs. This happens because __interceptor_malloc, compiled by GCC 4.8, didn't properly save frame pointer register:
00000000 <__interceptor_malloc>: 0: e92d 47f0 stmdb sp!, {r4, r5, r6, r7, r8, r9, sl, lr} 4: f5ad 6d85 sub.w sp, sp, #1064 ; 0x428 8: 4d29 ldr r5, [pc, #164] ; (b0 <__interceptor_malloc+0xb0>) a: 4b2a ldr r3, [pc, #168] ; (b4 <__interceptor_malloc+0xb4>) c: af04 add r7, sp, #16 e: 447d add r5, pc
But if compile with clang (3.3, 3.5 or trunk) or GCC 4.9, the fp is saved properly:
00000000 <__interceptor_malloc>: 0: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr} 4: e28db01c add fp, sp, #28 8: e24dd024 sub sp, sp, #36 ; 0x24 c: e24ddb01 sub sp, sp, #1024 ; 0x400 10: e1a08000 mov r8, r0
So, it seems that LSan's operability depends on what compiler is used to build it :(.
It is a know issue [1] [2] and in fact it a little more complicated, it depends on the flags used as well:
if you mix thumb and arm code fast unwinder won't work. I am not sure which option would be better,
but I am afraid to get a reliable stacktrack on arm (with mixed objects and function built with different compilers)
we will need to use a external unwinder (maybe libunwinder). But it will have the side-effect of making
backtrace way slower.
[1] https://llvm.org/bugs/show_bug.cgi?id=22741
[2] https://llvm.org/bugs/show_bug.cgi?id=18505
Right, but I thought it affects only Thumb (that might be mixed with arm) but it seems that even arm mode is affected depending on GCC version. This complicates enabling tests on arm targets, but still, It would be nice to have LSan available on arm.
My only concern is what kind of information user will see when potentially use with different libraries and systems.
We can constrain testing on a working environment, but it does not help when real usercase won't get meaningful
stacktraces.
Other projects usually rely on dwarf CFI for get arm stacktraces (valgrind, glibc) and I think we might need to actually
use it or rely on external libraries for arm (libunwind, etc.). Valgrind even tries to add some instructions scanning
to get around the thumb issues (coregrind/m_stacktrace.c) but it even states that it results in bad output.
One option would be to enable lsan as is and try to improve sanitizer stacktrace on arm
(lib/sanitizer_common/sanitizer_stacktrace.cc).
Yeah, users will need to compile all their code with -marm -fno-omit-frame-pointer to get stacktraces (ASan would benefit from this too) or use slow unwinder with small context (e.g. 5) but this would significantly slow down application under test.
Other projects usually rely on dwarf CFI for get arm stacktraces (valgrind, glibc) and I think we might need to actually
use it or rely on external libraries for arm (libunwind, etc.). Valgrind even tries to add some instructions scanning
to get around the thumb issues (coregrind/m_stacktrace.c) but it even states that it results in bad output.One option would be to enable lsan as is and try to improve sanitizer stacktrace on arm
(lib/sanitizer_common/sanitizer_stacktrace.cc).
Perhaps we can enable slow unwinder for LSan tests on ARM (compile with -funwind-tables and run with 'fast_unwind_on_malloc=false')? This would make almost all tests pass except few ones (e.g. use_tls_pthread_specific_dynamic.cc) where slow unwinder might be unable to get caller PC (e.g. from libpthread.so).
Updating. I've enabled slow unwinder for LSan tests on ARM to make them pass regardless of arm/thumb mode and compiler that was used to build LSan runtime.
That's pretty bad.
Yes, the tests will pass, but the tool will be utterly useless in real life. Slow unwinder is just too slow.
And setting fast_unwind_on_malloc=false in the test config means we are not testing what the users are going to use.
I agree. ARM ABI is pretty bad -- in thumb mode we can't use fast unwinder because of known reasons, but I really don't know how can we reliably run LSAN tests on Arm without enabling slow unwinder. In any case, I cannot push these changes -- you are the code owner after all, but even with slow unwinder we got pretty fine results on our distributiive. Is there any way how we can enable LSAN for Arm?
One option is to force libasan and tests to compile to ARM (via -marm switch) which is known to work. I think common understanding is that fast unwinder is unfixable in Thumb mode.
lib/sanitizer_common/sanitizer_linux_libcdep.cc | ||
---|---|---|
202 | Or just !isalpha(*end)? |
BTW how is the situation different from Asan? Fast unwinder should have _exactly_ same problems there but I guess just decided to live with them.
As far as I can see, ASan tests that rely on fast unwinder are just disabled on ARM (e.g. use-after-free.cc) via REQUIRES: stable-runtime.
I'm afraid this isn't enough, consider two following cases:
- If LSan itself is compiled with Thumb fast unwinder would be unable to leave first frame.
- Likewise, if LSan is compiled with GCC 4.8, fast unwinder would be unable to leave first frame due to __interceptor_malloc's frame doesn't contain saved fp (see objdump output several comments above).
Disabling slow unwinder in tests.
Still, if compile LSan with thumb, the whole LSan testsuite would fail. Can we somehow disable it on thumb targets? E.g. check for '-mthumb' flag into lit configs and enable/disable tests accordingly?
E.g. check for '-mthumb' flag into lit configs and enable/disable tests accordingly?
Something like REQUIRES: fast-unwinder-works.
Rebased. I've also disabled LSan tests on Thumb targets filtering them out by presence of -mthumb flag.
I have no further comments and defer lgtm to folks who know something about the unwinder stuff.
lib/lsan/lsan_common.h | ||
---|---|---|
151 | IsItaniumABIArrayCookie | |
162 | IsARMABIArrayCookie | |
lib/sanitizer_common/sanitizer_linux.cc | ||
1265 | Isn't stack 4 bytes aligned on arm32? | |
1293 | Why do we push them to stack? Aren't all the params for swi passed in registers? |
Going back to the issue with interceptor_malloc not saving FP when built with GCC. In clang a function using builtin_frame_address(0) automatically gets a proper frame pointer. As far as I can see, arm-gcc in the android ndk (version 4.9) behaves the same way on a simple test case. Is that some new optimization where a frame address is computed on the fly?
lib/sanitizer_common/sanitizer_linux_libcdep.cc | ||
---|---|---|
202 | what exactly are we trying to avoid here? Something like 1.2abc ? | |
test/lsan/TestCases/swapcontext.cc | ||
28 | should it not be 16 for x86_64? |
No, I just missed that GCC 4.8 emits thumb code by default on my Linaro Ubuntu 14.04 (configured with --with-mode=thumb switch). Passing -marm resolves the issue.
lib/sanitizer_common/sanitizer_linux_libcdep.cc | ||
---|---|---|
202 | Yeah, just filter out "strange" Glibc configurations. | |
test/lsan/TestCases/swapcontext.cc | ||
28 | Yeah, x86_64 needs 16 here (not sure why did this work on my x86_64 box, perhaps my machine we OK with unaligned access). |
LGTM with the aligned(16) change
test/lsan/TestCases/swapcontext.cc | ||
---|---|---|
28 | This only matters for some sse instructions which can appear out of thin air, like an inlined memcpy of a stack-allocated buffer. I guess it did not happen in you case. Still, better be safe. |
This breaks bootstrap builds. I put the error message in the revert commit description: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170410/444101.html
Yeeks, sorry about this. The problem seems to be here (missing space between bx and #R):
# define BLX(R) "mov lr, pc; bx" #R "\n"
I'll re-land the patch with this fix.
IsItaniumABIArrayCookie