Page MenuHomePhabricator

[lsan] Enable LSan for arm Linux
ClosedPublic

Authored by m.ostapenko on Feb 6 2017, 8:27 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

m.ostapenko created this revision.Feb 6 2017, 8:27 AM
ygribov added inline comments.Feb 6 2017, 9:04 AM
lib/sanitizer_common/sanitizer_linux_libcdep.cc
187

Misleading indent.

test/lsan/TestCases/swapcontext.cc
7

Deserves an explanation.

m.ostapenko added inline comments.Feb 6 2017, 9:53 AM
test/lsan/TestCases/swapcontext.cc
7

The point here is that this test fails on arm with SIGBUS due to child_stack is unaligned:

bash-3.2# LSAN_OPTIONS=log_pointers=1 /build/llvm/projects/compiler-rt/test/lsan/ARMLsanConfig/TestCases/Output/swapcontext.cc.tmp fooChild stack: 0x7e9ee5f3
Bus error

I can align it manually or use malloc/aligned_alloc instead of new[] to avoid the issue.

Updating according to Yura's nits.

rengolin edited edge metadata.Feb 6 2017, 10:48 AM
This comment was removed by rengolin.

You seem to be enabling it on soft and hard float. Did you test on both environments? What type of hardware did you test this?

lib/sanitizer_common/sanitizer_linux.cc
1269

This sequence doesn't coincide with the order of parameters above, and can lead to unexpected results.

Why do you need to reserve the registers in this code at all?

ygribov added inline comments.Feb 6 2017, 11:13 AM
lib/sanitizer_common/sanitizer_linux.cc
1269

Why should it? The sole purpose of register annotations is to pass arguments to swi call below.

rengolin added inline comments.Feb 6 2017, 1:37 PM
lib/sanitizer_common/sanitizer_linux.cc
1269

Right, I just realised the compiler will do the necessary moves.

This code is really opaque, can you add some comments like the ones above?

1284

If this is ARMv4T and no interwork, this will fail, as blx was introduced in ARMv5.

Check compiler-rt/lib/builtins/assembly.h for more info on the multiple ways to return.

ygribov added inline comments.Feb 6 2017, 9:55 PM
lib/sanitizer_common/sanitizer_linux.cc
1269

Note that that's how it's done for other cores, including AArch64. All comments is given for x86 internal_clone implementation above (it basically says that this hairy code is inspired by Glibc clone implementation).

Updating with a little bit more comments.

You seem to be enabling it on soft and hard float. Did you test on both environments? What type of hardware did you test this?

I've tested the patch on two configurations:

  1. Linaro Ubuntu 14.04 (Trusty) on Arndale board. It's a hard float system, right?
  2. I've built a softfp sysroot and tested the patch there on the same Arndale board (just chrooted to new root).
m.ostapenko added inline comments.Feb 7 2017, 12:37 AM
lib/sanitizer_common/sanitizer_linux.cc
1269

Yeah, the reason we have all that stuff is explained in x86_64 port. Anyway, I've added several comments to inline asm to improve readability. Looks better now?

1284

Ah, OK, thanks. Fixed now.

zatrazz added inline comments.Feb 7 2017, 12:16 PM
lib/sanitizer_common/sanitizer_linux.cc
1315

I think you could simplify to just:

#ifdef ARCH_HAS_BX
# ifdef ARCH_HAS_BLX
#  define BLX(R)        "blx "  #R
# else
#  define BLX(R)        "mov     lr, pc; bx" #R
# endif
#else
# define BLX(R)         "mov     lr, pc; mov pc," #R
#endif

And then on inline assembly just call BLX(ip).

lib/sanitizer_common/sanitizer_linux_libcdep.cc
249

This is not what I am seeing on my environment:

2.25 - 1216
2.24 - 1216
2.23 - 1216
2.22 - 1120
2.21 - 1120
2.20 - 1120

The 2.22 to 2.23 change (cause by 3e2ee6f0e3471ce commit) removed HAVE_FORCED_UNWIND and thus 'struct pthread' adds a 'struct _Unwind_Exception' regardless if architecture defined or not HAVE_FORCED_UNWIND.

m.ostapenko updated this revision to Diff 87612.Feb 8 2017, 1:25 AM
m.ostapenko added inline comments.Feb 8 2017, 1:28 AM
lib/sanitizer_common/sanitizer_linux_libcdep.cc
202

I don't really like this change . Perhaps we can create some white list of terminator symbols, or just drop the check?

249

Oh, my bad, should be fixed.

You seem to be enabling it on soft and hard float. Did you test on both environments? What type of hardware did you test this?

I've tested the patch on two configurations:

  1. Linaro Ubuntu 14.04 (Trusty) on Arndale board. It's a hard float system, right?

Oh, things become more complicated... I see many test failures with host GCC 4.8 because fast unwinder failed to get backtraces for mallocs. This happens because __interceptor_malloc, compiled by GCC 4.8, didn't properly save frame pointer register:

00000000 <__interceptor_malloc>:
   0:   e92d 47f0       stmdb   sp!, {r4, r5, r6, r7, r8, r9, sl, lr}
   4:   f5ad 6d85       sub.w   sp, sp, #1064   ; 0x428
   8:   4d29            ldr     r5, [pc, #164]  ; (b0 <__interceptor_malloc+0xb0>)
   a:   4b2a            ldr     r3, [pc, #168]  ; (b4 <__interceptor_malloc+0xb4>)
   c:   af04            add     r7, sp, #16
   e:   447d            add     r5, pc

But if compile with clang (3.3, 3.5 or trunk) or GCC 4.9, the fp is saved properly:

00000000 <__interceptor_malloc>:
   0:   e92d4ff0        push    {r4, r5, r6, r7, r8, r9, sl, fp, lr}
   4:   e28db01c        add     fp, sp, #28
   8:   e24dd024        sub     sp, sp, #36     ; 0x24
   c:   e24ddb01        sub     sp, sp, #1024   ; 0x400
  10:   e1a08000        mov     r8, r0

So, it seems that LSan's operability depends on what compiler is used to build it :(.

zatrazz edited edge metadata.Feb 9 2017, 11:38 AM

You seem to be enabling it on soft and hard float. Did you test on both environments? What type of hardware did you test this?

I've tested the patch on two configurations:

  1. Linaro Ubuntu 14.04 (Trusty) on Arndale board. It's a hard float system, right?

Oh, things become more complicated... I see many test failures with host GCC 4.8 because fast unwinder failed to get backtraces for mallocs. This happens because __interceptor_malloc, compiled by GCC 4.8, didn't properly save frame pointer register:

00000000 <__interceptor_malloc>:
   0:   e92d 47f0       stmdb   sp!, {r4, r5, r6, r7, r8, r9, sl, lr}
   4:   f5ad 6d85       sub.w   sp, sp, #1064   ; 0x428
   8:   4d29            ldr     r5, [pc, #164]  ; (b0 <__interceptor_malloc+0xb0>)
   a:   4b2a            ldr     r3, [pc, #168]  ; (b4 <__interceptor_malloc+0xb4>)
   c:   af04            add     r7, sp, #16
   e:   447d            add     r5, pc

But if compile with clang (3.3, 3.5 or trunk) or GCC 4.9, the fp is saved properly:

00000000 <__interceptor_malloc>:
   0:   e92d4ff0        push    {r4, r5, r6, r7, r8, r9, sl, fp, lr}
   4:   e28db01c        add     fp, sp, #28
   8:   e24dd024        sub     sp, sp, #36     ; 0x24
   c:   e24ddb01        sub     sp, sp, #1024   ; 0x400
  10:   e1a08000        mov     r8, r0

So, it seems that LSan's operability depends on what compiler is used to build it :(.

It is a know issue [1] [2] and in fact it a little more complicated, it depends on the flags used as well:
if you mix thumb and arm code fast unwinder won't work. I am not sure which option would be better,
but I am afraid to get a reliable stacktrack on arm (with mixed objects and function built with different compilers)
we will need to use a external unwinder (maybe libunwinder). But it will have the side-effect of making
backtrace way slower.

[1] https://llvm.org/bugs/show_bug.cgi?id=22741
[2] https://llvm.org/bugs/show_bug.cgi?id=18505

You seem to be enabling it on soft and hard float. Did you test on both environments? What type of hardware did you test this?

I've tested the patch on two configurations:

  1. Linaro Ubuntu 14.04 (Trusty) on Arndale board. It's a hard float system, right?

Oh, things become more complicated... I see many test failures with host GCC 4.8 because fast unwinder failed to get backtraces for mallocs. This happens because __interceptor_malloc, compiled by GCC 4.8, didn't properly save frame pointer register:

00000000 <__interceptor_malloc>:
   0:   e92d 47f0       stmdb   sp!, {r4, r5, r6, r7, r8, r9, sl, lr}
   4:   f5ad 6d85       sub.w   sp, sp, #1064   ; 0x428
   8:   4d29            ldr     r5, [pc, #164]  ; (b0 <__interceptor_malloc+0xb0>)
   a:   4b2a            ldr     r3, [pc, #168]  ; (b4 <__interceptor_malloc+0xb4>)
   c:   af04            add     r7, sp, #16
   e:   447d            add     r5, pc

But if compile with clang (3.3, 3.5 or trunk) or GCC 4.9, the fp is saved properly:

00000000 <__interceptor_malloc>:
   0:   e92d4ff0        push    {r4, r5, r6, r7, r8, r9, sl, fp, lr}
   4:   e28db01c        add     fp, sp, #28
   8:   e24dd024        sub     sp, sp, #36     ; 0x24
   c:   e24ddb01        sub     sp, sp, #1024   ; 0x400
  10:   e1a08000        mov     r8, r0

So, it seems that LSan's operability depends on what compiler is used to build it :(.

It is a know issue [1] [2] and in fact it a little more complicated, it depends on the flags used as well:

if you mix thumb and arm code fast unwinder won't work. I am not sure which option would be better,

but I am afraid to get a reliable stacktrack on arm (with mixed objects and function built with different compilers)
we will need to use a external unwinder (maybe libunwinder). But it will have the side-effect of making
backtrace way slower.

[1] https://llvm.org/bugs/show_bug.cgi?id=22741
[2] https://llvm.org/bugs/show_bug.cgi?id=18505

Right, but I thought it affects only Thumb (that might be mixed with arm) but it seems that even arm mode is affected depending on GCC version. This complicates enabling tests on arm targets, but still, It would be nice to have LSan available on arm.

So, it seems that LSan's operability depends on what compiler is used to build it :(.

It is a know issue [1] [2] and in fact it a little more complicated, it depends on the flags used as well:

if you mix thumb and arm code fast unwinder won't work. I am not sure which option would be better,

but I am afraid to get a reliable stacktrack on arm (with mixed objects and function built with different compilers)
we will need to use a external unwinder (maybe libunwinder). But it will have the side-effect of making
backtrace way slower.

[1] https://llvm.org/bugs/show_bug.cgi?id=22741
[2] https://llvm.org/bugs/show_bug.cgi?id=18505

Right, but I thought it affects only Thumb (that might be mixed with arm) but it seems that even arm mode is affected depending on GCC version. This complicates enabling tests on arm targets, but still, It would be nice to have LSan available on arm.

My only concern is what kind of information user will see when potentially use with different libraries and systems.
We can constrain testing on a working environment, but it does not help when real usercase won't get meaningful
stacktraces.

Other projects usually rely on dwarf CFI for get arm stacktraces (valgrind, glibc) and I think we might need to actually
use it or rely on external libraries for arm (libunwind, etc.). Valgrind even tries to add some instructions scanning
to get around the thumb issues (coregrind/m_stacktrace.c) but it even states that it results in bad output.

One option would be to enable lsan as is and try to improve sanitizer stacktrace on arm
(lib/sanitizer_common/sanitizer_stacktrace.cc).

So, it seems that LSan's operability depends on what compiler is used to build it :(.

It is a know issue [1] [2] and in fact it a little more complicated, it depends on the flags used as well:

if you mix thumb and arm code fast unwinder won't work. I am not sure which option would be better,

but I am afraid to get a reliable stacktrack on arm (with mixed objects and function built with different compilers)
we will need to use a external unwinder (maybe libunwinder). But it will have the side-effect of making
backtrace way slower.

[1] https://llvm.org/bugs/show_bug.cgi?id=22741
[2] https://llvm.org/bugs/show_bug.cgi?id=18505

Right, but I thought it affects only Thumb (that might be mixed with arm) but it seems that even arm mode is affected depending on GCC version. This complicates enabling tests on arm targets, but still, It would be nice to have LSan available on arm.

My only concern is what kind of information user will see when potentially use with different libraries and systems.
We can constrain testing on a working environment, but it does not help when real usercase won't get meaningful
stacktraces.

Yeah, users will need to compile all their code with -marm -fno-omit-frame-pointer to get stacktraces (ASan would benefit from this too) or use slow unwinder with small context (e.g. 5) but this would significantly slow down application under test.

Other projects usually rely on dwarf CFI for get arm stacktraces (valgrind, glibc) and I think we might need to actually
use it or rely on external libraries for arm (libunwind, etc.). Valgrind even tries to add some instructions scanning
to get around the thumb issues (coregrind/m_stacktrace.c) but it even states that it results in bad output.

One option would be to enable lsan as is and try to improve sanitizer stacktrace on arm
(lib/sanitizer_common/sanitizer_stacktrace.cc).

Perhaps we can enable slow unwinder for LSan tests on ARM (compile with -funwind-tables and run with 'fast_unwind_on_malloc=false')? This would make almost all tests pass except few ones (e.g. use_tls_pthread_specific_dynamic.cc) where slow unwinder might be unable to get caller PC (e.g. from libpthread.so).

Updating. I've enabled slow unwinder for LSan tests on ARM to make them pass regardless of arm/thumb mode and compiler that was used to build LSan runtime.

kcc edited edge metadata.Feb 16 2017, 1:42 PM

Updating. I've enabled slow unwinder for LSan tests on ARM to make them pass regardless of arm/thumb mode and compiler that was used to build LSan runtime.

That's pretty bad.
Yes, the tests will pass, but the tool will be utterly useless in real life. Slow unwinder is just too slow.
And setting fast_unwind_on_malloc=false in the test config means we are not testing what the users are going to use.

In D29586#679101, @kcc wrote:

Updating. I've enabled slow unwinder for LSan tests on ARM to make them pass regardless of arm/thumb mode and compiler that was used to build LSan runtime.

That's pretty bad.
Yes, the tests will pass, but the tool will be utterly useless in real life. Slow unwinder is just too slow.
And setting fast_unwind_on_malloc=false in the test config means we are not testing what the users are going to use.

I agree. ARM ABI is pretty bad -- in thumb mode we can't use fast unwinder because of known reasons, but I really don't know how can we reliably run LSAN tests on Arm without enabling slow unwinder. In any case, I cannot push these changes -- you are the code owner after all, but even with slow unwinder we got pretty fine results on our distributiive. Is there any way how we can enable LSAN for Arm?

One option is to force libasan and tests to compile to ARM (via -marm switch) which is known to work. I think common understanding is that fast unwinder is unfixable in Thumb mode.

lib/sanitizer_common/sanitizer_linux_libcdep.cc
202

Or just !isalpha(*end)?

BTW how is the situation different from Asan? Fast unwinder should have _exactly_ same problems there but I guess just decided to live with them.

BTW how is the situation different from Asan? Fast unwinder should have _exactly_ same problems there but I guess just decided to live with them.

As far as I can see, ASan tests that rely on fast unwinder are just disabled on ARM (e.g. use-after-free.cc) via REQUIRES: stable-runtime.

One option is to force libasan and tests to compile to ARM (via -marm switch) which is known to work. I think common understanding is that fast unwinder is unfixable in Thumb mode.

I'm afraid this isn't enough, consider two following cases:

  1. If LSan itself is compiled with Thumb fast unwinder would be unable to leave first frame.
  2. Likewise, if LSan is compiled with GCC 4.8, fast unwinder would be unable to leave first frame due to __interceptor_malloc's frame doesn't contain saved fp (see objdump output several comments above).

Disabling slow unwinder in tests.
Still, if compile LSan with thumb, the whole LSan testsuite would fail. Can we somehow disable it on thumb targets? E.g. check for '-mthumb' flag into lit configs and enable/disable tests accordingly?

E.g. check for '-mthumb' flag into lit configs and enable/disable tests accordingly?

Something like REQUIRES: fast-unwinder-works.

Rebased. I've also disabled LSan tests on Thumb targets filtering them out by presence of -mthumb flag.

Gentle ping. Is there any chance that LSan for ARM will be accepted?

Aleksey is looking at lsan now, hopefully he can help review this.

Rebased. Gentle reminder.

m.ostapenko updated this revision to Diff 93847.Apr 3 2017, 6:03 AM

Rebased. Ping.

alekseyshl edited edge metadata.Apr 3 2017, 11:02 AM

I have no further comments and defer lgtm to folks who know something about the unwinder stuff.

lib/lsan/lsan_common.h
149

IsItaniumABIArrayCookie

160

IsARMABIArrayCookie

lib/sanitizer_common/sanitizer_linux.cc
1265

Isn't stack 4 bytes aligned on arm32?

1293

Why do we push them to stack? Aren't all the params for swi passed in registers?

m.ostapenko updated this revision to Diff 94077.Apr 4 2017, 7:26 AM

Updating according to Aleksey's comments.

eugenis edited edge metadata.Apr 4 2017, 5:12 PM

Going back to the issue with interceptor_malloc not saving FP when built with GCC. In clang a function using builtin_frame_address(0) automatically gets a proper frame pointer. As far as I can see, arm-gcc in the android ndk (version 4.9) behaves the same way on a simple test case. Is that some new optimization where a frame address is computed on the fly?

lib/sanitizer_common/sanitizer_linux_libcdep.cc
202

what exactly are we trying to avoid here? Something like 1.2abc ?
This is where is comes from: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60038

test/lsan/TestCases/swapcontext.cc
28

should it not be 16 for x86_64?

Going back to the issue with interceptor_malloc not saving FP when built with GCC. In clang a function using builtin_frame_address(0) automatically gets a proper frame pointer. As far as I can see, arm-gcc in the android ndk (version 4.9) behaves the same way on a simple test case. Is that some new optimization where a frame address is computed on the fly?

No, I just missed that GCC 4.8 emits thumb code by default on my Linaro Ubuntu 14.04 (configured with --with-mode=thumb switch). Passing -marm resolves the issue.

lib/sanitizer_common/sanitizer_linux_libcdep.cc
202

Yeah, just filter out "strange" Glibc configurations.

test/lsan/TestCases/swapcontext.cc
28

Yeah, x86_64 needs 16 here (not sure why did this work on my x86_64 box, perhaps my machine we OK with unaligned access).

eugenis accepted this revision.Apr 5 2017, 1:33 PM

LGTM with the aligned(16) change

test/lsan/TestCases/swapcontext.cc
28

This only matters for some sse instructions which can appear out of thin air, like an inlined memcpy of a stack-allocated buffer. I guess it did not happen in you case. Still, better be safe.

This revision is now accepted and ready to land.Apr 5 2017, 1:33 PM
This revision was automatically updated to reflect the committed changes.

This breaks bootstrap builds. I put the error message in the revert commit description: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170410/444101.html

This breaks bootstrap builds. I put the error message in the revert commit description: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170410/444101.html

Yeeks, sorry about this. The problem seems to be here (missing space between bx and #R):

#  define BLX(R) "mov lr, pc; bx" #R "\n"

I'll re-land the patch with this fix.