This is an archive of the discontinued LLVM Phabricator instance.

[asan] Skip all non-shared objects in FindFirstDSOCallback.
Needs RevisionPublic

Authored by dankm on Nov 27 2015, 7:07 PM.

Details

Summary

FreeBSD's implementation of dl_iterate_phdr iterates over all ELF
objects, but we only care about shared objects (PT_LOAD type). Skip
them so that the address sanitizer runtime can work as a shared
object in FreeBSD.

Diff Detail

Event Timeline

dankm updated this revision to Diff 41317.Nov 27 2015, 7:07 PM
dankm retitled this revision from to Skip all non-shared objects in FindFirstDSOCallback..
dankm updated this object.
dankm added a comment.Nov 27 2015, 7:10 PM

The changes to CMakeLists.txt are to ensure that the shared runtime doesn't link libc before libthr. If that happens then the pthread implementation doesn't get intercepted properly.

dankm retitled this revision from Skip all non-shared objects in FindFirstDSOCallback. to [asan] Skip all non-shared objects in FindFirstDSOCallback..Nov 27 2015, 7:22 PM
dankm removed a reviewer: llvm-commits.
dankm set the repository for this revision to rL LLVM.
dankm removed rL LLVM as the repository for this revision.Nov 27 2015, 7:38 PM
dim added inline comments.Nov 28 2015, 5:27 AM
lib/asan/CMakeLists.txt
68

So where did libc itself go? Was it actually needed?

dankm added inline comments.Nov 30 2015, 7:44 AM
lib/asan/CMakeLists.txt
68

It's not needed, at least on FreeBSD. It implicitly links against libc. The resulting shared object has libc last in its DT_NEEDED set.

eugenis edited edge metadata.Nov 30 2015, 5:12 PM

This is fun.
We never explicitly link libc on Linux because CMake check for printf in libc fails spectacularly:

[1/2] Building C object CMakeFiles/cmTC_724d8.dir/CheckFunctionExists.c.o
FAILED: /code/build-llvm0/bin/clang -fPIC -Wall -W -Wno-unused-parameter -Wwrite-strings -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Werror -fcolor-diagnostics -ffunction-sections -fdata-sections -DCHECK_FUNCTION_EXISTS=printf -o CMakeFiles/cmTC_724d8.dir/CheckFunctionExists.c.o -c /usr/local/share/cmake-3.4/Modules/CheckFunctionExists.c
/usr/local/share/cmake-3.4/Modules/CheckFunctionExists.c:6:6: error: incompatible redeclaration of library function 'printf' [-Werror,-Wincompatible-library-redeclaration]
char CHECK_FUNCTION_EXISTS();

^

<command line>:1:31: note: expanded from here
#define CHECK_FUNCTION_EXISTS printf

^

/usr/local/share/cmake-3.4/Modules/CheckFunctionExists.c:6:6: note: 'printf' is a builtin with type 'int (const char *, ...)'
<command line>:1:31: note: expanded from here
#define CHECK_FUNCTION_EXISTS printf

^

1 error generated.

Just kill the whole thing.

dim accepted this revision.Dec 4 2015, 10:36 AM
dim edited edge metadata.

Ok, this looks good to me. On Linux, however, this might slightly change the first DSO found, as long as it's *not* asan. With the example from dl_iterate_phdr(3) (slightly modified to print the type), I get this output on Linux:

$ ./dlip
name= (8 segments), type PT_PHDR
		 header  0: address=  0x400040
		 header  1: address=  0x400200
		 header  2: address=  0x400000
		 header  3: address=  0x6009a0
		 header  4: address=  0x6009b8
		 header  5: address=  0x40021c
		 header  6: address=  0x400848
		 header  7: address=     (nil)
name=linux-vdso.so.1 (4 segments), type PT_LOAD
		 header  0: address=0x7ffcbed7f000
		 header  1: address=0x7ffcbed7f318
		 header  2: address=0x7ffcbed7f818
		 header  3: address=0x7ffcbed7f854
name=/lib/x86_64-linux-gnu/libc.so.6 (10 segments), type PT_PHDR
		 header  0: address=0x7f3ec9f70040
		 header  1: address=0x7f3eca0dc330
		 header  2: address=0x7f3ec9f70000
		 header  3: address=0x7f3eca30f740
		 header  4: address=0x7f3eca312ba0
		 header  5: address=0x7f3ec9f70270
		 header  6: address=0x7f3eca30f740
		 header  7: address=0x7f3eca0dc34c
		 header  8: address=0x7f3ec9f70000
		 header  9: address=0x7f3eca30f740
name=/lib64/ld-linux-x86-64.so.2 (7 segments), type PT_LOAD
		 header  0: address=0x7f3eca319000
		 header  1: address=0x7f3eca539c00
		 header  2: address=0x7f3eca539e70
		 header  3: address=0x7f3eca3191c8
		 header  4: address=0x7f3eca336440
		 header  5: address=0x7f3eca319000
		 header  6: address=0x7f3eca539c00

So whereas libc.so.6 would be found earlier, it will now first find ld-linux-x86-64.so.2. I'm not sure why libc.so.6 is identified as PT_PHDR, though.

When using -fsanitize=address, the output is changed to:

$ ./dlip
name= (8 segments), type PT_PHDR
		 header  0: address=  0x400040
		 header  1: address=  0x400200
		 header  2: address=  0x400000
		 header  3: address=  0x601478
		 header  4: address=  0x6014a8
		 header  5: address=  0x40021c
		 header  6: address=  0x4012d4
		 header  7: address=     (nil)
name=linux-vdso.so.1 (4 segments), type PT_LOAD
		 header  0: address=0x7ffebd599000
		 header  1: address=0x7ffebd599318
		 header  2: address=0x7ffebd599818
		 header  3: address=0x7ffebd599854
name=/usr/lib/x86_64-linux-gnu/libasan.so.1 (7 segments), type PT_LOAD
		 header  0: address=0x7ff8acafc000
		 header  1: address=0x7ff8acd97a00
		 header  2: address=0x7ff8acd98848
		 header  3: address=0x7ff8acafc1c8
		 header  4: address=0x7ff8acd97a00
		 header  5: address=0x7ff8acb84d18
		 header  6: address=0x7ff8acafc000
name=/lib/x86_64-linux-gnu/libc.so.6 (10 segments), type PT_PHDR
		 header  0: address=0x7ff8ac753040
		 header  1: address=0x7ff8ac8bf330
		 header  2: address=0x7ff8ac753000
		 header  3: address=0x7ff8acaf2740
		 header  4: address=0x7ff8acaf5ba0
		 header  5: address=0x7ff8ac753270
		 header  6: address=0x7ff8acaf2740
		 header  7: address=0x7ff8ac8bf34c
		 header  8: address=0x7ff8ac753000
		 header  9: address=0x7ff8acaf2740
name=/lib/x86_64-linux-gnu/libpthread.so.0 (9 segments), type PT_PHDR
		 header  0: address=0x7ff8ac536040
		 header  1: address=0x7ff8ac5481c0
		 header  2: address=0x7ff8ac536000
		 header  3: address=0x7ff8ac74db80
		 header  4: address=0x7ff8ac74dd50
		 header  5: address=0x7ff8ac536238
		 header  6: address=0x7ff8ac5481dc
		 header  7: address=0x7ff8ac536000
		 header  8: address=0x7ff8ac74db80
name=/lib/x86_64-linux-gnu/libdl.so.2 (9 segments), type PT_PHDR
		 header  0: address=0x7ff8ac332040
		 header  1: address=0x7ff8ac333a30
		 header  2: address=0x7ff8ac332000
		 header  3: address=0x7ff8ac534d60
		 header  4: address=0x7ff8ac534d88
		 header  5: address=0x7ff8ac332238
		 header  6: address=0x7ff8ac333a4c
		 header  7: address=0x7ff8ac332000
		 header  8: address=0x7ff8ac534d60
name=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 (8 segments), type PT_LOAD
		 header  0: address=0x7ff8ac027000
		 header  1: address=0x7ff8ac313108
		 header  2: address=0x7ff8ac31a178
		 header  3: address=0x7ff8ac027200
		 header  4: address=0x7ff8ac313108
		 header  5: address=0x7ff8ac0f13ec
		 header  6: address=0x7ff8ac027000
		 header  7: address=0x7ff8ac313108
name=/lib/x86_64-linux-gnu/libm.so.6 (9 segments), type PT_PHDR
		 header  0: address=0x7ff8abd26040
		 header  1: address=0x7ff8abe1d1f0
		 header  2: address=0x7ff8abd26000
		 header  3: address=0x7ff8ac025d90
		 header  4: address=0x7ff8ac025da8
		 header  5: address=0x7ff8abd26238
		 header  6: address=0x7ff8abe1d20c
		 header  7: address=0x7ff8abd26000
		 header  8: address=0x7ff8ac025d90
name=/lib64/ld-linux-x86-64.so.2 (7 segments), type PT_LOAD
		 header  0: address=0x7ff8ad9d3000
		 header  1: address=0x7ff8adbf3c00
		 header  2: address=0x7ff8adbf3e70
		 header  3: address=0x7ff8ad9d31c8
		 header  4: address=0x7ff8ad9f0440
		 header  5: address=0x7ff8ad9d3000
		 header  6: address=0x7ff8adbf3c00
name=/lib/x86_64-linux-gnu/libgcc_s.so.1 (6 segments), type PT_LOAD
		 header  0: address=0x7ff8abb10000
		 header  1: address=0x7ff8abd25450
		 header  2: address=0x7ff8abd25470
		 header  3: address=0x7ff8abb10190
		 header  4: address=0x7ff8abb23490
		 header  5: address=0x7ff8abb10000

As you can see, libasan.so.1 is the first PT_LOAD segment.

On FreeBSD, the output is rather different:

$ ./dlip
name=/tmp/dlip (8 segments), type PT_PHDR
                 header  0: address= 0x8048034
                 header  1: address= 0x8048134
                 header  2: address= 0x8048000
                 header  3: address= 0x8049920
                 header  4: address= 0x8049934
                 header  5: address= 0x804814c
                 header  6: address= 0x80488d4
                 header  7: address=       0x0
name=/lib/libc.so.7 (6 segments), type PT_LOAD
                 header  0: address=0x2806e000
                 header  1: address=0x281ca000
                 header  2: address=0x281ccad8
                 header  3: address=0x281ca000
                 header  4: address=0x281c98e8
                 header  5: address=0x2806e000
name=/libexec/ld-elf.so.1 (0 segments), type (null phdr)

E.g. here libc.so.7 will be found as the first PT_LOAD segment. (Since I don't have a dynamic asan lib yet, I can't show the output for that case.)

This revision is now accepted and ready to land.Dec 4 2015, 10:36 AM
eugenis added inline comments.Dec 4 2015, 3:48 PM
lib/asan/asan_linux.cc
100

You are looking at the first segment type. In libc on linux it happens to be PT_PHDR for some reason, and PT_LOAD's are the 3rd and the 4th. You should probably iterate.

eugenis requested changes to this revision.Dec 4 2015, 3:48 PM
eugenis edited edge metadata.
This revision now requires changes to proceed.Dec 4 2015, 3:48 PM
dim added inline comments.Dec 7 2015, 9:41 AM
lib/asan/asan_linux.cc
100

You mean iterate through the info->dlpi_phdr array? But as far as I understand, this shows the segments loaded by that particular library. So do you think that searching recursively will help here?

Also, on FreeBSD, which this change was meant for, this would actually not help, since the executable itself also has PT_LOAD segments. I.e., if I print the contents of the dlpi_phdr array, I see this:

name=/tmp/dlip (8 segments), type PT_PHDR
		 header  0: address= 0x8048034, type=PT_PHDR
		 header  1: address= 0x8048134, type=PT_INTERP
		 header  2: address= 0x8048000, type=PT_LOAD
		 header  3: address= 0x8049ab4, type=PT_LOAD
		 header  4: address= 0x8049ac8, type=PT_DYNAMIC
		 header  5: address= 0x804814c, type=PT_NOTE
		 header  6: address= 0x8048a68, type=PT_GNU_EH_FRAME
		 header  7: address=       0x0, type=PT_GNU_STACK
name=/lib/libc.so.7 (6 segments), type PT_LOAD
		 header  0: address=0x2806e000, type=PT_LOAD
		 header  1: address=0x281ca000, type=PT_LOAD
		 header  2: address=0x281ccad8, type=PT_DYNAMIC
		 header  3: address=0x281ca000, type=PT_TLS
		 header  4: address=0x281c98e8, type=PT_GNU_EH_FRAME
		 header  5: address=0x2806e000, type=PT_GNU_STACK
name=/libexec/ld-elf.so.1 (0 segments), type (null phdr)

In short, I'm not sure how to distinguish a shared library from the main executable this way. @dankm, any idea?

Of course, every DSO has LOAD segments.
Sorry, it looks like I don't understand the purpose of this change. What exactly is this not-shared-object that you are trying to avoid? The main executable? I think on linux it can be distinguished by empty dlpi_name.

One idea: it looks like we can rely on the fact that the main executable is the first one in dl_iterate_phdr (is it? what about LD_PRELOAD?). If so, you can walk its dynamic section and look for DT_NEEDED records.

One idea: it looks like we can rely on the fact that the main executable is the first one in dl_iterate_phdr (is it? what about LD_PRELOAD?). If so, you can walk its dynamic section and look for DT_NEEDED records.

On FreeBSD the main executable is the first one in dl_interate_phdr's callback, but there is currently no guarantee that this won't change.

dim added a comment.Dec 7 2015, 11:44 AM

Of course, every DSO has LOAD segments.
Sorry, it looks like I don't understand the purpose of this change. What exactly is this not-shared-object that you are trying to avoid? The main executable? I think on linux it can be distinguished by empty dlpi_name.

Yes, that is precisely the point of the change. On FreeBSD, the main executable does have its dlpi_name filled with the path to the executable.

One idea: it looks like we can rely on the fact that the main executable is the first one in dl_iterate_phdr (is it? what about LD_PRELOAD?).

It seems to be so, both on Linux and FreeBSD, but I'm unsure whether it is a hard rule.

If so, you can walk its dynamic section and look for DT_NEEDED records.

I'd think that was the job of the dynamic linker. :-)

emaste added a comment.Dec 7 2015, 1:20 PM

It seems to be so, both on Linux and FreeBSD, but I'm unsure whether it is a hard rule.

Today it is not a hard rule on FreeBSD, but I would like to make it so -- see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199943