[libunwind] Use _dl_find_object if available
As shown in P2544R0 [1] and the accompanying benchmark [2], the
current unwinding logic does not scale for multi-threaded programs.
This is because dl_iterate_phdr takes a global lock.
glibc 2.35 added _dl_find_object which directly returns the unwind
info for a given target address. _dl_find_object is fully lock-free
and hence allows parallel exception unwinding on multiple threads.
With this commit, libunwind now takes advantage of _dl_find_object.
Thereby, this commit improves libunwind's performance on benchmark [2]
for unwinding exception on 20 threads from 1103ms to 78ms.
(measured on Intel Xeon Silver 4114 with 20 physical cores)
[1] https://isocpp.org/files/papers/P2544R0.html
[2] https://github.com/neumannt/exceptionperformance
Detailed performance numbers from the benchmark:
Before:
Testing unwinding performance: sqrt computation with occasional errors
testing baseline using 1 2 4 8 16 20 threads
failure rate 0%: 34 35 34 35 35 36
testing exceptions using 1 2 4 8 16 20 threads
failure rate 0%: 16 32 33 34 35 36
failure rate 0.1%: 16 32 34 36 35 36
failure rate 1%: 20 40 40 43 90 113
failure rate 10%: 59 92 140 304 880 1103
[...]Testing invocation overhead: recursive fib with occasional errors
testing exceptions using 1 2 4 8 16 20 threads
failure rate 0%: 19 32 37 38 39 36
failure rate 0.1%: 22 32 40 40 39 34
failure rate 1%: 20 28 38 39 48 40
failure rate 10%: 25 39 44 50 92 113
After:
Testing unwinding performance: sqrt computation with occasional errors
testing baseline using 1 2 4 8 16 20 threads
failure rate 0%: 19 30 35 38 39 35
testing baseline using 1 2 4 8 16 20 threads
failure rate 0%: 32 35 33 34 34 36
testing exceptions using 1 2 4 8 16 20 threads
failure rate 0%: 16 35 33 37 35 35
failure rate 0.1%: 16 32 36 33 34 37
failure rate 1%: 21 37 39 40 40 41
failure rate 10%: 72 75 76 80 80 78
[...]Testing invocation overhead: recursive fib with occasional errors
testing baseline using 1 2 4 8 16 20 threads
failure rate 0%: 18 35 37 34 38 37
testing exceptions using 1 2 4 8 16 20 threads
failure rate 0%: 19 33 40 40 41 39
failure rate 0.1%: 21 33 39 38 39 38
failure rate 1%: 20 36 39 40 41 40
failure rate 10%: 25 45 41 42 44 43
typo: the the