This is an archive of the discontinued LLVM Phabricator instance.

[TSan] Improve handling of stack pointer mangling in {set,long}jmp, pt.1
ClosedPublic

Authored by yln on Apr 22 2019, 2:46 PM.

Details

Summary

TSan needs to infer which calls to setjmp/longjmp are corresponding
pairs. My understanding is, that we can't simply use the jmp_buf
address, since this buffer is just a plain data structure storing the
environment (registers) with no additional semantics, i.e., it can be
copied around and is still expected to work. So we use the stack pointer
(SP) instead.

The setjmp interceptor stores some metadata, which is then consumed in
the corresponding call to longjmp. We use the SP as an "index" (stable
identifier) into the metadata table. So far so good.

However, when mangling is used, the setjmp interceptor observes the
UNmangled SP, but the longjmp interceptor only knows the mangled value
for SP. To still correlate corresponding pairs of calls, TSan currently
derives the mangled representation in setjmp and uses it as the stable
identifer, so that longjmp can do it's lookup.

Currently, this works since "mangling" simply means XOR with a secret
value. However, in the future we want to use operations that do not
allow us to easily go from unmangled -> mangled (pointer
authentication). Going from mangled -> unmangled should still be
possible (for pointer authentication it means zeroing a few bits).

This patch is part 1 of changing set/longjmp interceptors to use the
unmangled SP for metadata lookup. Instead of deriving the mangled SP in
setjmp, we will derive the unmangled SP in longjmp. Since this change
involves difficult-to-test code, it will be done in (at least) 2 parts:
This patch only replicates the existing behavior and checks that the
newly computed value for SP matches with what we have been doing so far.
This should help me to fix issues on architectures I cannot test
directly. I tested this patch on x86-64 (Linux/Darwin) and arm64
(Darwin).

This patch will also address an orthogonal issue: there is a lot of code
duplication in the assembly files, because the
void __tsan_setjmp(uptr sp, uptr mangled_sp) already demands the
mangled SP. This means that the code for computing the mangled SP is
duplicated at every call site (in assembly).

Diff Detail

Repository
rL LLVM

Event Timeline

yln created this revision.Apr 22 2019, 2:46 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 22 2019, 2:46 PM
Herald added subscribers: llvm-commits, Restricted Project, kristof.beyls and 3 others. · View Herald Transcript
dvyukov accepted this revision.Apr 23 2019, 5:11 AM
dvyukov added a subscriber: dvyukov.

JmpBufGarbageCollect uses <= on sp values to detect bufs that are no longer active. But as far as I understand we still will able to do this, right?

It can make sense to combine the first part of LongJmp that extracts sp from context with UnmangleLongJmpSp in future changes, because they have effectively the same set of checks on OS/arch. E.g. GetSPFromContext that will both extract and demangle it.

This revision is now accepted and ready to land.Apr 23 2019, 5:11 AM
yln added a comment.Jun 28 2019, 10:24 AM

@dvyukov: I finally have a chance to continue working on this. I hope you are still onboard?
I plan to land this and let the bots (all the different architectures) churn on it over the weekend.

My next patch will be switching over the lookup to use the SP instead of the mangled SP, followed by some cleanup that is enabled by this.
I will make sure to put you as a reviewer for all of those.

JmpBufGarbageCollect uses <= on sp values to detect bufs that are no longer active. But as far as I understand we still will able to do this, right?

Yes.

It can make sense to combine the first part of LongJmp that extracts sp from context with UnmangleLongJmpSp in future changes, because they have effectively the same set of checks on OS/arch. E.g. GetSPFromContext that will both extract and demangle it.

I will do a round of cleanups and that sounds like a good candidate!

This revision was automatically updated to reflect the committed changes.

@dvyukov: I finally have a chance to continue working on this. I hope you are still onboard?

To the degree I will find time to review changes...

It seems I already approved this, so as long as there are no significant changes I guess that approval still holds.