Comment "No lock needed" in CurrentThreadContext was wrong.
Concurent ThreadRegistry::CreateThread can resize and relocate
ThreadRegistry::threads_ the same time CurrentThreadContext reads it.
To mitigate lock cost we store ThreadContext* instead of tid in
THREADLOCAL cache, we can tid from the ThreadContext*.
Doesn't clang-format prefer ThreadContextLsanBase * to ThreadContextLsanBase*?