pthread_join needs to map pthread_t of the joined thread to our Tid.
Currently we do this with linear search over all threads.
This has quadratic complexity and becomes much worse with the new
tsan runtime, which memorizes all threads that ever existed.
To resolve this add a hash map of live threads only (that are still
associated with pthread_t) and use it for the mapping.
With the new tsan runtime some programs spent 1/3 of time in this mapping.
After this change the mapping disappears from profiles.
Depends on D113996.
Insert seems to have its own CHECK.
Although I find the API could be improved.. I'll comment on the other patch.