I'm making a change in this area (https://reviews.llvm.org/D138461), so update the test:
- Add proper synchronization instead of a sleep.
- Avoid some unnecessary size_t casts.
- Spawn the number of hardware threads instead of 10.
- Check that __cxa_get_globals and __cxa_get_globals_fast return the same values.
- Split the test in with-threads and without-threads tests to simplify the code.
static