Following on from RFC here: https://reviews.llvm.org/D69043
This patch extends the TimeProfiler to support multiple threads (for ThinLTO) by making TimeTraceProfilerInstance thread local. timeTraceProfilerFinishThread() moves the thread local instance to a global vector of instances and timeTraceProfilerWrite() writes recorded trace data from all instances. This reduces locking over the initial implementation presented in the RFC and is much quicker even on relatively modest 6 core machine.
In the generated trace, threads are identified based on their thread ids. Totals are reported with artificial thread ids higher than the real ones.