Using TLS to implement threadprivate directive has shown 10x performance improvements if compared with the current cache-based implementation in PPC machines.
This patch introduces a TLS-based implementation that is currently activated only for PPC machines. It also creates CGOpenMPRuntimes.cpp, meant to extend the OpenMP codegeneration class in order to drive optimized implementations for different targets.
This patch complements the OpenMP runtime patch under review in http://lists.cs.uiuc.edu/pipermail/openmp-commits/2015-June/000347.html
thread_local variable cannot be threadprivate, restore the original code