This patch introduces an alternative OpenMP GPU kernel offloading interface called target kernel region (or TRegion),
The commit includes:
- The Clang code generation for OpenMP target constructs based on the interface.
- The runtime library implementation for the NVPTX device plugin, implemented mostly in terms of the existing functionality.
- An LLVM optimization for TRegions that tries to enable SPMD-mode or use custom state machines for the kernels.
The interface is deliberately simple to be easily analyzable in the middle end. Design decisions included:
- Hide all (complex) implementation choices in the runtime library but allow complete removal of the abstraction once the runtime is inlined.
- Provide all runtime calls with sufficient, easy encoded information.
- Make the LLVM optimization as general as possible.
Positive and negative examples for the LLVM optimization are provided in the