This patch is a partition of the original patch posted in http://reviews.llvm.org/D14031.
This patch implements the device runtime library whose interface is used in the code generation for OpenMP offloading devices. Currently there is a single device RTL written in CUDA meant to CUDA enabled GPUs. The interface is a variation of the kmpc interface that includes some extra calls to do thread and storage management that only make sense for a GPU target.
Depends on http://reviews.llvm.org/D14031.
MATCHES evaluates a regex, so clang should be enough