If available, use the clang that is already built in the same project as CUDA compiler unless another executable is explicitly defined. This also ensures the generated deviceRTL IR will be consistent with the version of Clang.
The change in add_subdirectory order is required to ensure that if clang is part of the project build, its target exists before openmp is included. Alternatively, LLVM_TOOL_CLANG_BUILD can could be used to determine whether clang build is enabled (Not sure how reliable it is).
This patch is required to reliably test OpenMP offloading in a buildbot without either a two-stage build (e.g. with LLVM_ENABLE_RUNTIMES) or a separately installed clang on the worker that will eventually become outdated.