SYCL is single source offload programming model relying on compiler to
separate device code (i.e. offloaded to an accelerator) from the code
executed on the host.
Here is code example of the SYCL program to demonstrate compiler
outlining work:
int foo(int x) { return ++x; } int bar(int x) { throw std::exception("CPU code only!"); } ... using namespace cl::sycl; queue Q; buffer<int, 1> a(range<1>{1024}); Q.submit([&](handler& cgh) { auto A = a.get_access<access::mode::write>(cgh); cgh.parallel_for<init_a>(range<1>{1024}, [=](id<1> index) { A[index] = index[0] * 2 + index[1] + foo(42); }); } ...
SYCL device compiler must compile lambda expression passed to
cl::sycl::handler::parallel_for method and function foo called from this
lambda expression for an "accelerator". SYCL device compiler also must
ignore bar function as it's not required for offloaded code execution.
This patch adds the sycl_kernel attribute, which is used to mark code
passed to cl::sycl::handler::parallel_for as "accelerated code".
Attribute must be applied to function templates which parameters include
at least "kernel name" and "kernel function object". These parameters
will be used to establish an ABI between the host application and
offloaded part.
Is there a reason to not also introduce a C++11 and C2x style spelling in the clang namespace? e.g., [[clang::sycl_device]]