Index: docs/LanguageExtensions.rst =================================================================== --- docs/LanguageExtensions.rst +++ docs/LanguageExtensions.rst @@ -1516,6 +1633,285 @@ Query the presence of this new mangling with ``__has_feature(objc_protocol_qualifier_mangling)``. + +OpenCL Features +=============== + +C++ for OpenCL +-------------- + +This functionality is built on top of OpenCL C v2.0 and C++17 enabling most of +regular C++ features in OpenCL kernel code. Most functionality from OpenCL C +is inherited. This section describes minor differences to OpenCL C and any +limitations related to C++ support as well as interactions between OpenCL and +C++ features that are not documented elsewhere. + +Restrictions to C++17 +^^^^^^^^^^^^^^^^^^^^^ + +The following features are not supported: + +- Virtual functions +- Exceptions +- ``dynamic_cast`` operator +- Non-placement ``new``/``delete`` operators +- Standard C++ libraries. Currently there is no solution for alternative C++ + libraries provided. Future release will feature library support. + + +Interplay of OpenCL and C++ features +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Address space behavior +"""""""""""""""""""""" + +Address spaces are part of the type qualifiers; many rules are just inherited +from the qualifier behavior documented in OpenCL C v2.0 s6.5 and Embedded C +extension ISO/IEC JTC1 SC22 WG14 N1021 s3.1. Note that since the address space +behavior in C++ is not documented formally, Clang extends the existing concept +from C and OpenCL. For example conversion rules are extended from qualification +conversion but the compatibility is determined using notation of sets and +overlapping of address spaces from Embedded C (ISO/IEC JTC1 SC22 WG14 N1021 +s3.1.3). For OpenCL it means that implicit conversions are allowed from +a named address space (except for ``__constant``) to ``__generic`` (OpenCL C +v2.0 6.5.5). Reverse conversion is only allowed explicitly. The ``__constant`` +address space does not overlap with any other and therefore no valid conversion +between ``__constant`` and other address spaces exists. Most of the rules +follow this logic. + +**Casts** + +C-style casts follow OpenCL C v2.0 rules (s6.5.5). All cast operators +permit conversion to ``__generic`` implicitly. However converting from +``__generic`` to named address spaces can only be done using ``addrspace_cast``. +Note that conversions between ``__constant`` and any other address space +are disallowed. + +.. _opencl_cpp_addrsp_deduction: + +**Deduction** + +Address spaces are not deduced for: + +- non-pointer/non-reference template parameters or any dependent types except + for template specializations. +- non-pointer/non-reference class members except for static data members that are + deduced to ``__global`` address space. +- non-pointer/non-reference alias declarations. +- ``decltype`` expressions. + +.. code-block:: c++ + + template + void foo() { + T m; // address space of m will be known at template instantiation time. + T * ptr; // ptr points to __generic address space object. + T & ref = ...; // ref references an object in __generic address space. + }; + + template + struct S { + int i; // i has no address space + static int ii; // ii is in global address space + int * ptr; // ptr points to __generic address space int. + int & ref = ...; // ref references int in __generic address space. + }; + + template + void bar() + { + S s; // s is in __private address space + } + +TODO: Add example for type alias and decltype! + +**References** + +Reference types can be qualified with an address space. + +.. code-block:: c++ + + __private int & ref = ...; // references int in __private address space + +By default references will refer to ``__generic`` address space objects, except +for dependent types that are not template specializations +(see :ref:`Deduction `). Address space compatibility +checks are performed when references are bound to values. The logic follows the +rules from address space pointer conversion (OpenCL v2.0 s6.5.5). + +**Default address space** + +All non-static member functions take an implicit object parameter ``this`` that +is a pointer type. By default this pointer parameter is in the ``__generic`` +address space. All concrete objects passed as an argument to ``this`` parameter +will be converted to the ``__generic`` address space first if such conversion is +valid. Therefore programs using objects in the ``__constant`` address space will +not be compiled unless the address space is explicitly specified using address +space qualifiers on member functions +(see :ref:`Member function qualifier `) as the +conversion between ``__constant`` and ``__generic`` is disallowed. Member function +qualifiers can also be used in case conversion to the ``__generic`` address space +is undesirable (even if it is legal). For example, a method can be implemented to +exploit memory access coalescing for segments with memory bank. This not only +applies to regular member functions but to constructors and destructors too. + +.. _opencl_cpp_addrspace_method_qual: + +**Member function qualifier** + +Clang allows specifying an address space qualifier on member functions to signal +that they are to be used with objects constructed in some specific address space. +This works just the same as qualifying member functions with ``const`` or any +other qualifiers. The overloading resolution will select the candidate with the +most specific address space if multiple candidates are provided. If there is no +conversion to an address space among candidates, compilation will fail with a +diagnostic. + +.. code-block:: c++ + + struct C { + void foo() __local; + void foo(); + }; + + __kernel void bar() { + __local C c1; + C c2; + __constant C c3; + c1.foo(); // will resolve to the first foo + c2.foo(); // will resolve to the second foo + c3.foo(); // error due to mismatching address spaces - can't convert to + // __local or __generic + } + +**Implicit special members** + +All implicit special members (default, copy, or move constructor, copy or move +assignment, destructor) will be generated with the ``__generic`` address space. + +.. code-block:: c++ + + class C { + // Has the following implicit definition + // void C() __generic; + // void C(const __generic C &) __generic; + // void C(__generic C &&) __generic; + // operator= '__generic C &(__generic C &&)' + // operator= '__generic C &(const __generic C &) __generic + } + +**Builtin operators** + +All builtin operators are available in the specific address spaces, thus no +conversion to ``__generic`` is performed. + +**Templates** + +There is no deduction of address spaces in non-pointer/non-reference template +parameters and dependent types (see :ref:`Deduction `). +The address space of a template parameter is deduced during type deduction if +it is not explicitly provided in the instantiation. + +.. code-block:: c++ + + 1 template + 2 void foo(T* i){ + 3 T var; + 4 } + 5 + 6 __global int g; + 7 void bar(){ + 8 foo(&g); // error: template instantiation failed as function scope variable + 9 // appears to be declared in __global address space (see line 3) + 10 } + +It is not legal to specify multiple different address spaces between template +definition and instantiation. If multiple different address spaces are specified in +template definition and instantiation, compilation of such a program will fail with +a diagnostic. + +.. code-block:: c++ + + template + void foo() { + __private T var; + } + + void bar() { + foo<__global int>(); // error: conflicting address space qualifiers are provided + // __global and __private + } + +Once a template has been instantiated, regular restrictions for address spaces will +apply. + +.. code-block:: c++ + + template + void foo(){ + T var; + } + + void bar(){ + foo<__global int>(); // error: function scope variable cannot be declared in + // __global address space + } + +**Temporary materialization** + +All temporaries are materialized in the ``__private`` address space. If a +reference with another address space is bound to them, the conversion will be +generated in case it is valid, otherwise compilation will fail with a diagnostic. + +.. code-block:: c++ + + int bar(const unsigned int &i); + + void foo() { + bar(1); // temporary is created in __private address space but converted + // to __generic address space of parameter reference + } + + __global const int& f(__global float &ref) { + return ref; // error: address space mismatch between temporary object + // created to hold value converted float->int and return + // value type (can't convert from __private to __global) + } + +**Initialization of local and constant address space objects** + +TODO + +Constructing and destroying global objects +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Global objects must be constructed before the first kernel using the global +objects is executed and destroyed just after the last kernel using the +program objects is executed. In OpenCL v2.0 drivers there is no specific +API for invoking global constructors. However, an easy workaround would be +to enqueue a constructor initialization kernel that has a name +``@_GLOBAL__sub_I_``. This kernel is only present if there +are any global objects to be initialized in the compiled binary. One way to +check this is by passing ``CL_PROGRAM_KERNEL_NAMES`` to ``clGetProgramInfo`` +(OpenCL v2.0 s5.8.7). + +Note that if multiple files are compiled and linked into libraries, multiple +kernels that initialize global objects for multiple modules would have to be +invoked. + +Applications are currently required to run initialization of global objects +manually before running any kernels in which the objects are used. + +.. code-block:: console + + clang -cl-std=clc++ test.cl + +If there are any global objects to be initialized, the final binary will +contain the ``@_GLOBAL__sub_I_test.cl`` kernel to be enqueued. + +Global destructors can not be invoked in OpenCL v2.0 drivers. However, all +memory used for program scope objects is released on ``clReleaseProgram``. + Initializer lists for complex numbers in C ========================================== Index: docs/UsersManual.rst =================================================================== --- docs/UsersManual.rst +++ docs/UsersManual.rst @@ -2397,7 +2408,8 @@ This will produce a generic test.bc file that can be used in vendor toolchains to perform machine code generation. -Clang currently supports OpenCL C language standards up to v2.0. +Clang currently supports OpenCL C language standards up to v2.0. Starting from +clang 9 a C++ mode is available for OpenCL (see :ref:`C++ for OpenCL `). OpenCL Specific Options ----------------------- @@ -2756,6 +2768,46 @@ enqueue query functions from `section 6.13.17.5 `_. +.. _opencl_cpp: + +C++ for OpenCL +-------------- + +Starting from clang 9 kernel code can contain C++17 features: classes, templates, +function overloading, type deduction, etc. Please note that this is not an +implementation of `OpenCL C++ +`_ and +there is no plan to support it in clang in any new releases in the near future. + +For detailed information about restrictions to allowed C++ features please +refer to :doc:`LanguageExtensions`. + +Since C++ features are to be used on top of OpenCL C functionality, all existing +restrictions from OpenCL C v2.0 will inherently apply. All OpenCL C builtin types +and function libraries are supported and can be used in this mode. + +To enable the C++ for OpenCL mode, pass one of following command line options when +compiling ``.cl`` file ``-cl-std=clc++``, ``-cl-std=CLC++``, ``-std=clc++`` or +``-std=CLC++``. + + .. code-block:: c++ + + template T add( T x, T y ) + { + return x + y; + } + + __kernel void test( __global float* a, __global float* b) + { + auto index = get_global_id(0); + a[index] = add(b[index], b[index+1]); + } + + + .. code-block:: console + + clang -cl-std=clc++ test.cl + .. _target_features: Target-Specific Features and Limitations