This is an archive of the discontinued LLVM Phabricator instance.

[SYCL][Doc] Add design document for SYCL mode
ClosedPublic

Authored by bader on Mar 29 2021, 12:39 AM.

Details

Summary

Initial version of the document covers address space handling

Diff Detail

Unit TestsFailed

Event Timeline

bader created this revision.Mar 29 2021, 12:39 AM
bader requested review of this revision.Mar 29 2021, 12:39 AM
Herald added a project: Restricted Project. · View Herald Transcript

Thanks. I am guessing you will address relevant review comments from D99190 in here?

bader added a comment.Mar 29 2021, 3:38 AM

Thanks. I am guessing you will address relevant review comments from D99190 in here?

Yes. I'm working on it. I wanted to make it in a separate update, so it should be easy to see what has changed.

Thanks. I am guessing you will address relevant review comments from D99190 in here?

Yes. I'm working on it. I wanted to make it in a separate update, so it should be easy to see what has changed.

Great! Makes sense.

OlegM added a subscriber: OlegM.Mar 29 2021, 5:43 AM
bader updated this revision to Diff 333853.Mar 29 2021, 6:50 AM

Applied code review comments.

Mostly grammar related review comments, nothing substantive.

clang/docs/SYCLSupport.md
816 ↗(On Diff #333853)
aaron.ballman added inline comments.Mar 29 2021, 9:02 AM
clang/docs/SYCLSupport.md
818 ↗(On Diff #333853)
819 ↗(On Diff #333853)
825 ↗(On Diff #333853)
826 ↗(On Diff #333853)
827 ↗(On Diff #333853)
831 ↗(On Diff #333853)

You should also re-wrap for 80 col (whenever we finish talking about what words to add).

839 ↗(On Diff #333853)
845 ↗(On Diff #333853)
851 ↗(On Diff #333853)
899–900 ↗(On Diff #333853)
902 ↗(On Diff #333853)
903 ↗(On Diff #333853)
914 ↗(On Diff #333853)
915 ↗(On Diff #333853)
916 ↗(On Diff #333853)
919 ↗(On Diff #333853)
Naghasan added inline comments.Mar 29 2021, 9:41 AM
clang/docs/SYCLSupport.md
914–919 ↗(On Diff #333853)

I think this section should be extended.

Pointers to Default address space should get lowered into a pointer to a generic address space (or flat to reuse more general terminology).
But depending on the allocation context, the default address space of a non-pointer type is assigned to a specific address space. This is described in https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:commonAddressSpace.

This is also in line with the behaviour of CUDA (small example https://godbolt.org/z/veqTfo9PK).

818–819 ↗(On Diff #333779)

Or more simply just SYCL specification uses C++ wrapper classes.

multi_ptr are not "pointers to" but a wrapper around the actual pointer.

Do you plan to implement any of the following restriction?

To allocate local memory within a kernel, the user can either pass a sycl::local_accessor object as a argument to an ND-range kernel (that has a user-defined work-group size), or can define a variable in work-group scope inside sycl::parallel_for_work_group.

Explicit pointer class values cannot be passed as arguments to kernels or stored in global memory.

Can I confirm that in your implementation any raw pointer obtained from multi_ptr will be in generic/default address space? This is something that might be worth adding in the documentation unless it is explained in the spec already?

clang/docs/SYCLSupport.md
821 ↗(On Diff #333853)

The memory model for SYCL devices is based on the OpenCL 1.2 memory model.

Is this possibly a spec bug? OpenCL didn't have generic address space in v1.2, it has only been added in v2.0.

https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#the-generic-address-space

830 ↗(On Diff #333853)

Is this explained somewhere would you be able to add any reference?

851 ↗(On Diff #333853)

This example demonstrates the problem with compiling C++ code when address space type qualifiers are inferred.

The example compiles in accordance with OpenCL language semantic...

https://godbolt.org/z/9jzxK5xc4 - ToT clang doesn't compile this example.

I am still not clear what message you are trying to convey here? In OpenCL kernel languages any object is always in some address space so if you write the following decltype(p), it will always have address space attribute in a type. OpenCL spec is very explicit about this:

https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#addr-spaces-inference

So if you compare a type not attributed by an address space with an attributed one they will never compare as equal because according to C++ rules if the qualifiers differ the types will differ. You need to use a special type trait to remove an address space if you need to compare types not qualified by an address space. What is important to highlight however is that address space inference is where OpenCL differs to C or C++. But of course, neither C nor C++ have address spaces so it is hard to compare.

In relation to your documentation, it is not clear what you are trying to achieve with this paragraph?

909 ↗(On Diff #333853)

Since SYCL spec has constant AS you should explain whether it is going to be supported or not and if so then how.

914–919 ↗(On Diff #333853)

Ok, if the implementation plans to follow the spec precisely then just adding a reference should be sufficient.

Do I understand it correctly that your implementation will use the first approach from the two described in:

If the target of the SYCL backend can represent the generic address space, then the "common address space deduction rules" in Section 5.9.2 and the "generic as default address space rules" in Section 5.9.3 apply. If the target of the SYCL backend cannot represent the generic address space, then the "common address space deduction rules" in Section 5.9.2 and the "inferred address space rules" in Section 5.9.4 apply.

This should be added to the documentation btw.

Btw does this statement in any way relate to the following statement:

Within kernels, the underlying C++ pointer types can be obtained from an accessor. The pointer types will contain a compile-time deduced address space. So, for example, if a C++ pointer is obtained from an accessor to global memory, the C++ pointer type will have a global address space attribute attached to it. The address space attribute will be compile-time propagated to other pointer values when one pointer is initialized to another pointer value using a defined algorithm.

from https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_access_to_memory

Or if not where can I find the algorithm it refers to?

915 ↗(On Diff #333853)

You should also explain what address spaces are super/sub-sets because this impacts implicit and explicit conversion behavior in an embedded C-like models. In relation to that, you should highlight that the private, local and global ASes are disjoint.

bader updated this revision to Diff 334420.Mar 31 2021, 5:56 AM
bader marked 24 inline comments as done.

Applied code review suggestions

Resolved merge conflicts with D99190.

clang/docs/SYCLSupport.md
821 ↗(On Diff #333853)
830 ↗(On Diff #333853)

I wasn't able to find documentation for this implementation detail, but we should be able to confirm that by printing AST for example.

Here is the documentation I found for CUDA in llvm project:

NVIDIA documentation - https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#variable-memory-space-specifiers.
It says that memory space is assigned using "variable specifiers" rather than type qualifiers.

851 ↗(On Diff #333853)

In relation to your documentation, it is not clear what you are trying to achieve with this paragraph?

This paragraph provides clarification to the question why we can't apply OpenCL address space inference rules for SYCL mode.
I think it might be unnecessary because the SYCL specification defines address space deduction rules now.
Do you suggest removing this paragraph?

909 ↗(On Diff #333853)

The first raw of this table covers mapping between SYCL constant_space and address space attribute.
Could you clarify what else do we need?

914–919 ↗(On Diff #333853)

I think this section should be extended.

Pointers to Default address space should get lowered into a pointer to a generic address space (or flat to reuse more general terminology).
But depending on the allocation context, the default address space of a non-pointer type is assigned to a specific address space. This is described in https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:commonAddressSpace.

This is also in line with the behaviour of CUDA (small example https://godbolt.org/z/veqTfo9PK).

I've added this text to the document.

Ok, if the implementation plans to follow the spec precisely then just adding a reference should be sufficient.

Do I understand it correctly that your implementation will use the first approach from the two described in:

If the target of the SYCL backend can represent the generic address space, then the "common address space deduction rules" in Section 5.9.2 and the "generic as default address space rules" in Section 5.9.3 apply. If the target of the SYCL backend cannot represent the generic address space, then the "common address space deduction rules" in Section 5.9.2 and the "inferred address space rules" in Section 5.9.4 apply.

This should be added to the documentation btw.

The implementation residing in https://github.com/intel/llvm targets devices supporting generic address space. If I understand it correctly, another approach is supported by ComputeCPP. @Naghasan, are you aware of any plans to upstream the second approach? If no, I can clarify that it's not supported.

Btw does this statement in any way relate to the following statement:

Within kernels, the underlying C++ pointer types can be obtained from an accessor. The pointer types will contain a compile-time deduced address space. So, for example, if a C++ pointer is obtained from an accessor to global memory, the C++ pointer type will have a global address space attribute attached to it. The address space attribute will be compile-time propagated to other pointer values when one pointer is initialized to another pointer value using a defined algorithm.

from https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_access_to_memory

Or if not where can I find the algorithm it refers to?

AFAIK, the pointer types mentioned in this section is not implemented yet, but their implementation can be done in the library using the attributes described in this section and C++ class templates.

Just to demonstrate the idea, let's use the example implementation for multi_ptr template class provided above:

`multi_ptr` class implementation example:

``` C++
template <typename T, address_space AS> class multi_ptr {
  // DecoratedType applies corresponding address space attribute to the type T
  // DecoratedType<T, global_space>::type == "__attribute__((opencl_global)) T"
  // See sycl/include/CL/sycl/access/access.hpp for more details
  using pointer_t = typename DecoratedType<T, AS>::type *;

  pointer_t m_Pointer;
  public:
  pointer_t get() { return m_Pointer; }
  T& operator* () { return *reinterpret_cast<T*>(m_Pointer); }
}

"decorated" pointers will return pointer_t, where as "raw" pointers will return the type casted to "generic" address space.

914–919 ↗(On Diff #333853)

Ok, if the implementation plans to follow the spec precisely then just adding a reference should be sufficient.

Do I understand it correctly that your implementation will use the first approach from the two described in:

If the target of the SYCL backend can represent the generic address space, then the "common address space deduction rules" in Section 5.9.2 and the "generic as default address space rules" in Section 5.9.3 apply. If the target of the SYCL backend cannot represent the generic address space, then the "common address space deduction rules" in Section 5.9.2 and the "inferred address space rules" in Section 5.9.4 apply.

This should be added to the documentation btw.

Btw does this statement in any way relate to the following statement:

Within kernels, the underlying C++ pointer types can be obtained from an accessor. The pointer types will contain a compile-time deduced address space. So, for example, if a C++ pointer is obtained from an accessor to global memory, the C++ pointer type will have a global address space attribute attached to it. The address space attribute will be compile-time propagated to other pointer values when one pointer is initialized to another pointer value using a defined algorithm.

from https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_access_to_memory

Or if not where can I find the algorithm it refers to?

914–919 ↗(On Diff #333853)

I think this section should be extended.

Pointers to Default address space should get lowered into a pointer to a generic address space (or flat to reuse more general terminology).
But depending on the allocation context, the default address space of a non-pointer type is assigned to a specific address space. This is described in https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:commonAddressSpace.

This is also in line with the behaviour of CUDA (small example https://godbolt.org/z/veqTfo9PK).

915 ↗(On Diff #333853)

It's already covered in the SYCL device memory model section of the specification: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_sycl_device_memory_model.
What additional clarification do we need in this document?

bader updated this revision to Diff 334433.Mar 31 2021, 7:10 AM

Convert document to ReST format.

Naghasan added inline comments.Apr 1 2021, 4:22 AM
clang/docs/SYCLSupport.md
909 ↗(On Diff #333853)

To be more specific here, the OpenCL constant address space no longer have an equivalent in the SYCL core spec memory model https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_sycl_device_memory_model.

The multi_ptr for that address space is now deprecated and implementors can map it the global address space (as described here).

914–919 ↗(On Diff #333853)

If I understand it correctly, another approach is supported by ComputeCPP. @Naghasan, are you aware of any plans to upstream the second approach?

We do support that other approach, but we do not have plan to upstream this. However an implementation doesn't have to support both, so you should just mention the implementation is solely based on the usage of generic.

from https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_access_to_memory

There is a spec bug here, this only applies to the deduced address space mode.

Or if not where can I find the algorithm it refers to?

Section 5.9.4, however you will see a note stating rework is due. But that doesn't impact this design here.

clang/docs/SYCLSupport.rst
222–224

Nitpicking things here, feel free to discard.

bader updated this revision to Diff 334689.Apr 1 2021, 7:42 AM
bader marked 3 inline comments as done.

Applied code review suggestions from @Naghasan.

Anastasia added inline comments.Apr 1 2021, 11:15 AM
clang/docs/SYCLSupport.md
830 ↗(On Diff #333853)

Btw I don't know if it is outdated - do you plan to use any of the conversion intrinsics https://llvm.org/docs/NVPTXUsage.html#address-space-conversion or do you plan to use addrspacecast instruction like OpenCL and some other languages?

NVIDIA documentation - https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#variable-memory-space-specifiers.
It says that memory space is assigned using "variable specifiers" rather than type qualifiers.

Yeah it seems like it is some sort of embedded C flavor but not quite the same concept though.

An automatic variable declared in device code without any of the device, shared and constant memory space specifiers described in this section generally resides in a register. However in some cases the compiler might choose to place it in local memory, which can have adverse performance consequences as detailed in Device Memory Accesses.

It might have been better to introduce a separate attribute instead of using embedded C's address spaces for this or introduce a different address space entry. In OpenCL we do so because generic also has a different flavor from emb C. I guess the original design didn't assume wide usage so it has not flagged up. It was clearly something custom for a particular vendor with some separation of concerns i.e. you wouln't compile CUDA to an arbitrary CPU that also needs to compile C or C++. So towards popularizing this approach it is not unreasonable that its better understanding and some adjustments might be required.

If you plan to use the CUDA approach in a straight-forward way it is reasonable to leave this undocumented for now and see if we can provide some details about the exact semantics while refining the implementation.

Since in this model the automatic objects are not allocated in the "default"/"generic" address space in contrast to embedded C and C++ just like in OpenCL btw, we should at least highlight that fact for now. It is probably good to refer to SYCL spec s3.8.2.1 here where object allocation is explained.

851 ↗(On Diff #333853)
In relation to your documentation, it is not clear what you are trying to achieve with this paragraph?

This paragraph provides clarification to the question why we can't apply OpenCL address space inference rules for SYCL mode.

I can't quite get what exactly it is trying to say. Perhaps you need to provide more context or full example, nit just a small fragement.

I think it might be unnecessary because the SYCL specification defines address space deduction rules now.
Do you suggest removing this paragraph?

Yeah, it does not seem very relevant here since we are not comparing the semantics from different languages. We just want to describe the expected behavior.

909 ↗(On Diff #333853)

Ok, if that's permitted by the spec, the current table is good enough.

914–919 ↗(On Diff #333853)
 C++
template <typename T, address_space AS> class multi_ptr {
  // DecoratedType applies corresponding address space attribute to the type T
  // DecoratedType<T, global_space>::type == "__attribute__((opencl_global)) T"
  // See sycl/include/CL/sycl/access/access.hpp for more details
  using pointer_t = typename DecoratedType<T, AS>::type *;

  pointer_t m_Pointer;
  public:
  pointer_t get() { return m_Pointer; }
  T& operator* () { return *reinterpret_cast<T*>(m_Pointer); }
}

"decorated" pointers will return pointer_t, where as "raw" pointers will return the type casted to "generic" address space.

FYI depending on your conversion semantics you might need to use addrspace_cast instead of reinterpret_cast https://www.khronos.org/opencl/assets/CXX_for_OpenCL.html#_casts

from https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_access_to_memory

There is a spec bug here, this only applies to the deduced address space mode.

Ok, this makes sense. So for the current implementation in clang would it be something like:

The pointer types will contain a compile-time deduced address space. -> The pointer types will point to a generic/default address space object.?

Or is it something else that will be different... If it can't be easily fixed in the spec, let's at least document it here.

Or if not where can I find the algorithm it refers to?

Section 5.9.4, however you will see a note stating rework is due. But that doesn't impact this design here.

Thanks, it seems it is not relevant for this implementation at the moment then.

915 ↗(On Diff #333853)

Sorry I can not see.

There are two approaches here:

  • Embedded C s5.1.3 "Address space nesting and rules for pointers" requires:

An implementation must define the relationship between all pairs of address spaces. (The complete set of address spaces includes the generic address space plus any address spaces that may be defined within a translation unit, if the implementation supports such definitions within a program.) There is no requirement that named address spaces (intrinsic or otherwise) be subsets of the generic address space

  • In OpenCL C we also use C terminology i.e. implicit/explicit conversions in addition to that:

https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#addr-spaces-conversions

Either way, it has to be clear what types of conversions are allowed or disallowed.

clang/docs/SYCLSupport.rst
243

This is ambiguous now because every language will use clang's "default" address space because at least one address space is always needed by every language but it has different semantics in languages. We should either attempt to describe it somehow or perhaps just point out that it is inherited from CUDA and currently undocumented.

341

Do you mean both implicit and explicit conversions? Does it mean that in your AS model named ASes are subset of generic AS and generic AS is a subset of named ASes so they are equivalent sets? It is probably good to mention here that all named address spaces are disjoint.

344

Interesting, will this deduction always be target specific or can it be generalized since it is governed by the language semantic already?

347

I think it is also relevant to highlight that you don't perform inference of the address space qualifiers and the memory segment binding is performed as a final phase of parsing. This is quite relevant since embedded C or C++ have no address space inference at all and OpenCL explicitly requires inference in the type qualifiers.

bader updated this revision to Diff 334927.Apr 2 2021, 2:22 AM
bader marked 4 inline comments as done.

Address comments from @Anastasia.

  • Removed controversial clarifications.
  • Reshuffled text to keep language semantics clarifications closer to each other.
bader marked 4 inline comments as done.Apr 2 2021, 2:28 AM
bader added inline comments.
clang/docs/SYCLSupport.rst
243

Removed this paragraph as it's already covered by SYCL specification.

341

Do you mean both implicit and explicit conversions? Does it mean that in your AS model named ASes are subset of generic AS and generic AS is a subset of named ASes so they are equivalent sets? It is probably good to mention here that all named address spaces are disjoint.

Updated paragraph:

The default address space is "generic-memory", which is a virtual address space
that overlaps the global, local, and private address spaces. SYCL mode enables
both explicit and implicit conversion to/from the default address space from/to
the address space-attributed type. All named address spaces are disjoint and
sub-sets of default address space.

344

Interesting, will this deduction always be target specific or can it be generalized since it is governed by the language semantic already?

It's target specific deduction. CPU targets doesn't require such deduction.

347

I think it is also relevant to highlight that you don't perform inference of the address space qualifiers and the memory segment binding is performed as a final phase of parsing. This is quite relevant since embedded C or C++ have no address space inference at all and OpenCL explicitly requires inference in the type qualifiers.

I move this paragraph before the code example right after this section:

SYCL borrows its memory model from OpenCL however SYCL doesn't perform
the address space qualifier inference as detailed in
OpenCL C v3.0 6.7.8 <https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#addr-spaces-inference>_.

Anastasia accepted this revision.Apr 6 2021, 10:45 AM

LGTM! Thanks for working on this. The expected sematic seems fairly clear now. We might add a few more details while refining the implementation but it should not block the development progress at this point.

clang/docs/SYCLSupport.rst
255

I think this should be converted into a hyper link?

258

same here, the list should be converted into a hyper link.

344

That's right we have a similar situation in OpenCL. If you expect certain logic to be common enough it might make sense to add abstractions at some point.

This revision is now accepted and ready to land.Apr 6 2021, 10:45 AM
bader updated this revision to Diff 335596.Apr 6 2021, 11:03 AM
bader marked 7 inline comments as done.

Add ReST marks to hyperlinks.

bader added a comment.Apr 7 2021, 6:31 AM

LGTM! Thanks for working on this. The expected sematic seems fairly clear now.

Thanks for review! I also fixed external hyperlinks formatting.

We might add a few more details while refining the implementation but it should not block the development progress at this point.

Great! Please, let me know if there any comment for the implementation - https://reviews.llvm.org/D89909.

bader updated this revision to Diff 340011.Apr 23 2021, 6:59 AM

Incorporate https://reviews.llvm.org/D89909 review feedback.

Allow one way implicit conversion only for now.
From named address space to default.

Anastasia accepted this revision.Apr 26 2021, 4:46 AM

LGTM! Thanks!

bader updated this revision to Diff 340526.Apr 26 2021, 7:45 AM

Rebased patch to unblock commit.

This patch had a dependency on D99190, which adds SYCLSupport.rst document.
To unblock commit of D99488, I switched the order of these two patches.
Now D99488 add SYCLSupport.rst document with just two sections: "Introduction" and "Address space handling".
D99190 will be rebased on top of D99488 to add more content to the document.

bader retitled this revision from [SYCL][Doc] Add address space handling section to SYCL documentation to [SYCL][Doc] Add design document for SYCL mode.Apr 26 2021, 7:49 AM
bader edited the summary of this revision. (Show Details)

Rebased patch to unblock commit.

This patch had a dependency on D99190, which adds SYCLSupport.rst document.
To unblock commit of D99488, I switched the order of these two patches.
Now D99488 add SYCLSupport.rst document with just two sections: "Introduction" and "Address space handling".
D99190 will be rebased on top of D99488 to add more content to the document.

Makes sense. LGTM!

This revision was landed with ongoing or failed builds.Apr 26 2021, 8:40 AM
This revision was automatically updated to reflect the committed changes.

If I build docs now I get the following output:

llvm-project/build-doc/tools/clang/docs/SYCLSupport.rst:102: WARNING: Error in "code-block" directive:
1 argument(s) required, 0 supplied.

.. code-block::

   TODO: add support for `__attribute__((opencl_global_host))` and
   `__attribute__((opencl_global_device))`.

Is this something already being looked at?

bader added a comment.Apr 29 2021, 6:25 AM

If I build docs now I get the following output:

llvm-project/build-doc/tools/clang/docs/SYCLSupport.rst:102: WARNING: Error in "code-block" directive:
1 argument(s) required, 0 supplied.

.. code-block::

   TODO: add support for `__attribute__((opencl_global_host))` and
   `__attribute__((opencl_global_device))`.

Is this something already being looked at?

It looks like it can be fixed by adding language parameter:

.. code-block:: c++

Unfortunately, I can't verify this fix locally. I see other types of warnings, which are treated as errors.

tools/clang/docs/ClangCommandLineReference.rst:22:Duplicate explicit target name: "cmdoption-clang--prefix".

Does anyone know how to avoid this issue?
If no, @Anastasia, could you confirm that adding c++ parameter fixes the warning, please? If it does, I can commit this fix.

If I build docs now I get the following output:

llvm-project/build-doc/tools/clang/docs/SYCLSupport.rst:102: WARNING: Error in "code-block" directive:
1 argument(s) required, 0 supplied.

.. code-block::

   TODO: add support for `__attribute__((opencl_global_host))` and
   `__attribute__((opencl_global_device))`.

Is this something already being looked at?

It looks like it can be fixed by adding language parameter:

.. code-block:: c++

Unfortunately, I can't verify this fix locally. I see other types of warnings, which are treated as errors.

tools/clang/docs/ClangCommandLineReference.rst:22:Duplicate explicit target name: "cmdoption-clang--prefix".

Does anyone know how to avoid this issue?

I think you expected to get some bots failing. If not you can ask on cfe-dev perhaps...

I normally verify locally by building the docs e.g. docs-clang-html build target. It should catch the issues.

If no, @Anastasia, could you confirm that adding c++ parameter fixes the warning, please? If it does, I can commit this fix.

With the following diff

diff --git a/clang/docs/SYCLSupport.rst b/clang/docs/SYCLSupport.rst
index 8c1ed19dff4e..b03dcfadafa5 100644
--- a/clang/docs/SYCLSupport.rst
+++ b/clang/docs/SYCLSupport.rst
@@ -98,8 +98,7 @@ space attributes for pointers:
    * - ``__attribute__((opencl_private))``
      - private_space
 
-
-.. code-block::
+.. code-block:: C++
 
    TODO: add support for `__attribute__((opencl_global_host))` and
    `__attribute__((opencl_global_device))`.

I get

llvm-project/build-doc/tools/clang/docs/SYCLSupport.rst:101: WARNING: Could not lex literal_block as "C++". Highlighting skipped.

But this seems to work

diff --git a/clang/docs/SYCLSupport.rst b/clang/docs/SYCLSupport.rst
index 8c1ed19dff4e..f714500300e1 100644
--- a/clang/docs/SYCLSupport.rst
+++ b/clang/docs/SYCLSupport.rst
@@ -98,8 +98,6 @@ space attributes for pointers:
    * - ``__attribute__((opencl_private))``
      - private_space
 
+.. code-block:: C++
 
-.. code-block::
-
-   TODO: add support for `__attribute__((opencl_global_host))` and
-   `__attribute__((opencl_global_device))`.
+    //TODO: add support for __attribute__((opencl_global_host)) and __attribute__((opencl_global_device)).

Thanks! I've uploaded this version to https://reviews.llvm.org/D101549.