Page MenuHomePhabricator

[OpenCL] Add global_device and global_host address spaces
ClosedPublic

Authored by sidorovd on Jun 19 2020, 3:30 AM.

Details

Summary

This patch introduces 2 new address spaces in OpenCL: global_device and global_host
which are a subset of a global address space, so the address space scheme will be
looking like:

generic->global->host
                          ->device
             ->private
             ->local
constant

Justification: USM allocations may be associated with both host and device memory. We
want to give users a way to tell the compiler the allocation type of a USM pointer for
optimization purposes. (Link to the Unified Shared Memory extension:
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc)

Before this patch USM pointer could be only in opencl_global
address space, hence a device backend can't tell if a particular pointer
points to host or device memory. On FPGAs at least we can generate more
efficient hardware code if the user tells us where the pointer can point -
being able to distinguish between these types of pointers at compile time
allows us to instantiate simpler load-store units to perform memory
transactions.

Diff Detail

Event Timeline

sidorovd created this revision.Jun 19 2020, 3:30 AM
sidorovd edited the summary of this revision. (Show Details)Jun 19 2020, 3:31 AM
sidorovd edited the summary of this revision. (Show Details)

I think you should also test the conversions of address spaces and it would be good to add a cl file i.e. OpenCL C level test.

clang/include/clang/Basic/AttrDocs.td
3132

USM abbreviation is not clear here. I suggest to writenit in full.

Is there a language extension spec for this or any other form of documentation already?

If not I would suggest explaining the conversions rules/relation to other OpenCL address spaces here. Alternatively you can also consider describing this in Claqng extensions: https://clang.llvm.org/docs/LanguageExtensions.html

clang/lib/CodeGen/CodeGenModule.cpp
1340

This is numbers are governed by SPIR that doesn't have this address spaces. Any reasons not to use continuous numbering?

sidorovd added inline comments.Jun 19 2020, 7:45 AM
clang/include/clang/Basic/AttrDocs.td
3132

Is there a language extension spec for this or any other form of documentation already?

Only for SYCL at this point: https://github.com/intel/llvm/commit/66bdf4a6c753448376aec1d7411dc626c2c5cf1c

Okay, I'll expand attribute documentation here, thanks!

clang/lib/CodeGen/CodeGenModule.cpp
1340

I didn't use continuous numbering because rest number are occupied by other target in clang, for example numbers from 5 to 7 are assigned to cuda_device, cuda_constant and cuda_shared address spaces. Do you mean that within SPIR target we can overlap with address spaces from different targets?

Btw could you upload a full diff, please?
https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface
It will simplify the review....

clang/lib/CodeGen/CodeGenModule.cpp
1340

I think here we are using SPIR address spaces - not the same as Clang internal address spaces?

Anastasia added inline comments.Jun 19 2020, 8:05 AM
clang/lib/CodeGen/CodeGenModule.cpp
1340

I.e. this is just used in the metadata so technically this should just be the same as corresponding address spaces in the address space map of SPIR target which is what you have. However, you could also just use continuous numbering I guess i.e. 5,6?

Anastasia added inline comments.Jun 19 2020, 8:08 AM
clang/include/clang/Basic/AttrDocs.td
3132

Btw do I understand it correctly that global_host means it is visible to the host and the device as opposed to global_device that is only visible on the device?

sidorovd updated this revision to Diff 272415.Jun 22 2020, 7:04 AM
sidorovd updated this revision to Diff 272418.Jun 22 2020, 7:13 AM

Uploaded the full diff. I have already forgot have to Phabricator properly :)

In the update I addressed:

  1. Updated info in AttrDocs;
  2. Added test in address space conversion.

TODO: update address space numerical values for SPIR target
Regarding the TODO. Still I like an idea of detached address spaces a little bit more, but I see your point. I would love to check how it works with SPIR-V address space mapping in SPIR-V to LLVM IR translator. I know, it shall have a little impact on a clang work, but still what to see the work I have to do there first (5 and 6 are already occupied there).

sidorovd marked 4 inline comments as done.Jun 22 2020, 7:15 AM
sidorovd added inline comments.
clang/include/clang/Basic/AttrDocs.td
3132

Yes!

aaron.ballman added inline comments.Jun 22 2020, 8:35 AM
clang/include/clang/Basic/Attr.td
1183

I'd appreciate a slightly different name here. OpenCLGlobalAddressSpacesDocs and OpenCLAddressSpaceGlobalDocs are pretty similar names. In fact, given how this relates to opencl_global, perhaps that documentation should be expanded instead of introducing a different doc category.

clang/include/clang/Basic/AttrDocs.td
3130

You should use backticks instead of parens for the attribute names (and fully spell them out), as in other parts of the docs.

3131

distinguishing -> distinguish

3132

"pointers that access device memory that access global memory" -- I'm not certain what you're trying to say with this, but I also don't know the domain very well. Do you mean "pointers that access global device memory from those that access global host memory"?

3134

of the opencl_global (add the the), also put backticks around opencl_global.

3135

is looking like -> looks like

3142–3143

Backticks around syntax identifiers.

TODO: update address space numerical values for SPIR target
Regarding the TODO. Still I like an idea of detached address spaces a little bit more, but I see your point. I would love to check how it works with SPIR-V address space mapping in SPIR-V to LLVM IR translator. I know, it shall have a little impact on a clang work, but still what to see the work I have to do there first (5 and 6 are already occupied there).

Makes sense. I think conversion to SPIR-V is one of the primary use cases. However, if checking doesn't need a lot of time it would be better to commit with the final numbering scheme. Otherwise things like this are pretty hard to change once they start to be used.

sidorovd updated this revision to Diff 272768.Jun 23 2020, 10:25 AM
sidorovd marked an inline comment as done.
sidorovd marked 6 inline comments as done.Jun 23 2020, 10:29 AM
sidorovd added inline comments.
clang/include/clang/Basic/Attr.td
1183

I would rather leave it in a separate doc as it's a vendor extension now. If/when we collect a positive feedback I don't mind to merge it with OpenCLAddressSpaceGlobalDocs. What about OpenCLGlobalDeviceHostAddressSpacesDocs?

TODO: update address space numerical values for SPIR target
Regarding the TODO. Still I like an idea of detached address spaces a little bit more, but I see your point. I would love to check how it works with SPIR-V address space mapping in SPIR-V to LLVM IR translator. I know, it shall have a little impact on a clang work, but still what to see the work I have to do there first (5 and 6 are already occupied there).

Makes sense. I think conversion to SPIR-V is one of the primary use cases. However, if checking doesn't need a lot of time it would be better to commit with the final numbering scheme. Otherwise things like this are pretty hard to change once they start to be used.

I have checked it. Values 5 and 6 are occupied for mapping Input and Output storage classes and I have a feeling that they are actually mapped on some real address spaces from some internal frontend. I asked the author of this change in the translator https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/579#discussion_r444377669 if I can move SPIRAS_Input/Output or not, waiting for a reply.

aaron.ballman added inline comments.Jun 23 2020, 10:51 AM
clang/include/clang/Basic/Attr.td
1183

I feel like the combined docs could be written to clarify it's a vendor extension, but I don't insist on consolidating the docs if that's your preference. How about OpenCLAddressSpaceGlobalExtDocs to make it clear it's an extension related to the other docs?

sidorovd marked an inline comment as done.Jun 23 2020, 11:39 AM
sidorovd added inline comments.
clang/include/clang/Basic/Attr.td
1183

I like it, thanks!

sidorovd updated this revision to Diff 272818.Jun 23 2020, 1:58 PM

The attribute bits LGTM, but I have no comment on the OpenCL bits.

sidorovd updated this revision to Diff 273020.Jun 24 2020, 7:31 AM

Made SPIR AS map continuous.

TODO: update address space numerical values for SPIR target
Regarding the TODO. Still I like an idea of detached address spaces a little bit more, but I see your point. I would love to check how it works with SPIR-V address space mapping in SPIR-V to LLVM IR translator. I know, it shall have a little impact on a clang work, but still what to see the work I have to do there first (5 and 6 are already occupied there).

Makes sense. I think conversion to SPIR-V is one of the primary use cases. However, if checking doesn't need a lot of time it would be better to commit with the final numbering scheme. Otherwise things like this are pretty hard to change once they start to be used.

It appeared to be, that SPIRVAS_Input/Output has no mapping on real LLVM address spaces, so I'll just move them in the translator.

I have checked it. Values 5 and 6 are occupied for mapping Input and Output storage classes and I have a feeling that they are actually mapped on some real address spaces from some internal frontend. I asked the author of this change in the translator https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/579#discussion_r444377669 if I can move SPIRAS_Input/Output or not, waiting for a reply.

sidorovd marked 7 inline comments as done.Jun 25 2020, 1:03 AM
Anastasia added inline comments.Jun 26 2020, 3:24 AM
clang/include/clang/Basic/AttrDocs.td
3134

Let's add the official spelling too:

opencl_global -> __global/opencl_global

The same applies below.

3134

so address space scheme for OpenCL 2.0 with the extension looks like -> the full address space set model for OpenCL 2.0 with the extension looks as follows:

3142

Do you refer to implicit conversion here? What about converting the other direction i.e. global_host to global?

I would also encourage you to add a reference to embedded C (ISO/IEC TR 18037) section 5 which OpenCL adopts too that explains subsets orverlapping and conversions in details.

clang/lib/AST/ASTContext.cpp
922–935

Perhaps not critical but I believe we could make numeration here continuous too as it is used only for testing purposes I believe. However it might require some tests to be changed.

clang/lib/AST/ItaniumMangle.cpp
2400

Why not to follow the same convention and use lower case CLhost. I would also suggest the full spelling i.e. CLglobalhost

clang/lib/AST/MicrosoftMangle.cpp
1827

The same here, why not _ASCLdevice or even _ASCLglobaldevice

clang/lib/AST/TypePrinter.cpp
1872

This should technically print as a source spelling. Have you considered adding this spelling to clang? Alternatively you could print attribute spelling.

Also you should test this either via diagnostic output or an AST dump.

sidorovd marked 2 inline comments as done.Jun 26 2020, 3:55 AM
sidorovd added inline comments.
clang/include/clang/Basic/AttrDocs.td
3142

What about converting the other direction i.e. global_host to global?

This possibility is yet being discussed. I'm not ready to disallow this type of conversion of explicitly specify that it's allowed in this patch (or at least right now).

clang/lib/AST/ItaniumMangle.cpp
2400

I think it's a bad idea, since demanglers may consider it to be opencl_global address space + unknown token.

Anastasia added inline comments.Jul 3 2020, 10:55 AM
clang/include/clang/Basic/AttrDocs.td
3142

Ok, I was asking to document the current behavior of your patch. You can continue the discussion and amend it later on if needed.

But actually is global_host/global_device a subset of global? Then you should not be able to convert the other direction implicitly but only explicitly. Perhaps your testing should be improved in this area.

clang/lib/AST/ItaniumMangle.cpp
2400

I would imagine demanglers would read until the next tag to complete the decoding of the current element.

Also your comment above is inconsistent because it explains the mangling scheme as follows:

"CL" [ "global" | "local" | "constant" |
      //                                "private"| "generic" | "global_device" |
      //                                "global_host" ]
sidorovd updated this revision to Diff 278165.Jul 15 2020, 6:26 AM
sidorovd marked 9 inline comments as done.
sidorovd marked an inline comment as done.

Sorry for a late response and thank you very much for the feedback!

clang/include/clang/Basic/AttrDocs.td
3142

Thanks for pointing it out, added the test.

clang/lib/AST/ItaniumMangle.cpp
2400

Made the mangling be CLdevice and CLhost.
I've made some experiments with an online demangler (not specialized on clang and OpenCL).
I took void foo(attribute((opencl_global)) int * Val); function, which as mangled as _Z3fooPU8CLglobali , and with changing it's mangling processed it through this mangler. Here is the result:
_Z3fooPU8CLglobali -> foo(int CLglobal*) wasn't parsed properly, but still OK
_Z3fooPU8CLdevicei -> foo(int CLdevice*)
same
_Z3fooPU8CLglobal_devicei -> _Z3fooPU8CLglobal_devicei yeap, it wasn't demangled
_Z3fooPU8CLglobaldevicei -> foo(double CLglobal*, long double, void, int, char, long double, int)
wow
_Z3fooPU8CLglobalDevicei ->foo(decimal128 CLglobal*, void, int, char, long double, int) // wow

So I think keeping mangling as easy as possible without overlapping with something known for demanglers is a safer approach.

clang/lib/AST/TypePrinter.cpp
1872

I would prefer to leave AST spelling here to be consistent with other address spaces. The appropriate test is in place.

Anastasia added inline comments.Jul 17 2020, 9:56 AM
clang/test/SemaOpenCL/usm-address-spaces-conversions.cl
23

Looking good! Can you extend for conversions with other address spaces please? It should result in an error apart from generic should work?

sidorovd updated this revision to Diff 280579.Jul 24 2020, 1:45 PM
sidorovd marked 2 inline comments as done.
sidorovd added inline comments.
clang/test/SemaOpenCL/usm-address-spaces-conversions.cl
23

Expanded the test. I left it in its own source file to distinguish extension testing from the core, but if you have objections - I can merge it with address-spaces-conversions-cl2.0.cl .

Anastasia accepted this revision.Jul 28 2020, 1:06 PM

LGTM, Thanks! Although I left a comment about possible test simplification that can be addressed before committing.

clang/test/SemaOpenCL/usm-address-spaces-conversions.cl
36

I think you can simplify this test by removing this macro. It doesn't seem to change the test coverage and you can minimize the number of run lines...

This revision is now accepted and ready to land.Jul 28 2020, 1:06 PM
sidorovd updated this revision to Diff 281517.Jul 29 2020, 4:33 AM

Simplified the test

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2020, 7:25 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript