Page MenuHomePhabricator

[SYCL] Implement SYCL device code outlining
Needs ReviewPublic

Authored by Fznamznon on Apr 9 2019, 5:36 AM.

Details

Summary

SYCL is single source offload programming model so compiler should be
able to separate device code from host code.

Here is code example of the SYCL program demonstrates complier outlining
work:

int foo(int x) { return ++x; }
int bar(int x) { throw std::exception("CPU code only!"); }
...
using namespace cl::sycl;
queue Q;
buffer<int, 1> a(range<1>{1024});
Q.submit([&](handler& cgh) {
      auto A = a.get_access<access::mode::write>(cgh);
      cgh.parallel_for<init_a>(range<1>{1024}, [=](id<1> index) {
        A[index] = index[0] * 2 + index[1] + foo(42);
      });
    }
...

SYCL compiler needs to compile lambda function passed to
cl::sycl::handler::parallel_for method and function foo called from this
lambda function. Compiler also must ignore bar function when we "device" part
of the single source code.

Current approach is to add an attribute SYCL kernel, which SYCL runtime will use
to mark code passed to cl::sycl::handler::parallel_for as "kernel functions".
Obviously runtime library can't mark foo as "device" code - this is a compiler
job: to traverse all symbols accessible from kernel functions and add them to
the "device part" of the code marking them with new attribute SYCL
device.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Anastasia added inline comments.May 30 2019, 10:53 AM
clang/lib/Parse/ParseAST.cpp
171

Do you also need to prevent generation of non-device functions somehow?

clang/lib/Sema/SemaSYCL.cpp
24

Ping!

I would suggest to document it a bit more including any current limitations/assumption that you can mark under FIXME i.e. does your code handle lambdas yet, what if lambdas are used in function parameters, etc...

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
5535–5542

May be this should go into a helper function as it seems to be now a bigger chunk of code that is repeated?

Although, I am not very familiar with this code. You can try to get someone to review who has contributed to this more recently.

clang/test/CodeGenSYCL/device-functions.cpp
25

I can't see where the SPIR calling convention is currently set for SYCL?

clang/test/SemaSYCL/device-attributes-on-non-sycl.cpp
4

I don't think this comment is necessary.

Fznamznon added inline comments.May 31 2019, 5:52 AM
clang/include/clang/Basic/Attr.td
1036

Oh, Thank you for that!

clang/lib/Parse/ParseAST.cpp
171

I think It's already prevented by change to CodeGenModule.cpp in this patch. CodeGen just ignores declarations without SYCL device attribute now.

clang/lib/Sema/SemaSYCL.cpp
24

Oh, sorry, I missed this comment when I updated patch last time.
Could you please advise in which form I can document it?

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
5535–5542

I think this chunk of code seems big because of big repeated comment.

clang/test/CodeGenSYCL/device-functions.cpp
25

If I understand correct it's set automatically on AST level because we use SPIR-based triple for device code. Only in case of C++ methods clang doesn't set SPIR calling convention. We did a modification in our codebase to get SPIR calling convention for C++ methods too (available here )

Anastasia added inline comments.Jun 3 2019, 3:00 AM
clang/test/CodeGenSYCL/device-functions.cpp
25

Ok and what happens if some other target is used - not SPIR?

Fznamznon added inline comments.Jun 3 2019, 3:28 AM
clang/test/CodeGenSYCL/device-functions.cpp
25

There will be no SPIR calling convention for device functions.

Anastasia added inline comments.Jun 3 2019, 7:54 AM
clang/test/CodeGenSYCL/device-functions.cpp
25

Just FYI at some point we generalized SPIR calling convention to be used for kernels irrespective from target by default (see TargetCodeGenInfo::getOpenCLKernelCallingConv). Not sure if it might make sense to do for SYCL device functions too. I am not saying it belongs to this patch though.

Fznamznon updated this revision to Diff 203785.Jun 10 2019, 2:24 AM

Applied comments from @Anastasia

  • Added link to documentation for sycl_device attribute
  • Removed redundant comment from test

@Anastasia, do you have additional comments?

@aaron.ballman , please let me know if you have additional comments/suggestions. If not, could you please accept this revision?

Most of the comments are about minor nits like grammar and coding conventions, but I did have some questions regarding what kinds of functions the sycl_kernel attribute gets applied to. Also, I'd like to see some additional tests that demonstrate the sycl device attribute is being implicitly created on the proper declarations as expected (can probably do that using -ast-dump and checking to see if the right functions have the attribute attached).

clang/include/clang/Basic/AttrDocs.td
259

is SYCL "kernel function" -> is a SYCL "kernel function"

260

SYCL -> A SYCL

261

Kernel is a -> A kernel is a

263–264

This doesn't really demonstrate the need for the attribute -- the attribute is never shown in the code example. I'd prefer an example that shows when and how a user would write this attribute.

278

called SYLC -> called a SYLC

280

use sycl_kernel -> use the sycl_kernel

281

as SYCL -> as a SYCL
Compiler is supposed to -> The compiler will

284

In this code example compiler is supposed to add "foo" function -> In this code example, the compiler will add the "foo" function

clang/include/clang/Sema/Sema.h
11197

Function -> function

11198

In SYCL, when we generate device code, we don't know

11199

we emit sycl kernels, so we add device

11204

adds the function declaration

11205

Should be named addSyclDeviceFunc() per coding standards. Similar for the other new functions.

11209

Don't repeat the function name in the comments, please. Also, rather than returning a concrete SmallVector<>, I think it would be more natural to return a SmallVectorImpl so that callers don't have to contend with the explicit size. There should also be a const overload for this function.

11212

Constructs a SYCL kernel that is compatible with OpenCL from the SYCL "kernel

11215–11216

Marks all functions accessible from SYCL kernels with the SYCL device attribute

clang/lib/CodeGen/CodeGenModule.cpp
2408

with the SYCL device attribute

2410–2413

These if statements can be combined.

2536

Missing a full stop at the end of the comment.

clang/lib/Sema/SemaSYCL.cpp
15

This include doesn't seem to be necessary?

24

e does not use our usual naming conventions.

42

Spurious whitespace can be removed.

45

Elide braces.

49

Don't use auto as the type is not spelled out in the initialization.

53

Elide braces.

69

with the SYCL device attribute

71

elt -> Elt per naming conventions

72

auto * since the type is spelled out in the initialization.

74

Elide braces

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
5538

for the SYCL kernel attribute

5540

Elide braces

5552

for the SYCL kernel attribute

5554

Elide braces

clang/test/SemaSYCL/device-attributes-on-non-sycl.cpp
5

#ifndef ?

12

I'd prefer to spell this with __attribute__, same in the other test

clang/test/SemaSYCL/device-attributes.cpp
4

I'd like to see some more tests covering less obvious scenarios. Can I add this attribute to a lambda? What about a member function? How does it work with virtual functions? That sort of thing.

Fznamznon added inline comments.Jun 18 2019, 8:01 AM
clang/test/SemaSYCL/device-attributes.cpp
4

Actually there is no restrictions for adding this attribute to any function to outline device code so I just checked the simplest variant.

But I'm working on new patch which will put some requirements on function which is marked with sycl_kernel attribute.
This new patch will add generation of OpenCL kernel from function marked with sycl_kernel attribute. The main idea of this approach is described in this document (in this document generated kernel is called "kernel wrapper").
And to be able to generate OpenCL kernel using function marked with sycl_kernel attribute we put some requirements on this function, for example it must be a template function. You can find these requirements and example of proper function which can be marked with sycl_kernel in this comment .

aaron.ballman added inline comments.Jun 18 2019, 3:15 PM
clang/test/SemaSYCL/device-attributes.cpp
4

Actually there is no restrictions for adding this attribute to any function to outline device code so I just checked the simplest variant.

So there are no concerns about code like:

struct Base {
  __attribute__((sycl_kernel)) virtual void foo();
  virtual void bar();
};

struct Derived : Base {
  void foo() override;
  __attribute__((sycl_kernel)) void bar() override;
};

void f(Base *B, Derived *D) {
  // Will all of these "do the right thing"?
  B->foo();
  B->bar();

  D->foo();
  D->bar();
}
Fznamznon updated this revision to Diff 205663.Jun 19 2019, 1:42 PM

Appled part of comments from @aaron.ballman:

  • Fixed grammar and code style in all places except sycl_kernel docs
  • Added a lit test which checks that sycl_device attribute implicitly added to proper declarations
Fznamznon added inline comments.Jun 19 2019, 1:45 PM
clang/include/clang/Basic/AttrDocs.td
263–264

I see. I will update documentation in the next version.

bader added inline comments.Jun 19 2019, 2:00 PM
clang/test/SemaSYCL/device-attributes.cpp
4

Actually there is no restrictions for adding this attribute to any function to outline device code so I just checked the simplest variant.
But I'm working on new patch which will put some requirements on function which is marked with sycl_kernel attribute.

@aaron.ballman, sorry for confusing. The usage scenarios should have been articulated more accurately.
We have only four uses of this attribute in our implementation:
https://github.com/intel/llvm/blob/sycl/sycl/include/CL/sycl/handler.hpp#L538 (lines 538-605).
All four uses are applied to member functions of cl::sycl::handler class and all of them have similar prototype (which is mentioned by Mariya in the previous comment:

namespace cl { namespace sycl {
class handler {
  template <typename KernelName, typename KernelType/*, ...*/>
  __attribute__((sycl_kernel)) void sycl_kernel_function(KernelType KernelFuncObj) {
    KernelFuncObj();
  }
};
}}

Here is the list of SYCL device compiler expectations with regard to the function marked with sycl_kernel attribute.

  • Function template with at least one parameter is expected. The compiler generates OpenCL kernel and uses first template parameter as unique name to the generated OpenCL kernel. Host application uses this unique name to invoke the OpenCL kernel generated for the sycl_kernel_function specialized by this name and KernelType (which might be a lambda type).
  • Function must have at least one parameter. First parameter expected to be a function object type (named or unnamed i.e. lambda). Compiler uses function object type field to generate OpenCL kernel parameters.

Aaron, I hope it makes more sense now.

We don't plan in any use cases other than in SYCL standard library implementation mentioned above.
If I understand you concerns correctly, you want to be sure that clang prohibits other uses of this attribute, which are not intended. Right?
What is the best way to do this? Add more negative tests cases and make sure that clang generate error diagnostic messages?

keryell added inline comments.Jun 19 2019, 6:55 PM
clang/test/SemaSYCL/device-attributes.cpp
4

If I understand you concerns correctly, you want to be sure that clang prohibits other uses of this attribute, which are not intended. Right?

But since it is an attribute to be used by SYCL run-time writers, I am not sure there is a lot of value in over-engineering the restrictions of its use. It diverts brain power from the real implementation & review and might even prevent innovation due to some creative use cases.

Fznamznon updated this revision to Diff 205813.Jun 20 2019, 7:54 AM

Updated sycl_kernel attribute documentation.

Fznamznon updated this revision to Diff 205831.Jun 20 2019, 8:58 AM

Fixed a couple coding style issues, renamed markDevice function with markSYCLDevice.

aaron.ballman added inline comments.Jun 24 2019, 1:28 PM
clang/test/SemaSYCL/device-attributes.cpp
4

If I understand you concerns correctly, you want to be sure that clang prohibits other uses of this attribute, which are not intended. Right?

Effectively, yes. I'd like to ensure that situations where the attribute does not do what the user expects are diagnosed. A good rule of thumb that I use is to diagnose (as a warning) situations where the attribute will be silently ignored, and diagnose (as an error) situations where applying the attribute would cause really bad results (like miscompiles, security concerns, etc).

What is the best way to do this? Add more negative tests cases and make sure that clang generate error diagnostic messages?

That's a good approach, yes. Though for the situations you describe, I'd probably just warn rather than err because it seems like it's harmless to ignore the attribute so long as the user knows it's being ignored.

4

But since it is an attribute to be used by SYCL run-time writers, I am not sure there is a lot of value in over-engineering the restrictions of its use. It diverts brain power from the real implementation & review and might even prevent innovation due to some creative use cases.

I disagree. Part of the real implementation is ensuring the attribute is not accidentally misused. It's frustrating for users to have an attribute silently ignored because it's easy to mistake that situation for the attribute behaving as expected.

Fznamznon updated this revision to Diff 206861.Jun 27 2019, 7:17 AM

Added warning diagnostic for sycl_kernel attribute.

Now if the sycl_kernel attribute applied to a function which doesn't meet requirements for OpenCL kernel generation, attribute will be ignored and diagnostic will be emitted.

aaron.ballman added inline comments.Jul 1 2019, 8:13 AM
clang/include/clang/Basic/Attr.td
1034

Shouldn't this be FunctionTemplate instead?

clang/include/clang/Basic/AttrDocs.td
260

generate an OpenCL kernel

261

demonstrates the compiler's

278

defines the entry point

279

The compiler will

281

the compiler will add the "foo" function

282

More details about the compilation of functions for the device part can be found

284

of the code, the SYCL runtime

308

generate an OpenCL kernel

313

The function must be a template with at least two type template parameters.

314

generates an OpenCL kernel and uses the first template parameter as a unique name

315

The host application uses

318

The function must
The first parameter is required to be a

319

The compiler uses the function object type

321

The function must return void. The compiler reuses the body of marked functions to generate the OpenCL kernel body, and the OpenCL kernel must return void.

I'd move the "The sycl_kernel_function" sentence to its own paragraph rather than as part of the final bullet.

clang/include/clang/Basic/DiagnosticSemaKinds.td
9739–9740

I think this diagnostic should be split out into a few diagnostics that explicitly cover the requirements. Something like:
'sycl_kernel' attribute only applies to a %select{templated function|function returning 'void'|etc}0. It's best to avoid trying to send users to documentation if we can just tell them explicitly what they did wrong with their code.

clang/include/clang/Sema/Sema.h
11210

Can you add a const overload that returns a const container reference?

Also, why return the container rather than returning an iterator range from the container?

clang/lib/Sema/SemaDeclAttr.cpp
6449

Spurious newline above and missing a full stop at the end of the comment. Comments below are also missing full stops.

6450–6454

You can replace all this with a cast<FunctionDecl>(D) because the common attribute handler already verifies the subject is correct.

6456

I'd appreciate this being declared as a const pointer (same for the other nodes obtained through FT).

6459–6462

If you switch the subject to FunctionTemplate, then I believe this predicate can also go away.

clang/lib/Sema/SemaSYCL.cpp
65–66

I think there's some type confusion happening here. I would expect Elt to either be auto * or Func to be a const auto &. I suspect Elt should be declared as auto *.