HomePhabricator

Initial check-in of Acxxel (StreamExecutor renamed)

Description

Initial check-in of Acxxel (StreamExecutor renamed)

Summary:
Acxxel is basically a simplified redesign of StreamExecutor.

Here are the major points where Acxxel differs from the current
StreamExecutor design:

  • Acxxel doesn't support the kernel and kernel loader types designed for emission by the compiler to support type-safe kernel launches. For CUDA, kernels in Acxxel can be seamlessly launched using the standard CUDA triple-chevron kernel launch syntax that is available with clang and nvcc. For CUDA and OpenCL, kernel arguments can be passed in the old-fashioned way, as one array of pointers to arguments and another array of argument sizes. Although OpenCL doesn't get a type-safe kernel launch method, it does still get the benefit of all the memory management wrappers. In the future, clang may add support for triple-chevron OpenCL kernel launchs, or some other type-safe OpenCL kernel launch method.
  • Acxxel does not depend on any other code in LLVM, so it builds completely independently from LLVM.

The goal will be to check in Acxxel and remove StreamExecutor, or
perhaps to remove the old StreamExecutor and rename Acxxel to
StreamExecutor, so I think Acxxel should be thought of as a new version
of StreamExecutor, not as a separate project.

Reviewers: jlebar, jprice

Subscribers: beanz, mgorny, modocache, parallel_libs-commits

Differential Revision: https://reviews.llvm.org/D25701

Details

Committed
jhenOct 25 2016, 1:18 PM
Differential Revision
D25701: Initial check-in of Acxxel (StreamExecutor renamed)
Parents
rL285110: revert: "Remove debug location from common tail when tail-merging"
Branches
Unknown
Tags
Unknown

Event Timeline

It seems I am missing some context here, but what is the motivation behind this change? You state in the summary what you change, but the reason why the old streamexecuter design did not work and in which situations it does not match your needs is something I miss.

In fact, I am very interested to use something like streamexecuter as a runtime system for polly's accelerator generation system. So I am certainly planning to look closer into this change. (Maybe we can also talk at / around LLVM dev meeting?)

jhen added a comment.Oct 31 2016, 9:51 AM

Hi grosser. Sorry for not including the motivation in the commit message. During the review (https://reviews.llvm.org/D25701) I added a comment at the beginning about the motivation for the change in response to the same question from hfinkel. I won't repeat the details here. Instead, I'll just provide the previous link and summarize by saying that the old kernel launch model didn't work with templated CUDA kernels, so I decided not to keep it, but it could return later (hopefully in a more general form).

I'm really glad to hear you are considering StreamExecutor/Acxxel for use in polly. I'd be glad to discuss any changes that can make the two projects work better together. Unfortunately, I will be out of the bay area during the LLVM dev meeting. Perhaps we can start up a discussion thread on parallel_libs-dev or feel free to contact me directly through email and we can set up a call or in person meeting, or whatever is most convenient.

Alright. Thank you for the info. Yes, we should discuss this at some point!