This implementation does not currently support multiple concurrent streams, and
it won't allow kernels to be launched with grids larger than one block or
blocks larger than one thread. These limitations could be removed in the future
by launching new threads on the host, but that is not done in this
implementation.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
204 ↗ | (On Diff #71053) | Is the second const necessary? I feel like it kind of confuses things without really affecting the meaning of the signature in any way. And the second const has no bearing on whether or not you can bind an std::function to a given function: https://godbolt.org/g/XiksZ9 |
223 ↗ | (On Diff #71053) | Nit, maybe HostFunction != nullptr would be clearer? They mean the same thing. |
242 ↗ | (On Diff #71053) | Why do we return this as a pointer rather than a reference? |
streamexecutor/include/streamexecutor/platforms/host/HostPlatform.h | ||
37 ↗ | (On Diff #71053) | Isn't the Platform supposed to own the Device object? |
streamexecutor/include/streamexecutor/platforms/host/HostPlatformDevice.h | ||
38 ↗ | (On Diff #71053) | Oh, I see why you return a pointer from getHostFunction(). Um. You could still return a reference and take the address here... That doesn't actually seem so bad to me. In either case you probably want to make these multi-kernel objects not-movable, or otherwise allocate the std::function inside a unique_ptr. Because if this thing gets moved we're currently screwed. |
151 ↗ | (On Diff #71053) | Does this file supercede the for-testing platform we have? (Not necessary to do the switch in this patch, of course.) |
- Respond to jlebar's comments
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
204 ↗ | (On Diff #71053) | You were right, it wasn't necessary. I removed it. |
223 ↗ | (On Diff #71053) | It's a unique_ptr now and it uses the != nullptr test. |
242 ↗ | (On Diff #71053) | Changed it to return a reference. |
streamexecutor/include/streamexecutor/platforms/host/HostPlatform.h | ||
37 ↗ | (On Diff #71053) | Right. I totally forgot about that. Now it does. |
streamexecutor/include/streamexecutor/platforms/host/HostPlatformDevice.h | ||
38 ↗ | (On Diff #71053) | Now returning a ref and taking the address. Also converted the MultiKernelLoaderSpec to store a unique_ptr to the function in order to correctly handle moving. |
151 ↗ | (On Diff #71053) | Yes, I think I will get rid of the for-testing host platform in another patch. |
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
273 ↗ | (On Diff #71184) | Nit, std::move(Function). (std::function objects may be heavyweight in general, even though in this case it probably doesn't matter.) |
streamexecutor/include/streamexecutor/platforms/host/HostPlatform.h | ||
39 ↗ | (On Diff #71184) | Do we need this to be thread-safe? |
39 ↗ | (On Diff #71184) | I'm confused where we call delete on the HostPlatformDevice. It's not super important, but I think if we don't it may show up as a leak in asan, and that means it will show up as a leak in user programs. |
- Make thread safe and destroy PlatformDevice
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
273 ↗ | (On Diff #71184) | I made that change and also changed the signature of the calling function, addHostFunction to take the function argument as an rvalue ref to prevent the copy during the function call. |
streamexecutor/include/streamexecutor/platforms/host/HostPlatform.h | ||
39 ↗ | (On Diff #71184) | Yes, we do want thread safety. Thanks for catching that. I added a mutex to protect it. I didn't do lazy static initialization because, as jprice suggested before, I think we are going to want to add another method to clear out the platform. I also made the HostDevicePlatform explicitly owned by HostPlatform so it's clear when it is destroyed. |
streamexecutor/include/streamexecutor/KernelSpec.h | ||
---|---|---|
273 ↗ | (On Diff #71193) | The usual convention I've seen is to take the heavyweight, copyable arg (be it an std::function, std::string, std::vector, whatever) that you're taking ownership of by value, rather than by rvalue ref. You're right that making the function arg an rvalue ref prevents accidental copies. The problem is that it also makes *intentional* copies a pain. I think you have to spell out the whole type again: std::function<const void*> fn = ...; addHostFunction("kernel_name", std::function<const void*>(fn)); |