The current context is thread-local state, and in preparation of GPU async execution (on multiple threads) we need to set the context before calling API that create resources.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp | ||
---|---|---|
61 | Should this rather use push/pop in case there is some external (to the gpu dialect) use of the context, too? Like if this runs inside of some other runtime. |
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp | ||
---|---|---|
61 | It certainly could, but it seems a little over-engineered at this stage. But happy to add it if you think it makes sense. |
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp | ||
---|---|---|
61 | CUDA context issues are annoying to debug and why not if we can avoid creating that issue. I will forget this and then be puzzled :) |
Hmm, this turned out more complex than I had thought. I had a simple push/pop in mind. If that is not enough, lets keep it at the simple version for now.
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp | ||
---|---|---|
45 | Creating it always as before would make this less complex. What is the drawback? | |
52 | This might no longer be the current one, if it was just created. | |
56 | Why not use cuCtxPopCurrent here? | |
69 | Doesn't cuCtxCreate already do this? |
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp | ||
---|---|---|
45 | Setting a specific context allows running on a different device, for example. The use is quite limited though because mgpuSetContext() is not thread safe. We will probably need to expose the per-thread context per thread, or per function that needs one. I switched it to the primary context, which is the simplest. | |
52 | See comment below. | |
56 | The CUDA context stack is from early CUDA days. I have not seen anyone using it in years, and the HIP equivalent is marked deprecated. | |
69 | cuCtxCreate sets the current context, this restores it so that the c'tor can grab it. It's a bit of a back and forth, but there is no call_once-else. |
mlir/tools/mlir-rocm-runner/rocm-runtime-wrappers.cpp | ||
---|---|---|
49 | This should say hipCtxGetCurrent will fix later. |
Thanks for cleaning this up.
mlir/tools/mlir-rocm-runner/rocm-runtime-wrappers.cpp | ||
---|---|---|
50 | context -> Context |
Creating it always as before would make this less complex. What is the drawback?