Page MenuHomePhabricator

[mlir] Set CUDA/ROCm context before creating resources.
ClosedPublic

Authored by csigg on Jan 12 2021, 7:11 AM.

Details

Summary

The current context is thread-local state, and in preparation of GPU async execution (on multiple threads) we need to set the context before calling API that create resources.

Diff Detail

Event Timeline

csigg created this revision.Jan 12 2021, 7:11 AM
csigg requested review of this revision.Jan 12 2021, 7:11 AM
csigg retitled this revision from Set CUDA/ROCm context before creating resources. to [mlir] Set CUDA/ROCm context before creating resources..Jan 12 2021, 10:21 AM
herhut added inline comments.Jan 15 2021, 7:11 AM
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp
61

Should this rather use push/pop in case there is some external (to the gpu dialect) use of the context, too? Like if this runs inside of some other runtime.

csigg added inline comments.Jan 20 2021, 4:31 AM
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp
61

It certainly could, but it seems a little over-engineered at this stage. But happy to add it if you think it makes sense.

herhut added inline comments.Jan 20 2021, 8:54 AM
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp
61

CUDA context issues are annoying to debug and why not if we can avoid creating that issue. I will forget this and then be puzzled :)

csigg updated this revision to Diff 318136.Jan 21 2021, 2:17 AM

Restore context.

Hmm, this turned out more complex than I had thought. I had a simple push/pop in mind. If that is not enough, lets keep it at the simple version for now.

mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp
45

Creating it always as before would make this less complex. What is the drawback?

52

This might no longer be the current one, if it was just created.

56

Why not use cuCtxPopCurrent here?

69

Doesn't cuCtxCreate already do this?

csigg updated this revision to Diff 318167.Jan 21 2021, 5:14 AM

Switch to primary context.

csigg added inline comments.Jan 21 2021, 5:15 AM
mlir/tools/mlir-cuda-runner/cuda-runtime-wrappers.cpp
45

Setting a specific context allows running on a different device, for example. The use is quite limited though because mgpuSetContext() is not thread safe.

We will probably need to expose the per-thread context per thread, or per function that needs one.

I switched it to the primary context, which is the simplest.

52

See comment below.

56

The CUDA context stack is from early CUDA days. I have not seen anyone using it in years, and the HIP equivalent is marked deprecated.

69

cuCtxCreate sets the current context, this restores it so that the c'tor can grab it. It's a bit of a back and forth, but there is no call_once-else.

csigg updated this revision to Diff 318170.Jan 21 2021, 5:33 AM

Rebase.

csigg added inline comments.Jan 24 2021, 4:56 AM
mlir/tools/mlir-rocm-runner/rocm-runtime-wrappers.cpp
49

This should say hipCtxGetCurrent will fix later.

herhut accepted this revision.Jan 25 2021, 12:53 AM

Thanks for cleaning this up.

mlir/tools/mlir-rocm-runner/rocm-runtime-wrappers.cpp
50

context -> Context

This revision is now accepted and ready to land.Jan 25 2021, 12:53 AM
csigg updated this revision to Diff 319274.Jan 26 2021, 5:25 AM

Fixing typos.

This revision was automatically updated to reflect the committed changes.