This is an archive of the discontinued LLVM Phabricator instance.

Synchronize after each GPU action in the nextgen plugin
AcceptedPublic

Authored by efwright on Jun 27 2023, 12:12 AM.

Details

Reviewers
jdoerfert
ye-luo
Summary

Creating a debug option to synchronize GPU kernel launches and data transfers immediately. Done through an environment variable. Currently done in the common plugin interface, so hopefully it would be applicable to all architectures. We instead could do it inside the individual architecture implementations instead, such as having a "cudaStreamSynchronize" call immediately after the "cudaLaunchKernel", for example. Though I think the way I have it now is basically equivalent and doesn't need to be done for each architecture.]

The env variable for now is "LIBOMPTARGET_FORCE_SYNCHRONIZE"

Diff Detail

Event Timeline

efwright created this revision.Jun 27 2023, 12:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 12:12 AM
efwright requested review of this revision.Jun 27 2023, 12:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 12:12 AM
kevinsala added inline comments.
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
254

You can store this envar in the GenericDeviceTy as other envars using BoolEnvar or similar. This way the value will be cached.

260

I think we should return the error instead of processing it here. The plugin API should be the one reporting the error and perform the necessary actions.

932

Could this function could be executed inside the AsyncInfoWrapper.finalize(Err)?

openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h
535 ↗(On Diff #534851)

If you will keep this function, I would rename it to a clearer name. In this way, it seems like you are just checking if the force synchronization is enabled.

openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp
853 ↗(On Diff #534851)

Should be removed

Please also document the new envar in openmp/docs/design/Runtimes.rst, in the Libomptarget Environment Variables section.

Documentation of the new env var is missing. See openmp/docs/...

ye-luo added a subscriber: ye-luo.Jun 27 2023, 12:26 PM

Please also include the environment variable in the patch summary.

efwright updated this revision to Diff 540467.Jul 14 2023, 9:56 AM
efwright edited the summary of this revision. (Show Details)

Moved the functionality to the AsyncInfoWrapperTy.finalize() function. Stores the error and it will be returned from the function to reported properly. Looked into the BoolEnvar as recommended. was thinking of storing it in the AsyncInfoWrapperTy class, but since that class gets created and destroyed frequently it's probably a better idea to store it somewhere more permanent.

jdoerfert added inline comments.Jul 14 2023, 11:20 AM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
216

Are we sure that this has the desired effect? Also, do we need to check for the presence of Queue?

kevinsala added inline comments.Jul 17 2023, 2:03 AM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
214

We could define this variable as static to initialize just once.

216

We could generalize it and merge with the if below (line 221):

if ((ForceSynchronization || AsyncInfoPtr == &LocalAsyncInfo) && AsyncInfoPtr->Queue && !Err)
  Err = Device.synchronize(AsyncInfoPtr);
efwright updated this revision to Diff 541000.Jul 17 2023, 6:46 AM

Made the env var static and added a check for the queue.

@jdoerfert I have a code where I'm just launching a couple of different kernels back to back, and I at least see the code is synchronizing here. Do you have something specific I could test? Possibly something we know will crash to see if this change helps with the error feedback and what not?

kevinsala added inline comments.Jul 17 2023, 7:01 AM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
217

I believe the parenthesis in my snippet above are necessary. As the code is now, it would synchronize inconditionally when the new envar is set. However, when there is no queue (stream) or there was a previous error (Err), we shouldn't synchronize. I think the condition should be:

if ((ForceSynchronization || AsyncInfoPtr == &LocalAsyncInfo) && AsyncInfoPtr->Queue && !Err)

Also, the sycnhronize call should be changed to the code below. Otherwise, we will always synchronize with the local async info.

Err = Device.synchronize(AsyncInfoPtr);
efwright updated this revision to Diff 543605.Jul 24 2023, 9:42 AM
efwright marked 5 inline comments as done.
efwright marked an inline comment as done.
ye-luo accepted this revision.Jul 24 2023, 9:51 AM
ye-luo added inline comments.
openmp/docs/design/Runtimes.rst
1248

Please mention the default value.

This revision is now accepted and ready to land.Jul 24 2023, 9:51 AM
ye-luo added inline comments.Jul 24 2023, 9:52 AM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
953

Better also remove all the added empty lines.