This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][libomptarget] Notify the plugins regarding new mapping/unmappings
ClosedPublic

Authored by kevinsala on Jan 24 2023, 4:08 PM.

Details

Summary

The NextGen plugins use the information regarding new mapping/unmappings to lock/unlock the corresponding host buffer and speed up the host-device transfers. Handle also the buffers that were already locked externally, i.e., from the application.

Diff Detail

Event Timeline

kevinsala created this revision.Jan 24 2023, 4:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 24 2023, 4:08 PM
kevinsala requested review of this revision.Jan 24 2023, 4:08 PM
kevinsala updated this revision to Diff 491952.Jan 24 2023, 4:13 PM

Fix format.

tianshilei1992 requested changes to this revision.Jan 24 2023, 4:16 PM
tianshilei1992 added a subscriber: tianshilei1992.

mapped/unmapped is a confusing name, especially in the context of OpenMP, you can always check if a pointer is mapped in libomptarget. Of course I understand the "map" here is not OpenMP map.

This revision now requires changes to proceed.Jan 24 2023, 4:16 PM

mapped/unmapped is a confusing name, especially in the context of OpenMP, you can always check if a pointer is mapped in libomptarget. Of course I understand the "map" here is not OpenMP map.

True. Any recommendation for the rename?

mapped/unmapped is a confusing name, especially in the context of OpenMP, you can always check if a pointer is mapped in libomptarget. Of course I understand the "map" here is not OpenMP map.

True. Any recommendation for the rename?

I'm working on Nvidia GPUs, so IMO pinned memory makes sense. Not sure if AMD people would object, though in the implementation you are using that name. ;-)

ye-luo added inline comments.Jan 24 2023, 5:33 PM
openmp/libomptarget/include/omptargetplugin.h
219

The same as line 215

@tianshilei1992 This is supposed to mean "mapped" in the OpenMP sense. So the plugins are informed if a mapping occurs.

I think this is fine. Two nits and Ye's comment need to be addressed. @tianshilei1992 please accept or suggest a different name. Since we really mean "OpenMP mapped" here, I find the name fitting.

openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
1855
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h
533

Docs

tianshilei1992 added a comment.EditedJan 24 2023, 6:39 PM

If it servers as notifying data mapping, __tgt_rtl_notify_data_map or something like that sounds more appropriate to me. __tgt_rtl_data_mapped sounds like checking whether it is mapped, and that's how it confused me initially.

If it servers as notifying data mapping, __tgt_rtl_notify_data_map or something like that sounds more appropriate to me. __tgt_rtl_data_mapped sounds like checking whether it is mapped, and that's how it confused me initially.

Can be tgt_rtl_data_notify_mapped and tgt_rtl_data_notify_unmapped by keeping __tgt_rtl_data prefix

ye-luo added inline comments.Jan 24 2023, 7:58 PM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
887

I don't understand this part of code.
The implementation actually lock the host buffer instead of just notifying.

jdoerfert added inline comments.Jan 24 2023, 8:31 PM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
887

While it is mapped, we want to keep it (by default) pinned. The escape hatch (env var) is missing though.
This allows us to avoid the pin/unpin on the data transfer back and intermediate updates.

ye-luo added inline comments.Jan 24 2023, 8:48 PM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
887

Unconditionally pinning mapped host memory should be an option to opt-in rather than default.
If I just map 10 scalars, don't pin them, staging can be better.
if users want fine-grain control, use the lock API. For really lazy users, give them the env var to opt-in.

jdoerfert added inline comments.Jan 25 2023, 8:02 AM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
887

Mapping scalars as firstprivate or as tofrom? The former would not go through this mechanism. The latter makes reasonable sense to pin the page, no?

I agree we need an env var, default is the question.

ye-luo added inline comments.Jan 25 2023, 8:04 AM
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
887

I was referring to map(tofrom) scalars. firstprivte is not considered a map technically. I don't think you want to pay the cost of pinning 10 pages due to 10 scalars. Pinning is very expensive (interacting with OS).

Let's make it opt-in then. And we can use heuristics as well as we go.

kevinsala updated this revision to Diff 492603.EditedJan 26 2023, 5:08 PM

Changes:

  • Fixing comments.
  • Re-writing the PinnedAllocationMapTy functions to be clearer and easier to read.
  • Renaming RTL API functions.
  • Disable automatic locking of mapped host buffers by default. Can be enabled by setting LIBOMPTARGET_LOCK_MAPPED_HOST_BUFFERS envar.
  • Although the previous envar is not set, mapped buffers that were externally locked (e.g, the app allocating HSA memory) are registered into the plugin's map and we use a single-step asynchronous HSA memory transfer. (@ye-luo)

This patch still detects invalid partial overlapping in offloading/complex_reduction.cpp, which I have to fix. The previous patch fails with OpenMC, I've to check if this also fails.

kevinsala marked 8 inline comments as done.Jan 26 2023, 5:09 PM
kevinsala added inline comments.
openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
1855

It's an output parameter. I'll add the proper comment.

kevinsala updated this revision to Diff 492664.Jan 27 2023, 1:30 AM
kevinsala marked an inline comment as done.

Fix format.

kevinsala updated this revision to Diff 493034.Jan 28 2023, 2:35 PM

Fixing bug. Now complex reduction tests work.

tianshilei1992 added inline comments.Jan 28 2023, 3:21 PM
openmp/libomptarget/include/omptargetplugin.h
215

__tgt_rtl_notify_data_mapped?

kevinsala added inline comments.Jan 29 2023, 2:43 AM
openmp/libomptarget/include/omptargetplugin.h
215

This name was proposed by @ye-luo to be consistent with the other data-related API functions:

Can be tgt_rtl_data_notify_mapped and tgt_rtl_data_notify_unmapped by keeping __tgt_rtl_data prefix

This revision is now accepted and ready to land.Jan 29 2023, 6:18 AM
ye-luo added inline comments.Jan 29 2023, 8:33 AM
openmp/libomptarget/src/device.cpp
611

"Size=%" missing "%"

kevinsala updated this revision to Diff 493102.Jan 29 2023, 9:15 AM

Fix debug print

kevinsala marked 2 inline comments as done.Jan 29 2023, 9:16 AM
ye-luo accepted this revision.Jan 29 2023, 9:31 AM
kevinsala updated this revision to Diff 494687.Feb 3 2023, 10:33 AM

The locking of mapped host buffers is disabled by default. Can be enabled through the LIBOMPTARGET_LOCK_MAPPED_HOST_BUFFERS envar. The envar accepts boolean values (on/off) and a special option:

  • off: Do not lock mapped host buffers (default).
  • on: Lock mapped host buffers automatically, but do not report lock failures if the plugin fails to lock them.
  • mandatory: Lock mapped host buffers automatically and treat locking failures in the plugins as fatal errors. This option may be useful for debugging purposes.

This is probably a very late report but I happened to be looking at openmp bots for a different reason, some of the error returns in this change are failing to compile:
https://lab.llvm.org/buildbot/#/builders/4/builds/32052/steps/5/logs/stdio

Despite the "gcc" name, this bot appears to use clang-11 which is still supported for building llvm.

This is a simplified version: https://godbolt.org/z/x91h9KcGq

I'm not sure how we're expected to use Expected (no pun intended) in this scenario. There is probably a helper function somewhere.

This is probably a very late report but I happened to be looking at openmp bots for a different reason, some of the error returns in this change are failing to compile:
https://lab.llvm.org/buildbot/#/builders/4/builds/32052/steps/5/logs/stdio

Despite the "gcc" name, this bot appears to use clang-11 which is still supported for building llvm.

This is a simplified version: https://godbolt.org/z/x91h9KcGq

I'm not sure how we're expected to use Expected (no pun intended) in this scenario. There is probably a helper function somewhere.

Needs an explicit move https://godbolt.org/z/Kj1s7od7e.

This is probably a very late report but I happened to be looking at openmp bots for a different reason, some of the error returns in this change are failing to compile:
https://lab.llvm.org/buildbot/#/builders/4/builds/32052/steps/5/logs/stdio

Despite the "gcc" name, this bot appears to use clang-11 which is still supported for building llvm.

This is a simplified version: https://godbolt.org/z/x91h9KcGq

I'm not sure how we're expected to use Expected (no pun intended) in this scenario. There is probably a helper function somewhere.

This should resolve it, thanks for bringing this to my attention https://reviews.llvm.org/rGedc03550063c.