Page MenuHomePhabricator

[OpenMP][CUDA] Fix the issue that P2P memcpy doesn't work
Needs ReviewPublic

Authored by tianshilei1992 on Mar 30 2022, 2:56 PM.

Details

Reviewers
jdoerfert
Summary

This patch fixes the issue that P2P memcpy doesn't work. The root cause is we didn't set current context when calling the API function. In addition, a matrix to track the states of each pair of devices is also added such that we only need to query and configure the device once.

Diff Detail

Event Timeline

tianshilei1992 created this revision.Mar 30 2022, 2:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2022, 2:56 PM
tianshilei1992 requested review of this revision.Mar 30 2022, 2:56 PM
Herald added a project: Restricted Project. · View Herald Transcript

add blank line

add set context before d2d in case of any failure when setting destination device

jdoerfert added inline comments.Apr 6 2022, 8:00 AM
openmp/libomptarget/plugins/cuda/src/rtl.cpp
1110

Do we want to do this every time? Is this costly the second time around? Should we keep a flag?

add a state matrix

tianshilei1992 marked an inline comment as done.Apr 12 2022, 9:40 AM
tianshilei1992 edited the summary of this revision. (Show Details)Apr 12 2022, 9:48 AM

lazy initialization

remove unnecessary changes

remove intermediate state

tianshilei1992 edited the summary of this revision. (Show Details)Apr 15 2022, 7:20 PM

rebase and ping again