This patch fixes the issue that P2P memcpy doesn't work. The root cause is we didn't set current context when calling the API function. In addition, a matrix to track the states of each pair of devices is also added such that we only need to query and configure the device once.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
1064 | Do we want to do this every time? Is this costly the second time around? Should we keep a flag? |
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
1079–1080 | This default label will cause warning/error... |
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
1079–1080 | Thanks for pointing that out. I'll push a fix pretty soon. |
Do we want to do this every time? Is this costly the second time around? Should we keep a flag?