Page MenuHomePhabricator

[Libomptarget][WIP] Introduce VGPU Plugin
Needs ReviewPublic

Authored by atmnpatel on Nov 6 2021, 7:41 PM.

Details

Summary

This patch introduces a virtual GPU (x86) plugin. This allows for the
emulation of the GPU environment on the host. This re-uses the same
execution model, compilation paths, runtimes as a physical GPU. The
number of threads, warps, and CTAs are set through the environment
variables VGPU_{NUM_THREADS,NUM_WARPS,WARPS_PER_CTA} respectively.

Known Bugs:

  • There is UB somewhere in the DeviceRTL that occasionally sets stride to zero, causing a FPE segfault.

Diff Detail

Unit TestsFailed

TimeTest
90 msx64 debian > LLVM.Bindings/Go::go.test
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llvm-go go=/usr/bin/go test llvm.org/llvm/bindings/go/llvm

Event Timeline

atmnpatel created this revision.Nov 6 2021, 7:41 PM
atmnpatel requested review of this revision.Nov 6 2021, 7:41 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptNov 6 2021, 7:41 PM
jdoerfert added inline comments.Nov 8 2021, 7:59 AM
clang/lib/CodeGen/CGOpenMPRuntimeVirtualGPU.cpp
54 ↗(On Diff #385318)

We should be able to get rid of this file (and the cuda/hip) version. Might be the right time now as a precommit.

llvm/include/llvm/ADT/Triple.h
168

Let's call it OpenMP_VGPU or something like that to make it clear.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
2177 ↗(On Diff #385318)

@tianshilei1992 This needs a test.

openmp/libomptarget/DeviceRTL/src/Kernel.cpp
107 ↗(On Diff #385318)

I don't think we should do this. Instead, the plugin should signal as threads finish the kernel.

openmp/libomptarget/DeviceRTL/src/Mapping.cpp
239

We probably should use kind(CPU) or something instead. Nothing x86 specific about it I think.

openmp/libomptarget/include/DeviceEnvironment.h
83 ↗(On Diff #385318)

This should go into a new file (ThreadEnvironment)

I removed the shared var opt - might be best to keep this in a separate patch @tianshilei1992. Also addressed comments.

small nit fix

tianshilei1992 added inline comments.Nov 11 2021, 9:04 AM
clang/lib/Driver/ToolChains/Gnu.cpp
3075

Maybe "x86_64-openmp_vpu" now?

llvm/lib/Support/Triple.cpp
189

"openmp_vpu"?

I can't see it in the diff - does the cmake somewhere enable the existing tests on this new target?

A bit surprised to see ffi involved, are we thinking of spawning a separate process for the target?

clang/lib/Basic/Targets/X86.h
49

It's not clear to me what this is x86 specific. Being able to run our tests on power / arm etc seems like an advantage. Would also mean we would avoid adding openmp stuff the x86 specific files. Maybe OpenMPVGPUAddrSpaceMap and put it in one of the openmp source files?

clang/lib/Frontend/CompilerInvocation.cpp
3984

Add a isOpenmpVGPU function?

openmp/libomptarget/DeviceRTL/CMakeLists.txt
135

Should only add this include to the vgu, not all the plugins. May be able to use relative include paths to drop it entirely

atmnpatel updated this revision to Diff 398370.Sat, Jan 8, 1:13 PM
  • Fixed lifetime issue around ffi_call
  • Addressed comments

The existing x86 plugin uses ffi, so this does as well, no explicit benefit in doing so. Is it worth keeping?

jdoerfert added inline comments.Mon, Jan 10, 7:00 AM
llvm/lib/Support/Triple.cpp
512
openmp/libomptarget/DeviceRTL/src/Debug.cpp
53
openmp/libomptarget/DeviceRTL/src/Kernel.cpp
128 ↗(On Diff #398370)
openmp/libomptarget/DeviceRTL/src/Mapping.cpp
28
openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
290
314

Pass the memory order, also rename the arguments to match the coding convention.

317

Pass the mask

openmp/libomptarget/DeviceRTL/src/Utils.cpp
53
65

Can't we merge this with AMDGPU?

117
openmp/libomptarget/plugins/vgpu/src/rtl.cpp
304

Can we split this up and create some helper functions maybe?

openmp/libomptarget/src/rtl.cpp
34

Introduce an environment variable, if it is set, X86 target should skip the image.
Also, add a TODO such that we later look into the image and inspect it to decide automatically.

openmp/libomptarget/test/lit.cfg
189 ↗(On Diff #398370)

Leftovers.

tianshilei1992 added inline comments.Wed, Jan 12, 6:55 AM
openmp/libomptarget/DeviceRTL/CMakeLists.txt
231

It's not a good practice to specify include directories in CMake in this way. Use include_directories instead.

openmp/libomptarget/DeviceRTL/src/Kernel.cpp
127 ↗(On Diff #398370)

Are these code here unintentional? We don't need to specialize this function for vgpu IIRC.

ormris removed a subscriber: ormris.Tue, Jan 18, 10:04 AM
jdoerfert added inline comments.Tue, Jan 18, 12:50 PM
openmp/libomptarget/DeviceRTL/src/Kernel.cpp
127 ↗(On Diff #398370)

we might be able to avoid it if we move the synchronize::threads "effect" into the VGPU instead.

atmnpatel edited the summary of this revision. (Show Details)Tue, Jan 18, 11:32 PM
atmnpatel marked 9 inline comments as done.

Addressed comments

atmnpatel added inline comments.Tue, Jan 18, 11:36 PM
openmp/libomptarget/DeviceRTL/CMakeLists.txt
231

can't quite do that here I think, afaik both include_directories and target_include_directories require that CMake builds the target, but we specify custom targets/build commands so they don't get pulled in