This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
openmp/libomptarget/
-
libomptarget/
-
CMakeLists.txt
-
plugins-nextgen/
-
CMakeLists.txt
-
common/PluginInterface/
-
PluginInterface/
3/3
CMakeLists.txt
1/1
JIT.h
23/23
JIT.cpp
-
PluginInterface.h
2/2
PluginInterface.cpp
-
cuda/src/
-
src/
1/1
rtl.cpp
-
generic-elf-64bit/src/
-
src/
-
rtl.cpp
-
test/
-
lit.cfg

Differential D139287

[OpenMP] Introduce basic JIT support to OpenMP target offloading
ClosedPublic

Authored by tianshilei1992 on Dec 4 2022, 7:26 PM.

Download Raw Diff

Details

Reviewers

jdoerfert
ggeorgakoudis
jhuber6

Commits

rG5a3a527f8ae2: [OpenMP] Introduce basic JIT support to OpenMP target offloading
rG58906e4901ec: [OpenMP] Introduce basic JIT support to OpenMP target offloading

Summary

This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs.

The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. https://github.com/shiltian/llvm-project/commit/02bc7effccc6ff2f5ab3fe5218336094c0485766#diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432 shows how it roughly works.

As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later.

In order to enable JIT mode, when compiling, -foffload-lto is needed, and when linking, -foffload-lto -Wl,--embed-bitcode is needed. That implies that, LTO is required to enable JIT mode.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tianshilei1992 created this revision.Dec 4 2022, 7:26 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 4 2022, 7:26 PM

Herald added subscribers: guansong, yaxunl. · View Herald Transcript

tianshilei1992 requested review of this revision.Dec 4 2022, 7:26 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 4 2022, 7:26 PM

Herald added subscribers: openmp-commits, cfe-commits, sstefan1. · View Herald Transcript

tianshilei1992 added inline comments.Dec 4 2022, 7:29 PM

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
879 ↗	(On Diff #479964)	This will be pushed by Joseph in another patch.
openmp/libomptarget/plugins-nextgen/common/PluginInterface/CMakeLists.txt
15–22	I guess this might cause the issue of non-protected global symbols.
openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
185	Is there any way that we don't write it to a file here?
256	Is there better way to compare two triples?

tianshilei1992 added inline comments.Dec 4 2022, 7:32 PM

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
229	I might change the return value to `Expected<bool>` such that it is able to pass the error info back to caller.
openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
723	Do we want a configurable value for the `OptLevel`, or can we know it from somewhere else what value is used at compile time?

Harbormaster completed remote builds in B201007: Diff 479964.Dec 4 2022, 8:00 PM

Why do we have the JIT in the nextgen plugins? I figured that JIT would be handled by libomptarget proper rather than the plugins. I guess this is needed for per-kernel specialization? My idea of the rough pseudocode would be like this and we wouldn't need a complex class heirarchy. Also I don't know if we can skip ptxas by giving CUDA the ptx directly, we probably will need to invoke lld on the command line however right.

for each image:
  if image is bitcode
    image = compile(image)
 register(image)

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
879 ↗	(On Diff #479964)	Did that this morning.
openmp/libomptarget/plugins-nextgen/common/PluginInterface/CMakeLists.txt
15–22	Should we be able to put all this in the `add_llvm_library`?
openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
48–52	We could probably limit these to the ones we actually care about since we know the triples. Not sure if it would save us much runtime.
185	Why do we need to invoke LTO here? I figured that we could call the backend directly since we have no need to actually link any filies, and we may not have a need to run more expensive optimizations when the bitcode is already optimized. If you do that then you should be able to just use a `raw_svector_ostream` as your output stream and get the compiled output written to that buffer.

In D139287#3970996, @jhuber6 wrote:
Why do we have the JIT in the nextgen plugins? I figured that JIT would be handled by libomptarget proper rather than the plugins. I guess this is needed for per-kernel specialization? My idea of the rough pseudocode would be like this and we wouldn't need a complex class heirarchy. Also I don't know if we can skip ptxas by giving CUDA the ptx directly, we probably will need to invoke lld on the command line however right.
for each image:
  if image is bitcode
    image = compile(image)
 register(image)

We could handle them in libomptarget, but that's gonna require we add another two interface functions: is_valid_bitcode_image, and compile_bitcode_image. It is doable. Handling them in plugin as a separate module can just reuse the two existing interfaces.

Also I don't know if we can skip ptxas by giving CUDA the ptx directly, we probably will need to invoke lld on the command line however right.
for each image:
  if image is bitcode
    image = compile(image)
 register(image)

We can give CUDA PTX directly, since the CUDA JIT is to just call ptxas instead of ptxas -c, which requires nvlink afterwards.

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
185	For the purpose of this basic JIT support, we indeed just need backend. However, since we have the plan for super optimization, etc., having an optimization pipeline here is also useful.

In D139287#3971024, @tianshilei1992 wrote:
In D139287#3970996, @jhuber6 wrote:
Why do we have the JIT in the nextgen plugins? I figured that JIT would be handled by libomptarget proper rather than the plugins. I guess this is needed for per-kernel specialization? My idea of the rough pseudocode would be like this and we wouldn't need a complex class heirarchy. Also I don't know if we can skip ptxas by giving CUDA the ptx directly, we probably will need to invoke lld on the command line however right.
for each image:
  if image is bitcode
    image = compile(image)
 register(image)
We could handle them in libomptarget, but that's gonna require we add another two interface functions: is_valid_bitcode_image, and compile_bitcode_image. It is doable. Handling them in plugin as a separate module can just reuse the two existing interfaces.

Would we need to consult the plugin? We can just check the magic directly, if it's bitcode we just compile it for its triple. If this was wrong then when the plugin gets the compiled image it will error.

Also I don't know if we can skip ptxas by giving CUDA the ptx directly, we probably will need to invoke lld on the command line however right.
for each image:
  if image is bitcode
    image = compile(image)
 register(image)
We can give CUDA PTX directly, since the CUDA JIT is to just call ptxas instead of ptxas -c, which requires nvlink afterwards.

That makes it easier for us, so the only command line tool we need to call is lld for AMDGPU.

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
185	We should be able to configure our own optimization pipeline in that case, we might want the extra control as well.

In D139287#3971062, @jhuber6 wrote:
In D139287#3971024, @tianshilei1992 wrote:
In D139287#3970996, @jhuber6 wrote:
Why do we have the JIT in the nextgen plugins? I figured that JIT would be handled by libomptarget proper rather than the plugins. I guess this is needed for per-kernel specialization? My idea of the rough pseudocode would be like this and we wouldn't need a complex class heirarchy. Also I don't know if we can skip ptxas by giving CUDA the ptx directly, we probably will need to invoke lld on the command line however right.
for each image:
  if image is bitcode
    image = compile(image)
 register(image)
We could handle them in libomptarget, but that's gonna require we add another two interface functions: is_valid_bitcode_image, and compile_bitcode_image. It is doable. Handling them in plugin as a separate module can just reuse the two existing interfaces.
Would we need to consult the plugin? We can just check the magic directly, if it's bitcode we just compile it for its triple. If this was wrong then when the plugin gets the compiled image it will error.

I prefer error out at earlier stage, especially if we have a bitcode image, and both Nvidia and AMD support JIT, then both NVIDIA and AMD will report a valid binary, thus continue compiling the image, initializing the plugin, etc., which could give us the wrong results.

Also I don't know if we can skip ptxas by giving CUDA the ptx directly, we probably will need to invoke lld on the command line however right.
for each image:
  if image is bitcode
    image = compile(image)
 register(image)
We can give CUDA PTX directly, since the CUDA JIT is to just call ptxas instead of ptxas -c, which requires nvlink afterwards.
That makes it easier for us, so the only command line tool we need to call is lld for AMDGPU.

Yes, for AMDGPU it can be handled in the post processing, which is not done in this patch.

tianshilei1992 added inline comments.Dec 5 2022, 7:54 AM

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
185	which means we basically rewrite the function `opt` and `backend` in `LTO.cpp`. I thought about just invoking backend before, especially using LTO requires us to build the resolution table. However, after a second thought, I think it would be better to just use LTO.

jhuber6 added inline comments.Dec 5 2022, 8:06 AM

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
185	Building the passes isn't too complicated, it would take up the same amount of space as the symbol resolutions and has the advantage that we don't need to write the output to a file. I could write an implementation for this to see how well it works.

jplehr added a subscriber: jplehr.Dec 5 2022, 8:13 AM

drop LTO and fix comments

Herald added a subscriber: aheejin. · View Herald TranscriptDec 7 2022, 12:33 PM

tianshilei1992 marked 5 inline comments as done.Dec 7 2022, 12:35 PM

tianshilei1992 added inline comments.

openmp/libomptarget/plugins-nextgen/common/PluginInterface/CMakeLists.txt
30–54	Have to figure out what components here.

Harbormaster completed remote builds in B201777: Diff 481014.Dec 7 2022, 12:36 PM

add build components

Harbormaster completed remote builds in B201783: Diff 481023.Dec 7 2022, 12:49 PM

We should probably make a test for this. Do we currently test the nextgen plugins?

All but a test and this looks good to me.

We probably want to enable a new test configuration to have each test run in JIT mode.

rebase and refine

It currently crashes in setupLLVMOptimizationRemarks

Herald added a subscriber: • pcwang-thead. · View Herald TranscriptDec 9 2022, 10:33 AM

Harbormaster completed remote builds in B202262: Diff 481686.Dec 9 2022, 10:36 AM

jdoerfert added inline comments.Dec 9 2022, 10:42 AM

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
65	Not unreachable. Use a printf and an abort.

rebase and fix opt error

Harbormaster completed remote builds in B202469: Diff 481960.Dec 11 2022, 5:48 PM

rebase and fix comments

tianshilei1992 marked an inline comment as done.Dec 11 2022, 7:37 PM

Harbormaster completed remote builds in B202481: Diff 481980.Dec 11 2022, 7:41 PM

jdoerfert added inline comments.Dec 12 2022, 8:13 AM

openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
723	We most likely want a env var and maybe later even pass the value through. Env var is good enough for now.

add test

Harbormaster completed remote builds in B202603: Diff 482158.Dec 12 2022, 9:13 AM

tianshilei1992 retitled this revision from [WIP][OpenMP] Introduce basic JIT support to OpenMP target offloading to [OpenMP] Introduce basic JIT support to OpenMP target offloading.Dec 12 2022, 9:14 AM

tianshilei1992 edited the summary of this revision. (Show Details)

Herald added subscribers: kosarev, tpr. · View Herald TranscriptDec 12 2022, 9:14 AM

add env for opt level

tianshilei1992 marked an inline comment as done and an inline comment as not done.Dec 12 2022, 9:24 AM

tianshilei1992 added inline comments.

openmp/cmake/OpenMPTesting.cmake
6 ↗	(On Diff #482158)	Oh, this change is unrelated. Will remove it later.

Harbormaster completed remote builds in B202609: Diff 482168.Dec 12 2022, 9:27 AM

tianshilei1992 edited the summary of this revision. (Show Details)Dec 12 2022, 1:35 PM

Some nits.

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
122	I'm tempted to move this into LLVM somewhere since it's been duplicated so many times.
133	typo
151	Why do we need `TT` if we expect to read the triple out of the Module?
277–278	Should work.
302	Why a `std::list`? Since we use pointers we shouldn't need to worry about having a stable pointer and could use a `SmallVector` or similar right?
320	Should probably prefer C++ casts even if they are ridiculously verbose.
344	`MCPU` is the more common version AFAICT.
openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp
825	Missing comment?

LG as there seem to be only nits left. We can expand on this in tree

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
122	Let's do this as a follow up.
185	Agreed. We can test it in a follow up and decide then.
openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.h
42

This revision is now accepted and ready to land.Dec 12 2022, 10:19 PM

tianshilei1992 marked an inline comment as done.Dec 13 2022, 4:54 AM

tianshilei1992 added inline comments.

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
185	I already switched to build the pipeline ourselves and drop LTO.

rebase and fix comments

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
277–278	What if `OS` is not `NULL` terminated?

Harbormaster completed remote builds in B204983: Diff 485375.Dec 27 2022, 7:20 AM

jhuber6 added inline comments.Dec 27 2022, 9:01 AM

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
277–278	Do we need it to be? The memory buffer should contain the size, unless we need to convert it to a C string somewhere. In that case you could do `OS << '\0'` but it would probably mess up the size.

rebase and fix comment

tianshilei1992 marked an inline comment as done.Dec 27 2022, 11:19 AM

tianshilei1992 added inline comments.

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp
277–278	`str()` returns a `StringRef`, which is good. I thought it returned `const char *`.

This revision was landed with ongoing or failed builds.Dec 27 2022, 4:07 PM

Closed by commit rG58906e4901ec: [OpenMP] Introduce basic JIT support to OpenMP target offloading (authored by tianshilei1992). · Explain Why

This revision was automatically updated to reflect the committed changes.

tianshilei1992 marked an inline comment as done.

tianshilei1992 added a commit: rG58906e4901ec: [OpenMP] Introduce basic JIT support to OpenMP target offloading.

seems like this broke the amdgpu buildbot , plz resolve
https://lab.llvm.org/buildbot/#/builders/193/builds/24122

tianshilei1992 added a reverting change: rG95956bd89622: Revert "[OpenMP] Introduce basic JIT support to OpenMP target offloading".Dec 27 2022, 6:52 PM

In D139287#4018071, @ronlieb wrote:

seems like this broke the amdgpu buildbot , plz resolve
https://lab.llvm.org/buildbot/#/builders/193/builds/24122

Reverted. Will fix it soon.

tianshilei1992 reopened this revision.Dec 27 2022, 7:17 PM

This revision is now accepted and ready to land.Dec 27 2022, 7:17 PM

fix compile error

This revision was landed with ongoing or failed builds.Dec 27 2022, 7:19 PM

Closed by commit rG5a3a527f8ae2: [OpenMP] Introduce basic JIT support to OpenMP target offloading (authored by tianshilei1992). · Explain Why

This revision was automatically updated to reflect the committed changes.

tianshilei1992 marked an inline comment as done.

tianshilei1992 added a commit: rG5a3a527f8ae2: [OpenMP] Introduce basic JIT support to OpenMP target offloading.

Harbormaster completed remote builds in B205043: Diff 485451.Dec 27 2022, 7:20 PM

Got tons of runtime failure

target doesn't support jit
UNREACHABLE executed at /gpfs/jlse-fs0/users/yeluo/opt/llvm-clang/llvm-project-nightly/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:543!

on AMD GPU gfx906 when running miniqmc ctest
failure happens with
LIBOMPTARGET_NEXTGEN_PLUGINS=1

Looks like GCC 7.5 cannot build LLVM after this change. Could you please take a look?

In file included from /localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp:11:0:
/localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h: In member function ‘virtual llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> > llvm::omp::target::plugin::GenericDeviceTy::doJITPostProcessing(std::unique_ptr<llvm::MemoryBuffer>) const’:
/localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:389:12: error: could not convert ‘MB’ from ‘std::unique_ptr<llvm::MemoryBuffer>’ to ‘llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> >’
     return MB;
            ^~

In D139287#4031469, @hbae wrote:

Looks like GCC 7.5 cannot build LLVM after this change. Could you please take a look?

In file included from /localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp:11:0:
/localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h: In member function ‘virtual llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> > llvm::omp::target::plugin::GenericDeviceTy::doJITPostProcessing(std::unique_ptr<llvm::MemoryBuffer>) const’:
/localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:389:12: error: could not convert ‘MB’ from ‘std::unique_ptr<llvm::MemoryBuffer>’ to ‘llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> >’
     return MB;
            ^~

Some older GCC's have problem with the implicit move on copy elision AFAIK. I'll add a std::move and let me know if that fixes it.

In D139287#4031473, @jhuber6 wrote:

In D139287#4031469, @hbae wrote:

Looks like GCC 7.5 cannot build LLVM after this change. Could you please take a look?

In file included from /localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp:11:0:
/localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h: In member function ‘virtual llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> > llvm::omp::target::plugin::GenericDeviceTy::doJITPostProcessing(std::unique_ptr<llvm::MemoryBuffer>) const’:
/localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:389:12: error: could not convert ‘MB’ from ‘std::unique_ptr<llvm::MemoryBuffer>’ to ‘llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> >’
     return MB;
            ^~

Some older GCC's have problem with the implicit move on copy elision AFAIK. I'll add a std::move and let me know if that fixes it.

Yes, that fixes it. We have two more places in JIT.cpp (line 126, 185).

In D139287#4031503, @hbae wrote:
In D139287#4031473, @jhuber6 wrote:
In D139287#4031469, @hbae wrote:
Looks like GCC 7.5 cannot build LLVM after this change. Could you please take a look?
In file included from /localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp:11:0:
/localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h: In member function ‘virtual llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> > llvm::omp::target::plugin::GenericDeviceTy::doJITPostProcessing(std::unique_ptr<llvm::MemoryBuffer>) const’:
/localdisk/hbae/LLVM/llvm-base/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:389:12: error: could not convert ‘MB’ from ‘std::unique_ptr<llvm::MemoryBuffer>’ to ‘llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> >’
     return MB;
            ^~
Some older GCC's have problem with the implicit move on copy elision AFAIK. I'll add a std::move and let me know if that fixes it.
Yes, that fixes it. We have two more places in JIT.cpp (line 126, 185).

Thanks for pointing them out, I'll fix them right away.

jhuber6 mentioned this in D141158: [OpenMP] Introduce '-f[no-]openmp-target-jit' flag to control JIT for offloading.Jan 6 2023, 1:32 PM

jhuber6 mentioned this in rGf5f746f1efd4: [OpenMP] Introduce '-f[no-]openmp-target-jit' flag to control JIT for offloading.Jan 6 2023, 6:02 PM

Revision Contents

Path

Size

openmp/

libomptarget/

CMakeLists.txt

3 lines

plugins-nextgen/

CMakeLists.txt

3 lines

common/

PluginInterface/

35 lines

50 lines

375 lines

17 lines

38 lines

cuda/

src/

rtl.cpp

25 lines

generic-elf-64bit/

src/

rtl.cpp

4 lines

test/

lit.cfg

16 lines

Diff 485452

openmp/libomptarget/CMakeLists.txt

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} powerpc64le-ibm-linux-gnu")			set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} powerpc64le-ibm-linux-gnu")
	set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} powerpc64le-ibm-linux-gnu-LTO")			set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} powerpc64le-ibm-linux-gnu-LTO")
	set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} powerpc64-ibm-linux-gnu")			set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} powerpc64-ibm-linux-gnu")
	set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} powerpc64-ibm-linux-gnu-LTO")			set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} powerpc64-ibm-linux-gnu-LTO")
	set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} x86_64-pc-linux-gnu")			set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} x86_64-pc-linux-gnu")
	set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} x86_64-pc-linux-gnu-LTO")			set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} x86_64-pc-linux-gnu-LTO")
	set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} nvptx64-nvidia-cuda")			set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} nvptx64-nvidia-cuda")
	set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} nvptx64-nvidia-cuda-LTO")			set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} nvptx64-nvidia-cuda-LTO")
				set (LIBOMPTARGET_ALL_TARGETS "${LIBOMPTARGET_ALL_TARGETS} nvptx64-nvidia-cuda-JIT-LTO")

	# Once the plugins for the different targets are validated, they will be added to			# Once the plugins for the different targets are validated, they will be added to
	# the list of supported targets in the current system.			# the list of supported targets in the current system.
	set (LIBOMPTARGET_SYSTEM_TARGETS "")			set (LIBOMPTARGET_SYSTEM_TARGETS "")
	set (LIBOMPTARGET_TESTED_PLUGINS "")			set (LIBOMPTARGET_TESTED_PLUGINS "")

	# Check whether using debug mode. In debug mode, allow dumping progress			# Check whether using debug mode. In debug mode, allow dumping progress
	# messages at runtime by default. Otherwise, it can be enabled			# messages at runtime by default. Otherwise, it can be enabled
	# independently using the LIBOMPTARGET_ENABLE_DEBUG option.			# independently using the LIBOMPTARGET_ENABLE_DEBUG option.
	string( TOLOWER "${CMAKE_BUILD_TYPE}" LIBOMPTARGET_CMAKE_BUILD_TYPE)			string( TOLOWER "${CMAKE_BUILD_TYPE}" LIBOMPTARGET_CMAKE_BUILD_TYPE)
	if(LIBOMPTARGET_CMAKE_BUILD_TYPE MATCHES debug)			if(LIBOMPTARGET_CMAKE_BUILD_TYPE MATCHES debug)
	option(LIBOMPTARGET_ENABLE_DEBUG "Allow debug output with the environment variable LIBOMPTARGET_DEBUG=1" ON)			option(LIBOMPTARGET_ENABLE_DEBUG "Allow debug output with the environment variable LIBOMPTARGET_DEBUG=1" ON)
	else()			else()
	option(LIBOMPTARGET_ENABLE_DEBUG "Allow debug output with the environment variable LIBOMPTARGET_DEBUG=1" OFF)			option(LIBOMPTARGET_ENABLE_DEBUG "Allow debug output with the environment variable LIBOMPTARGET_DEBUG=1" OFF)
	endif()			endif()
	if(LIBOMPTARGET_ENABLE_DEBUG)			if(LIBOMPTARGET_ENABLE_DEBUG)
	add_definitions(-DOMPTARGET_DEBUG)			add_definitions(-DOMPTARGET_DEBUG)
	endif()			endif()

	# OMPT support for libomptarget			# OMPT support for libomptarget
	# Follow host OMPT support and check if host support has been requested.			# Follow host OMPT support and check if host support has been requested.
	# LIBOMP_HAVE_OMPT_SUPPORT indicates whether host OMPT support has been implemented.			# LIBOMP_HAVE_OMPT_SUPPORT indicates whether host OMPT support has been implemented.
	# LIBOMP_OMPT_SUPPORT indicates whether host OMPT support has been requested (default is ON).			# LIBOMP_OMPT_SUPPORT indicates whether host OMPT support has been requested (default is ON).
	# LIBOMPTARGET_OMPT_SUPPORT indicates whether target OMPT support has been requested (default is ON).			# LIBOMPTARGET_OMPT_SUPPORT indicates whether target OMPT support has been requested (default is ON).
	set(OMPT_TARGET_DEFAULT FALSE)			set(OMPT_TARGET_DEFAULT FALSE)
	if ((LIBOMP_HAVE_OMPT_SUPPORT) AND (LIBOMP_OMPT_SUPPORT) AND (NOT WIN32))			if ((LIBOMP_HAVE_OMPT_SUPPORT) AND (LIBOMP_OMPT_SUPPORT) AND (NOT WIN32))
	set (OMPT_TARGET_DEFAULT TRUE)			set (OMPT_TARGET_DEFAULT TRUE)
	endif()			endif()
	set(LIBOMPTARGET_OMPT_SUPPORT ${OMPT_TARGET_DEFAULT} CACHE BOOL "OMPT-target-support?")			set(LIBOMPTARGET_OMPT_SUPPORT ${OMPT_TARGET_DEFAULT} CACHE BOOL "OMPT-target-support?")
	if ((OMPT_TARGET_DEFAULT) AND (LIBOMPTARGET_OMPT_SUPPORT))			if ((OMPT_TARGET_DEFAULT) AND (LIBOMPTARGET_OMPT_SUPPORT))
	add_definitions(-DOMPT_SUPPORT=1)			add_definitions(-DOMPT_SUPPORT=1)
	message(STATUS "OMPT target enabled")			message(STATUS "OMPT target enabled")
	Show All 30 Lines

openmp/libomptarget/plugins-nextgen/CMakeLists.txt

Show All 27 Lines	if(LIBOMPTARGET_DEP_LIBFFI_FOUND)

# Define debug prefix. TODO: This should be automatized in the Debug.h but		# Define debug prefix. TODO: This should be automatized in the Debug.h but
# it requires changing the original plugins.		# it requires changing the original plugins.
add_definitions(-DDEBUG_PREFIX="TARGET ${tmachine_name} RTL")		add_definitions(-DDEBUG_PREFIX="TARGET ${tmachine_name} RTL")

# Define macro with the ELF ID for this target.		# Define macro with the ELF ID for this target.
add_definitions("-DTARGET_ELF_ID=${elf_machine_id}")		add_definitions("-DTARGET_ELF_ID=${elf_machine_id}")

		# Define target regiple
		add_definitions("-DLIBOMPTARGET_NEXTGEN_GENERIC_PLUGIN_TRIPLE=${tmachine}")

add_llvm_library("omptarget.rtl.${tmachine_libname}.nextgen"		add_llvm_library("omptarget.rtl.${tmachine_libname}.nextgen"
SHARED		SHARED

${CMAKE_CURRENT_SOURCE_DIR}/../generic-elf-64bit/src/rtl.cpp		${CMAKE_CURRENT_SOURCE_DIR}/../generic-elf-64bit/src/rtl.cpp

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${LIBOMPTARGET_INCLUDE_DIR}		${LIBOMPTARGET_INCLUDE_DIR}
${LIBOMPTARGET_DEP_LIBFFI_INCLUDE_DIR}		${LIBOMPTARGET_DEP_LIBFFI_INCLUDE_DIR}
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

openmp/libomptarget/plugins-nextgen/common/PluginInterface/CMakeLists.txt

##===----------------------------------------------------------------------===## ##===----------------------------------------------------------------------===##

# #

# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. # Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

# See https://llvm.org/LICENSE.txt for license information. # See https://llvm.org/LICENSE.txt for license information.

# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception # SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

# #

##===----------------------------------------------------------------------===## ##===----------------------------------------------------------------------===##

# #

# Common parts which can be used by all plugins # Common parts which can be used by all plugins

# #

##===----------------------------------------------------------------------===## ##===----------------------------------------------------------------------===##

# NOTE: Don't try to build `PluginInterface` using `add_llvm_library` because we # NOTE: Don't try to build `PluginInterface` using `add_llvm_library` because we

# don't want to export `PluginInterface` while `add_llvm_library` requires that. # don't want to export `PluginInterface` while `add_llvm_library` requires that.

add_library(PluginInterface OBJECT PluginInterface.cpp GlobalHandler.cpp) add_library(PluginInterface OBJECT

PluginInterface.cpp GlobalHandler.cpp JIT.cpp)

# Only enable JIT for those targets that LLVM can support.

string(TOUPPER "${LLVM_TARGETS_TO_BUILD}" TargetsSupported)

foreach(Target ${TargetsSupported})

target_compile_definitions(PluginInterface PRIVATE "LIBOMPTARGET_JIT_${TARGET}")

endforeach()

tianshilei1992AuthorUnsubmitted

Done

I guess this might cause the issue of non-protected global symbols.

tianshilei1992: I guess this might cause the issue of non-protected global symbols.

jhuber6Unsubmitted

Done

# Plugin Interface library.

- add_llvm_library(PluginInterface OBJECT PluginInterface.cpp GlobalHandler.cpp JIT.cpp)

+ add_llvm_library(PluginInterface OBJECT

+ PluginInterface.cpp

+ GlobalHandler.cpp

+ JIT.cpp

+ LINK_COMPONENTS

+ LTO

+ MC

+ Target

+ LINK_LIBS

+ elf_common

+ MemoryManager

+ )

# Define the TARGET_NAME.

Should we be able to put all this in the add_llvm_library?

jhuber6: Should we be able to put all this in the `add_llvm_library`?

# This is required when using LLVM libraries. # This is required when using LLVM libraries.

llvm_update_compile_flags(PluginInterface) llvm_update_compile_flags(PluginInterface)

if (LLVM_LINK_LLVM_DYLIB) if (LLVM_LINK_LLVM_DYLIB)

set(llvm_libs LLVM) set(llvm_libs LLVM)

else() else()

llvm_map_components_to_libnames(llvm_libs Support) llvm_map_components_to_libnames(llvm_libs

${LLVM_TARGETS_TO_BUILD}

AggressiveInstCombine

Analysis

BinaryFormat

BitReader

BitWriter

CodeGen

Core

Extensions

InstCombine

Instrumentation

IPO

IRReader

Linker

Object

Passes

Remarks

ScalarOpts

Support

Target

TransformUtils

Vectorize

)

tianshilei1992AuthorUnsubmitted

Done

Have to figure out what components here.

tianshilei1992: Have to figure out what components here.

endif() endif()

target_link_libraries(PluginInterface target_link_libraries(PluginInterface

PUBLIC PUBLIC

${llvm_libs} ${llvm_libs}

elf_common elf_common

MemoryManager MemoryManager

) )

Show All 15 Lines

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.h

This file was added.

//===- JIT.h - Target independent JIT infrastructure ----------------------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

#ifndef OPENMP_LIBOMPTARGET_PLUGINS_NEXTGEN_COMMON_JIT_H

#define OPENMP_LIBOMPTARGET_PLUGINS_NEXTGEN_COMMON_JIT_H

#include "llvm/ADT/StringRef.h"

#include "llvm/ADT/Triple.h"

#include "llvm/Support/Error.h"

#include <functional>

#include <memory>

#include <string>

struct __tgt_device_image;

namespace llvm {

class MemoryBuffer;

namespace omp {

namespace jit {

/// Function type for a callback that will be called after the backend is

/// called.

using PostProcessingFn = std::function<Expected<std::unique_ptr<MemoryBuffer>>(

std::unique_ptr<MemoryBuffer>)>;

/// Check if \p Image contains bitcode with triple \p Triple.

bool checkBitcodeImage(__tgt_device_image *Image, Triple::ArchType TA);

/// Compile the bitcode image \p Image and generate the binary image that can be

/// loaded to the target device of the triple \p Triple architecture \p MCpu. \p

/// PostProcessing will be called after codegen to handle cases such as assember

/// as an external tool.

Expected<__tgt_device_image *> compile(__tgt_device_image *Image,

jdoerfertUnsubmitted

Done

/// PostProcessing will be called after codegen to handle cases such as assember

- /// is an external tool.

+ /// as an external tool.

Expected<__tgt_device_image *> compile(__tgt_device_image *Image,

jdoerfert:

Triple::ArchType TA, std::string MCpu,

unsigned OptLevel,

PostProcessingFn PostProcessing);

} // namespace jit

} // namespace omp

} // namespace llvm

#endif // OPENMP_LIBOMPTARGET_PLUGINS_NEXTGEN_COMMON_JIT_H

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp

This file was added.

//===- JIT.cpp - Target independent JIT infrastructure --------------------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

#include "JIT.h"

#include "Debug.h"

#include "omptarget.h"

#include "llvm/ADT/SmallVector.h"

#include "llvm/ADT/StringRef.h"

#include "llvm/CodeGen/CommandFlags.h"

#include "llvm/CodeGen/MachineModuleInfo.h"

#include "llvm/IR/LLVMContext.h"

#include "llvm/IR/LLVMRemarkStreamer.h"

#include "llvm/IR/LegacyPassManager.h"

#include "llvm/IRReader/IRReader.h"

#include "llvm/InitializePasses.h"

#include "llvm/MC/SubtargetFeature.h"

#include "llvm/MC/TargetRegistry.h"

#include "llvm/Object/IRObjectFile.h"

#include "llvm/Passes/OptimizationLevel.h"

#include "llvm/Passes/PassBuilder.h"

#include "llvm/Support/Error.h"

#include "llvm/Support/MemoryBuffer.h"

#include "llvm/Support/SourceMgr.h"

#include "llvm/Support/TargetSelect.h"

#include "llvm/Support/TimeProfiler.h"

#include "llvm/Support/ToolOutputFile.h"

#include "llvm/Support/raw_ostream.h"

#include "llvm/Target/TargetMachine.h"

#include "llvm/Target/TargetOptions.h"

#include <mutex>

using namespace llvm;

using namespace llvm::object;

using namespace omp;

static codegen::RegisterCodeGenFlags RCGF;

namespace {

std::once_flag InitFlag;

void init(Triple TT) {

bool JITTargetInitialized = false;

jhuber6Unsubmitted

Done

We could probably limit these to the ones we actually care about since we know the triples. Not sure if it would save us much runtime.

jhuber6: We could probably limit these to the ones we actually care about since we know the triples. Not…

#ifdef LIBOMPTARGET_JIT_NVPTX

if (TT.isNVPTX()) {

LLVMInitializeNVPTXTargetInfo();

LLVMInitializeNVPTXTarget();

LLVMInitializeNVPTXTargetMC();

LLVMInitializeNVPTXAsmPrinter();

JITTargetInitialized = true;

}

#endif

#ifdef LIBOMPTARGET_JIT_AMDGPU

if (TT.isAMDGPU()) {

LLVMInitializeAMDGPUTargetInfo();

LLVMInitializeAMDGPUTarget();

jdoerfertUnsubmitted

Done

Not unreachable. Use a printf and an abort.

jdoerfert: Not unreachable. Use a printf and an abort.

LLVMInitializeAMDGPUTargetMC();

LLVMInitializeAMDGPUAsmPrinter();

JITTargetInitialized = true;

}

#endif

if (!JITTargetInitialized) {

FAILURE_MESSAGE("unsupported JIT target");

abort();

}

// Initialize passes

PassRegistry &Registry = *PassRegistry::getPassRegistry();

initializeCore(Registry);

initializeScalarOpts(Registry);

initializeVectorization(Registry);

initializeIPO(Registry);

initializeAnalysis(Registry);

initializeTransformUtils(Registry);

initializeInstCombine(Registry);

initializeTarget(Registry);

initializeExpandLargeDivRemLegacyPassPass(Registry);

initializeExpandLargeFpConvertLegacyPassPass(Registry);

initializeExpandMemCmpPassPass(Registry);

initializeScalarizeMaskedMemIntrinLegacyPassPass(Registry);

initializeSelectOptimizePass(Registry);

initializeCodeGenPreparePass(Registry);

initializeAtomicExpandPass(Registry);

initializeRewriteSymbolsLegacyPassPass(Registry);

initializeWinEHPreparePass(Registry);

initializeDwarfEHPrepareLegacyPassPass(Registry);

initializeSafeStackLegacyPassPass(Registry);

initializeSjLjEHPreparePass(Registry);

initializePreISelIntrinsicLoweringLegacyPassPass(Registry);

initializeGlobalMergePass(Registry);

initializeIndirectBrExpandPassPass(Registry);

initializeInterleavedLoadCombinePass(Registry);

initializeInterleavedAccessPass(Registry);

initializeUnreachableBlockElimLegacyPassPass(Registry);

initializeExpandReductionsPass(Registry);

initializeExpandVectorPredicationPass(Registry);

initializeWasmEHPreparePass(Registry);

initializeWriteBitcodePassPass(Registry);

initializeHardwareLoopsPass(Registry);

initializeTypePromotionPass(Registry);

initializeReplaceWithVeclibLegacyPass(Registry);

initializeJMCInstrumenterPass(Registry);

}

Expected<std::unique_ptr<Module>>

createModuleFromImage(__tgt_device_image *Image, LLVMContext &Context) {

StringRef Data((const char *)Image->ImageStart,

(char *)Image->ImageEnd - (char *)Image->ImageStart);

std::unique_ptr<MemoryBuffer> MB = MemoryBuffer::getMemBuffer(

Data, /* BufferName */ "", /* RequiresNullTerminator */ false);

SMDiagnostic Err;

auto Mod = parseIR(*MB, Err, Context);

jhuber6Unsubmitted

Done

I'm tempted to move this into LLVM somewhere since it's been duplicated so many times.

jhuber6: I'm tempted to move this into LLVM somewhere since it's been duplicated so many times.

jdoerfertUnsubmitted

Done

Let's do this as a follow up.

jdoerfert: Let's do this as a follow up.

if (!Mod)

return make_error<StringError>("Failed to create module",

inconvertibleErrorCode());

return Mod;

}

CodeGenOpt::Level getCGOptLevel(unsigned OptLevel) {

switch (OptLevel) {

case 0:

return CodeGenOpt::None;

case 1:

jhuber6Unsubmitted

Done

typo

jhuber6: typo

return CodeGenOpt::Less;

case 2:

return CodeGenOpt::Default;

case 3:

return CodeGenOpt::Aggressive;

}

llvm_unreachable("Invalid optimization level");

}

OptimizationLevel getOptLevel(unsigned OptLevel) {

switch (OptLevel) {

case 0:

return OptimizationLevel::O0;

case 1:

return OptimizationLevel::O1;

case 2:

return OptimizationLevel::O2;

case 3:

jhuber6Unsubmitted

Done

Expected<std::unique_ptr<TargetMachine>>

- createTargetMachine(Module &M, Triple TT, std::string CPU, unsigned OptLevel) {

+ createTargetMachine(Module &M, Triple &TT, std::string CPU, unsigned OptLevel) {

CodeGenOpt::Level CGOptLevel = getCGOptLevel(OptLevel);

Why do we need TT if we expect to read the triple out of the Module?

jhuber6: Why do we need `TT` if we expect to read the triple out of the Module?

return OptimizationLevel::O3;

}

llvm_unreachable("Invalid optimization level");

}

Expected<std::unique_ptr<TargetMachine>>

createTargetMachine(Module &M, std::string CPU, unsigned OptLevel) {

Triple TT(M.getTargetTriple());

CodeGenOpt::Level CGOptLevel = getCGOptLevel(OptLevel);

std::string Msg;

const Target *T = TargetRegistry::lookupTarget(M.getTargetTriple(), Msg);

if (!T)

return make_error<StringError>(Msg, inconvertibleErrorCode());

SubtargetFeatures Features;

Features.getDefaultSubtargetFeatures(TT);

std::optional<Reloc::Model> RelocModel;

if (M.getModuleFlag("PIC Level"))

RelocModel =

M.getPICLevel() == PICLevel::NotPIC ? Reloc::Static : Reloc::PIC_;

std::optional<CodeModel::Model> CodeModel = M.getCodeModel();

TargetOptions Options = codegen::InitTargetOptionsFromCodeGenFlags(TT);

std::unique_ptr<TargetMachine> TM(

T->createTargetMachine(M.getTargetTriple(), CPU, Features.getString(),

Options, RelocModel, CodeModel, CGOptLevel));

if (!TM)

return make_error<StringError>("Failed to create target machine",

inconvertibleErrorCode());

return TM;

tianshilei1992AuthorUnsubmitted

Done

Is there any way that we don't write it to a file here?

tianshilei1992: Is there any way that we don't write it to a file here?

jhuber6Unsubmitted

Done

Why do we need to invoke LTO here? I figured that we could call the backend directly since we have no need to actually link any filies, and we may not have a need to run more expensive optimizations when the bitcode is already optimized. If you do that then you should be able to just use a raw_svector_ostream as your output stream and get the compiled output written to that buffer.

jhuber6: Why do we need to invoke LTO here? I figured that we could call the backend directly since we…

tianshilei1992AuthorUnsubmitted

Done

For the purpose of this basic JIT support, we indeed just need backend. However, since we have the plan for super optimization, etc., having an optimization pipeline here is also useful.

tianshilei1992: For the purpose of this basic JIT support, we indeed just need backend. However, since we have…

jhuber6Unsubmitted

Done

We should be able to configure our own optimization pipeline in that case, we might want the extra control as well.

jhuber6: We should be able to configure our own optimization pipeline in that case, we might want the…

tianshilei1992AuthorUnsubmitted

Done

which means we basically rewrite the function opt and backend in LTO.cpp. I thought about just invoking backend before, especially using LTO requires us to build the resolution table. However, after a second thought, I think it would be better to just use LTO.

tianshilei1992: which means we basically rewrite the function `opt` and `backend` in `LTO.cpp`. I thought about…

jhuber6Unsubmitted

Done

Building the passes isn't too complicated, it would take up the same amount of space as the symbol resolutions and has the advantage that we don't need to write the output to a file. I could write an implementation for this to see how well it works.

jhuber6: Building the passes isn't too complicated, it would take up the same amount of space as the…

jdoerfertUnsubmitted

Done

Agreed. We can test it in a follow up and decide then.

jdoerfert: Agreed. We can test it in a follow up and decide then.

tianshilei1992AuthorUnsubmitted

Done

I already switched to build the pipeline ourselves and drop LTO.

tianshilei1992: I already switched to build the pipeline ourselves and drop LTO.

}

///

class JITEngine {

public:

JITEngine(Triple::ArchType TA, std::string MCpu)

: TT(Triple::getArchTypeName(TA)), CPU(MCpu) {

std::call_once(InitFlag, init, TT);

}

/// Run jit compilation. It is expected to get a memory buffer containing the

/// generated device image that could be loaded to the device directly.

Expected<std::unique_ptr<MemoryBuffer>>

run(__tgt_device_image *Image, unsigned OptLevel,

jit::PostProcessingFn PostProcessing);

private:

/// Run backend, which contains optimization and code generation.

Expected<std::unique_ptr<MemoryBuffer>> backend(Module &M, unsigned OptLevel);

/// Run optimization pipeline.

void opt(TargetMachine *TM, TargetLibraryInfoImpl *TLII, Module &M,

unsigned OptLevel);

/// Run code generation.

void codegen(TargetMachine *TM, TargetLibraryInfoImpl *TLII, Module &M,

raw_pwrite_stream &OS);

LLVMContext Context;

const Triple TT;

const std::string CPU;

};

void JITEngine::opt(TargetMachine *TM, TargetLibraryInfoImpl *TLII, Module &M,

unsigned OptLevel) {

PipelineTuningOptions PTO;

std::optional<PGOOptions> PGOOpt;

LoopAnalysisManager LAM;

FunctionAnalysisManager FAM;

CGSCCAnalysisManager CGAM;

ModuleAnalysisManager MAM;

ModulePassManager MPM;

tianshilei1992AuthorUnsubmitted

Done

I might change the return value to Expected<bool> such that it is able to pass the error info back to caller.

tianshilei1992: I might change the return value to `Expected<bool>` such that it is able to pass the error info…

PassBuilder PB(TM, PTO, PGOOpt, nullptr);

FAM.registerPass([&] { return TargetLibraryAnalysis(*TLII); });

// Register all the basic analyses with the managers.

PB.registerModuleAnalyses(MAM);

PB.registerCGSCCAnalyses(CGAM);

PB.registerFunctionAnalyses(FAM);

PB.registerLoopAnalyses(LAM);

PB.crossRegisterProxies(LAM, FAM, CGAM, MAM);

MPM.addPass(PB.buildPerModuleDefaultPipeline(getOptLevel(OptLevel)));

MPM.run(M, MAM);

}

void JITEngine::codegen(TargetMachine *TM, TargetLibraryInfoImpl *TLII,

Module &M, raw_pwrite_stream &OS) {

legacy::PassManager PM;

PM.add(new TargetLibraryInfoWrapperPass(*TLII));

MachineModuleInfoWrapperPass *MMIWP = new MachineModuleInfoWrapperPass(

reinterpret_cast<LLVMTargetMachine *>(TM));

TM->addPassesToEmitFile(PM, OS, nullptr,

TT.isNVPTX() ? CGFT_AssemblyFile : CGFT_ObjectFile,

/* DisableVerify */ false, MMIWP);

PM.run(M);

tianshilei1992AuthorUnsubmitted

Done

Is there better way to compare two triples?

tianshilei1992: Is there better way to compare two triples?

}

Expected<std::unique_ptr<MemoryBuffer>> JITEngine::backend(Module &M,

unsigned OptLevel) {

auto RemarksFileOrErr = setupLLVMOptimizationRemarks(

Context, /* RemarksFilename */ "", /* RemarksPasses */ "",

/* RemarksFormat */ "", /* RemarksWithHotness */ false);

if (Error E = RemarksFileOrErr.takeError())

return std::move(E);

if (*RemarksFileOrErr)

(*RemarksFileOrErr)->keep();

auto TMOrErr = createTargetMachine(M, CPU, OptLevel);

if (!TMOrErr)

return TMOrErr.takeError();

std::unique_ptr<TargetMachine> TM = std::move(*TMOrErr);

TargetLibraryInfoImpl TLII(TT);

opt(TM.get(), &TLII, M, OptLevel);

jhuber6Unsubmitted

Done

codegen(TM.get(), &TLII, M, OS);

- StringRef RawData(CGOutputBuffer.begin(), CGOutputBuffer.size());

- return MemoryBuffer::getMemBufferCopy(RawData);

+ return MemoryBuffer::getMemBufferCopy(OS.str());

}

Expected<std::unique_ptr<MemoryBuffer>>

Should work.

jhuber6: Should work.

tianshilei1992AuthorUnsubmitted

Done

What if OS is not NULL terminated?

tianshilei1992: What if `OS` is not `NULL` terminated?

jhuber6Unsubmitted

Done

Do we need it to be? The memory buffer should contain the size, unless we need to convert it to a C string somewhere. In that case you could do OS << '\0' but it would probably mess up the size.

jhuber6: Do we need it to be? The memory buffer should contain the size, unless we need to convert it to…

tianshilei1992AuthorUnsubmitted

Done

str() returns a StringRef, which is good. I thought it returned const char *.

tianshilei1992: `str()` returns a `StringRef`, which is good. I thought it returned `const char *`.

// Prepare the output buffer and stream for codegen.

SmallVector<char> CGOutputBuffer;

raw_svector_ostream OS(CGOutputBuffer);

codegen(TM.get(), &TLII, M, OS);

return MemoryBuffer::getMemBufferCopy(OS.str());

}

Expected<std::unique_ptr<MemoryBuffer>>

JITEngine::run(__tgt_device_image *Image, unsigned OptLevel,

jit::PostProcessingFn PostProcessing) {

auto ModOrErr = createModuleFromImage(Image, Context);

if (!ModOrErr)

return ModOrErr.takeError();

auto Mod = std::move(*ModOrErr);

auto MBOrError = backend(*Mod, OptLevel);

if (!MBOrError)

return MBOrError.takeError();

return PostProcessing(std::move(*MBOrError));

}

jhuber6Unsubmitted

Done

Why a std::list? Since we use pointers we shouldn't need to worry about having a stable pointer and could use a SmallVector or similar right?

jhuber6: Why a `std::list`? Since we use pointers we shouldn't need to worry about having a stable…

/// A map from a bitcode image start address to its corresponding triple. If the

/// image is not in the map, it is not a bitcode image.

DenseMap<void *, Triple::ArchType> BitcodeImageMap;

/// Output images generated from LLVM backend.

SmallVector<std::unique_ptr<MemoryBuffer>, 4> JITImages;

/// A list of __tgt_device_image images.

std::list<__tgt_device_image> TgtImages;

} // namespace

namespace llvm {

namespace omp {

namespace jit {

bool checkBitcodeImage(__tgt_device_image *Image, Triple::ArchType TA) {

TimeTraceScope TimeScope("Check bitcode image");

jhuber6Unsubmitted

Done

Should probably prefer C++ casts even if they are ridiculously verbose.

jhuber6: Should probably prefer C++ casts even if they are ridiculously verbose.

{

auto Itr = BitcodeImageMap.find(Image->ImageStart);

if (Itr != BitcodeImageMap.end() && Itr->second == TA)

return true;

}

StringRef Data(reinterpret_cast<const char *>(Image->ImageStart),

reinterpret_cast<char *>(Image->ImageEnd) -

reinterpret_cast<char *>(Image->ImageStart));

std::unique_ptr<MemoryBuffer> MB = MemoryBuffer::getMemBuffer(

Data, /* BufferName */ "", /* RequiresNullTerminator */ false);

if (!MB)

return false;

Expected<object::IRSymtabFile> FOrErr = object::readIRSymtab(*MB);

if (!FOrErr) {

consumeError(FOrErr.takeError());

return false;

}

auto ActualTriple = FOrErr->TheReader.getTargetTriple();

if (Triple(ActualTriple).getArch() == TA) {

BitcodeImageMap[Image->ImageStart] = TA;

jhuber6Unsubmitted

Done

Expected<__tgt_device_image *> compile(__tgt_device_image *Image,

- Triple::ArchType TA, std::string MCpu,

+ Triple::ArchType TA, std::string MCPU,

unsigned OptLevel,

MCPU is the more common version AFAICT.

jhuber6: `MCPU` is the more common version AFAICT.

return true;

}

return false;

}

Expected<__tgt_device_image *> compile(__tgt_device_image *Image,

Triple::ArchType TA, std::string MCPU,

unsigned OptLevel,

PostProcessingFn PostProcessing) {

JITEngine J(TA, MCPU);

auto ImageMBOrErr = J.run(Image, OptLevel, PostProcessing);

if (!ImageMBOrErr)

return ImageMBOrErr.takeError();

JITImages.push_back(std::move(*ImageMBOrErr));

TgtImages.push_back(*Image);

auto &ImageMB = JITImages.back();

auto *NewImage = &TgtImages.back();

NewImage->ImageStart = (void *)ImageMB->getBufferStart();

NewImage->ImageEnd = (void *)ImageMB->getBufferEnd();

return NewImage;

}

} // namespace jit

} // namespace omp

} // namespace llvm

openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h

Show All 20 Lines
#include "Debug.h"		#include "Debug.h"
#include "DeviceEnvironment.h"		#include "DeviceEnvironment.h"
#include "GlobalHandler.h"		#include "GlobalHandler.h"
#include "MemoryManager.h"		#include "MemoryManager.h"
#include "Utilities.h"		#include "Utilities.h"
#include "omptarget.h"		#include "omptarget.h"

#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
		#include "llvm/ADT/Triple.h"
#include "llvm/Frontend/OpenMP/OMPConstants.h"		#include "llvm/Frontend/OpenMP/OMPConstants.h"
#include "llvm/Frontend/OpenMP/OMPGridValues.h"		#include "llvm/Frontend/OpenMP/OMPGridValues.h"
#include "llvm/Support/Allocator.h"		#include "llvm/Support/Allocator.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MemoryBufferRef.h"		#include "llvm/Support/MemoryBufferRef.h"

namespace llvm {		namespace llvm {
▲ Show 20 Lines • Show All 335 Lines • ▼ Show 20 Lines	struct GenericDeviceTy : public DeviceAllocatorTy {
uint32_t getDefaultNumThreads() const {		uint32_t getDefaultNumThreads() const {
return GridValues.GV_Default_WG_Size;		return GridValues.GV_Default_WG_Size;
}		}
uint64_t getDefaultNumBlocks() const {		uint64_t getDefaultNumBlocks() const {
return GridValues.GV_Default_Num_Teams;		return GridValues.GV_Default_Num_Teams;
}		}
uint32_t getDynamicMemorySize() const { return OMPX_SharedMemorySize; }		uint32_t getDynamicMemorySize() const { return OMPX_SharedMemorySize; }

		/// Get target architecture.
		virtual std::string getArch() const {
		llvm_unreachable("device doesn't support JIT");
		}

		/// Post processing after jit backend. The ownership of \p MB will be taken.
		virtual Expected<std::unique_ptr<MemoryBuffer>>
		doJITPostProcessing(std::unique_ptr<MemoryBuffer> MB) const {
		return MB;
		}

private:		private:
/// Register offload entry for global variable.		/// Register offload entry for global variable.
Error registerGlobalOffloadEntry(DeviceImageTy &DeviceImage,		Error registerGlobalOffloadEntry(DeviceImageTy &DeviceImage,
const __tgt_offload_entry &GlobalEntry,		const __tgt_offload_entry &GlobalEntry,
__tgt_offload_entry &DeviceEntry);		__tgt_offload_entry &DeviceEntry);

/// Register offload entry for kernel function.		/// Register offload entry for kernel function.
Error registerKernelOffloadEntry(DeviceImageTy &DeviceImage,		Error registerKernelOffloadEntry(DeviceImageTy &DeviceImage,
▲ Show 20 Lines • Show All 133 Lines • ▼ Show 20 Lines	struct GenericPluginTy {
}		}

/// Get the number of active devices.		/// Get the number of active devices.
int32_t getNumDevices() const { return NumDevices; }		int32_t getNumDevices() const { return NumDevices; }

/// Get the ELF code to recognize the binary image of this plugin.		/// Get the ELF code to recognize the binary image of this plugin.
virtual uint16_t getMagicElfBits() const = 0;		virtual uint16_t getMagicElfBits() const = 0;

		/// Get the target triple of this plugin.
		virtual Triple::ArchType getTripleArch() const {
		llvm_unreachable("target doesn't support jit");
		}

/// Allocate a structure using the internal allocator.		/// Allocate a structure using the internal allocator.
template <typename Ty> Ty *allocate() {		template <typename Ty> Ty *allocate() {
return reinterpret_cast<Ty *>(Allocator.Allocate(sizeof(Ty), alignof(Ty)));		return reinterpret_cast<Ty *>(Allocator.Allocate(sizeof(Ty), alignof(Ty)));
}		}

/// Get the reference to the global handler of this plugin.		/// Get the reference to the global handler of this plugin.
GenericGlobalHandlerTy &getGlobalHandler() {		GenericGlobalHandlerTy &getGlobalHandler() {
assert(GlobalHandler && "Global handler not initialized");		assert(GlobalHandler && "Global handler not initialized");
▲ Show 20 Lines • Show All 323 Lines • Show Last 20 Lines

openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp

//===- PluginInterface.cpp - Target independent plugin device interface ---===//		//===- PluginInterface.cpp - Target independent plugin device interface ---===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "PluginInterface.h"		#include "PluginInterface.h"
#include "Debug.h"		#include "Debug.h"
#include "GlobalHandler.h"		#include "GlobalHandler.h"
		#include "JIT.h"
#include "elf_common.h"		#include "elf_common.h"
#include "omptarget.h"		#include "omptarget.h"
#include "omptargetplugin.h"		#include "omptargetplugin.h"

#include <cstdint>		#include <cstdint>
#include <limits>		#include <limits>

using namespace llvm;		using namespace llvm;
▲ Show 20 Lines • Show All 602 Lines • ▼ Show 20 Lines	int32_t __tgt_rtl_deinit_plugin() {

return (bool)Err;		return (bool)Err;
}		}

int32_t __tgt_rtl_is_valid_binary(__tgt_device_image *TgtImage) {		int32_t __tgt_rtl_is_valid_binary(__tgt_device_image *TgtImage) {
if (!Plugin::isActive())		if (!Plugin::isActive())
return false;		return false;

return elf_check_machine(TgtImage, Plugin::get().getMagicElfBits());		if (elf_check_machine(TgtImage, Plugin::get().getMagicElfBits()))
		return true;

		return jit::checkBitcodeImage(TgtImage, Plugin::get().getTripleArch());
}		}

int32_t __tgt_rtl_is_valid_binary_info(__tgt_device_image *TgtImage,		int32_t __tgt_rtl_is_valid_binary_info(__tgt_device_image *TgtImage,
__tgt_image_info *Info) {		__tgt_image_info *Info) {
if (!Plugin::isActive())		if (!Plugin::isActive())
return false;		return false;

if (!__tgt_rtl_is_valid_binary(TgtImage))		if (!__tgt_rtl_is_valid_binary(TgtImage))
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
int32_t __tgt_rtl_is_data_exchangable(int32_t SrcDeviceId,		int32_t __tgt_rtl_is_data_exchangable(int32_t SrcDeviceId,
int32_t DstDeviceId) {		int32_t DstDeviceId) {
return Plugin::get().isDataExchangable(SrcDeviceId, DstDeviceId);		return Plugin::get().isDataExchangable(SrcDeviceId, DstDeviceId);
}		}

__tgt_target_table *__tgt_rtl_load_binary(int32_t DeviceId,		__tgt_target_table *__tgt_rtl_load_binary(int32_t DeviceId,
__tgt_device_image *TgtImage) {		__tgt_device_image *TgtImage) {
GenericPluginTy &Plugin = Plugin::get();		GenericPluginTy &Plugin = Plugin::get();
auto TableOrErr = Plugin.getDevice(DeviceId).loadBinary(Plugin, TgtImage);		GenericDeviceTy &Device = Plugin.getDevice(DeviceId);

		// If it is a bitcode image, we have to jit the binary image before loading to
		// the device.
		{
		UInt32Envar JITOptLevel("LIBOMPTARGET_JIT_OPT_LEVEL", 3);
		Triple::ArchType TA = Plugin.getTripleArch();
		std::string Arch = Device.getArch();

		jit::PostProcessingFn PostProcessing =
		[&Device](std::unique_ptr<MemoryBuffer> MB)
		-> Expected<std::unique_ptr<MemoryBuffer>> {
		return Device.doJITPostProcessing(std::move(MB));
		};

		if (jit::checkBitcodeImage(TgtImage, TA)) {
		auto TgtImageOrErr =
		tianshilei1992AuthorUnsubmitted Done Reply Inline Actions Do we want a configurable value for the `OptLevel`, or can we know it from somewhere else what value is used at compile time? tianshilei1992: Do we want a configurable value for the `OptLevel`, or can we know it from somewhere else what…
		jdoerfertUnsubmitted Done Reply Inline Actions We most likely want a env var and maybe later even pass the value through. Env var is good enough for now. jdoerfert: We most likely want a env var and maybe later even pass the value through. Env var is good…
		jit::compile(TgtImage, TA, Arch, JITOptLevel, PostProcessing);
		if (!TgtImageOrErr) {
		auto Err = TgtImageOrErr.takeError();
		REPORT("Failure to jit binary image from bitcode image %p on device "
		"%d: %s\n",
		TgtImage, DeviceId, toString(std::move(Err)).data());
		return nullptr;
		}

		TgtImage = *TgtImageOrErr;
		}
		}

		auto TableOrErr = Device.loadBinary(Plugin, TgtImage);
if (!TableOrErr) {		if (!TableOrErr) {
auto Err = TableOrErr.takeError();		auto Err = TableOrErr.takeError();
REPORT("Failure to load binary image %p on device %d: %s\n", TgtImage,		REPORT("Failure to load binary image %p on device %d: %s\n", TgtImage,
DeviceId, toString(std::move(Err)).data());		DeviceId, toString(std::move(Err)).data());
return nullptr;		return nullptr;
}		}

__tgt_target_table Table = TableOrErr;		__tgt_target_table Table = TableOrErr;
▲ Show 20 Lines • Show All 243 Lines • Show Last 20 Lines

openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp

Show First 20 Lines • Show All 272 Lines • ▼ Show 20 Lines	Error initImpl(GenericPluginTy &Plugin) override {
if (auto Err = getDeviceAttr(CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_X,		if (auto Err = getDeviceAttr(CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_X,
GridValues.GV_Max_WG_Size))		GridValues.GV_Max_WG_Size))
return Err;		return Err;

if (auto Err = getDeviceAttr(CU_DEVICE_ATTRIBUTE_WARP_SIZE,		if (auto Err = getDeviceAttr(CU_DEVICE_ATTRIBUTE_WARP_SIZE,
GridValues.GV_Warp_Size))		GridValues.GV_Warp_Size))
return Err;		return Err;

		if (auto Err = getDeviceAttr(CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR,
		ComputeCapability.Major))
		return Err;

		if (auto Err = getDeviceAttr(CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR,
		ComputeCapability.Minor))
		return Err;

return Plugin::success();		return Plugin::success();
}		}

/// Deinitialize the device and release its resources.		/// Deinitialize the device and release its resources.
Error deinitImpl() override {		Error deinitImpl() override {
if (Context) {		if (Context) {
if (auto Err = setContext())		if (auto Err = setContext())
return Err;		return Err;
▲ Show 20 Lines • Show All 500 Lines • ▼ Show 20 Lines	struct CUDADeviceTy : public GenericDeviceTy {
/// CUDA-specific function to get device attributes.		/// CUDA-specific function to get device attributes.
Error getDeviceAttr(uint32_t Kind, uint32_t &Value) {		Error getDeviceAttr(uint32_t Kind, uint32_t &Value) {
// TODO: Warn if the new value is larger than the old.		// TODO: Warn if the new value is larger than the old.
CUresult Res =		CUresult Res =
cuDeviceGetAttribute((int *)&Value, (CUdevice_attribute)Kind, Device);		cuDeviceGetAttribute((int *)&Value, (CUdevice_attribute)Kind, Device);
return Plugin::check(Res, "Error in cuDeviceGetAttribute: %s");		return Plugin::check(Res, "Error in cuDeviceGetAttribute: %s");
}		}

		/// See GenericDeviceTy::getArch().
		std::string getArch() const override { return ComputeCapability.str(); }

private:		private:
using CUDAStreamManagerTy = GenericDeviceResourceManagerTy<CUDAStreamRef>;		using CUDAStreamManagerTy = GenericDeviceResourceManagerTy<CUDAStreamRef>;
using CUDAEventManagerTy = GenericDeviceResourceManagerTy<CUDAEventRef>;		using CUDAEventManagerTy = GenericDeviceResourceManagerTy<CUDAEventRef>;

/// Stream manager for CUDA streams.		/// Stream manager for CUDA streams.
CUDAStreamManagerTy CUDAStreamManager;		CUDAStreamManagerTy CUDAStreamManager;

/// Event manager for CUDA events.		/// Event manager for CUDA events.
CUDAEventManagerTy CUDAEventManager;		CUDAEventManagerTy CUDAEventManager;

/// The device's context. This context should be set before performing		/// The device's context. This context should be set before performing
/// operations on the device.		/// operations on the device.
CUcontext Context = nullptr;		CUcontext Context = nullptr;

/// The CUDA device handler.		/// The CUDA device handler.
CUdevice Device = CU_DEVICE_INVALID;		CUdevice Device = CU_DEVICE_INVALID;

		/// The compute capability of the corresponding CUDA device.
		jhuber6Unsubmitted Done Reply Inline Actions Missing comment? jhuber6: Missing comment?
		struct ComputeCapabilityTy {
		uint32_t Major;
		uint32_t Minor;
		std::string str() const {
		return "sm_" + std::to_string(Major * 10 + Minor);
		}
		} ComputeCapability;
};		};

Error CUDAKernelTy::launchImpl(GenericDeviceTy &GenericDevice,		Error CUDAKernelTy::launchImpl(GenericDeviceTy &GenericDevice,
uint32_t NumThreads, uint64_t NumBlocks,		uint32_t NumThreads, uint64_t NumBlocks,
uint32_t DynamicMemorySize,		uint32_t DynamicMemorySize,
int32_t NumKernelArgs, void *KernelArgs,		int32_t NumKernelArgs, void *KernelArgs,
AsyncInfoWrapperTy &AsyncInfoWrapper) const {		AsyncInfoWrapperTy &AsyncInfoWrapper) const {
CUDADeviceTy &CUDADevice = static_cast<CUDADeviceTy &>(GenericDevice);		CUDADeviceTy &CUDADevice = static_cast<CUDADeviceTy &>(GenericDevice);
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	struct CUDAPluginTy final : public GenericPluginTy {
}		}

/// Deinitialize the plugin.		/// Deinitialize the plugin.
Error deinitImpl() override { return Plugin::success(); }		Error deinitImpl() override { return Plugin::success(); }

/// Get the ELF code for recognizing the compatible image binary.		/// Get the ELF code for recognizing the compatible image binary.
uint16_t getMagicElfBits() const override { return ELF::EM_CUDA; }		uint16_t getMagicElfBits() const override { return ELF::EM_CUDA; }

		Triple::ArchType getTripleArch() const override {
		// TODO: I think we can drop the support for 32-bit NVPTX devices.
		return Triple::nvptx64;
		}

/// Check whether the image is compatible with the available CUDA devices.		/// Check whether the image is compatible with the available CUDA devices.
Expected<bool> isImageCompatible(__tgt_image_info *Info) const override {		Expected<bool> isImageCompatible(__tgt_image_info *Info) const override {
for (int32_t DevId = 0; DevId < getNumDevices(); ++DevId) {		for (int32_t DevId = 0; DevId < getNumDevices(); ++DevId) {
CUdevice Device;		CUdevice Device;
CUresult Res = cuDeviceGet(&Device, DevId);		CUresult Res = cuDeviceGet(&Device, DevId);
if (auto Err = Plugin::check(Res, "Error in cuDeviceGet: %s"))		if (auto Err = Plugin::check(Res, "Error in cuDeviceGet: %s"))
return std::move(Err);		return std::move(Err);

▲ Show 20 Lines • Show All 122 Lines • Show Last 20 Lines

openmp/libomptarget/plugins-nextgen/generic-elf-64bit/src/rtl.cpp

Show First 20 Lines • Show All 358 Lines • ▼ Show 20 Lines	struct GenELF64PluginTy final : public GenericPluginTy {
bool isDataExchangable(int32_t SrcDeviceId, int32_t DstDeviceId) override {		bool isDataExchangable(int32_t SrcDeviceId, int32_t DstDeviceId) override {
return false;		return false;
}		}

/// All images (ELF-compatible) should be compatible with this plugin.		/// All images (ELF-compatible) should be compatible with this plugin.
Expected<bool> isImageCompatible(__tgt_image_info *Info) const override {		Expected<bool> isImageCompatible(__tgt_image_info *Info) const override {
return true;		return true;
}		}

		Triple::ArchType getTripleArch() const override {
		return Triple::LIBOMPTARGET_NEXTGEN_GENERIC_PLUGIN_TRIPLE;
		}
};		};

GenericPluginTy *Plugin::createPlugin() { return new GenELF64PluginTy(); }		GenericPluginTy *Plugin::createPlugin() { return new GenELF64PluginTy(); }

GenericDeviceTy *Plugin::createDevice(int32_t DeviceId, int32_t NumDevices) {		GenericDeviceTy *Plugin::createDevice(int32_t DeviceId, int32_t NumDevices) {
return new GenELF64DeviceTy(DeviceId, NumDevices);		return new GenELF64DeviceTy(DeviceId, NumDevices);
}		}

Show All 17 Lines

openmp/libomptarget/test/lit.cfg

Show All 28 Lines	if 'OMP_TARGET_OFFLOAD' in os.environ:
config.environment['OMP_TARGET_OFFLOAD'] = os.environ['OMP_TARGET_OFFLOAD']		config.environment['OMP_TARGET_OFFLOAD'] = os.environ['OMP_TARGET_OFFLOAD']

def append_dynamic_library_path(name, value, sep):		def append_dynamic_library_path(name, value, sep):
if name in config.environment:		if name in config.environment:
config.environment[name] = value + sep + config.environment[name]		config.environment[name] = value + sep + config.environment[name]
else:		else:
config.environment[name] = value		config.environment[name] = value

		# Evalute the environment variable which is a string boolean value.
		def evaluate_bool_env(env):
		env = env.lower()
		possible_true_values = ["on", "true", "1"]
		for v in possible_true_values:
		if env == v:
		return True
		return False

# name: The name of this test suite.		# name: The name of this test suite.
config.name = 'libomptarget :: ' + config.libomptarget_current_target		config.name = 'libomptarget :: ' + config.libomptarget_current_target

# suffixes: A list of file extensions to treat as test files.		# suffixes: A list of file extensions to treat as test files.
config.suffixes = ['.c', '.cpp', '.cc']		config.suffixes = ['.c', '.cpp', '.cc']

# test_source_root: The root path where tests are located.		# test_source_root: The root path where tests are located.
config.test_source_root = os.path.dirname(__file__)		config.test_source_root = os.path.dirname(__file__)
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	else: # Unices
if config.cuda_libdir:		if config.cuda_libdir:
config.test_flags += " -Wl,-rpath," + config.cuda_libdir		config.test_flags += " -Wl,-rpath," + config.cuda_libdir
if config.libomptarget_current_target.startswith('amdgcn'):		if config.libomptarget_current_target.startswith('amdgcn'):
config.test_flags += " --libomptarget-amdgcn-bc-path=" + config.library_dir		config.test_flags += " --libomptarget-amdgcn-bc-path=" + config.library_dir
if config.libomptarget_current_target.startswith('nvptx'):		if config.libomptarget_current_target.startswith('nvptx'):
config.test_flags += " --libomptarget-nvptx-bc-path=" + config.library_dir		config.test_flags += " --libomptarget-nvptx-bc-path=" + config.library_dir
if config.libomptarget_current_target.endswith('-LTO'):		if config.libomptarget_current_target.endswith('-LTO'):
config.test_flags += " -foffload-lto"		config.test_flags += " -foffload-lto"
		if config.libomptarget_current_target.endswith('-JIT-LTO') and evaluate_bool_env(
		config.environment['LIBOMPTARGET_NEXTGEN_PLUGINS']
		):
		config.test_flags += " -foffload-lto"
		config.test_flags += " -Wl,--embed-bitcode"

def remove_suffix_if_present(name):		def remove_suffix_if_present(name):
if name.endswith('-LTO'):		if name.endswith('-LTO'):
return name[:-4]		return name[:-4]
		elif name.endswith('-JIT-LTO'):
		return name[:-8]
else:		else:
return name		return name

# substitutions		# substitutions
# - for targets that exist in the system create the actual command.		# - for targets that exist in the system create the actual command.
# - for valid targets that do not exist in the system, return false, so that the		# - for valid targets that do not exist in the system, return false, so that the
# same test can be used for different targets.		# same test can be used for different targets.

▲ Show 20 Lines • Show All 158 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Introduce basic JIT support to OpenMP target offloadingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 485452

openmp/libomptarget/CMakeLists.txt

openmp/libomptarget/plugins-nextgen/CMakeLists.txt

openmp/libomptarget/plugins-nextgen/common/PluginInterface/CMakeLists.txt

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.h

openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp

openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h

openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp

openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp

openmp/libomptarget/plugins-nextgen/generic-elf-64bit/src/rtl.cpp

openmp/libomptarget/test/lit.cfg

[OpenMP] Introduce basic JIT support to OpenMP target offloading
ClosedPublic