diff --git a/openmp/docs/SupportAndFAQ.rst b/openmp/docs/SupportAndFAQ.rst --- a/openmp/docs/SupportAndFAQ.rst +++ b/openmp/docs/SupportAndFAQ.rst @@ -53,14 +53,15 @@ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To build an *effective* OpenMP offload capable compiler, only one extra CMake option, `LLVM_ENABLE_RUNTIMES="openmp"`, is needed when building LLVM (Generic -information about building LLVM is available `here `__.). -Make sure all backends that are targeted by OpenMP to be enabled. By default, -Clang will be built with all backends enabled. -When building with `LLVM_ENABLE_RUNTIMES="openmp"` OpenMP should not be enabled -in `LLVM_ENABLE_PROJECTS` because it is enabled by default. +information about building LLVM is available `here +`__.). Make sure all backends that +are targeted by OpenMP to be enabled. By default, Clang will be built with all +backends enabled. When building with `LLVM_ENABLE_RUNTIMES="openmp"` OpenMP +should not be enabled in `LLVM_ENABLE_PROJECTS` because it is enabled by +default. -For Nvidia offload, please see :ref:`_build_nvidia_offload_capable_compiler`. -For AMDGPU offload, please see :ref:`_build_amdgpu_offload_capable_compiler`. +For Nvidia offload, please see :ref:`build_nvidia_offload_capable_compiler`. +For AMDGPU offload, please see :ref:`build_amdgpu_offload_capable_compiler`. .. note:: The compiler that generates the offload code should be the same (version) as @@ -86,41 +87,51 @@ Q: How to build an OpenMP AMDGPU offload capable compiler? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -A subset of the `ROCm ` toolchain is +A subset of the `ROCm `_ toolchain is required to build the LLVM toolchain and to execute the openmp application. Either install ROCm somewhere that cmake's find_package can locate it, or build the required subcomponents ROCt and ROCr from source. -The two components used are ROCT-Thunk-Interface, roct, and ROCR-Runtime, -rocr. Roct is the userspace part of the linux driver. It calls into the -driver which ships with the linux kernel. It is an implementation detail of -Rocr from OpenMP's perspective. Rocr is an implementation of `HSA `. +The two components used are ROCT-Thunk-Interface, roct, and ROCR-Runtime, rocr. +Roct is the userspace part of the linux driver. It calls into the driver which +ships with the linux kernel. It is an implementation detail of Rocr from +OpenMP's perspective. Rocr is an implementation of `HSA +`_. - SOURCE_DIR=same-as-llvm-source # e.g. the checkout of llvm-project, next to openmp - BUILD_DIR=somewhere - INSTALL_PREFIX=same-as-llvm-install - - cd $SOURCE_DIR - git clone git@github.com:RadeonOpenCompute/ROCT-Thunk-Interface.git -b roc-4.1.x --single-branch - git clone git@github.com:RadeonOpenCompute/ROCR-Runtime.git -b rocm-4.1.x --single-branch - - cd $BUILD_DIR && mkdir roct && cd roct - cmake $SOURCE_DIR/ROCT-Thunk-Interface/ -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF - make && make install - - cd $BUILD_DIR && mkdir rocr && cd rocr - cmake $SOURCE_DIR/ROCR-Runtime/src -DIMAGE_SUPPORT=OFF -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON - make && make install +.. code-block:: text -IMAGE_SUPPORT requires building rocr with clang and is not used by openmp. + SOURCE_DIR=same-as-llvm-source # e.g. the checkout of llvm-project, next to openmp + BUILD_DIR=somewhere + INSTALL_PREFIX=same-as-llvm-install + + cd $SOURCE_DIR + git clone git@github.com:RadeonOpenCompute/ROCT-Thunk-Interface.git -b roc-4.1.x \ + --single-branch + git clone git@github.com:RadeonOpenCompute/ROCR-Runtime.git -b rocm-4.1.x \ + --single-branch + + cd $BUILD_DIR && mkdir roct && cd roct + cmake $SOURCE_DIR/ROCT-Thunk-Interface/ -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX \ + -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF + make && make install + + cd $BUILD_DIR && mkdir rocr && cd rocr + cmake $SOURCE_DIR/ROCR-Runtime/src -DIMAGE_SUPPORT=OFF \ + -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release \ + -DBUILD_SHARED_LIBS=ON + make && make install + +``IMAGE_SUPPORT`` requires building rocr with clang and is not used by openmp. Provided cmake's find_package can find the ROCR-Runtime package, LLVM will -build a tool `bin/amdgpu-arch` which will print a string like 'gfx906' when +build a tool ``bin/amdgpu-arch`` which will print a string like ``gfx906`` when run if it recognises a GPU on the local system. LLVM will also build a shared library, libomptarget.rtl.amdgpu.so, which is linked against rocr. With those libraries installed, then LLVM build and installed, try: +.. code-block:: shell + clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa example.c -o example && ./example Q: What are the known limitations of OpenMP AMDGPU offload? @@ -153,8 +164,8 @@ Q: Does Clang support `` and `` operations in OpenMP target on GPUs? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Yes, LLVM/Clang allows math functions and complex arithmetic inside of OpenMP target regions -that are compiled for GPUs. +Yes, LLVM/Clang allows math functions and complex arithmetic inside of OpenMP +target regions that are compiled for GPUs. Clang provides a set of wrapper headers that are found first when `math.h` and `complex.h`, for C, `cmath` and `complex`, for C++, or similar headers are @@ -202,8 +213,8 @@ Currently, the only solution is to change how the application is built and avoid the use of static libraries. -Q: Can I use dynamically linked libraries with OpenMP offloading -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Q: Can I use dynamically linked libraries with OpenMP offloading? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Dynamically linked libraries can be only used if there is no device code split between the library and application. Anything declared on the device inside the @@ -220,3 +231,36 @@ For example, if your system-wide GCC installation is too old to build LLVM and you would like to use a newer GCC, set the CMake variable `GCC_INSTALL_PREFIX` to inform clang of the GCC installation you would like to use in the second stage. + +Q: How can I include OpenMP offloading support in my CMake project? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Currently, there is an experimental CMake find module for OpenMP target +offloading provided by LLVM. It will attempt to find OpenMP target offloading +support for your compiler. The flags necessary for OpenMP target offloading will +be loaded into the ``OpenMPTarget::OpenMPTarget_`` target or the +``OpenMPTarget__FLAGS`` variable if successful. Currently supported +devices are ``AMDGCN`` and ``NVPTX``. + +To use this module, simply add the path to CMake's current module path and call +``find_package``. The module will be installed with your OpenMP installation by +default. Including OpenMP offloading support in an application should now only +require a few additions. + +.. code-block:: cmake + + cmake_minimum_required(VERSION 3.13.4) + project(offloadTest VERSION 1.0 LANGUAGES CXX) + + list(APPEND CMAKE_MODULE_PATH "${PATH_TO_OPENMP_INSTALL}/lib/cmake/openmp") + + find_package(OpenMPTarget REQUIRED NVPTX) + + add_executable(offload) + target_link_libraries(offload PRIVATE OpenMPTarget::OpenMPTarget_NVPTX) + target_sources(offload PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/src/Main.cpp) + +Using this module requires at least CMake version 3.13.4. Supported languages +are C and C++ with Fortran support planned in the future. Compiler support is +best for Clang but this module should work for other compiler vendors such as +IBM, GNU. diff --git a/openmp/tools/Modules/FindOpenMPTarget.cmake b/openmp/tools/Modules/FindOpenMPTarget.cmake --- a/openmp/tools/Modules/FindOpenMPTarget.cmake +++ b/openmp/tools/Modules/FindOpenMPTarget.cmake @@ -140,10 +140,6 @@ # Get flags for setting the device's architecture for each compiler. function(_OPENMP_TARGET_DEVICE_ARCH_CANDIDATES LANG DEVICE DEVICE_FLAG) - # AMD requires the architecture, default to gfx908 if not provided. - if((NOT OpenMPTarget_${DEVICE}_ARCH) AND ("${DEVICE}" STREQUAL "AMDGCN")) - set(OpenMPTarget_${DEVICE}_ARCH "gfx908") - endif() if(OpenMPTarget_${DEVICE}_ARCH) # Only Clang supports selecting the architecture for now. set(OMPTarget_ARCH_Clang "-Xopenmp-target=${DEVICE_FLAG} -march=${OpenMPTarget_${DEVICE}_ARCH}") diff --git a/openmp/tools/Modules/README.rst b/openmp/tools/Modules/README.rst --- a/openmp/tools/Modules/README.rst +++ b/openmp/tools/Modules/README.rst @@ -15,7 +15,7 @@ device. The module will attempt to compile a test program using known compiler flags for each requested architecture. If successful, the flags required for offloading will be loaded into the ``OpenMPTarget::OpenMPTarget_`` -target or the ``OpenMPTarget_NVPTX_FLAGS`` variable. Currently supported target +target or the ``OpenMPTarget__FLAGS`` variable. Currently supported target devices are ``NVPTX`` and ``AMDGCN``. This module is still under development so some features may be missing.