This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][WIP] Pass auxiliary YAML along with device image for OpenMP offload.
AbandonedPublic

Authored by vzakhari on Jan 26 2021, 2:32 PM.

Details

Reviewers
grokos
jdoerfert
Summary

This is a RFC for bundling a YAML file along with the device image in clang-offload-wrapper, so that auxiliary information is passed to the offload plugins.

Currently, many plugins expect that the device image is an ELF image, so they look for the ELF magic word and go from there. This change-set puts extra data before the actual image like this (see code in ClangOffloadWrapper.cpp):

// Insert 1OMP header magic word.
// The expected image structure is like this:
//   struct {
//     char MagicWord[ONEOMPMAGLEN]; // "1OMP"
//     uint32_t HeaderSize;          // In little-endian.
//     char Header[HeaderSize];      // Uncompressed YAML with auxiliary
//                                   // information.
//     char ActualImage[];           // ELF or another image.
//   }

For backward compatibility, plugins have to continue working with plain ELF images. If 1OMP magic word is detected, then the corresponding Header has to be read as an YAML file with some mandatory keys.

As long as unification between multiple offload targets is not practical, vendors/plugins may use optional keys (such as CompileOptions for cuModuleLoadDataEx()) to convey auxiliary information from the source code or compiler options to the device specific plugins.

Main concerns about this change-set are the dependency on LLVMSupport (requiring pre-build llvm project) and the final size of the plugins (due to LLVMSupport linking).

Just for an example, the plugins will see this auxiliary data like this:

TARGET x86_64 RTL --> 1OMP header: HeaderSize(133), HeaderBegin(0x00000000004009f8), ImageBegin(0x0000000000400a7d), ImageEnd(0x0000000000402ba5)
TARGET x86_64 RTL --> Read OpenMPAttr:
TARGET x86_64 RTL --> <CompileOptions: [
TARGET x86_64 RTL --> ,
TARGET x86_64 RTL --> ]>
TARGET x86_64 RTL --> Read OpenMPAttr:
TARGET x86_64 RTL --> <LinkOptions: [
TARGET x86_64 RTL --> ,
TARGET x86_64 RTL --> ]>
TARGET x86_64 RTL --> Read OpenMPAttr:
TARGET x86_64 RTL --> <Producer: [
TARGET x86_64 RTL --> unknown,
TARGET x86_64 RTL --> ]>
TARGET x86_64 RTL --> Read OpenMPAttr:
TARGET x86_64 RTL --> <TargetID: [
TARGET x86_64 RTL --> ,
TARGET x86_64 RTL --> ]>
TARGET x86_64 RTL --> Read OpenMPAttr:
TARGET x86_64 RTL --> <VendorId: [
TARGET x86_64 RTL --> 9,
TARGET x86_64 RTL --> ]>
TARGET x86_64 RTL --> Read OpenMPAttr:
TARGET x86_64 RTL --> <VendorName: [
TARGET x86_64 RTL --> llvm,
TARGET x86_64 RTL --> ]>

Diff Detail

Event Timeline

vzakhari created this revision.Jan 26 2021, 2:32 PM
vzakhari requested review of this revision.Jan 26 2021, 2:32 PM
vzakhari updated this revision to Diff 319412.Jan 26 2021, 2:34 PM
vzakhari removed a reviewer: jdoerfert.
vzakhari removed subscribers: sstefan1, jvesely, nhaehnle and 5 others.
vzakhari edited reviewers, added: grokos; removed: jdoerfert.Jan 26 2021, 2:34 PM
jdoerfert requested changes to this revision.Jan 26 2021, 6:46 PM

vzakhari removed a reviewer: jdoerfert.

My herald rules are very annoying sometimes, apologies. Can we discuss this tomorrow? I put it on the agenda https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit?usp=sharing

This revision now requires changes to proceed.Jan 26 2021, 6:46 PM
jdoerfert retitled this revision from [WIP] Pass auxiliary YAML along with device image for OpenMP offload. to [OpenMP][WIP] Pass auxiliary YAML along with device image for OpenMP offload..Jan 26 2021, 6:46 PM

My herald rules are very annoying sometimes, apologies. Can we discuss this tomorrow? I put it on the agenda https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit?usp=sharing

The patch was uploaded to serve as a base for our discussions :)

There are a bunch of side channels for communicating between the compiler and the runtime already. Compiler emitted magic symbols, a msgpack blob for amdgpu, writing extra stuff into the elf in general etc.

What information do you intend to pass along through this new channel, why yaml, and why for all plugins?

There are a bunch of side channels for communicating between the compiler and the runtime already. Compiler emitted magic symbols, a msgpack blob for amdgpu, writing extra stuff into the elf in general etc.

What information do you intend to pass along through this new channel, why yaml, and why for all plugins?

Yes there may be many ways to communicate information from compiler to runtime. The proposal is one way and we others wanted to communicate information to runtime and come up with a standard scheme.
We added this to other plugins to show how it can be done. Plugins can opt in,, just dont add the unique signature and the header.

There are a bunch of side channels for communicating between the compiler and the runtime already. Compiler emitted magic symbols, a msgpack blob for amdgpu, writing extra stuff into the elf in general etc.

What information do you intend to pass along through this new channel, why yaml, and why for all plugins?

Yes there may be many ways to communicate information from compiler to runtime. The proposal is one way and we others wanted to communicate information to runtime and come up with a standard scheme.
We added this to other plugins to show how it can be done. Plugins can opt in,, just dont add the unique signature and the header.

Right, I am wondering if we can replace all the existing side channels with a unified one. But there is always an option for a plugin/vendor to use the existing methods. Code in clang-offload-wrapper is not supposed to be enabled by default, so there will be some mechanism to switch between the current and the new wrapping. A vendor-specific toolchain will be able to invoke clang-offload-wrapper as it wants.

So to answer "what information" question: all the existing information (if possible; if not possible - I would like to understand why) and more along the road, e.g. compilation/linking options for OpenCL/CUDA backends, TargetIDs for alternative images targeting different versions of a target device, ProducerID for identifying not backward compatible changes in plugins, etc.

I chose YAML just because it is human readable and easily editable format with existing support in LLVMSupport library.

I looked through the AMDGCN offload toolchain, and I think we should be able to use ELF for embedding the auxiliary information just the same way msgpack is used for AMDGCN offload. We can also use some special PT_NOTE section for common things like vendorid/vendorname/producer. I believe the targetid may be represented with e_machine and probably additional auxiliary information in vendor specific PT_NOTE section(s) (if needed at all).

I suggest adding a PT_NOTE section with new type (e.g. NT_OPENMP_METADATA) with the name matching one of the vendor-name values from table 1.2 OpenMP Additional Definitions document. The desc of this section contains a contiguous "list" of null-terminated strings. First string specifies the OpenMP metadata version (the structure of the desc), i.e. 1.0 for the initial version. Version 1.0 has only one additional string in the "list", which is the producer information (interpreted in a vendor-speific way, e.g. for llvm we could embed LLVM version and a "version" of OpenMP toolchain, which is clang-offload-bundler, clang-offload-wrapper, libomptarget, etc.).

Maybe using just LLVM version is enough: I suppose the minor version of LLVM will change if any changes to the OpenMP toolchain are made, so we can use LLVM version, if needed, to check for compatibility issues. Alternatively, we can use an OMP toolchain version and make it independent of LLVM releases.

namesz#
descsz#
typeNT_OPENMP_METADATA
namellvm\0<pad>
desc1.0\0LLVM12.0.0/OMP1.0.0\0<pad>

For the rest of the auxiliary information (whatever it is) vendor specific toolchains and implementations may use additional PT_NOTE sections (just as AMDGCN).

I have not found anything like PT_NOTE in COFF and supporting two formats in the toolchain is not very convenient, so I think it makes sense to use ELF for Windows as well. This seems to require getting rid of libelf dependency in all plugins. We can probably use LLVMObject implementation.

Hi @gregrodgers, I am trying to comprehend the example that you showed on the meeting, where you had two offload images compiled for different archs. I wonder how this should work when we execute on a system where the actual GPU can support/execute both images. Currently, an offload image is registered with an RTL whenever is_valid_binary returns true. So it looks like both such images will be registered with your RTL, but they are basically for the same OpenMP program, so it is not clear how they should interact. Obviously, they cannot be linked in any way in runtime, since they are for the same OpenMP program. This is offtopic, though.

What do you think about using PT_NOTE/SHT_NOTE for keeping the additional information with the offload image?

grokos added a comment.EditedMar 25 2021, 12:02 PM

New implementation is here: https://reviews.llvm.org/D99360

Are we abandoning this patch then? Can you close it?

vzakhari abandoned this revision.Mar 25 2021, 12:10 PM