We will start generating this in a future patch.
Change-Id: Id5fae1e8e6887ab7bbecbf445d1834a3aed4a5c9
Differential D21482
Support/ELF: Add R_AMDGPU_GOTPCREL relocation • tstellarAMD on Jun 17 2016, 2:31 PM. Authored by
Details We will start generating this in a future patch. Change-Id: Id5fae1e8e6887ab7bbecbf445d1834a3aed4a5c9
Diff Detail
Event TimelineComment Actions git-clang-format the patch. Out of curiosity, why do you need a got? Are GPU fancy enough to have Cheers, Comment Actions We only support generating shared libraries of GPU code. The shared libraries are loaded into memory, and then the CPU host code is essentially calling functions in the GPU libraries (not directly calling them directly, but using a GPU runtime library, like OpenCL, for example, to invoke the functions). We don't support preemption, but we need to use a got for referencing external variables. Comment Actions
Sorry, I am still missing something. By "external", you mean "not in the .o" or "not in the .so". You don't Cheers, Comment Actions "external" means not in the .so. There can be multiple .so files. A common case for this is on a multiple GPU system where you have one .so for data that is stored in device memory. This device .so is loaded into the device memory of each GPU (so one copy of the .so per device). Then you have a second .so for data that is stored in host memory. There is only a single copy of this .so which is stored in host memory that can be accessed by all devices. The .so stored in device memory will access variables defined in the .so stored in host memory. One question I have is if we don't support pre-emption, is there some way for the whole tool chain to know this, so the linker is able to resolve more symbols at link time? Comment Actions Hi Rafael, In additions to what Tom said, external variables can be allocated and defined by the host, at runtime, before dispatching GPU kernels. Here is a "memory copy" quick example: GPU Kernel (OpenCL): extern global int *ExternVar; kernel void foo(global int *A, const int size) { int gid = get_global_id(0); if (gid < size) A[gid] = ExternVar[gid]; } Host Code (using hsa runtime, pseudo code and hand waving): int *hostA = NULL hsa_memory_allocate(global_region, 4 * sizeof(int), &hostA) int *hostExternVar = NULL hsa_memory_allocate(global_region, 4 * sizeof(int), &hostExternVar) for (i = 0; i < 4; i++) hostExternVar[i] = i exec = hsa_executable_create hsa_executable_global_variable_define(exec, "ExternVar", hostExternVar) hsa_executable_load_code_object(exec, device, code object for GPU kernel above) setup kernel arguments and dispatch kernel "foo" Comment Actions LGTM: add missing `` in docs and add a test.
|