[libomptarget] Implement target_impl for amdgcn
Smallest atomic addition for a new target. Implements enough of the amdgcn
specific code that some of the source files under nvptx/src could be compiled,
without modification, to run on amdgcn.
This foreshadows a work in progress patch to move said source out of nvptx/src.
Patch based on fork at https://github.com/ROCm-Developer-Tools/llvm-project
__forceinline__?