This is an archive of the discontinued LLVM Phabricator instance.

[Openmp] Libomptarget plugin for NEC SX-Aurora
ClosedPublic

Authored by manorom on Mar 26 2020, 6:26 AM.

Details

Summary

This patch adds a a libomptarget plugin for the NEC SX-Aurora TSUBASA Vector Engine, which will also support OpenMP offloading in the future.
The code is mostly based on the plugins for generic-elf and uses NEC's VEO and VEOSINFO libraries to run the offloaded code on the target device.

The plugin was developed by the HPC group at RWTH Aachen for offloading OpenMP using NEC's ncc compiler for offloading compilation, but can also be used with Clang as target compiler.

Diff Detail

Event Timeline

manorom created this revision.Mar 26 2020, 6:26 AM

Thanks for working on this!
Comments are mostly nitpicks. I am not sure we can assume that the VEOS device ids are consecutive.

openmp/libomptarget/plugins/ve/src/rtl.cpp
11

typo: Aurora TSUBASE

123

Can we assume that?

169

else after return

215

else after return

223

typo: craete

299

else after return

337

else after return

363

Looks pretty good to me. Thin shim over veo_*.

openmp/libomptarget/plugins/ve/CMakeLists.txt
44

Worth considering adding "-Wl,-z,defs" here to detect unresolved symbols up front.

openmp/libomptarget/plugins/ve/src/rtl.cpp
48

Nice. I hadn't noticed this file.

184

I think a few plugins do this, just found it in hsa. It's stylistically unusual though. omptargetplugin.h declares the API under extern "C", and there's an export file that internalises the other symbols, so I'm pretty sure this #ifdef block and the one at the end can be removed.

grokos added inline comments.Mar 26 2020, 1:17 PM
openmp/libomptarget/plugins/ve/CMakeLists.txt
11

additonal --> additional

Thanks for pushing this upstream, Manoel! Some comments below, inline.

openmp/libomptarget/plugins/ve/src/rtl.cpp
51

Please get rid of the VEO_MAX_VERSION. The VEO calls that you use are stable and will not be changed, so you won't need this. Also you'd want to build libomptarget with AVEO because it is faster and less limited. I plan to change AVEOs API version number because there are new calls inside, but have no agreement with the VEOS devs whether it shall become "8" or get an offset to VEO API versions.

123

No. Devices can be missing or offline. For sizing the objects below this is ok, but not for selecting the ve nodeids.

190

I think you need the online devices here, so you'll need to keep the node_info structure around and check the status array, while going through the nodeids.

220

See above: would be good to delete this else block because VEO_MAX_VERSION is not actually needed.

446

Please at least accumulate the return codes as the call actually can fail.

manorom added inline comments.Apr 7 2020, 9:39 AM
openmp/libomptarget/plugins/ve/src/rtl.cpp
123

Do we have to assume that VE node ids can be assigned completely at random?

We hoped that that would not be the case. Our rationale for simply re-using the device ids we get passed from libomptarget was, that the user would be able to set OMP_DEFAULT_DEVICE or the device() clause to a VE node id.

Now it seems to me that libomptarget expects openmp device ids to be consecutive, so if VE node ids are random or if we only report those which are present and online, we need to have a mapping between openmp device ids and VE node ids which should be somewhat predictable for a user.

simoll added inline comments.Apr 8 2020, 10:09 AM
openmp/libomptarget/plugins/ve/src/rtl.cpp
123

Do we have to assume that VE node ids can be assigned completely at random?

Yes

We hoped that that would not be the case. Our rationale for simply re-using the device ids we get passed from libomptarget was, that the user would be able to set OMP_DEFAULT_DEVICE or the device() clause to a VE node id.

OpenMP device ids should not be physical device ids. It really messes with portability.

Now it seems to me that libomptarget expects openmp device ids to be consecutive, so if VE node ids are random or if we only report those which are present and online, we need to have a mapping between openmp device ids and VE node ids which should be somewhat predictable for a user.

Here is a mapping that gives you predicable behavior:

  1. fill up an array compactly with the active VE node ids. Report the number of active VE nodes as 'NumDevices'
  2. sort that array
  3. use the array to index into the ProcHandles/Contests/FuncOrGlbEntry/LibraryHandles arrays.

The omp device ids are now deterministic (with regards to the available VE nodes) and you gain some portability, eg devices '0' to 'n-1' will be valid as long as there are 'n' available nodes.

manorom updated this revision to Diff 256090.Apr 8 2020, 12:50 PM
  • Add file elf_common.c to the diff, missing from the previous revision of the diff.
  • Add "-Wl,-z,defs" as linker flags in CmakeLists.txt
  • Fix Typos in CMakeLists.txt and rtl.cpp
  • Remove Else after Return (rtl,cpp lines 168 NOT 298 and 336)
  • Remove checks of the VEO api version
  • Use correct return type in rtl.cpp:362
manorom marked 6 inline comments as done.Apr 8 2020, 12:53 PM
grokos added inline comments.Apr 8 2020, 1:48 PM
openmp/libomptarget/plugins/ve/src/rtl.cpp
184

I second that. All __tgt_rtl_* functions are declared as extern "C" in omptargetplugin.h, so their definitions do not need to do the same.

232–236

libomptarget first checks whether we have a valid binary for this plugin (__tgt_rtl_is_valid_binary) and if yes then it will try to load it (__tgt_rtl_load_binary). So this EFL check is guaranteed to have been executed as part of the __tgt_rtl_is_valid_binary call, it can be removed from here.

manorom marked 2 inline comments as done.Apr 9 2020, 5:35 AM
manorom marked 3 inline comments as done.Apr 9 2020, 5:38 AM
manorom added inline comments.
openmp/libomptarget/plugins/ve/src/rtl.cpp
299

I don't think so: If the process handle for a dynamic process is created successfully, we do not return but also do not want to visit the else branch.

337

Same as above: if the library handle is created successfully, there is no return here.

simoll added inline comments.Apr 9 2020, 6:34 AM
openmp/libomptarget/plugins/ve/src/rtl.cpp
299

This is simply a matter of coding style. You can rewrite the else to

if (!is_dyn) {
}

and it is clear under which condition it will be executed.

337

same as above

Hahnfeld added inline comments.
openmp/libomptarget/plugins/ve/src/rtl.cpp
299

This would repeat the condition on is_dyn which is less desirable IMHO. I agree with @manorom here: The else is not on the branch with the return, which is handling a possible error (ie !proc_handle). The code is properly indented, so at least to me it's obvious what's happening here:

if (...) {
  // veo_proc_create
  if (error) {
    // log
    return NULL;
  }
  // success
} else {
  // veo_proc_create_static
  if (error) {
    // log
    return NULL;
  }
}
simoll added inline comments.Apr 9 2020, 9:25 AM
openmp/libomptarget/plugins/ve/src/rtl.cpp
299

It's not a major concern in any case.
However, you have this error handler pattern repeated a lot in your code (7 times in this function):

if (error condition) {
  DP(<pretty print some error message>)
  return NULL
}

Here is an idea. You could wrap that in a macro to the reduce the amount of repetition and indentation in the code.

if (is_dyn) {
  proc_handle = veo_..
  RETURN_NULL_IF(!proc_handle, "veo_proc_create() failed for device %d\n", ID)
} else { 
  proc_handle = veo_..
  RETURN_NULL_IF(!proc_handle, "...");
}
Hahnfeld added inline comments.Apr 10 2020, 1:21 AM
openmp/libomptarget/plugins/ve/src/rtl.cpp
299

My 2 cents: I don't think this improves readability. And to be honest, I'm not aware of any part in LLVM that does error handling like this.

simoll added inline comments.Apr 14 2020, 12:12 AM
openmp/libomptarget/plugins/ve/src/rtl.cpp
299

Look, this is really just a minor nitpick and not a blocker for me. The device ID issue is what need's resolving before this can go in.

manorom updated this revision to Diff 258945.Apr 21 2020, 3:17 AM
manorom marked an inline comment as not done.
  • The plugin will only report VE nodes which are present and online to libomptarget (and will map OpenMP device id's to VE node ids)
  • Remove redundant libelf checks
  • Remove redundant extern "C" declaration

This is ok with me. Does anybody else have comments?

manorom updated this revision to Diff 259015.Apr 21 2020, 8:42 AM

So now the mapping from OpenMP device ids and VE node ids is actually used to create processes (sorry about that)

Thanks for getting this patch into shape! I am a bit hesitant to commit this right away for the following reasons, and please do correct me, if i am wrong:

  • We cannot test this without merging out-of-tree code.
  • (meaning) This is dead code until we have functional offloading. If this patch is upstream and we do not make it functional quickly, it is likely to be removed.

If that is so, i'd say we leave this here on Phabricator until there is a OMPT prototype on Phabricator or elsewhere that implements VE OMPT offloading in a way that could be upstreamed. Btw, Clang support for the VE target, at least for scalar code, is only 3-5 patches out.

It is currently dead, but I think it's probably worth landing anyway. Keeping it compiling is unlikely to be any overhead and it's another example implementation which is a potentially useful reference for other targets.

openmp/libomptarget/src/rtl.cpp
25–26

These could be sorted lexicographically, for the sake of removing uncertainty about where in the list to insert a new entry.

It is currently dead, but I think it's probably worth landing anyway. Keeping it compiling is unlikely to be any overhead and it's another example implementation which is a potentially useful reference for other targets.

Ok. We will add ompt to our buildbot for VE to make sure this gets frequently compiled and does not rot silently.

simoll added inline comments.Apr 29 2020, 8:16 AM
openmp/libomptarget/src/rtl.cpp
25–26

I think sorting this list should happen in a separate patch after this one has been committed.

openmp/libomptarget/src/rtl.cpp
25–26

Oh, sure. Before or after this lands is of course fine.

simoll accepted this revision.May 5 2020, 12:33 AM

LGTM then. The VE buildbot is configured for OpenMP now.

This revision is now accepted and ready to land.May 5 2020, 12:33 AM

@manorom Do you need me to commit this for you?

simoll requested changes to this revision.May 5 2020, 9:08 AM

Ok. Could you rebase this patch onto master (Arcanist is telling me that the parent commit of this one is not in the LLVM tree) ? Also, please apply the following diff to your patch. As it stands the plugin is not registered with cmake and so the plugin never gets build:

+++ b/openmp/libomptarget/plugins/CMakeLists.txt
@@ -70,6 +70,7 @@ add_subdirectory(cuda)
 add_subdirectory(ppc64)
 add_subdirectory(ppc64le)
 add_subdirectory(x86_64)
+add_subdirectory(ve)
This revision now requires changes to proceed.May 5 2020, 9:08 AM
simoll added inline comments.May 6 2020, 1:47 AM
openmp/libomptarget/plugins/ve/src/rtl.cpp
26

Missing #include <algorithm> for the std::sort below.

370

Use %d instead of PRIu64 for the int ret.

manorom updated this revision to Diff 263240.May 11 2020, 12:14 PM
  • Add ve plugin to openmp/libomptarget/plugins/CMakeLists.txt
  • Fixed format strings

@simoll this is not submitted with arcanist but I put it all on a clean fork from the current master. Can you try again?

simoll accepted this revision.May 12 2020, 1:45 AM

The plugin compiles now. I'll commit shortly. Thanks!

This revision is now accepted and ready to land.May 12 2020, 1:45 AM
This revision was automatically updated to reflect the committed changes.