This is an archive of the discontinued LLVM Phabricator instance.

[libomptarget] Fail on missing symbols in device image
Needs ReviewPublic

Authored by JonChesterfield on Nov 18 2020, 3:16 PM.

Details

Summary

[libomptarget] Fail on missing symbols in device image

_exec_mode is unconditionally emitted by codegen. It missing implies a corrupt
device image, which may have further problems.

omptarget_device_environment is unconditionally present in deviceRTL. It
missing implies that some of the deviceRTL has not been linked into the
application.

Diff Detail

Event Timeline

JonChesterfield requested review of this revision.Nov 18 2020, 3:16 PM

Cuda reads from _exec_mode, and writes to omptarget_device_environment, after the image has been loaded onto the gpu. Amdgpu read/writes the device image before it is loaded onto the gpu to avoid the extra round trip. This means amdgpu has an elf symbol table parsing function and makes a possibly avoidable copy of the device image (I haven't checked whether libomptarget handles a plugin mutating the image gracefully). Do we want the same optimisation for cuda?

This makes sense, can we test this?

Manually, yeah - jury rig clang to elide one symbol, rename the one in deviceRTL. Automatically? Not sure it's worth the setup.

Manually, yeah - jury rig clang to elide one symbol, rename the one in deviceRTL. Automatically? Not sure it's worth the setup.

Hm, I thought we could prepare an image. Or maybe we can build an offload test, strip a symbol via existing llvm tools, then run and observe the result?
You think that can be done reasonably easily?

llvm-objcopy is willing to strip the symbols from the device image. Can get the name of exec_mode from llvm-nm. I'm not sure we have tooling for pulling apart the openmp image and reassembling it, or for injecting an external tool to mangle the device image on the way into the binary. Expressing the result in lit/filecheck form means long pipelines and/or several temporary files, but is doable.

I would say the test would be substantially more complicated than the change, and much more likely to be buggy. If we had better testing infra - stuff for making binary files and dicing them - I'd be more in favour.