This is an archive of the discontinued LLVM Phabricator instance.

[Driver][OpenMP] Add specialized action builder for OpenMP offloading actions.
ClosedPublic

Authored by sfantao on Jun 29 2016, 10:34 AM.

Details

Summary

This patch adds a new specialized action builder to create OpenMP offloading actions. The specialized builder is added to the action builder already containing the CUDA specialized builder.

OpenMP offloading dependences between host and device actions (expressed with OffloadActions) are different that what is used for CUDA:

  • Device compile action depends on the host compile action - the device frontend extracts the information about the declarations that have to be emitted by looking into the metadata produced by the host frontend.
  • The host link action depends on the device link actions - the device images are embedded in the host binary at link time.

Event Timeline

sfantao updated this revision to Diff 62238.Jun 29 2016, 10:34 AM
sfantao retitled this revision from to [Driver][OpenMP] Add specialized action builder for OpenMP offloading actions..
sfantao updated this object.
ABataev added inline comments.Jun 29 2016, 9:23 PM
lib/Driver/Driver.cpp
1834

'final'

sfantao updated this revision to Diff 62519.Jul 1 2016, 12:09 PM
sfantao marked an inline comment as done.
  • Mark class as final and remove \brief from comments.

Hi Alexey,

Thanks for the review! Addressed your comment in the new diff. Also removed \brief from the comments.

jlebar edited edge metadata.Jul 1 2016, 12:39 PM

Hi, Alexy. Would you mind not asking for 'final' in additional reviews until we've resolved this thread elsewhere? Feel free to find me on IRC if you want to talk about it synchronously.

Thanks!

sfantao updated this revision to Diff 62580.Jul 1 2016, 5:10 PM
sfantao edited edge metadata.
  • Rebase
sfantao updated this revision to Diff 63684.Jul 12 2016, 8:21 AM
  • Rebase.
mkuron added a subscriber: mkuron.Jul 26 2016, 5:27 AM
sfantao updated this revision to Diff 66019.Jul 28 2016, 2:50 PM
  • Rebase.
sfantao updated this revision to Diff 72120.Sep 21 2016, 3:46 PM
  • Rebase.
hfinkel added inline comments.Sep 28 2016, 12:02 PM
lib/Driver/Driver.cpp
1846

Depences - Spelling?

1864

to a -> as a

1889

as dependence -> as a dependence (or as the dependence)

1890

declaration -> declarations

1891

have prevent -> prevent

1928

related with -> related to

1960

Since we can have both OpenMP offloading and CUDA, please add a test that the phases work correctly for that case (or that we produce an error if that can't currently work correctly).

sfantao updated this revision to Diff 75722.Oct 25 2016, 9:50 AM
sfantao marked 7 inline comments as done.
sfantao edited edge metadata.
  • Fix typos and add test tht checks phases when OpenMP and CUDA are used simultaneously.

Hi Hal,

Thanks for the review! Fixed the typos in the new diff.

lib/Driver/Driver.cpp
1960

Added new test for that. The phases generation should work well if CUDA and OpenMP offloading are used on the same file.

However, the bindings for these phases cannot be generated given that the NVPTX toolchain support for OpenMP is not implemented yet and the CUDA implementation interprets actions differently, e.g. in CUDA linking is the combination of binaries of different devices (GPUs) whereas for OpenMP actual linking takes place, i.e. symbols are resolved by looking into other compilation units.

hfinkel accepted this revision.Oct 26 2016, 3:18 PM
hfinkel edited edge metadata.

LGTM

lib/Driver/Driver.cpp
1960

Okay; after this is committed, please file a PR showing what happens and explaining the issue.

This revision is now accepted and ready to land.Oct 26 2016, 3:18 PM
sfantao closed this revision.Oct 27 2016, 10:17 AM

A PR was generated as requested by Hal explaining why we do not generate jobs for NVPTX targets yet.

https://llvm.org/bugs/show_bug.cgi?id=30812

mkuron added a comment.EditedOct 28 2016, 8:14 AM

I think OffloadAction::DeviceDependences::add(..., ..., /*BoundArch=*/nullptr, Action::OFK_OpenMP) is never sufficient. The invalid BoundArch eventually ends up in NVPTX::Assembler::ConstructJob and triggers an assert; I don't think there is any code path with OpenMP offloading where the GPU architecture is set correctly. If I compile a simple test file with

clang -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -c example.c -march=sm_30

the error message is the following:

clang: /llvm/tools/clang/lib/Driver/Tools.cpp:11960: virtual void clang::driver::tools::NVPTX::Assembler::ConstructJob(clang::driver::Compilation&, const clang::driver::JobAction&, const clang::driver::InputInfo&, const InputInfoList&, const llvm::opt::ArgList&, const char*) const: Assertion `gpu_arch != CudaArch::UNKNOWN && "Device action expected to have an architecture."' failed.

On a related but different note, leaving out -march=sm_30 in the clang call above causes an earlier assert to trigger:

clang: /llvm/tools/clang/lib/Driver/ToolChains.cpp:5049: virtual void clang::driver::toolchains::CudaToolChain::addClangTargetOptions(const llvm::opt::ArgList&, llvm::opt::ArgStringList&) const: Assertion `!GpuArch.empty() && "Must have an explicit GPU arch."' failed.

The more appropriate flag would probably be --cuda-gpu-arch=sm_30, but that is not recognized.

I thought I'd just report this here as it seemed to me that with the merge of all of @sfantao's code yesterday the OpenMP offloading support should mostly work. If this is not the case or I should report the issue elsewhere, please let me know. Also, I'm not sure if/how this relates to the bug report you mentioned.

Hi Michael,

I think OffloadAction::DeviceDependences::add(..., ..., /*BoundArch=*/nullptr, Action::OFK_OpenMP) is never sufficient. The invalid BoundArch eventually ends up in NVPTX::Assembler::ConstructJob and triggers an assert; I don't think there is any code path with OpenMP offloading where the GPU architecture is set correctly. If I compile a simple test file with

clang -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -c example.c -march=sm_30

the error message is the following:

clang: /llvm/tools/clang/lib/Driver/Tools.cpp:11960: virtual void clang::driver::tools::NVPTX::Assembler::ConstructJob(clang::driver::Compilation&, const clang::driver::JobAction&, const clang::driver::InputInfo&, const InputInfoList&, const llvm::opt::ArgList&, const char*) const: Assertion `gpu_arch != CudaArch::UNKNOWN && "Device action expected to have an architecture."' failed.

On a related but different note, leaving out -march=sm_30 in the clang call above causes an earlier assert to trigger:

clang: /llvm/tools/clang/lib/Driver/ToolChains.cpp:5049: virtual void clang::driver::toolchains::CudaToolChain::addClangTargetOptions(const llvm::opt::ArgList&, llvm::opt::ArgStringList&) const: Assertion `!GpuArch.empty() && "Must have an explicit GPU arch."' failed.

The more appropriate flag would probably be --cuda-gpu-arch=sm_30, but that is not recognized.

I thought I'd just report this here as it seemed to me that with the merge of all of @sfantao's code yesterday the OpenMP offloading support should mostly work. If this is not the case or I should report the issue elsewhere, please let me know. Also, I'm not sure if/how this relates to the bug report you mentioned.

These patches do not implement any specific support for GPUs. Only toolchains based on gcc are expected to work. GPUs will require some extra work on the toolchain which is under progress.

In any case, it is not nice to have these assertions when trying an unsupported toolchain. I'll work on a diagnostic so that the driver stops before attempting to create jobs for unsupported toolchains.

Thanks for reporting this!