This is an archive of the discontinued LLVM Phabricator instance.

[Libomptarget] Add checks for CUDA subarchitecture using new info
ClosedPublic

Authored by jhuber6 on Jun 10 2022, 8:31 AM.

Details

Summary

This patch extends the is_valid_binary routine to also check if the
binary's architecture string matches the one parsed from the runtime.
This should allow us to only use the binary whose compute capability
matches, allowing us to support basic multi-architecture binaries for
CUDA.

Depends on D127432

Diff Detail

Event Timeline

jhuber6 created this revision.Jun 10 2022, 8:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 8:31 AM
Herald added subscribers: mattd, yaxunl. · View Herald Transcript
jhuber6 requested review of this revision.Jun 10 2022, 8:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 8:31 AM
tianshilei1992 added inline comments.Jun 10 2022, 8:50 AM
openmp/libomptarget/plugins/cuda/src/rtl.cpp
1549

I think here we should check if all devices are compatible instead of one.

  if (ArchStr != image->Info.Arch)
    return true;
}

return true.
tianshilei1992 added inline comments.Jun 10 2022, 8:52 AM
openmp/libomptarget/plugins/cuda/src/rtl.cpp
1548–1551

Should I add a test for this? Something like

clang --offload-arch=sm_35 --offload-arch=sm_37 --offload-arch=sm_50 --offload-arch=sm_52 --offload-arch=sm_53 --offload-arch=sm_60 --offload-arch=sm_61 --offload-arch=sm_62 --offload-arch=sm_70 --offload-arch=sm_72 --offload-arch=sm_75 --offload-arch=sm_80 --offload-arch=sm_86
openmp/libomptarget/plugins/cuda/src/rtl.cpp
1549

Yeah I wasn't sure how to handle this in the case of multiple devices, but it's probably more reasonable to expect each one to match the architecture for it to be compatible. I'll change it.

Should I add a test for this? Something like

clang --offload-arch=sm_35 --offload-arch=sm_37 --offload-arch=sm_50 --offload-arch=sm_52 --offload-arch=sm_53 --offload-arch=sm_60 --offload-arch=sm_61 --offload-arch=sm_62 --offload-arch=sm_70 --offload-arch=sm_72 --offload-arch=sm_75 --offload-arch=sm_80 --offload-arch=sm_86

I don't think our current runtime libraries support the case that there are multiple images for one plugin.

Should I add a test for this? Something like

clang --offload-arch=sm_35 --offload-arch=sm_37 --offload-arch=sm_50 --offload-arch=sm_52 --offload-arch=sm_53 --offload-arch=sm_60 --offload-arch=sm_61 --offload-arch=sm_62 --offload-arch=sm_70 --offload-arch=sm_72 --offload-arch=sm_75 --offload-arch=sm_80 --offload-arch=sm_86

I don't think our current runtime libraries support the case that there are multiple images for one plugin.

That's the whole point of this patch set. Now we will embed all of these images into an offloading binary which contains the architecture string. Inside the runtime we can then check if the architecture string matches and use that if it matches.

Should I add a test for this? Something like

clang --offload-arch=sm_35 --offload-arch=sm_37 --offload-arch=sm_50 --offload-arch=sm_52 --offload-arch=sm_53 --offload-arch=sm_60 --offload-arch=sm_61 --offload-arch=sm_62 --offload-arch=sm_70 --offload-arch=sm_72 --offload-arch=sm_75 --offload-arch=sm_80 --offload-arch=sm_86

I don't think our current runtime libraries support the case that there are multiple images for one plugin.

That's the whole point of this patch set. Now we will embed all of these images into an offloading binary which contains the architecture string. Inside the runtime we can then check if the architecture string matches and use that if it matches.

Oh, sorry didnt see its parent patch.

jhuber6 updated this revision to Diff 436011.Jun 10 2022, 12:23 PM

Rebase on the previous changes.

jhuber6 updated this revision to Diff 436409.Jun 13 2022, 8:29 AM

Rebase to use new function.

This revision is now accepted and ready to land.Jul 21 2022, 10:05 AM
This revision was landed with ongoing or failed builds.Jul 21 2022, 10:20 AM
This revision was automatically updated to reflect the committed changes.