This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Use CUDA's non-RDC mode when LTO has whole program visibility
ClosedPublic

Authored by jhuber6 on Apr 22 2022, 12:34 PM.

Details

Summary

When we do LTO we consider ourselves to have whole program visibility if
every single input file we have contains LLVM bitcode. If we have whole
program visibliity then we can create a single image and utilize CUDA's
non-RDC mode by not passing -c to ptxas and ignoring the nvlink
job. This should be faster for some situations and also saves us the
time executing nvlink.

Diff Detail

Event Timeline

jhuber6 created this revision.Apr 22 2022, 12:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 12:34 PM
jhuber6 requested review of this revision.Apr 22 2022, 12:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 12:34 PM
tra accepted this revision.Apr 22 2022, 12:49 PM

LGTM with a minor test nit.

clang/test/Driver/linker-wrapper.c
41–42

// LTO-NOT: nvlink

This revision is now accepted and ready to land.Apr 22 2022, 12:49 PM
jhuber6 marked an inline comment as done.Apr 22 2022, 12:50 PM
jhuber6 updated this revision to Diff 424587.Apr 22 2022, 12:51 PM

Add test line

This revision was landed with ongoing or failed builds.Apr 23 2022, 9:43 AM
This revision was automatically updated to reflect the committed changes.