The previous patches allowed us to create a static library containing
all the device code. This patch uses that library to perform the device
runtime linking late when performing LTO. This in addition to
simplifying the libraries, allows us to transparently handle the runtime
library as-needed without needing Clang to manually pass the necessary
library in the linker wrapper job.
Depends on D125315