The changes were extracted from D8463 to separate driver part from the rest.
- Changed driver pipeline to compile host and device sides of CUDA files and pack device-side results into host object file.
- Added tests for cuda pipeline creation in driver.
Differential Revision: http://reviews.llvm.org/D8463
AtTopLevel needs to be documented more. It's not clear. Also the usage below.