This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Improve CUDA compilation pipeline creation.
ClosedPublic

Authored by tra on Jul 16 2015, 3:07 PM.

Details

Summary

Current implementation tries to guess which Action will result in a job which needs to incorporate device-side GPU binaries. The guessing was attempting to work around the fact that multiple actions may be combined into a single compiler invocation. If CudaHostAction ends up being combined (and thus bypassed during action list traversal) no device-side actions it pointed to were processed. The guessing worked for most of the usual cases, but fell apart when external assembler was used.

This change removes the guessing and makes sure we create and pass device-side jobs regardless of how the jobs get combined.

  • CudaHostAction is always inserted either at Compile phase or the FinalPhase of current compilation, whichever happens first.
  • If selectToolForJob combines CudaHostAction with other actions, it passes info about CudaHostAction up to the caller
  • When it sees that CudaHostAction got combined with other actions (and hence will never be passed to BuildJobsForActions), BuildJobsForActions creates device-side jobs the same way they would be created if CudaHostAction was passed to BuildJobsForActions directly.
  • Added two more test cases to make sure GPU binaries are passed to correct jobs.

Diff Detail

Repository
rL LLVM

Event Timeline

tra updated this revision to Diff 29947.Jul 16 2015, 3:07 PM
tra retitled this revision from to [CUDA] Improve CUDA compilation pipeline creation..
tra updated this object.
tra added reviewers: echristo, bogner, dblaikie.
tra added a subscriber: cfe-commits.
dblaikie edited edge metadata.Jul 17 2015, 1:49 PM
dblaikie added a subscriber: dblaikie.

Manuel - just got this private email from Phab. Seems that should've gone
to all the subscribers & ended up on the mailing list, but didn't?

echristo accepted this revision.Aug 24 2015, 1:25 PM
echristo edited edge metadata.

This seems like a decent incremental improvement. I think we still need to separate out the pipeline a bit further and have the cuda compilation just be separate actions that don't need this "inject" bit, but this can at least prep the code for that cleanup.

-eric

This revision is now accepted and ready to land.Aug 24 2015, 1:25 PM
This revision was automatically updated to reflect the committed changes.