This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Check that our CUDA install supports the requested architectures.
ClosedPublic

Authored by jlebar on Jun 29 2016, 4:00 PM.

Details

Summary

Raise an error if you're using a CUDA installation that's too old for
the requested architectures. In practice, this means that you need a
CUDA 8 install to compile for sm_6*.

Diff Detail

Repository
rL LLVM

Event Timeline

jlebar updated this revision to Diff 62301.Jun 29 2016, 4:00 PM
jlebar retitled this revision from to [CUDA] Check that our CUDA install supports the requested architectures..
jlebar updated this object.
jlebar added a reviewer: tra.
jlebar added a subscriber: cfe-commits.
jlebar added a comment.Jul 6 2016, 6:00 PM

Friendly ping.

tra added inline comments.Jul 7 2016, 9:59 AM
include/clang/Basic/DiagnosticDriverKinds.td
32 ↗(On Diff #62301)

Is it supposed to be two dashes or one?

include/clang/Driver/Options.td
1722–1724 ↗(On Diff #62301)

IMO we should use double-dash for this option.

-nocudalib/-nocudainc were mimicking existing -nostdlib/-nostdinc and thus ended with single dash in front.
--nocuda-version-check is not constrained by this.

lib/Driver/ToolChains.cpp
1704 ↗(On Diff #62301)

Might add an example of what's in that file, so it's easier to understand what the code below is doing.

1793 ↗(On Diff #62301)

What if it's cuda-8, but we've failed to read the file due to permissions.
Perhaps we should differentiate no-such-file from other errors.

4711 ↗(On Diff #62301)

-nocudainc should not preclude CUDA version check, IMO.
Primary use of version check is to make sure we can compile for a given GPU architecture. I think we still want to do that even if -nocudainc is passed.

jlebar updated this revision to Diff 63094.Jul 7 2016, 10:48 AM
jlebar marked 3 inline comments as done.

Address review comments.

include/clang/Driver/Options.td
1722–1724 ↗(On Diff #62301)

I guess I have to do a double dash because --no-cuda-noopt-device-debug is thus spelled. Notice there's also a dash after "no".

...this is so screwed up.

lib/Driver/ToolChains.cpp
1793 ↗(On Diff #62301)

Urgh.

I feel like simpler is probably better. Like, I would rather say:

  • We consider /usr/local/cuda, /usr/local/cuda-8.0, ...
  • We pick the first one that has a few key files
  • We infer its version by looking for version.txt. If we can't read it, for whatever reason, we assume 7.0.

Than the more complicated

  • We infer its version by looking for version.txt. If it's present but we can't read it, we try to look at the file path (/usr/local/cuda-8.0/...).

Especially because as we make things more complicated, our next step will probably be to change the algorithm to something like:

  • For each directory D in /usr/local/cuda, /usr/local/cuda-8.0, ...
    • Infer D's version using the algorithm above.
    • If D is not compatible with all of the cuda_archs specified, break and try the next one...

I do agree that if someone hits a permission problem on version.txt, that would be very difficult for them to debug. I suppose we could raise a proper error if version.txt is present but not readable? I'm not sure it's worth doing that, though...

4711 ↗(On Diff #62301)

We also do a check if you invoke ptxas. So the only circumstances under which we don't do a check is -nocudainc and no invocation of ptxas. In which case the only thing we're using from the CUDA install is fatbinary, which doesn't care about the sm version (afaik). We could add a version check on the fatbin invocation too if you want. My goal was just that if we don't use anything from the CUDA install, we should not check its version.

tra accepted this revision.Jul 7 2016, 11:17 AM
tra edited edge metadata.

LGTM.

lib/Driver/ToolChains.cpp
1798 ↗(On Diff #63094)

Good point. It may be worth a separate patch. Falling back to 7.0 on any error is OK for now.

4727 ↗(On Diff #63094)

OK.

This revision is now accepted and ready to land.Jul 7 2016, 11:17 AM
This revision was automatically updated to reflect the committed changes.