This is an archive of the discontinued LLVM Phabricator instance.

[SE] KernelSpec return best PTX
ClosedPublic

Authored by jhen on Sep 13 2016, 4:24 PM.

Details

Summary

Before, the kernel spec would only return PTX for exactly the requested
compute capability. With this patch it will now return the PTX with the
largest compute capability that does not exceed that requested compute
capability.

Diff Detail

Repository
rL LLVM

Event Timeline

jhen updated this revision to Diff 71258.Sep 13 2016, 4:24 PM
jhen retitled this revision from to [SE] KernelSpec return best PTX.
jhen updated this object.
jhen added a reviewer: jlebar.
jhen added subscribers: parallel_libs-commits, jprice.
jlebar added inline comments.Sep 13 2016, 4:33 PM
streamexecutor/lib/KernelSpec.cpp
39 ↗(On Diff #71258)

Gosh this is subtle -- even though I've seen it before, I still had to spend five minutes convincing myself it's right. Oh well, I don't have a concrete suggestion. :)

jlebar accepted this revision.Sep 13 2016, 4:33 PM
jlebar edited edge metadata.
This revision is now accepted and ready to land.Sep 13 2016, 4:33 PM
This revision was automatically updated to reflect the committed changes.