Before, the kernel spec would only return PTX for exactly the requested
compute capability. With this patch it will now return the PTX with the
largest compute capability that does not exceed that requested compute
capability.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
streamexecutor/lib/KernelSpec.cpp | ||
---|---|---|
39 ↗ | (On Diff #71258) | Gosh this is subtle -- even though I've seen it before, I still had to spend five minutes convincing myself it's right. Oh well, I don't have a concrete suggestion. :) |