Instead of reloading a module if the same kernel is requested multiple times,
cache the loaded module and return the cached value.
The CUDAPlatformDevice now also keeps handles to all its modules so they can be
unloaded if the device is cleared.
Differential D24619
[SE] Cache CUDA modules jhen on Sep 15 2016, 12:42 PM. Authored by
Details
Instead of reloading a module if the same kernel is requested multiple times, The CUDAPlatformDevice now also keeps handles to all its modules so they can be
Diff Detail Event Timeline
Comment Actions We've decided to come at this problem from a different angle, so I'm abandoning this revision. |
Hm. This makes a copy of "Code" in the map. And also, every time we do a lookup, we're going to have to compare the whole PTX strings. Which are potentially very long.
Is there no other identifier we could use as the map key?