Right now if the Action graph is a DAG and we encounter an action twice,
we will run it twice.
This patch is difficult to test as-is, but I have testcases for this as
used within CUDA compilation.
This actually has a subtle issue not found with existing unit tests: BuildJobsForAction has an outparam and we don't set it on cache hit.
Please hold off reviewing this until I fix the problem.
In the new CUDA world, we have the following graph, which I hope will render properly:
foo.cu --> foo.s (PTX) --> foo.cubin --> foo.fatbin └-----------------------┙
That is, foo.s is an input to foo.cubin *and* an input to foo.fatbin.
The Driver stores each Action's inputs. So starting from the fatbin, we look at its two inputs, and try to create jobs for them. Fine. Then we look at the input to the cubin. That's foo.s, which we already visited.