SerializeToCubin depends on CUDA at *runtime* which is undesirable for MLIR's
general use case, as compilation should be doable on any host, regardless of
whether it has a GPU.
SerializeToCubin is needed to run some GPU tests, so when we build mlir-opt,
SerializeToCubin pass is linked in directly into mlir-opt.