This fixes a bug where we were unable to compile the following CUDA
file with libstdc++ (didn't try libc++):
#include <future> void foo() { std::shared_future<int> x; }
The problem is that <future> only defines std::shared_future if
__GCC_ATOMIC_INT_LOCK_FREE > 1. When we compiled this file for device,
the macro was set to 1, and then the class didn't exist at all.