CUDA target attributes are used for function overloading and must not be merged.
This fixes a bug where attributes were inherited during function template
specialization in CUDA and made it impossible for specialized function
to provide its own target attributes.
Other reviewers have pointed out to me that we don't usually (ever?) need this. I think these have to do with llvm's ability to generate code for our targets, but it's not relevant to clang here.