Aligned allocation versus CUDA: make deallocation function preference order
match other CUDA preference orders, per discussion with jlebar. We now model
this in an attempt to match overload resolution as closely as possible:
- First, we throw out all non-callable (due to CUDA host/device mismatch) operator delete functions.
- Then we apply sizedness / alignedness preferences based on whether the type is overaligned and whether the deallocation function is a member.
- Finally, we use the CUDA callability preference as a tiebreaker.