Set the maximum width of atomic operations on x86-32 based on the target
CPU. The 64-bit inline atomics require cmpxchg8b which is an i586
instruction. Other inline atomics require cmpxchg which is an i486
instruction.
This fixes the incorrect value of GCC_ATOMIC_LLONG_LOCK_FREE
and atomic_always_lock_free() on FreeBSD where clang defaults to i486
CPU (PR#31864).
For CUDA device builds, assume i586+. This matches the default CPUs for
all x86-32 targets on systems supporting CUDA.
Someone else is calling setCPU on the HostTarget. If they're calling it after we call it here, this is obviously not going to work. Since it does work, I presume they are calling it before we get here. In which case, can we not just set MaxAtomicPromoteWidth=HostTarget->MaxAtomicPromoteWidth right after we set MaxAtomicInlineWidth = HostTarget->getMaxAtomicInlineWidth(); below?
(Note that as written definitely isn't right because it assumes that HostTarget is non-null.)