This will result in larger atomic operations getting expanded to
__atomic_* libcalls via AtomicExpandPass. This is a part of a change
to similarly clean up atomics handling on all targets.
AArch64 always supports 128-bit atomics, so this is nice and simple.
Additionally, adjust some comments, and remove partial code dealing
with larger-than-128bit atomics, as it's now unreachable.
The arm64-irtranslator.ll test was adjusted as it was using an i258
type as a hack to avoid IR atomic lowering, and test GlobalISel
behavior. Pass -mattr=+lse and use i32, instead, to accomplish that
goal in a more reasonable manner.