Prior to v8.4a[*] the only way to get a 128-bit atomic load or store was via ldxp/stxp (or casp), which is not only inefficient but outright impossible without write access. So v8.4a extended the memory model so that any 16-byte operation aligned to 16 bytes (as all LLVM atomic load/stores must be) is atomic.
This patch implements ISel for these instructions in both SDAG and GISel. In both cases we go for ldp/stp implementations since atomics are much more likely to be GPR-based operations.
clang-tidy: warning: invalid case style for function 'LowerStore128' [readability-identifier-naming]
not useful