Based on the same for AArch64: 4751cadcca45984d7671e594ce95aed8fe030bf1
At -O0, the fast register allocator may insert spills between the ldrex and
strex instructions inserted by AtomicExpandPass when expanding atomicrmw
instructions in LL/SC loops. To avoid this, expand to cmpxchg loops and
therefore expand the cmpxchg pseudos after register allocation.
Required a tweak to ARMExpandPseudo::ExpandCMP_SWAP to use the 4-byte encoding
of UXT, since the pseudo instruction can be allocated a high register (R8-R15)
which the 2-byte encoding doesn't support.
The previously committed attempt in D101164 had to be reverted due to runtime
failures in the test suites. Rather than spending time fixing that
implementation (adding another implementation of atomic operations and more
divergence between backends) I have chosen to follow the approach taken in
D101163.
Compared to D101164, this patch also fixes the problem for Thumb 2.
Depends on D101912
The text of this assertion seems seems wrong; the point of the assertion is that ARMv8-M.baseline doesn't have t2UXTB/t2UXTH