AtomicExpandPass expands atomicrmw instructions to loop structures. On
ARM/AArch64, these make use of exclusive load/store instructions. Any additional
store that occurs between these instructions will invalidate the exclusive
access monitor, and potentially cause an infinite loop. Therefore the register
allocator must be prevented from inserting spills between these two points.
The approach taken here is to create a bundle containing all the instructions
between the exclusive load and store. This prevents the register allocator from
inserting spills.
This exposed an issue with RegAllocFast, wherein a virtual register defined
inside the bundle might be assigned the same physical register as a virtual
register with a use that occurs after the def. For example:
%0 = something global BUNDLE implicit-def %1, implicit %0 { %1 = MOVi 123 store %0, ... }
In the above example was possible to allocate the same physical register to both
%0 and %1. RegAllocFast has been updated to avoid this. RegAllocGreedy does not
have a similar problem, since it uses liveness analysis.
Finally, UnpackMachineBundles is added after register allocation for ARM/AArch64
to remove the bundles.
Differential Review: https://reviews.llvm.org/D94949
clang-tidy: warning: header guard does not follow preferred style [llvm-header-guard]
not useful