This is the first part of adding memory synchronization semantics to LSE Atomics.
This patch does not change the functionality of the existing LSE Atomics.
It is just the first step necessary for adding memory synchronization semantics.
The memory semantics feature will be added in a subsequent patch.
In this patch, several corrections were added to the existing LSE Atomics implementation, based on the ARM Errata D11904 from 05/12/2017.
You can put these into the "multiclass binary_atomic_op"
That'll automatically instantiate the _8 ordered variants when the plain _8 is created. Obviously you need similar _16, _32, ...