These calls are neither intercepted by compiler-rt nor is libatomic.a
naturally instrumented.
This patch uses the existing libcall mechanism to detect a call
to atomic_load or atomic_store, and instruments them much like
the preexisting instrumentation for atomics.
Calls to _load are modified to have at least Acquire ordering, and
calls to _store at least Release ordering. Because this needs to be
converted at runtime, msan injects a LUT (implemented as a vector
with extractelement).
One thing that turned out a little strange is that because I have to insert instructions *after* the atomic load, including the origin update, the msan reporter decides that the origin is one line below the call. Is there anything I can do about this?