On ARM, there are multiple versions of each of the intrinsics, with acquire/relaxed/release barrier semantics.
The 64 bit versions that so far were within "ifdef x86_64" also are necessary for arm - I think the reason why they were in ifdefs is that winnt.h have got inline versions of them for, within ifdef _M_IX86.
The newly added ones are provided as inline functions here instead of builtins, since they should only be available on certain archs (arm/aarch64).
This is necessary in order to compile C++ code for ARM in MSVC mode.
Perhaps we should add static asserts that _BitPos is within limits for the signed shift.