This adds naive instruction selection for the @llvm.aarch64.ldaxr intrinsic.
This will select the right instruction based off the number of bytes being loaded, but it doesn't try to do any folding etc. This isn't great for code size, but at least prevents us from falling back on this intrinsic.
(Since the instructions we're not folding have already been selected by the time we run into the intrinsic, it's difficult to do this the "right" way during selection.)
Add a test for the intrinsic (select-ldaxr-intrin.mir) and update arm64-ldxr-stxr.ll as well.