As discussed on https://github.com/llvm/llvm-project/issues/54682, MemorySSA currently has a bug when computing the clobber of calls that access loop-varying locations. I think a "proper" fix for this on the MemorySSA side might be non-trivial, but we can easily work around this in MemCpyOpt:
Currently, MemCpyOpt uses a location-less getClobberingMemoryAccess() call to find a clobber on either the src or dest location, and then refines it for the src and dest clobber. This was intended as an optimization, as the location-less API is cached, while the location-affected APIs are not.
However, I don't think this really makes a difference in practice, because I don't think anything will use the cached clobbers on those calls later anyway. Per http://llvm-compile-time-tracker.com/compare.php?from=cd55e51516f03203f3bf632ff4a65ae7518a8319&to=3a6ff3cb24c5e2fc4b9cd70c80f96bdce6cfa405&stat=instructions the impact seems to be very mildly positive actually.
So I think this is a reasonable way to avoid the problem for now, though MemorySSA should also get a fix.