Call slot optimization currently merges the metadata between the call and the load. However, we also need to merge in the metadata of the store.
Part of the reason why we might have gotten away with this previously is that usually the load and the store are the same instruction (a memcpy), this can only happen if call slot optimization occurs on an actual load/store pair.
This addresses the issue reported in https://reviews.llvm.org/D115615#3251386.