Currently memcpyopt optimizes cases like
memset(a, byte, N); memcpy(b, a, M); // to memset(a, byte, N); memset(b, byte, M);
if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely.
This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start.
This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844.
For the implementation, I'm reusing a bit of code for a similar existing optimization (direct memcpy of undef).
This probably makes sense, but it looks like it'll impact other transforms; would it be possible to test separately? Or does it actually not have any impact on other transforms for some reason?
Looks fine otherwise.