Use pointee type of a MemTransferInst's source value,
instead of an integer type, when replacing it with
a load/store pair.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/Transforms/InstCombine/memcpy_alloca.ll | ||
---|---|---|
83 | Why is the cast necessary for the test? |
llvm/test/Transforms/InstCombine/memcpy_alloca.ll | ||
---|---|---|
83 | InstCombine reduces the function down to ret void when I remove the cast. |
There are regressions in tests. To avoid them you probably need to restrict this to the case where the alloca type is a simple type rather than an aggregate. Currently you end up introducing a load of [2 x i32] type in the swap example, which is something we're not good at handling.
Could you explain what the larger context for this change is, possibly with a phase ordering test? We will not be able to reliably "guess" the type here, and I'm not sure special-casing the alloca case makes sense. The general expectation is that if the chosen type is "incorrect", we'll optimize away introduced bitcasts when that becomes evident.
The long term plan here is that this will make use of the byte type, because other choices are generally incorrect due to either poison-propagation or provenance issues. The introduction of the byte type is somewhat stuck right now though.
Also, there are a number of correctness issues in your implementation: a) You don't check that the alloca type has the same size as the memcpy, so you might end up copying e.g. more bytes here. b) The alloca type might have interior padding. E.g. a load/store of { i8, i32 } is not the same as a memcpy of 8 bytes.
Agreed with @nikic, this patch can't be right. The type of the alloca is not meaningful. Alas, we could replace it with a single bytes parameter in the future.
Lowering memcpys to load/store is tricky business. Let's not add more such optimizations until there's a proper solution implemented.
The general way to handle this is to support mismatched types in relevant optimizations by inserting bitcasts. See D147348 for an example of how to do that.
Thanks everyone for feedback to this review. I implemented a different approach in D150900, which is rather safer than this approach. I will abandon this revision as a result.
Why is the cast necessary for the test?