This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Simplify MemTransferInst with type inference
AbandonedPublic

Authored by gandhi21299 on May 15 2023, 10:23 AM.

Details

Summary

Use pointee type of a MemTransferInst's source value,
instead of an integer type, when replacing it with
a load/store pair.

Diff Detail

Event Timeline

gandhi21299 created this revision.May 15 2023, 10:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 10:23 AM
gandhi21299 requested review of this revision.May 15 2023, 10:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 10:23 AM
  • Simplified @test7 in memcpy_alloca.ll
gandhi21299 edited the summary of this revision. (Show Details)May 15 2023, 10:45 AM
gandhi21299 added reviewers: reames, skatkov, bcahoon.
arsenm added a subscriber: arsenm.May 15 2023, 11:54 AM
arsenm added inline comments.
llvm/test/Transforms/InstCombine/memcpy_alloca.ll
83

Why is the cast necessary for the test?

gandhi21299 added inline comments.May 15 2023, 11:59 AM
llvm/test/Transforms/InstCombine/memcpy_alloca.ll
83

InstCombine reduces the function down to ret void when I remove the cast.

  • Simplified test further by eliminating addrspacecast
gandhi21299 marked an inline comment as done.May 15 2023, 12:17 PM
nikic requested changes to this revision.May 16 2023, 1:09 AM
nikic added a reviewer: nlopes.

There are regressions in tests. To avoid them you probably need to restrict this to the case where the alloca type is a simple type rather than an aggregate. Currently you end up introducing a load of [2 x i32] type in the swap example, which is something we're not good at handling.

Could you explain what the larger context for this change is, possibly with a phase ordering test? We will not be able to reliably "guess" the type here, and I'm not sure special-casing the alloca case makes sense. The general expectation is that if the chosen type is "incorrect", we'll optimize away introduced bitcasts when that becomes evident.

The long term plan here is that this will make use of the byte type, because other choices are generally incorrect due to either poison-propagation or provenance issues. The introduction of the byte type is somewhat stuck right now though.

This revision now requires changes to proceed.May 16 2023, 1:09 AM
nikic added a comment.May 16 2023, 1:12 AM

Also, there are a number of correctness issues in your implementation: a) You don't check that the alloca type has the same size as the memcpy, so you might end up copying e.g. more bytes here. b) The alloca type might have interior padding. E.g. a load/store of { i8, i32 } is not the same as a memcpy of 8 bytes.

Agreed with @nikic, this patch can't be right. The type of the alloca is not meaningful. Alas, we could replace it with a single bytes parameter in the future.
Lowering memcpys to load/store is tricky business. Let's not add more such optimizations until there's a proper solution implemented.

@nikic @nlopes Is there a way to reliably infer the type of alloca then? The purpose is to enable certain optimizations like merging multiple stores into a phi, etc.

@nikic @nlopes Is there a way to reliably infer the type of alloca then? The purpose is to enable certain optimizations like merging multiple stores into a phi, etc.

The general way to handle this is to support mismatched types in relevant optimizations by inserting bitcasts. See D147348 for an example of how to do that.

Thanks everyone for feedback to this review. I implemented a different approach in D150900, which is rather safer than this approach. I will abandon this revision as a result.

gandhi21299 abandoned this revision.May 18 2023, 12:19 PM