Page MenuHomePhabricator

[InstCombine] Handle PHI nodes in PtrReplacer
ClosedPublic

Authored by gandhi21299 on Oct 18 2022, 1:38 PM.

Details

Summary

This patch adds on to the functionality implemented
in rG42ab5dc5a5dd6c79476104bdc921afa2a18559cf,
where PHI nodes are supported in the use-def traversal
algorithm to determine if an alloca ever overwritten
in addition to a memmove/memcpy. This patch implements
the support needed by the PointerReplacer to collect
all (indirect) users of the alloca in the case where a PHI
is involved. Finally, a new PHI is defined in the replace
method which takes in replaced incoming values and
updates the WorkMap accordingly.

Diff Detail

Unit TestsFailed

TimeTest
60,820 msx64 debian > Clang.CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/non-overloaded::vloxseg.c
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/16/include -nostdsysteminc -triple riscv64 -target-feature +v -target-feature +zfh -target-feature +experimental-zvfh -disable-O0-optnone -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/non-overloaded/vloxseg.c -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/opt -S -passes=mem2reg | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --check-prefix=CHECK-RV64 /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/non-overloaded/vloxseg.c
61,010 msx64 debian > Clang.CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/non-overloaded::vluxseg.c
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/16/include -nostdsysteminc -triple riscv64 -target-feature +v -target-feature +zfh -target-feature +experimental-zvfh -disable-O0-optnone -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/non-overloaded/vluxseg.c -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/opt -S -passes=mem2reg | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --check-prefix=CHECK-RV64 /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/non-overloaded/vluxseg.c
60,940 msx64 debian > Clang.CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/overloaded::vloxseg.c
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/16/include -nostdsysteminc -triple riscv64 -target-feature +v -target-feature +zfh -target-feature +experimental-zvfh -disable-O0-optnone -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/overloaded/vloxseg.c -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/opt -S -passes=mem2reg | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --check-prefix=CHECK-RV64 /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/overloaded/vloxseg.c
61,210 msx64 debian > Clang.CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/overloaded::vluxseg.c
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/16/include -nostdsysteminc -triple riscv64 -target-feature +v -target-feature +zfh -target-feature +experimental-zvfh -disable-O0-optnone -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/overloaded/vluxseg.c -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/opt -S -passes=mem2reg | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --check-prefix=CHECK-RV64 /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/overloaded/vluxseg.c
50 msx64 debian > LLVM.CodeGen/LoongArch::O0-pipeline.ll
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llc --mtriple=loongarch32 -O0 --debug-pass=Structure /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/LoongArch/O0-pipeline.ll -o /dev/null 2>&1 | grep -v "Verify generated machine code" | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
View Full Test Results (6 Failed)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
nikic added inline comments.Dec 8 2022, 6:44 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
75

Why do we care about default address space here? I think what you probably want to check is PHI->getType()->getPointerAddressSpace() != TheCopy->getPointerAddressSpace()? But also, this code assumes that TheCopy is available before the PHI is seen, and I don't think anything guarantees that -- order of uses is unpredictable. I think you'd want to perform the bailout in PointerReplacer::collectUsers() instead? Maybe I'm misunderstanding the purpose here.

nikic added inline comments.Dec 8 2022, 7:06 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
67

Why do we need the LastPHI variable? I'd have expected that just setting IsOffset=true is equivalent? (With a rename IsOffset -> ForbidCopy or something?)

gandhi21299 updated this revision to Diff 481948.EditedDec 11 2022, 1:29 PM
gandhi21299 marked an inline comment as done.
  • Eliminated LastPHI as it had no valuable impact
  • Removed the addrspace cast checks in isOnlyCopiedFromConstantMemory(..) as pointed out. Instead, determine if the alloca is directly used as an incoming value in a PHI while collecting users of an alloca in PointerReplacer::collectUsers(..). If the address spaces of the PHI and the destination of the Copy inst differ and there is a direct use of the alloca, an addrspace cast will be required to further allow the PHI for replacement. Since we do not allow for addrspace casts, reject the PHI at that moment.
gandhi21299 marked 4 inline comments as done.Dec 11 2022, 1:33 PM
gandhi21299 added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
75

I implemented it for the function @addrspace_diff_keep_alloca in the test file. I improved upon the solution by moving the changes over to collectUsers() which does make a lot more sense wrt my previous solution. I hope it seems fine now.

gandhi21299 marked an inline comment as done.

removed dbg statements

arsenm added inline comments.Dec 13 2022, 3:17 PM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
130

I think I'm getting lost in the different revisions of the patch; not sure what this comment is referring to anymore. There isn't anything about phi handling right here?

300

Don't like auto for addrspace

Use unsigned instead of auto

gandhi21299 marked 2 inline comments as done.Dec 13 2022, 11:19 PM
gandhi21299 added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
130

I had a revision previously where I was tracking PHI nodes seperately, as opposed to IsOffset. I replaced it with my changes in PointerReplacer::collectUsers(..).

gandhi21299 marked an inline comment as done.

Allow PHI nodes only if the number of incoming values is less than 5

  • PHI's need to be kept track of to avoid inserting them repeatedly into ValuesToInspect.
  • Applied clang-format and rebased
  • Insert PHI into ValuesToInspect only when it hasn't seen before. Otherwise, continue with the use-def traversal.

@nikic Thanks for committing on my behalf, may I close this revision?

@gandhi21299 I only landed the basic isOnlyCopiedFromConstantMemory() support, but the PointerReplacer part is still open. I'm think the logic there is still not quite correct -- will take another look tomorrow.

nikic requested changes to this revision.Jan 12 2023, 5:32 AM

Rebased patch over committed changes: https://gist.github.com/nikic/ce69a66fdc0985d1af7a56bf024bc860

With additional tests added (in particular @addrspace_diff_keep_alloca_extra_gep) the patch causes an assertion failure.

You are currently going over phi operands and checking whether one of them is the memcpy dest, but I'm not sure what this check is intended to achieve, semantically. I think it just works by accident on existing test cases.

The problem is that we need a closed graph to perform the pointer replacement (can't have any external inputs) and additionally the worklist also needs to be ordered correctly, so that inputs are always replaced before the using instruction. The former is a hard requirement for this to work, the latter is a requirement of the current algorithm.

If you don't want to change the current replacement approach, then I think a good way to enforce this would be to do the following: When processing a phi node, check whether all phi operands are already in the worklist. If not, don't visit the phi yet (wait until everything is in the worklist). Add the phi to some extra set, and after collectUsers(), check that all phis are in the worklist. This is to handle the case where an input was not part of the graph and the phi is never added to the worklist.

This revision now requires changes to proceed.Jan 12 2023, 5:32 AM
  • Revised PointerReplacer::collectUsers(..) as requested.
  • Passes all tests under check-llvm-transforms, including replace-alloca-phi.ll
  • Rebased and removed an irrelevant comment.
nikic added inline comments.Jan 13 2023, 1:39 PM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
320

I don't think you need this check (and as such this whole loop can be dropped).

338

I think this should only be done after the whole collectUsers() phase, currently you do it on each recursive call.

gandhi21299 marked 2 inline comments as done.
  • Eliminated the redundant loop as pointed out by reviewer
  • Created a new method that drives the recursive algorithm for collecting users, this was necessary to avoid publicizing the member variables of PointerReplacer.
  • clang-format
gandhi21299 retitled this revision from [InstCombine] Handle PHI nodes when eliminating constant memcpy to [InstCombine] Handle PHI nodes in PtrReplacer.Jan 13 2023, 2:29 PM
gandhi21299 edited the summary of this revision. (Show Details)
nikic added a comment.Jan 13 2023, 2:38 PM

Logic looks correct to me now. Looks like the test changes got lost though?

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
15

Shouldn't be needed (already used in file).

263

Copy argument no longer needed

286

Could make ValuesToRevisit use Instruction * rather than Value * and save this cast.

313

Can use any_of instead of for here and avoid the ValueInserted variable.

gandhi21299 marked 3 inline comments as done.Jan 13 2023, 3:04 PM

Logic looks correct to me now. Looks like the test changes got lost though?

I rebased the changes and there weren't any test changes in the reduced version of this patch.

gandhi21299 marked an inline comment as done.Jan 13 2023, 9:30 PM
  • Changes as requested by reviewer, updated tests accordingly
nikic added inline comments.Jan 14 2023, 12:29 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
300

Drop llvm:: prefix.

307

Missing the insertion into ValuesToRevisit here?

  • Replaced calls from collectUsers to collectUsersRecursive, which now generates the code we expect for @loop_phi_remove_alloca and @loop_phi_late_memtransfer_remove_alloca.
gandhi21299 marked 2 inline comments as done.Jan 14 2023, 11:23 AM
  • removed #include "STLExtras.h"
nikic added inline comments.Jan 16 2023, 1:17 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
68

Spurious change.

307–313

The ValuesToRevisit update should happen outside the any_of.

458

Spurious change.

llvm/test/Transforms/InstCombine/replace-alloca-phi.ll
11

Broken check line (should be CHECK).

gandhi21299 marked 5 inline comments as done.
  • changes as suggested by reviewer
nikic accepted this revision.Jan 17 2023, 3:59 AM

LGTM

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
299

Spurious change

307

Unneeded Inst capture.

This revision is now accepted and ready to land.Jan 17 2023, 3:59 AM
gandhi21299 marked 2 inline comments as done.
  • requested updates

Thanks a bunch for such in-depth review, I will commit this patch.

This revision was landed with ongoing or failed builds.Jan 17 2023, 9:56 AM
This revision was automatically updated to reflect the committed changes.