This patch adds on to the functionality implemented
where PHI nodes are supported in the use-def traversal
algorithm to determine if an alloca ever overwritten
in addition to a memmove/memcpy. This patch implements
the support needed by the PointerReplacer to collect
all (indirect) users of the alloca in the case where a PHI
is involved. Finally, a new PHI is defined in the replace
method which takes in replaced incoming values and
updates the WorkMap accordingly.
This patch adds on to the functionality implemented
|150 ms||x64 debian > LLVM.Transforms/InstCombine::replace-alloca-phi.ll|
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/opt -passes=instcombine -S -o - /var/lib/buildkite-agent/builds/llvm-project/llvm/test/Transforms/InstCombine/replace-alloca-phi.ll | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/Transforms/InstCombine/replace-alloca-phi.ll
|4,720 ms||x64 debian > Profile-x86_64.Linux::counter_promo_for.c|
Script: -- : 'RUN: at line 1'; rm -fr /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/profile/Profile-x86_64/Linux/Output/counter_promo_for.c.tmp.promo.prof
|4,470 ms||x64 debian > Profile-x86_64.Linux::counter_promo_while.c|
Script: -- : 'RUN: at line 1'; rm -fr /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/profile/Profile-x86_64/Linux/Output/counter_promo_while.c.tmp.promo.prof
|60,190 ms||x64 debian > SanitizerCommon-Unit._/Sanitizer-x86_64-Test/36::48|
Script(shard): -- GTEST_OUTPUT=json:/var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/lib/sanitizer_common/tests/./Sanitizer-x86_64-Test-SanitizerCommon-Unit-4045684-36-48.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=48 GTEST_SHARD_INDEX=36 /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/lib/sanitizer_common/tests/./Sanitizer-x86_64-Test
|60,100 ms||x64 debian > SanitizerCommon-msan-x86_64-Linux.Linux::soft_rss_limit_mb_test.cpp|
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -gline-tables-only -fsanitize=memory -m64 -funwind-tables -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test -ldl -O2 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Linux/soft_rss_limit_mb_test.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/sanitizer_common/msan-x86_64-Linux/Linux/Output/soft_rss_limit_mb_test.cpp.tmp
|View Full Test Results (13 Failed)|
Why do we care about default address space here? I think what you probably want to check is PHI->getType()->getPointerAddressSpace() != TheCopy->getPointerAddressSpace()? But also, this code assumes that TheCopy is available before the PHI is seen, and I don't think anything guarantees that -- order of uses is unpredictable. I think you'd want to perform the bailout in PointerReplacer::collectUsers() instead? Maybe I'm misunderstanding the purpose here.
Why do we need the LastPHI variable? I'd have expected that just setting IsOffset=true is equivalent? (With a rename IsOffset -> ForbidCopy or something?)
- Eliminated LastPHI as it had no valuable impact
- Removed the addrspace cast checks in isOnlyCopiedFromConstantMemory(..) as pointed out. Instead, determine if the alloca is directly used as an incoming value in a PHI while collecting users of an alloca in PointerReplacer::collectUsers(..). If the address spaces of the PHI and the destination of the Copy inst differ and there is a direct use of the alloca, an addrspace cast will be required to further allow the PHI for replacement. Since we do not allow for addrspace casts, reject the PHI at that moment.
I implemented it for the function @addrspace_diff_keep_alloca in the test file. I improved upon the solution by moving the changes over to collectUsers() which does make a lot more sense wrt my previous solution. I hope it seems fine now.
I had a revision previously where I was tracking PHI nodes seperately, as opposed to IsOffset. I replaced it with my changes in PointerReplacer::collectUsers(..).
- PHI's need to be kept track of to avoid inserting them repeatedly into ValuesToInspect.
- Insert PHI into ValuesToInspect only when it hasn't seen before. Otherwise, continue with the use-def traversal.
@gandhi21299 I only landed the basic isOnlyCopiedFromConstantMemory() support, but the PointerReplacer part is still open. I'm think the logic there is still not quite correct -- will take another look tomorrow.
Rebased patch over committed changes: https://gist.github.com/nikic/ce69a66fdc0985d1af7a56bf024bc860
With additional tests added (in particular @addrspace_diff_keep_alloca_extra_gep) the patch causes an assertion failure.
You are currently going over phi operands and checking whether one of them is the memcpy dest, but I'm not sure what this check is intended to achieve, semantically. I think it just works by accident on existing test cases.
The problem is that we need a closed graph to perform the pointer replacement (can't have any external inputs) and additionally the worklist also needs to be ordered correctly, so that inputs are always replaced before the using instruction. The former is a hard requirement for this to work, the latter is a requirement of the current algorithm.
If you don't want to change the current replacement approach, then I think a good way to enforce this would be to do the following: When processing a phi node, check whether all phi operands are already in the worklist. If not, don't visit the phi yet (wait until everything is in the worklist). Add the phi to some extra set, and after collectUsers(), check that all phis are in the worklist. This is to handle the case where an input was not part of the graph and the phi is never added to the worklist.
- Revised PointerReplacer::collectUsers(..) as requested.
- Passes all tests under check-llvm-transforms, including replace-alloca-phi.ll
- Eliminated the redundant loop as pointed out by reviewer
- Created a new method that drives the recursive algorithm for collecting users, this was necessary to avoid publicizing the member variables of PointerReplacer.
Logic looks correct to me now. Looks like the test changes got lost though?
Shouldn't be needed (already used in file).
Copy argument no longer needed
Could make ValuesToRevisit use Instruction * rather than Value * and save this cast.
Can use any_of instead of for here and avoid the ValueInserted variable.
I rebased the changes and there weren't any test changes in the reduced version of this patch.
- Replaced calls from collectUsers to collectUsersRecursive, which now generates the code we expect for @loop_phi_remove_alloca and @loop_phi_late_memtransfer_remove_alloca.
- requested updates
Thanks a bunch for such in-depth review, I will commit this patch.