This is an archive of the discontinued LLVM Phabricator instance.

Fix a missed opportunity to optimize consecutive stores.
ClosedPublic

Authored by nadav on Jan 9 2022, 9:55 AM.

Details

Summary

This commit fixes a missed opportunity in merging consecutive stores. The code that searches for stores skipped the case of stores that directly connect to the root. The comment above the implementation lists this case but the code did not handle it. I found this pattern when looking into the shared_ptr destructor. GCC generates the right sequence. Here is a small repo:

int foo(int* buff) {
    buff[0] = 0;
    int x = buff[1];
    buff[1] = 0;
    return x;
}

Diff Detail

Event Timeline

nadav created this revision.Jan 9 2022, 9:55 AM
nadav requested review of this revision.Jan 9 2022, 9:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 9 2022, 9:55 AM
nadav edited the summary of this revision. (Show Details)Jan 9 2022, 9:57 AM
nadav added reviewers: wenlei, MatzeB, hoy.
nadav updated this revision to Diff 398504.Jan 9 2022, 9:23 PM
hoy added inline comments.Jan 10 2022, 12:35 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
17497

Does it matter if Store3 interferes with one of the loads? Wondering if it is dealt with somewhere, otherwise merging store3 with the stores following the loads might cause be a problem.

hoy added a comment.Jan 10 2022, 12:37 AM

This is a good catch. Thanks for the fix and adding a test for it.

nadav updated this revision to Diff 398659.Jan 10 2022, 8:50 AM

@hoy Thank you for the review. Yes, the code in TryToAddCandidate and CandidateMatch checks that all of the stores have the same base. I added a second test where the load aliases and prevents the store merging.

nadav marked an inline comment as done.Jan 10 2022, 8:50 AM

Added the test merge_store_alias that checks that aliasing loads prevent the merging of the store operations.

hoy accepted this revision.Jan 10 2022, 9:54 AM

LGTM.

This revision is now accepted and ready to land.Jan 10 2022, 9:54 AM
This revision was automatically updated to reflect the committed changes.

Hi,

Anyone seen problems with this patch?
I have a case for my out-of-tree target where this patch causes two stores to be merged leading to a cycle in the DAG.
So in the initial DAG we have the following dependencies

store2 <- load1 <- load2 <- load3 <- store1

and then store2 and store1 are merged to store21 and we end up with dependencies like this:

store12 <- load1 <- load2 <- load3
                          <- store12

ie. load2 depends on both load3 and store12 so we have a broken DAG with a cycle in it.
So then we end up with

Operand not processed?0x882a240
t9: i16,ch = load<(load (s16) from %ir.next1.i)> t26, t8, undef:i16

UNREACHABLE executed at ../lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:503!

I haven't been able to reproduce this for any in-tree targets yet so it might very well be something we've broken for our target but I thought I'd mention here that we see problems in case anyone else does too.

bjope added a subscriber: bjope.Jan 26 2022, 1:28 AM
bjope added inline comments.Jan 26 2022, 1:33 AM
llvm/test/CodeGen/Hexagon/store-widen-aliased-load.ll
1

Why are you disabling the combiner-store-merging in this test case? Looks like a typo?

Reading the comment history in this ticket I got a feeling that this test case was added to verify that no store merge happened in this case. But then it seems really weird to verify that in a test case were the store merging is disabled.

Maybe there should be two RUN-lines, expecting the same result regardless if combiner-store-merging is true or false?

bjope added inline comments.Jan 26 2022, 1:36 AM
llvm/test/CodeGen/Hexagon/store-widen-aliased-load.ll
1

Aw, sorry, I now realize that the test case you added for the alias situation was in test/CodeGen/X86/MergeConsecutiveStores.ll.

bjope added a comment.Feb 3 2022, 7:13 AM

Hi,

Anyone seen problems with this patch?
I have a case for my out-of-tree target where this patch causes two stores to be merged leading to a cycle in the DAG.
So in the initial DAG we have the following dependencies

store2 <- load1 <- load2 <- load3 <- store1

and then store2 and store1 are merged to store21 and we end up with dependencies like this:

store12 <- load1 <- load2 <- load3
                          <- store12

ie. load2 depends on both load3 and store12 so we have a broken DAG with a cycle in it.
So then we end up with

Operand not processed?0x882a240
t9: i16,ch = load<(load (s16) from %ir.next1.i)> t26, t8, undef:i16

UNREACHABLE executed at ../lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:503!

I haven't been able to reproduce this for any in-tree targets yet so it might very well be something we've broken for our target but I thought I'd mention here that we see problems in case anyone else does too.

Here is a reproducer using aarch64 as target:

; RUN: llc -mtriple aarch64-- -o /dev/null %s | FileCheck %s

%str0 = type { i64, i64 }
%str1 = type { i64, %str1* }

@g0 = external global %str0, align 1

define i64 @foo() {
entry:
  %sp0 = getelementptr inbounds %str0, %str0* @g0, i32 0, i32 0
  %sp1 = getelementptr inbounds %str0, %str0* @g0, i32 0, i32 1
  store i64 0, i64* %sp1, align 1, !tbaa !1
  %l0 = load %str1*, %str1** undef, align 1
  %lp0 = getelementptr inbounds %str1, %str1* %l0, i32 0, i32 1
  %l1 = load %str1*, %str1** %lp0, align 1
  %lp1 = getelementptr inbounds %str1, %str1* %l1, i32 0, i32 0
  %l2 = load i64, i64* %lp1, align 1, !tbaa !7
  store i64 0, i64* %sp0, align 1
  ret i64 %l2
}

!llvm.ident = !{!0}

!0 = !{!"clang version 14.0.0.prerel"}
!1 = !{!2, !6, i16 1}
!2 = !{!"dinges", !3, i16 0, !6, i16 1}
!3 = !{!"int", !4, i16 0}
!4 = !{!"omnipotent char", !5, i16 0}
!5 = !{!"Simple C/C++ TBAA"}
!6 = !{!"any pointer", !4, i16 0}
!7 = !{!2, !3, i16 0}

As mentioned here https://discourse.llvm.org/t/dagcombiner-chains-and-mergeconsecutivestores/59623 , maybe the fault isn't exactly related to this patch (even if it was exposed by this patch) but rather that checkMergeStoreCandidatesForDependencies need to analyse the chain operands as well.

nadav added a comment.Feb 3 2022, 9:29 AM

Hi Björn. Thank you for looking into this bug and reproducing the problem on aarch64. It sounds like you found the underlying issue in checkMergeStoreCandidatesForDependencies. Do you need help with anything?

bjope added a comment.Feb 3 2022, 1:22 PM

Hi Björn. Thank you for looking into this bug and reproducing the problem on aarch64. It sounds like you found the underlying issue in checkMergeStoreCandidatesForDependencies. Do you need help with anything?

Well, I proposed a fix here: D118943
I haven't really thought much about alternative solutions.