This is an archive of the discontinued LLVM Phabricator instance.

[INDVARS]
ClosedPublic

Authored by zinovy.nis on Jul 28 2014, 5:42 AM.

Details

Summary

This patch extends using of widening of induction variables for the cases of "sub nsw" and "mul nsw" instructions. Currently only "add nsw" are widened.
This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like:

int N = 100;
float **A;

void foo(int x0, int x1)
{
        float * A_cur = &A[0][0];
        float * A_next = &A[1][0];
        for(int x = x0; x < x1; ++x).
        {
          // Currently only [x+N] case is widened. Others 2 cases lead to sext.
          // This patch fixes it, so all 3 cases do not need sext.
          const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N];
          A_next[x] = div;
        }
}
...
> clang++ test.cpp -march=core-avx2 -Ofast  -fno-unroll-loops -fno-tree-vectorize -S -o -

(with my patch)

.LBB0_2:                                # %for.body
                                        # =>This Inner Loop Header: Depth=1
        vmovss  (%rdi,%rcx,4), %xmm0
        vaddss  (%rdx,%rcx,4), %xmm0, %xmm0
        vaddss  (%rax), %xmm0, %xmm0
        vmovss  %xmm0, (%r8,%rcx,4)
        incq    %rcx
        addq    %r9, %rax
        cmpl    %esi, %ecx
        jl      .LBB0_2

vs trunk:

.LBB0_2:                                # %for.body
                                        # =>This Inner Loop Header: Depth=1
        vmovss  (%r10,%rcx,4), %xmm0
        leal    (%r11,%rcx), %edx
        movslq  %edx, %rdx
        vaddss  (%rax,%rdx,4), %xmm0, %xmm0
        movslq  %edi, %rdi
        vaddss  (%rax,%rdi,4), %xmm0, %xmm0
        vmovss  %xmm0, (%r8,%rcx,4)
        incq    %rcx
        addl    %r9d, %edi
        cmpl    %esi, %ecx
        jl      .LBB0_2

Diff Detail

Event Timeline

zinovy.nis updated this revision to Diff 11943.Jul 28 2014, 5:42 AM
zinovy.nis retitled this revision from to [INDVARS].
zinovy.nis updated this object.
zinovy.nis edited the test plan for this revision. (Show Details)
zinovy.nis added reviewers: atrick, rafael.
zinovy.nis set the repository for this revision to rL LLVM.
zinovy.nis added a project: deleted.
zinovy.nis added a subscriber: Unknown Object (MLST).
zinovy.nis updated this object.Jul 28 2014, 5:45 AM

Gentle ping #2.

hfinkel accepted this revision.Aug 20 2014, 9:40 AM
hfinkel added a reviewer: hfinkel.
hfinkel added a subscriber: hfinkel.

LGTM. Thanks!

lib/Transforms/Scalar/IndVarSimplify.cpp
847

I recommend using llvm_unreachable here. The caller can't handle a null return value, and this function should never be called with any other opcode.

This revision is now accepted and ready to land.Aug 20 2014, 9:40 AM
atrick accepted this revision.Aug 20 2014, 9:40 AM
atrick edited edge metadata.

Zinovy, this is great! I'm sorry I didn't review it right away. Thanks for the ping.
LGTM.

zinovy.nis closed this revision.Aug 21 2014, 1:35 AM
zinovy.nis updated this revision to Diff 12749.

Closed by commit rL216160 (authored by @zinovy.nis).