This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix for PR31879: vectorize repeated scalar ops that don't get put back into a vector
ClosedPublic

Authored by ABataev on Feb 13 2017, 11:48 AM.

Details

Summary

Previously the cost of the existing ExtractElement/ExtractValue
instructions was considered as a dead cost only if it was detected that
they have only one use. But these instructions may be considered
dead also if users of the instructions are also going to be vectorized,
like:

%x0 = extractelement <2 x float> %x, i32 0
%x1 = extractelement <2 x float> %x, i32 1
%x0x0 = fmul float %x0, %x0
%x1x1 = fmul float %x1, %x1
%add = fadd float %x0x0, %x1x1

This can be transformed to

%1 = fmul <2 x float> %x, %x
%2 = extractelement <2 x float> %1, i32 0
%3 = extractelement <2 x float> %1, i32 1
%add = fadd float %2, %3

because though %x0 and %x1 have 2 users each other, these users are
part of the vectorized tree and we can consider these extractelement
instructions as dead.

Diff Detail

Repository
rL LLVM

Event Timeline

ABataev created this revision.Feb 13 2017, 11:48 AM
hfinkel added inline comments.
lib/Transforms/Vectorize/SLPVectorizer.cpp
1629 ↗(On Diff #88234)

Maybe this should be called hasOneUser?

1672 ↗(On Diff #88234)

Why are we checking this use(r) count at all? If canReuseExtract is true, then don't we just care that all users are part of the to-be-vectorized tree?

ABataev added inline comments.Feb 13 2017, 12:05 PM
lib/Transforms/Vectorize/SLPVectorizer.cpp
1629 ↗(On Diff #88234)

Ok, will do.

1672 ↗(On Diff #88234)

It does not check if we can vectorize extract, it checks should we remove this extract from the code or there are other users of this extract. If we can remove this extract from the code, we can consider this instruction as dead and subtract from the cost, otherwise this instruction is alive and its cost should be considered during vectorization.

ABataev updated this revision to Diff 88239.Feb 13 2017, 12:12 PM

Update after review

hfinkel added inline comments.Feb 13 2017, 12:16 PM
lib/Transforms/Vectorize/SLPVectorizer.cpp
1672 ↗(On Diff #88234)

Right, but what's special in this regard about all of the uses belonging to the same instruction? What if the test case, for example, looked like this:

%x0 = extractelement <2 x float> %x, i32 0
%x1 = extractelement <2 x float> %x, i32 1
%x0x1 = fmul float %x0, %x1
%x1x1 = fmul float %x1, %x1
%add = fadd float %x0x1, %x1x1
ret float %add

}

are the costs of all of these extracts still accounted for correctly? The final code presumably has a shuffle in place of one of the extracts?

ABataev added inline comments.Feb 14 2017, 1:39 AM
lib/Transforms/Vectorize/SLPVectorizer.cpp
1672 ↗(On Diff #88234)

Oh, now I see what are you talking about. But we still may run into something like this:

@a = global float 0.000000e+00, align 4

define float @f(<2 x float> %x) {
  %x0 = extractelement <2 x float> %x, i32 0
  %x1 = extractelement <2 x float> %x, i32 1
  %x0x0 = fmul float %x0, %x0
  %x1x1 = fmul float %x1, %x1
  %add = fadd float %x0x0, %x1x1
  store float %add, float* @a
  ret float %x0
}

where we have ret user of %x0, which is not a part of the vectorized tree. What we should do is to check that all users of extractelement instructions are going to be vectorized.

ABataev updated this revision to Diff 88357.Feb 14 2017, 5:26 AM

Consider extractelement instructions dead if all users are going to be vectorized.

ABataev edited the summary of this revision. (Show Details)Feb 14 2017, 5:27 AM
ABataev set the repository for this revision to rL LLVM.
hfinkel added inline comments.Feb 14 2017, 7:05 AM
lib/Transforms/Vectorize/SLPVectorizer.cpp
1663 ↗(On Diff #88357)

Why is the hasOneUse check still needed if we're checking the ScalarToTreeEntry map? Or is this just a microoptimization?

ABataev added inline comments.Feb 14 2017, 7:08 AM
lib/Transforms/Vectorize/SLPVectorizer.cpp
1663 ↗(On Diff #88357)

Yes, it is a microoptimization. If we have only onlу user, this user will be vectorized/removed for sure

hfinkel accepted this revision.Feb 14 2017, 7:11 AM
hfinkel added inline comments.
lib/Transforms/Vectorize/SLPVectorizer.cpp
1663 ↗(On Diff #88357)

Okay, please add a comment explaining that. Otherwise, LGTM.

This revision is now accepted and ready to land.Feb 14 2017, 7:11 AM
This revision was automatically updated to reflect the committed changes.