This is an archive of the discontinued LLVM Phabricator instance.

Improve ISel using across lane min/max reduction
ClosedPublic

Authored by junbuml on Sep 4 2015, 10:26 AM.

Details

Summary

In vectorized integer min/max reduction code, the final "reduce" step
is sub-optimal. In AArch64, this change wll combine :

%svn0 = vector_shuffle %0, undef<2,3,u,u>
%smax0 = smax %0, svn0
%svn3 = vector_shuffle %smax0, undef<1,u,u,u>
%sc = setcc %smax0, %svn3, gt
%n0 = extract_vector_elt %sc, #0
%n1 = extract_vector_elt %smax0, #0
%n2 = extract_vector_elt $smax0, #1
%result = select %n0, %n1, n2

becomes :

%1 = smaxv %0
%result = extract_vector_elt %1, 0

This change extends r246790.

Diff Detail

Event Timeline

junbuml updated this revision to Diff 34044.Sep 4 2015, 10:26 AM
junbuml retitled this revision from to Improve ISel using across lane min/max reduction.
junbuml updated this object.
mcrosier added inline comments.Sep 4 2015, 12:21 PM
test/CodeGen/AArch64/aarch64-minmaxv.ll
2

Please add -aarch64-neon-syntax=generic, so this doesn't fail on Darwin.

junbuml updated this revision to Diff 34056.Sep 4 2015, 12:24 PM
junbuml updated this revision to Diff 34057.Sep 4 2015, 12:33 PM
junbuml marked an inline comment as done.Sep 10 2015, 12:31 PM

Confirmed that this patch was passed correctness tests for spec2000/2006, and applied several spec benchmarks including siding, hammer, h264ref, gobmk, gcc, spoiled, sphinx3, and vpr.

mcrosier accepted this revision.Sep 11 2015, 11:15 AM
mcrosier edited edge metadata.

LGTM with a few nits.

lib/Target/AArch64/AArch64ISelLowering.cpp
8760

It might improve readability if we check against != 0. I.e.,

.. getZExtValue() != 0)

8765

It might improve readability if we check against != 0. I.e.,

.. getZExtValue() != 0)

test/CodeGen/AArch64/aarch64-addv.ll
1

I believe this fix was committed in r246833. You may need to rebase your patch.

This revision is now accepted and ready to land.Sep 11 2015, 11:15 AM
junbuml updated this revision to Diff 34572.Sep 11 2015, 1:06 PM
junbuml edited edge metadata.
junbuml marked 3 inline comments as done.

Addressed Chad's comments.
Thanks Chad for the review !

James,

Sure, I will certainly wait your review.

jmolloy accepted this revision.Sep 12 2015, 6:33 AM
jmolloy edited edge metadata.

Hi,

This generally looks fine, I have a few specific comments and it looks like the testcases could probably do with some simplification (anonymizing of the long names?)

Apart from that LGTM.

Cheers,

James

lib/Target/AArch64/AArch64ISelLowering.cpp
8615

Any reduction should be commutative, right? surely at least SMAX and UMAX should be? Not just ADD?

8620

Braces around the else

8620

necessarily

junbuml updated this revision to Diff 34686.Sep 14 2015, 9:14 AM
junbuml edited edge metadata.
junbuml marked 3 inline comments as done.

Addressed James' comments.
Thanks James for the review.

junbuml closed this revision.Sep 14 2015, 9:37 AM

Committed in r247575.