This is an archive of the discontinued LLVM Phabricator instance.

[Reassociate] Add initial support for vector instructions.
ClosedPublic

Authored by rob.lougher on Feb 11 2015, 10:02 AM.

Details

Summary

This patch adds initial support for vector instructions to the reassociation pass. It enables most parts of the pass to work with vectors but to keep the size of the patch small, optimization of Xor trees, canonicalization of negative constants and converting shifts to muls, etc., have been left out. This will be handled in later patches.

The patch is based on an initial patch by Chad Rosier (see http://reviews.llvm.org/D5222).

Robert Lougher
SN Systems - Sony Computer Entertainment Group

Diff Detail

Repository
rL LLVM

Event Timeline

rob.lougher retitled this revision from to [Reassocate] Add initial support for vector instructions..
rob.lougher updated this object.
rob.lougher edited the test plan for this revision. (Show Details)
rob.lougher added a reviewer: mcrosier.
rob.lougher added a subscriber: Unknown Object (MLST).
rob.lougher retitled this revision from [Reassocate] Add initial support for vector instructions. to [Reassociate] Add initial support for vector instructions..Feb 11 2015, 10:41 AM

Rebased the patch and added tests for 'or' and 'and' (previously these just showed that the operands were commuted - the tests now check that the expressions are optimized).

This is the minimum changes necessary to add reassociation of vectors. The idea is to split the changes into smaller, reviewable chunks. The differences with Chad's original patch is:

  1. Xor instructions are not optimized (the code in the original patch is incomplete). This will be added later.
  1. Various constant handling is missing (e.g. removal of negative constant factors). Chad's original patch extended the checks to include constant splats. Again, these will addressed in later patches.
  1. CanonicalizeNegConstExpr() was added after Chad's patch. Again, this will be handled in later patches.
  1. Tests that rely on other passes (gvn, instcombine) have either been modified or removed.
  1. FIXME tests have been added for some of the missing optimizations.

Chad's original review has been dormant since October. Note, I emailed Chad privately to make sure he was happy with someone else continuing the work.

spatel added inline comments.Feb 23 2015, 9:07 AM
lib/Transforms/Scalar/Reassociate.cpp
2091 ↗(On Diff #19894)

Can you add a bit to the comment to explain?
Is this because we want to distinguish zero'ing idioms?

test/Transforms/Reassociate/fast-ReassociateVector.ll
4 ↗(On Diff #19894)

I assume all of the floating point transforms apply identically to doubles as well as floats. Can you change some of these tests to use doubles so we have some coverage for those?

Thanks for the review Sanjay.

lib/Transforms/Scalar/Reassociate.cpp
2091 ↗(On Diff #19894)

OK, I'll add a comment here. The only reason is that OptimizeXor (and helpers) hasn't been vectorized yet. This will be done in a later patch.

test/Transforms/Reassociate/fast-ReassociateVector.ll
4 ↗(On Diff #19894)

OK. I'll update the patch.

New version of the patch that addresses Sanjay's comments. Tests 3, 7, 9 and 11 now use double rather than float.

mcrosier edited edge metadata.Feb 23 2015, 12:42 PM

Thanks for taking this on, Rob. I agree that this should be committed in incremental patches.

What kind of correctness/performance testing have you done?

IIRC, I didn't see a lot of change in SPEC2000/SPEC2006 with my original patch.

Hi Chad,

Sorry for the delay. I have ran SPEC CPU 2006 and open-source Bullet. Unfortunately, I found no performance difference at all. Sanjay also tried the n-body sim from https://github.com/tycho/nbody and didn't see any perf difference either. Apart from micro-benchmarks, the only consistent speed-up I see is in an internal benchmark of ~1%. I didn't see any performance regressions during testing.

mcrosier accepted this revision.Mar 12 2015, 12:19 PM
mcrosier edited edge metadata.

LGTM. My past experience is that this really made no difference for my workloads (i.e., spec2000/spec2006/eembc). I just wanted to make sure we didn't have any odd regressions.

This revision is now accepted and ready to land.Mar 12 2015, 12:19 PM
This revision was automatically updated to reflect the committed changes.

I reverted the commit as it caused a couple of arm64 clang tests to fail. This was not expected to happen after a change to LLVM. Hopefully will be easy to fix and I'll resubmit on Monday.

I have reapplied the patch at revision 232209. Hopefully it will be OK this time!