This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Disable unrolling in the loop vectorizer if the loop is not vectorized.
ClosedPublic

Authored by cfang on Mar 7 2017, 2:13 PM.

Details

Summary

This additional unrolling (interleaving) will increase the register usage, and most likely hurts the performance.

Diff Detail

Repository
rL LLVM

Event Timeline

cfang created this revision.Mar 7 2017, 2:13 PM
arsenm edited edge metadata.Mar 7 2017, 3:16 PM

Can you add a test for this

cfang updated this revision to Diff 91049.Mar 8 2017, 10:37 AM

Add a test case.

I was thinking that we should still be able to use the backend option: -force-target-max-scalar-interleave

But it seems the code in LoopVectorize.cpp prevents us doing this:

// Don't attempt if

// 1. the target claims to have no vector registers, and
// 2. interleaving won't help ILP.
//
// The second condition is necessary because, even if the target has no
// vector registers, loop vectorization may still enable scalar
// interleaving.
if (!TTI->getNumberOfRegisters(true) && TTI->getMaxInterleaveFactor(1) < 2) {
  return false;
}
arsenm added a comment.Mar 8 2017, 2:58 PM

The test should go in test/Transforms/LoopVectorize/AMDGPU

cfang updated this revision to Diff 91086.Mar 8 2017, 3:20 PM

Move test case to test/Transforms/LoopVectorize/AMDGPU

arsenm added a comment.Mar 8 2017, 3:23 PM

You need to add an ew lit.local.cfg like the other target specific test directories

cfang updated this revision to Diff 91088.Mar 8 2017, 3:32 PM

add lit.local.cfg file in the newly created test/Transforms/LoopVectorize/AMDGPU directory.

arsenm accepted this revision.Mar 8 2017, 3:34 PM

LGTM

This revision is now accepted and ready to land.Mar 8 2017, 3:34 PM
This revision was automatically updated to reflect the committed changes.