This is following patch for http://reviews.llvm.org/D15177.
Just as Hal's comment in http://reviews.llvm.org/D15177?id=41809#inline-126094,
For induction variable only used in GetElementPtr and ICmp, D15177 knows it will not have vectorized version and can be added into VecValuesToIgnore, but it didn't consider the case that induction variable may be used by a add/sub before used in GetElementPtr.
for loop like below:
char a[1000];
char b[1000];
for (long i = 0; i < N; i++)
a[i] = b[i] * 6 + (b[i] + b[i + 1]) * 4 + b[i - 2] + b[i + 2];
When we are computing RegUsages for VF==8 and VF==16,
it is important for the register usages estimation component to know array index exprs like i, i+1, i+2 and i+3 will not have vectorized version after the loop being vectorized, and their live ranges should not be counted as vector register usages, or else it is likely to exaggerate the vector register pressure.
The patch adds instructions for which isUniformAfterVectorization returns true into the VecValuesToIgnore set. A special case is that PHI instructions are never included into the Uniforms set in collectLoopUniforms(), so a special handling for PHI is that when the result of PHI is only used in GetElementPtr or Uniform instructions, the PHI will be added into VecValuesToIgnore set too.
Another following patch in plan is if estimated vector register usage is more than the number of available hardware vector registers for certain VF, don't simply give up the VF. Just add the extra spill cost into the VectorCost. If the total VectorCost of the VF is the lowest, the VF is still worthy to try.
Are you sure about this?
Nonconsecutive pointer values when there is no gather/scatter will be scalarized, but they aren't uniform.
So I'm not sure we should be counting them as uniform. This will work correctly for your new use of isUniformAfterVectorization() (since we really don't need vector registers in either case). But I think it may do the wrong thing for the existing use, in getInstructionCost(). We shouldn't be evaluating the cost of non-consecutive loads/stores as if they are a single scalar load/store.
Am I confused?