There are certain loops like this below:
for (int i = 0; i < n; i++) {
  a[i] = b[i] + 1;
  *inv = a[i];
}that can only be vectorised if we are able to extract the last lane of the
vectorised form of 'a[i]'. For fixed width vectors this already works since
we know at compile time what the final lane is, however for scalable vectors
this is a different story. This patch adds support for extracting the last
lane from a scalable vector using a runtime determined lane value. I have
added support to VPIteration for non-constant lanes that still permits the
caching of values. Whilst doing this work I couldn't find any explicit tests
for extracting the last lane values of fixed width vectors so I added tests
for both scalable and fixed width vectors.
It looks like this is only used in llvm/lib/Transforms/Vectorize/VPlan.cpp in this patch? Should it be defined directly there? If it needs to be shared between multiple files it would probably be better to just put the declaration into a header?