When we inline a function with a min-legal-vector-width attribute we need to make sure the caller also ends up with at least that vector width.
In the future we may want to have heuristics to block inlining for different vector widths possibly with another attribute, but we haven't defined that yet.
I've based this entirely on the stack-probe-size merging code.
I feel like we're going to want to do something a bit more nuanced than this...
For example, consider a function doing dynamic dispatch based on CPUID detection. It will look like:
I don't think we're going to want to promote the min legal width of this wrapper to be the largest of all the things it calls, even if they are viable for inlining....
Right now, these usually are subtarget selecting and I think we *block* inlining in that case. But now that we can talk about vector width, I could imagine the above selecting a 256-bit algorithm when running on a Skylake CPU, but a 128-bit algorithm when running on older CPUs, and not needing an target features to differ between the two. Just the vector min length.
If we need a heuristic, the one I would suggest goes along the lines of:
Would #2 still be too restrictive for the use cases you have in mind?