This patch is the first patch based on the RFC sent to llvm-dev to enable support for a prefered vector width for the vectorizer. I'll also submit a clang patch shortly to hook it up to the driver.
RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-November/118734.html
This stores the preference as an enum in the subtarget in order of increasing strictness because the autogenerated subtarget code needs to be able max the encoding values if both prefer-avx128 and prefer-avx256 are specified in any order.
From initial experiments there's still more work to do to prevent zmm register usage, but this gives us a baseline and plumbing that we can build on.
What's the best way to test the register width output from TTI in a lit test?
It might be worth expanding upon this a bit to make is clear why Prefer128 is "stricter" than Prefer256.
I have no objection to this choice of ordering, but it does make the code at lines 535-539 read a little funny.