The getScaledOffsetDup optimization in constructDup(...) expects the EXTRACT_VECTOR's source to be a NEON sized vector. Since the DUPLANE* patterns won't match a larger fixed width vector, bail out of that optimization early when a large vector is seen.
Two notes on the test case:
- I bumped the -aarch64-sve-vector-bits-min option to 512b so that the new DUPLANE* test can live in this file. I've confirmed that the 2 existing test cases also show their original failure at this new width, so there is no change in coverage.
- The new test case generates a ton of instructions. I wasn't sure if we should continue using the update_llc_test_checks.py script to generate the CHECK lines or not. Or maybe something else? Any preferences?