This nearly resolves issues such as #50466, #40981, #38190.
In the added test, the extra long vector reduction seems to skip my added reduction, where should I look for that?
I'm not sure who to add to review this, feel free to add anyone else or remove yourself.
I think I would move this into one of the if (Subtarget->hasNEON()) { blocks in ARMTargetLowering::ARMTargetLowering. Maybe near all the legal extload setLoadExtAction's. At least that seems to be how things have worked so far.