This extends PerformSplittingToWideningLoad to also handle FP_Ext, as well as sign and zero extends. It uses an integer extending load followed by a VCVTL on the bottom lanes to efficiently perform an fpext on a smaller than legal type.
The existing code had to be rewritten a little to not just split the node in two and let legalization handle it from there, but to actually split into legal chunks.
Instead of looping like this, you can probably just return DAG.getNode(ISD::CONCAT_VECTORS, DL, ConcatVT, Loads);.