An alternative to D109348, this adds fallback broadcast patterns on AVX1 targets instead.
I've added AVX2 test coverage to help show a missing fold - where the vbroadcastsd_ymm(vbroadcastss_load_xmm()) should be foldable to a single vbroadcastss_load_ymm() on AVX1 and AVX2 targets if we create the broadcast nodes.
Thanks for creating this bugfix. This case only cover the code:
def : Pat<(v4f64 (X86VBroadcast v2f64:$src)), (VINSERTF128rr (INSERT_SUBREG (v4f64 (IMPLICIT_DEF)), (v2f64 (VMOVDDUPrr VR128:$src)), sub_xmm), (v2f64 (VMOVDDUPrr VR128:$src)), 1)>;I think we'd better add more cases.