This bug manifests when there are two loads and two stores that are chained as follows in a DAG,
(ld v3f32) -> (st f32) -> (ld v3f32) -> (st f32)
and the store's values are extracted from the preceding vector loads.
The current code in trunk creates a build_vector node and then inserts the new merged store node between the two load nodes. This creates a cycle between the merged store node, the build_vector node, the extract_vector_elt node, and the second vector load (set a breakpoint at DAGCombiner.cpp:10056 to visualize the transformation), which eventually results in a crash during type legalization.
This patch fixes the bug by inserting the new merged store at the position of the last store node in the chain.