This bug manifests when there are two loads and two stores that are chained as follows in a DAG,
(ld v3f32) -> (st f32) -> (ld v3f32) -> (st f32)
and the store's values are extracted from the preceding vector loads.
The current code in trunk creates a build_vector node and then inserts the new merged store node between the two load nodes. This creates a cycle between the merged store node, the build_vector node, the extract_vector_elt node, and the second vector load (set a breakpoint at DAGCombiner.cpp:10056 to visualize the transformation), which eventually results in a crash during type legalization.
This patch fixes the bug by inserting the new merged store at the position of the last store node in the chain.
Comment does not match code now: "earliest store" should be "latest store"?