This is a 2nd try at the same optimization as http://reviews.llvm.org/D6698. That patch was checked in at r224611, but reverted at r225031 because it caused a failure.
The cause of the crash was not recognizing consecutive stores that have mixed source values (loads and vector element extracts), so I've just added a check to bail out if any store value is not coming from a vector element extract.
Use SmallVectorImpl<MemOpLink> instead.