This updates MergedLoadStoreMotion to use and preserve MemorySSA.
It depends on D7864
Prior to this, the algorithm for loads was N^2, and for stores, it was N^2R, where N is the number of instructions in
the block, and R is the number of removed stores
(It restarted the reverse walk every time it removed a store due to iterator invalidation).
It is now O(M) (where M is the number of memory instructions in the two blocks) for loads,
and O(max(M,S^2)) for stores (because we have no downwards clobbering API yet, the hash table does not help
us determine memory dependence for our uses).
I have deliberately not changed behavior in terms of what loads/stores it will remove or the compile time
controls, in order to make the change minimal.
(Hopefully phabricator will not screwup a revision that is a branch of a branch,
arc diff --preview showed the right stuff)