This updates MergedLoadStoreMotion to use and preserve MemorySSA.

It depends on D7864

Prior to this, the algorithm for loads was N^2, and for stores, it was N^2R, where N is the number of instructions in

the block, and R is the number of removed stores

(It restarted the reverse walk every time it removed a store due to iterator invalidation).

It is now O(M) (where M is the number of memory instructions in the two blocks) for loads,

and O(max(M,S^2)) for stores (because we have no downwards clobbering API yet, the hash table does not help

us determine memory dependence for our uses).

I have deliberately not changed behavior in terms of what loads/stores it will remove or the compile time

controls, in order to make the change minimal.

(Hopefully phabricator will not screwup a revision that is a branch of a branch,

arc diff --preview showed the right stuff)