This is an upgrade of DSE to use MemorySSA, not MemDeps. Which allows in to work across basic blocks in a sparser manner.
Halfway into making this I found D29624 by bryant, which is an attempt at the same thing, so I stole all their good ideas. There is also D29866, which is a PDSE pass by the same author. As far as I understand, that would be superior but harder to write. Unfortunately both seem to have been abandoned.
I believe this version should handle everything that the old memdeps version does (it passes all the tests). This includes complete overwrite (so long as the later store postdoms), noop stores, partial overwrites, stores before frees/lifetime_ends and PartialEarlierWithFullLater.
The only exception that I know of is for the coroutine tests, which rely on removing stores to soon to be freed data, even across function calls that may throw. See test37 in simple.ll and ex3.ll in coroutine tests. It should be possible to get that working but might involve looking through the llvm.coro.begin.
Putting up for early review, I need to do some extra testing/benchmarking/compile time etc. Added subscribers from anyone who looked interested in D29624. This is a fairly big chunk of code, let me know if I can do anything to make it easier to review.