A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence.
We do need to make sure that stores before the fence can be eliminated even if there's an otherwise store to the same location after the fence. In theory, we could reorder the second store above the fence and *then* eliminate the former, but we can't do this if the stores are on opposite sides of the fence.
I have a similar change, D11436, for MemoryDependenceAnalysis, but I consider this change much lower risk than that one.
p.s. The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. I'm not even really sure the LangRef's definition is helpful.
Getting re-acquainted with the code: it doesn't seem like this change exercises a test?