This is another step on the long road to fixing the last part of PR21711:
https://llvm.org/bugs/show_bug.cgi?id=21711
This change was mentioned in:
http://reviews.llvm.org/D10662 and
http://reviews.llvm.org/D10905
The change in DAGCombiner is simple: use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time we have a merged access candidate.
But I think that change exposes a bug in the AMD/SI implementation of allowsMisalignedMemoryAccesses(). A test failure shows up in test/CodeGen/AMDGPU/merge-stores.ll. Matt, can you confirm/deny that that part of the patch is what you expect?
This looks fine to me