The ultimate motivation for this patch is to fix the part of PR21711 ( https://llvm.org/bugs/show_bug.cgi?id=21711#c12 ) that is still not working. To get there, I'd like to use TLI.allowsMemoryAccess() in DAGCombiner's MergeConsecutiveStores(). This will require fixing bugs in x86, AArch64 (see post-commit thread for r227242) and possibly other targets.
This patch fixes the x86 implementation of allowsMisalignedMemoryAccess() to correctly return the 'Fast' output parameter for 32-byte accesses. To test that, an existing load merging optimization is changed to use the TLI hook. This exposes a shortcoming in the current logic and results in the regression test update. Changing other direct users of the isUnalignedMem32Slow() x86 CPU attribute would be a follow-on patch.