This is a follow-on from the discussion in http://reviews.llvm.org/D12154.
This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has generally fast unaligned memory ops.
A motivating use case for this change is a clang invocation that doesn't explicitly set the CPU, but does target a feature that we know only exists on a CPU that supports fast unaligned memops. For example:
$ clang -O1 foo.c -mavx
This resolves a difference in lowering noted in PR24449:
https://llvm.org/bugs/show_bug.cgi?id=24449
Currently, we use different store types depending on whether the example can be lowered as a memset or not.