[x86, AVX] replace masked load with full vector load when possible
Converting masked vector loads to regular vector loads for x86 AVX should always be a win.
I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any
- x86 already does this kind of optimization for multiple scalar loads -> vector load.
- If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner.
Differential Revision: http://reviews.llvm.org/D18094