A masked load with a zero mask means there's no load.
A masked load with an allOnes mask means it's a normal vector load.
I think something similar may be happening in CodeGenPrepare with D13855, but it doesn't trigger for a target that actually supports these ops (an x86 AVX target for example). We may be able to remove some of that logic. Doing these transforms in InstCombine is a better solution because it will trigger sooner and allow more optimizations from other passes.
Eventually, I think we should be able to replace the x86 intrinsics with the llvm IR intrinsics.