On X86 (AVX1/AVX), non-boolean masked loads only demand the sign bit of the mask., we already do the equivalent for masked stores.
Annoyingly I can't easily handle this inside TargetLowering::SimplifyDemandedBits as this is an x86 specific case for a generic node.
Do we ever have expanding loads with non-i1 masks?