By the magic of masked loads, a widened MLOAD is almost identical to the original MLOAD.
Need to handle a few more INSERT_SUBVECTOR legalization cases to avoid crashing on the testcase.
The code for computing the mask is unfortunately not very efficient; maybe we need a target-independent version of whilelo? Or a DAGCombine to form whilelo?
clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming]
not useful