If the vectorization tree root operations are integer cast operations, D41948 does not consider them as candidates for truncation. However, sometimes the trunc is needed by instcombine to trigger type shrinking or prevent type extension.
One example is like this, when vectorizing
%800 = zext <16 x i16> %799 to <16 x i32> %803 = zext <16 x i16> %802 to <16 x i32> %804 = sub nsw <16 x i32> %800, %803 %805 = extractelement <16 x i32> %804, i32 0 %806 = sext i32 %805 to i64 %808 = extractelement <16 x i32> %804, i32 1 %809 = sext i32 %808 to i64 ... %850 = extractelement <16 x i32> %804, i32 15 %851 = sext i32 %850 to i64
to
%800 = zext <16 x i16> %799 to <16 x i32> %803 = zext <16 x i16> %802 to <16 x i32> %804 = sub nsw <16 x i32> %800, %803 %805 = sext <16 x i32> %804 to <16 x i64> %806 = trunc <16 x i64> %805 to <16 x i32> %807 = extractelement <16 x i32> %806, i32 0 %808 = extractelement <16 x i32> %806, i32 1 ... %837 = extractelement <16 x i32> %806, i32 15
A trunc after the vectorized sext is needed to prevent the types of the first three IRs (2 zext and 1 sub) from being expanded to i64.
This patch checks if the sources of the vectorized cast operations can be changed to evaluate in different types. If so, the trunc is still added.