Previously in the conversion cost table there are no entries for integer-integer conversions on SSE2. This will result in imprecise costs for certain vectorized operations. This patch adds those entries for SSE2. The cost numbers are counted from the result of running llc on the new test case in this patch.
These values don't appear to be correct for SSE41 which has PMOVSX/PMOVZX ops - maybe split off the 128-bit extension ops from AVXConversionTbl into SSE41ConversionTbl ?
You may not need SSSE3 (often PSHUFB is as costly as fixed shuffles on older hardware - it just reduces register use) but splitting the extensions from AVX into SSE41 needs to be done.
I haven't checked this very thoroughly but you might need to improve non-simple type handling here, especially for extensions?
I agree. SSE41 really provides several instructions that can greatly reduce those costs. I have added a table for SSE41 and also updated the test case. PTAL.
This patch is actually adding cost entries for non-simple types. I added another query on the table below for non-simple types except this one. Those entries of float/int conversions also need to be updated for non-simple types and I will do it later. To prevent vector-size/types combination explosion, I think we probably need to redesign the cost table.