Add shufflevector instruction to the expression graph post-dominated by trunc,
allowing TruncInstCombine to reduce bitwidth of expressions containing these
anton-afanasyev on Mar 22 2022, 8:05 AM.Authored by
I mean, what if we have a two-input shuffle, and one of the operands is unused as per the shuffle mask.
This might introduce regressions as the shuffle costs for the same mask but different element types can vary considerably (SSE v4i32/v4i16 unary shuffles are really cheap but v4i8 or v4i64 can be a lot more expensive).
Do you mean using TTI.getShuffleCost()? There's an issue here: we don't know the exact shuffle type at the moment we need to get its cost. We infer this type (given by MinBitWidth) after the expression graph has been built already. Need to refactor whole pass for this case, which looks redundant.