[X86] Attempt to more accurately model the cost of a bool reduction of wide vector type.
Previously we multiplied the cost for the table entries by the number of splits needed. But that implies that each split goes through a reduction to scalar independently. I think what really happens is that the we AND/OR the split pieces until we're down to a single value with a legal type and then do special reduction sequence on that.
So to model that this patch takes the number of splits minus one multiplied by the cost of a AND/OR at the legal element count and adds that on top of the table lookup.
Differential Revision: https://reviews.llvm.org/D76400