This patch includes a peephole optimization that involves simplifying specific sequences involving the XXSPLTW instruction. Sequences simplified will differ and the transformation performed is dependent on if it is being done on P8 or P9.
For P9:
- load -> permute (or a shift) -> xxspltw will become lxvwsx
- load -> xxspltw will become lxvwsx
For P8:
- load -> permute (or a shift) -> xxspltw will become load -> xxspltw, ensuring that the correct element is splatted while removing the redundant permute instruction.