This patch fixes pr31144.
Power8 has MTVSRWZ but no LXSIBZX/LXSIHZX, so move 1 or 2 bytes to VSR through MTVSRWZ is much faster than store the extended value into stack and load it with LXSIWZX.
Differential D27287
[PPC] Prefer direct move on power8 if load 1 or 2 bytes to VSR Carrot on Nov 30 2016, 4:11 PM. Authored by
Details This patch fixes pr31144. Power8 has MTVSRWZ but no LXSIBZX/LXSIHZX, so move 1 or 2 bytes to VSR through MTVSRWZ is much faster than store the extended value into stack and load it with LXSIWZX.
Diff Detail
Event Timeline
Comment Actions Other than the inline comment, I think this patch is fine but I'll let Hal have a look for the official stamp of approval.
|