We currently have no way to emit the xxspltw for a word splat and a call to vec_splat for vectors with 8-byte elements get translated to vperm with a mask vector that comes from memory.
This patch provides intrinsics to get at xxspltw and xxspltd (extended mnemonic). In a subsequent patch, altivec.h will be modified to use these intrinsics for the respective vec_splat definitions.
This provides a significant improvement in one benchmark that uses vec_splat.
Looks like the "This is not yet implemented. When it is, we need to uncomment the following:" part of this comment is out-dated. If so, please remove it.