For AMD processors we may be able to handle indirect jumps via a simpler lfence mechanism. Indirect calls may still require retpoline. If this turns out to be the right solution for AMD processors we may need to put some code in to support this.

vpermps/vpermpd/vpermd have high cost in Ryzen. I see that this patch creates cases where a vperm* is introduced where one did not exist earlier. That may cause slowdowns in Ryzen.

