relational: Implement shuffle builtin
This was added in CL 1.1
Tested with a Radeon HD 7850 (Pitcairn) using the CL CTS via:
v2: Add half-precision support to shuffle when available.
Move to misc/ and add section 6.12.12 to clc.h
Signed-off-by: Aaron Watry <firstname.lastname@example.org>
Reviewed-by: Jan Vesely <email@example.com>