This patch implements buffer_load_format and tbuffer_load_format intrinsics that support half data types.
While types that are not legal currently ( v4f16, for example), we are using ReplaceNodeResults to change the
type and cast it back after customer lowering.
I think we should invert this, to HasUnpackedD16Mem. It's only the one weird target, the packed layout is the expected one and for every other subtarget.