Intel documents the 128-bit versions as being in emmintrin.h and the 256-bit version as being in immintrin.h.
This patch makes a new __emmtrin_f16c.h to hold the 128-bit versions to be included from emmintrin.h. And makes the existing f16cintrin.h contain the 256-bit versions and include it from immintrin.h with an error if its included directly.
Interesting this to note here, the 256-bit f16c intrinsics were being guarded by AVX2 when MSC_VER was defined and modules weren't supported. This was definitely incorrect.