MSVC header files using vectorcall to differentiate overloaded functions, which
causes failure for AMDGPU target. This is because clang does not check function
calling convention based on function target.
This patch checks calling convention using the proper target info.
This is rather convoluted.
Perhaps it would be easier to understand if the code is restructured. along these lines: