X86: Implement the vectorcall calling convention
This is a Microsoft calling convention that supports both x86 and x86_64
subtargets. It passes vector and floating point arguments in XMM0-XMM5,
and passes them indirectly once they are consumed.
Homogenous vector aggregates of up to four elements can be passed in
sequential vector registers, but this part is not implemented in LLVM
and will be handled in Clang.
On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as
integer register parameters and is callee cleanup. On x86_64, it
delegates to the normal win64 calling convention.
Differential Revision: http://reviews.llvm.org/D5943