When deducing a reference type for forwarding references prevent adding default address space of a template argument if it is given.
In OpenCL all parameters are in private address space and therefore when we initialize a forwarding reference with a parameter we should just inherit the address space from it i.e. keep __private instead of __generic (see test case).
This check fails on 32-bit Windows platform where compiler adds __attribute__((thiscall)) both constructor and call operator.
Something like this should fix the problem: