sret is returned through temporary variables allocated on stack,
therefore it should use alloca address space.
Currently clang use default address space for sret pointers. This
causes inefficient code generated for AMDGPU backend since
alloca address space is 32 bit whereas generic pointer is 64 bit.
It also causes assertions where alloca address space is expected.
This patch uses alloca address space for sret pointers. It is NFC
for targets alloca address space of which is default address space.