Taking address of parameter is legal in PTX and we do generate code that does it
Alas such code currently runs into miscompilation by ptxas on sm_50+ (NVIDIA issue 1789042).
Work around the issue by enforcing minimum alignment on byval arguments of device functions.
The change is effectively a no-op on SASS level for sm_3x where ptxas already aligns local copy by at least 4.