The alloca + memcpy could be optimized out entirely by Clang,
since it wasn't referenced by anything outside of the function,
and had no visible effect.
And even if it wouldn't be optimized out, there's strictly no
guarantee that the buffer allocated by alloca is at the exact
stack frame boundary when calling the next function; the compiler
could easily have added extra padding around the allocation at the
bottom.
Change to using the same full switch as the fallback in z_Linux_util.cpp,
which supports up to 15 arguments. This fixes calls to microtasks with
more than 6 parameters in optimized builds with Clang. However, for
the build configurations where the alloca wasn't broken already, this
change does break the misc_bugs/many-microtask-args.c test - which
requires passing 17 arguments to the microtask.
This tries to fix the same issue as D137827 does, but fixing it in
pure C code. However this fix only works for a set, fixed maximum
number of arguments.
The same implementation is being used as fallback implementation in
z_Linux_util.cpp for architectures without an assembly implementation
- where it also fails the misc_bugs/many-microtask-args.c test.
Is there a fallback path for ARM64 MSVC? If there isn't, then our libomp140.aarch64.dll might break. @natgla FYI