This change introduces GOMP doacross compatibility. There are 12 new interface
functions 6 for long type and 6 for unsigned long long type:
GOMP_doacross_post, GOMP_doacross_wait, GOMP_loop_doacross_[schedule]_start
where schedule can be static, dynamic, guided, or runtime.
These functions just translate the parameters if necessary and send them
to the corresponding kmp function.
E.g., GOMP_doacross_post() -> __kmpc_doacross_post()
For the GOMP_doacross_post function, there is template specialization to
account for when long is a four byte vs an eight byte type. If it is a
four byte type, then a temporary array has to be created to convert the
four byte integers into eight byte integers and then sending that into
__kmpc_doacross_post(). Because GOMP_doacross_wait uses varargs, it
always needs a temporary array and does not need template specialization.
This statement seems to cause the harm. I think this makes all threads free their doacross buffer after processing their last iteration, right?
That's breaking runtime/test/worksharing/for/kmp_doacross_check.c which has its own call to __kmpc_doacross_fini at line 46 :-(