CUDA and HIP have kernel attributes to tune the code generation (in the
backend). To reuse this functionality for OpenMP target regions we
introduce the ompx_attribute clause that takes these kernel
attributes and emits code as if they had been attached to the kernel
fuction (which is implicitly generated).
To limit the impact, we only support three kernel attributes:
amdgpu_waves_per_eu, for AMDGPU
amdgpu_flat_work_group_size, for AMDGPU
launch_bounds, for NVPTX
The existing implementations of those attributes are used for error
checking and code generation. ompx_attribute can be attached to any
executable target region and it can hold more than one kernel attribute.
Bad to have Sema dependancy in AST.