Currently if there is not kernel argument, device synchronization will
be skipped. This can lead to two issues:
- If there is any device error, it will not be captured;
- The target region might end before the kernel is done, which is not spec conformant.
The test added in this patch only runs on NVPTX platform, although it will not
be executed by Phab at all. It also requires not which is not available on most
systems.
", and it is invoked asynchronously", really? Maybe the language is just confusing.