Add a new call to Clang to perform task allocation for the target.
Details
Diff Detail
- Repository
- rC Clang
Event Timeline
As for D63010, the new function isn't used anywhere (and hence cannot be tested). What's the advantage of splitting changes at such granularity?
This patches does not add any functionality. I thought, it was published by an accident. The patch should be abandoned or reworked to add a functional part of the code.
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | For nowait we need to use the target_task_alloc variant. There are runtimes - other than the open source one - for which this function does something different. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | Trunk relies only to libomptarget interfaces. Why we should take into account some other runtime libraries? Plus, what's so different for async and non-async versions of the directive? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | Async is not yet fully supported in the OpenMP open source runtime but at some point it will be. This is the first step towards its support. I'm not sure what your objection is. The difference is clear. Device ID must be passed on the async version of this call. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | The difference is not obvious. Why we can't use the same function for sync directives? The fact the it has DeviceId parameter is not an argument here. What's so special from functional point of view? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | When you have multiple device on the same system you have to be able to distinguish between then and manage dependencies across these different devices. Knowing the device is a crucial first step in handling these inter-device dependencies. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | Ok, this is why we need it for async version. But why we can't use for sync version? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | Based on how the runtime is currently structured, nowait functions are separate from the synchronous ones. See target vs. target_nowait functions and there are many examples like that. The async support is always kept separate from the synchronous support at least in terms of entry points in the runtime. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | Right now, target_task indicates two things: 1) there are depenences to offload and 2) it's a nowait constructs. If we use target_task_alloc for both situations, we need to distinguish these two cases using flags, because the handling iseventually different. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
5121 | Thanks a lot for the comment @AlexEichenberger |
Am I correct that the second to last revision ("- Fix tests.") removed all checks for the actual device_id argument from the tests? From my point of view that's not fixing but weakening the tests! Can you explain why they needed "fixing"?
If I had to guess this is because some directives have the device clause, so the tests should check that the generated code passes the correct argument in that case.
The tests must check the device ID for target-based calls of the task alloc function.
When I was just passing the default value the LLVM-IR was: i64 -1 i.e. constant, easy to check.
With the latest change the emitted code is: i64 %123 i.e. where %123 is a local derived from the expression of the device ID.
If the value is constant, check for the constant. And at least several tests with the expressions should check for the correct value of the expression.
All the tests here use expressions so the value is never constant. If you'd like me to add a test with constant I can.
And at least several tests with the expressions should check for the correct value of the expression.
I'll have to check how to do this. There's nothing that distinguishes an expression that represents the device ID from any other expression in the code.
Yes, there must be at least one test with the default value.
And at least several tests with the expressions should check for the correct value of the expression.
I'll have to check how to do this. There's nothing that distinguishes an expression that represents the device ID from any other expression in the code.
device_id is int64_t