This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Add target task alloc function with device ID
ClosedPublic

Authored by gtbercea on Jun 7 2019, 8:57 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

gtbercea created this revision.Jun 7 2019, 8:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2019, 8:57 AM

As for D63010, the new function isn't used anywhere (and hence cannot be tested). What's the advantage of splitting changes at such granularity?

lib/CodeGen/CGOpenMPRuntime.cpp
610 ↗(On Diff #203565)

device_id is int64_t

1928 ↗(On Diff #203565)

device_id must be a int64_t type, not a machine dependent quantity.

This patches does not add any functionality. I thought, it was published by an accident. The patch should be abandoned or reworked to add a functional part of the code.

gtbercea updated this revision to Diff 204139.Jun 11 2019, 1:02 PM
  • Add temporary implementation.
Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2019, 1:02 PM
gtbercea marked 2 inline comments as done.Jun 11 2019, 1:05 PM
gtbercea updated this revision to Diff 204141.Jun 11 2019, 1:09 PM
  • Add tests.
gtbercea updated this revision to Diff 204294.Jun 12 2019, 8:00 AM
  • Add device ID if available.
gtbercea updated this revision to Diff 204320.Jun 12 2019, 10:08 AM
  • Fix tests.
This revision is now accepted and ready to land.Jun 12 2019, 10:42 AM
ABataev added inline comments.Jun 13 2019, 2:09 PM
lib/CodeGen/CGOpenMPRuntime.cpp
1927 ↗(On Diff #204320)

The comment is not correct, fix the function name

5122 ↗(On Diff #204320)

Can we use the same function in both modes, with nowait clause and without?

gtbercea marked 2 inline comments as done.Jun 14 2019, 7:10 AM
gtbercea added inline comments.
lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

For nowait we need to use the target_task_alloc variant. There are runtimes - other than the open source one - for which this function does something different.

gtbercea marked an inline comment as done.Jun 14 2019, 7:11 AM
gtbercea updated this revision to Diff 204763.Jun 14 2019, 7:13 AM
  • Fix function name.
gtbercea marked an inline comment as done.Jun 14 2019, 7:13 AM
ABataev added inline comments.Jun 14 2019, 7:29 AM
lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

Trunk relies only to libomptarget interfaces. Why we should take into account some other runtime libraries? Plus, what's so different for async and non-async versions of the directive?

gtbercea marked an inline comment as done.Jun 14 2019, 7:36 AM
gtbercea added inline comments.
lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

Async is not yet fully supported in the OpenMP open source runtime but at some point it will be. This is the first step towards its support. I'm not sure what your objection is. The difference is clear. Device ID must be passed on the async version of this call.

ABataev added inline comments.Jun 14 2019, 7:45 AM
lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

The difference is not obvious. Why we can't use the same function for sync directives? The fact the it has DeviceId parameter is not an argument here. What's so special from functional point of view?

gtbercea marked an inline comment as done.Jun 14 2019, 7:48 AM
gtbercea added inline comments.
lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

When you have multiple device on the same system you have to be able to distinguish between then and manage dependencies across these different devices. Knowing the device is a crucial first step in handling these inter-device dependencies.

ABataev added inline comments.Jun 14 2019, 8:06 AM
lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

Ok, this is why we need it for async version. But why we can't use for sync version?

gtbercea marked an inline comment as done.Jun 14 2019, 8:50 AM
gtbercea added inline comments.
lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

Based on how the runtime is currently structured, nowait functions are separate from the synchronous ones. See target vs. target_nowait functions and there are many examples like that. The async support is always kept separate from the synchronous support at least in terms of entry points in the runtime.

lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

Right now, target_task indicates two things: 1) there are depenences to offload and 2) it's a nowait constructs. If we use target_task_alloc for both situations, we need to distinguish these two cases using flags, because the handling iseventually different.
If we want to explore this avenue, then we have to look into using the proxy flag is set to indicates nowait. It may have implications for the tasks on the KMP side, so this might be more risky.
This is why I suggest at the present time that we only generate target task for the limited "target task depend nowait"

gtbercea marked an inline comment as done.Jun 14 2019, 12:12 PM
gtbercea added inline comments.
lib/CodeGen/CGOpenMPRuntime.cpp
5122 ↗(On Diff #204320)

Thanks a lot for the comment @AlexEichenberger

ABataev accepted this revision.Jun 14 2019, 12:51 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2019, 1:16 PM

Am I correct that the second to last revision ("- Fix tests.") removed all checks for the actual device_id argument from the tests? From my point of view that's not fixing but weakening the tests! Can you explain why they needed "fixing"?

If I had to guess this is because some directives have the device clause, so the tests should check that the generated code passes the correct argument in that case.

Am I correct that the second to last revision ("- Fix tests.") removed all checks for the actual device_id argument from the tests? From my point of view that's not fixing but weakening the tests! Can you explain why they needed "fixing"?

If I had to guess this is because some directives have the device clause, so the tests should check that the generated code passes the correct argument in that case.

+1. Missed the test checks, tests must be fixed

ABataev reopened this revision.Jun 15 2019, 1:24 AM
This revision is now accepted and ready to land.Jun 15 2019, 1:24 AM
ABataev requested changes to this revision.Jun 15 2019, 1:25 AM

The tests must check the device ID for target-based calls of the task alloc function.

This revision now requires changes to proceed.Jun 15 2019, 1:25 AM

The tests must check the device ID for target-based calls of the task alloc function.

Since device ID is now an expression I can only do this:

i64 {{.*}})

Is this what you wanted? @ABataev @Hahnfeld ?

Am I correct that the second to last revision ("- Fix tests.") removed all checks for the actual device_id argument from the tests? From my point of view that's not fixing but weakening the tests! Can you explain why they needed "fixing"?

When I was just passing the default value the LLVM-IR was: i64 -1 i.e. constant, easy to check.

With the latest change the emitted code is: i64 %123 i.e. where %123 is a local derived from the expression of the device ID.

Am I correct that the second to last revision ("- Fix tests.") removed all checks for the actual device_id argument from the tests? From my point of view that's not fixing but weakening the tests! Can you explain why they needed "fixing"?

When I was just passing the default value the LLVM-IR was: i64 -1 i.e. constant, easy to check.

With the latest change the emitted code is: i64 %123 i.e. where %123 is a local derived from the expression of the device ID.

If the value is constant, check for the constant. And at least several tests with the expressions should check for the correct value of the expression.

Am I correct that the second to last revision ("- Fix tests.") removed all checks for the actual device_id argument from the tests? From my point of view that's not fixing but weakening the tests! Can you explain why they needed "fixing"?

When I was just passing the default value the LLVM-IR was: i64 -1 i.e. constant, easy to check.

With the latest change the emitted code is: i64 %123 i.e. where %123 is a local derived from the expression of the device ID.

If the value is constant, check for the constant.

All the tests here use expressions so the value is never constant. If you'd like me to add a test with constant I can.

And at least several tests with the expressions should check for the correct value of the expression.

I'll have to check how to do this. There's nothing that distinguishes an expression that represents the device ID from any other expression in the code.

Am I correct that the second to last revision ("- Fix tests.") removed all checks for the actual device_id argument from the tests? From my point of view that's not fixing but weakening the tests! Can you explain why they needed "fixing"?

When I was just passing the default value the LLVM-IR was: i64 -1 i.e. constant, easy to check.

With the latest change the emitted code is: i64 %123 i.e. where %123 is a local derived from the expression of the device ID.

If the value is constant, check for the constant.

All the tests here use expressions so the value is never constant. If you'd like me to add a test with constant I can.

Yes, there must be at least one test with the default value.

And at least several tests with the expressions should check for the correct value of the expression.

I'll have to check how to do this. There's nothing that distinguishes an expression that represents the device ID from any other expression in the code.

ABataev resigned from this revision.Jun 19 2019, 12:24 PM
ABataev removed a reviewer: ABataev.
This revision is now accepted and ready to land.Jun 19 2019, 12:24 PM
gtbercea closed this revision.Jun 19 2019, 12:29 PM