This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Unify omptarget API and usage wrt. `__tgt_async_info`
ClosedPublic

Authored by jdoerfert on Feb 9 2021, 3:51 PM.

Details

Summary

This patch unifies our libomptarget API in two ways:

  • always pass a __tgt_async_info object, the Queue member decides if it is in use or not.
  • (almost) always synchronize in the interface layer and not in the omptarget layer.

A side effect is that we now put all constructor and static initializer
kernels in a stream too, if the device utilizes __tgt_async_info.

The patch contains a TODO which can be addressed as we add support for
asynchronous malloc and free in the plugin API. This is the only
synchronizeAsyncInfo left in the omptarget layer.

Site note: On a V100 system the GridMini performance for small sizes
more than doubled.

Diff Detail

Event Timeline

jdoerfert created this revision.Feb 9 2021, 3:51 PM
jdoerfert requested review of this revision.Feb 9 2021, 3:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2021, 3:51 PM
Herald added a subscriber: sstefan1. · View Herald Transcript
jdoerfert updated this revision to Diff 322526.Feb 9 2021, 3:56 PM

Minor code movement

tianshilei1992 added inline comments.Feb 9 2021, 6:01 PM
openmp/libomptarget/src/interface.cpp
505–507

It's good to unify the usage of device id and device object. Probably better to separate it to another patch.

507

It's really "uncomfortable" to read the code mixed with different styles, like rc and AsyncInfo.

openmp/libomptarget/src/omptarget.cpp
1313

We might want a unified use of either device id or device object.

openmp/libomptarget/src/private.h
16

Why do we need this include?

I'll split the patch tomorrow and do some NFC pre commits without review to unify some style choices.

openmp/libomptarget/src/private.h
16

clangd automatically added it. sometimes that is a problem when I used something from the header but later removed the use. I'll get rid of the include if I can.

Interesting.

A big cleanup would be for the async object passed to the plugins to always be non-null. Add an API to construct an async object, let synchronise act as the destructor. Then every _async call is passed a valid, not null thing to work with, and we remove N points of if-passed-null-try-to-construct.

Async would then look more like start transaction, append entries, finalise transaction.

jdoerfert updated this revision to Diff 322722.Feb 10 2021, 9:55 AM

Rebase after splitting off formating patches and API adjustments

Interesting.

A big cleanup would be for the async object passed to the plugins to always be non-null. Add an API to construct an async object, let synchronise act as the destructor. Then every _async call is passed a valid, not null thing to work with, and we remove N points of if-passed-null-try-to-construct.

Async would then look more like start transaction, append entries, finalise transaction.

Yes, but layer by layer. For now, we make it a required object on the omptarget layer and then we go down.

Rebase w/ wrapper

tianshilei1992 added inline comments.Feb 10 2021, 2:22 PM
openmp/libomptarget/src/omptarget.cpp
647–648

I'm lost in the comment...We don't need the synchronization here if we have async data free.

jdoerfert added inline comments.Feb 10 2021, 3:05 PM
openmp/libomptarget/src/omptarget.cpp
647–648

Clang format mixed the TODO with the comment that was there.

// TODO: We should not synchronized here but pass the AsyncInfo object to the allocate/deallocate device APIs. 

// We need to synchronize before deallocating
// data. If AsyncInfo is nullptr, the previous data transfer (if has) will be
// synchronous, so we don't need to synchronize again. If AsyncInfo->Queue is
// nullptr, there is no data transfer happened because once there is,
// AsyncInfo->Queue will not be nullptr, so again, we don't need to
// synchronize.
jdoerfert updated this revision to Diff 322876.Feb 10 2021, 5:34 PM

Include functions that got lost in translation and improve comments

This revision is now accepted and ready to land.Feb 10 2021, 5:46 PM
grokos added a subscriber: grokos.Feb 11 2021, 7:56 AM
grokos added inline comments.
openmp/libomptarget/src/omptarget.cpp
648–650

synchronized --> synchronize

651–653

If AsyncInfo is nullptr, the previous data transfer (if has) will be synchronous, so we don't need to synchronize again.

I think this sentence is no longer relevant. After this patch, we will always have an AsyncInfo object here.

653–655

Also, the second part of this comment is confusing. From now on we always call AsyncInfo.synchronize(), so we cannot claim that:

so again, we don't need to synchronize

1317–1318

So now this TODO (and the whole else-branch if I understand correctly) is ready to be removed, right? Synchronization takes place at the outer level (in __tgt_target*).

I'll update comments and the leftover sync.

openmp/libomptarget/src/omptarget.cpp
1317–1318

Yes, all this rebasing. I removed it before.

jdoerfert marked 4 inline comments as done.

Addressed comments

This revision was landed with ongoing or failed builds.Feb 16 2021, 1:38 PM
This revision was automatically updated to reflect the committed changes.