This is an archive of the discontinued LLVM Phabricator instance.

[OPENMP][NVPTX]Improve number of threads counter, NFC.
AbandonedPublic

Authored by ABataev on May 3 2019, 10:36 AM.

Details

Summary

Patch improves performance of the full runtime mode by moving
number-of-threads counter to the shared memory. It also allows to save
global memory.

Event Timeline

ABataev created this revision.May 3 2019, 10:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 3 2019, 10:36 AM
grokos accepted this revision.May 3 2019, 12:40 PM

Looks good.

libomptarget/deviceRTLs/nvptx/src/libcall.cu
33–36

Can you make this comment clearer? What is the first parallel region and what are the other parallel regions? I suppose you mean L1 parallel vs nested?

This revision is now accepted and ready to land.May 3 2019, 12:40 PM
ABataev updated this revision to Diff 198069.May 3 2019, 12:48 PM

Updated comment

ABataev abandoned this revision.May 24 2019, 11:09 AM

There was another patch instead of this one.