This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][Archer] Avoid false positive for OpenMP tasking
Needs ReviewPublic

Authored by protze.joachim on Aug 17 2023, 3:47 AM.

Details

Summary

OpenMP tasks can be understood as asynchronous function calls where shared variables are passed by reference and firstprivate variables are passed by value. The OpenMP runtime needs to store these arguments from the instantiation/creation to the asynchronous execution. libomp stores these values along with the internal task data structure that is allocated from the internal memory manager.
During task creation, compiler generated code copies the values into the task object. At the beginning of task execution, compiler generated loads these values to the stack. Due to the memory management, the task object will be reused after the task finished execution. The load during task execution is not synchronized by OpenMP semantics with the store during task creation with reused task object. The memory manager provides the synchronization.

The newly added test demonstrates the false positive report with current LLVM.

This patch adds annotation for the new semantics of the task object. The patch depends on two other patches for TSan runtime and OpenMP runtime.

Diff Detail

Event Timeline

protze.joachim created this revision.Aug 17 2023, 3:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2023, 3:47 AM
protze.joachim requested review of this revision.Aug 17 2023, 3:47 AM

The annotation is in endTask because in startTask there may already be data filled by the runtime?

openmp/tools/archer/ompt-tsan.cpp
485

Can adjust the comment style to be consistent with the others?

971

Do we need this extra check? I think PrivateDataSize is guaranteed 0 if Archer is unable to get this information, right?

Addressing review comments

protze.joachim marked 2 inline comments as done.Aug 28 2023, 2:30 AM

Between creation and execution of the task we have an explicit happens-before edge.
The false positive comes from the access to the private data during execution and reuse of the memory for the next task. startTask is be before the execution of the task.
We need to store the memory information, because only in startTask we can query the task memory according to OMPT. In endTask we would get the information for the task that reached the task scheduling point where our task of interest was scheduled.

Ok; I still don't understand though why Archer needs to call TsanNewMemory both in startTask and again for the last block in endTask. Does the LLVM OpenMP runtime guarantee that there is only one range, or that the last range is "special" in some way?