Currently if the variable is captured in captured region, capture record for this region stores reference to this variable for future use. But we don't need to provide the reference to the original variable if it was explicitly marked as private in the 'private' clause of the OpenMP construct, this variable is replaced by private copy.
We cannot eliminate corresponding FieldDecl for this variable from the capture record, because this FieldDecl may be required for some constructs like 'omp task'.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Also do not capture local variables in outer regions if it is marked as private in inner OpenMP region, capture it only in this OpenMP region.
Can you expand a bit on what the task directive needs the capture field for if it's never initialized? This feels like a very significant change, and I'd be a lot happier if the field could be dropped entirely.
At the very least, CapturedStmt need to document this.
lib/CodeGen/CGStmt.cpp | ||
---|---|---|
2120 ↗ | (On Diff #25287) | The more officially-approved way to do this would be to use "continue". Not a big deal, though. |
John, we need all this stuff because of asynchronous nature of tasks.
In other OpenMP directives we can directly allocate private copies and
use them instead of the original variables in the outlined OpenMP function:
void outlined(...) {
%g = alloca i32
store i32 0, %g
}
But for tasks we cannot allocate private copies on stack in the outlined
function. It happens because there are actually 2 kind of tasks - tied
tasks and untied tasks. Untied tasks may have many entry/exit points and
because of that we cant simply allocate private copies in the outlined
function.
Also we can't allocate memory for these privates in a calling function
because tasks are asynchronous and the parent function may be terminated
earlier than the task itself.
Instead we're adding private copies at the end of kmp_task_t structure
and the memory for them is allocated by the runtime. Also we're
generating a function with the destructors calls for each private field
(if it is required) and provide a pointer to this function to runtime:
- struct kmp_task_t_with_privates { struct kmp_task_t { void *shareds; /< pointer to block of pointers to shared vars */ kmp_routine_entry_t routine; /< pointer to outlined
function for tasks */
kmp_int32 part_id; /**< part id for the untied
task (defines current entry point) */
kmp_routine_entry_t destructors; /* pointer to function to
invoke deconstructors of private C++ objects */
} task_data; struct .kmp_private. { // private copies of all private/firstprivate
variables
int a; S s; } privates;
};
- Our outlined function for task has the next profile
void oulined(int ThreadID, int PartId, void *shareds) {
...
}
This function with this profile is automatically generated from
CapturedStmt construct.
- Then we have to create a task using
void *new_task = __kmpc_task_allocate(<loc>, i32 ThreadId, flags, size_t
sizeof(kmp_task_t_with_privates), size_t sizeof(shareds),
kmp_routine_entry_t outlined);
libcall and it returns a pointer to created kmp_task_t_with_privates
structure.
- We have to initialize our private copies:
(kmp_task_t_with_privates*)new_task->privates.a = original_a; if 'a'
is a firstprivate;
DefaultConstructor_for_S_class
(&(kmp_task_t_with_privates*)new_task->privates.s); if 's' is a
private and because of that it must be initialized with the default
constructor.
- We have to generate a function for destructors calls:
void destructors(kmp_task_t_with_privates *task) {
~Destructor_for_S_class(&task->privates.s);
}
...
(kmp_task_t_with_privates*)new_task->task_data.destructors =
(kmp_routine_entry_t)&destructors;
- You're saying that capture field is not initialized. It is not so.
We're using it to provide a reference to private copy to outlined task
function:
(shareds_type*)((kmp_task_t_with_privates*)new_task->task_data.shareds)->ref_a
&(kmp_task_t_with_privates*)new_task->privates.a;
(shareds_type*)((kmp_task_t_with_privates*)new_task->task_data.shareds)->ref_s
&(kmp_task_t_with_privates*)new_task->privates.s;
So we have to initialize these references manually rather than
automatically as for other constructs.
Shareds is the structure of captures generated by the CapturedStmt
construct.
- We're creating a task by calling
__kmpc_omp_task(<loc>, i32 ThreadId, new_task);
This is how tasks are implemented in libiomp5 runtime library.
19.05.2015 1:56, John McCall пишет:
Can you expand a bit on what the task directive needs the capture field for if it's never initialized? This feels like a very significant change, and I'd be a lot happier if the field could be dropped entirely.
At the very least, CapturedStmt need to document this.
Comment at: lib/CodeGen/CGStmt.cpp:2120
@@ +2119,3 @@
+ // Do not emit initialization for OpenMP private variables.
+ if (CurField->hasCapturedVLAType() || *I) {+ LValue LV = EmitLValueForFieldInitialization(SlotLV, *CurField);
The more officially-approved way to do this would be to use "continue". Not a big deal, though.
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
Okay, thank you. What you're saying is that the shareds structure contains a pointer to every captured variable, and those pointers are normally initialized by taking the address of the original variable, but since private captures are allocated separately their address must be initialized separately. That all makes sense to me.
What I'm not clear on is why the shareds structure needs to contain pointers to the private captures at all. It looks to me like that's purely an artifact of our CapturedStmt code generation, where it only expects to receive a pointer to the shareds structure. The actual task function passed to kmp_omp_task_alloc takes a pointer to the task structure, which is where the private variables are actually allocated, and we currently just make that a proxy function that drills down to the shareds structure so that we can satisfy CapturedStmt.
It seems to me that, if we can recognize that private captures need different initialization, we should also be able to recognize that they don't need an actual field in the shareds structure, and code-generation can just directly use their address in the task. Does that not work? Is the current formulation necessary for consistency in some way?
John, this patch tries to avoid generation of pointers to private
captures. But for tasks we need them.
CapturedStmt is built in Sema, while kmp_task_t (used in proxy function
as argument) is generated only in codegen. That's the main issue.
Currently I don't see a solution how to make implicitly outlined
function, built from CapturedStmt, and that does not know anything about
this kmp_task_t type, to use this kmp_task_t. Currently the only way is
to use this shareds structure with pointers manually remapped to private
copies.
But I'll try to investigate this situation a little bit deeper, maybe
I'll find some better solution.
Best regards,
Alexey Bataev
Software Engineer
Intel Compiler Team
19.05.2015 21:21, John McCall пишет:
Okay, thank you. What you're saying is that the shareds structure contains a pointer to every captured variable, and those pointers are normally initialized by taking the address of the original variable, but since private captures are allocated separately their address must be initialized separately. That all makes sense to me.
What I'm not clear on is why the shareds structure needs to contain pointers to the private captures at all. It looks to me like that's purely an artifact of our CapturedStmt code generation, where it only expects to receive a pointer to the shareds structure. The actual task function passed to kmp_omp_task_alloc takes a pointer to the task structure, which is where the private variables are actually allocated, and we currently just make that a proxy function that drills down to the shareds structure so that we can satisfy CapturedStmt.
It seems to me that, if we can recognize that private captures need different initialization, we should also be able to recognize that they don't need an actual field in the shareds structure, and code-generation can just directly use their address in the task. Does that not work? Is the current formulation necessary for consistency in some way?
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
Sema doesn't know much about the "shareds" structure, either. To be sure, it builds it as part of building the CapturedStmt, but it's not like Sema creates MemberRefExprs that point into it whenever you refer to a captured variable. It's not even like lambdas, where that struct type is visible in the enclosing AST. It's purely an implementation detail.
It should be possible to have EmitDeclRefLValue just delegate to the current CapturedStmtInfo, and then the OpenMP implementations can map privates however they're implemented.
Yeah, I thought about it already. Hope to prepare a fix for it today
Best regards,
Alexey Bataev
Software Engineer
Intel Compiler Team
20.05.2015 20:03, John McCall пишет:
Sema doesn't know much about the "shareds" structure, either. To be sure, it builds it as part of building the CapturedStmt, but it's not like Sema creates MemberRefExprs that point into it whenever you refer to a captured variable. It's not even like lambdas, where that struct type is visible in the enclosing AST. It's purely an implementation detail.
It should be possible to have EmitDeclRefLValue just delegate to the current CapturedStmtInfo, and then the OpenMP implementations can map privates however they're implemented.
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
Updated after changes in codegen for task directive. Currently we don't need to emit references to private captures at all.