Page MenuHomePhabricator

[OpenMP] Permit map with DSA on combined directive
ClosedPublic

Authored by jdenny on Aug 6 2019, 4:43 PM.

Details

Summary

For map, the following restriction changed in OpenMP 5.0:

  • OpenMP 4.5 [2.15.5.1, Restrictions]: "A list item cannot appear in both a map clause and a data-sharing attribute clause on the same construct.
  • OpenMP 5.0 [2.19.7.1, Restrictions]: "A list item cannot appear in both a map clause and a data-sharing attribute clause on the same construct unless the construct is a combined construct."

This patch removes this restriction in the case of combined constructs
and OpenMP 5.0, and it updates Sema not to capture a scalar by copy in
the target region when firstprivate and map appear for that scalar
on a combined target construct.

This patch also adds a fixme to a test that now reveals that a
diagnostic about loop iteration variables is dropped in the case of
OpenMP 5.0. That bug exists regardless of this patch's changes.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jdenny added a comment.Aug 9 2019, 3:43 PM

I want to be sure we're on the same page: For OpenMP 5.0, should we allow is_device_ptr with the private clauses?

Yes, since it is allowed by the standard.

Umm ... I probably missed some earlier discussions! What would be the behavior of the following code?

p = omp_target_alloc(...);
#pragma omp target private(p) is_device_ptr(p)
  p[...] = ...;   // crash or not?

It must crush, I assume. The main problem is that this construct is allowed by the standard.

Yep. We should add a warning message for it.

Clang currently doesn't permit is_device_ptr with firstprivate either, but I'm not aware of any reason to diagnose that. Is there? And then there are privatization clauses like linear and reduction....

I want to be sure we're on the same page: For OpenMP 5.0, should we allow is_device_ptr with the private clauses?

Yes, since it is allowed by the standard.

Umm ... I probably missed some earlier discussions! What would be the behavior of the following code?

p = omp_target_alloc(...);
#pragma omp target private(p) is_device_ptr(p)
  p[...] = ...;   // crash or not?

It must crush, I assume. The main problem is that this construct is allowed by the standard.

Yep. We should add a warning message for it.

Clang currently doesn't permit is_device_ptr with firstprivate either, but I'm not aware of any reason to diagnose that. Is there? And then there are privatization clauses like linear and reduction....

I assume, we should allow all allowed datasharing clauses along with is_device_ptr clause, including shared, linear, etc.

jdenny added a comment.Aug 9 2019, 3:53 PM

I want to be sure we're on the same page: For OpenMP 5.0, should we allow is_device_ptr with the private clauses?

Yes, since it is allowed by the standard.

Umm ... I probably missed some earlier discussions! What would be the behavior of the following code?

p = omp_target_alloc(...);
#pragma omp target private(p) is_device_ptr(p)
  p[...] = ...;   // crash or not?

It must crush, I assume. The main problem is that this construct is allowed by the standard.

Yep. We should add a warning message for it.

Clang currently doesn't permit is_device_ptr with firstprivate either, but I'm not aware of any reason to diagnose that. Is there? And then there are privatization clauses like linear and reduction....

I assume, we should allow all allowed datasharing clauses along with is_device_ptr clause, including shared, linear, etc.

Agreed. I was just wondering if anyone saw an obvious issue that might deserve a warning (as in the private plus is_device_ptr case). I'm not planning to implement warnings right now. I'm just curious for later.

I want to be sure we're on the same page: For OpenMP 5.0, should we allow is_device_ptr with the private clauses?

Yes, since it is allowed by the standard.

Umm ... I probably missed some earlier discussions! What would be the behavior of the following code?

p = omp_target_alloc(...);
#pragma omp target private(p) is_device_ptr(p)
  p[...] = ...;   // crash or not?

It must crush, I assume. The main problem is that this construct is allowed by the standard.

Yep. We should add a warning message for it.

Upon further reflection, this is not clearly allowed by the standard. My experience is that, when reading standards, sometimes things are disallowed by contradiction (i.e., the standard does not define some behavior, and what the standard does say that's relevant is self contradictory). In this case, 2.19.3 says that list items which are privatized (and which are used) undergo replacement (with new items created as specified) while 2.12.5 says that "The is_device_ptr clause is used to indicate that a list item is a device pointer already in the device data environment and that it should be used directly." A given list item cannot simultaneously be "used directly" (2.12.5) and also undergo replacement: "Inside the construct, all references to the original list item are replaced by references to a new list item received by the task or SIMD lane" (2.19.3). Thus, it may be disallowed.

kkwli0 added a comment.Aug 9 2019, 4:09 PM

I want to be sure we're on the same page: For OpenMP 5.0, should we allow is_device_ptr with the private clauses?

Yes, since it is allowed by the standard.

Umm ... I probably missed some earlier discussions! What would be the behavior of the following code?

p = omp_target_alloc(...);
#pragma omp target private(p) is_device_ptr(p)
  p[...] = ...;   // crash or not?

It must crush, I assume. The main problem is that this construct is allowed by the standard.

Yep. We should add a warning message for it.

Upon further reflection, this is not clearly allowed by the standard. My experience is that, when reading standards, sometimes things are disallowed by contradiction (i.e., the standard does not define some behavior, and what the standard does say that's relevant is self contradictory). In this case, 2.19.3 says that list items which are privatized (and which are used) undergo replacement (with new items created as specified) while 2.12.5 says that "The is_device_ptr clause is used to indicate that a list item is a device pointer already in the device data environment and that it should be used directly." A given list item cannot simultaneously be "used directly" (2.12.5) and also undergo replacement: "Inside the construct, all references to the original list item are replaced by references to a new list item received by the task or SIMD lane" (2.19.3). Thus, it may be disallowed.

That is what I thought. Specifying these two clauses on the target construct creates ambiguity on which p it referred to inside the construct. The private p or the pointer p that stores the device address? I will work with the committee to fix the spec.

I want to be sure we're on the same page: For OpenMP 5.0, should we allow is_device_ptr with the private clauses?

Yes, since it is allowed by the standard.

Umm ... I probably missed some earlier discussions! What would be the behavior of the following code?

p = omp_target_alloc(...);
#pragma omp target private(p) is_device_ptr(p)
  p[...] = ...;   // crash or not?

It must crush, I assume. The main problem is that this construct is allowed by the standard.

Yep. We should add a warning message for it.

Upon further reflection, this is not clearly allowed by the standard. My experience is that, when reading standards, sometimes things are disallowed by contradiction (i.e., the standard does not define some behavior, and what the standard does say that's relevant is self contradictory). In this case, 2.19.3 says that list items which are privatized (and which are used) undergo replacement (with new items created as specified) while 2.12.5 says that "The is_device_ptr clause is used to indicate that a list item is a device pointer already in the device data environment and that it should be used directly." A given list item cannot simultaneously be "used directly" (2.12.5) and also undergo replacement: "Inside the construct, all references to the original list item are replaced by references to a new list item received by the task or SIMD lane" (2.19.3). Thus, it may be disallowed.

I think, this combination is still allowed. I assume that

#Pragma omp target parallel is_device_ptr(a) <dsa_clause>(a)

is the same as

#pragma omp target is_device_ptr(a)
#pragma omp parallel <dsa_clause>(a)

i.e. datasharing clauses are applied to inner sub-regions, not the target region itself.

I want to be sure we're on the same page: For OpenMP 5.0, should we allow is_device_ptr with the private clauses?

Yes, since it is allowed by the standard.

Umm ... I probably missed some earlier discussions! What would be the behavior of the following code?

p = omp_target_alloc(...);
#pragma omp target private(p) is_device_ptr(p)
  p[...] = ...;   // crash or not?

It must crush, I assume. The main problem is that this construct is allowed by the standard.

Yep. We should add a warning message for it.

Upon further reflection, this is not clearly allowed by the standard. My experience is that, when reading standards, sometimes things are disallowed by contradiction (i.e., the standard does not define some behavior, and what the standard does say that's relevant is self contradictory). In this case, 2.19.3 says that list items which are privatized (and which are used) undergo replacement (with new items created as specified) while 2.12.5 says that "The is_device_ptr clause is used to indicate that a list item is a device pointer already in the device data environment and that it should be used directly." A given list item cannot simultaneously be "used directly" (2.12.5) and also undergo replacement: "Inside the construct, all references to the original list item are replaced by references to a new list item received by the task or SIMD lane" (2.19.3). Thus, it may be disallowed.

I think, this combination is still allowed. I assume that

#Pragma omp target parallel is_device_ptr(a) <dsa_clause>(a)

is the same as

#pragma omp target is_device_ptr(a)
#pragma omp parallel <dsa_clause>(a)

i.e. datasharing clauses are applied to inner sub-regions, not the target region itself.

With the parallel, that makes sense to me. In that case, however, I'd imagine that the privitization works as normal and the code wouldn't crash. Right?

I want to be sure we're on the same page: For OpenMP 5.0, should we allow is_device_ptr with the private clauses?

Yes, since it is allowed by the standard.

Umm ... I probably missed some earlier discussions! What would be the behavior of the following code?

p = omp_target_alloc(...);
#pragma omp target private(p) is_device_ptr(p)
  p[...] = ...;   // crash or not?

It must crush, I assume. The main problem is that this construct is allowed by the standard.

Yep. We should add a warning message for it.

Upon further reflection, this is not clearly allowed by the standard. My experience is that, when reading standards, sometimes things are disallowed by contradiction (i.e., the standard does not define some behavior, and what the standard does say that's relevant is self contradictory). In this case, 2.19.3 says that list items which are privatized (and which are used) undergo replacement (with new items created as specified) while 2.12.5 says that "The is_device_ptr clause is used to indicate that a list item is a device pointer already in the device data environment and that it should be used directly." A given list item cannot simultaneously be "used directly" (2.12.5) and also undergo replacement: "Inside the construct, all references to the original list item are replaced by references to a new list item received by the task or SIMD lane" (2.19.3). Thus, it may be disallowed.

I think, this combination is still allowed. I assume that

#Pragma omp target parallel is_device_ptr(a) <dsa_clause>(a)

is the same as

#pragma omp target is_device_ptr(a)
#pragma omp parallel <dsa_clause>(a)

i.e. datasharing clauses are applied to inner sub-regions, not the target region itself.

With the parallel, that makes sense to me. In that case, however, I'd imagine that the privitization works as normal and the code wouldn't crash. Right?

It leads to the use of uninitialized memory. We create private non-initialized copy of the pointer, not the copy of the data it addresses. It shall work with firstprivate clause.

jdenny updated this revision to Diff 214542.Aug 10 2019, 3:07 PM
jdenny retitled this revision from [OpenMP] Fix map/is_device_ptr with DSA on combined directive to [OpenMP] Permit map with DSA on combined directive.
jdenny edited the summary of this revision. (Show Details)

I made the following changes, as suggested:

  • Add back restriction for OpenMP < 5.0.
  • Remove is_device_ptr changes.

Alexey, you said:

Plus, these changes are not enough to support this new feature from OpenMP 5.0. There definitely must be some changes in the codegen. If the variable is mapped in the target construct, we should not generate a code for the private clause of this variable on the target construct, since, in this case, private clauses are applied for the inner subdirectives, which are the part of the combined directive, but not the target part of the directive.

I haven't spent enough time exploring codegen here in the past, so I'm starting out by investigating the codegen output for various examples. I'm looking for differences between a combined target teams and an equivalent separate target and teams. So far, I see nothing but superficial differences. Do you have an example where there's an undesirable difference?

Thanks.

jdenny edited the summary of this revision. (Show Details)Aug 10 2019, 3:15 PM

I made the following changes, as suggested:

  • Add back restriction for OpenMP < 5.0.
  • Remove is_device_ptr changes.

    Alexey, you said:

Plus, these changes are not enough to support this new feature from OpenMP 5.0. There definitely must be some changes in the codegen. If the variable is mapped in the target construct, we should not generate a code for the private clause of this variable on the target construct, since, in this case, private clauses are applied for the inner subdirectives, which are the part of the combined directive, but not the target part of the directive.

I haven't spent enough time exploring codegen here in the past, so I'm starting out by investigating the codegen output for various examples. I'm looking for differences between a combined target teams and an equivalent separate target and teams. So far, I see nothing but superficial differences. Do you have an example where there's an undesirable difference?

Thanks.

Try something like target parallel firstprivate (a) map(a). Currently it will create a firstprivate copy of the variable a in target context thought it is not required at all. It may lead to increased register pressure and performance degradation.

jdenny updated this revision to Diff 214559.Aug 11 2019, 8:44 AM
jdenny marked 3 inline comments as done.
jdenny edited the summary of this revision. (Show Details)
jdenny set the repository for this revision to rG LLVM Github Monorepo.

Try something like target parallel firstprivate (a) map(a). Currently it will create a firstprivate copy of the variable a in target context thought it is not required at all. It may lead to increased register pressure and performance degradation.

Thanks. The only combination that appears to be affected is firstprivate and map for scalar types. I had only tried arrays and structs earlier, but they're not affected. If I had looked a little more closely at the test case this patch already introduced, I would have noticed that int is affected. The problematic analysis is in sema, where there was an apparent assumption that firstprivate wouldn't appear with map due to the previous restriction. This update fixes that.

For my previous update, I meant to point out that I introduced a fixme into the test suite. See the phabricator summary for details.

Try something like target parallel firstprivate (a) map(a). Currently it will create a firstprivate copy of the variable a in target context thought it is not required at all. It may lead to increased register pressure and performance degradation.

Thanks. The only combination that appears to be affected is firstprivate and map for scalar types. I had only tried arrays and structs earlier, but they're not affected. If I had looked a little more closely at the test case this patch already introduced, I would have noticed that int is affected. The problematic analysis is in sema, where there was an apparent assumption that firstprivate wouldn't appear with map due to the previous restriction. This update fixes that.

For my previous update, I meant to point out that I introduced a fixme into the test suite. See the phabricator summary for details.

This is wrong. It affects all possible combinations and not only fof scalar types, all of them are affected.

This is wrong. It affects all possible combinations and not only fof scalar types, all of them are affected.

Are you saying the patch isn't sufficient because other types need to be fixed too? Can you give an example?

This is wrong. It affects all possible combinations and not only fof scalar types, all of them are affected.

Are you saying the patch isn't sufficient because other types need to be fixed too? Can you give an example?

Target teams private map will produce extra private in target context, other constructs either.

Target teams private map will produce extra private in target context, other constructs either.

Here's what I tried:

int a;
#pragma omp target teams private(a) map(a)
  ;

The same code is generated as for:

int a;
#pragma omp target map(a)
#pragma omp teams private(a)
  ;

I haven't debugged it yet, but it seems orthogonal to whether you have a combined directive, which is what this patch is about.

Target teams private map will produce extra private in target context, other constructs either.

Here's what I tried:

int a;
#pragma omp target teams private(a) map(a)
  ;

The same code is generated as for:

int a;
#pragma omp target map(a)
#pragma omp teams private(a)
  ;

I haven't debugged it yet, but it seems orthogonal to whether you have a combined directive, which is what this patch is about.

Did you check the mapping flags, generated during host codegen? They must be the same. With private clause it may generate just map(alloc) instead of map(tofrom).

Target teams private map will produce extra private in target context, other constructs either.

Here's what I tried:

int a;
#pragma omp target teams private(a) map(a)
  ;

The same code is generated as for:

int a;
#pragma omp target map(a)
#pragma omp teams private(a)
  ;

I haven't debugged it yet, but it seems orthogonal to whether you have a combined directive, which is what this patch is about.

Did you check the mapping flags, generated during host codegen? They must be the same. With private clause it may generate just map(alloc) instead of map(tofrom).

I diffed the .ll files for combined vs. separate constructs. The only difference is the file ID. @.offload_maptypes isn't generated in either (but it is if I replace private with firstprivate).

Target teams private map will produce extra private in target context, other constructs either.

Here's what I tried:

int a;
#pragma omp target teams private(a) map(a)
  ;

The same code is generated as for:

int a;
#pragma omp target map(a)
#pragma omp teams private(a)
  ;

I haven't debugged it yet, but it seems orthogonal to whether you have a combined directive, which is what this patch is about.

Did you check the mapping flags, generated during host codegen? They must be the same. With private clause it may generate just map(alloc) instead of map(tofrom).

I diffed the .ll files for combined vs. separate constructs. The only difference is the file ID. @.offload_maptypes isn't generated in either (but it is if I replace private with firstprivate).

Maptypes array must be generated in all cases, check the host codegen.
Also, test codegen with the different kinds of maptypes, not only to from, but also alloc, to, from, etc. Yoh will see the difference in many cases and 2ith many kinds of types, not only scalars.

Target teams private map will produce extra private in target context, other constructs either.

Here's what I tried:

int a;
#pragma omp target teams private(a) map(a)
  ;

The same code is generated as for:

int a;
#pragma omp target map(a)
#pragma omp teams private(a)
  ;

I haven't debugged it yet, but it seems orthogonal to whether you have a combined directive, which is what this patch is about.

Did you check the mapping flags, generated during host codegen? They must be the same. With private clause it may generate just map(alloc) instead of map(tofrom).

I diffed the .ll files for combined vs. separate constructs. The only difference is the file ID. @.offload_maptypes isn't generated in either (but it is if I replace private with firstprivate).

Maptypes array must be generated in all cases, check the host codegen.
Also, test codegen with the different kinds of maptypes, not only to from, but also alloc, to, from, etc. Yoh will see the difference in many cases and 2ith many kinds of types, not only scalars.

I'll work on another patch for the bugs where directives are separate, and I'll return to this patch if more fixes are then needed to support combined directives consistently. Thanks.

jdenny updated this revision to Diff 215298.Aug 14 2019, 5:58 PM

Rebase. Add more tests, as requested at D66247#1630441.

Why there are the changes from the another patch?

I think you're referring to stuff like OpenMPCaptureLevel in ScopeInfo.h. I wrote those changes for this patch first, as can be seen in phabricator history. I needed them for the other patch too, and I thought we were going to end up applying that patch first, so I was going to strip them out here. Of course, as I understand it, the plan now is to apply only this patch.

I think you're referring to stuff like OpenMPCaptureLevel in ScopeInfo.h. I wrote those changes for this patch first, as can be seen in phabricator history. I needed them for the other patch too, and I thought we were going to end up applying that patch first, so I was going to strip them out here. Of course, as I understand it, the plan now is to apply only this patch.

These changes required for this functionality, please remove.

jdenny updated this revision to Diff 216420.Aug 21 2019, 9:19 AM
jdenny edited the summary of this revision. (Show Details)
jdenny set the repository for this revision to rG LLVM Github Monorepo.

As requested, backed out the changes related to firstprivate scalars, and updated tests.

This revision is now accepted and ready to land.Aug 21 2019, 9:38 AM

I want to be sure we're on the same page. Due to the changes I just backed out, the following two examples now generate different code:

int a = 0;
#pragma omp target map(a)
#pragma omp teams firstprivate(a)
  ;
int a = 0;
#pragma omp target teams firstprivate(a) map(a)
  ;

The difference is whether a is passed by reference (the first case) or value (the second case) to the offloading function.

Is that fine for you?

I want to be sure we're on the same page. Due to the changes I just backed out, the following two examples now generate different code:

int a = 0;
#pragma omp target map(a)
#pragma omp teams firstprivate(a)
  ;
int a = 0;
#pragma omp target teams firstprivate(a) map(a)
  ;

The difference is whether a is passed by reference (the first case) or value (the second case) to the offloading function.

Is that fine for you?

No, this is what I warned about. We shall have the same codegen just like in the first case, the value must be passed by reference and mapped as tofrom.

ABataev requested changes to this revision.Aug 21 2019, 9:56 AM

Need to fix the mapping

This revision now requires changes to proceed.Aug 21 2019, 9:56 AM

I want to be sure we're on the same page. Due to the changes I just backed out, the following two examples now generate different code:

int a = 0;
#pragma omp target map(a)
#pragma omp teams firstprivate(a)
  ;
int a = 0;
#pragma omp target teams firstprivate(a) map(a)
  ;

The difference is whether a is passed by reference (the first case) or value (the second case) to the offloading function.

Is that fine for you?

No, this is what I warned about. We shall have the same codegen just like in the first case, the value must be passed by reference and mapped as tofrom.

If I add back those changes I just backed out, we get the same codegen. Is that what you want?

I want to be sure we're on the same page. Due to the changes I just backed out, the following two examples now generate different code:

int a = 0;
#pragma omp target map(a)
#pragma omp teams firstprivate(a)
  ;
int a = 0;
#pragma omp target teams firstprivate(a) map(a)
  ;

The difference is whether a is passed by reference (the first case) or value (the second case) to the offloading function.

Is that fine for you?

No, this is what I warned about. We shall have the same codegen just like in the first case, the value must be passed by reference and mapped as tofrom.

If I add back those changes I just backed out, we get the same codegen. Is that what you want?

Those 2 cases must result in the same codegen. But I rather doubt we need your previous changes. Check Sema::isOpenMPCapturedByRef instead, required functionality must be handled in this function.

I want to be sure we're on the same page. Due to the changes I just backed out, the following two examples now generate different code:

int a = 0;
#pragma omp target map(a)
#pragma omp teams firstprivate(a)
  ;
int a = 0;
#pragma omp target teams firstprivate(a) map(a)
  ;

The difference is whether a is passed by reference (the first case) or value (the second case) to the offloading function.

Is that fine for you?

No, this is what I warned about. We shall have the same codegen just like in the first case, the value must be passed by reference and mapped as tofrom.

If I add back those changes I just backed out, we get the same codegen. Is that what you want?

Those 2 cases must result in the same codegen. But I rather doubt we need your previous changes. Check Sema::isOpenMPCapturedByRef instead, required functionality must be handled in this function.

That's the focus of my previous changes. The rest just supports the changes there.

I want to be sure we're on the same page. Due to the changes I just backed out, the following two examples now generate different code:

int a = 0;
#pragma omp target map(a)
#pragma omp teams firstprivate(a)
  ;
int a = 0;
#pragma omp target teams firstprivate(a) map(a)
  ;

The difference is whether a is passed by reference (the first case) or value (the second case) to the offloading function.

Is that fine for you?

No, this is what I warned about. We shall have the same codegen just like in the first case, the value must be passed by reference and mapped as tofrom.

If I add back those changes I just backed out, we get the same codegen. Is that what you want?

Those 2 cases must result in the same codegen. But I rather doubt we need your previous changes. Check Sema::isOpenMPCapturedByRef instead, required functionality must be handled in this function.

That's the focus of my previous changes. The rest just supports the changes there.

We don't need this new level counter to correctly handle this situation. Just check for the combined target directive in Sema::isOpenMPCapturedByRef and return true if it is mapped as to from or just from. The change must be very simple.

We don't need this new level counter to correctly handle this situation. Just check for the combined target directive in Sema::isOpenMPCapturedByRef and return true if it is mapped as to from or just from. The change must be very simple.

I don't see a way to do that without also affecting capturing on the teams region in the case of a combined target teams construct.

If it's simple, it might save time for both of us if you post the change you're thinking of here.

We don't need this new level counter to correctly handle this situation. Just check for the combined target directive in Sema::isOpenMPCapturedByRef and return true if it is mapped as to from or just from. The change must be very simple.

I don't see a way to do that without also affecting capturing on the teams region in the case of a combined target teams construct.

If it's simple, it might save time for both of us if you post the change you're thinking of here.

Chwck 2 ifs: if (IsVariableUsedInMapClause) and the second if (IsByRef && Ty.getNonReferenceType()->isScalarType()). If the variable is mapped, IsByRef is set to true, but later it may be changed in the second if, when we check for the use in the firstprivate clause. We need to add a check, that the IsByRef must be set to false onlly if IsVariableUsedInMapClause is set to false too for combined target constructs.

Chwck 2 ifs: if (IsVariableUsedInMapClause) and the second if (IsByRef && Ty.getNonReferenceType()->isScalarType()). If the variable is mapped, IsByRef is set to true, but later it may be changed in the second if, when we check for the use in the firstprivate clause. We need to add a check, that the IsByRef must be set to false onlly if IsVariableUsedInMapClause is set to false too for combined target constructs.

I've tried that. It results in capturing by reference on both the target region and the teams region in the case of a combined target teams construct. Thus, we end up with different codegen than when the directives are separate, where capturing by reference is set only on the target region.

The changes I just backed out handle this correctly because they distinguish between the target and teams regions.

Chwck 2 ifs: if (IsVariableUsedInMapClause) and the second if (IsByRef && Ty.getNonReferenceType()->isScalarType()). If the variable is mapped, IsByRef is set to true, but later it may be changed in the second if, when we check for the use in the firstprivate clause. We need to add a check, that the IsByRef must be set to false onlly if IsVariableUsedInMapClause is set to false too for combined target constructs.

I've tried that. It results in capturing by reference on both the target region and the teams region in the case of a combined target teams construct. Thus, we end up with different codegen than when the directives are separate, where capturing by reference is set only on the target region.

The changes I just backed out handle this correctly because they distinguish between the target and teams regions.

Ahh, I see, in this case, the codegen for the inner subregion is changed. Ok, I see now. Then yes, we need this change with subregion counter.
In this case, I would recommend trying to fix https://bugs.llvm.org/show_bug.cgi?id=40253 at first, as it relates to the same problem, I think. Better to fix the problem separately and then extend it for new feature rather than add 2 new features in one patch.

jdenny updated this revision to Diff 216444.Aug 21 2019, 11:55 AM
jdenny edited the summary of this revision. (Show Details)

Restore previous version of patch, and rebase.

I tried, and this patch is not sufficient to fix PR40253. If there are indeed common changes, it seems it's just as well to put them here.

ABataev added inline comments.Aug 21 2019, 12:39 PM
clang/include/clang/Sema/ScopeInfo.h
763 ↗(On Diff #216444)

I would add a parameter for OpenMPCaptureLevel rather than use default value.

clang/include/clang/Sema/Sema.h
9026 ↗(On Diff #216444)

Do not use default value here, just set it to 0 in the call in SemaOpenMP.cpp. We use it there tat target level always and thus CaptureLevel is always 0 there.

clang/lib/Sema/Sema.cpp
2108 ↗(On Diff #216444)

I would suggest adding a new parameter for CapturedLevel and the same parameter to ActOnCapturedRegionStart function. And pass the value for CapturedLevel explicitly at ActOnCapturedRegionStart function call in SemaOpenMP.cpp rather than rely on this solution.
Also, for each separate OpenMP directive, this counter must start from 0. Currently, I think, it may lead to crash if we have several directives. Something like:

#pragma omp parallel
#pragma omp target teams
...

I think it will work incorrectly for this example.

2109 ↗(On Diff #216444)

CapturedRegionScopeInfo *->auto *

jdenny marked an inline comment as done.Aug 21 2019, 1:08 PM
jdenny added inline comments.
clang/lib/Sema/Sema.cpp
2108 ↗(On Diff #216444)

I think the check that OpenMPLevel is the same prevents the problem you mention.

ABataev added inline comments.Aug 21 2019, 1:11 PM
clang/lib/Sema/Sema.cpp
2108 ↗(On Diff #216444)

Ah, yes, missed this. Anyway, better to specify this explicitly as function parameters, I think, rather than trying to evaluate it here this way. We eventually may change the handling of captured regions in OpenMP, better to have everything in SemaOpenMP.cpp

ABataev added inline comments.Aug 21 2019, 1:13 PM
clang/lib/Sema/Sema.cpp
2108 ↗(On Diff #216444)

BTW, here you can use the parameter with the default value 0.

jdenny marked an inline comment as done.Aug 21 2019, 1:17 PM
jdenny added inline comments.
clang/lib/Sema/Sema.cpp
2108 ↗(On Diff #216444)

OK.

CaptureLevel or CapturedLevel? Do you care?

ABataev added inline comments.Aug 21 2019, 1:23 PM
clang/lib/Sema/Sema.cpp
2108 ↗(On Diff #216444)

Up to you

jdenny updated this revision to Diff 216469.Aug 21 2019, 1:50 PM
jdenny marked 7 inline comments as done.
jdenny set the repository for this revision to rG LLVM Github Monorepo.

Make suggested changes for passing around the capture level.

ABataev added inline comments.Aug 21 2019, 2:29 PM
clang/include/clang/Sema/Sema.h
1419 ↗(On Diff #216469)

Better to use OpenMPCaptureLevel since this param is OpenMP specific.

1419 ↗(On Diff #216469)

Add default value for the parameter here too.

3977 ↗(On Diff #216469)

Same, add prefix OpenMP

9027 ↗(On Diff #216469)

Same, OpenMPCaptureLevel here and in other places.

clang/lib/Sema/SemaExpr.cpp
17748 ↗(On Diff #216469)

Restore original code here

clang/lib/Sema/SemaOpenMP.cpp
1350 ↗(On Diff #216469)

Use /*param=*/ format, please.

2973 ↗(On Diff #216469)

Same for param comment here and other places.

clang/lib/Sema/SemaStmt.cpp
4324 ↗(On Diff #216469)

Use default value of the parameter, no need to pass 0 here

jdenny updated this revision to Diff 216502.Aug 21 2019, 3:24 PM
jdenny marked 8 inline comments as done.

Make suggested changes to default arguments, comments on literals, and parameter names.

jdenny added inline comments.Aug 21 2019, 3:27 PM
clang/lib/Sema/SemaExpr.cpp
17748 ↗(On Diff #216469)

OK, I did. What's the reason for no newline at the end of the file?

ABataev added inline comments.Aug 21 2019, 3:34 PM
clang/lib/Sema/SemaExpr.cpp
17748 ↗(On Diff #216469)

Changes not related to the patch, like formatting etc., better to commit in separate patches.

ABataev accepted this revision.Aug 21 2019, 3:38 PM

LG

clang/lib/Sema/SemaExpr.cpp
17749 ↗(On Diff #216502)

Still marked as changed code, better to restore it completely.

This revision is now accepted and ready to land.Aug 21 2019, 3:38 PM
jdenny marked 4 inline comments as done.Aug 21 2019, 3:41 PM

Thanks.

clang/lib/Sema/SemaExpr.cpp
17748 ↗(On Diff #216469)

Ah.

17749 ↗(On Diff #216502)

There's no change here. I checked the diff I uploaded to be sure.

This revision was automatically updated to reflect the committed changes.
jdenny marked 2 inline comments as done.
Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2019, 8:33 PM