This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
flang/
-
include/flang/Lower/
-
flang/
-
Lower/
1/2
AbstractConverter.h
-
OpenMP.h
-
PFTBuilder.h
-
lib/Lower/
-
Lower/
-
Bridge.cpp
14/31
OpenMP.cpp
-
PFTBuilder.cpp
-
test/Lower/OpenMP/
-
Lower/
-
OpenMP/
-
Todo/
-
omp-threadprivate.f90
-
threadprivate-char-array-chararray.f90
-
threadprivate-commonblock.f90
-
threadprivate-integer-different-kinds.f90
-
threadprivate-pointer-allocatable.f90
-
threadprivate-real-logical-complex-derivedtype.f90
-
threadprivate-use-association.f90

Differential D124226

[flang][OpenMP] Support lowering parse-tree to MLIR for threadprivate directive
ClosedPublic

Authored by peixin on Apr 21 2022, 7:22 PM.

Download Raw Diff

Details

Reviewers

kiranchandramohan
kiranktp
clementval
jeanPerier
schweitz
ftynse
shraiysh
NimishMishra
rriddle
arnamoy10
sscalpone
jdoerfert

Commits

rG411bd2d40788: [flang][OpenMP] Support lowering parse-tree to MLIR for threadprivate directive

Summary

This supports lowering parse-tree to MLIR for threadprivate directive
following the OpenMP 5.1 [2.21.2] standard. Take the following as an
example:

program m
  integer, save :: i
  !$omp threadprivate(i)
  call sub(i)
  !$omp parallel
    call sub(i)
  !$omp end parallel
end

func.func @_QQmain() {
  %0 = fir.address_of(@_QFEi) : !fir.ref<i32>
  %1 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
  fir.call @_QPsub(%1) : (!fir.ref<i32>) -> ()
  omp.parallel   {   
    %2 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
    fir.call @_QPsub(%2) : (!fir.ref<i32>) -> ()
    omp.terminator
  }
  return
}

A threadprivate operation (omp.threadprivate) is created for all
references to a threadprivate variable. The runtime will appropriately
return a threadprivate var (%1 as above) or its copy (%2 as above)
depending on whether it is outside or inside a parallel region. For
threadprivate access outside the parallel region, the threadprivate
operation is created in instantiateVar. Inside the parallel region, it
is created in createBodyOfOp.

One new utility function collectSymbolSet is created for collecting
all the variables with a property within a evaluation, which may be one
Fortran, or OpenMP, or OpenACC construct.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

peixin created this revision.Apr 21 2022, 7:22 PM

Herald added a reviewer: sscalpone. · View Herald TranscriptApr 21 2022, 7:22 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: awarzynski, sdasgup3, wenzhicui and 22 others. · View Herald Transcript

peixin requested review of this revision.Apr 21 2022, 7:22 PM

Herald added a reviewer: jdoerfert. · View Herald TranscriptApr 21 2022, 7:22 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: sstefan1, stephenneuendorffer, nicolasvasilache. · View Herald Transcript

peixin edited the summary of this revision. (Show Details)Apr 21 2022, 7:23 PM

peixin added a parent revision: D124225: [flang] Add lowering stubs for OpenMP/OpenACC declarative constructs.

For omp-threadprivate-4.f90 and omp-threadprivate-5.f90 (lowering for pointer and allocatable variables), I got the following error when running flang-new -fc1 -emit-llvm -fopenmp. When applying this patch in fir-dev branch, there is no this error.

error: loc("./omp-threadprivate-5.f90":8:29): 'llvm.mlir.global' op initializer region type '!llvm.ptr<struct<(ptr<i32>, i64, i32, i8, i8, i8, i8)>>' does not match global type '!llvm.struct<(ptr<i32>, i64, i32, i8, i8, i8, i8)>'
error: loc("./omp-threadprivate-5.f90":9:30): 'llvm.mlir.global' op initializer region type '!llvm.ptr<struct<(ptr<f32>, i64, i32, i8, i8, i8, i8)>>' does not match global type '!llvm.struct<(ptr<f32>, i64, i32, i8, i8, i8, i8)>'
error: loc("./omp-threadprivate-5.f90":8:23): 'llvm.mlir.global' op initializer region type '!llvm.ptr<struct<(ptr<i32>, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>>' does not match global type '!llvm.struct<(ptr<i32>, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>'
error: loc("./omp-threadprivate-5.f90":9:24): 'llvm.mlir.global' op initializer region type '!llvm.ptr<struct<(ptr<f32>, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>>' does not match global type '!llvm.struct<(ptr<f32>, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>'
error: Lowering to LLVM IR failed
error: loc("./omp-threadprivate-5.f90":8:29): cannot be converted to LLVM IR: missing `LLVMTranslationDialectInterface` registration for dialect for op: builtin.unrealized_conversion_cast
error: loc("./omp-threadprivate-5.f90":8:29): unemittable constant value
error: failed to create the LLVM module

Does anyone know what's going on here? Is it because some work has not been upstreamed?

Harbormaster completed remote builds in B160771: Diff 424358.Apr 21 2022, 8:12 PM

Remove the split-file check since the IR check fails sometimes for multiple runs. It always succeeds with correct running output if not checking the specific IR with the specific order.

Harbormaster completed remote builds in B160778: Diff 424375.Apr 21 2022, 10:12 PM

Disable 128-bit integer test on windows since it is not supported there.

Harbormaster completed remote builds in B160787: Diff 424384.Apr 22 2022, 7:28 AM

Thanks for this patch. I have a few comments and questions.

For the naming of tests, I'd like them to be descriptive of "what they test" instead of 1,2,3... but I do not have a strong opinion on this. Thoughts?

flang/lib/Lower/OpenMP.cpp
94	Please add a comment here describing what this function does.
114	Please add a comment here describing what this function does.
132	Same as below: this function doesn't do much and its uses can be easily replaced with converter.getSymbolAddress(). Can we please do that?
136	I am having trouble understanding what this function is doing. Why are we generating a thread private op when the assert checks for already existing thread private op? Are there two threadprivate ops for each symbol?
137	nit: spelling - `symOrThreadprivateValue`?
1102	this function doesn't seem to do much and is only used at two places. Can we use `converter.getSymbolAddress()` in those uses?
mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp
62 ↗	(On Diff #424358)	Do we need RegionLessWithResultsOpConversion and RegionLessOpConversion both? Would it be okay if we have one `RegionLessOpConversion` (that works like current `RegionLessWithResultsOpConversion`)? Would that cause a compilation issue?

@shraiysh First of all, thanks for the detailed review.

The threadprivate directive is one data-sharing related directive. Other OpenMP constructs/clauses code currently in OpenMP.cpp only needs the result of the expression if there is one expression in the contructs/clauses, so mlir Value is enough. In semantics, the threadprivate changes the definition of the variables, so the extended value is needed for one specific language (here I mean fir::ExtendedValue for fortran code), which results in additional and more complex handling on data entity than other contructs/clauses. This has something in common with private/firstprivate clauses. The common block in private/firstprivate has not been supported, and it will needs some additional process similar to this patch. All in all, this patch borrows some code from FIR lowering, in which it usually use the function names or variable names comment itself. For these code, I don't add comments since it's pretty common in FIR lowering. I only add some comments for the special handling of threadprivate variables, which is no in the lowering of the usual fortran variables.

Please feel free to share your thoughts which needs further comments if some code is hard to understand. @kiranchandramohan can help double check. The final target is to add necessary comments and not to add the redundant comments such as the comment repeating the function name.

For the naming of tests, I'd like them to be descriptive of "what they test" instead of 1,2,3... but I do not have a strong opinion on this. Thoughts?

I did that at the beginning, but it results in very long file name as I supports more and more data types and variable usage cases.

omp-threadprivate-2.f90 -> omp-threadprivate-real-logical-complex-derivedtype.f90
omp-threadprivate-4.f90 -> omp-threadprivate-noncharacter-nonSAVEd-nonInit-scalar.f90

So, I add what is testing in the comment of the file. This is pretty common for fortran test cases. Check flang/test/Lower/io-statement-1/2/3.f90.

flang/lib/Lower/OpenMP.cpp
94	I use the function name to comment itself. "genCommonMember" -> "generate the member of the common block"
114	I use the function name to comment itself. "getExtValue" -> "get the extended value"
132	This and converter.getSymbolAddress() have different argument types.
136	This code is in the function `threadPrivateVars`, which is called in `createBodyOfOp`. This is generating one explicit threadprivate op in parallel region. Are there two threadprivate ops for each symbol? Yes. But they are in different scope. This code has the similar function as `allocate` for `private` clause (createHostAssociateVarClone in Bridge.cpp).
137	No. This is `symOriThreadprivateValue`, i.e., the original threadprivate value of the symbol. As explained above, this is generating a new and second threadprivate op.
1102	I can refactor this code and add one new function `getSymbolAddress` with different argument type when it gets three according to "Rule of Three" (code duplication). It seems that the lowering of copyin clause will also need this, I can refactor this at that time. I am working on that.
mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp
62 ↗	(On Diff #424358)	Yes, I tried to use `RegionLessOpConversion`, but I got compilation error since there is no result in atomic read/write operations. `curOp.getType()` fails for them. Adding one empty result in atomic read/write op should work, but it seems kind of strange to do that.

rriddle added inline comments.Apr 24 2022, 11:58 PM

mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp
62 ↗	(On Diff #424358)	Can't you get around that by converting all of the result types instead of anchoring on `getType()` (using `->getResultTypes()`)? For the zero-result case the result type range should just be empty, for the single result there would of course be one.

Replace using getType with using getResultTypes.

peixin added inline comments.Apr 25 2022, 4:34 AM

mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp
62 ↗	(On Diff #424358)	Thanks. It works and get fixed.

Harbormaster completed remote builds in B161137: Diff 424867.Apr 25 2022, 4:50 AM

rriddle added inline comments.Apr 25 2022, 9:40 AM

mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp
56–60 ↗	(On Diff #424867)	Sorry, this is what I meant before. This removes any assumptions about number of types.

Replace using converType with using convertTypes.

Harbormaster completed remote builds in B161308: Diff 425096.Apr 25 2022, 8:17 PM

Fix the clang format issue reported in debian.

Thanks for your patience and addressing the comments. It is a big patch and I am trying my best to understand and review it as soon as possible. I will review the test cases soon.

In D124226#3470253, @peixin wrote:

@shraiysh First of all, thanks for the detailed review.

The threadprivate directive is one data-sharing related directive. Other OpenMP constructs/clauses code currently in OpenMP.cpp only needs the result of the expression if there is one expression in the contructs/clauses, so mlir Value is enough. In semantics, the threadprivate changes the definition of the variables, so the extended value is needed for one specific language (here I mean fir::ExtendedValue for fortran code), which results in additional and more complex handling on data entity than other contructs/clauses. This has something in common with private/firstprivate clauses. The common block in private/firstprivate has not been supported, and it will needs some additional process similar to this patch. All in all, this patch borrows some code from FIR lowering, in which it usually use the function names or variable names comment itself. For these code, I don't add comments since it's pretty common in FIR lowering. I only add some comments for the special handling of threadprivate variables, which is no in the lowering of the usual fortran variables.

Please feel free to share your thoughts which needs further comments if some code is hard to understand. @kiranchandramohan can help double check. The final target is to add necessary comments and not to add the redundant comments such as the comment repeating the function name.

I didn't know what we meant by an "extended value" but if the comments are just going to be function names, I guess we can avoid that :)

For the naming of tests, I'd like them to be descriptive of "what they test" instead of 1,2,3... but I do not have a strong opinion on this. Thoughts?

I did that at the beginning, but it results in very long file name as I supports more and more data types and variable usage cases.

omp-threadprivate-2.f90 -> omp-threadprivate-real-logical-complex-derivedtype.f90
omp-threadprivate-4.f90 -> omp-threadprivate-noncharacter-nonSAVEd-nonInit-scalar.f90

So, I add what is testing in the comment of the file. This is pretty common for fortran test cases. Check flang/test/Lower/io-statement-1/2/3.f90.

Is there any issue with long file names as long as they are descriptive but not complete sentences. On the another hand, can we add all the relevant tests in one file, and separate them using comments describing each of them individually? (like loop.f90 in D124277).

Also, the tests for this minor change are huge. This is primarily because these tests try to check four things at once - thus making it integration/regression test instead of unit test. A test failure in such a scenario does not help us much. A suggestion - can we please divide the tests into individual stages so that they are more targeted and provide helpful feedback in case of failures:

flang/test/Lower/OpenMP/*.f90 - check generation of FIR and OpenMPDialect only (these tests would be added in this patch)
flang/test/Fir/convert-to-llvm.fir - check FIR to LLVM IR Dialect only.
mlir/test/Target/LLVMIR/llvmir.mlir - Check LLVM IR Dialect to LLVM IR
mlir/test/Target/LLVMIR/openmp-llvm.mlir - check OpenMP Dialect to LLVM IR

Only the first kind of tests are required to be added in this patch, and maybe upto LLVM Dialect, but beyond that, it just makes the test lengthy and hard to understand. We can however discuss about having some integration tests separate from unit tests (am open to ideas about this).

shraiysh added inline comments.Apr 26 2022, 12:19 AM

flang/lib/Lower/OpenMP.cpp
132	I tried replacing the uses of this function with converter.getSymbolAddress and `ninja check-flang` passed for all tests. Am I missing something?
136	Okay understood. Thanks
137	Okay, I misread it.
1102	It is not just about the code duplication, this function is simply replacing getSymbolAddress <-> converter.getSymbolAddress. If it is absolutely necessary to use this lambda, it would be very unfortunate and we should comment why a simple replacement could not work.

Harbormaster completed remote builds in B161330: Diff 425132.Apr 26 2022, 12:43 AM

Is there any issue with long file names as long as they are descriptive but not complete sentences. On the another hand, can we add all the relevant tests in one file, and separate them using comments describing each of them individually? (like loop.f90 in D124277).

Any of them is OK to me. I guess there is no strict rule for adding test cases. Usually, I followed one of the previous provided test cases.

Also, the tests for this minor change are huge. This is primarily because these tests try to check four things at once - thus making it integration/regression test instead of unit test. A test failure in such a scenario does not help us much. A suggestion - can we please divide the tests into individual stages so that they are more targeted and provide helpful feedback in case of failures:

flang/test/Lower/OpenMP/*.f90 - check generation of FIR and OpenMPDialect only (these tests would be added in this patch)
flang/test/Fir/convert-to-llvm.fir - check FIR to LLVM IR Dialect only.
mlir/test/Target/LLVMIR/llvmir.mlir - Check LLVM IR Dialect to LLVM IR
mlir/test/Target/LLVMIR/openmp-llvm.mlir - check OpenMP Dialect to LLVM IR
Only the first kind of tests are required to be added in this patch, and maybe upto LLVM Dialect, but beyond that, it just makes the test lengthy and hard to understand. We can however discuss about having some integration tests separate from unit tests (am open to ideas about this).

You propose one good question here. When I join the OpenMP development group, I was told to follow the following process:

OMPIRBuilder: unittest and IR check for clang
MLIR Op Def: check dialect
lowering from MLIR to LLVMIR: check MLIR to LLVMIR
lowering from parse-tree to MLIR: check end-to-end (was using bbc and tco)

Is there one rule to add the test cases? @kiranchandramohan

I think there is benefit to check end-to-end in step 4. When I develop the code for ordered clause, I found several bugs about work-sharing loop lowering since the end-to-end testing is missed. Usually, the MLIR check cannot have full combination tests, especially for data-related constructs/clauses. Of course, we can add them all, but it takes much more time to add the test cases there. I guess checking end-to-end was in step 4 using bbc and tco since it is the least time-consuming work.

flang/lib/Lower/OpenMP.cpp
114	// Get the extended value for \p val by extracting additional variable information from \p base. Is this reasonable to you? @shraiysh
132	Well, I didn't try that. Usually I try to pass the same type argument when I use those provided functions in case CI gives one warning-to-error problem. Let me try this and test if CI can pass.

(like loop.f90 in D124277).

https://reviews.llvm.org/D124277#3473895. Andrzej prefers multiple files. Everyone has their own preference :).

In D124226#3473962, @peixin wrote:

(like loop.f90 in D124277).

https://reviews.llvm.org/D124277#3473895. Andrzej prefers multiple files. Everyone has their own preference :).

In D124277, I suggested multiple files as that particular file is already quite dense and to me DO and DO WHILE are different enough to merit a dedicated file each :) There is no harm in doing that, it will make the test files more specialised and potentially make testing faster (e.g. "do_loop.f90" and "do_while_loop.f90" can be run in parallel).

In general, I do prefer test files that:

Test one thing that can be uniquely determined through the file name (within reason, file name can be/should be complemented with comments inside every file). In general, it really helps when it is immediately clear "what" is being tested in a particular file.
Make triaging failures easy (good labels really help, so does MLIR's -split-input-file). For example, when you see a buildbot failure (especially if you are unfamiliar with the failing code), descriptive file names like "omp-threadprivate-real-logical-complex-derivedtype.f90" (instead of e.g. "omp-threadprivate-2.f90") can make a huge difference.
Avoid covering too many loosely related cases. In many situations, one failing case in a file means that no other cases in that particular file are being tested (one test case failing = whole file is failing).

In practice, easier said than done! And you will have slightly different priorities in different scenarios.

This is pretty common for fortran test cases. Check flang/test/Lower/io-statement-1/2/3.f90.

I don't find this naming scheme to be particularly helpful - there is no way to understand the difference between these files without opening and reading them. I really wish we avoided that.

On a related note - if you feel that test file names are too long, then all OpenMP test files could drop the leading "omp-". Since the test files are located in "flang/test/Lower/OpenMP/" (i.e. it's clear these are OpenMP tests), having the "omp-" prefix adds no new information. As for "omp-threadprivate-real-logical-complex-derivedtype.f90" vs "omp-threadprivate-2.f90" - how about "threadprivate-different-types.f90"? IIUC, that file is about testing threadprivate with different types. Is listing the types in the filename necessary? I don't mind, but if you want to make the file name shorter then there are various ways to achieve that while keeping the name quite informative.

@awarzynski Thanks for the explanations. Good to hear someone has the strong reason to make the test cases standard. It is also more reasonable for me to give one descriptive file name.

threadprivate-different-types.f90 doesn't represent its true intention. I will use threadprivate-real-logical-complex-derivedtype.f90 if no one opposes the long file name. Will also fix other file names.

In D124226#3473930, @peixin wrote:

Is there any issue with long file names as long as they are descriptive but not complete sentences. On the another hand, can we add all the relevant tests in one file, and separate them using comments describing each of them individually? (like loop.f90 in D124277).

Any of them is OK to me. I guess there is no strict rule for adding test cases. Usually, I followed one of the previous provided test cases.

Also, the tests for this minor change are huge. This is primarily because these tests try to check four things at once - thus making it integration/regression test instead of unit test. A test failure in such a scenario does not help us much. A suggestion - can we please divide the tests into individual stages so that they are more targeted and provide helpful feedback in case of failures:

flang/test/Lower/OpenMP/*.f90 - check generation of FIR and OpenMPDialect only (these tests would be added in this patch)
flang/test/Fir/convert-to-llvm.fir - check FIR to LLVM IR Dialect only.
mlir/test/Target/LLVMIR/llvmir.mlir - Check LLVM IR Dialect to LLVM IR
mlir/test/Target/LLVMIR/openmp-llvm.mlir - check OpenMP Dialect to LLVM IR
Only the first kind of tests are required to be added in this patch, and maybe upto LLVM Dialect, but beyond that, it just makes the test lengthy and hard to understand. We can however discuss about having some integration tests separate from unit tests (am open to ideas about this).

You propose one good question here. When I join the OpenMP development group, I was told to follow the following process:

OMPIRBuilder: unittest and IR check for clang

MLIR Op Def: check dialect

lowering from MLIR to LLVMIR: check MLIR to LLVMIR

lowering from parse-tree to MLIR: check end-to-end (was using bbc and tco)

Is there one rule to add the test cases? @kiranchandramohan

I think there is benefit to check end-to-end in step 4. When I develop the code for ordered clause, I found several bugs about work-sharing loop lowering since the end-to-end testing is missed. Usually, the MLIR check cannot have full combination tests, especially for data-related constructs/clauses. Of course, we can add them all, but it takes much more time to add the test cases there. I guess checking end-to-end was in step 4 using bbc and tco since it is the least time-consuming work.

These were the steps followed initially when we were gaining experience with the working of flang, mlir, openmpirbuilder. Having all of them in a single place helped to understand/prove that it all works. And as @peixin says it can help uncover bugs and missing steps (like adding the operation to the openmp-to-llvm conversion pass) and forces the developer to write the tests. I guess it is easier to write tests from source to LLVM IR than from FIR+OpenMP to LLVM IR. While for other dialects, the LLVM dialect is a good stop, this is not the case for us due to our slightly different flow (involving the OpenMPIRBuilder). So I think it is good to check the LLVM IR also. But I agree with @shraiysh that having all these in a single test makes the test difficult to read.

I think we can restrict the Lower directory to check for parse-tree to MLIR (FIR + OpenMP) in line with FIR lowering and move the rest to a different directory (flang/test/Integration/OpenMPLLVM) which will be fortran+openmp to OpenMP+LLVMDialect and LLVM IR. Now that we have some execution support, we should also have some execution tests somewhere (https://github.com/llvm/llvm-test-suite).

In D124226#3474072, @awarzynski wrote:

In D124226#3473962, @peixin wrote:

(like loop.f90 in D124277).

https://reviews.llvm.org/D124277#3473895. Andrzej prefers multiple files. Everyone has their own preference :).

In D124277, I suggested multiple files as that particular file is already quite dense and to me DO and DO WHILE are different enough to merit a dedicated file each :) There is no harm in doing that, it will make the test files more specialised and potentially make testing faster (e.g. "do_loop.f90" and "do_while_loop.f90" can be run in parallel).

In general, I do prefer test files that:

Test one thing that can be uniquely determined through the file name (within reason, file name can be/should be complemented with comments inside every file). In general, it really helps when it is immediately clear "what" is being tested in a particular file.

Make triaging failures easy (good labels really help, so does MLIR's -split-input-file). For example, when you see a buildbot failure (especially if you are unfamiliar with the failing code), descriptive file names like "omp-threadprivate-real-logical-complex-derivedtype.f90" (instead of e.g. "omp-threadprivate-2.f90") can make a huge difference.

Avoid covering too many loosely related cases. In many situations, one failing case in a file means that no other cases in that particular file are being tested (one test case failing = whole file is failing).

In practice, easier said than done! And you will have slightly different priorities in different scenarios.

This is pretty common for fortran test cases. Check flang/test/Lower/io-statement-1/2/3.f90.

I don't find this naming scheme to be particularly helpful - there is no way to understand the difference between these files without opening and reading them. I really wish we avoided that.

On a related note - if you feel that test file names are too long, then all OpenMP test files could drop the leading "omp-". Since the test files are located in "flang/test/Lower/OpenMP/" (i.e. it's clear these are OpenMP tests), having the "omp-" prefix adds no new information. As for "omp-threadprivate-real-logical-complex-derivedtype.f90" vs "omp-threadprivate-2.f90" - how about "threadprivate-different-types.f90"? IIUC, that file is about testing threadprivate with different types. Is listing the types in the filename necessary? I don't mind, but if you want to make the file name shorter then there are various ways to achieve that while keeping the name quite informative.

The omp prefix for tests was because they were initially in the Lower directory with the other FIR tests. When we moved it to the OpenMP directory we did not remove the prefix. Feel free to remove it.
In general, we should balance how much information can be added to a file name vs adding the same information clearly as a comment in the test. I don't have a strong opinion.

Remove getSymbolAddress.
Change the file names.
Split the test cases.

Harbormaster completed remote builds in B161375: Diff 425186.Apr 26 2022, 6:23 AM

In D124226#3474232, @kiranchandramohan wrote:

These were the steps followed initially when we were gaining experience with the working of flang, mlir, openmpirbuilder. Having all of them in a single place helped to understand/prove that it all works. And as @peixin says it can help uncover bugs and missing steps (like adding the operation to the openmp-to-llvm conversion pass) and forces the developer to write the tests. I guess it is easier to write tests from source to LLVM IR than from FIR+OpenMP to LLVM IR. While for other dialects, the LLVM dialect is a good stop, this is not the case for us due to our slightly different flow (involving the OpenMPIRBuilder). So I think it is good to check the LLVM IR also. But I agree with @shraiysh that having all these in a single test makes the test difficult to read.

I think we can restrict the Lower directory to check for parse-tree to MLIR (FIR + OpenMP) in line with FIR lowering and move the rest to a different directory (flang/test/Integration/OpenMPLLVM) which will be fortran+openmp to OpenMP+LLVMDialect and LLVM IR. Now that we have some execution support, we should also have some execution tests somewhere (https://github.com/llvm/llvm-test-suite).

It was mentioned to me in a patch for OpenACC that it is not desirable to have end to end tests (or Fortran to LLVM-IR). Test should test a single component (Fortran to FIR, FIR to LLVM IR).

In D124226#3474102, @peixin wrote:

@awarzynski Thanks for the explanations. Good to hear someone has the strong reason to make the test cases standard. It is also more reasonable for me to give one descriptive file name.

threadprivate-different-types.f90 doesn't represent its true intention. I will use threadprivate-real-logical-complex-derivedtype.f90 if no one opposes the long file name. Will also fix other file names.

In D124226#3474504, @clementval wrote:

It was mentioned to me in a patch for OpenACC that it is not desirable to have end to end tests (or Fortran to LLVM-IR). Test should test a single component (Fortran to FIR, FIR to LLVM IR).

Was that "end-to-end _instead_ of component-based tests" or "end-to-end _on top of_ component-based tests"? I agree that first and foremost we ought to be adding minimal tests for every component separately. I'm not opposed to supplementing them with more holistic/integration tests. However, I would appreciate some justification that would explain what is gained by testing multiple components (perhaps there's something that cannot be tested otherwise?).

In D124226#3474232, @kiranchandramohan wrote:

I think we can restrict the Lower directory to check for parse-tree to MLIR (FIR + OpenMP) in line with FIR lowering.

In D124226#3474623, @awarzynski wrote:

In D124226#3474102, @peixin wrote:

@awarzynski Thanks for the explanations. Good to hear someone has the strong reason to make the test cases standard. It is also more reasonable for me to give one descriptive file name.

Was that "end-to-end _instead_ of component-based tests" or "end-to-end _on top of_ component-based tests"? I agree that first and foremost we ought to be adding minimal tests for every component separately. I'm not opposed to supplementing them with more holistic/integration tests. However, I would appreciate some justification that would explain what is gained by testing multiple components (perhaps there's something that cannot be tested otherwise?).

end-to-end test are normally hosted in the llvm-test-suite and not directly here. component based tests makes debugging easier and point directly to the failing component.

It was mentioned to me in a patch for OpenACC that it is not desirable to have end to end tests (or Fortran to LLVM-IR). Test should test a single component (Fortran to FIR, FIR to LLVM IR).

@clementval I like this strategy. To be honest, adding so many LLVM IR checks really takes time and I really don't like to do it.

However, I would appreciate some justification that would explain what is gained by testing multiple components (perhaps there's something that cannot be tested otherwise?).

@awarzynski Some bugs are hidden without end-to-end test. For example, I found these two (D120294 and D116073) when doing end-to-end test. For D120294, it is not easy to write such one complex mlir test case without translated from fortran, so it was not tested and the bug is hidden. For D116073, the operation arguments type may have mismatch if not carefully handled.

Anyway, it is ok to me that we only make components test in patch. The bugs can be filed to issues when testing locally. I would rather create one issue like this (https://github.com/flang-compiler/f18-llvm-project/issues/1136) to show what I have tested than writing so many LLVM IR checks. For this, I only need to copy and paste what I tested locally. In addition, reading the source code is easier and testing running results is more useful. Also, writing one LLVM IR check for use association is not easy.

One more problem here is how to test the code in mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp? Previously, it was tested by end-to-end test for OpenMP test cases.

One more problem here is how to test the code in mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp? Previously, it was tested by end-to-end test for OpenMP test cases.

Well, I find the place to test it. FIR team uses flang/test/Fir to test FIR to LLVMIR with tco or fir-opt. There was no OpenMP test there. We can add the test cases under flang/test/Fir/OpenMP/threadprivate-op.mlir to test the operation conversion. Or better directory name or place. @kiranchandramohan @shraiysh What do you think? Do we need end-to-end test here?

There are tests in mlir/test/Target/LLVMIR/openmp-llvm.mlir. I guess if it is to test FIR/OpenMP integration then of course it makes sense to have where FIR is.

In my opinion, these tests belongs to MLIR since the OpenMP is a core dialect. It is supposed to be usable by other frontend than Flang.

I agree with this. They test (LLVM IR Dialect + OpenMP Dialect) to LLVM IR. If we accurately test F90 to (OpenMP Dialect + FIR Dialect) and (FIR Dialect + OpenMP Dialect) to (LLVM IR Dialect + OpenMP Dialect) then we probably would not need any other tests (except the end-to-end/execution tests hosted in the llvm test suite).

Previously, it was tested by end-to-end test for OpenMP test cases.

https://github.com/llvm/llvm-project/blob/main/mlir/test/Target/LLVMIR/openmp-llvm.mlir tests for them.

In my opinion, these tests belongs to MLIR since the OpenMP is a core dialect. It is supposed to be usable by other frontend than Flang.

That's true. But the problem is there seems no tool can test it currently. Only Flang uses OpenMP dialect for now?

https://github.com/llvm/llvm-project/blob/main/mlir/test/Target/LLVMIR/openmp-llvm.mlir tests for them.

It is not. As we can see, the test for atomic read/write and threadprivate are sucessful in openmp-llvm.mlir without changes in OpenMPToLLVM.cpp. The code in mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp needs to be tested by tco or fir-opt, which are both under flang. For example, the code is tested for atomic read/write using fir-opt in D122725.

There are tests in mlir/test/Target/LLVMIR/openmp-llvm.mlir. I guess if it is to test FIR/OpenMP integration then of course it makes sense to have where FIR is.

mlir/test/Target/LLVMIR/openmp-llvm.mlir is used to test translation from MLIR to LLVM IR. There was no FIR. The code under mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp is used to translate OpenMP MLIR (in language side, here I mean Fortran) into OpenMP MLIR (in MLIR side). For threadprivate, it is mainly for type conversion of arguments and results.

It is not. As we can see, the test for atomic read/write and threadprivate are sucessful in openmp-llvm.mlir without changes in OpenMPToLLVM.cpp. The code in mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp needs to be tested by tco or fir-opt, which are both under flang. For example, the code is tested for atomic read/write using fir-opt in D122725.

Hmm, this is a strange situation, the tests for changes in mlir/ should reside under mlir/ and not under flang/. I think we can add tests for such changes by adding tests for these while converting OpenMP Dialect + X-dialect (any other dialect) to OpenMP Dialect + LLVM Dialect in mlir/. @rriddle @ftynse what would be the best place/combination of dialects for testing the code in mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp?

OpenMP Dialect + FIR-Dialect  -> OpenMP Dialect + LLVM Dialect
 (flang side, FIR data type)        (mlir side, LLVM data type)

BTW, these two OpenMP Dialect have minor differences, mainly data type? Currently, this translation is tested in flang side by using tco or fir-opt.

In D124226#3476532, @shraiysh wrote:

It is not. As we can see, the test for atomic read/write and threadprivate are sucessful in openmp-llvm.mlir without changes in OpenMPToLLVM.cpp. The code in mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp needs to be tested by tco or fir-opt, which are both under flang. For example, the code is tested for atomic read/write using fir-opt in D122725.

Hmm, this is a strange situation, the tests for changes in mlir/ should reside under mlir/ and not under flang/. I think we can add tests for such changes by adding tests for these while converting OpenMP Dialect + X-dialect (any other dialect) to OpenMP Dialect + LLVM Dialect in mlir/. @rriddle @ftynse what would be the best place/combination of dialects for testing the code in mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp?

Usually these are tested in https://github.com/llvm/llvm-project/blob/main/mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir. But i think in the atomic operation and the threadprivate case there is currently no conversion from the memref to LLVM dialect that is suitable for OpenMP dialect. If there is such a conversion then it can be tested in https://github.com/llvm/llvm-project/blob/main/mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir.

In D124226#3476548, @peixin wrote:
OpenMP Dialect + FIR-Dialect  -> OpenMP Dialect + LLVM Dialect
 (flang side, FIR data type)        (mlir side, LLVM data type)
BTW, these two OpenMP Dialect have minor differences, mainly data type? Currently, this translation is tested in flang side by using tco or fir-opt.

This can be achieved by FIR to LLVM Dialect conversion tests + OpenMPToLLVM conversion tests. There are similar examples in OpenACC.
https://github.com/llvm/llvm-project/blob/4a8c13a6f42e9c58af64421790509cc58208859b/mlir/test/Conversion/OpenACCToLLVM/convert-data-operands-to-llvmir.mlir#L8
https://github.com/flang-compiler/f18-llvm-project/pull/915/commits/e0a401d1490084880e468ef7b7693c2712631f5f#diff-99ef956db9470e0f2d2c079bbd6efa99698fd01d738ceef9fc422946062a3900

There are some integration tests in MLIR that seems to be switched OFF by default and only enabled by building with a CMake option. There is probably a CI somewhere testing it. This could possibly be an option for us.
https://discourse.llvm.org/t/vectorops-rfc-add-suite-of-integration-tests-for-vector-dialect-operations/1213

This can be achieved by FIR to LLVM Dialect conversion tests + OpenMPToLLVM conversion tests. There are similar examples in OpenACC.
https://github.com/llvm/llvm-project/blob/4a8c13a6f42e9c58af64421790509cc58208859b/mlir/test/Conversion/OpenACCToLLVM/convert-data-operands-to-llvmir.mlir#L8
https://github.com/flang-compiler/f18-llvm-project/pull/915/commits/e0a401d1490084880e468ef7b7693c2712631f5f#diff-99ef956db9470e0f2d2c079bbd6efa99698fd01d738ceef9fc422946062a3900

This sounds more reasonable. And I know why this patch fails for pointer/allocatables when lowering to LLVM IR by running tco, but it succeeds in fir-dev. There are still some important patches not upstreamed.

Even we can support integration test here, threadprivate-pointer-allocatable.f90 still needs some patches upstreamed to take the integration test. Also, this patch blocks the review for the lowering of default clause and the development of the copyin clause. Considering the type conversion things has not been fully upstreamed, can I change this patch to only support lowering parse-tree to MLIR for threadprivate? At that time, we can follow the style of what OpenACC does.

In D124226#3476752, @peixin wrote:

Even we can support integration test here, threadprivate-pointer-allocatable.f90 still needs some patches upstreamed to take the integration test. Also, this patch blocks the review for the lowering of default clause and the development of the copyin clause. Considering the type conversion things has not been fully upstreamed, can I change this patch to only support lowering parse-tree to MLIR for threadprivate? At that time, we can follow the style of what OpenACC does.

Sure that is fine with me.

Remove the operation conversion. Will support it after the upstream is done.

Harbormaster completed remote builds in B161579: Diff 425482.Apr 27 2022, 5:14 AM

Update this since the depend patch is updated.

Harbormaster completed remote builds in B161599: Diff 425509.Apr 27 2022, 8:07 AM

If you want to write a Fortran with OpenMP test that tests more of the pipeline, say to LLVM IR, then flang/test/Driver seems to be one possibility of where to put such a test. That directory is not just driver tests, but more a sundry of tests.

In D124226#3477399, @schweitz wrote:

If you want to write a Fortran with OpenMP test that tests more of the pipeline, say to LLVM IR, then flang/test/Driver seems to be one possibility of where to put such a test. That directory is not just driver tests, but more a sundry of tests.

I look through the tests under flang/test/Driver, but adding OpenMP integration tests there seems not appropriate. As @clementval suggested, adding component tests in llvm-project and adding the execution tests in https://github.com/llvm/llvm-test-suite in future seems more resonable.

In D124226#3479018, @peixin wrote:

I look through the tests under flang/test/Driver, but adding OpenMP integration tests there seems not appropriate.

I agree. Every test file in that directory is meant to verify a particular driver functionality. As the driver integrates various components, some tests may feel like these are end-to-end or integration tests. The goal, however, is just to test the driver.

As @clementval suggested, adding component tests in llvm-project and adding the execution tests in https://github.com/llvm/llvm-test-suite in future seems more resonable.

+1.

As a more general suggestion, I would really appreciate an RFC for this. The idea of "end-to-end" tests was brought up in the past (see e.g. https://discourse.llvm.org/t/flang-test-questions), but we never reached a conclusion. It is still unclear to me whether we actually need integration tests for this change. To me, that's a workflow question for OpenMP development and I feel that we are digressing here a bit. If this patch is for "lowering", then that's only one component for me and there should only be tests for lowering. Also, apart from llvm-test-suite, there's cross-project-tests too.

@awarzynski Thanks for the suggestions. Let's move this discussion to D124610. I would like to push forward this patch for only lowering work so that we can move on other OpenMP works based on this patch.

In D124226#3479937, @peixin wrote:

@awarzynski Thanks for the suggestions. Let's move this discussion to D124610. I would like to push forward this patch for only lowering work so that we can move on other OpenMP works based on this patch.

SGTM, thanks!

rebase and ping

peixin mentioned this in D124610: [OpenMP] Support operation conversion to LLVM for threadprivate directive.May 11 2022, 2:53 AM

Harbormaster completed remote builds in B163860: Diff 428597.May 11 2022, 3:17 AM

ping @kiranchandramohan @jeanPerier @schweitz

rebase

Herald added a subscriber: bzcheeseman. · View Herald TranscriptMay 24 2022, 4:07 AM

Harbormaster completed remote builds in B166024: Diff 431637.May 24 2022, 4:22 AM

rebase and ping

Harbormaster completed remote builds in B167453: Diff 433660.Jun 1 2022, 11:46 PM

Thanks @peixin for this work and apologise that reviewing this patch is delayed for a long time. Mostly LG. Requesting adding several comments, a refactoring (if possible), and pushing out a portion of this patch into another one.

I would recommend adding some more information into the summary of the patch, like with an example like the one below. Also briefly mention the following,
-> A threadprivate operation is created for all references to a threadprivate variable. The runtime will appropriately return a threadprivate var or its copy depending on whether it is outside or inside a parallel region.
-> For threadprivate access outside the parallel region, the threadprivate operation is created in instantiateVar.
-> Inside the parallel region, it is created in createBodyOfOp.
-> A few utility functions are created. e.g for collecting all the variables with a property.

integer, save :: i
!$omp threadprivate(i)
call sb(i)
!$omp parallel
call sb(i)
!$omp end parallel
end

func.func @_QQmain() {
  %0 = fir.address_of(@_QFEi) : !fir.ref<i32>
  %1 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
  fir.call @_QPsb(%1) : (!fir.ref<i32>) -> ()
  omp.parallel   {
    %2 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
    fir.call @_QPsb(%2) : (!fir.ref<i32>) -> ()
    omp.terminator
  }
  return
}

flang/include/flang/Lower/AbstractConverter.h
81	Nit: Would spelling out be better? `getSymbolExtendedValue`.
flang/lib/Lower/OpenMP.cpp
94	Nit: Would `genCommonBlockMember` be better?
94	Nit: I assume this shares some common code with other places where common block members are accessed like `instantiateCommon` in `ConvertVariable.cpp` and possibly in other places too. Can we do some refactoring to share the code? If not can you add a comment similar to `instantiateCommon`?
114	Nit: Would getExtendedValue be better? The description sounds good to me. Do you know whether additional handling is required for strings? Is this function specially made for the threadprivate case or is it generally usable?
134	Nit: Can you add a comment here saying that in this function we get the original ThreadPrivateOp corresponding to the symbol and use the appropriate field from that Operation to create the threadprivate Copy operation inside the parallel region?
139	Is this the same as `op->sym_addr()` ?
149	Nit: Can you add a comment explaining why we have to do some additional steps (like binding the Symbol) for commonblocks?
1120	Can we add this `else if` in a follow up patch? It will simplify this patch.

This revision now requires changes to proceed.Jun 3 2022, 7:39 AM

Address the comments.

Thanks @kiranchandramohan for the comments. Addressed them all.

flang/include/flang/Lower/AbstractConverter.h
81	Fixed.
flang/lib/Lower/OpenMP.cpp
94	The `instantiateCommon` has multiple purposes which stops me reusing that here. Add some comment. In the long term, the `instantiateCommon` may be split into multiple small functions when supporting more fortran features. So also add one FIXME here.
94	Yes, fixed.
114	Yes, fixed the function name. The additional handling is not required for non-pointer non-allocatable strings. For pointer or allocatable strings, `box,nonDeferredLenParams()` is used to get the non deferred length info. This function is designed targeting for general usage. I assume future data-related construct or clauses may need this such as declare target construct. Of course, this function may be refactored to support more functions such as mutable properties.
139	Thanks for the notice. Fixed it since `sym_addr` is more readable.
149	Done.
1120	Sure. Good point. Single-function patch is easier to review and maintain.

peixin added a child revision: D127047: [flang][OpenMP] Support lowering of threadprivate directive for non global variables.Jun 4 2022, 4:48 AM

Harbormaster completed remote builds in B167892: Diff 434259.Jun 4 2022, 4:52 AM

Can you check the behaviour in a nested OpenMP region, like the following?

PROGRAM MAIN
INTEGER, SAVE :: A
!$OMP THREADPRIVATE(A)
!$OMP PARALLEL
PRINT *, A
!$OMP DO
DO I=1,100
 PRINT *, A
END DO
!$OMP END DO
!$OMP END PARALLEL
END PROGRAM

You might need a change from,

if (std::is_same_v<Op, omp::ParallelOp>)
  threadPrivatizeVars(converter, eval);

to something like

if(std::is_same_v<Op, omp::ParallelOp> || getRegion().getParentOfType<mlir::omp::ParallelOp>())
  threadPrivatizeVars(converter, eval);

In D124226#3559674, @kiranchandramohan wrote:

Can you check the behaviour in a nested OpenMP region, like the following?

PROGRAM MAIN
INTEGER, SAVE :: A
!$OMP THREADPRIVATE(A)
!$OMP PARALLEL
PRINT *, A
!$OMP DO
DO I=1,100
 PRINT *, A
END DO
!$OMP END DO
!$OMP END PARALLEL
END PROGRAM

You might need a change from,

if (std::is_same_v<Op, omp::ParallelOp>)
  threadPrivatizeVars(converter, eval);

to something like

if(std::is_same_v<Op, omp::ParallelOp> || getRegion().getParentOfType<mlir::omp::ParallelOp>())
  threadPrivatizeVars(converter, eval);

The generated FIR is as follows:

func.func @_QQmain() {
  %0 = fir.address_of(@_QFEa) : !fir.ref<i32>
  %1 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
  %2 = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFEi"}
  omp.parallel   {
    %3 = fir.alloca i32 {adapt.valuebyref, pinned}
    %4 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
    ...
    %8 = fir.load %4 : !fir.ref<i32>
    %9 = fir.call @_FortranAioOutputInteger32(%7, %8) : (!fir.ref<i8>, i32) -> i1
    omp.wsloop   for  (%arg0) : i32 = (%c1_i32) to (%c100_i32) inclusive step (%c1_i32_0) {
      fir.store %arg0 to %3 : !fir.ref<i32>
      ...
      %14 = fir.load %4 : !fir.ref<i32>
      %15 = fir.call @_FortranAioOutputInteger32(%13, %14) : (!fir.ref<i8>, i32) -> i1
      omp.yield
    }
    omp.terminator
  }
  return
}

It seems to be correct. The threadprivate runtime call is only related to the thread. It gets the value of varaibles according to the thread info in the runtime library. For the worksharing loop, it assigns multiple threads into the loops. But in implementation for the runtime call such as __kmpc_for_static_init_4u, it stores and gets the loop info (such as index) according to the thread info in the runtime library. All in all, the thread info is in the runtime library, so the openmp runtime calls can be implemented independently. Please let me know if I understand incorrectly?

Check the following example:

PROGRAM MAIN
  use omp_lib
  INTEGER, SAVE :: A
  !$OMP THREADPRIVATE(A)
  !$OMP PARALLEL num_threads(10)
  A = omp_get_thread_num()
  !$OMP DO
  DO I=1,10
   PRINT *, "index = ", I, ", threadprivate value = ", A
  END DO
  !$OMP END DO
  !$OMP END PARALLEL
END PROGRAM

$ flang-new -fopenmp -flang-experimental-exec test.f90 && ./a.out index =  1, threadprivate value =  0
 index =  7, threadprivate value =  6
 index =  6, threadprivate value =  5
 index =  5, threadprivate value =  4
 index =  9, threadprivate value =  8
 index =  3, threadprivate value =  2
 index =  8, threadprivate value =  7
 index =  2, threadprivate value =  1
 index =  4, threadprivate value =  3
 index =  10, threadprivate value =  9

Thanks @peixin for the clarification. Due to some reason, I thought that the threadprivate operation is not created if the threadprivate variable is not directly accessed in the immediate parallel region.

What about the following case where the threadprivate access is inside a function called in the parallel region?

PROGRAM MAIN
  use omp_lib
  INTEGER, SAVE :: A
  !$OMP THREADPRIVATE(A)
  !$OMP PARALLEL
  call sub()
  !$OMP end parallel
contains
  subroutine sub
  !$omp do
  DO I=1,10
   A = omp_get_thread_num()
   print *, I, A
  END DO
  !$OMP END do
  end subroutine
END PROGRAM

In D124226#3560009, @kiranchandramohan wrote:
Thanks @peixin for the clarification. Due to some reason, I thought that the threadprivate operation is not created if the threadprivate variable is not directly accessed in the immediate parallel region.

What about the following case where the threadprivate access is inside a function called in the parallel region?
PROGRAM MAIN
  use omp_lib
  INTEGER, SAVE :: A
  !$OMP THREADPRIVATE(A)
  !$OMP PARALLEL
  call sub()
  !$OMP end parallel
contains
  subroutine sub
  !$omp do
  DO I=1,10
   A = omp_get_thread_num()
   print *, I, A
  END DO
  !$OMP END do
  end subroutine
END PROGRAM

You hit one good example. This is the host-association threadprivate variable, which is not supported yet. Check https://github.com/flang-compiler/f18-llvm-project/issues/1136#issuecomment-1070900183. To support host assocation threadprivate variable, it will requires some changes in FIR lowering. I will delay it after I know some more about FIR lowering. Anyway, it definitely should be in one single patch since it will need some discussion with FIR team.

@kiranchandramohan BTW, only common block is allowed in threadprivate in OMP 1.0. And the common block is most commonly used in threadprivate directive. So it's no hurry to support host-association threadprivate variable.

Thanks @peixin for the explanation and detailed testing. Yes, we can handle host-association in a separate patch. In general, for OpenMP we might have some issues there.

LGTM. You can submit after a day if there are no further requests for change.

kiranchandramohan accepted this revision.Jun 6 2022, 5:51 AM

This revision is now accepted and ready to land.Jun 6 2022, 5:51 AM

Closed by commit rG411bd2d40788: [flang][OpenMP] Support lowering parse-tree to MLIR for threadprivate directive (authored by peixin). · Explain WhyJun 7 2022, 12:10 AM

This revision was automatically updated to reflect the committed changes.

peixin added a commit: rG411bd2d40788: [flang][OpenMP] Support lowering parse-tree to MLIR for threadprivate directive.

peixin removed a child revision: D127047: [flang][OpenMP] Support lowering of threadprivate directive for non global variables.Aug 22 2022, 11:41 PM

Revision Contents

Path

Size

flang/

include/

flang/

Lower/

AbstractConverter.h

10 lines

OpenMP.h

2 lines

PFTBuilder.h

5 lines

lib/

Lower/

Bridge.cpp

23 lines

OpenMP.cpp

139 lines

PFTBuilder.cpp

9 lines

test/

Lower/

OpenMP/

Todo/

omp-threadprivate.f90

threadprivate-char-array-chararray.f90

46 lines

threadprivate-commonblock.f90

91 lines

threadprivate-integer-different-kinds.f90

67 lines

threadprivate-pointer-allocatable.f90

51 lines

threadprivate-real-logical-complex-derivedtype.f90

58 lines

threadprivate-use-association.f90

74 lines

Diff 434704

flang/include/flang/Lower/AbstractConverter.h

	Show All 10 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef FORTRAN_LOWER_ABSTRACTCONVERTER_H			#ifndef FORTRAN_LOWER_ABSTRACTCONVERTER_H
	#define FORTRAN_LOWER_ABSTRACTCONVERTER_H			#define FORTRAN_LOWER_ABSTRACTCONVERTER_H

	#include "flang/Common/Fortran.h"			#include "flang/Common/Fortran.h"
	#include "flang/Lower/PFTDefs.h"			#include "flang/Lower/PFTDefs.h"
	#include "flang/Optimizer/Builder/BoxValue.h"			#include "flang/Optimizer/Builder/BoxValue.h"
				#include "flang/Semantics/symbol.h"
	#include "mlir/IR/BuiltinOps.h"			#include "mlir/IR/BuiltinOps.h"
	#include "llvm/ADT/ArrayRef.h"			#include "llvm/ADT/ArrayRef.h"

	namespace fir {			namespace fir {
	class KindMapping;			class KindMapping;
	class FirOpBuilder;			class FirOpBuilder;
	} // namespace fir			} // namespace fir

	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	public:			public:
	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//
	// Symbols			// Symbols
	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//

	/// Get the mlir instance of a symbol.			/// Get the mlir instance of a symbol.
	virtual mlir::Value getSymbolAddress(SymbolRef sym) = 0;			virtual mlir::Value getSymbolAddress(SymbolRef sym) = 0;

				virtual fir::ExtendedValue
				getSymbolExtendedValue(const Fortran::semantics::Symbol &sym) = 0;
				kiranchandramohanUnsubmitted Not Done Reply Inline Actions Nit: Would spelling out be better? `getSymbolExtendedValue`. kiranchandramohan: Nit: Would spelling out be better? `getSymbolExtendedValue`.
				peixinAuthorUnsubmitted Done Reply Inline Actions Fixed. peixin: Fixed.

	/// Get the binding of an implied do variable by name.			/// Get the binding of an implied do variable by name.
	virtual mlir::Value impliedDoBinding(llvm::StringRef name) = 0;			virtual mlir::Value impliedDoBinding(llvm::StringRef name) = 0;

	/// Copy the binding of src to target symbol.			/// Copy the binding of src to target symbol.
	virtual void copySymbolBinding(SymbolRef src, SymbolRef target) = 0;			virtual void copySymbolBinding(SymbolRef src, SymbolRef target) = 0;

	/// Binds the symbol to an fir extended value. The symbol binding will be			/// Binds the symbol to an fir extended value. The symbol binding will be
	/// added or replaced at the inner-most level of the local symbol map.			/// added or replaced at the inner-most level of the local symbol map.
	virtual void bindSymbol(SymbolRef sym, const fir::ExtendedValue &exval) = 0;			virtual void bindSymbol(SymbolRef sym, const fir::ExtendedValue &exval) = 0;

	/// Get the label set associated with a symbol.			/// Get the label set associated with a symbol.
	virtual bool lookupLabelSet(SymbolRef sym, pft::LabelSet &labelSet) = 0;			virtual bool lookupLabelSet(SymbolRef sym, pft::LabelSet &labelSet) = 0;

	/// Get the code defined by a label			/// Get the code defined by a label
	virtual pft::Evaluation *lookupLabel(pft::Label label) = 0;			virtual pft::Evaluation *lookupLabel(pft::Label label) = 0;

	/// For a given symbol which is host-associated, create a clone using			/// For a given symbol which is host-associated, create a clone using
	/// parameters from the host-associated symbol.			/// parameters from the host-associated symbol.
	virtual bool			virtual bool
	createHostAssociateVarClone(const Fortran::semantics::Symbol &sym) = 0;			createHostAssociateVarClone(const Fortran::semantics::Symbol &sym) = 0;

	virtual void copyHostAssociateVar(const Fortran::semantics::Symbol &sym) = 0;			virtual void copyHostAssociateVar(const Fortran::semantics::Symbol &sym) = 0;

				/// Collect the set of symbols flagged as \p flag in \p eval region.
				virtual void collectSymbolSet(
				pft::Evaluation &eval,
				llvm::SetVector<const Fortran::semantics::Symbol *> &symbolSet,
				Fortran::semantics::Symbol::Flag flag) = 0;

	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//
	// Expressions			// Expressions
	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//

	/// Generate the address of the location holding the expression, \p expr.			/// Generate the address of the location holding the expression, \p expr.
	/// If \p expr is a Designator that is not compile time contiguous, the			/// If \p expr is a Designator that is not compile time contiguous, the
	/// address returned is the one of a contiguous temporary storage holding the			/// address returned is the one of a contiguous temporary storage holding the
	/// expression value. The clean-up for this temporary is added to \p context.			/// expression value. The clean-up for this temporary is added to \p context.
	▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

flang/include/flang/Lower/OpenMP.h

	Show All 23 Lines
	} // namespace parser			} // namespace parser

	namespace lower {			namespace lower {

	class AbstractConverter;			class AbstractConverter;

	namespace pft {			namespace pft {
	struct Evaluation;			struct Evaluation;
				struct Variable;
	} // namespace pft			} // namespace pft

	void genOpenMPConstruct(AbstractConverter &, pft::Evaluation &,			void genOpenMPConstruct(AbstractConverter &, pft::Evaluation &,
	const parser::OpenMPConstruct &);			const parser::OpenMPConstruct &);
	void genOpenMPDeclarativeConstruct(AbstractConverter &, pft::Evaluation &,			void genOpenMPDeclarativeConstruct(AbstractConverter &, pft::Evaluation &,
	const parser::OpenMPDeclarativeConstruct &);			const parser::OpenMPDeclarativeConstruct &);
	int64_t getCollapseValue(const Fortran::parser::OmpClauseList &clauseList);			int64_t getCollapseValue(const Fortran::parser::OmpClauseList &clauseList);
				void genThreadprivateOp(AbstractConverter &, const pft::Variable &);

	} // namespace lower			} // namespace lower
	} // namespace Fortran			} // namespace Fortran

	#endif // FORTRAN_LOWER_OPENMP_H			#endif // FORTRAN_LOWER_OPENMP_H

flang/include/flang/Lower/PFTBuilder.h

Show First 20 Lines • Show All 776 Lines • ▼ Show 20 Lines	return node.parent.visit(common::visitors{
[](auto &p) -> ParentType * { return getAncestor<ParentType>(p); }});		[](auto &p) -> ParentType * { return getAncestor<ParentType>(p); }});
}		}

/// Call the provided \p callBack on all symbols that are referenced inside \p		/// Call the provided \p callBack on all symbols that are referenced inside \p
/// funit.		/// funit.
void visitAllSymbols(const FunctionLikeUnit &funit,		void visitAllSymbols(const FunctionLikeUnit &funit,
std::function<void(const semantics::Symbol &)> callBack);		std::function<void(const semantics::Symbol &)> callBack);

		/// Call the provided \p callBack on all symbols that are referenced inside \p
		/// eval region.
		void visitAllSymbols(const Evaluation &eval,
		std::function<void(const semantics::Symbol &)> callBack);

} // namespace Fortran::lower::pft		} // namespace Fortran::lower::pft

namespace Fortran::lower {		namespace Fortran::lower {
/// Create a PFT (Pre-FIR Tree) from the parse tree.		/// Create a PFT (Pre-FIR Tree) from the parse tree.
///		///
/// A PFT is a light weight tree over the parse tree that is used to create FIR.		/// A PFT is a light weight tree over the parse tree that is used to create FIR.
/// The PFT captures pointers back into the parse tree, so the parse tree must		/// The PFT captures pointers back into the parse tree, so the parse tree must
/// not be changed between the construction of the PFT and its last use. The		/// not be changed between the construction of the PFT and its last use. The
Show All 12 Lines

flang/lib/Lower/Bridge.cpp

Show First 20 Lines • Show All 307 Lines • ▼ Show 20 Lines	public:
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// AbstractConverter overrides		// AbstractConverter overrides
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

mlir::Value getSymbolAddress(Fortran::lower::SymbolRef sym) override final {		mlir::Value getSymbolAddress(Fortran::lower::SymbolRef sym) override final {
return lookupSymbol(sym).getAddr();		return lookupSymbol(sym).getAddr();
}		}

		fir::ExtendedValue
		getSymbolExtendedValue(const Fortran::semantics::Symbol &sym) override final {
		Fortran::lower::SymbolBox sb = localSymbols.lookupSymbol(sym);
		assert(sb && "symbol box not found");
		return sb.toExtendedValue();
		}

mlir::Value impliedDoBinding(llvm::StringRef name) override final {		mlir::Value impliedDoBinding(llvm::StringRef name) override final {
mlir::Value val = localSymbols.lookupImpliedDo(name);		mlir::Value val = localSymbols.lookupImpliedDo(name);
if (!val)		if (!val)
fir::emitFatalError(toLocation(), "ac-do-variable has no binding");		fir::emitFatalError(toLocation(), "ac-do-variable has no binding");
return val;		return val;
}		}

void copySymbolBinding(Fortran::lower::SymbolRef src,		void copySymbolBinding(Fortran::lower::SymbolRef src,
▲ Show 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	if (auto seqTy = symType.dyn_cast<fir::SequenceType>()) {
builder->create<fir::StoreOp>(loc, loadVal, fir::getBase(exv));		builder->create<fir::StoreOp>(loc, loadVal, fir::getBase(exv));
}		}
}		}

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Utility methods		// Utility methods
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

		void collectSymbolSet(
		Fortran::lower::pft::Evaluation &eval,
		llvm::SetVector<const Fortran::semantics::Symbol *> &symbolSet,
		Fortran::semantics::Symbol::Flag flag) override final {
		auto addToList = [&](const Fortran::semantics::Symbol &sym) {
		const Fortran::semantics::Symbol &ultimate = sym.GetUltimate();
		if (ultimate.test(flag))
		symbolSet.insert(&ultimate);
		};
		Fortran::lower::pft::visitAllSymbols(eval, addToList);
		}

mlir::Location getCurrentLocation() override final { return toLocation(); }		mlir::Location getCurrentLocation() override final { return toLocation(); }

/// Generate a dummy location.		/// Generate a dummy location.
mlir::Location genUnknownLocation() override final {		mlir::Location genUnknownLocation() override final {
// Note: builder may not be instantiated yet		// Note: builder may not be instantiated yet
return mlir::UnknownLoc::get(&getMLIRContext());		return mlir::UnknownLoc::get(&getMLIRContext());
}		}

▲ Show 20 Lines • Show All 1,931 Lines • ▼ Show 20 Lines	void mapDummiesAndResults(Fortran::lower::pft::FunctionLikeUnit &funit,
}		}
}		}

/// Instantiate variable \p var and add it to the symbol map.		/// Instantiate variable \p var and add it to the symbol map.
/// See ConvertVariable.cpp.		/// See ConvertVariable.cpp.
void instantiateVar(const Fortran::lower::pft::Variable &var,		void instantiateVar(const Fortran::lower::pft::Variable &var,
Fortran::lower::AggregateStoreMap &storeMap) {		Fortran::lower::AggregateStoreMap &storeMap) {
Fortran::lower::instantiateVariable(*this, var, localSymbols, storeMap);		Fortran::lower::instantiateVariable(*this, var, localSymbols, storeMap);
		if (var.hasSymbol() &&
		var.getSymbol().test(
		Fortran::semantics::Symbol::Flag::OmpThreadprivate))
		Fortran::lower::genThreadprivateOp(*this, var);
}		}

/// Prepare to translate a new function		/// Prepare to translate a new function
void startNewFunction(Fortran::lower::pft::FunctionLikeUnit &funit) {		void startNewFunction(Fortran::lower::pft::FunctionLikeUnit &funit) {
assert(!builder && "expected nullptr");		assert(!builder && "expected nullptr");
Fortran::lower::CalleeInterface callee(funit, *this);		Fortran::lower::CalleeInterface callee(funit, *this);
mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments();		mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments();
builder = new fir::FirOpBuilder(func, bridge.getKindMap());		builder = new fir::FirOpBuilder(func, bridge.getKindMap());
▲ Show 20 Lines • Show All 599 Lines • Show Last 20 Lines

flang/lib/Lower/OpenMP.cpp

Show First 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	if (const auto &privateClause =
std::get_if<Fortran::parser::OmpClause::Firstprivate>(		std::get_if<Fortran::parser::OmpClause::Firstprivate>(
&clause.u)) {		&clause.u)) {
createPrivateVarSyms(converter, firstPrivateClause);		createPrivateVarSyms(converter, firstPrivateClause);
}		}
}		}
firOpBuilder.restoreInsertionPoint(insPt);		firOpBuilder.restoreInsertionPoint(insPt);
}		}

		/// The COMMON block is a global structure. \p commonValue is the base address
		shraiyshUnsubmitted Not Done Reply Inline Actions Please add a comment here describing what this function does. shraiysh: Please add a comment here describing what this function does.
		peixinAuthorUnsubmitted Done Reply Inline Actions I use the function name to comment itself. "genCommonMember" -> "generate the member of the common block" peixin: I use the function name to comment itself. "genCommonMember" -> "generate the member of the…
		kiranchandramohanUnsubmitted Not Done Reply Inline Actions Nit: Would `genCommonBlockMember` be better? kiranchandramohan: Nit: Would `genCommonBlockMember` be better?
		peixinAuthorUnsubmitted Done Reply Inline Actions Yes, fixed. peixin: Yes, fixed.
		kiranchandramohanUnsubmitted Not Done Reply Inline Actions Nit: I assume this shares some common code with other places where common block members are accessed like `instantiateCommon` in `ConvertVariable.cpp` and possibly in other places too. Can we do some refactoring to share the code? If not can you add a comment similar to `instantiateCommon`? kiranchandramohan: Nit: I assume this shares some common code with other places where common block members are…
		peixinAuthorUnsubmitted Done Reply Inline Actions The `instantiateCommon` has multiple purposes which stops me reusing that here. Add some comment. In the long term, the `instantiateCommon` may be split into multiple small functions when supporting more fortran features. So also add one FIXME here. peixin: The `instantiateCommon` has multiple purposes which stops me reusing that here. Add some…
		/// of the the COMMON block. As the offset from the symbol \p sym, generate the
		/// COMMON block member value (commonValue + offset) for the symbol.
		/// FIXME: Share the code with `instantiateCommon` in ConvertVariable.cpp.
		static mlir::Value
		genCommonBlockMember(Fortran::lower::AbstractConverter &converter,
		const Fortran::semantics::Symbol &sym,
		mlir::Value commonValue) {
		auto &firOpBuilder = converter.getFirOpBuilder();
		mlir::Location currentLocation = converter.getCurrentLocation();
		mlir::IntegerType i8Ty = firOpBuilder.getIntegerType(8);
		mlir::Type i8Ptr = firOpBuilder.getRefType(i8Ty);
		mlir::Type seqTy = firOpBuilder.getRefType(firOpBuilder.getVarLenSeqTy(i8Ty));
		mlir::Value base =
		firOpBuilder.createConvert(currentLocation, seqTy, commonValue);
		std::size_t byteOffset = sym.GetUltimate().offset();
		mlir::Value offs = firOpBuilder.createIntegerConstant(
		currentLocation, firOpBuilder.getIndexType(), byteOffset);
		mlir::Value varAddr = firOpBuilder.create<fir::CoordinateOp>(
		currentLocation, i8Ptr, base, mlir::ValueRange{offs});
		mlir::Type symType = converter.genType(sym);
		shraiyshUnsubmitted Not Done Reply Inline Actions Please add a comment here describing what this function does. shraiysh: Please add a comment here describing what this function does.
		peixinAuthorUnsubmitted Done Reply Inline Actions I use the function name to comment itself. "getExtValue" -> "get the extended value" peixin: I use the function name to comment itself. "getExtValue" -> "get the extended value"
		peixinAuthorUnsubmitted Done Reply Inline Actions // Get the extended value for \p val by extracting additional variable information from \p base. Is this reasonable to you? @shraiysh peixin: ``` // Get the extended value for \p val by extracting additional variable information from \p…
		kiranchandramohanUnsubmitted Not Done Reply Inline Actions Nit: Would getExtendedValue be better? The description sounds good to me. Do you know whether additional handling is required for strings? Is this function specially made for the threadprivate case or is it generally usable? kiranchandramohan: Nit: Would getExtendedValue be better? The description sounds good to me. Do you know whether…
		peixinAuthorUnsubmitted Done Reply Inline Actions Yes, fixed the function name. The additional handling is not required for non-pointer non-allocatable strings. For pointer or allocatable strings, `box,nonDeferredLenParams()` is used to get the non deferred length info. This function is designed targeting for general usage. I assume future data-related construct or clauses may need this such as declare target construct. Of course, this function may be refactored to support more functions such as mutable properties. peixin: Yes, fixed the function name. The additional handling is not required for non-pointer non…
		return firOpBuilder.createConvert(currentLocation,
		firOpBuilder.getRefType(symType), varAddr);
		}

		// Get the extended value for \p val by extracting additional variable
		// information from \p base.
		static fir::ExtendedValue getExtendedValue(fir::ExtendedValue base,
		mlir::Value val) {
		return base.match(
		[&](const fir::MutableBoxValue &box) -> fir::ExtendedValue {
		return fir::MutableBoxValue(val, box.nonDeferredLenParams(), {});
		},
		[&](const auto &) -> fir::ExtendedValue {
		return fir::substBase(base, val);
		});
		}

		static void threadPrivatizeVars(Fortran::lower::AbstractConverter &converter,
		shraiyshUnsubmitted Not Done Reply Inline Actions Same as below: this function doesn't do much and its uses can be easily replaced with converter.getSymbolAddress(). Can we please do that? shraiysh: Same as below: this function doesn't do much and its uses can be easily replaced with converter.
		peixinAuthorUnsubmitted Done Reply Inline Actions This and converter.getSymbolAddress() have different argument types. peixin: This and converter.getSymbolAddress() have different argument types.
		shraiyshUnsubmitted Not Done Reply Inline Actions I tried replacing the uses of this function with converter.getSymbolAddress and `ninja check-flang` passed for all tests. Am I missing something? shraiysh: I tried replacing the uses of this function with converter.getSymbolAddress and `ninja check…
		peixinAuthorUnsubmitted Done Reply Inline Actions Well, I didn't try that. Usually I try to pass the same type argument when I use those provided functions in case CI gives one warning-to-error problem. Let me try this and test if CI can pass. peixin: Well, I didn't try that. Usually I try to pass the same type argument when I use those provided…
		Fortran::lower::pft::Evaluation &eval) {
		auto &firOpBuilder = converter.getFirOpBuilder();
		kiranchandramohanUnsubmitted Not Done Reply Inline Actions Nit: Can you add a comment here saying that in this function we get the original ThreadPrivateOp corresponding to the symbol and use the appropriate field from that Operation to create the threadprivate Copy operation inside the parallel region? kiranchandramohan: Nit: Can you add a comment here saying that in this function we get the original…
		mlir::Location currentLocation = converter.getCurrentLocation();
		auto insPt = firOpBuilder.saveInsertionPoint();
		shraiyshUnsubmitted Not Done Reply Inline Actions I am having trouble understanding what this function is doing. Why are we generating a thread private op when the assert checks for already existing thread private op? Are there two threadprivate ops for each symbol? shraiysh: I am having trouble understanding what this function is doing. Why are we generating a thread…
		peixinAuthorUnsubmitted Done Reply Inline Actions This code is in the function `threadPrivateVars`, which is called in `createBodyOfOp`. This is generating one explicit threadprivate op in parallel region. Are there two threadprivate ops for each symbol? Yes. But they are in different scope. This code has the similar function as `allocate` for `private` clause (createHostAssociateVarClone in Bridge.cpp). peixin: This code is in the function `threadPrivateVars`, which is called in `createBodyOfOp`. This is…
		shraiyshUnsubmitted Not Done Reply Inline Actions Okay understood. Thanks shraiysh: Okay understood. Thanks
		firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock());
		shraiyshUnsubmitted Not Done Reply Inline Actions nit: spelling - `symOrThreadprivateValue`? shraiysh: nit: spelling - `symOrThreadprivateValue`?
		peixinAuthorUnsubmitted Done Reply Inline Actions No. This is `symOriThreadprivateValue`, i.e., the original threadprivate value of the symbol. As explained above, this is generating a new and second threadprivate op. peixin: No. This is `symOriThreadprivateValue`, i.e., the original threadprivate value of the symbol.
		shraiyshUnsubmitted Not Done Reply Inline Actions Okay, I misread it. shraiysh: Okay, I misread it.

		// Get the original ThreadprivateOp corresponding to the symbol and use the
		kiranchandramohanUnsubmitted Not Done Reply Inline Actions Is this the same as `op->sym_addr()` ? kiranchandramohan: Is this the same as `op->sym_addr() `?
		peixinAuthorUnsubmitted Done Reply Inline Actions Thanks for the notice. Fixed it since `sym_addr` is more readable. peixin: Thanks for the notice. Fixed it since `sym_addr` is more readable.
		// symbol value from that opeartion to create one ThreadprivateOp copy
		// operation inside the parallel region.
		auto genThreadprivateOp = [&](Fortran::lower::SymbolRef sym) -> mlir::Value {
		mlir::Value symOriThreadprivateValue = converter.getSymbolAddress(sym);
		mlir::Operation *op = symOriThreadprivateValue.getDefiningOp();
		assert(mlir::isa<mlir::omp::ThreadprivateOp>(op) &&
		"The threadprivate operation not created");
		mlir::Value symValue =
		mlir::dyn_cast<mlir::omp::ThreadprivateOp>(op).sym_addr();
		return firOpBuilder.create<mlir::omp::ThreadprivateOp>(
		kiranchandramohanUnsubmitted Not Done Reply Inline Actions Nit: Can you add a comment explaining why we have to do some additional steps (like binding the Symbol) for commonblocks? kiranchandramohan: Nit: Can you add a comment explaining why we have to do some additional steps (like binding the…
		peixinAuthorUnsubmitted Done Reply Inline Actions Done. peixin: Done.
		currentLocation, symValue.getType(), symValue);
		};

		llvm::SetVector<const Fortran::semantics::Symbol *> threadprivateSyms;
		converter.collectSymbolSet(
		eval, threadprivateSyms,
		Fortran::semantics::Symbol::Flag::OmpThreadprivate);

		// For a COMMON block, the ThreadprivateOp is generated for itself instead of
		// its members, so only bind the value of the new copied ThreadprivateOp
		// inside the parallel region to the common block symbol only once for
		// multiple members in one COMMON block.
		llvm::SetVector<const Fortran::semantics::Symbol *> commonSyms;
		for (std::size_t i = 0; i < threadprivateSyms.size(); i++) {
		auto sym = threadprivateSyms[i];
		mlir::Value symThreadprivateValue;
		if (const Fortran::semantics::Symbol *common =
		Fortran::semantics::FindCommonBlockContaining(sym->GetUltimate())) {
		mlir::Value commonThreadprivateValue;
		if (commonSyms.contains(common)) {
		commonThreadprivateValue = converter.getSymbolAddress(*common);
		} else {
		commonThreadprivateValue = genThreadprivateOp(*common);
		converter.bindSymbol(*common, commonThreadprivateValue);
		commonSyms.insert(common);
		}
		symThreadprivateValue =
		genCommonBlockMember(converter, *sym, commonThreadprivateValue);
		} else {
		symThreadprivateValue = genThreadprivateOp(*sym);
		}

		fir::ExtendedValue sexv = converter.getSymbolExtendedValue(*sym);
		fir::ExtendedValue symThreadprivateExv =
		getExtendedValue(sexv, symThreadprivateValue);
		converter.bindSymbol(*sym, symThreadprivateExv);
		}

		firOpBuilder.restoreInsertionPoint(insPt);
		}

static void genObjectList(const Fortran::parser::OmpObjectList &objectList,		static void genObjectList(const Fortran::parser::OmpObjectList &objectList,
Fortran::lower::AbstractConverter &converter,		Fortran::lower::AbstractConverter &converter,
llvm::SmallVectorImpl<Value> &operands) {		llvm::SmallVectorImpl<Value> &operands) {
auto addOperands = [&](Fortran::lower::SymbolRef sym) {		auto addOperands = [&](Fortran::lower::SymbolRef sym) {
const mlir::Value variable = converter.getSymbolAddress(sym);		const mlir::Value variable = converter.getSymbolAddress(sym);
if (variable) {		if (variable) {
operands.push_back(variable);		operands.push_back(variable);
} else {		} else {
▲ Show 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	createBodyOfOp(Op &op, Fortran::lower::AbstractConverter &converter,
if (storeOp)		if (storeOp)
firOpBuilder.setInsertionPointAfter(storeOp);		firOpBuilder.setInsertionPointAfter(storeOp);
else		else
firOpBuilder.setInsertionPointToStart(&block);		firOpBuilder.setInsertionPointToStart(&block);

// Handle privatization. Do not privatize if this is the outer operation.		// Handle privatization. Do not privatize if this is the outer operation.
if (clauses && !outerCombined)		if (clauses && !outerCombined)
privatizeVars(converter, *clauses);		privatizeVars(converter, *clauses);

		if (std::is_same_v<Op, omp::ParallelOp>)
		threadPrivatizeVars(converter, eval);
}		}

static void genOMP(Fortran::lower::AbstractConverter &converter,		static void genOMP(Fortran::lower::AbstractConverter &converter,
Fortran::lower::pft::Evaluation &eval,		Fortran::lower::pft::Evaluation &eval,
const Fortran::parser::OpenMPSimpleStandaloneConstruct		const Fortran::parser::OpenMPSimpleStandaloneConstruct
&simpleStandaloneConstruct) {		&simpleStandaloneConstruct) {
const auto &directive =		const auto &directive =
std::get<Fortran::parser::OmpSimpleStandaloneDirective>(		std::get<Fortran::parser::OmpSimpleStandaloneDirective>(
▲ Show 20 Lines • Show All 734 Lines • ▼ Show 20 Lines	std::visit(
[&](const Fortran::parser::OpenMPCriticalConstruct		[&](const Fortran::parser::OpenMPCriticalConstruct
&criticalConstruct) {		&criticalConstruct) {
genOMP(converter, eval, criticalConstruct);		genOMP(converter, eval, criticalConstruct);
},		},
},		},
ompConstruct.u);		ompConstruct.u);
}		}

		void Fortran::lower::genThreadprivateOp(
		Fortran::lower::AbstractConverter &converter,
		const Fortran::lower::pft::Variable &var) {
		fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
		mlir::Location currentLocation = converter.getCurrentLocation();

		const Fortran::semantics::Symbol &sym = var.getSymbol();
		shraiyshUnsubmitted Not Done Reply Inline Actions this function doesn't seem to do much and is only used at two places. Can we use `converter.getSymbolAddress()` in those uses? shraiysh: this function doesn't seem to do much and is only used at two places. Can we use `converter.
		peixinAuthorUnsubmitted Done Reply Inline Actions I can refactor this code and add one new function `getSymbolAddress` with different argument type when it gets three according to "Rule of Three" (code duplication). It seems that the lowering of copyin clause will also need this, I can refactor this at that time. I am working on that. peixin: I can refactor this code and add one new function `getSymbolAddress` with different argument…
		shraiyshUnsubmitted Not Done Reply Inline Actions It is not just about the code duplication, this function is simply replacing getSymbolAddress <-> converter.getSymbolAddress. If it is absolutely necessary to use this lambda, it would be very unfortunate and we should comment why a simple replacement could not work. shraiysh: It is not just about the code duplication, this function is simply replacing getSymbolAddress <…
		mlir::Value symThreadprivateValue;
		if (const Fortran::semantics::Symbol *common =
		Fortran::semantics::FindCommonBlockContaining(sym.GetUltimate())) {
		mlir::Value commonValue = converter.getSymbolAddress(*common);
		if (mlir::isa<mlir::omp::ThreadprivateOp>(commonValue.getDefiningOp())) {
		// Generate ThreadprivateOp for a common block instead of its members and
		// only do it once for a common block.
		return;
		}
		// Generate ThreadprivateOp and rebind the common block.
		mlir::Value commonThreadprivateValue =
		firOpBuilder.create<mlir::omp::ThreadprivateOp>(
		currentLocation, commonValue.getType(), commonValue);
		converter.bindSymbol(*common, commonThreadprivateValue);
		// Generate the threadprivate value for the common block member.
		symThreadprivateValue =
		genCommonBlockMember(converter, sym, commonThreadprivateValue);
		} else {
		kiranchandramohanUnsubmitted Not Done Reply Inline Actions Can we add this `else if` in a follow up patch? It will simplify this patch. kiranchandramohan: Can we add this `else if` in a follow up patch? It will simplify this patch.
		peixinAuthorUnsubmitted Done Reply Inline Actions Sure. Good point. Single-function patch is easier to review and maintain. peixin: Sure. Good point. Single-function patch is easier to review and maintain.
		mlir::Value symValue = converter.getSymbolAddress(sym);
		symThreadprivateValue = firOpBuilder.create<mlir::omp::ThreadprivateOp>(
		currentLocation, symValue.getType(), symValue);
		}

		fir::ExtendedValue sexv = converter.getSymbolExtendedValue(sym);
		fir::ExtendedValue symThreadprivateExv =
		getExtendedValue(sexv, symThreadprivateValue);
		converter.bindSymbol(sym, symThreadprivateExv);
		}

void Fortran::lower::genOpenMPDeclarativeConstruct(		void Fortran::lower::genOpenMPDeclarativeConstruct(
Fortran::lower::AbstractConverter &converter,		Fortran::lower::AbstractConverter &converter,
Fortran::lower::pft::Evaluation &eval,		Fortran::lower::pft::Evaluation &eval,
const Fortran::parser::OpenMPDeclarativeConstruct &ompDeclConstruct) {		const Fortran::parser::OpenMPDeclarativeConstruct &ompDeclConstruct) {

std::visit(		std::visit(
common::visitors{		common::visitors{
[&](const Fortran::parser::OpenMPDeclarativeAllocate		[&](const Fortran::parser::OpenMPDeclarativeAllocate
Show All 10 Lines	std::visit(
TODO(converter.getCurrentLocation(), "OpenMPDeclareSimdConstruct");		TODO(converter.getCurrentLocation(), "OpenMPDeclareSimdConstruct");
},		},
[&](const Fortran::parser::OpenMPDeclareTargetConstruct		[&](const Fortran::parser::OpenMPDeclareTargetConstruct
&declareTargetConstruct) {		&declareTargetConstruct) {
TODO(converter.getCurrentLocation(),		TODO(converter.getCurrentLocation(),
"OpenMPDeclareTargetConstruct");		"OpenMPDeclareTargetConstruct");
},		},
[&](const Fortran::parser::OpenMPThreadprivate &threadprivate) {		[&](const Fortran::parser::OpenMPThreadprivate &threadprivate) {
TODO(converter.getCurrentLocation(), "OpenMPThreadprivate");		// The directive is lowered when instantiating the variable to
		// support the case of threadprivate variable declared in module.
},		},
},		},
ompDeclConstruct.u);		ompDeclConstruct.u);
}		}

flang/lib/Lower/PFTBuilder.cpp

	Show First 20 Lines • Show All 1,803 Lines • ▼ Show 20 Lines
	void Fortran::lower::pft::visitAllSymbols(			void Fortran::lower::pft::visitAllSymbols(
	const Fortran::lower::pft::FunctionLikeUnit &funit,			const Fortran::lower::pft::FunctionLikeUnit &funit,
	const std::function<void(const Fortran::semantics::Symbol &)> callBack) {			const std::function<void(const Fortran::semantics::Symbol &)> callBack) {
	SymbolVisitor visitor{callBack};			SymbolVisitor visitor{callBack};
	funit.visit([&](const auto &functionParserNode) {			funit.visit([&](const auto &functionParserNode) {
	parser::Walk(functionParserNode, visitor);			parser::Walk(functionParserNode, visitor);
	});			});
	}			}

				void Fortran::lower::pft::visitAllSymbols(
				const Fortran::lower::pft::Evaluation &eval,
				const std::function<void(const Fortran::semantics::Symbol &)> callBack) {
				SymbolVisitor visitor{callBack};
				eval.visit([&](const auto &functionParserNode) {
				parser::Walk(functionParserNode, visitor);
				});
				}

flang/test/Lower/OpenMP/Todo/omp-threadprivate.f90

This file was deleted.

	! This test checks lowering of OpenMP threadprivate Directive.

	// RUN: not flang-new -fc1 -emit-fir -fopenmp %s 2>&1 \| FileCheck %s

	program main
	integer, save :: x, y

	// CHECK: not yet implemented: OpenMPThreadprivate
	!$omp threadprivate(x, y)
	end

flang/test/Lower/OpenMP/threadprivate-char-array-chararray.f90

This file was added.

				! This test checks lowering of OpenMP Threadprivate Directive.
				! Test for character, array, and character array.

				!RUN: %flang_fc1 -emit-fir -fopenmp %s -o - \| FileCheck %s

				module test
				character :: x
				integer :: y(5)
				character(5) :: z(5)

				!$omp threadprivate(x, y, z)

				!CHECK-DAG: fir.global @_QMtestEx : !fir.char<1> {
				!CHECK-DAG: fir.global @_QMtestEy : !fir.array<5xi32> {
				!CHECK-DAG: fir.global @_QMtestEz : !fir.array<5x!fir.char<1,5>> {

				contains
				subroutine sub()
				!CHECK-DAG: [[ADDR0:%.*]] = fir.address_of(@_QMtestEx) : !fir.ref<!fir.char<1>>
				!CHECK-DAG: [[NEWADDR0:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.char<1>> -> !fir.ref<!fir.char<1>>
				!CHECK-DAG: [[ADDR1:%.*]] = fir.address_of(@_QMtestEy) : !fir.ref<!fir.array<5xi32>>
				!CHECK-DAG: [[NEWADDR1:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<!fir.array<5xi32>> -> !fir.ref<!fir.array<5xi32>>
				!CHECK-DAG: [[ADDR2:%.*]] = fir.address_of(@_QMtestEz) : !fir.ref<!fir.array<5x!fir.char<1,5>>>
				!CHECK-DAG: [[NEWADDR2:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<!fir.array<5x!fir.char<1,5>>> -> !fir.ref<!fir.array<5x!fir.char<1,5>>>
				!CHECK-DAG: %{{.*}} = fir.convert [[NEWADDR0]] : (!fir.ref<!fir.char<1>>) -> !fir.ref<i8>
				!CHECK-DAG: %{{.}} = fir.embox [[NEWADDR1]](%{{.}}) : (!fir.ref<!fir.array<5xi32>>, !fir.shape<1>) -> !fir.box<!fir.array<5xi32>>
				!CHECK-DAG: %{{.}} = fir.embox [[NEWADDR2]](%{{.}}) : (!fir.ref<!fir.array<5x!fir.char<1,5>>>, !fir.shape<1>) -> !fir.box<!fir.array<5x!fir.char<1,5>>>
				print *, x, y, z

				!$omp parallel
				!CHECK-DAG: [[ADDR33:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.char<1>> -> !fir.ref<!fir.char<1>>
				!CHECK-DAG: [[ADDR34:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<!fir.array<5xi32>> -> !fir.ref<!fir.array<5xi32>>
				!CHECK-DAG: [[ADDR35:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<!fir.array<5x!fir.char<1,5>>> -> !fir.ref<!fir.array<5x!fir.char<1,5>>>
				!CHECK-DAG: %{{.*}} = fir.convert [[ADDR33]] : (!fir.ref<!fir.char<1>>) -> !fir.ref<i8>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR34]](%{{.}}) : (!fir.ref<!fir.array<5xi32>>, !fir.shape<1>) -> !fir.box<!fir.array<5xi32>>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR35]](%{{.}}) : (!fir.ref<!fir.array<5x!fir.char<1,5>>>, !fir.shape<1>) -> !fir.box<!fir.array<5x!fir.char<1,5>>>
				print *, x, y, z
				!$omp end parallel

				!CHECK-DAG: %{{.*}} = fir.convert [[NEWADDR0]] : (!fir.ref<!fir.char<1>>) -> !fir.ref<i8>
				!CHECK-DAG: %{{.}} = fir.embox [[NEWADDR1]](%{{.}}) : (!fir.ref<!fir.array<5xi32>>, !fir.shape<1>) -> !fir.box<!fir.array<5xi32>>
				!CHECK-DAG: %{{.}} = fir.embox [[NEWADDR2]](%{{.}}) : (!fir.ref<!fir.array<5x!fir.char<1,5>>>, !fir.shape<1>) -> !fir.box<!fir.array<5x!fir.char<1,5>>>
				print *, x, y, z

				end
				end

flang/test/Lower/OpenMP/threadprivate-commonblock.f90

This file was added.

				! This test checks lowering of OpenMP Threadprivate Directive.
				! Test for common block.

				!RUN: %flang_fc1 -emit-fir -fopenmp %s -o - \| FileCheck %s

				module test
				integer:: a
				real :: b(2)
				complex, pointer :: c, d(:)
				character(5) :: e, f(2)
				common /blk/ a, b, c, d, e, f

				!$omp threadprivate(/blk/)

				!CHECK: fir.global common @_QBblk(dense<0> : vector<103xi8>) : !fir.array<103xi8>

				contains
				subroutine sub()
				!CHECK: [[ADDR0:%.*]] = fir.address_of(@_QBblk) : !fir.ref<!fir.array<103xi8>>
				!CHECK: [[NEWADDR0:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.array<103xi8>> -> !fir.ref<!fir.array<103xi8>>
				!CHECK-DAG: [[ADDR1:%.*]] = fir.convert [[NEWADDR0]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[C0:%.*]] = arith.constant 0 : index
				!CHECK-DAG: [[ADDR2:%.*]] = fir.coordinate_of [[ADDR1]], [[C0]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR3:%.*]] = fir.convert [[ADDR2]] : (!fir.ref<i8>) -> !fir.ref<i32>
				!CHECK-DAG: [[ADDR4:%.*]] = fir.convert [[NEWADDR0]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[C1:%.*]] = arith.constant 4 : index
				!CHECK-DAG: [[ADDR5:%.*]] = fir.coordinate_of [[ADDR4]], [[C1]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR6:%.*]] = fir.convert [[ADDR5]] : (!fir.ref<i8>) -> !fir.ref<!fir.array<2xf32>>
				!CHECK-DAG: [[ADDR7:%.*]] = fir.convert [[NEWADDR0]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[C2:%.*]] = arith.constant 16 : index
				!CHECK-DAG: [[ADDR8:%.*]] = fir.coordinate_of [[ADDR7]], [[C2]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR9:%.*]] = fir.convert [[ADDR8]] : (!fir.ref<i8>) -> !fir.ref<!fir.box<!fir.ptr<!fir.complex<4>>>>
				!CHECK-DAG: [[ADDR10:%.*]] = fir.convert [[NEWADDR0]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[C3:%.*]] = arith.constant 40 : index
				!CHECK-DAG: [[ADDR11:%.*]] = fir.coordinate_of [[ADDR10]], [[C3]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR12:%.*]] = fir.convert [[ADDR11]] : (!fir.ref<i8>) -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?x!fir.complex<4>>>>>
				!CHECK-DAG: [[ADDR13:%.*]] = fir.convert [[NEWADDR0]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[C4:%.*]] = arith.constant 88 : index
				!CHECK-DAG: [[ADDR14:%.*]] = fir.coordinate_of [[ADDR13]], [[C4]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR15:%.*]] = fir.convert [[ADDR14]] : (!fir.ref<i8>) -> !fir.ref<!fir.char<1,5>>
				!CHECK-DAG: [[ADDR16:%.*]] = fir.convert [[NEWADDR0]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[C5:%.*]] = arith.constant 93 : index
				!CHECK-DAG: [[ADDR17:%.*]] = fir.coordinate_of [[ADDR16]], [[C5]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR18:%.*]] = fir.convert [[ADDR17]] : (!fir.ref<i8>) -> !fir.ref<!fir.array<2x!fir.char<1,5>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR3]] : !fir.ref<i32>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR6]](%{{.}}) : (!fir.ref<!fir.array<2xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<2xf32>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR9]] : !fir.ref<!fir.box<!fir.ptr<!fir.complex<4>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR12]] : !fir.ref<!fir.box<!fir.ptr<!fir.array<?x!fir.complex<4>>>>>
				!CHECK-DAG: %{{.*}} = fir.convert [[ADDR15]] : (!fir.ref<!fir.char<1,5>>) -> !fir.ref<i8>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR18]](%{{.}}) : (!fir.ref<!fir.array<2x!fir.char<1,5>>>, !fir.shape<1>) -> !fir.box<!fir.array<2x!fir.char<1,5>>>
				print *, a, b, c, d, e, f

				!$omp parallel
				!CHECK: [[ADDR77:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.array<103xi8>> -> !fir.ref<!fir.array<103xi8>>
				!CHECK-DAG: [[ADDR78:%.*]] = fir.convert [[ADDR77]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR79:%.]] = fir.coordinate_of [[ADDR78]], [[C0:%.]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR80:%.]] = fir.convert [[ADDR79:%.]] : (!fir.ref<i8>) -> !fir.ref<i32>
				!CHECK-DAG: [[ADDR81:%.*]] = fir.convert [[ADDR77]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR82:%.]] = fir.coordinate_of [[ADDR81]], [[C1:%.]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR83:%.]] = fir.convert [[ADDR82:%.]] : (!fir.ref<i8>) -> !fir.ref<!fir.array<2xf32>>
				!CHECK-DAG: [[ADDR84:%.*]] = fir.convert [[ADDR77]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR85:%.]] = fir.coordinate_of [[ADDR84]], [[C2:%.]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR86:%.]] = fir.convert [[ADDR85:%.]] : (!fir.ref<i8>) -> !fir.ref<!fir.box<!fir.ptr<!fir.complex<4>>>>
				!CHECK-DAG: [[ADDR87:%.*]] = fir.convert [[ADDR77]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR88:%.]] = fir.coordinate_of [[ADDR87]], [[C3:%.]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR89:%.]] = fir.convert [[ADDR88:%.]] : (!fir.ref<i8>) -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?x!fir.complex<4>>>>>
				!CHECK-DAG: [[ADDR90:%.*]] = fir.convert [[ADDR77]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR91:%.]] = fir.coordinate_of [[ADDR90]], [[C4:%.]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR92:%.]] = fir.convert [[ADDR91:%.]] : (!fir.ref<i8>) -> !fir.ref<!fir.char<1,5>>
				!CHECK-DAG: [[ADDR93:%.*]] = fir.convert [[ADDR77]] : (!fir.ref<!fir.array<103xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR94:%.]] = fir.coordinate_of [[ADDR93]], [[C5:%.]] : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR95:%.]] = fir.convert [[ADDR94:%.]] : (!fir.ref<i8>) -> !fir.ref<!fir.array<2x!fir.char<1,5>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR80]] : !fir.ref<i32>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR83]](%{{.}}) : (!fir.ref<!fir.array<2xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<2xf32>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR86]] : !fir.ref<!fir.box<!fir.ptr<!fir.complex<4>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR89]] : !fir.ref<!fir.box<!fir.ptr<!fir.array<?x!fir.complex<4>>>>>
				!CHECK-DAG: %{{.*}} = fir.convert [[ADDR92]] : (!fir.ref<!fir.char<1,5>>) -> !fir.ref<i8>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR95]](%{{.}}) : (!fir.ref<!fir.array<2x!fir.char<1,5>>>, !fir.shape<1>) -> !fir.box<!fir.array<2x!fir.char<1,5>>>
				print *, a, b, c, d, e, f
				!$omp end parallel

				!CHECK-DAG: %{{.*}} = fir.load [[ADDR3]] : !fir.ref<i32>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR6]](%{{.}}) : (!fir.ref<!fir.array<2xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<2xf32>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR9]] : !fir.ref<!fir.box<!fir.ptr<!fir.complex<4>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR12]] : !fir.ref<!fir.box<!fir.ptr<!fir.array<?x!fir.complex<4>>>>>
				!CHECK-DAG: %{{.*}} = fir.convert [[ADDR15]] : (!fir.ref<!fir.char<1,5>>) -> !fir.ref<i8>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR18]](%{{.}}) : (!fir.ref<!fir.array<2x!fir.char<1,5>>>, !fir.shape<1>) -> !fir.box<!fir.array<2x!fir.char<1,5>>>
				print *, a, b, c, d, e, f

				end
				end

flang/test/Lower/OpenMP/threadprivate-integer-different-kinds.f90

This file was added.

				! This test checks lowering of OpenMP Threadprivate Directive.
				! Test for variables with different kind.

				!REQUIRES: shell
				!RUN: %flang_fc1 -emit-fir -fopenmp %s -o - \| FileCheck %s

				program test
				integer, save :: i
				integer(kind=1), save :: i1
				integer(kind=2), save :: i2
				integer(kind=4), save :: i4
				integer(kind=8), save :: i8
				integer(kind=16), save :: i16

				!CHECK-DAG: [[ADDR0:%.*]] = fir.address_of(@_QFEi) : !fir.ref<i32>
				!CHECK-DAG: [[NEWADDR0:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<i32> -> !fir.ref<i32>
				!CHECK-DAG: [[ADDR1:%.*]] = fir.address_of(@_QFEi1) : !fir.ref<i8>
				!CHECK-DAG: [[NEWADDR1:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<i8> -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR2:%.*]] = fir.address_of(@_QFEi16) : !fir.ref<i128>
				!CHECK-DAG: [[NEWADDR2:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<i128> -> !fir.ref<i128>
				!CHECK-DAG: [[ADDR3:%.*]] = fir.address_of(@_QFEi2) : !fir.ref<i16>
				!CHECK-DAG: [[NEWADDR3:%.*]] = omp.threadprivate [[ADDR3]] : !fir.ref<i16> -> !fir.ref<i16>
				!CHECK-DAG: [[ADDR4:%.*]] = fir.address_of(@_QFEi4) : !fir.ref<i32>
				!CHECK-DAG: [[NEWADDR4:%.*]] = omp.threadprivate [[ADDR4]] : !fir.ref<i32> -> !fir.ref<i32>
				!CHECK-DAG: [[ADDR5:%.*]] = fir.address_of(@_QFEi8) : !fir.ref<i64>
				!CHECK-DAG: [[NEWADDR5:%.*]] = omp.threadprivate [[ADDR5]] : !fir.ref<i64> -> !fir.ref<i64>
				!$omp threadprivate(i, i1, i2, i4, i8, i16)

				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR0]] : !fir.ref<i32>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR1]] : !fir.ref<i8>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR2]] : !fir.ref<i128>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR3]] : !fir.ref<i16>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR4]] : !fir.ref<i32>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR5]] : !fir.ref<i64>
				print *, i, i1, i2, i4, i8, i16

				!$omp parallel
				!CHECK-DAG: [[ADDR39:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<i32> -> !fir.ref<i32>
				!CHECK-DAG: [[ADDR40:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<i8> -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR41:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<i128> -> !fir.ref<i128>
				!CHECK-DAG: [[ADDR42:%.*]] = omp.threadprivate [[ADDR3]] : !fir.ref<i16> -> !fir.ref<i16>
				!CHECK-DAG: [[ADDR43:%.*]] = omp.threadprivate [[ADDR4]] : !fir.ref<i32> -> !fir.ref<i32>
				!CHECK-DAG: [[ADDR44:%.*]] = omp.threadprivate [[ADDR5]] : !fir.ref<i64> -> !fir.ref<i64>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR39]] : !fir.ref<i32>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR40]] : !fir.ref<i8>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR41]] : !fir.ref<i128>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR42]] : !fir.ref<i16>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR43]] : !fir.ref<i32>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR44]] : !fir.ref<i64>
				print *, i, i1, i2, i4, i8, i16
				!$omp end parallel

				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR0]] : !fir.ref<i32>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR1]] : !fir.ref<i8>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR2]] : !fir.ref<i128>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR3]] : !fir.ref<i16>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR4]] : !fir.ref<i32>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR5]] : !fir.ref<i64>
				print *, i, i1, i2, i4, i8, i16

				!CHECK-DAG: fir.global internal @_QFEi : i32 {
				!CHECK-DAG: fir.global internal @_QFEi1 : i8 {
				!CHECK-DAG: fir.global internal @_QFEi16 : i128 {
				!CHECK-DAG: fir.global internal @_QFEi2 : i16 {
				!CHECK-DAG: fir.global internal @_QFEi4 : i32 {
				!CHECK-DAG: fir.global internal @_QFEi8 : i64 {
				end

flang/test/Lower/OpenMP/threadprivate-pointer-allocatable.f90

This file was added.

				! This test checks lowering of OpenMP Threadprivate Directive.
				! Test for allocatable and pointer variables.

				!RUN: %flang_fc1 -emit-fir -fopenmp %s -o - \| FileCheck %s

				module test
				integer, pointer :: x(:), m
				real, allocatable :: y(:), n

				!$omp threadprivate(x, y, m, n)

				!CHECK-DAG: fir.global @_QMtestEm : !fir.box<!fir.ptr<i32>> {
				!CHECK-DAG: fir.global @_QMtestEn : !fir.box<!fir.heap<f32>> {
				!CHECK-DAG: fir.global @_QMtestEx : !fir.box<!fir.ptr<!fir.array<?xi32>>> {
				!CHECK-DAG: fir.global @_QMtestEy : !fir.box<!fir.heap<!fir.array<?xf32>>> {

				contains
				subroutine sub()
				!CHECK-DAG: [[ADDR0:%.*]] = fir.address_of(@_QMtestEm) : !fir.ref<!fir.box<!fir.ptr<i32>>>
				!CHECK-DAG: [[NEWADDR0:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.box<!fir.ptr<i32>>> -> !fir.ref<!fir.box<!fir.ptr<i32>>>
				!CHECK-DAG: [[ADDR1:%.*]] = fir.address_of(@_QMtestEn) : !fir.ref<!fir.box<!fir.heap<f32>>>
				!CHECK-DAG: [[NEWADDR1:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<!fir.box<!fir.heap<f32>>> -> !fir.ref<!fir.box<!fir.heap<f32>>>
				!CHECK-DAG: [[ADDR2:%.*]] = fir.address_of(@_QMtestEx) : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xi32>>>>
				!CHECK-DAG: [[NEWADDR2:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xi32>>>> -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?xi32>>>>
				!CHECK-DAG: [[ADDR3:%.*]] = fir.address_of(@_QMtestEy) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
				!CHECK-DAG: [[NEWADDR3:%.*]] = omp.threadprivate [[ADDR3]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>> -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR2]] : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xi32>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR3]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR0]] : !fir.ref<!fir.box<!fir.ptr<i32>>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR1]] : !fir.ref<!fir.box<!fir.heap<f32>>>
				print *, x, y, m, n

				!$omp parallel
				!CHECK-DAG: [[ADDR54:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.box<!fir.ptr<i32>>> -> !fir.ref<!fir.box<!fir.ptr<i32>>>
				!CHECK-DAG: [[ADDR55:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<!fir.box<!fir.heap<f32>>> -> !fir.ref<!fir.box<!fir.heap<f32>>>
				!CHECK-DAG: [[ADDR56:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xi32>>>> -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?xi32>>>>
				!CHECK-DAG: [[ADDR57:%.*]] = omp.threadprivate [[ADDR3]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>> -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR56]] : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xi32>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR57]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR54]] : !fir.ref<!fir.box<!fir.ptr<i32>>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR55]] : !fir.ref<!fir.box<!fir.heap<f32>>>
				print *, x, y, m, n
				!$omp end parallel

				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR2]] : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xi32>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR3]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR0]] : !fir.ref<!fir.box<!fir.ptr<i32>>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR1]] : !fir.ref<!fir.box<!fir.heap<f32>>>
				print *, x, y, m, n
				end
				end

flang/test/Lower/OpenMP/threadprivate-real-logical-complex-derivedtype.f90

This file was added.

				! This test checks lowering of OpenMP Threadprivate Directive.
				! Test for real, logical, complex, and derived type.

				!RUN: %flang_fc1 -emit-fir -fopenmp %s -o - \| FileCheck %s

				module test
				type my_type
				integer :: t_i
				real :: t_arr(5)
				end type my_type
				real :: x
				complex :: y
				logical :: z
				type(my_type) :: t

				!$omp threadprivate(x, y, z, t)

				!CHECK-DAG: fir.global @_QMtestEt : !fir.type<_QMtestTmy_type{t_i:i32,t_arr:!fir.array<5xf32>}> {
				!CHECK-DAG: fir.global @_QMtestEx : f32 {
				!CHECK-DAG: fir.global @_QMtestEy : !fir.complex<4> {
				!CHECK-DAG: fir.global @_QMtestEz : !fir.logical<4> {

				contains
				subroutine sub()
				!CHECK-DAG: [[ADDR0:%.*]] = fir.address_of(@_QMtestEt) : !fir.ref<!fir.type<_QMtestTmy_type{t_i:i32,t_arr:!fir.array<5xf32>}>>
				!CHECK-DAG: [[NEWADDR0:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.type<_QMtestTmy_type{t_i:i32,t_arr:!fir.array<5xf32>}>> -> !fir.ref<!fir.type<_QMtestTmy_type{t_i:i32,t_arr:!fir.array<5xf32>}>>
				!CHECK-DAG: [[ADDR1:%.*]] = fir.address_of(@_QMtestEx) : !fir.ref<f32>
				!CHECK-DAG: [[NEWADDR1:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<f32> -> !fir.ref<f32>
				!CHECK-DAG: [[ADDR2:%.*]] = fir.address_of(@_QMtestEy) : !fir.ref<!fir.complex<4>>
				!CHECK-DAG: [[NEWADDR2:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<!fir.complex<4>> -> !fir.ref<!fir.complex<4>>
				!CHECK-DAG: [[ADDR3:%.*]] = fir.address_of(@_QMtestEz) : !fir.ref<!fir.logical<4>>
				!CHECK-DAG: [[NEWADDR3:%.*]] = omp.threadprivate [[ADDR3]] : !fir.ref<!fir.logical<4>> -> !fir.ref<!fir.logical<4>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR1]] : !fir.ref<f32>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR2]] : !fir.ref<!fir.complex<4>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR3]] : !fir.ref<!fir.logical<4>>
				!CHECK-DAG: %{{.*}} = fir.coordinate_of [[NEWADDR0]]
				print *, x, y, z, t%t_i

				!$omp parallel
				!CHECK-DAG: [[ADDR38:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.type<_QMtestTmy_type{t_i:i32,t_arr:!fir.array<5xf32>}>> -> !fir.ref<!fir.type<_QMtestTmy_type{t_i:i32,t_arr:!fir.array<5xf32>}>>
				!CHECK-DAG: [[ADDR39:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<f32> -> !fir.ref<f32>
				!CHECK-DAG: [[ADDR40:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<!fir.complex<4>> -> !fir.ref<!fir.complex<4>>
				!CHECK-DAG: [[ADDR41:%.*]] = omp.threadprivate [[ADDR3]] : !fir.ref<!fir.logical<4>> -> !fir.ref<!fir.logical<4>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR39]] : !fir.ref<f32>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR40]] : !fir.ref<!fir.complex<4>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR41]] : !fir.ref<!fir.logical<4>>
				!CHECK-DAG: %{{.*}} = fir.coordinate_of [[ADDR38]]
				print *, x, y, z, t%t_i
				!$omp end parallel

				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR1]] : !fir.ref<f32>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR2]] : !fir.ref<!fir.complex<4>>
				!CHECK-DAG: %{{.*}} = fir.load [[NEWADDR3]] : !fir.ref<!fir.logical<4>>
				!CHECK-DAG: %{{.*}} = fir.coordinate_of [[NEWADDR0]]
				print *, x, y, z, t%t_i

				end
				end

flang/test/Lower/OpenMP/threadprivate-use-association.f90

This file was added.

				! This test checks lowering of OpenMP Threadprivate Directive.
				! Test for threadprivate variable in use association.

				!RUN: %flang_fc1 -emit-fir -fopenmp %s -o - \| FileCheck %s

				!CHECK-DAG: fir.global common @_QBblk(dense<0> : vector<24xi8>) : !fir.array<24xi8>
				!CHECK-DAG: fir.global @_QMtestEy : f32 {

				module test
				integer :: x
				real :: y, z(5)
				common /blk/ x, z

				!$omp threadprivate(y, /blk/)

				contains
				subroutine sub()
				! CHECK-LABEL: @_QMtestPsub
				!CHECK-DAG: [[ADDR0:%.*]] = fir.address_of(@_QBblk) : !fir.ref<!fir.array<24xi8>>
				!CHECK-DAG: [[NEWADDR0:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.array<24xi8>> -> !fir.ref<!fir.array<24xi8>>
				!CHECK-DAG: [[ADDR1:%.*]] = fir.address_of(@_QMtestEy) : !fir.ref<f32>
				!CHECK-DAG: [[NEWADDR1:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<f32> -> !fir.ref<f32>

				!$omp parallel
				!CHECK-DAG: [[ADDR2:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.array<24xi8>> -> !fir.ref<!fir.array<24xi8>>
				!CHECK-DAG: [[ADDR3:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<f32> -> !fir.ref<f32>
				!CHECK-DAG: [[ADDR4:%.*]] = fir.convert [[ADDR2]] : (!fir.ref<!fir.array<24xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR5:%.]] = fir.coordinate_of [[ADDR4]], %{{.}} : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR6:%.]] = fir.convert [[ADDR5:%.]] : (!fir.ref<i8>) -> !fir.ref<i32>
				!CHECK-DAG: [[ADDR7:%.*]] = fir.convert [[ADDR2]] : (!fir.ref<!fir.array<24xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR8:%.]] = fir.coordinate_of [[ADDR7]], %{{.}} : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR9:%.]] = fir.convert [[ADDR8:%.]] : (!fir.ref<i8>) -> !fir.ref<!fir.array<5xf32>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR6]] : !fir.ref<i32>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR3]] : !fir.ref<f32>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR9]](%{{.}}) : (!fir.ref<!fir.array<5xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<5xf32>>
				print *, x, y, z
				!$omp end parallel
				end
				end

				program main
				use test
				integer :: x1
				real :: z1(5)
				common /blk/ x1, z1

				!$omp threadprivate(/blk/)

				call sub()

				! CHECK-LABEL: @_QQmain()
				!CHECK-DAG: [[ADDR0:%.*]] = fir.address_of(@_QBblk) : !fir.ref<!fir.array<24xi8>>
				!CHECK-DAG: [[NEWADDR0:%.*]] = omp.threadprivate [[ADDR0]] : !fir.ref<!fir.array<24xi8>> -> !fir.ref<!fir.array<24xi8>>
				!CHECK-DAG: [[ADDR1:%.*]] = fir.address_of(@_QBblk) : !fir.ref<!fir.array<24xi8>>
				!CHECK-DAG: [[NEWADDR1:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<!fir.array<24xi8>> -> !fir.ref<!fir.array<24xi8>>
				!CHECK-DAG: [[ADDR2:%.*]] = fir.address_of(@_QMtestEy) : !fir.ref<f32>
				!CHECK-DAG: [[NEWADDR2:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<f32> -> !fir.ref<f32>

				!$omp parallel
				!CHECK-DAG: [[ADDR4:%.*]] = omp.threadprivate [[ADDR1]] : !fir.ref<!fir.array<24xi8>> -> !fir.ref<!fir.array<24xi8>>
				!CHECK-DAG: [[ADDR5:%.*]] = omp.threadprivate [[ADDR2]] : !fir.ref<f32> -> !fir.ref<f32>
				!CHECK-DAG: [[ADDR6:%.*]] = fir.convert [[ADDR4]] : (!fir.ref<!fir.array<24xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR7:%.]] = fir.coordinate_of [[ADDR6]], %{{.}} : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR8:%.]] = fir.convert [[ADDR7:%.]] : (!fir.ref<i8>) -> !fir.ref<i32>
				!CHECK-DAG: [[ADDR9:%.*]] = fir.convert [[ADDR4]] : (!fir.ref<!fir.array<24xi8>>) -> !fir.ref<!fir.array<?xi8>>
				!CHECK-DAG: [[ADDR10:%.]] = fir.coordinate_of [[ADDR9]], %{{.}} : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
				!CHECK-DAG: [[ADDR11:%.]] = fir.convert [[ADDR10:%.]] : (!fir.ref<i8>) -> !fir.ref<!fir.array<5xf32>>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR8]] : !fir.ref<i32>
				!CHECK-DAG: %{{.*}} = fir.load [[ADDR5]] : !fir.ref<f32>
				!CHECK-DAG: %{{.}} = fir.embox [[ADDR11]](%{{.}}) : (!fir.ref<!fir.array<5xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<5xf32>>
				print *, x1, y, z1
				!$omp end parallel

				end

This is an archive of the discontinued LLVM Phabricator instance.

[flang][OpenMP] Support lowering parse-tree to MLIR for threadprivate directiveClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 434704

flang/include/flang/Lower/AbstractConverter.h

flang/include/flang/Lower/OpenMP.h

flang/include/flang/Lower/PFTBuilder.h

flang/lib/Lower/Bridge.cpp

flang/lib/Lower/OpenMP.cpp

flang/lib/Lower/PFTBuilder.cpp

flang/test/Lower/OpenMP/Todo/omp-threadprivate.f90

flang/test/Lower/OpenMP/threadprivate-char-array-chararray.f90

flang/test/Lower/OpenMP/threadprivate-commonblock.f90

flang/test/Lower/OpenMP/threadprivate-integer-different-kinds.f90

flang/test/Lower/OpenMP/threadprivate-pointer-allocatable.f90

flang/test/Lower/OpenMP/threadprivate-real-logical-complex-derivedtype.f90

flang/test/Lower/OpenMP/threadprivate-use-association.f90

[flang][OpenMP] Support lowering parse-tree to MLIR for threadprivate directive
ClosedPublic