This patch adds lowering of TargetOps for the host. The lowering outlines the
target region function and uses the OpenMPIRBuilder support functions to emit
the function and call. Code generation for offloading will be done in
later patches. The tests only use scalars for now to reduce the size of the initial
patches for offloading. Runtime tests will be added in a separate patch since 'lit'
is currently not set up to handle Fortran runtime tests.
Details
Diff Detail
Event Timeline
Please split this patch into three patches (or two).
- Containing just the OpenMPIRBuilder changes.
- Containing the translation from OPenMP + LLVM Dialect to LLVM IR.
- The conversion from Fortran to FIR + OpenMP and the test from FIR + OpenMP to LLVM + OpenMP.
See comments inline. We would like to have the translation layer thin and have most of the code in the OpenMP IRBuilder. Please refer to the translation of other OpenMP Ops as detailed in the inline comment. If there is anything special to this Op that prevents it then please explain.
flang/test/Lower/OpenMP/target_region_to_llvmir.f90 | ||
---|---|---|
1 ↗ | (On Diff #509427) | We generally do not test lowering to LLVM IR. It is usually done in multiple steps:
|
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
1664 | We would generally like to keep the Translation code thin and do the heavy-lifting in the OpenMP IRBuilder. I think this should be rewritten with more code sunk into the OpenMP IRBuilder. Can this be converted to other functions like convertOmpSingle or convertOmpMaster or convertOmpTaskOp or convertOmpParallel? | |
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | ||
1328–1330 | Nit: Braces not required. | |
1332 | Nit: What are the three false options? |
flang/test/Lower/OpenMP/target_region_to_llvmir.f90 | ||
---|---|---|
1 ↗ | (On Diff #509427) | Yes, I will split up the patch into 3 separate ones. What is the expectation on the various clauses? Can these be implemented in separate sets of patches? The patches going from OMP+LLVM Dialect -> LLVM IR could be very large if we do everything at once. |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
1664 | I agree we want to put as much as possible in the OpenMPIRBuilder, but unless both clang and MLIR can use it, it shouldn't be in the OpenMPIRBuilder, because it would expose two different (possibly incompatible) implementations of the same thing to either clang or mlir, which in my opinion is undesirable. When we did the migration of code from clang to OpenMPIRBuilder we tried to migrate as much as possible, the problem with generating code for the target regions specifically is that there are data structures that need to be maintained which all use clang data types like mapping device pointers to values etc. There are also issues with clauses affecting the logic for how the code should be generated, these two things combined are not easy to get around. Parameterizing the code and using callback functions (which OpenMPIRBuilder does today) didn't seem feasible, because the common code would be trivial, and the parameterization would just make things less readable/maintainable. Another idea would be to use templates to be able to ignore the specific types, which might be a way forward, but unless we know what the MLIR implementation looks like it is hard to know if it would work. Another potential alternative would be to have some kind of interface for dealing with values/statements/clauses which the OpenMPIRBuilder would define and clang/mlir implement. But after some discussions with @jdoerfert we concluded that this would probably not be a good option (too complicated basically). Hopefully something will emerge while we are working on the MLIR side. If sharing code through the OpenMPIRBuilder is not a good possible, then it might be better to keep it in MLIR,. What are your thoughts about this? | |
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | ||
1332 |
Because we don't have the flags/dierctives implemented in the frontend yet. |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
---|---|---|
1664 | Thanks for the detailed explanation here. Unless there are issues for Flang/MLIR, I would still recommend moving the code to the OpenMPIRBuilder for a few reasons:
| |
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | ||
1332 | I mean't, please add a comment for each of the options so that someone who reads the code knows what they are. |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
---|---|---|
1554 | Is it generally expected for the compilation to fail if the input is given via stdin? It looks like clang currently does implement this behavior but it's not communicated to the user: $ echo -e "int main(){return 0;}" | clang -cc1 -fopenmp - -emit-llvm -o - ; ModuleID = '-' source_filename = "-" target datalayout = ... ... $ echo -e "int main(){\n#pragma omp target\n{} return 0;}" | clang -cc1 -fopenmp - -emit-llvm -o - fatal error: cannot open file '<stdin>': No such file or directory Currently flang-new seems to support compiling from stdin. Or at least that's the case for the simplest program: $ echo "end program" | flang-new -fc1 -x f95-cpp-input - -emit-mlir -o - module attributes {dlti.dl_spec ... ... |
flang/test/Lower/OpenMP/target_region_to_llvmir.f90 | ||
---|---|---|
1 ↗ | (On Diff #509427) | Yes, you can split the clauses into a series of patches. |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
---|---|---|
1664 |
Yes, there should be a way to use the same pattern, I can work on creating a convertOmpTarget function in the OpenMPIRBuilder that will do this.
My initial thinking was to keep the code structure similar between MLIR and Clang similar to make it easier to have a shared OpenMPIRBuilder interface. Given 3. there isn't much choice than to add new functions in OpenMPIRBuilder to minimize the code in MLIR.
Good to know! |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
---|---|---|
1554 | Yes, the assumption in clang is that there is a file, which is needed to generate a unique name for outlined target region functions. The code above would make flang-new fail the same way if target regions are used. |
@kiranchandramohan I moved the outlining to the OpenMPIRBuilder, is this more in line of what you were thinking? There is a patch up by @TIFitis for Fortran->Fir lowering https://reviews.llvm.org/D147339. I will prune this patch to only have the OMP+LLVM Dialect -> LLVMIR portions and appropriate tests.
For IRBuilder lowering unit tests need to be added here: llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
And for MLIR Translation tests need to be added to: mlir/test/Target/LLVMIR/omptarget-llvm.mlir
Thanks @jsjodin for the changes. Yes, moving the code to the OpenMPIRBuilder looks fine. I have not looked at the patch in detail. But I see that you have not used the OpenMPIRBuilder's outlining facility (OutlineInfo(s) and the PostOutlineCB). Are there any issues in using these here?
I have reworked the patch a bit to simplify things and added tests. The reason why the OutlineInfos etc are not used for the target regions is probably because the outlining is more complicated. With offloading we have to generate an if-then-else that tries to offload via the openmp runtime and if the runtime fails to launch the kernel the fallback is executed instead.
Simplify the code by only passing the inputs, since they will be the same values in the outlined code. Add tests and improve comments.
Also since the outlining is shared with the device codegen, that is also a reason why the OutlineInfos etc is not used.
The if-then-else is not related to the if clause, it is only for the host fallback. The code in this patch will be updated once the kernel argument handling is implemented for the offloading case. We will have to add the if clause handling as well later on.
mlir/test/Target/LLVMIR/omptarget-llvm.mlir | ||
---|---|---|
178 | It has to stay because of omp.is_device = #omp.isdevice<is_device = false> |
TL;DR, we should try to converge to one impl. in OpenMPIRBuilder but that might take a while and we should not force it where it doesn't make sense (yet).
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h | ||
---|---|---|
1828 | Nit: newline and docs above; copy&paste error below; it's not the default number but clause value; | |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
4166 | ||
4180 | FWIW, you should already (be able to) emit the kernel launch call and fallback handling. | |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
1554 | @skatrak You can file a bug. We might be able to handle that fine, but it has not come up yet. |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
---|---|---|
4180 |
Should I add the code to build the kernel args, or do you mean empty kernel args? |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
---|---|---|
4180 | I forgot to mention that I am working on a patch to handle the kernel args which adds the kernel launch call. So I could pull that merge that patch into this or keep it separate. @kiranchandramohan, @jdoerfert you have a preference? |
Thanks @jsjodin for the changes. Yes, moving the code to the OpenMPIRBuilder looks fine. I have not looked at the patch in detail. But I see that you have not used the OpenMPIRBuilder's outlining facility (OutlineInfo(s) and the PostOutlineCB). Are there any issues in using these here?
I have reworked the patch a bit to simplify things and added tests. The reason why the OutlineInfos etc are not used for the target regions is probably because the outlining is more complicated. With offloading we have to generate an if-then-else that tries to offload via the openmp runtime and if the runtime fails to launch the kernel the fallback is executed instead.
Also since the outlining is shared with the device codegen, that is also a reason why the OutlineInfos etc is not used.
We have handling for the if clause in both task and parallel. Will the handling in Target be substantially different from these?
Task : D130615
Parallel : D138495 (The latest revision uses a runtime extension, so that we do not have to generate IR for both sequential and parallel version).The if-then-else is not related to the if clause, it is only for the host fallback. The code in this patch will be updated once the kernel argument handling is implemented for the offloading case. We will have to add the if clause handling as well later on.
I was making a general point that, we already have handling in the OpenMPIRBuilder that is capable of conditionally calling a parallel or sequential version. I was hoping that this can be used to generate the if-then-else for the offload.
In general, the concern about not using the OutlineInfo mechanism is whether this will fail to compose with existing. code that uses the OutlineInfo Mechanism.
But if you and @jdoerfert feel this is the right way to go then I don't have any issues.
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
---|---|---|
4180 | No. I don't have a preference. |
I think it can be done.
I am unsure about how target regions can compose with the other directives that use the OutlineInfo. It might be possible to use the OutlineInfo mechanism, but I don't think it will work with some of the other existing utilities in the OpenMPIRBuilder e.g. emitTargetRegionFunction, which is used in this patch.
I think that reusing the offloading utilities is the better option right now. It will allow us to not re-implement a bunch of logic to make faster progress. Also if anything changes with respect to the omp runtime we wouldn't have to modify the code in two places. Since everything will be in the OpenMPIRBuilder we could do some refactoring later of if it seems worth doing.
Thanks for the changes. I have a few comments or questions. See inline.
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h | ||
---|---|---|
1835–1837 | Nit: Please add a full stop/dot at the end for all the three above to keep it consistent. | |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
4118 | Nit: Is 4 significant here? | |
4119–4121 | Nit: braces not required. | |
4120 | In the past we have seen that if the number of live-ins/parameters go above a certain number that leads to ABI issues on some platforms. | |
4141–4142 | Nit: spell the types. | |
4144–4151 | Nit: braces not required. | |
4196 | Nit: Is this leftover debug code? | |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
1571 | Is this required? If so, please add it to other if statements as well. | |
1602 | Return success or failure? | |
1612–1614 | Nit: braces not required. | |
mlir/test/Target/LLVMIR/omptarget-llvm.mlir | ||
177 | Please also add a trivial test for an OpenMP construct that can occur inside the target region, just to ensure that the target region can contain other constructs and it works OK. Something very simple like the following. omp.target { omp.parallel { .... omp.terminator } omp.terminator } | |
178 | Nit: If the dlti is not required for this patch, then you can consider omitting it. |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
---|---|---|
4118 |
No, I picked something small that hopefully doesn't require a resize. Not sure if there is a better value to pick? | |
4120 |
Clang does not pack arguments into a struct for the outlined fallback (host) for target regions. It could be that the number of parameters are fewer compared to other outlining cases. | |
4196 | Yes, thanks for finding it! | |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
1571 | Yes, I think it is okay to error out. I will add it to the other cases. | |
1602 | Success for now. Basically ignore this so we can get through the driver. There is a follow up patch https://reviews.llvm.org/D147940 that will allow us to enable device codegen. | |
mlir/test/Target/LLVMIR/omptarget-llvm.mlir | ||
177 |
Sure, I can add a test for this. | |
178 |
We can remove dlti it for this patch. |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
---|---|---|
4118 | Doesn't the programmers manual recommend omitting if the value is not significant? In the absence of a well-motivated choice for the number of inlined elements N, it is recommended to use SmallVector<T> (that is, omitting the N). This will choose a default number of inlined elements reasonable for allocation on the stack (for example, trying to keep sizeof(SmallVector<T>) around 64 bytes). https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h | |
4120 | OK. It is fine for now. |
A few comments/questions.
mlir/lib/Target/LLVMIR/CMakeLists.txt | ||
---|---|---|
60 | Is this required? | |
88 | Is this required? | |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
1643–1644 | Can this function be sunk into the OpenMPIRBuilder? If it is just a call like this then please consider, otherwise ignore. | |
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | ||
1327–1335 | Ideally, this code should be in OpenMPToLLVMIRTranslation.cpp. Can this be moved there? |
mlir/lib/Target/LLVMIR/CMakeLists.txt | ||
---|---|---|
60 |
No, thanks for pointing this out. I think it was needed in an earlier iteration. | |
88 |
Not needed here either. Thanks! | |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
1643–1644 |
No, unfortunately it cannot be replaced with a simple function call. | |
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | ||
1327–1335 |
I'm not sure how to do that. The code in OpenMPToLLVMIRTranslation code implements an interface and does not have an internal state, and we probably don't want to set the config in every function. The object that holds the state is ModuleTranslation. We could move this code to the constructor of the ModuleTranslation class if you prefer. |
Removed useless dependencies in cmake file. Moved setting of config to the ModuleTranslation constructor.
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | ||
---|---|---|
460 | This will always create the OpenMPIRBuilder, even for non-OpenMP flows which is not what we want. getOpenMPBuilder creates the Builder if it is not yet created. We can consider setting the config there. |
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | ||
---|---|---|
460 |
Yes, that is a much better option. Moved the code there instead. |
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | ||
---|---|---|
1269 | I will remove this recursive call. |
LGTM. Please check with @jdoerfert before you submit.
It will be good if someone in your team can go through the tests in detail.
mlir/lib/Target/LLVMIR/CMakeLists.txt | ||
---|---|---|
42 | Nit: Please reconfirm that this is required. | |
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
14 | Is this include required? |
Thank you for taking the time to review this patch!
I will let @jdoerfert review, and have someone look at the test.
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
---|---|---|
4118 |
Looked over everything one final time, including the tests, it all LGTM.
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | ||
---|---|---|
1581 | I have a patch that moves this into the OMP IR builder: https://reviews.llvm.org/D149162 so it might be possible to replace this at some point in the near future with it! It's used for declare target variables as well, and perhaps some other things. I can likely create a patch to change it to utilise the OMPIRBuilder version if/when it lands. I bring it up as a future TODO, for myself (or whomever) rather than something to do in this patch. |
Looks like this broke our sanitizer bot: https://lab.llvm.org/buildbot/#/builders/5/builds/33303. Could you please revert or fix forward?
Thanks!
mlir/lib/Target/LLVMIR/CMakeLists.txt | ||
---|---|---|
42 | This seems problematic. OpenMP dialect should not be a dependency on the generic LLVMIR export. It seems there is a layering issue here. You need to add your own LLVMIR export target with the OpenMP dialect registered. |
mlir/lib/Target/LLVMIR/CMakeLists.txt | ||
---|---|---|
42 | @jsjodin If the link error is coming due to the use of the OffloadInterface then you can remove its usage and access attributes directly. If accessing attributes still causes a link error then you can consider initializing the OpenMPIRBuilder using the amdendOperation flow. You might have to move convertModule to somewhere earlier for this to work. If that is also not possible and there are no other ideas then please revert the patch. |
mlir/lib/Target/LLVMIR/CMakeLists.txt | ||
---|---|---|
42 | Sorry for the delayed response. I am looking into this issue. |
mlir/lib/Target/LLVMIR/CMakeLists.txt | ||
---|---|---|
42 | A fix is in this patch: https://reviews.llvm.org/D151745 |
mlir/lib/Target/LLVMIR/CMakeLists.txt | ||
---|---|---|
42 |
This was committed now. |
Nit: newline and docs above; copy&paste error below; it's not the default number but clause value;