Patch: https://reviews.llvm.org/D128940 added support for simd if clause for MLIR dialect. This patch adds support for lowering MLIR simd if clause to LLVM IR.
peixin kiranchandramohan kiranktp shraiysh psoni2628 dpalermo sscalpone ftynse jdoerfert Meinersbur
- Group Reviewers
Restricted Project Restricted Project
- rGd90b7bf2c53d: Add support for lowering simd if clause to LLVM IR
Other options include,
-> Adding the if option to applySimd in the OpenMPIRBuilder. CreateParallel takes in the ifClause as an argument. But it can be argued that it does this since ifClause for parallel requires the insertion of runtime calls. https://github.com/llvm/llvm-project/blob/6858a17f66f6d609e2d801c443da951c391636a0/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp#L833
-> Handling the if clause in the OpenMPToLLVMConversion pass. https://github.com/llvm/llvm-project/blob/main/mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp
-> Or adding an OpenMPtoSCF pass like what OpenACC does and handling the if clause there. https://github.com/llvm/llvm-project/blob/main/mlir/lib/Conversion/OpenACCToSCF/OpenACCToSCF.cpp
|4 ↗||(On Diff #443226)|
It might be better to not mix test like that. Normally here fir is tested. You can use tco to test fir to LLVM IR translation.
This (and the following lines) is better if it's done in the OpenMPIRBuilder.
Nit : Please use spaces instead of Tab
I'm not sure I follow your question here. I personally did not opt for anything in this patch since it is yours. In OpenACC, for standalone data movement ops it's straight forward to do it in MLIR directly.
I think that can be done here too (like Kiran suggested). In that case, these lines would be eliminated. There are pros and cons to both the approaches 1 and 3 in Kiran's reply. If we follow the OpenACC approach, this same thing will have to be done inside clang, but it should be straightforward for OpenMP Dialect. If we follow the OpenMP IR Builder, it will probably be a bit trickier, but it will be a nice abstraction for both OpenMP Dialect and Clang.
I think what Roger meant was why choose one over another? Or, in other words - which one of these should we do here? (I don't have a strong preference for one or another)
Apologies for taking the wrong name.
In general, we should consider lowering the clauses or operation in translation as the last possible option. Translation is intended to be a light-weight transformation and indeed it is for LLVM dialect. For using the OpenMPIRBuilder we do some more stuff in translation but we should try to keep it as lightweight as possible and implement things in the OpenMPIRBuilder or if that is not possible, in a conversion pass. So preference will be,
Also, it will be good to have things implemented in a uniform fashion for various OpenMP Ops and also among OpenACC and OpenMP.
[serious] This is really ugly from a software architecture point of view. Where previously all could expect a CanonicalLoopInfo has a preheader has to exist, it now any loop-handling code has to check for it.
The description of CanonicalLoopInfo needs to be updated.
What this function does should be achievable with 3 elementary steps:
Step 1 does not yet exist as an OpenMPIRBuilder method but but maybe should be made available independent of applySimd to by usable for anything that implements an if clause.
I would not object to have such a helper function doing all 3 steps together, but the description should reflect that.
Please avoid copy&paste code. The common code can be extracted into a separate function.
However, I don't think a version taking a Loop info is even necessary. See comment below.
[serious] As far as I understand, the only reason this instantiates a pass manger is to use the cloneLoopWithPreheader function. However, it is sufficient to clone all block from CanonicalLoopInfo::getHeader() to CanonicalLoopInfo::getAfter(). Since the ExitBlock is guaranteed to be the only exit out of the loop (at least until I can introduce breaks; D124823 is a step into that direction), all blocks in the loop can be simply enumerated and cloned without cloneLoopWithPreheader.
This functionality is probably useful for implementing any loop versioning for if-clauses and should probably made its own method.
I decided to modify createCanonicalLoop interface, because omp loop trip count was calculated outside the loop preheader.
I needed to calculate trip count inside loop preheader because I used cloneLoopWithPreheader function. This function needs a BasicBlock which dominates. If we create loop trip count outside loop preheader then we cannot use loop preheader as dominator block.
According to your comments we can omit cloneLoopWithPreheader function call. In consequence there is no need to modify createCanonicalLoop function.
Implemented simd if clause- applying review remarks:
- Use simplified cloning mechanism (without cloneLoopWithPreheader).
- Removed addLoopMetadata(Loop *L, Metadata) function.
- Added more tests.
I do not use cloneLoopWithPreheader function any more, but I still need pass manager to define Loop blocks. Original applySimd function also uses pass manager for defining loop blocks. Please let me know if you have better idea, how to extract all body blocks.
As mentioned, I think this can be integrated into the existing applySimd methods. All it does need is the IfCond argument, and if it is set, implement the loop versioning. After that, use the same code to attach the metadata.
collapseLoops can be called by the translation layer to get a single CanonicalLoopInfo handle to be passed to this function.
Enumerating loop block without LoopInfo will be possible after D124823. Instantiation of an entire pass manager infrastructure does not scale well and should not be a long-term solution. It runs DominatorTree, LoopInfo etc on the entire function. n loops in the function have a complexity of at O(n³), because DominatorTree is already O(n²).
Do not collapse loops inside function responsible for applying SIMD metadirectives
Added special function for loop versioning
LGTM, this turned out really nice. Thank you!
Some nitpicks left. Please apply before committing.
This could be made private as front-ends are encouraged to use the IfCond argument for the relevant instead that may implement specialized behavior (like passing the value to the runtime).
It still might be useful for front-ends (eg implementing dynamic metadirective), but I'd wait to such use cases to occur before adding this to the public API.
Edit: I noticed it is tested in the unit test, which requires it to be public.
Could you add a TODO/FIXME comment mentioning that we should not depend on a pass manger?
operator may modify the underlaying hashtable.
The name MappedLatch is more meaningful than MappedValue.
I moved it to private and I removed corresponding unit test. This functionality is also tested by applySimdLoopIf unit test.