This is an archive of the discontinued LLVM Phabricator instance.

[mlir][OpenMP] Lower simd if clause to LLVM IR

Authored by domada on Jul 8 2022, 6:00 AM.



Patch: added support for simd if clause for MLIR dialect. This patch adds support for lowering MLIR simd if clause to LLVM IR.

Diff Detail

Event Timeline

domada created this revision.Jul 8 2022, 6:00 AM
Herald added a reviewer: ftynse. · View Herald Transcript
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
domada requested review of this revision.Jul 8 2022, 6:00 AM

Other options include,
-> Adding the if option to applySimd in the OpenMPIRBuilder. CreateParallel takes in the ifClause as an argument. But it can be argued that it does this since ifClause for parallel requires the insertion of runtime calls.
-> Handling the if clause in the OpenMPToLLVMConversion pass.
-> Or adding an OpenMPtoSCF pass like what OpenACC does and handling the if clause there.

clementval added inline comments.
4 ↗(On Diff #443226)

It might be better to not mix test like that. Normally here fir is tested. You can use tco to test fir to LLVM IR translation.


This (and the following lines) is better if it's done in the OpenMPIRBuilder.

raghavendhra added inline comments.

Nit : Please use spaces instead of Tab

domada added inline comments.Jul 11 2022, 12:26 PM

I would like to ask you why you opt for modification of OpenMPIRBuilder instead of adding an OpenMPtoSCF pass like what OpenACC does and handling the if clause there. ?

clementval added inline comments.Jul 11 2022, 12:57 PM

I'm not sure I follow your question here. I personally did not opt for anything in this patch since it is yours. In OpenACC, for standalone data movement ops it's straight forward to do it in MLIR directly.

shraiysh added inline comments.Jul 11 2022, 9:23 PM

I think that can be done here too (like Kiran suggested). In that case, these lines would be eliminated. There are pros and cons to both the approaches 1 and 3 in Kiran's reply. If we follow the OpenACC approach, this same thing will have to be done inside clang, but it should be straightforward for OpenMP Dialect. If we follow the OpenMP IR Builder, it will probably be a bit trickier, but it will be a nice abstraction for both OpenMP Dialect and Clang.

I think what Roger meant was why choose one over another? Or, in other words - which one of these should we do here? (I don't have a strong preference for one or another)

shraiysh added inline comments.Jul 12 2022, 12:41 AM

Apologies for taking the wrong name.
I think what Roger Dominik meant was why choose one over another? Or, in other words - which one of these should we do here? (I don't have a strong preference for one or another)


In general, we should consider lowering the clauses or operation in translation as the last possible option. Translation is intended to be a light-weight transformation and indeed it is for LLVM dialect. For using the OpenMPIRBuilder we do some more stuff in translation but we should try to keep it as lightweight as possible and implement things in the OpenMPIRBuilder or if that is not possible, in a conversion pass. So preference will be,

  1. OpenMPIRBuilder
  2. OpenMPToSCF
  3. OpenMPToLLVM
  4. Translation

Also, it will be good to have things implemented in a uniform fashion for various OpenMP Ops and also among OpenACC and OpenMP.

domada updated this revision to Diff 447324.Jul 25 2022, 7:14 AM
domada marked an inline comment as not done.

Applied remarks. Moved generation of if clause to OMPIRBuilder.

Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 7:14 AM
domada updated this revision to Diff 447342.Jul 25 2022, 7:57 AM
domada marked an inline comment as done.

Fix build failures

Meinersbur requested changes to this revision.Jul 25 2022, 2:28 PM
Meinersbur added inline comments.

[serious] This is really ugly from a software architecture point of view. Where previously all could expect a CanonicalLoopInfo has a preheader has to exist, it now any loop-handling code has to check for it.

The description of CanonicalLoopInfo needs to be updated.


What this function does should be achievable with 3 elementary steps:

  1. Code versioning (IfCond)
  2. collapseLoops
  3. applySimd(CanonicalLoopInfo, SimLen)

Step 1 does not yet exist as an OpenMPIRBuilder method but but maybe should be made available independent of applySimd to by usable for anything that implements an if clause.

I would not object to have such a helper function doing all 3 steps together, but the description should reflect that.


[nit] unrelated


Please avoid copy&paste code. The common code can be extracted into a separate function.

However, I don't think a version taking a Loop info is even necessary. See comment below.


[serious] As far as I understand, the only reason this instantiates a pass manger is to use the cloneLoopWithPreheader function. However, it is sufficient to clone all block from CanonicalLoopInfo::getHeader() to CanonicalLoopInfo::getAfter(). Since the ExitBlock is guaranteed to be the only exit out of the loop (at least until I can introduce breaks; D124823 is a step into that direction), all blocks in the loop can be simply enumerated and cloned without cloneLoopWithPreheader.

This functionality is probably useful for implementing any loop versioning for if-clauses and should probably made its own method.

This revision now requires changes to proceed.Jul 25 2022, 2:28 PM
domada added inline comments.Jul 26 2022, 12:22 AM

thanks for the remarks.

I decided to modify createCanonicalLoop interface, because omp loop trip count was calculated outside the loop preheader.

I needed to calculate trip count inside loop preheader because I used cloneLoopWithPreheader function. This function needs a BasicBlock which dominates. If we create loop trip count outside loop preheader then we cannot use loop preheader as dominator block.

According to your comments we can omit cloneLoopWithPreheader function call. In consequence there is no need to modify createCanonicalLoop function.

domada updated this revision to Diff 448311.Jul 28 2022, 5:41 AM

Implemented simd if clause- applying review remarks:

  1. Use simplified cloning mechanism (without cloneLoopWithPreheader).
  2. Removed addLoopMetadata(Loop *L, Metadata) function.
  3. Added more tests.
domada added inline comments.Jul 28 2022, 5:53 AM

I do not use cloneLoopWithPreheader function any more, but I still need pass manager to define Loop blocks. Original applySimd function also uses pass manager for defining loop blocks. Please let me know if you have better idea, how to extract all body blocks.

Meinersbur added inline comments.Jul 28 2022, 8:46 AM

As mentioned, I think this can be integrated into the existing applySimd methods. All it does need is the IfCond argument, and if it is set, implement the loop versioning. After that, use the same code to attach the metadata.

collapseLoops can be called by the translation layer to get a single CanonicalLoopInfo handle to be passed to this function.


Enumerating loop block without LoopInfo will be possible after D124823. Instantiation of an entire pass manager infrastructure does not scale well and should not be a long-term solution. It runs DominatorTree, LoopInfo etc on the entire function. n loops in the function have a complexity of at O(n³), because DominatorTree is already O(n²).

domada updated this revision to Diff 448666.Jul 29 2022, 9:55 AM

Applied remarks:
Do not collapse loops inside function responsible for applying SIMD metadirectives
Added special function for loop versioning

domada marked 2 inline comments as done.Jul 29 2022, 9:58 AM
Meinersbur accepted this revision.Jul 29 2022, 11:36 AM

LGTM, this turned out really nice. Thank you!

Some nitpicks left. Please apply before committing.


This could be made private as front-ends are encouraged to use the IfCond argument for the relevant instead that may implement specialized behavior (like passing the value to the runtime).

It still might be useful for front-ends (eg implementing dynamic metadirective), but I'd wait to such use cases to occur before adding this to the public API.

Edit: I noticed it is tested in the unit test, which requires it to be public.


Could you add a TODO/FIXME comment mentioning that we should not depend on a pass manger?


operator[] may modify the underlaying hashtable.

The name MappedLatch is more meaningful than MappedValue.

This revision is now accepted and ready to land.Jul 29 2022, 11:36 AM
domada marked 3 inline comments as done.Aug 1 2022, 1:23 AM
domada added inline comments.

I moved it to private and I removed corresponding unit test. This functionality is also tested by applySimdLoopIf unit test.

This revision was landed with ongoing or failed builds.Aug 1 2022, 2:50 AM
This revision was automatically updated to reflect the committed changes.
domada marked an inline comment as done.
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 2:50 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript