Page MenuHomePhabricator

[MLIR,OpenMP] Added support for lowering MasterOp to LLVMIR
Needs ReviewPublic

Authored by SouraVX on Sep 7 2020, 10:35 AM.

Details

Summary

Some Ops in OMP dialect have regions associated with them i.e
ParallelOp MasterOp. Lowering of these regions involves interfacing
with OMPIRBuilder using callbacks, yet there still exist opportunities
for sharing common code in between.

This patch factors out common code into a separate function and adds
support for lowering MasterOp using that. Lowering of ParallelOp is
also modified appropriately.

Diff Detail

Event Timeline

SouraVX created this revision.Sep 7 2020, 10:35 AM
Herald added a project: Restricted Project. · View Herald Transcript
SouraVX requested review of this revision.Sep 7 2020, 10:35 AM

Thanks @SouraVX for working on this. Looks OK. A few comments to address.

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
447–448

Should valueMapping and blockMapping be passed? Aren't these available here since convertOmpOpRegions is a member function? Or did you want this as a static function?

492–496

Can this loop also go into convertOmpOpRegions?

505

lint warning?

mlir/test/Target/openmp-llvm.mlir
225–227
  1. Can you add this inside a parallel here or in another test?
  2. Nit: I think other tests here use the pretty print syntax.
kiranktp added inline comments.Sep 7 2020, 6:03 PM
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
491

convertOmpMaster and convertOmpParallel function body is 99% same except the casting type (omp::MasterOp/omp::ParallelOp).
Will it be good to have a template function (eg: convertOmpOpWithRegions(..)) and call it with explicit type.
convertOmpOpWithRegions<omp::MasterOp>(opInst, builder);
convertOmpOpWithRegions<omp::ParallelOp>(opInst, builder);
This way we can reduce code duplication.
You can choose to not follow this.

SouraVX marked 2 inline comments as done.Sep 8 2020, 8:33 AM

Thanks @kiranktp ,@kiranchandramohan for reviewing this!
When adding a test case for a MasterOp inside the region of ParallelOp(typical use case.)
Test snippet:

llvm.func @test_omp_master() -> () {
  omp.parallel {
  "omp.master" ()({
    omp.terminator
  }):()->()
    omp.terminator
}
  llvm.return
}

This produces following crash/assertion failure in OMPIRBuilder:

mlir-translate: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:717: llvm::IRBuilderBase::InsertPoint llvm::OpenMPIRBuilder::CreateParallel(const llvm::OpenMPIRBuilder::LocationDescription&, llvm::OpenMPIRBuilder::InsertPointTy, llvm::OpenMPIRBuilder::BodyGenCallbackTy, llvm::OpenMPIRBuilder::PrivatizeCallbackTy, llvm::OpenMPIRBuilder::FinalizeCallbackTy, llvm::Value*, llvm::Value*, llvm::omp::ProcBindKind, bool): Assertion `Outputs.empty() && "OpenMP outlining should not produce live-out values!"' failed.

digging in further, I realized this Outputs is getting populated with(below instruction) and finally asserting at Outputs.empty().

%omp_global_thread_num2 = call i32 @__kmpc_global_thread_num(%struct.ident_t* @3), !dbg !9

I'll dig further, BTW if you guys have any inputs WRT this. Please share!

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
447–448

Yes, initially planned for static function. But later realized convertBlock (used here) is private within ModuleTranslation.

And I didn't find enough motivation to expose that as public, if time demands we'll do that.
I'll take care of the NIT's when revising.

491

Thought of this approach too, templatizing this won't help since Op is passed around as generic Operation so it seems to me we can't get away with this.
I even thought pulling out the Callback itself, but that will also incur if/else ladder and so forth.

492–496

It can be, but keeping it separate seems more reasonable.
Since we have to create block with Op Name(this can be done there also but that will incur one more parameter passing and maybe changing the function name itself to convey the intent).

505

Thanks for catching this! Sure, it's only needed for converting ParallelOp.

mlir/test/Target/openmp-llvm.mlir
225–227

Nit: I think other tests here use the pretty print syntax.

MasterOp doesn't has custom assembly registered.

error: custom op 'omp.master' has no custom assembly form

Can you add this inside a parallel here or in another test?

Thanks for this, I discovered one crash.
Details follows:

This produces following crash/assertion failure in OMPIRBuilder:

mlir-translate: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:717: llvm::IRBuilderBase::InsertPoint llvm::OpenMPIRBuilder::CreateParallel(const llvm::OpenMPIRBuilder::LocationDescription&, llvm::OpenMPIRBuilder::InsertPointTy, llvm::OpenMPIRBuilder::BodyGenCallbackTy, llvm::OpenMPIRBuilder::PrivatizeCallbackTy, llvm::OpenMPIRBuilder::FinalizeCallbackTy, llvm::Value*, llvm::Value*, llvm::omp::ProcBindKind, bool): Assertion `Outputs.empty() && "OpenMP outlining should not produce live-out values!"' failed.

I don't have any immediate inputs regarding this crash. Maybe @jdoerfert or @fghanim.

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
492–496

No strong opinion here. Let us see whether anyone else has an opinion here before making any change.
But what is the change in name of the function you are hinting here? Creating the llvm blocks is part of the conversion right?

mlir/test/Target/openmp-llvm.mlir
225–227

Can you add an assembly format to print/parse this operation in custom syntax as part of this patch? I think it will look a bit odd if this operation only has the generic syntax.
From the latest mlir news discourse post I see that they have added support for region parsing also in assembly format. master looks like a simple op to try this out.
https://github.com/llvm/llvm-project/commit/eaeadce9bd11

I don't know enough about the OMPIR for me to be helpful, however, I noticed a couple of things that I wanted to ask about for better understanding. :)

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
483

I don't see where you set the location where alloca should be codegened. how do you handle variable declarations inside the omp master scope? when do you codegen the relevant alloca?

505

All llvm passes expect Allocas to be in the entry block of the function. In this case, the soon-to-be-outlined region. is builder's current insertion point in the entry block? Also, is it guaranteed to not be empty?

@kiranchandramohan This error happens when there are things within the OMP region still alive/are used outside the region, I cannot comment on what causes this here. Usually, this suggests that "something" that should be contained within the parallel region is not being detected as such, and so when the region is outlined, this User remains as part of the original function.

The first thing that comes to mind, make sure continuationBB post dominate everything within OMP parallel

jdoerfert added inline comments.Sep 10 2020, 8:16 AM
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
489

Why do you set the insertion point to something other than codeGenIP?

505

Where is allocaIP used?

jdoerfert added inline comments.Sep 10 2020, 8:40 AM
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
489

Do not insert code or modify anything else. Basically you get this:

block:
  ... 
  %...      <- codeGenIP
  ...
  terminator

And if you want to insert control flow, split it:

block.pre:
  ... 
  %...      <- codeGenIP
  br %block.post

block.post:
  ...
  terminator

No you can remove the branch that was introduced and create instructions and control flow that converge at block.post.

ftynse added inline comments.Sep 10 2020, 9:08 AM
mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
460

It's a good time to do this.

503

Nit: a more comprehensible variable name is welcome.

mlir/test/Target/openmp-llvm.mlir
225–227

Have created a trivial patch for adding the assembly format.
https://reviews.llvm.org/D87549

SouraVX updated this revision to Diff 299326.EditedTue, Oct 20, 3:55 AM
SouraVX marked 2 inline comments as done.

At https://github.com/llvm/llvm-project/blob/master/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp#L918 @jdoerfert could you please explain the need for passing empty AllocaIP ? I meant this will be empty? while interfacing MasterOp from flang/mlir. Why is that.

That looks to me like a mistake which needs to be fixed. And, by glancing at it, it seems we now need to do what I feared in D82470, we need to add the AllocaIP to a lot more APIs so it can be passed down properly. Maybe there is no way around it.

It reuses BodyGenCallbackTy which has an AllocaIP argument as part of the type. But since this is an inlined region it doesn't benefit from it. Why? because inlined-regions should be using the AllocaIP of the innermost outlined region they belong to, for any allocations that happen inside them. Hence, you don't need to update the AllocaIP for inlined regions and so it is left empty. I wrote this with a design like clang's in mind, where an AllocaIP is always maintained in the frontend, and BodyCB is a place for the frontend to generate whatever non-OMP code it wants according to its own rules.