This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Conversion/
-
Passes.h
-
Passes.td
-
SCFToOpenMP/
-
SCFToOpenMP.h
-
Dialect/OpenMP/
-
OpenMP/
-
OpenMPOps.td
-
lib/
-
Conversion/
-
CMakeLists.txt
-
PassDetail.h
-
SCFToOpenMP/
-
CMakeLists.txt
10/10
SCFToOpenMP.cpp
-
Dialect/OpenMP/IR/
-
OpenMP/
-
IR/
-
OpenMPDialect.cpp
-
test/Conversion/SCFToOpenMP/
-
Conversion/
-
SCFToOpenMP/
-
scf-to-openmp.mlir

Differential D91982

[mlir] Add conversion from SCF parallel loops to OpenMP
ClosedPublic

Authored by ftynse on Nov 23 2020, 10:45 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
kiranchandramohan
SouraVX
kiranktp
wsmoses
chelini

Commits

rG119545f4338e: [mlir] Add conversion from SCF parallel loops to OpenMP

Summary

Introduce a conversion pass from SCF parallel loops to OpenMP dialect
constructs - parallel region and workshare loop. Loops with reductions are not
supported because the OpenMP dialect cannot model them yet.

The conversion currently targets only one level of parallelism, i.e. only
one top-level omp.parallel operation is produced even if there are nested
scf.parallel operations that could be mapped to omp.wsloop. Nested
parallelism support is left for future work.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ftynse created this revision.Nov 23 2020, 10:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2020, 10:45 AM

Herald added subscribers: teijeong, rdzhabarov, tatianashp and 16 others. · View Herald Transcript

ftynse requested review of this revision.Nov 23 2020, 10:45 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptNov 23 2020, 10:45 AM

Herald added subscribers: sstefan1, stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Add cmake

Herald added a subscriber: mgorny. · View Herald TranscriptNov 23 2020, 10:54 AM

Harbormaster completed remote builds in B79820: Diff 307120.Nov 23 2020, 10:56 AM

Sprinkle around a bit more documentation

ftynse added reviewers: kiranchandramohan, SouraVX, kiranktp, wsmoses, chelini.Nov 23 2020, 11:00 AM

Nice. Looks like you missed adding test cases.

mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
25	Brief doc comment please.
58	Missing doc comment.
78	Comment here or somewhere in the block.
88	`func.getContext()` and make this method `static`?

Harbormaster completed remote builds in B79822: Diff 307123.Nov 23 2020, 11:15 AM

Harbormaster completed remote builds in B79823: Diff 307124.Nov 23 2020, 11:18 AM

Forgotten git add

Address (most of) the review

Harbormaster completed remote builds in B79829: Diff 307133.Nov 23 2020, 11:40 AM

Harbormaster completed remote builds in B79830: Diff 307134.Nov 23 2020, 11:48 AM

Address review

Harbormaster completed remote builds in B79871: Diff 307201.Nov 23 2020, 2:58 PM

Thanks for this patch.

We should be able to do something similar for the Fortran "Do concurrent" loop. @schweitz

mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
25	Nit: OpenMP paralle loop or OpenMP parallel + workshare loop.
69	Are you not interested in nested parallelism?

Please ignore if this is out of context.
So Flang lower all loops into fir.do_loop(which is essentially a superset of scf.for) then it may lower to scf.for(depending on feasibility). So the question I have is what's the intended lowering targeted here for openmp worksharing loop ?

FORTRAN -> fir.loop -> scf.for -> ws.loop
OR 
FORTRAN -> fir.loop -> ws.loop

OR something else ?

is it already discussed/finalized in RFC?
If this patch intend to implements some part from the RFC discussion ? Would you mind adding the link here.
Thanks!

In D91982#2413912, @SouraVX wrote:
Please ignore if this is out of context.
So Flang lower all loops into fir.loop(which is essentially a superset of scf.for) then it may lower to scf.for(depending on feasibility). So the question I have is what's the intended lowering targeted here for openmp worksharing loop ?
FORTRAN -> fir.loop -> scf.for -> ws.loop
OR 
FORTRAN -> fir.loop -> ws.loop

OR something else ?
is it already discussed/finalized in RFC?
If this patch intend to implements some part from the RFC discussion ? Would you mind adding the link here.
Thanks!

This is fully orthogonal to Flang. I have code that comes from higher-level abstractions in core MLIR (Linalg) to SCF, and I need this code to run in parallel on a platform that supports OpenMP. Flang can choose to target the SCF dialect or the OpenMP dialect regardless of is done here. If you think this needs an RFC within core MLIR context, I can send one, but to me this looks like relatively straightforward plumbing work.

mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
25	This pattern does not create the OpenMP parallel operation, only the workshare loop, so I maintain that the original comment is correct and the suggested replacement will be misleading.
69	I don't have a use case for it as of yet. We can add more features when we need them.

LGTM.

Do you need the lowering to LLVM IR also sometime soon?

Previously, did you mention something about a parallelisation strategy and SCF ops carrying an attribute to determine whether they should be lowered to OpenMP parallel loops.

@SouraVX we can discuss the lowering separately for Flang. Since we decided to represent the work-sharing loop as a loop like operation (omp.wsloop), lowering directly to the work-sharing loop from the parse tree will be the straight forward method. If there are difficulties with that we can think of lowering to fir.do_loop and then converting to omp.wsloop but I think this might need modification of fir.do_loop or addition of a directive like operation in FIR.

mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
69	OK. Can the commit message/description carry this info?

This revision is now accepted and ready to land.Nov 24 2020, 9:15 AM

ftynse edited the summary of this revision. (Show Details)Nov 24 2020, 9:24 AM

In D91982#2414182, @kiranchandramohan wrote:

LGTM.

Do you need the lowering to LLVM IR also sometime soon?

Previously, did you mention something about a parallelisation strategy and SCF ops carrying an attribute to determine whether they should be lowered to OpenMP parallel loops.

@SouraVX we can discuss the lowering separately for Flang. Since we decided to represent the work-sharing loop as a loop like operation (omp.wsloop), lowering directly to the work-sharing loop from the parse tree will be the straight forward method. If there are difficulties with that we can think of lowering to fir.do_loop and then converting to omp.wsloop but I think this might need modification of fir.do_loop or addition of a directive like operation in FIR.

Thanks for the quick review!

I am in the process of writing a translation to LLVM IR, happy to use anything that is already available.

We discussed parallelization strategies in several ODMs, but there was no firm decision. Most of the discussion was in the GPU context, I believe, where we need not only to say which loops remain parallel but also how they are mapped to the device. We can add annotations that this pass would consume. Alternatively, it is possible to first transform any SCF parallel that should not be converted to SCF for, and then just run a blanket conversion of all remaining parallel operations. We already have the former conversion implemented, can just as well add a way to control it at a finer grain.

We can expose any part of this transformation somewhere in Utils.h if you find them useful for Flang/

mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
69	Sure, will put this into the commit message and add a TODO here.

In D91982#2414233, @ftynse wrote:

In D91982#2414182, @kiranchandramohan wrote:

LGTM.

Do you need the lowering to LLVM IR also sometime soon?

Previously, did you mention something about a parallelisation strategy and SCF ops carrying an attribute to determine whether they should be lowered to OpenMP parallel loops.

@SouraVX we can discuss the lowering separately for Flang. Since we decided to represent the work-sharing loop as a loop like operation (omp.wsloop), lowering directly to the work-sharing loop from the parse tree will be the straight forward method. If there are difficulties with that we can think of lowering to fir.do_loop and then converting to omp.wsloop but I think this might need modification of fir.do_loop or addition of a directive like operation in FIR.

Thanks for the quick review!

I am in the process of writing a translation to LLVM IR, happy to use anything that is already available.

@Meinersbur added a canonical loop in the OpenMPIRBuilder (https://reviews.llvm.org/D90830). The idea was to create the worksharing loop using this canonical loop.
FYI, we have a call ( 4pm BST/11am ET) for the OpenMP for Flang work where the OpenMP dialect work is also discussed. Link to the call and minutes can be found below.
https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZjhiMzZiOWMtNTU5Ni00YzljLWEwNDctYjZkNTZhODg2ZTIx%40thread.v2/0?context=%7b%22Tid%22%3a%223dd8961f-e488-4e60-8e11-a82d994e183d%22%2c%22Oid%22%3a%22ce0d0f4a-6fbf-4ba7-b0c7-20184e3bcb53%22%7d
https://docs.google.com/document/d/1yA-MeJf6RYY-ZXpdol0t7YoDoqtwAyBhFLr5thu5pFI/edit#

We discussed parallelization strategies in several ODMs, but there was no firm decision. Most of the discussion was in the GPU context, I believe, where we need not only to say which loops remain parallel but also how they are mapped to the device. We can add annotations that this pass would consume. Alternatively, it is possible to first transform any SCF parallel that should not be converted to SCF for, and then just run a blanket conversion of all remaining parallel operations. We already have the former conversion implemented, can just as well add a way to control it at a finer grain.

We can expose any part of this transformation somewhere in Utils.h if you find them useful for Flang/

OK, thanks for the explanation. We might need it (not sure though).

Closed by commit rG119545f4338e: [mlir] Add conversion from SCF parallel loops to OpenMP (authored by ftynse). · Explain WhyNov 24 2020, 12:13 PM

This revision was automatically updated to reflect the committed changes.

ftynse marked 2 inline comments as done.

ftynse added a commit: rG119545f4338e: [mlir] Add conversion from SCF parallel loops to OpenMP.

Herald added a subscriber: nimiwio. · View Herald TranscriptNov 24 2020, 12:13 PM

In D91982#2414309, @kiranchandramohan wrote:

In D91982#2414233, @ftynse wrote:

In D91982#2414182, @kiranchandramohan wrote:

LGTM.

Do you need the lowering to LLVM IR also sometime soon?

I have something that works here https://reviews.llvm.org/D92055 but it needs some discussions.

Previously, did you mention something about a parallelisation strategy and SCF ops carrying an attribute to determine whether they should be lowered to OpenMP parallel loops.

@SouraVX we can discuss the lowering separately for Flang. Since we decided to represent the work-sharing loop as a loop like operation (omp.wsloop), lowering directly to the work-sharing loop from the parse tree will be the straight forward method. If there are difficulties with that we can think of lowering to fir.do_loop and then converting to omp.wsloop but I think this might need modification of fir.do_loop or addition of a directive like operation in FIR.

Thanks for the quick review!

I am in the process of writing a translation to LLVM IR, happy to use anything that is already available.

@Meinersbur added a canonical loop in the OpenMPIRBuilder (https://reviews.llvm.org/D90830). The idea was to create the worksharing loop using this canonical loop.

Familiar names :) I indeed intended to use the canonical loop creation functionality.

FYI, we have a call ( 4pm BST/11am ET) for the OpenMP for Flang work where the OpenMP dialect work is also discussed. Link to the call and minutes can be found below.
https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZjhiMzZiOWMtNTU5Ni00YzljLWEwNDctYjZkNTZhODg2ZTIx%40thread.v2/0?context=%7b%22Tid%22%3a%223dd8961f-e488-4e60-8e11-a82d994e183d%22%2c%22Oid%22%3a%22ce0d0f4a-6fbf-4ba7-b0c7-20184e3bcb53%22%7d
https://docs.google.com/document/d/1yA-MeJf6RYY-ZXpdol0t7YoDoqtwAyBhFLr5thu5pFI/edit#

Good to know, thanks! I may drop by some time.

We discussed parallelization strategies in several ODMs, but there was no firm decision. Most of the discussion was in the GPU context, I believe, where we need not only to say which loops remain parallel but also how they are mapped to the device. We can add annotations that this pass would consume. Alternatively, it is possible to first transform any SCF parallel that should not be converted to SCF for, and then just run a blanket conversion of all remaining parallel operations. We already have the former conversion implemented, can just as well add a way to control it at a finer grain.

We can expose any part of this transformation somewhere in Utils.h if you find them useful for Flang/

OK, thanks for the explanation. We might need it (not sure though).

Don't hesitate to ping me.

Revision Contents

Path

Size

mlir/

include/

mlir/

Conversion/

Passes.h

1 line

Passes.td

11 lines

SCFToOpenMP/

SCFToOpenMP.h

23 lines

Dialect/

OpenMP/

OpenMPOps.td

9 lines

lib/

Conversion/

CMakeLists.txt

1 line

PassDetail.h

4 lines

SCFToOpenMP/

CMakeLists.txt

17 lines

SCFToOpenMP.cpp

113 lines

Dialect/

OpenMP/

IR/

OpenMPDialect.cpp

28 lines

test/

Conversion/

SCFToOpenMP/

scf-to-openmp.mlir

65 lines

Diff 307433

mlir/include/mlir/Conversion/Passes.h

	Show All 17 Lines
	#include "mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRVPass.h"			#include "mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRVPass.h"
	#include "mlir/Conversion/GPUToVulkan/ConvertGPUToVulkanPass.h"			#include "mlir/Conversion/GPUToVulkan/ConvertGPUToVulkanPass.h"
	#include "mlir/Conversion/LinalgToLLVM/LinalgToLLVM.h"			#include "mlir/Conversion/LinalgToLLVM/LinalgToLLVM.h"
	#include "mlir/Conversion/LinalgToSPIRV/LinalgToSPIRVPass.h"			#include "mlir/Conversion/LinalgToSPIRV/LinalgToSPIRVPass.h"
	#include "mlir/Conversion/LinalgToStandard/LinalgToStandard.h"			#include "mlir/Conversion/LinalgToStandard/LinalgToStandard.h"
	#include "mlir/Conversion/OpenMPToLLVM/ConvertOpenMPToLLVM.h"			#include "mlir/Conversion/OpenMPToLLVM/ConvertOpenMPToLLVM.h"
	#include "mlir/Conversion/PDLToPDLInterp/PDLToPDLInterp.h"			#include "mlir/Conversion/PDLToPDLInterp/PDLToPDLInterp.h"
	#include "mlir/Conversion/SCFToGPU/SCFToGPUPass.h"			#include "mlir/Conversion/SCFToGPU/SCFToGPUPass.h"
				#include "mlir/Conversion/SCFToOpenMP/SCFToOpenMP.h"
	#include "mlir/Conversion/SCFToStandard/SCFToStandard.h"			#include "mlir/Conversion/SCFToStandard/SCFToStandard.h"
	#include "mlir/Conversion/SPIRVToLLVM/ConvertSPIRVToLLVMPass.h"			#include "mlir/Conversion/SPIRVToLLVM/ConvertSPIRVToLLVMPass.h"
	#include "mlir/Conversion/ShapeToStandard/ShapeToStandard.h"			#include "mlir/Conversion/ShapeToStandard/ShapeToStandard.h"
	#include "mlir/Conversion/StandardToLLVM/ConvertStandardToLLVMPass.h"			#include "mlir/Conversion/StandardToLLVM/ConvertStandardToLLVMPass.h"
	#include "mlir/Conversion/StandardToSPIRV/ConvertStandardToSPIRVPass.h"			#include "mlir/Conversion/StandardToSPIRV/ConvertStandardToSPIRVPass.h"
	#include "mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h"			#include "mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h"
	#include "mlir/Conversion/VectorToROCDL/VectorToROCDL.h"			#include "mlir/Conversion/VectorToROCDL/VectorToROCDL.h"
	#include "mlir/Conversion/VectorToSCF/VectorToSCF.h"			#include "mlir/Conversion/VectorToSCF/VectorToSCF.h"
	Show All 11 Lines

mlir/include/mlir/Conversion/Passes.td

	Show First 20 Lines • Show All 225 Lines • ▼ Show 20 Lines

	def ConvertPDLToPDLInterp : Pass<"convert-pdl-to-pdl-interp", "ModuleOp"> {			def ConvertPDLToPDLInterp : Pass<"convert-pdl-to-pdl-interp", "ModuleOp"> {
	let summary = "Convert PDL ops to PDL interpreter ops";			let summary = "Convert PDL ops to PDL interpreter ops";
	let constructor = "mlir::createPDLToPDLInterpPass()";			let constructor = "mlir::createPDLToPDLInterpPass()";
	let dependentDialects = ["pdl_interp::PDLInterpDialect"];			let dependentDialects = ["pdl_interp::PDLInterpDialect"];
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// SCFToOpenMP
				//===----------------------------------------------------------------------===//

				def ConvertSCFToOpenMP : FunctionPass<"convert-scf-to-openmp"> {
				let summary = "Convert SCF parallel loop to OpenMP parallel + workshare "
				"constructs.";
				let constructor = "mlir::createConvertSCFToOpenMPPass()";
				let dependentDialects = ["omp::OpenMPDialect"];
				}

				//===----------------------------------------------------------------------===//
	// SCFToStandard			// SCFToStandard
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def SCFToStandard : Pass<"convert-scf-to-std"> {			def SCFToStandard : Pass<"convert-scf-to-std"> {
	let summary = "Convert SCF dialect to Standard dialect, replacing structured"			let summary = "Convert SCF dialect to Standard dialect, replacing structured"
	" control flow with a CFG";			" control flow with a CFG";
	let constructor = "mlir::createLowerToCFGPass()";			let constructor = "mlir::createLowerToCFGPass()";
	let dependentDialects = ["StandardOpsDialect"];			let dependentDialects = ["StandardOpsDialect"];
	▲ Show 20 Lines • Show All 186 Lines • Show Last 20 Lines

mlir/include/mlir/Conversion/SCFToOpenMP/SCFToOpenMP.h

This file was added.

				//===- ConvertSCFToOpenMP.h - SCF to OpenMP pass entrypoint ------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_CONVERSION_SCFTOOPENMP_SCFTOOPENMP_H
				#define MLIR_CONVERSION_SCFTOOPENMP_SCFTOOPENMP_H

				#include <memory>

				namespace mlir {
				class FuncOp;
				template <typename T>
				class OperationPass;

				std::unique_ptr<OperationPass<FuncOp>> createConvertSCFToOpenMPPass();

				} // namespace mlir

				#endif // MLIR_CONVERSION_SCFTOOPENMP_SCFTOOPENMP_H

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

Show First 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	let arguments = (ins Optional<AnyType>:$if_expr_var,
Variadic<AnyType>:$shared_vars,		Variadic<AnyType>:$shared_vars,
Variadic<AnyType>:$copyin_vars,		Variadic<AnyType>:$copyin_vars,
Variadic<AnyType>:$allocate_vars,		Variadic<AnyType>:$allocate_vars,
Variadic<AnyType>:$allocators_vars,		Variadic<AnyType>:$allocators_vars,
OptionalAttr<ProcBindKind>:$proc_bind_val);		OptionalAttr<ProcBindKind>:$proc_bind_val);

let regions = (region AnyRegion:$region);		let regions = (region AnyRegion:$region);

		let builders = [
		OpBuilderDAG<(ins CArg<"ArrayRef<NamedAttribute>", "{}">:$attributes)>
		];
let parser = [{ return parseParallelOp(parser, result); }];		let parser = [{ return parseParallelOp(parser, result); }];
let printer = [{ return printParallelOp(p, *this); }];		let printer = [{ return printParallelOp(p, *this); }];
let verifier = [{ return ::verifyParallelOp(*this); }];		let verifier = [{ return ::verifyParallelOp(*this); }];
}		}

def TerminatorOp : OpenMP_Op<"terminator", [Terminator]> {		def TerminatorOp : OpenMP_Op<"terminator", [Terminator]> {
let summary = "terminator for OpenMP regions";		let summary = "terminator for OpenMP regions";
let description = [{		let description = [{
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	let arguments = (ins Variadic<IntLikeType>:$lowerBound,
Variadic<AnyType>:$linear_step_vars,		Variadic<AnyType>:$linear_step_vars,
OptionalAttr<ScheduleKind>:$schedule_val,		OptionalAttr<ScheduleKind>:$schedule_val,
Optional<AnyType>:$schedule_chunk_var,		Optional<AnyType>:$schedule_chunk_var,
Confined<OptionalAttr<I64Attr>, [IntMinValue<0>]>:$collapse_val,		Confined<OptionalAttr<I64Attr>, [IntMinValue<0>]>:$collapse_val,
OptionalAttr<UnitAttr>:$nowait,		OptionalAttr<UnitAttr>:$nowait,
Confined<OptionalAttr<I64Attr>, [IntMinValue<0>]>:$ordered_val,		Confined<OptionalAttr<I64Attr>, [IntMinValue<0>]>:$ordered_val,
OptionalAttr<OrderKind>:$order_val);		OptionalAttr<OrderKind>:$order_val);

		let builders = [
		OpBuilderDAG<(ins "ValueRange":$lowerBound, "ValueRange":$upperBound,
		"ValueRange":$step,
		CArg<"ArrayRef<NamedAttribute>", "{}">:$attributes)>
		];

let regions = (region AnyRegion:$region);		let regions = (region AnyRegion:$region);
}		}

def YieldOp : OpenMP_Op<"yield", [NoSideEffect, ReturnLike, Terminator,		def YieldOp : OpenMP_Op<"yield", [NoSideEffect, ReturnLike, Terminator,
HasParent<"WsLoopOp">]> {		HasParent<"WsLoopOp">]> {
let summary = "loop yield and termination operation";		let summary = "loop yield and termination operation";
let description = [{		let description = [{
"omp.yield" yields SSA values from the OpenMP dialect op region and		"omp.yield" yields SSA values from the OpenMP dialect op region and
▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

mlir/lib/Conversion/CMakeLists.txt

	add_subdirectory(AffineToStandard)			add_subdirectory(AffineToStandard)
	add_subdirectory(AsyncToLLVM)			add_subdirectory(AsyncToLLVM)
	add_subdirectory(AVX512ToLLVM)			add_subdirectory(AVX512ToLLVM)
	add_subdirectory(GPUCommon)			add_subdirectory(GPUCommon)
	add_subdirectory(GPUToNVVM)			add_subdirectory(GPUToNVVM)
	add_subdirectory(GPUToROCDL)			add_subdirectory(GPUToROCDL)
	add_subdirectory(GPUToSPIRV)			add_subdirectory(GPUToSPIRV)
	add_subdirectory(GPUToVulkan)			add_subdirectory(GPUToVulkan)
	add_subdirectory(LinalgToLLVM)			add_subdirectory(LinalgToLLVM)
	add_subdirectory(LinalgToSPIRV)			add_subdirectory(LinalgToSPIRV)
	add_subdirectory(LinalgToStandard)			add_subdirectory(LinalgToStandard)
	add_subdirectory(OpenMPToLLVM)			add_subdirectory(OpenMPToLLVM)
	add_subdirectory(PDLToPDLInterp)			add_subdirectory(PDLToPDLInterp)
	add_subdirectory(SCFToGPU)			add_subdirectory(SCFToGPU)
				add_subdirectory(SCFToOpenMP)
	add_subdirectory(SCFToSPIRV)			add_subdirectory(SCFToSPIRV)
	add_subdirectory(SCFToStandard)			add_subdirectory(SCFToStandard)
	add_subdirectory(ShapeToStandard)			add_subdirectory(ShapeToStandard)
	add_subdirectory(SPIRVToLLVM)			add_subdirectory(SPIRVToLLVM)
	add_subdirectory(StandardToLLVM)			add_subdirectory(StandardToLLVM)
	add_subdirectory(StandardToSPIRV)			add_subdirectory(StandardToSPIRV)
	add_subdirectory(VectorToROCDL)			add_subdirectory(VectorToROCDL)
	add_subdirectory(VectorToLLVM)			add_subdirectory(VectorToLLVM)
	add_subdirectory(VectorToSCF)			add_subdirectory(VectorToSCF)
	add_subdirectory(VectorToSPIRV)			add_subdirectory(VectorToSPIRV)

mlir/lib/Conversion/PassDetail.h

	Show All 27 Lines
	class LLVMDialect;			class LLVMDialect;
	class LLVMAVX512Dialect;			class LLVMAVX512Dialect;
	} // end namespace LLVM			} // end namespace LLVM

	namespace NVVM {			namespace NVVM {
	class NVVMDialect;			class NVVMDialect;
	} // end namespace NVVM			} // end namespace NVVM

				namespace omp {
				class OpenMPDialect;
				} // end namespace omp

	namespace pdl_interp {			namespace pdl_interp {
	class PDLInterpDialect;			class PDLInterpDialect;
	} // end namespace pdl_interp			} // end namespace pdl_interp

	namespace ROCDL {			namespace ROCDL {
	class ROCDLDialect;			class ROCDLDialect;
	} // end namespace ROCDL			} // end namespace ROCDL

	Show All 18 Lines

mlir/lib/Conversion/SCFToOpenMP/CMakeLists.txt

This file was added.

				add_mlir_conversion_library(MLIRSCFToOpenMP
				SCFToOpenMP.cpp

				ADDITIONAL_HEADER_DIRS
				${MLIR_MAIN_INCLUDE_DIR}/mlir/Conversion/SCFToStandard

				DEPENDS
				MLIRConversionPassIncGen

				LINK_COMPONENTS
				Core

				LINK_LIBS PUBLIC
				MLIROpenMP
				MLIRSCF
				MLIRTransforms
				)

mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp

This file was added.

				//===- SCFToOpenMP.cpp - Structured Control Flow to OpenMP conversion -----===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements a pass to convert scf.parallel operations into OpenMP
				// parallel loops.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Conversion/SCFToOpenMP/SCFToOpenMP.h"
				#include "../PassDetail.h"
				#include "mlir/Dialect/OpenMP/OpenMPDialect.h"
				#include "mlir/Dialect/SCF/SCF.h"
				#include "mlir/Transforms/DialectConversion.h"

				using namespace mlir;

				namespace {

				/// Converts SCF parallel operation into an OpenMP workshare loop construct.
				struct ParallelOpLowering : public OpRewritePattern<scf::ParallelOp> {
				bondhugulaUnsubmitted Done Reply Inline Actions Brief doc comment please. bondhugula: Brief doc comment please.
				kiranchandramohanUnsubmitted Done Reply Inline Actions Nit: OpenMP paralle loop or OpenMP parallel + workshare loop. kiranchandramohan: Nit: OpenMP paralle loop or OpenMP parallel + workshare loop.
				ftynseAuthorUnsubmitted Done Reply Inline Actions This pattern does not create the OpenMP parallel operation, only the workshare loop, so I maintain that the original comment is correct and the suggested replacement will be misleading. ftynse: This pattern does //not// create the OpenMP parallel operation, only the workshare loop, so I…
				using OpRewritePattern<scf::ParallelOp>::OpRewritePattern;

				LogicalResult matchAndRewrite(scf::ParallelOp parallelOp,
				PatternRewriter &rewriter) const override {
				// TODO: add support for reductions when OpenMP loops have them.
				if (parallelOp.getNumResults() != 0)
				return rewriter.notifyMatchFailure(
				parallelOp,
				"OpenMP dialect does not yet support loops with reductions");

				// Replace SCF yield with OpenMP yield.
				{
				OpBuilder::InsertionGuard guard(rewriter);
				rewriter.setInsertionPointToEnd(parallelOp.getBody());
				assert(llvm::hasSingleElement(parallelOp.region()) &&
				"expected scf.parallel to have one block");
				rewriter.replaceOpWithNewOp<omp::YieldOp>(
				parallelOp.getBody()->getTerminator(), ValueRange());
				}

				// Replace the loop.
				auto loop = rewriter.create<omp::WsLoopOp>(
				parallelOp.getLoc(), parallelOp.lowerBound(), parallelOp.upperBound(),
				parallelOp.step());
				rewriter.inlineRegionBefore(parallelOp.region(), loop.region(),
				loop.region().begin());
				rewriter.eraseOp(parallelOp);
				return success();
				}
				};

				/// Inserts OpenMP "parallel" operations around top-level SCF "parallel"
				/// operations in the given function. This is implemented as a direct IR
				bondhugulaUnsubmitted Done Reply Inline Actions Missing doc comment. bondhugula: Missing doc comment.
				/// modification rather than as a conversion pattern because it does not
				/// modify the top-level operation it matches, which is a requirement for
				/// rewrite patterns.
				//
				// TODO: consider creating nested parallel operations when necessary.
				static void insertOpenMPParallel(FuncOp func) {
				// Collect top-level SCF "parallel" ops.
				SmallVector<scf::ParallelOp, 4> topLevelParallelOps;
				func.walk([&topLevelParallelOps](scf::ParallelOp parallelOp) {
				// Ignore ops that are already within OpenMP parallel construct.
				if (!parallelOp.getParentOfType<scf::ParallelOp>())
				kiranchandramohanUnsubmitted Done Reply Inline Actions Are you not interested in nested parallelism? kiranchandramohan: Are you not interested in nested parallelism?
				ftynseAuthorUnsubmitted Done Reply Inline Actions I don't have a use case for it as of yet. We can add more features when we need them. ftynse: I don't have a use case for it as of yet. We can add more features when we need them.
				kiranchandramohanUnsubmitted Done Reply Inline Actions OK. Can the commit message/description carry this info? kiranchandramohan: OK. Can the commit message/description carry this info?
				ftynseAuthorUnsubmitted Done Reply Inline Actions Sure, will put this into the commit message and add a TODO here. ftynse: Sure, will put this into the commit message and add a TODO here.
				topLevelParallelOps.push_back(parallelOp);
				});

				// Wrap SCF ops into OpenMP "parallel" ops.
				for (scf::ParallelOp parallelOp : topLevelParallelOps) {
				OpBuilder builder(parallelOp);
				auto omp = builder.create<omp::ParallelOp>(parallelOp.getLoc());
				Block *block = builder.createBlock(&omp.getRegion());
				builder.create<omp::TerminatorOp>(parallelOp.getLoc());
				bondhugulaUnsubmitted Done Reply Inline Actions Comment here or somewhere in the block. bondhugula: Comment here or somewhere in the block.
				block->getOperations().splice(
				block->begin(), parallelOp.getOperation()->getBlock()->getOperations(),
				parallelOp.getOperation());
				}
				}

				/// Applies the conversion patterns in the given function.
				static LogicalResult applyPatterns(FuncOp func) {
				ConversionTarget target(*func.getContext());
				target.addIllegalOp<scf::ParallelOp>();
				bondhugulaUnsubmitted Done Reply Inline Actions `func.getContext()` and make this method `static`? bondhugula: `func.getContext()` and make this method `static`?
				target.addDynamicallyLegalOp<scf::YieldOp>(
				[](scf::YieldOp op) { return !isa<scf::ParallelOp>(op.getParentOp()); });
				target.addLegalDialect<omp::OpenMPDialect>();

				OwningRewritePatternList patterns;
				patterns.insert<ParallelOpLowering>(func.getContext());
				FrozenRewritePatternList frozen(std::move(patterns));
				return applyPartialConversion(func, target, frozen);
				}

				/// A pass converting SCF operations to OpenMP operations.
				struct SCFToOpenMPPass : public ConvertSCFToOpenMPBase<SCFToOpenMPPass> {
				/// Pass entry point.
				void runOnFunction() override {
				insertOpenMPParallel(getFunction());
				if (failed(applyPatterns(getFunction())))
				signalPassFailure();
				}
				};

				} // end namespace

				std::unique_ptr<OperationPass<FuncOp>> mlir::createConvertSCFToOpenMPPass() {
				return std::make_unique<SCFToOpenMPPass>();
				}

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp

Show All 31 Lines
#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"		#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"
>();		>();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// ParallelOp		// ParallelOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		void ParallelOp::build(OpBuilder &builder, OperationState &state,
		ArrayRef<NamedAttribute> attributes) {
		ParallelOp::build(
		builder, state, /if_expr_var=/nullptr, /num_threads_var=/nullptr,
		/default_val=/nullptr, /private_vars=/ValueRange(),
		/firstprivate_vars=/ValueRange(), /shared_vars=/ValueRange(),
		/copyin_vars=/ValueRange(), /allocate_vars=/ValueRange(),
		/allocators_vars=/ValueRange(), /proc_bind_val=/nullptr);
		state.addAttributes(attributes);
		}

/// Parse a list of operands with types.		/// Parse a list of operands with types.
///		///
/// operand-and-type-list ::= `(` ssa-id-and-type-list `)`		/// operand-and-type-list ::= `(` ssa-id-and-type-list `)`
/// ssa-id-and-type-list ::= ssa-id-and-type \|		/// ssa-id-and-type-list ::= ssa-id-and-type \|
/// ssa-id-and-type `,` ssa-id-and-type-list		/// ssa-id-and-type `,` ssa-id-and-type-list
/// ssa-id-and-type ::= ssa-id `:` type		/// ssa-id-and-type ::= ssa-id `:` type
static ParseResult		static ParseResult
parseOperandAndTypeList(OpAsmParser &parser,		parseOperandAndTypeList(OpAsmParser &parser,
▲ Show 20 Lines • Show All 309 Lines • ▼ Show 20 Lines	static ParseResult parseParallelOp(OpAsmParser &parser,
Region *body = result.addRegion();		Region *body = result.addRegion();
SmallVector<OpAsmParser::OperandType, 4> regionArgs;		SmallVector<OpAsmParser::OperandType, 4> regionArgs;
SmallVector<Type, 4> regionArgTypes;		SmallVector<Type, 4> regionArgTypes;
if (parser.parseRegion(*body, regionArgs, regionArgTypes))		if (parser.parseRegion(*body, regionArgs, regionArgTypes))
return failure();		return failure();
return success();		return success();
}		}

		//===----------------------------------------------------------------------===//
		// WsLoopOp
		//===----------------------------------------------------------------------===//

		void WsLoopOp::build(OpBuilder &builder, OperationState &state,
		ValueRange lowerBound, ValueRange upperBound,
		ValueRange step, ArrayRef<NamedAttribute> attributes) {
		build(builder, state, TypeRange(), lowerBound, upperBound, step,
		/private_vars=/ValueRange(),
		/firstprivate_vars=/ValueRange(), /lastprivate_vars=/ValueRange(),
		/linear_vars=/ValueRange(), /linear_step_vars=/ValueRange(),
		/schedule_val=/nullptr, /schedule_chunk_var=/nullptr,
		/collapse_val=/nullptr,
		/nowait=/nullptr, /ordered_val=/nullptr, /order_val=/nullptr);
		state.addAttributes(attributes);
		}

#define GET_OP_CLASSES		#define GET_OP_CLASSES
#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"		#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"

mlir/test/Conversion/SCFToOpenMP/scf-to-openmp.mlir

This file was added.

				// RUN: mlir-opt -convert-scf-to-openmp %s \| FileCheck %s

				// CHECK-LABEL: @parallel
				func @parallel(%arg0: index, %arg1: index, %arg2: index,
				%arg3: index, %arg4: index, %arg5: index) {
				// CHECK: omp.parallel {
				// CHECK: "omp.wsloop"({{.*}}) ( {
				scf.parallel (%i, %j) = (%arg0, %arg1) to (%arg2, %arg3) step (%arg4, %arg5) {
				// CHECK: test.payload
				"test.payload"(%i, %j) : (index, index) -> ()
				// CHECK: omp.yield
				// CHECK: }
				}
				// CHECK: omp.terminator
				// CHECK: }
				return
				}

				// CHECK-LABEL: @nested_loops
				func @nested_loops(%arg0: index, %arg1: index, %arg2: index,
				%arg3: index, %arg4: index, %arg5: index) {
				// CHECK: omp.parallel {
				// CHECK: "omp.wsloop"({{.*}}) ( {
				// CHECK-NOT: omp.parallel
				scf.parallel (%i) = (%arg0) to (%arg2) step (%arg4) {
				// CHECK: "omp.wsloop"({{.*}}) ( {
				scf.parallel (%j) = (%arg1) to (%arg3) step (%arg5) {
				// CHECK: test.payload
				"test.payload"(%i, %j) : (index, index) -> ()
				// CHECK: omp.yield
				// CHECK: }
				}
				// CHECK: omp.yield
				// CHECK: }
				}
				// CHECK: omp.terminator
				// CHECK: }
				return
				}

				func @adjacent_loops(%arg0: index, %arg1: index, %arg2: index,
				%arg3: index, %arg4: index, %arg5: index) {
				// CHECK: omp.parallel {
				// CHECK: "omp.wsloop"({{.*}}) ( {
				scf.parallel (%i) = (%arg0) to (%arg2) step (%arg4) {
				// CHECK: test.payload1
				"test.payload1"(%i) : (index) -> ()
				// CHECK: omp.yield
				// CHECK: }
				}
				// CHECK: omp.terminator
				// CHECK: }

				// CHECK: omp.parallel {
				// CHECK: "omp.wsloop"({{.*}}) ( {
				scf.parallel (%j) = (%arg1) to (%arg3) step (%arg5) {
				// CHECK: test.payload2
				"test.payload2"(%j) : (index) -> ()
				// CHECK: omp.yield
				// CHECK: }
				}
				// CHECK: omp.terminator
				// CHECK: }
				return
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Add conversion from SCF parallel loops to OpenMPClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 307433

mlir/include/mlir/Conversion/Passes.h

mlir/include/mlir/Conversion/Passes.td

mlir/include/mlir/Conversion/SCFToOpenMP/SCFToOpenMP.h

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

mlir/lib/Conversion/CMakeLists.txt

mlir/lib/Conversion/PassDetail.h

mlir/lib/Conversion/SCFToOpenMP/CMakeLists.txt

mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp

mlir/test/Conversion/SCFToOpenMP/scf-to-openmp.mlir

[mlir] Add conversion from SCF parallel loops to OpenMP
ClosedPublic