This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
flang/
-
lib/Lower/
-
Lower/
3/3
OpenMP.cpp
-
test/Lower/OpenMP/
-
Lower/
-
OpenMP/
-
atomic-update-hlfir.f90

Differential D158294

[Flang][OpenMP] Fix for atomic lowering with HLFIR
ClosedPublic

Authored by kiranchandramohan on Aug 18 2023, 10:08 AM.

Download Raw Diff

Details

Reviewers

NimishMishra
tblah
jdoerfert
sscalpone
razvanlupusoru

Commits

rG6163d66e73cf: [Flang][OpenMP] Fix for atomic lowering with HLFIR

Summary

Atomic update operation is modelled in OpenMP dialect as
an operation that takes a reference to the operation being
updated. It also contains a region that will perform the
update. The block argument represents the loaded value from
the update location and the Yield operation is the value
that should be stored for the update.

OpenMP FIR lowering binds the value loaded from the update
address to the SymbolAddress. HLFIR lowering does not permit
SymbolAddresses to be a value. To work around this, the
lowering is now performed in two steps. First the body of
the atomic update is lowered into an SCF execute_region
operation. Then this is copied into the omp.atomic_update
as a second step that performs the following:
-> Create an omp.atomic_update with the block argument of
the correct type.
-> Copy the operations from the SCF execute_region. Convert
the scf.yield to an omp.yield.
-> Remove the loads of the update location and replace all
uses with the block argument.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

kiranchandramohan created this revision.Aug 18 2023, 10:08 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 18 2023, 10:08 AM

Herald added subscribers: mehdi_amini, guansong, yaxunl. · View Herald Transcript

kiranchandramohan requested review of this revision.Aug 18 2023, 10:08 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptAug 18 2023, 10:08 AM

Herald added subscribers: jplehr, sstefan1, jdoerfert. · View Herald Transcript

TODO:
-> Add a detailed explanation
-> Add a test for HLFIR.

flang/lib/Lower/OpenMP.cpp

3113

Lowering is in three steps :

subroutine sb
  integer :: a, b
  !$omp atomic update
    a = a + b
end subroutine

Lower to scf.execute_region_op

func.func @_QPsb() {
  %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"}
  %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"}
  %2 = scf.execute_region -> i32 {
    %3 = fir.load %0 : !fir.ref<i32>
    %4 = fir.load %1 : !fir.ref<i32>
    %5 = arith.addi %3, %4 : i32
    scf.yield %5 : i32
  }
  return
}

Move out all the non-update loads.

func.func @_QPsb() {
  %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"}
  %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"}
  %2 = fir.load %1 : !fir.ref<i32>
  %3 = scf.execute_region -> i32 {
    %4 = fir.load %0 : !fir.ref<i32>
    %5 = arith.addi %4, %2 : i32
    scf.yield %5 : i32
  }
  return
}

Convert to atomic.update.

func.func @_QPsb() {
  %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"}
  %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"}
  %2 = fir.load %1 : !fir.ref<i32>
  omp.atomic.update   %0 : !fir.ref<i32> {
  ^bb0(%arg0: i32):
    %3 = arith.addi %arg0, %2 : i32
    omp.yield(%3 : i32)
  }
  return
}

Harbormaster completed remote builds in B253528: Diff 551562.Aug 18 2023, 11:38 AM

tblah added inline comments.Aug 21 2023, 2:42 AM

flang/lib/Lower/OpenMP.cpp
3113	Why did you decide to lower to scf.execute_region first then convert to an atomic.update?

Reduce the scope of changes to only make omp.atomic_update work with
HLFIR. (Hoisting can come in a separate patch)
Add an hlfir test.

Herald added a reviewer: sscalpone. · View Herald TranscriptAug 21 2023, 4:29 AM

kiranchandramohan edited the summary of this revision. (Show Details)Aug 21 2023, 4:30 AM

kiranchandramohan marked an inline comment as done.Aug 21 2023, 4:34 AM

kiranchandramohan added inline comments.

flang/lib/Lower/OpenMP.cpp
3113	HLFIR lowering only permits a reference to be bound to SymbolAddresses. omp.atomic_update has a block argument that models the value loaded from the Address and this is used inside the atomic_update region. This was previously achieved by mapping the SymbolAddress to the block argument. This does not work anymore, so it is first lowered temporarily to an scf.execute_region and then the atomic_update operation is carefully constructed from the scf.execute_region. Note: Have added some additional explanation to the summary.

Harbormaster completed remote builds in B253817: Diff 551979.Aug 21 2023, 4:41 AM

LG from a HLFIR perspective

This revision is now accepted and ready to land.Aug 21 2023, 5:29 AM

@razvanlupusoru, just FYI

Innovative approach with the intermediate step!

The change seems reasonable to me - but I think your comment showing the multiple step transformation belongs as a comment in the code instead of just summarized in the description. Also, you may want to further break up genOmpAtomicUpdateStatement into multiple functions - feels like too much is happening in a single spot now.

(I am currently looking into reusing this logic for OpenACC so I will build my change on top of this one) :)

Thanks! Great work!

Add code comments inline as suggested by @razvanlupusoru.
Will split the code in a follow-up patch.

Thank you!

Harbormaster completed remote builds in B254376: Diff 552755.Aug 23 2023, 10:14 AM

clementval mentioned this in D158772: [flang][openacc] Enable lowering support for OpenACC atomic operations.Aug 25 2023, 12:44 PM

Closed by commit rG6163d66e73cf: [Flang][OpenMP] Fix for atomic lowering with HLFIR (authored by kiranchandramohan). · Explain WhyAug 31 2023, 10:36 AM

This revision was automatically updated to reflect the committed changes.

kiranchandramohan added a commit: rG6163d66e73cf: [Flang][OpenMP] Fix for atomic lowering with HLFIR.

Revision Contents

Path

Size

flang/

lib/

Lower/

OpenMP.cpp

101 lines

test/

Lower/

OpenMP/

atomic-update-hlfir.f90

23 lines

Diff 555098

flang/lib/Lower/OpenMP.cpp

Show All 20 Lines
#include "flang/Optimizer/Builder/FIRBuilder.h"		#include "flang/Optimizer/Builder/FIRBuilder.h"
#include "flang/Optimizer/Builder/Todo.h"		#include "flang/Optimizer/Builder/Todo.h"
#include "flang/Optimizer/HLFIR/HLFIROps.h"		#include "flang/Optimizer/HLFIR/HLFIROps.h"
#include "flang/Parser/dump-parse-tree.h"		#include "flang/Parser/dump-parse-tree.h"
#include "flang/Parser/parse-tree.h"		#include "flang/Parser/parse-tree.h"
#include "flang/Semantics/openmp-directive-sets.h"		#include "flang/Semantics/openmp-directive-sets.h"
#include "flang/Semantics/tools.h"		#include "flang/Semantics/tools.h"
#include "mlir/Dialect/OpenMP/OpenMPDialect.h"		#include "mlir/Dialect/OpenMP/OpenMPDialect.h"
		#include "mlir/Dialect/SCF/IR/SCF.h"
#include "llvm/Frontend/OpenMP/OMPConstants.h"		#include "llvm/Frontend/OpenMP/OMPConstants.h"

using DeclareTargetCapturePair =		using DeclareTargetCapturePair =
std::pair<mlir::omp::DeclareTargetCaptureClause,		std::pair<mlir::omp::DeclareTargetCaptureClause,
Fortran::semantics::Symbol>;		Fortran::semantics::Symbol>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Common helper functions		// Common helper functions
▲ Show 20 Lines • Show All 3,052 Lines • ▼ Show 20 Lines	static void genOmpAtomicUpdateStatement(
mlir::IntegerAttr hint = nullptr;		mlir::IntegerAttr hint = nullptr;
mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr;		mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr;
if (leftHandClauseList)		if (leftHandClauseList)
genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint,		genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint,
memoryOrder);		memoryOrder);
if (rightHandClauseList)		if (rightHandClauseList)
genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint,		genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint,
memoryOrder);		memoryOrder);
auto atomicUpdateOp = firOpBuilder.create<mlir::omp::AtomicUpdateOp>(
currentLocation, lhsAddr, hint, memoryOrder);

//// Generate body of Atomic Update operation
// If an argument for the region is provided then create the block with that
// argument. Also update the symbol's address with the argument mlir value.
llvm::SmallVector<mlir::Type> varTys = {varType};
llvm::SmallVector<mlir::Location> locs = {currentLocation};
firOpBuilder.createBlock(&atomicUpdateOp.getRegion(), {}, varTys, locs);
mlir::Value val =
fir::getBase(atomicUpdateOp.getRegion().front().getArgument(0));
const auto *varDesignator =		const auto *varDesignator =
std::get_if<Fortran::common::Indirection<Fortran::parser::Designator>>(		std::get_if<Fortran::common::Indirection<Fortran::parser::Designator>>(
&assignmentStmtVariable.u);		&assignmentStmtVariable.u);
assert(varDesignator && "Variable designator for atomic update assignment "		assert(varDesignator && "Variable designator for atomic update assignment "
"statement does not exist");		"statement does not exist");
const Fortran::parser::Name *name =		const Fortran::parser::Name *name =
Fortran::semantics::getDesignatorNameIfDataRef(varDesignator->value());		Fortran::semantics::getDesignatorNameIfDataRef(varDesignator->value());
if (!name)		if (!name)
TODO(converter.getCurrentLocation(),		TODO(converter.getCurrentLocation(),
"Array references as atomic update variable");		"Array references as atomic update variable");
assert(name && name->symbol &&		assert(name && name->symbol &&
"No symbol attached to atomic update variable");		"No symbol attached to atomic update variable");
converter.bindSymbol(*name->symbol, val);		if (Fortran::semantics::IsAllocatableOrPointer(name->symbol->GetUltimate()))
// Set the insert for the terminator operation to go at the end of the		converter.bindSymbol(*name->symbol, lhsAddr);
// block.
mlir::Block &block = atomicUpdateOp.getRegion().back();
firOpBuilder.setInsertionPointToEnd(&block);

		// Lowering is in two steps :
		kiranchandramohanAuthorUnsubmitted Done Reply Inline Actions Lowering is in three steps : subroutine sb integer :: a, b !$omp atomic update a = a + b end subroutine Lower to scf.execute_region_op func.func @_QPsb() { %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} %2 = scf.execute_region -> i32 { %3 = fir.load %0 : !fir.ref<i32> %4 = fir.load %1 : !fir.ref<i32> %5 = arith.addi %3, %4 : i32 scf.yield %5 : i32 } return } Move out all the non-update loads. func.func @_QPsb() { %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} %2 = fir.load %1 : !fir.ref<i32> %3 = scf.execute_region -> i32 { %4 = fir.load %0 : !fir.ref<i32> %5 = arith.addi %4, %2 : i32 scf.yield %5 : i32 } return } Convert to `atomic.update`. func.func @_QPsb() { %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} %2 = fir.load %1 : !fir.ref<i32> omp.atomic.update %0 : !fir.ref<i32> { ^bb0(%arg0: i32): %3 = arith.addi %arg0, %2 : i32 omp.yield(%3 : i32) } return } kiranchandramohan: Lowering is in three steps : ``` subroutine sb integer :: a, b !$omp atomic update…
		tblahUnsubmitted Done Reply Inline Actions Why did you decide to lower to scf.execute_region first then convert to an atomic.update? tblah: Why did you decide to lower to scf.execute_region first then convert to an atomic.update?
		kiranchandramohanAuthorUnsubmitted Done Reply Inline Actions HLFIR lowering only permits a reference to be bound to SymbolAddresses. omp.atomic_update has a block argument that models the value loaded from the Address and this is used inside the atomic_update region. This was previously achieved by mapping the SymbolAddress to the block argument. This does not work anymore, so it is first lowered temporarily to an scf.execute_region and then the atomic_update operation is carefully constructed from the scf.execute_region. Note: Have added some additional explanation to the summary. kiranchandramohan: HLFIR lowering only permits a reference to be bound to SymbolAddresses. omp.atomic_update has a…
		// subroutine sb
		// integer :: a, b
		// !$omp atomic update
		// a = a + b
		// end subroutine
		//
		// 1. Lower to scf.execute_region_op
		//
		// func.func @_QPsb() {
		// %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"}
		// %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"}
		// %2 = scf.execute_region -> i32 {
		// %3 = fir.load %0 : !fir.ref<i32>
		// %4 = fir.load %1 : !fir.ref<i32>
		// %5 = arith.addi %3, %4 : i32
		// scf.yield %5 : i32
		// }
		// return
		// }
		auto tempOp =
		firOpBuilder.create<mlir::scf::ExecuteRegionOp>(currentLocation, varType);
		firOpBuilder.createBlock(&tempOp.getRegion());
		mlir::Block &block = tempOp.getRegion().back();
		firOpBuilder.setInsertionPointToEnd(&block);
Fortran::lower::StatementContext stmtCtx;		Fortran::lower::StatementContext stmtCtx;
mlir::Value rhsExpr = fir::getBase(converter.genExprValue(		mlir::Value rhsExpr = fir::getBase(converter.genExprValue(
*Fortran::semantics::GetExpr(assignmentStmtExpr), stmtCtx));		*Fortran::semantics::GetExpr(assignmentStmtExpr), stmtCtx));
mlir::Value convertResult =		mlir::Value convertResult =
firOpBuilder.createConvert(currentLocation, varType, rhsExpr);		firOpBuilder.createConvert(currentLocation, varType, rhsExpr);
// Insert the terminator: YieldOp.		// Insert the terminator: YieldOp.
firOpBuilder.create<mlir::omp::YieldOp>(currentLocation, convertResult);		firOpBuilder.create<mlir::scf::YieldOp>(currentLocation, convertResult);
// Reset the insert point to before the terminator.
firOpBuilder.setInsertionPointToStart(&block);		firOpBuilder.setInsertionPointToStart(&block);

		// 2. Create the omp.atomic.update Operation using the Operations in the
		// temporary scf.execute_region Operation.
		//
		// func.func @_QPsb() {
		// %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"}
		// %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"}
		// %2 = fir.load %1 : !fir.ref<i32>
		// omp.atomic.update %0 : !fir.ref<i32> {
		// ^bb0(%arg0: i32):
		// %3 = fir.load %1 : !fir.ref<i32>
		// %4 = arith.addi %arg0, %3 : i32
		// omp.yield(%3 : i32)
		// }
		// return
		// }
		mlir::Value updateVar = converter.getSymbolAddress(*name->symbol);
		if (auto decl = updateVar.getDefiningOp<hlfir::DeclareOp>())
		updateVar = decl.getBase();

		firOpBuilder.setInsertionPointAfter(tempOp);
		auto atomicUpdateOp = firOpBuilder.create<mlir::omp::AtomicUpdateOp>(
		currentLocation, updateVar, hint, memoryOrder);

		llvm::SmallVector<mlir::Type> varTys = {varType};
		llvm::SmallVector<mlir::Location> locs = {currentLocation};
		firOpBuilder.createBlock(&atomicUpdateOp.getRegion(), {}, varTys, locs);
		mlir::Value val =
		fir::getBase(atomicUpdateOp.getRegion().front().getArgument(0));

		llvm::SmallVector<mlir::Operation *> ops;
		for (mlir::Operation &op : tempOp.getRegion().getOps())
		ops.push_back(&op);

		// SCF Yield is converted to OMP Yield. All other operations are copied
		for (mlir::Operation *op : ops) {
		if (auto y = mlir::dyn_cast<mlir::scf::YieldOp>(op)) {
		firOpBuilder.setInsertionPointToEnd(&atomicUpdateOp.getRegion().front());
		firOpBuilder.create<mlir::omp::YieldOp>(currentLocation, y.getResults());
		op->erase();
		} else {
		op->remove();
		atomicUpdateOp.getRegion().front().push_back(op);
		}
		}

		// Remove the load and replace all uses of load with the block argument
		for (mlir::Operation &op : atomicUpdateOp.getRegion().getOps()) {
		fir::LoadOp y = mlir::dyn_cast<fir::LoadOp>(&op);
		if (y && y.getMemref() == updateVar)
		y.getRes().replaceAllUsesWith(val);
		}

		tempOp.erase();
}		}

static void		static void
genOmpAtomicWrite(Fortran::lower::AbstractConverter &converter,		genOmpAtomicWrite(Fortran::lower::AbstractConverter &converter,
Fortran::lower::pft::Evaluation &eval,		Fortran::lower::pft::Evaluation &eval,
const Fortran::parser::OmpAtomicWrite &atomicWrite) {		const Fortran::parser::OmpAtomicWrite &atomicWrite) {
// Get the value and address of atomic write operands.		// Get the value and address of atomic write operands.
const Fortran::parser::OmpAtomicClauseList &rightHandClauseList =		const Fortran::parser::OmpAtomicClauseList &rightHandClauseList =
▲ Show 20 Lines • Show All 692 Lines • Show Last 20 Lines

flang/test/Lower/OpenMP/atomic-update-hlfir.f90

This file was added.

				! This test checks lowering of atomic and atomic update constructs with HLFIR
				! RUN: bbc -hlfir -fopenmp -emit-hlfir %s -o - \| FileCheck %s
				! RUN: %flang_fc1 -flang-experimental-hlfir -emit-hlfir -fopenmp %s -o - \| FileCheck %s

				subroutine sb
				integer :: x, y

				!$omp atomic update
				x = x + y
				end subroutine

				!CHECK-LABEL: @_QPsb
				!CHECK: %[[X_REF:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsbEx"}
				!CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {uniq_name = "_QFsbEx"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
				!CHECK: %[[Y_REF:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFsbEy"}
				!CHECK: %[[Y_DECL:.*]]:2 = hlfir.declare %[[Y_REF]] {uniq_name = "_QFsbEy"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
				!CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref<i32> {
				!CHECK: ^bb0(%[[ARG_X:.*]]: i32):
				!CHECK: %[[Y_VAL:.*]] = fir.load %[[Y_DECL]]#0 : !fir.ref<i32>
				!CHECK: %[[X_UPDATE_VAL:.*]] = arith.addi %[[ARG_X]], %[[Y_VAL]] : i32
				!CHECK: omp.yield(%[[X_UPDATE_VAL]] : i32)
				!CHECK: }
				!CHECK: return

This is an archive of the discontinued LLVM Phabricator instance.

[Flang][OpenMP] Fix for atomic lowering with HLFIRClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 555098

flang/lib/Lower/OpenMP.cpp

flang/test/Lower/OpenMP/atomic-update-hlfir.f90

[Flang][OpenMP] Fix for atomic lowering with HLFIR
ClosedPublic