This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
flang/
-
lib/Optimizer/HLFIR/Transforms/
-
Optimizer/
-
HLFIR/
-
Transforms/
1/3
BufferizeHLFIR.cpp
-
test/HLFIR/
-
HLFIR/
1/2
associate-codegen.fir

Differential D158471

[flang][hlfir] Fixed associate-op codegen for optimized HLFIR.
ClosedPublic

Authored by vzakhari on Aug 21 2023, 5:24 PM.

Download Raw Diff

Details

Reviewers

tblah
jeanPerier

Commits

rGb2a9501080d0: [flang][hlfir] Fixed associate-op codegen for optimized HLFIR.

Summary

This effectively reverts D154715.

The issue appears as the dialect conversion error because we try to
erase an op that has already been erased. See the added LIT test case
with HLFIR that may appear as a result of CSE.
The adaptor.getSource() is an operation producing a tuple,
which does not have users, so allOtherUsesAreSafeForAssociate
just looks at the empty list of users. So we get completely wrong
answers from it. This causes problems with the following
eraseAllUsesInDestroys that tries to remove the DestroyOp twice
during both hflir.associate processing.

But we also cannot use associate.getSource() *efficiently*, because
the original users may still hang around: one example is the original body
of hlfir.elemental (see D154715), another example is other already converted
AssociateOp's that are pending removal in the rewriter
(that is why we have a temporary created for each hlfir.associate
in the newly added LIT case).

This patch just fixes the correctness issue. I think we have to separate
the buffer reuse analysis from the conversion itself.

I also tried to address the issues with the cloned bodies of hlfir.elemental,
but this should not matter since D155778: if hlfir.associate is inside
hlfir.elemental, it will end up inside a do-loop body region, so the early
exit added in D155778 will prevent the buffer reuse.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

vzakhari created this revision.Aug 21 2023, 5:24 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2023, 5:24 PM

Herald added subscribers: mehdi_amini, jdoerfert. · View Herald Transcript

vzakhari requested review of this revision.Aug 21 2023, 5:24 PM

Harbormaster completed remote builds in B253958: Diff 552175.Aug 21 2023, 5:58 PM

Thank you for picking up these bugs from adding CSE to the pipeline. This patch looks good to me. The comments are just questions and are not intended to block the patch.

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp
538	I wonder if using the conversion pattern rewriter driver is creating more trouble than it is worth in this patch. Personally, I find these sorts of things makes this pass very difficult to debug. How would you feel about using the greedy pattern rewriter API instead? This way updates would be immediately visible.
flang/test/HLFIR/associate-codegen.fir
242	Have you observed performance regressions as a result of this? If you have, I don't think that should block the patch; but it would be helpful to know.

This revision is now accepted and ready to land.Aug 22 2023, 2:38 AM

I added the dominance check to avoid TODO failures on cases like this:

subroutine test(i)
  integer, intent(in) :: i(:, :)
  associate (j => transpose(i))
    if (j(1,1) /= 1) then
       stop
    end if
  end associate
end subroutine test

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

538

The greedy pattern match driver usage is not straightforward either. One of the problems is that the operands are not converted before the users, so for the LIT case from flang/test/HLFIR/associate-codegen.fir the very first pattern rewrite (DestroyOpConversion) produces this:

"func.func"() <{arg_attrs = [{fir.bindc_name = "a"}], function_type = (!fir.boxchar<1>) -> (), sym_name = "_QPtest"}> ({
^bb0(%arg0: !fir.boxchar<1>):
...
  %3 = "hlfir.elemental"(%2, %0) ({
  ^bb0(%arg1: index):
    %6 = "fir.undefined"() : () -> !fir.ref<!fir.char<1>>
    "hlfir.yield_element"(%6) : (!fir.ref<!fir.char<1>>) -> ()
  }) {operandSegmentSizes = array<i32: 1, 0, 1>, unordered} : (!fir.shape<1>, index) -> !hlfir.expr<10x!fir.char<1>>
...
  "fir.if"(%3) ({
    %6 = "fir.convert"(%3) : (!hlfir.expr<10x!fir.char<1>>) -> !fir.heap<!fir.array<10x!fir.char<1>>>
    "fir.freemem"(%6) : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> ()
    "fir.result"() : () -> ()
  }, {
  }) : (!hlfir.expr<10x!fir.char<1>>) -> ()
  "func.return"() : () -> ()
}) : () -> ()

I guess we can "legalize" the operand of DestroyOp before generating the if block with freemem, e.g. do something like this:

"func.func"() <{arg_attrs = [{fir.bindc_name = "a"}], function_type = (!fir.boxchar<1>) -> (), sym_name = "_QPtest"}> ({
^bb0(%arg0: !fir.boxchar<1>):
...
  %3 = "hlfir.elemental"(%2, %0) ({
  ^bb0(%arg1: index):
    %6 = "fir.undefined"() : () -> !fir.ref<!fir.char<1>>
    "hlfir.yield_element"(%6) : (!fir.ref<!fir.char<1>>) -> ()
  }) {operandSegmentSizes = array<i32: 1, 0, 1>, unordered} : (!fir.shape<1>, index) -> !hlfir.expr<10x!fir.char<1>>
...
  %tuple:2 = hlfir.unpack_expr %3 : (!hlfir.expr<10x!fir.char<1>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>, i1
  "fir.if"(%tuple#1) ({
    %6 = "fir.convert"(%tuple#0) : (!fir.ref<!fir.array<10x!fir.char<1>>>) -> !fir.heap<!fir.array<10x!fir.char<1>>>
    "fir.freemem"(%6) : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> ()
    "fir.result"() : () -> ()
  }, {
  }) : (i1) -> ()
  "func.return"() : () -> ()
}) : () -> ()

Then after hlfir.elemental has been replaced with a tuple producing fir.insert_value we would have to pattern-match and eliminate fir.insert_value->hlfir.unpack_expr chains.

But this does not seem to help to recognize the "allOtherUsesAreSafeForAssociate(): last use of hlfir.expr" at all: different users of the hlfir.elemental may introduce their own hlfir.unpack_expr, so there will be multiple uses of hlfir.elemental that we will have to chase starting from a hlfir.upack_expr operand of a hlfir.associate; and the users are "unrecognizable" after some conversions have happened (like above, the DestroyOp has become and if operation with type casts and freemem).

So I think pre-conversion analysis for deciding about the buffer reuse is the way to go here. I would like to discuss this more with @jeanPerier. In the meantime I will try to fix this for correctness.

flang/test/HLFIR/associate-codegen.fir

242

There are no regressions on exchange2, but I will run more benchmarks to be on the safe side.

Harbormaster completed remote builds in B254154: Diff 552452.Aug 22 2023, 12:40 PM

This patch did not cause any performance regressions on x86 for SPEC CPU2000/2006/2017 or Polyhedron benchmarks.

tblah added inline comments.Aug 23 2023, 2:55 AM

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp
538	+1, thanks for trying this

Closed by commit rGb2a9501080d0: [flang][hlfir] Fixed associate-op codegen for optimized HLFIR. (authored by vzakhari). · Explain WhyAug 23 2023, 9:48 AM

This revision was automatically updated to reflect the committed changes.

vzakhari added a commit: rGb2a9501080d0: [flang][hlfir] Fixed associate-op codegen for optimized HLFIR..

Revision Contents

Path

Size

flang/

lib/

Optimizer/

HLFIR/

Transforms/

BufferizeHLFIR.cpp

33 lines

test/

HLFIR/

associate-codegen.fir

132 lines

Diff 552767

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

Show All 19 Lines
#include "flang/Optimizer/Builder/Todo.h"		#include "flang/Optimizer/Builder/Todo.h"
#include "flang/Optimizer/Dialect/FIRDialect.h"		#include "flang/Optimizer/Dialect/FIRDialect.h"
#include "flang/Optimizer/Dialect/FIROps.h"		#include "flang/Optimizer/Dialect/FIROps.h"
#include "flang/Optimizer/Dialect/FIRType.h"		#include "flang/Optimizer/Dialect/FIRType.h"
#include "flang/Optimizer/Dialect/Support/FIRContext.h"		#include "flang/Optimizer/Dialect/Support/FIRContext.h"
#include "flang/Optimizer/HLFIR/HLFIRDialect.h"		#include "flang/Optimizer/HLFIR/HLFIRDialect.h"
#include "flang/Optimizer/HLFIR/HLFIROps.h"		#include "flang/Optimizer/HLFIR/HLFIROps.h"
#include "flang/Optimizer/HLFIR/Passes.h"		#include "flang/Optimizer/HLFIR/Passes.h"
		#include "mlir/IR/Dominance.h"
#include "mlir/IR/PatternMatch.h"		#include "mlir/IR/PatternMatch.h"
#include "mlir/Pass/Pass.h"		#include "mlir/Pass/Pass.h"
#include "mlir/Pass/PassManager.h"		#include "mlir/Pass/PassManager.h"
#include "mlir/Support/LogicalResult.h"		#include "mlir/Support/LogicalResult.h"
#include "mlir/Transforms/DialectConversion.h"		#include "mlir/Transforms/DialectConversion.h"
#include "llvm/ADT/TypeSwitch.h"		#include "llvm/ADT/TypeSwitch.h"

namespace hlfir {		namespace hlfir {
▲ Show 20 Lines • Show All 407 Lines • ▼ Show 20 Lines	if (!mlir::isa<hlfir::DestroyOp>(useOp) && useOp != currentUse) {
// hlfir.shape_of and hlfir.get_length will not disrupt cleanup so it is		// hlfir.shape_of and hlfir.get_length will not disrupt cleanup so it is
// safe for hlfir.associate. These operations might read from the box and		// safe for hlfir.associate. These operations might read from the box and
// so they need to come before the hflir.end_associate (which may		// so they need to come before the hflir.end_associate (which may
// deallocate).		// deallocate).
if (mlir::isa<hlfir::ShapeOfOp>(useOp) \|\|		if (mlir::isa<hlfir::ShapeOfOp>(useOp) \|\|
mlir::isa<hlfir::GetLengthOp>(useOp)) {		mlir::isa<hlfir::GetLengthOp>(useOp)) {
if (!endAssociate)		if (!endAssociate)
continue;		continue;
// not known to occur in practice:		// If useOp dominates the endAssociate, then it is definitely safe.
if (useOp->getBlock() != endAssociate->getBlock())		if (useOp->getBlock() != endAssociate->getBlock())
TODO(endAssociate->getLoc(), "Associate split over multiple blocks");		if (mlir::DominanceInfo{}.dominates(useOp, endAssociate))
		continue;
if (useOp->isBeforeInBlock(endAssociate))		if (useOp->isBeforeInBlock(endAssociate))
continue;		continue;
}		}
return false;		return false;
}		}
return true;		return true;
}		}

▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	auto replaceWith = [&](mlir::Value hlfirVar, mlir::Value firVar,
mlir::Type associateHlfirVarType = associate.getResultTypes()[0];		mlir::Type associateHlfirVarType = associate.getResultTypes()[0];
hlfirVar = adjustVar(hlfirVar, associateHlfirVarType);		hlfirVar = adjustVar(hlfirVar, associateHlfirVarType);
associate.getResult(0).replaceAllUsesWith(hlfirVar);		associate.getResult(0).replaceAllUsesWith(hlfirVar);

mlir::Type associateFirVarType = associate.getResultTypes()[1];		mlir::Type associateFirVarType = associate.getResultTypes()[1];
firVar = adjustVar(firVar, associateFirVarType);		firVar = adjustVar(firVar, associateFirVarType);
associate.getResult(1).replaceAllUsesWith(firVar);		associate.getResult(1).replaceAllUsesWith(firVar);
associate.getResult(2).replaceAllUsesWith(flag);		associate.getResult(2).replaceAllUsesWith(flag);
rewriter.replaceOp(associate, {hlfirVar, firVar, flag});		// FIXME: note that the AssociateOp that is being erased
		// here will continue to be a user of the original Source
		// operand (e.g. a result of hlfir.elemental), because
		// the erasure is not immediate in the rewriter.
		tblahUnsubmitted Not Done Reply Inline Actions I wonder if using the conversion pattern rewriter driver is creating more trouble than it is worth in this patch. Personally, I find these sorts of things makes this pass very difficult to debug. How would you feel about using the greedy pattern rewriter API instead? This way updates would be immediately visible. tblah: I wonder if using the conversion pattern rewriter driver is creating more trouble than it is…
		vzakhariAuthorUnsubmitted Done Reply Inline Actions The greedy pattern match driver usage is not straightforward either. One of the problems is that the operands are not converted before the users, so for the LIT case from `flang/test/HLFIR/associate-codegen.fir` the very first pattern rewrite (DestroyOpConversion) produces this: "func.func"() <{arg_attrs = [{fir.bindc_name = "a"}], function_type = (!fir.boxchar<1>) -> (), sym_name = "_QPtest"}> ({ ^bb0(%arg0: !fir.boxchar<1>): ... %3 = "hlfir.elemental"(%2, %0) ({ ^bb0(%arg1: index): %6 = "fir.undefined"() : () -> !fir.ref<!fir.char<1>> "hlfir.yield_element"(%6) : (!fir.ref<!fir.char<1>>) -> () }) {operandSegmentSizes = array<i32: 1, 0, 1>, unordered} : (!fir.shape<1>, index) -> !hlfir.expr<10x!fir.char<1>> ... "fir.if"(%3) ({ %6 = "fir.convert"(%3) : (!hlfir.expr<10x!fir.char<1>>) -> !fir.heap<!fir.array<10x!fir.char<1>>> "fir.freemem"(%6) : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> () "fir.result"() : () -> () }, { }) : (!hlfir.expr<10x!fir.char<1>>) -> () "func.return"() : () -> () }) : () -> () I guess we can "legalize" the operand of `DestroyOp` before generating the `if` block with `freemem`, e.g. do something like this: "func.func"() <{arg_attrs = [{fir.bindc_name = "a"}], function_type = (!fir.boxchar<1>) -> (), sym_name = "_QPtest"}> ({ ^bb0(%arg0: !fir.boxchar<1>): ... %3 = "hlfir.elemental"(%2, %0) ({ ^bb0(%arg1: index): %6 = "fir.undefined"() : () -> !fir.ref<!fir.char<1>> "hlfir.yield_element"(%6) : (!fir.ref<!fir.char<1>>) -> () }) {operandSegmentSizes = array<i32: 1, 0, 1>, unordered} : (!fir.shape<1>, index) -> !hlfir.expr<10x!fir.char<1>> ... %tuple:2 = hlfir.unpack_expr %3 : (!hlfir.expr<10x!fir.char<1>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>, i1 "fir.if"(%tuple#1) ({ %6 = "fir.convert"(%tuple#0) : (!fir.ref<!fir.array<10x!fir.char<1>>>) -> !fir.heap<!fir.array<10x!fir.char<1>>> "fir.freemem"(%6) : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> () "fir.result"() : () -> () }, { }) : (i1) -> () "func.return"() : () -> () }) : () -> () Then after `hlfir.elemental` has been replaced with a tuple producing `fir.insert_value` we would have to pattern-match and eliminate `fir.insert_value`->`hlfir.unpack_expr` chains. But this does not seem to help to recognize the "allOtherUsesAreSafeForAssociate(): last use of hlfir.expr" at all: different users of the `hlfir.elemental` may introduce their own `hlfir.unpack_expr`, so there will be multiple uses of `hlfir.elemental` that we will have to chase starting from a `hlfir.upack_expr` operand of a `hlfir.associate`; and the users are "unrecognizable" after some conversions have happened (like above, the `DestroyOp` has become and `if` operation with type casts and `freemem`). So I think pre-conversion analysis for deciding about the buffer reuse is the way to go here. I would like to discuss this more with @jeanPerier. In the meantime I will try to fix this for correctness. vzakhari: The greedy pattern match driver usage is not straightforward either. One of the problems is…
		tblahUnsubmitted Not Done Reply Inline Actions +1, thanks for trying this tblah: +1, thanks for trying this
		// In case there are multiple uses of the Source operand,
		// the allOtherUsesAreSafeForAssociate() below will always
		// see them, so there is no way to reuse the buffer.
		// I think we have to run this analysis before doing
		// the conversions, so that we can analyze HLFIR in its
		// original form and decide which of the AssociateOp
		// users of hlfir.expr can reuse the buffer (if it can).
		rewriter.eraseOp(associate);
};		};

// If this is the last use of the expression value and this is an hlfir.expr		// If this is the last use of the expression value and this is an hlfir.expr
// that was bufferized, re-use the storage.		// that was bufferized, re-use the storage.
// Otherwise, create a temp and assign the storage to it.		// Otherwise, create a temp and assign the storage to it.
		//
		// WARNING: it is important to use the original Source operand
		// of the AssociateOp to look for the users, because its replacement
		// has zero materialized users at this point.
		// So allOtherUsesAreSafeForAssociate() may incorrectly return
		// true here.
if (!isTrivialValue && allOtherUsesAreSafeForAssociate(		if (!isTrivialValue && allOtherUsesAreSafeForAssociate(
adaptor.getSource(), associate.getOperation(),		associate.getSource(), associate.getOperation(),
getEndAssociate(associate))) {		getEndAssociate(associate))) {
// Re-use hlfir.expr buffer if this is the only use of the hlfir.expr		// Re-use hlfir.expr buffer if this is the only use of the hlfir.expr
// outside of the hlfir.destroy. Take on the cleaning-up responsibility		// outside of the hlfir.destroy. Take on the cleaning-up responsibility
// for the related hlfir.end_associate, and erase the hlfir.destroy (if		// for the related hlfir.end_associate, and erase the hlfir.destroy (if
// any).		// any).
mlir::Value mustFree = getBufferizedExprMustFreeFlag(adaptor.getSource());		mlir::Value mustFree = getBufferizedExprMustFreeFlag(adaptor.getSource());
mlir::Value firBase = hlfir::Entity{bufferizedExpr}.getFirBase();		mlir::Value firBase = hlfir::Entity{bufferizedExpr}.getFirBase();
replaceWith(bufferizedExpr, firBase, mustFree);		replaceWith(bufferizedExpr, firBase, mustFree);
▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines	matchAndRewrite(hlfir::ElementalOp elemental, OpAdaptor adaptor,
// loop, this will ensure the buffer properly deallocated.		// loop, this will ensure the buffer properly deallocated.
if (elementValue.getType().isa<hlfir::ExprType>() &&		if (elementValue.getType().isa<hlfir::ExprType>() &&
wasCreatedInCurrentBlock(elementValue, builder))		wasCreatedInCurrentBlock(elementValue, builder))
builder.create<hlfir::DestroyOp>(loc, elementValue);		builder.create<hlfir::DestroyOp>(loc, elementValue);
builder.restoreInsertionPoint(insPt);		builder.restoreInsertionPoint(insPt);

mlir::Value bufferizedExpr =		mlir::Value bufferizedExpr =
packageBufferizedExpr(loc, builder, temp, cleanup);		packageBufferizedExpr(loc, builder, temp, cleanup);
		// Explicitly delete the body of the elemental to get rid
		// of any users of hlfir.expr values inside the body as early
		// as possible.
		rewriter.startRootUpdate(elemental);
		rewriter.eraseBlock(elemental.getBody());
		rewriter.finalizeRootUpdate(elemental);
rewriter.replaceOp(elemental, bufferizedExpr);		rewriter.replaceOp(elemental, bufferizedExpr);
return mlir::success();		return mlir::success();
}		}
};		};
struct CharExtremumOpConversion		struct CharExtremumOpConversion
: public mlir::OpConversionPattern<hlfir::CharExtremumOp> {		: public mlir::OpConversionPattern<hlfir::CharExtremumOp> {
using mlir::OpConversionPattern<hlfir::CharExtremumOp>::OpConversionPattern;		using mlir::OpConversionPattern<hlfir::CharExtremumOp>::OpConversionPattern;
explicit CharExtremumOpConversion(mlir::MLIRContext *ctx)		explicit CharExtremumOpConversion(mlir::MLIRContext *ctx)
▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines

flang/test/HLFIR/associate-codegen.fir

	Show First 20 Lines • Show All 233 Lines • ▼ Show 20 Lines
	// CHECK: %[[VAL_15:.*]] = fir.insert_value %[[VAL_14]], %[[VAL_5]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<3x4xi32>>, i1>, !fir.heap<!fir.array<3x4xi32>>) -> tuple<!fir.heap<!fir.array<3x4xi32>>, i1>			// CHECK: %[[VAL_15:.*]] = fir.insert_value %[[VAL_14]], %[[VAL_5]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<3x4xi32>>, i1>, !fir.heap<!fir.array<3x4xi32>>) -> tuple<!fir.heap<!fir.array<3x4xi32>>, i1>
	// CHECK: %[[VAL_16:.*]] = fir.convert %[[VAL_5]]#0 : (!fir.heap<!fir.array<3x4xi32>>) -> !fir.ref<!fir.array<3x4xi32>>			// CHECK: %[[VAL_16:.*]] = fir.convert %[[VAL_5]]#0 : (!fir.heap<!fir.array<3x4xi32>>) -> !fir.ref<!fir.array<3x4xi32>>
	// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_5]]#1 : (!fir.heap<!fir.array<3x4xi32>>) -> !fir.ref<!fir.array<3x4xi32>>			// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_5]]#1 : (!fir.heap<!fir.array<3x4xi32>>) -> !fir.ref<!fir.array<3x4xi32>>
	// CHECK: %[[VAL_18:.*]] = fir.convert %[[VAL_17]] : (!fir.ref<!fir.array<3x4xi32>>) -> !fir.heap<!fir.array<3x4xi32>>			// CHECK: %[[VAL_18:.*]] = fir.convert %[[VAL_17]] : (!fir.ref<!fir.array<3x4xi32>>) -> !fir.heap<!fir.array<3x4xi32>>
	// CHECK: fir.freemem %[[VAL_18]] : !fir.heap<!fir.array<3x4xi32>>			// CHECK: fir.freemem %[[VAL_18]] : !fir.heap<!fir.array<3x4xi32>>
	// CHECK: return			// CHECK: return
	// CHECK: }			// CHECK: }

	// The hlfir.associate op is cloned when the elemental is bufferized into the
	tblahUnsubmitted Not Done Reply Inline Actions Have you observed performance regressions as a result of this? If you have, I don't think that should block the patch; but it would be helpful to know. tblah: Have you observed performance regressions as a result of this? If you have, I don't think that…
	vzakhariAuthorUnsubmitted Done Reply Inline Actions There are no regressions on exchange2, but I will run more benchmarks to be on the safe side. vzakhari: There are no regressions on exchange2, but I will run more benchmarks to be on the safe side.
	// fir.do_loop. When the associate op conversion is run, if the source of the
	// assoicate is used directly (not accessing the bufferized version through
	// the adaptor) then both the associate inside the elemental and the associate
	// inside the fir.do_loop are found as uses. Therefore being erroneously
	// flagged as an associate with more than one use
	func.func @test_cloned_associate() {
	%false = arith.constant false
	%c1 = arith.constant 1 : index
	%c10 = arith.constant 10 : index
	%0 = fir.alloca !fir.char<1>
	%2 = fir.shape %c10 : (index) -> !fir.shape<1>
	%4 = hlfir.as_expr %0 move %false : (!fir.ref<!fir.char<1>>, i1) -> !hlfir.expr<!fir.char<1>>
	%5 = hlfir.elemental %2 unordered : (!fir.shape<1>) -> !hlfir.expr<10x!fir.logical<4>> {
	^bb0(%arg0: index):
	%8:3 = hlfir.associate %4 typeparams %c1 {uniq_name = "adapt.valuebyref"} : (!hlfir.expr<!fir.char<1>>, index) -> (!fir.ref<!fir.char<1>>, !fir.ref<!fir.char<1>>, i1)
	hlfir.end_associate %8#1, %8#2 : !fir.ref<!fir.char<1>>, i1
	%15 = fir.convert %false : (i1) -> !fir.logical<4>
	hlfir.yield_element %15 : !fir.logical<4>
	}
	hlfir.destroy %5 : !hlfir.expr<10x!fir.logical<4>>
	hlfir.destroy %4 : !hlfir.expr<!fir.char<1>>
	return
	}
	// CHECK-LABEL: func.func @test_cloned_associate() {
	// CHECK: %[[VAL_0:.*]] = arith.constant false
	// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index
	// CHECK: %[[VAL_2:.*]] = arith.constant 10 : index
	// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.char<1>
	// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1>
	// CHECK: %[[VAL_5:.*]] = fir.undefined tuple<!fir.ref<!fir.char<1>>, i1>
	// CHECK: %[[VAL_6:.*]] = fir.insert_value %[[VAL_5]], %[[VAL_0]], [1 : index] : (tuple<!fir.ref<!fir.char<1>>, i1>, i1) -> tuple<!fir.ref<!fir.char<1>>, i1>
	// CHECK: %[[VAL_7:.*]] = fir.insert_value %[[VAL_6]], %[[VAL_3]], [0 : index] : (tuple<!fir.ref<!fir.char<1>>, i1>, !fir.ref<!fir.char<1>>) -> tuple<!fir.ref<!fir.char<1>>, i1>
	// CHECK: %[[VAL_8:.*]] = fir.allocmem !fir.array<10x!fir.logical<4>> {bindc_name = ".tmp.array", uniq_name = ""}
	// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_8]](%[[VAL_4]]) {uniq_name = ".tmp.array"} : (!fir.heap<!fir.array<10x!fir.logical<4>>>, !fir.shape<1>) -> (!fir.heap<!fir.array<10x!fir.logical<4>>>, !fir.heap<!fir.array<10x!fir.logical<4>>>)
	// CHECK: %[[VAL_10:.*]] = arith.constant true
	// CHECK: %[[VAL_11:.*]] = arith.constant 1 : index
	// CHECK: fir.do_loop %[[VAL_12:.*]] = %[[VAL_11]] to %[[VAL_2]] step %[[VAL_11]] unordered {
	// CHECK: %[[VAL_13:.*]] = fir.convert %[[VAL_0]] : (i1) -> !fir.logical<4>
	// CHECK: %[[VAL_14:.*]] = hlfir.designate %[[VAL_9]]#0 (%[[VAL_12]]) : (!fir.heap<!fir.array<10x!fir.logical<4>>>, index) -> !fir.ref<!fir.logical<4>>
	// CHECK: hlfir.assign %[[VAL_13]] to %[[VAL_14]] temporary_lhs : !fir.logical<4>, !fir.ref<!fir.logical<4>>
	// CHECK: }
	// CHECK: %[[VAL_15:.*]] = fir.undefined tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>
	// CHECK: %[[VAL_16:.*]] = fir.insert_value %[[VAL_15]], %[[VAL_10]], [1 : index] : (tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>, i1) -> tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>
	// CHECK: %[[VAL_17:.*]] = fir.insert_value %[[VAL_16]], %[[VAL_9]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>, !fir.heap<!fir.array<10x!fir.logical<4>>>) -> tuple<!fir.heap<!fir.array<10x!fir.logical<4>>>, i1>
	// CHECK: fir.freemem %[[VAL_9]]#1 : !fir.heap<!fir.array<10x!fir.logical<4>>>
	// CHECK: return
	// CHECK: }

	func.func @test_multiple_associations(%arg0: !hlfir.expr<1x2xi32>) {			func.func @test_multiple_associations(%arg0: !hlfir.expr<1x2xi32>) {
	%c1 = arith.constant 1 : index			%c1 = arith.constant 1 : index
	%c2 = arith.constant 2 : index			%c2 = arith.constant 2 : index
	%shape = fir.shape %c1, %c2 : (index, index) -> !fir.shape<2>			%shape = fir.shape %c1, %c2 : (index, index) -> !fir.shape<2>
	%0:3 = hlfir.associate %arg0(%shape) {uniq_name = "associate 0"} : (!hlfir.expr<1x2xi32>, !fir.shape<2>) -> (!fir.ref<!fir.array<1x2xi32>>, !fir.ref<!fir.array<1x2xi32>>, i1)			%0:3 = hlfir.associate %arg0(%shape) {uniq_name = "associate 0"} : (!hlfir.expr<1x2xi32>, !fir.shape<2>) -> (!fir.ref<!fir.array<1x2xi32>>, !fir.ref<!fir.array<1x2xi32>>, i1)
	%1:3 = hlfir.associate %arg0(%shape) {uniq_name = "associate 1"} : (!hlfir.expr<1x2xi32>, !fir.shape<2>) -> (!fir.ref<!fir.array<1x2xi32>>, !fir.ref<!fir.array<1x2xi32>>, i1)			%1:3 = hlfir.associate %arg0(%shape) {uniq_name = "associate 1"} : (!hlfir.expr<1x2xi32>, !fir.shape<2>) -> (!fir.ref<!fir.array<1x2xi32>>, !fir.ref<!fir.array<1x2xi32>>, i1)
	hlfir.end_associate %0#1, %0#2 : !fir.ref<!fir.array<1x2xi32>>, i1			hlfir.end_associate %0#1, %0#2 : !fir.ref<!fir.array<1x2xi32>>, i1
	hlfir.end_associate %1#1, %1#2 : !fir.ref<!fir.array<1x2xi32>>, i1			hlfir.end_associate %1#1, %1#2 : !fir.ref<!fir.array<1x2xi32>>, i1
	▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	// CHECK: %[[VAL_24:.*]] = hlfir.designate %[[VAL_12]]#0 (%[[VAL_15]]) : (!fir.box<!fir.array<?x!fir.logical<4>>>, index) -> !fir.ref<!fir.logical<4>>			// CHECK: %[[VAL_24:.*]] = hlfir.designate %[[VAL_12]]#0 (%[[VAL_15]]) : (!fir.box<!fir.array<?x!fir.logical<4>>>, index) -> !fir.ref<!fir.logical<4>>
	// CHECK: hlfir.assign %[[VAL_23]] to %[[VAL_24]] temporary_lhs : !fir.logical<4>, !fir.ref<!fir.logical<4>>			// CHECK: hlfir.assign %[[VAL_23]] to %[[VAL_24]] temporary_lhs : !fir.logical<4>, !fir.ref<!fir.logical<4>>
	// CHECK: }			// CHECK: }
	// CHECK: %[[VAL_25:.*]] = fir.undefined tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>			// CHECK: %[[VAL_25:.*]] = fir.undefined tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>
	// CHECK: %[[VAL_26:.*]] = fir.insert_value %[[VAL_25]], %[[VAL_13]], [1 : index] : (tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>, i1) -> tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>			// CHECK: %[[VAL_26:.*]] = fir.insert_value %[[VAL_25]], %[[VAL_13]], [1 : index] : (tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>, i1) -> tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>
	// CHECK: %[[VAL_27:.*]] = fir.insert_value %[[VAL_26]], %[[VAL_12]]#0, [0 : index] : (tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>, !fir.box<!fir.array<?x!fir.logical<4>>>) -> tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>			// CHECK: %[[VAL_27:.*]] = fir.insert_value %[[VAL_26]], %[[VAL_12]]#0, [0 : index] : (tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>, !fir.box<!fir.array<?x!fir.logical<4>>>) -> tuple<!fir.box<!fir.array<?x!fir.logical<4>>>, i1>
	// CHECK: return			// CHECK: return
	// CHECK: }			// CHECK: }

				// Verify that we properly recognize mutliple consequent hlfir.associate using
				// the same result of hlfir.elemental.
				func.func @_QPtest_multitple_associates_for_same_expr() {
				%c1 = arith.constant 1 : index
				%c10 = arith.constant 10 : index
				%4 = fir.shape %c10 : (index) -> !fir.shape<1>
				%11 = hlfir.elemental %4 typeparams %c1 unordered : (!fir.shape<1>, index) -> !hlfir.expr<10x!fir.char<1>> {
				^bb0(%arg1: index):
				%44 = fir.undefined !fir.ref<!fir.char<1>>
				hlfir.yield_element %44 : !fir.ref<!fir.char<1>>
				}
				%12:3 = hlfir.associate %11(%4) typeparams %c1 {uniq_name = "adapt.valuebyref"} : (!hlfir.expr<10x!fir.char<1>>, !fir.shape<1>, index) -> (!fir.ref<!fir.array<10x!fir.char<1>>>, !fir.ref<!fir.array<10x!fir.char<1>>>, i1)
				hlfir.end_associate %12#1, %12#2 : !fir.ref<!fir.array<10x!fir.char<1>>>, i1
				%31:3 = hlfir.associate %11(%4) typeparams %c1 {uniq_name = "adapt.valuebyref"} : (!hlfir.expr<10x!fir.char<1>>, !fir.shape<1>, index) -> (!fir.ref<!fir.array<10x!fir.char<1>>>, !fir.ref<!fir.array<10x!fir.char<1>>>, i1)
				hlfir.end_associate %31#1, %31#2 : !fir.ref<!fir.array<10x!fir.char<1>>>, i1
				hlfir.destroy %11 : !hlfir.expr<10x!fir.char<1>>
				return
				}
				// CHECK-LABEL: func.func @_QPtest_multitple_associates_for_same_expr() {
				// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index
				// CHECK: %[[VAL_1:.*]] = arith.constant 10 : index
				// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_1]] : (index) -> !fir.shape<1>
				// CHECK: %[[VAL_3:.*]] = fir.allocmem !fir.array<10x!fir.char<1>> {bindc_name = ".tmp.array", uniq_name = ""}
				// CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]](%[[VAL_2]]) typeparams %[[VAL_0]] {uniq_name = ".tmp.array"} : (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.shape<1>, index) -> (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>)
				// CHECK: %[[VAL_5:.*]] = arith.constant true
				// CHECK: %[[VAL_6:.*]] = arith.constant 1 : index
				// CHECK: fir.do_loop %[[VAL_7:.*]] = %[[VAL_6]] to %[[VAL_1]] step %[[VAL_6]] unordered {
				// CHECK: %[[VAL_8:.*]] = fir.undefined !fir.ref<!fir.char<1>>
				// CHECK: %[[VAL_9:.*]] = hlfir.designate %[[VAL_4]]#0 (%[[VAL_7]]) typeparams %[[VAL_0]] : (!fir.heap<!fir.array<10x!fir.char<1>>>, index, index) -> !fir.ref<!fir.char<1>>
				// CHECK: hlfir.assign %[[VAL_8]] to %[[VAL_9]] temporary_lhs : !fir.ref<!fir.char<1>>, !fir.ref<!fir.char<1>>
				// CHECK: }
				// CHECK: %[[VAL_10:.*]] = fir.undefined tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_5]], [1 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, i1) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_4]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, !fir.heap<!fir.array<10x!fir.char<1>>>) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_13:.*]] = fir.allocmem !fir.array<10x!fir.char<1>> {bindc_name = ".tmp", uniq_name = ""}
				// CHECK: %[[VAL_14:.*]] = arith.constant true
				// CHECK: %[[VAL_15:.*]]:2 = hlfir.declare %[[VAL_13]](%[[VAL_2]]) typeparams %[[VAL_0]] {uniq_name = ".tmp"} : (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.shape<1>, index) -> (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>)
				// CHECK: hlfir.assign %[[VAL_4]]#0 to %[[VAL_15]]#0 temporary_lhs : !fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>
				// CHECK: %[[VAL_16:.*]] = fir.undefined tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_17:.*]] = fir.insert_value %[[VAL_16]], %[[VAL_14]], [1 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, i1) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_18:.*]] = fir.insert_value %[[VAL_17]], %[[VAL_15]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, !fir.heap<!fir.array<10x!fir.char<1>>>) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_19:.*]] = fir.convert %[[VAL_15]]#0 : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>
				// CHECK: %[[VAL_20:.*]] = fir.convert %[[VAL_15]]#1 : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>
				// CHECK: %[[VAL_21:.*]] = fir.convert %[[VAL_20]] : (!fir.ref<!fir.array<10x!fir.char<1>>>) -> !fir.heap<!fir.array<10x!fir.char<1>>>
				// CHECK: fir.freemem %[[VAL_21]] : !fir.heap<!fir.array<10x!fir.char<1>>>
				// CHECK: %[[VAL_22:.*]] = fir.allocmem !fir.array<10x!fir.char<1>> {bindc_name = ".tmp", uniq_name = ""}
				// CHECK: %[[VAL_23:.*]] = arith.constant true
				// CHECK: %[[VAL_24:.*]]:2 = hlfir.declare %[[VAL_22]](%[[VAL_2]]) typeparams %[[VAL_0]] {uniq_name = ".tmp"} : (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.shape<1>, index) -> (!fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>)
				// CHECK: hlfir.assign %[[VAL_4]]#0 to %[[VAL_24]]#0 temporary_lhs : !fir.heap<!fir.array<10x!fir.char<1>>>, !fir.heap<!fir.array<10x!fir.char<1>>>
				// CHECK: %[[VAL_25:.*]] = fir.undefined tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_26:.*]] = fir.insert_value %[[VAL_25]], %[[VAL_23]], [1 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, i1) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_27:.*]] = fir.insert_value %[[VAL_26]], %[[VAL_24]]#0, [0 : index] : (tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>, !fir.heap<!fir.array<10x!fir.char<1>>>) -> tuple<!fir.heap<!fir.array<10x!fir.char<1>>>, i1>
				// CHECK: %[[VAL_28:.*]] = fir.convert %[[VAL_24]]#0 : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>
				// CHECK: %[[VAL_29:.*]] = fir.convert %[[VAL_24]]#1 : (!fir.heap<!fir.array<10x!fir.char<1>>>) -> !fir.ref<!fir.array<10x!fir.char<1>>>
				// CHECK: %[[VAL_30:.*]] = fir.convert %[[VAL_29]] : (!fir.ref<!fir.array<10x!fir.char<1>>>) -> !fir.heap<!fir.array<10x!fir.char<1>>>
				// CHECK: fir.freemem %[[VAL_30]] : !fir.heap<!fir.array<10x!fir.char<1>>>
				// CHECK: fir.freemem %[[VAL_4]]#1 : !fir.heap<!fir.array<10x!fir.char<1>>>
				// CHECK: return
				// CHECK: }

				// Test hlfir.associate codegen, when its operand is used
				// by hlfir.shape located in a block different from the block
				// of the hlfir.end_associate.
				func.func @_QPtest(%arg0: index, %arg1: index, %arg2 : i32) {
				%c1_i32 = arith.constant 1 : i32
				%3 = fir.shape %arg0, %arg1 : (index, index) -> !fir.shape<2>
				%4 = hlfir.elemental %3 unordered : (!fir.shape<2>) -> !hlfir.expr<?x?xi32> {
				^bb0(%arg3: index, %arg4: index):
				%16 = fir.undefined i32
				hlfir.yield_element %16 : i32
				}
				%5 = hlfir.shape_of %4 : (!hlfir.expr<?x?xi32>) -> !fir.shape<2>
				%6:3 = hlfir.associate %4(%5) {uniq_name = "adapt.valuebyref"} : (!hlfir.expr<?x?xi32>, !fir.shape<2>) -> (!fir.box<!fir.array<?x?xi32>>, !fir.ref<!fir.array<?x?xi32>>, i1)
				%13 = arith.cmpi ne, %arg2, %c1_i32 : i32
				cf.cond_br %13, ^bb1, ^bb2
				^bb1: // pred: ^bb0
				fir.unreachable
				^bb2: // pred: ^bb0
				hlfir.end_associate %6#1, %6#2 : !fir.ref<!fir.array<?x?xi32>>, i1
				hlfir.destroy %4 : !hlfir.expr<?x?xi32>
				return
				}

This is an archive of the discontinued LLVM Phabricator instance.

[flang][hlfir] Fixed associate-op codegen for optimized HLFIR.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 552767

flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp

flang/test/HLFIR/associate-codegen.fir

[flang][hlfir] Fixed associate-op codegen for optimized HLFIR.
ClosedPublic