This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Arith/IR/
-
mlir/
-
Dialect/
-
Arith/
-
IR/
-
Arith.h
-
lib/Dialect/
-
Dialect/
-
Affine/Utils/
-
Utils/
-
LoopUtils.cpp
-
Arith/IR/
-
IR/
-
ArithOps.cpp
-
test/Dialect/Affine/
-
Dialect/
-
Affine/
-
unroll-jam.mlir

Differential D127553

[mlir] Replace iterOperand with a neutral element
AbandonedPublic

Authored by ayzhuang on Jun 10 2022, 6:27 PM.

Download Raw Diff

Details

Reviewers

dcaballe
nicolasvasilache
bondhugula
kuhar

Summary

Fix a bug in unroll-and-jam utility. Currently we would get wrong result if any

of the iterOperands of the loop being unroll-and-jammed is not a neutral element
for the corresponding reduction kind. This patch replaces such iterOperand with
a neutral element and adds an op after the loop to combine the original iterOperand
and the related result of the loop.

Modify getIdentityValue to support non-scalar type.

Diff Detail

Event Timeline

ayzhuang created this revision.Jun 10 2022, 6:27 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 6:27 PM

Herald added subscribers: bzcheeseman, sdasgup3, wenzhicui and 20 others. · View Herald Transcript

ayzhuang requested review of this revision.Jun 10 2022, 6:27 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJun 10 2022, 6:27 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Harbormaster completed remote builds in B169196: Diff 436088.Jun 10 2022, 6:46 PM

Update comments.

Herald added a subscriber: anlunx. · View Herald TranscriptJul 6 2022, 11:35 AM

Harbormaster completed remote builds in B173957: Diff 442648.Jul 6 2022, 11:58 AM

@bondhugula @dcaballe Please review, thanks.

Can you please improve the commit summary to state what the rationale for the change is? For eg., is it fixing a bug, is it doing something functionally equivalent in a different way or a better way, etc.? This is important to specify here.

ayzhuang edited the summary of this revision. (Show Details)Aug 25 2022, 6:16 PM

@bondhugula Thank you for the suggestion. I have modified the summary. Example: if the original loop is doing a reduction add and its iterOperand is not 0, we modify the original loop to use 0 as its iterOperand and add the original iterOperand back to the result of the original loop. As a result, the unroll-and jammed loop is using 0 as its iterOperands. Otherwise, we would have added the original iterOperand unroll-factor times instead of once.

In D127553#3750796, @ayzhuang wrote:

@bondhugula Thank you for the suggestion. I have modified the summary. Example: if the original loop is doing a reduction add and its iterOperand is not 0, we modify the original loop to use 0 as its iterOperand and add the original iterOperand back to the result of the original loop. As a result, the unroll-and jammed loop is using 0 as its iterOperands. Otherwise, we would have added the original iterOperand unroll-factor times instead of once.

I still don't see why you need to replace the init value by a neutral element. Let's take this example (unroll-and-jam here is the same as unroll). Any cleanup loop will start with the return value of the main loop. Within the body, the former yield of iteration i would provide the "init" value for i + 1.

%sum = affine.for %iv = 0 to 9 iter_args(%sum_iter = %c5) -> (f32) {
    %next = arith.addf %sum_iter, %arg1 : f32
    affine.yield %next : f32
  }

It looks like the change is circumventing an existing bug? . What output do you see for this currently? We don't perform any such replacement for unrolling in the presence of iter args.

This revision now requires changes to proceed.Aug 26 2022, 12:10 AM

original loop:

%c5 = arith.constant 5.0 : f32
%sum = affine.for %iv = 0 to 9 iter_args(%sum_iter = %c5) -> (f32) {
    %next = arith.addf %sum_iter, %arg1 : f32
    affine.yield %next : f32
}

without this patch:

%c5 = arith.constant 5.000000e+00 : f32
%sum:4 = affine.for %iv = 0 to 8 step 4 iter_args(%sum_iter0 = %c5, %sum_iter1 = %c5, %sum_iter2 = %c5, %sum_iter3 = %c5) -> (f32, f32, f32, f32) {
  %next0 = arith.addf %sum_iter0, %arg1 : f32
  %next1 = arith.addf %sum_iter1, %arg1 : f32
  %next2 = arith.addf %sum_iter2, %arg1 : f32
  %next3 = arith.addf %sum_iter3, %arg1 : f32
  affine.yield %next0, %next1, %next2, %next3 : f32, f32, f32, f32
}
%1 = arith.addf %sum#0, %sum#3 : f32
%2 = arith.addf %1, %sum#2 : f32
%3 = arith.addf %2, %sum#1 : f32
%4 = arith.addf %3, %arg1 : f32

3.with this patch:

%c5 = arith.constant 5.000000e+00 : f32
%cst_0 = arith.constant 0.000000e+00 : f32
%sum:4 = affine.for %iv = 0 to 8 step 4 iter_args(%arg2 = %cst_0, %arg3 = %cst_0, %arg4 = %cst_0, %arg5 = %cst_0) -> (f32, f32, f32, f32) {
  %next0 = arith.addf %sum_iter0, %arg1 : f32
  %next1 = arith.addf %sum_iter1, %arg1 : f32
  %next2 = arith.addf %sum_iter2, %arg1 : f32
  %next3 = arith.addf %sum_iter3, %arg1 : f32
  affine.yield %next0, %next1, %next2, %next3 : f32, f32, f32, f32
}
%1 = arith.addf %sum#0, %sum#3 : f32
%2 = arith.addf %1, %sum#2 : f32
%3 = arith.addf %2, %sum#1 : f32
%4 = arith.addf %3, %arg1 : f32
%5 = arith.addf %4, %c5 : f32

In the example you provide, we add %arg1 9 times and %c5 1 time. Without this patch we add %arg1 9 times and %c5 4 times after unroll-and-jam. With this patch we add %arg1 9 times and %c5 1 time after unroll-and-jam.

rebase

Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2023, 1:07 PM

Herald added subscribers: llvm-commits, Moerafaat, zero9178, thopre. · View Herald Transcript

Harbormaster completed remote builds in B220137: Diff 506183.Mar 17 2023, 1:08 PM

@bondhugula @dcaballe Please review, thanks. Please read my last comment for an example. We don't need this for unroll because unroll does not widen the loop. This is the test I copied from unroll.mlir:

// UNROLL-BY-4-LABEL: unroll_with_iter_args_and_promotion
func.func @unroll_with_iter_args_and_promotion(%arg0 : f32, %arg1 : f32) -> f32 {
  %from = arith.constant 0 : index
  %to = arith.constant 10 : index
  %step = arith.constant 1 : index
  %sum = affine.for %iv = 0 to 9 iter_args(%sum_iter = %arg0) -> (f32) {
    %next = arith.addf %sum_iter, %arg1 : f32
    affine.yield %next : f32
  }
  // UNROLL-BY-4:      %[[SUM:.*]] = affine.for %{{.*}} = 0 to 8 step 4 iter_args(%[[V0:.*]] =
  // UNROLL-BY-4-NEXT:   %[[V1:.*]] = arith.addf %[[V0]]
  // UNROLL-BY-4-NEXT:   %[[V2:.*]] = arith.addf %[[V1]]
  // UNROLL-BY-4-NEXT:   %[[V3:.*]] = arith.addf %[[V2]]
  // UNROLL-BY-4-NEXT:   %[[V4:.*]] = arith.addf %[[V3]]
  // UNROLL-BY-4-NEXT:   affine.yield %[[V4]]
  // UNROLL-BY-4-NEXT: }
  // UNROLL-BY-4-NEXT: %[[RES:.*]] = arith.addf %[[SUM]],
  // UNROLL-BY-4-NEXT: return %[[RES]]
  return %sum : f32
}

There is only one iterArg which is only used once in the unrolled loop.

ayzhuang updated this revision to Diff 506613.Mar 20 2023, 9:03 AM

Harbormaster completed remote builds in B220463: Diff 506613.Mar 20 2023, 9:52 AM

bondhugula removed a reviewer: bondhugula.Jun 3 2023, 6:29 AM

Herald added a reviewer: bondhugula. · View Herald TranscriptJun 3 2023, 6:29 AM

Herald added a reviewer: kuhar. · View Herald Transcript

Herald added a subscriber: bviyer. · View Herald Transcript

kuhar resigned from this revision.Jun 11 2023, 1:14 PM

ayzhuang abandoned this revision.Jun 23 2023, 2:07 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Arith/

IR/

Arith.h

3 lines

lib/

Dialect/

Affine/

Utils/

LoopUtils.cpp

38 lines

Arith/

IR/

ArithOps.cpp

21 lines

test/

Dialect/

Affine/

unroll-jam.mlir

8 lines

Diff 506613

mlir/include/mlir/Dialect/Arith/IR/Arith.h

	Show First 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
	/// Returns the identity value attribute associated with an AtomicRMWKind op.			/// Returns the identity value attribute associated with an AtomicRMWKind op.
	Attribute getIdentityValueAttr(AtomicRMWKind kind, Type resultType,			Attribute getIdentityValueAttr(AtomicRMWKind kind, Type resultType,
	OpBuilder &builder, Location loc);			OpBuilder &builder, Location loc);

	/// Returns the identity value associated with an AtomicRMWKind op.			/// Returns the identity value associated with an AtomicRMWKind op.
	Value getIdentityValue(AtomicRMWKind op, Type resultType, OpBuilder &builder,			Value getIdentityValue(AtomicRMWKind op, Type resultType, OpBuilder &builder,
	Location loc);			Location loc);

				/// Checks if `value` is an identity value assocated with an AtomicRMWKind op.
				bool isIdentityValue(AtomicRMWKind op, Value value);

	/// Returns the value obtained by applying the reduction operation kind			/// Returns the value obtained by applying the reduction operation kind
	/// associated with a binary AtomicRMWKind op to `lhs` and `rhs`.			/// associated with a binary AtomicRMWKind op to `lhs` and `rhs`.
	Value getReductionOp(AtomicRMWKind op, OpBuilder &builder, Location loc,			Value getReductionOp(AtomicRMWKind op, OpBuilder &builder, Location loc,
	Value lhs, Value rhs);			Value lhs, Value rhs);

	arith::CmpIPredicate invertPredicate(arith::CmpIPredicate pred);			arith::CmpIPredicate invertPredicate(arith::CmpIPredicate pred);
	} // namespace arith			} // namespace arith
	} // namespace mlir			} // namespace mlir

	#endif // MLIR_DIALECT_ARITH_IR_ARITH_H_			#endif // MLIR_DIALECT_ARITH_IR_ARITH_H_

mlir/lib/Dialect/Affine/Utils/LoopUtils.cpp

Show First 20 Lines • Show All 1,199 Lines • ▼ Show 20 Lines	for (auto it = block.begin(), e = std::prev(block.end()); it != e;) {
subBlocks.emplace_back(subBlockStart, std::prev(it));		subBlocks.emplace_back(subBlockStart, std::prev(it));
// Process all for ops that appear next.		// Process all for ops that appear next.
while (it != e && isa<AffineForOp>(&*it))		while (it != e && isa<AffineForOp>(&*it))
walk(&*it++);		walk(&*it++);
}		}
}		}
};		};

		/// Checks every iterOperand of `forOp` from `reductions`. If an iterOperand is
		/// not a neutral element for the corresponding reduction kind, replaces the
		/// iterOperand with a neutral element and adds an op after `forOp` to combine
		/// the original iterOperand and the related result of `forOp`.
		static void
		replaceIterOperandWithNeutralElement(AffineForOp forOp,
		ArrayRef<LoopReduction> reductions) {
		unsigned numIterArgs = forOp.getNumIterOperands();
		if (numIterArgs == 0)
		return;
		assert(reductions.size() == numIterArgs);
		OpBuilder builder(forOp);
		auto loc = forOp.getLoc();
		auto iterOperands = forOp.getIterOperands();
		for (const LoopReduction &reduction : reductions) {
		unsigned pos = reduction.iterArgPosition;
		arith::AtomicRMWKind kind = reduction.kind;
		Value iterOperand = iterOperands[pos];
		if (isIdentityValue(kind, iterOperand))
		continue;
		builder.setInsertionPoint(forOp);
		Value neutralElem =
		getIdentityValue(kind, iterOperand.getType(), builder, loc);
		forOp.setOperand(pos + forOp.getNumControlOperands(), neutralElem);
		builder.setInsertionPointAfter(forOp);
		auto res = forOp.getResult(pos);
		// Combine the original initial value with the result.
		Value newRes = getReductionOp(kind, builder, loc, res, iterOperand);
		res.replaceAllUsesExcept(newRes, newRes.getDefiningOp());
		}
		}

/// Unrolls and jams this loop by the specified factor.		/// Unrolls and jams this loop by the specified factor.
LogicalResult mlir::loopUnrollJamByFactor(AffineForOp forOp,		LogicalResult mlir::loopUnrollJamByFactor(AffineForOp forOp,
uint64_t unrollJamFactor) {		uint64_t unrollJamFactor) {
assert(unrollJamFactor > 0 && "unroll jam factor should be positive");		assert(unrollJamFactor > 0 && "unroll jam factor should be positive");

std::optional<uint64_t> mayBeConstantTripCount = getConstantTripCount(forOp);		std::optional<uint64_t> mayBeConstantTripCount = getConstantTripCount(forOp);
if (unrollJamFactor == 1) {		if (unrollJamFactor == 1) {
if (mayBeConstantTripCount && *mayBeConstantTripCount == 1 &&		if (mayBeConstantTripCount && *mayBeConstantTripCount == 1 &&
Show All 26 Lines	LogicalResult mlir::loopUnrollJamByFactor(AffineForOp forOp,
SmallVector<AffineForOp, 4> loopsWithIterArgs;		SmallVector<AffineForOp, 4> loopsWithIterArgs;
forOp.walk([&](AffineForOp aForOp) {		forOp.walk([&](AffineForOp aForOp) {
if (aForOp.getNumIterOperands() > 0)		if (aForOp.getNumIterOperands() > 0)
loopsWithIterArgs.push_back(aForOp);		loopsWithIterArgs.push_back(aForOp);
});		});

// Get supported reductions to be used for creating reduction ops at the end.		// Get supported reductions to be used for creating reduction ops at the end.
SmallVector<LoopReduction> reductions;		SmallVector<LoopReduction> reductions;
if (forOp.getNumIterOperands() > 0)		if (forOp.getNumIterOperands() > 0) {
getSupportedReductions(forOp, reductions);		getSupportedReductions(forOp, reductions);
		// Each iterOperand of `forOp` from `reductions` should be a neutral element
		// for the corresponding reduction kind.
		replaceIterOperandWithNeutralElement(forOp, reductions);
		}

// Generate the cleanup loop if trip count isn't a multiple of		// Generate the cleanup loop if trip count isn't a multiple of
// unrollJamFactor.		// unrollJamFactor.
if (getLargestDivisorOfTripCount(forOp) % unrollJamFactor != 0) {		if (getLargestDivisorOfTripCount(forOp) % unrollJamFactor != 0) {
// Loops where the lower bound is a max expression or the upper bound is		// Loops where the lower bound is a max expression or the upper bound is
// a min expression and the trip count doesn't divide the unroll factor		// a min expression and the trip count doesn't divide the unroll factor
// can't be unrolled since the lower bound of the cleanup loop in such cases		// can't be unrolled since the lower bound of the cleanup loop in such cases
// cannot be expressed as an affine function or a max over affine functions.		// cannot be expressed as an affine function or a max over affine functions.
▲ Show 20 Lines • Show All 1,586 Lines • Show Last 20 Lines

mlir/lib/Dialect/Arith/IR/ArithOps.cpp

Show First 20 Lines • Show All 2,332 Lines • ▼ Show 20 Lines	default:
break;		break;
}		}
return nullptr;		return nullptr;
}		}

/// Returns the identity value associated with an AtomicRMWKind op.		/// Returns the identity value associated with an AtomicRMWKind op.
Value mlir::arith::getIdentityValue(AtomicRMWKind op, Type resultType,		Value mlir::arith::getIdentityValue(AtomicRMWKind op, Type resultType,
OpBuilder &builder, Location loc) {		OpBuilder &builder, Location loc) {
Attribute attr = getIdentityValueAttr(op, resultType, builder, loc);		Type scalarTy = getElementTypeOrSelf(resultType);
		Attribute attr = getIdentityValueAttr(op, scalarTy, builder, loc);
		if (scalarTy != resultType) {
		attr = DenseElementsAttr::get(resultType, attr);
		}
return builder.create<arith::ConstantOp>(loc, attr);		return builder.create<arith::ConstantOp>(loc, attr);
}		}

		/// Checks if `value` is an identity value assocated with an AtomicRMWKind op.
		bool mlir::arith::isIdentityValue(AtomicRMWKind op, Value value) {
		auto constOp = dyn_cast_or_null<arith::ConstantOp>(value.getDefiningOp());
		if (!constOp)
		return false;
		OpBuilder builder(value.getContext());
		Type scalarTy = getElementTypeOrSelf(value);
		Attribute valueAttr =
		getIdentityValueAttr(op, scalarTy, builder, value.getLoc());
		if (constOp.getValue().isa<DenseElementsAttr>())
		return constOp.getValue() ==
		DenseElementsAttr::get(value.getType(), valueAttr);
		return constOp.getValue() == valueAttr;
		}

/// Return the value obtained by applying the reduction operation kind		/// Return the value obtained by applying the reduction operation kind
/// associated with a binary AtomicRMWKind op to `lhs` and `rhs`.		/// associated with a binary AtomicRMWKind op to `lhs` and `rhs`.
Value mlir::arith::getReductionOp(AtomicRMWKind op, OpBuilder &builder,		Value mlir::arith::getReductionOp(AtomicRMWKind op, OpBuilder &builder,
Location loc, Value lhs, Value rhs) {		Location loc, Value lhs, Value rhs) {
switch (op) {		switch (op) {
case AtomicRMWKind::addf:		case AtomicRMWKind::addf:
return builder.create<arith::AddFOp>(loc, lhs, rhs);		return builder.create<arith::AddFOp>(loc, lhs, rhs);
case AtomicRMWKind::addi:		case AtomicRMWKind::addi:
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

mlir/test/Dialect/Affine/unroll-jam.mlir

Show First 20 Lines • Show All 452 Lines • ▼ Show 20 Lines	%0 = affine.for %arg3 = 0 to 21 iter_args(%arg4 = %init) -> (f32) {
}		}
%2 = arith.mulf %arg4, %1 : f32		%2 = arith.mulf %arg4, %1 : f32
affine.yield %2 : f32		affine.yield %2 : f32
}		}
return		return
}		}

// CHECK: %[[CONST0:[a-zA-Z0-9_]*]] = arith.constant 20 : index		// CHECK: %[[CONST0:[a-zA-Z0-9_]*]] = arith.constant 20 : index
// CHECK-NEXT: [[RES:%[0-9]+]]:2 = affine.for %[[IV0:arg[0-9]+]] = 0 to 20 step 2 iter_args([[ACC0:%arg[0-9]+]] = [[INIT0]], [[ACC1:%arg[0-9]+]] = [[INIT0]]) -> (f32, f32) {		// CHECK-NEXT: [[CST1:%[a-zA-Z0-9_]*]] = arith.constant 1.000000e+00 : f32
		// CHECK-NEXT: [[RES:%[0-9]+]]:2 = affine.for %[[IV0:arg[0-9]+]] = 0 to 20 step 2 iter_args([[ACC0:%arg[0-9]+]] = [[CST1]], [[ACC1:%arg[0-9]+]] = [[CST1]]) -> (f32, f32) {
// CHECK-NEXT: [[RES1:%[0-9]+]]:2 = affine.for %[[IV1:arg[0-9]+]] = 0 to 30 iter_args([[ACC2:%arg[0-9]+]] = [[INIT1]], [[ACC3:%arg[0-9]+]] = [[INIT1]]) -> (f32, f32) {		// CHECK-NEXT: [[RES1:%[0-9]+]]:2 = affine.for %[[IV1:arg[0-9]+]] = 0 to 30 iter_args([[ACC2:%arg[0-9]+]] = [[INIT1]], [[ACC3:%arg[0-9]+]] = [[INIT1]]) -> (f32, f32) {
// CHECK-NEXT: [[LOAD1:%[0-9]+]] = affine.load {{.*}}[%[[IV0]], %[[IV1]]]		// CHECK-NEXT: [[LOAD1:%[0-9]+]] = affine.load {{.*}}[%[[IV0]], %[[IV1]]]
// CHECK-NEXT: [[ADD1:%[0-9]+]] = arith.addf [[ACC2]], [[LOAD1]] : f32		// CHECK-NEXT: [[ADD1:%[0-9]+]] = arith.addf [[ACC2]], [[LOAD1]] : f32
// CHECK-NEXT: %[[INC1:[0-9]+]] = affine.apply [[$MAP_PLUS_1]](%[[IV0]])		// CHECK-NEXT: %[[INC1:[0-9]+]] = affine.apply [[$MAP_PLUS_1]](%[[IV0]])
// CHECK-NEXT: [[LOAD2:%[0-9]+]] = affine.load {{.*}}[%[[INC1]], %[[IV1]]]		// CHECK-NEXT: [[LOAD2:%[0-9]+]] = affine.load {{.*}}[%[[INC1]], %[[IV1]]]
// CHECK-NEXT: [[ADD2:%[0-9]+]] = arith.addf [[ACC3]], [[LOAD2]] : f32		// CHECK-NEXT: [[ADD2:%[0-9]+]] = arith.addf [[ACC3]], [[LOAD2]] : f32
// CHECK-NEXT: affine.yield [[ADD1]], [[ADD2]]		// CHECK-NEXT: affine.yield [[ADD1]], [[ADD2]]
// CHECK-NEXT: }		// CHECK-NEXT: }
// CHECK-NEXT: [[MUL1:%[0-9]+]] = arith.mulf [[ACC0]], [[RES1]]#0 : f32		// CHECK-NEXT: [[MUL1:%[0-9]+]] = arith.mulf [[ACC0]], [[RES1]]#0 : f32
// CHECK-NEXT: affine.apply		// CHECK-NEXT: affine.apply
// CHECK-NEXT: [[MUL2:%[0-9]+]] = arith.mulf [[ACC1]], [[RES1]]#1 : f32		// CHECK-NEXT: [[MUL2:%[0-9]+]] = arith.mulf [[ACC1]], [[RES1]]#1 : f32
// CHECK-NEXT: affine.yield [[MUL1]], [[MUL2]]		// CHECK-NEXT: affine.yield [[MUL1]], [[MUL2]]
// CHECK-NEXT: }		// CHECK-NEXT: }
// Reduction op.		// Reduction op.
// CHECK-NEXT: [[MUL3:%[0-9]+]] = arith.mulf [[RES]]#0, [[RES]]#1 : f32		// CHECK-NEXT: [[MUL3:%[0-9]+]] = arith.mulf [[RES]]#0, [[RES]]#1 : f32
// Cleanup loop (single iteration).		// Cleanup loop (single iteration).
// CHECK-NEXT: [[RES2:%[0-9]+]] = affine.for %[[IV2:arg[0-9]+]] = 0 to 30 iter_args([[ACC4:%arg[0-9]+]] = [[INIT1]]) -> (f32) {		// CHECK-NEXT: [[RES2:%[0-9]+]] = affine.for %[[IV2:arg[0-9]+]] = 0 to 30 iter_args([[ACC4:%arg[0-9]+]] = [[INIT1]]) -> (f32) {
// CHECK-NEXT: [[LOAD3:%[0-9]+]] = affine.load {{.*}}[%[[CONST0]], %[[IV2]]]		// CHECK-NEXT: [[LOAD3:%[0-9]+]] = affine.load {{.*}}[%[[CONST0]], %[[IV2]]]
// CHECK-NEXT: [[ADD3:%[0-9]+]] = arith.addf [[ACC4]], [[LOAD3]] : f32		// CHECK-NEXT: [[ADD3:%[0-9]+]] = arith.addf [[ACC4]], [[LOAD3]] : f32
// CHECK-NEXT: affine.yield [[ADD3]] : f32		// CHECK-NEXT: affine.yield [[ADD3]] : f32
// CHECK-NEXT: }		// CHECK-NEXT: }
// CHECK-NEXT: [[MUL4:%[0-9]+]] = arith.mulf [[MUL3]], [[RES2]] : f32		// CHECK-NEXT: [[MUL4:%[0-9]+]] = arith.mulf [[MUL3]], [[RES2]] : f32
		// CHECK-NEXT: [[MUL5:%[0-9]+]] = arith.mulf [[MUL4]], [[INIT0]] : f32
// CHECK-NEXT: return		// CHECK-NEXT: return

// CHECK-LABEL: func @unroll_jam_iter_args_addi		// CHECK-LABEL: func @unroll_jam_iter_args_addi
// CHECK-SAME: [[INIT0:%arg[0-9]+]]: i32		// CHECK-SAME: [[INIT0:%arg[0-9]+]]: i32
func.func @unroll_jam_iter_args_addi(%arg0: memref<21xi32, 1>, %init : i32) {		func.func @unroll_jam_iter_args_addi(%arg0: memref<21xi32, 1>, %init : i32) {
%0 = affine.for %arg3 = 0 to 21 iter_args(%arg4 = %init) -> (i32) {		%0 = affine.for %arg3 = 0 to 21 iter_args(%arg4 = %init) -> (i32) {
%1 = affine.load %arg0[%arg3] : memref<21xi32, 1>		%1 = affine.load %arg0[%arg3] : memref<21xi32, 1>
%2 = arith.addi %arg4, %1 : i32		%2 = arith.addi %arg4, %1 : i32
affine.yield %2 : i32		affine.yield %2 : i32
}		}
return		return
}		}

// CHECK: %[[CONST0:[a-zA-Z0-9_]*]] = arith.constant 20 : index		// CHECK: %[[CONST0:[a-zA-Z0-9_]*]] = arith.constant 20 : index
// CHECK-NEXT: [[RES:%[0-9]+]]:2 = affine.for %[[IV0:arg[0-9]+]] = 0 to 20 step 2 iter_args([[ACC0:%arg[0-9]+]] = [[INIT0]], [[ACC1:%arg[0-9]+]] = [[INIT0]]) -> (i32, i32) {		// CHECK-NEXT: [[CST0:%[a-zA-Z0-9_]*]] = arith.constant 0 : i32
		// CHECK-NEXT: [[RES:%[0-9]+]]:2 = affine.for %[[IV0:arg[0-9]+]] = 0 to 20 step 2 iter_args([[ACC0:%arg[0-9]+]] = [[CST0]], [[ACC1:%arg[0-9]+]] = [[CST0]]) -> (i32, i32) {
// CHECK-NEXT: [[LOAD1:%[0-9]+]] = affine.load {{.*}}[%[[IV0]]]		// CHECK-NEXT: [[LOAD1:%[0-9]+]] = affine.load {{.*}}[%[[IV0]]]
// CHECK-NEXT: [[ADD1:%[0-9]+]] = arith.addi [[ACC0]], [[LOAD1]] : i32		// CHECK-NEXT: [[ADD1:%[0-9]+]] = arith.addi [[ACC0]], [[LOAD1]] : i32
// CHECK-NEXT: %[[INC1:[0-9]+]] = affine.apply [[$MAP_PLUS_1]](%[[IV0]])		// CHECK-NEXT: %[[INC1:[0-9]+]] = affine.apply [[$MAP_PLUS_1]](%[[IV0]])
// CHECK-NEXT: [[LOAD2:%[0-9]+]] = affine.load {{.*}}[%[[INC1]]]		// CHECK-NEXT: [[LOAD2:%[0-9]+]] = affine.load {{.*}}[%[[INC1]]]
// CHECK-NEXT: [[ADD2:%[0-9]+]] = arith.addi [[ACC1]], [[LOAD2]] : i32		// CHECK-NEXT: [[ADD2:%[0-9]+]] = arith.addi [[ACC1]], [[LOAD2]] : i32
// CHECK-NEXT: affine.yield [[ADD1]], [[ADD2]]		// CHECK-NEXT: affine.yield [[ADD1]], [[ADD2]]
// CHECK-NEXT: }		// CHECK-NEXT: }
// Reduction op.		// Reduction op.
// CHECK-NEXT: [[ADD3:%[0-9]+]] = arith.addi [[RES]]#0, [[RES]]#1 : i32		// CHECK-NEXT: [[ADD3:%[0-9]+]] = arith.addi [[RES]]#0, [[RES]]#1 : i32
// Cleanup loop (single iteration).		// Cleanup loop (single iteration).
// CHECK-NEXT: [[LOAD3:%[0-9]+]] = affine.load {{.*}}[%[[CONST0]]]		// CHECK-NEXT: [[LOAD3:%[0-9]+]] = affine.load {{.*}}[%[[CONST0]]]
// CHECK-NEXT: [[ADD4:%[0-9]+]] = arith.addi [[ADD3]], [[LOAD3]] : i32		// CHECK-NEXT: [[ADD4:%[0-9]+]] = arith.addi [[ADD3]], [[LOAD3]] : i32
		// CHECK-NEXT: [[ADD5:%[0-9]+]] = arith.addi [[ADD4]], [[INIT0]] : i32
// CHECK-NEXT: return		// CHECK-NEXT: return

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Replace iterOperand with a neutral elementAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 506613

mlir/include/mlir/Dialect/Arith/IR/Arith.h

mlir/lib/Dialect/Affine/Utils/LoopUtils.cpp

mlir/lib/Dialect/Arith/IR/ArithOps.cpp

mlir/test/Dialect/Affine/unroll-jam.mlir

[mlir] Replace iterOperand with a neutral element
AbandonedPublic