This is an archive of the discontinued LLVM Phabricator instance.

[mlir] properly support min/max in affine parallelization
ClosedPublic

Authored by ftynse on Dec 7 2020, 6:45 AM.

Download Raw Diff

Details

Reviewers

wsmoses
chelini
nicolasvasilache

Commits

rG2fe30a3534da: [mlir] properly support min/max in affine parallelization

Summary

The existing implementation of the affine parallelization silently copies over
the lower and upper bound maps from affine.for to affine.parallel. However, the
semantics of these maps differ between these two ops: in affine.for, a max(min)
of results is taken for the lower(upper) bound; in affine.parallel, multiple
induction variables can be defined an each result corresponds to one induction
variable. Thus the existing implementation could generate invalid IR or IR that
passes the verifier but has different semantics than the original code. Fix the
parallelization utility to emit dedicated min/max operations before the
affine.parallel in such cases. Disallow parallelization if min/max would have
been in an operation without the AffineScope trait, e.g., in another loop,
since the result of these operations is not considered a valid affine dimension
identifier and may not be properly handled by the affine analyses.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ftynse created this revision.Dec 7 2020, 6:45 AM

Herald added subscribers: teijeong, rdzhabarov, tatianashp and 14 others. · View Herald TranscriptDec 7 2020, 6:45 AM

ftynse requested review of this revision.Dec 7 2020, 6:45 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptDec 7 2020, 6:45 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

ftynse added a child revision: D92765: [mlir] Add an option to control the number of loops in affine parallelizer.Dec 7 2020, 7:18 AM

Harbormaster completed remote builds in B81293: Diff 309904.Dec 7 2020, 7:32 AM

wsmoses requested changes to this revision.Dec 7 2020, 12:38 PM

wsmoses added inline comments.

mlir/lib/Dialect/Affine/Utils/Utils.cpp
145	While seeming correct, I am a bit worried that checking for num results > 1 here won't be equivalent to checking if there's a max (or min) in the future if AffineForOp is extended, resulting in this code getting overlooked.

This revision now requires changes to proceed.Dec 7 2020, 12:38 PM

ftynse added inline comments.Dec 7 2020, 1:28 PM

mlir/lib/Dialect/Affine/Utils/Utils.cpp
145	This has been a part of the op's semantics since it was created - https://mlir.llvm.org/docs/Dialects/Affine/#affinefor-affineforop - there's no other way of checking for this (well, unless you want me to introduce the condition in the op itself and update the existing users).

wsmoses accepted this revision.Dec 7 2020, 2:10 PM

This revision is now accepted and ready to land.Dec 7 2020, 2:10 PM

wsmoses added inline comments.Dec 7 2020, 2:12 PM

mlir/lib/Dialect/Affine/Utils/Utils.cpp
145	I think that may be useful for clarity to add a helper wrapper to AffineForOp, but not necessary for this patch.

Closed by commit rG2fe30a3534da: [mlir] properly support min/max in affine parallelization (authored by ftynse). · Explain WhyDec 8 2020, 1:43 AM

This revision was automatically updated to reflect the committed changes.

ftynse added a commit: rG2fe30a3534da: [mlir] properly support min/max in affine parallelization.

Revision Contents

Path

Size

mlir/

lib/

Dialect/

Affine/

Utils/

Utils.cpp

38 lines

test/

Dialect/

Affine/

parallelize.mlir

30 lines

Diff 310108

mlir/lib/Dialect/Affine/Utils/Utils.cpp

Show First 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	static AffineIfOp hoistAffineIfOp(AffineIfOp ifOp, Operation *hoistOverOp) {
return hoistedIfOp;		return hoistedIfOp;
}		}

/// Replace affine.for with a 1-d affine.parallel and clone the former's body		/// Replace affine.for with a 1-d affine.parallel and clone the former's body
/// into the latter while remapping values.		/// into the latter while remapping values.
void mlir::affineParallelize(AffineForOp forOp) {		void mlir::affineParallelize(AffineForOp forOp) {
Location loc = forOp.getLoc();		Location loc = forOp.getLoc();
OpBuilder outsideBuilder(forOp);		OpBuilder outsideBuilder(forOp);

		// If a loop has a 'max' in the lower bound, emit it outside the parallel loop
		// as it does not have implicit 'max' behavior.
		AffineMap lowerBoundMap = forOp.getLowerBoundMap();
		ValueRange lowerBoundOperands = forOp.getLowerBoundOperands();
		AffineMap upperBoundMap = forOp.getUpperBoundMap();
		ValueRange upperBoundOperands = forOp.getUpperBoundOperands();

		bool needsMax = lowerBoundMap.getNumResults() > 1;
		wsmosesUnsubmitted Not Done Reply Inline Actions While seeming correct, I am a bit worried that checking for num results > 1 here won't be equivalent to checking if there's a max (or min) in the future if AffineForOp is extended, resulting in this code getting overlooked. wsmoses: While seeming correct, I am a bit worried that checking for num results > 1 here won't be…
		ftynseAuthorUnsubmitted Done Reply Inline Actions This has been a part of the op's semantics since it was created - https://mlir.llvm.org/docs/Dialects/Affine/#affinefor-affineforop - there's no other way of checking for this (well, unless you want me to introduce the condition in the op itself and update the existing users). ftynse: This has been a part of the op's semantics since it was created - https://mlir.llvm.
		wsmosesUnsubmitted Not Done Reply Inline Actions I think that may be useful for clarity to add a helper wrapper to AffineForOp, but not necessary for this patch. wsmoses: I think that may be useful for clarity to add a helper wrapper to AffineForOp, but not…
		bool needsMin = upperBoundMap.getNumResults() > 1;
		AffineMap identityMap;
		if (needsMax \|\| needsMin) {
		if (forOp->getParentOp() &&
		!forOp->getParentOp()->hasTrait<OpTrait::AffineScope>())
		return;

		identityMap = AffineMap::getMultiDimIdentityMap(1, loc->getContext());
		}
		if (needsMax) {
		auto maxOp = outsideBuilder.create<AffineMaxOp>(loc, lowerBoundMap,
		lowerBoundOperands);
		lowerBoundMap = identityMap;
		lowerBoundOperands = maxOp->getResults();
		}

		// Same for the upper bound.
		if (needsMin) {
		auto minOp = outsideBuilder.create<AffineMinOp>(loc, upperBoundMap,
		upperBoundOperands);
		upperBoundMap = identityMap;
		upperBoundOperands = minOp->getResults();
		}

// Creating empty 1-D affine.parallel op.		// Creating empty 1-D affine.parallel op.
AffineParallelOp newPloop = outsideBuilder.create<AffineParallelOp>(		AffineParallelOp newPloop = outsideBuilder.create<AffineParallelOp>(
loc, llvm::None, llvm::None, forOp.getLowerBoundMap(),		loc, llvm::None, llvm::None, lowerBoundMap, lowerBoundOperands,
forOp.getLowerBoundOperands(), forOp.getUpperBoundMap(),		upperBoundMap, upperBoundOperands);
forOp.getUpperBoundOperands());
// Steal the body of the old affine for op and erase it.		// Steal the body of the old affine for op and erase it.
newPloop.region().takeBody(forOp.region());		newPloop.region().takeBody(forOp.region());
forOp.erase();		forOp.erase();
}		}

// Returns success if any hoisting happened.		// Returns success if any hoisting happened.
LogicalResult mlir::hoistAffineIfOp(AffineIfOp ifOp, bool *folded) {		LogicalResult mlir::hoistAffineIfOp(AffineIfOp ifOp, bool *folded) {
// Bail out early if the ifOp returns a result. TODO: Consider how to		// Bail out early if the ifOp returns a result. TODO: Consider how to
▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

mlir/test/Dialect/Affine/parallelize.mlir

	Show First 20 Lines • Show All 108 Lines • ▼ Show 20 Lines
	func @non_affine_load() {			func @non_affine_load() {
	%0 = alloc() : memref<100 x f32>			%0 = alloc() : memref<100 x f32>
	affine.for %i = 0 to 100 {			affine.for %i = 0 to 100 {
	// CHECK: affine.for %{{.*}} = 0 to 100 {			// CHECK: affine.for %{{.*}} = 0 to 100 {
	load %0[%i] : memref<100 x f32>			load %0[%i] : memref<100 x f32>
	}			}
	return			return
	}			}

				// CHECK-LABEL: for_with_minmax
				func @for_with_minmax(%m: memref<?xf32>, %lb0: index, %lb1: index,
				%ub0: index, %ub1: index) {
				// CHECK: %[[lb:.*]] = affine.max
				// CHECK: %[[ub:.*]] = affine.min
				// CHECK: affine.parallel (%{{.*}}) = (%[[lb]]) to (%[[ub]])
				affine.for %i = max affine_map<(d0, d1) -> (d0, d1)>(%lb0, %lb1)
				to min affine_map<(d0, d1) -> (d0, d1)>(%ub0, %ub1) {
				affine.load %m[%i] : memref<?xf32>
				}
				return
				}

				// CHECK-LABEL: nested_for_with_minmax
				func @nested_for_with_minmax(%m: memref<?xf32>, %lb0: index,
				%ub0: index, %ub1: index) {
				// CHECK: affine.parallel
				affine.for %j = 0 to 10 {
				// Cannot parallelize the inner loop because we would need to compute
				// affine.max for its lower bound inside the loop, and that is not (yet)
				// considered as a valid affine dimension.
				// CHECK: affine.for
				affine.for %i = max affine_map<(d0, d1) -> (d0, d1)>(%lb0, %j)
				to min affine_map<(d0, d1) -> (d0, d1)>(%ub0, %ub1) {
				affine.load %m[%i] : memref<?xf32>
				}
				}
				return
				}