This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
1/3
ConvertVectorToLLVM.cpp
-
test/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
-
vector-to-llvm.mlir

Differential D155877

[mlir][vector] Improve lowering to LLVM for `minf`, `maxf` reductions
ClosedPublic

Authored by unterumarmung on Jul 20 2023, 12:17 PM.

Download Raw Diff

Details

Reviewers

aartbik
ftynse
dcaballe

Commits

rGdad9de0ae536: [mlir][vector] Improve lowering to LLVM for `minf`, `maxf` reductions

Summary

This patch improves the lowering by changing target LLVM intrinsics from
reduce.fmax and reduce.fmin,
which have different semantic for handling NaN,
to reduce.fmaximum and reduce.fminimum ones.

Fixes #63969

Depends on D155869

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

unterumarmung created this revision.Jul 20 2023, 12:17 PM

Herald added a reviewer: aartbik. · View Herald TranscriptJul 20 2023, 12:17 PM

Herald added a reviewer: ftynse. · View Herald Transcript

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: gysit, Dinistro, bviyer and 24 others. · View Herald Transcript

unterumarmung requested review of this revision.Jul 20 2023, 12:17 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 12:17 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Thanks for looking into this!

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
601–605	IIRC, I added this comment quite sometime ago. / Create lowering of minf/maxf op. We cannot use llvm.maximum/llvm.minimum / with vector types. Have you tried replacing these two ops with an `llvm.maximum`/`llvm.minimum`? Maybe they are more widely supported now. Otherwise, we would have to propagate the NaNs ourselves here.

unterumarmung added inline comments.Jul 20 2023, 1:34 PM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

601–605

Oh, sorry, I'm new to this
I thought that using new intrinsics would allow to delete this boilerplate for NaNs.

From what I can see in tests, llvm.minimum and llvm.maximum can be called with vector arguments too.
But I do not understand one thing - why do we need this ability here in the first place? Both FCmpOp and SelectOp accept scalar values and return them because they accept the result value of the reduction intrinsic and the accumulator value which are floating-point scalars. So, replacing them would result in a llvm.minimum/llvm.maximum intrinsic call with scalar arguments.

Before:

func.func @reduce_fmax_f32(%arg0: vector<16xf32>, %arg1: f32) -> f32 {
  %0 = vector.reduction <maxf>, %arg0, %arg1 : vector<16xf32> into f32
  return %0 : f32
}
// CHECK-LABEL: @reduce_fmax_f32(
// CHECK-SAME: %[[A:.*]]: vector<16xf32>, %[[B:.*]]: f32)
//      CHECK: %[[V:.*]] = llvm.intr.vector.reduce.fmaximum(%[[A]]) : (vector<16xf32>) -> f32
//      CHECK: %[[C0:.*]] = llvm.fcmp "ogt" %[[V]], %[[B]] : f32
//      CHECK: %[[S0:.*]] = llvm.select %[[C0]], %[[V]], %[[B]] : i1, f32
//      CHECK: return %[[S0]] : f32

After:

func.func @reduce_fmax_f32(%arg0: vector<16xf32>, %arg1: f32) -> f32 {
  %0 = vector.reduction <maxf>, %arg0, %arg1 : vector<16xf32> into f32
  return %0 : f32
}
// CHECK-LABEL: @reduce_fmax_f32(
// CHECK-SAME: %[[A:.*]]: vector<16xf32>, %[[B:.*]]: f32)
//      CHECK: %[[V:.*]] = llvm.intr.vector.reduce.fmaximum(%[[A]]) : (vector<16xf32>) -> f32
//      CHECK: %[[C0:.*]] = llvm.maximum "ogt" %[[V]], %[[B]] : f32
//      CHECK: return %[[C0]] : f32

Is this the intended behavior?

dcaballe added inline comments.Jul 20 2023, 2:54 PM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
601–605	Yes, that would be the intended behavior, thanks! The existing semantics of fp max/min ops in MLIR match the semantics of maximum/minimum in LLVM. However, when we first implemented this, LLVM's maximum/minimum instructions were not implemented in some of the backends so we couldn't use them directly. Instead, we generated the `fcmp` + `select` with the ad-hoc NaN handling, which is kind of matching the semantics of the `maximum`/`minimum` instructions. Now that they seem more generally supported by the backends, we can use the maximum/minimum ones. This shouldn't have a large impact, at least on x64, where these instructions are lowered to `fcmp` + `select` in the backend. Eventually, we may want to rename `arith::fmax/fmin` to `arith::fmaximum/fminimum` and redefine the semantics of the former ones so that they all match LLVM spec. That would also require changing the fmax/fmin reductions in the Vector dialect... Quite some work... Happy to answer any other question you may have! Thanks for helping with this!

Harbormaster completed remote builds in B246995: Diff 542629.Jul 20 2023, 4:03 PM

Address review comments

LGTM! Thanks a lot!

This revision is now accepted and ready to land.Jul 21 2023, 1:53 PM

Thank you for providing such a clear explanation! Now I understand that the issue goes beyond just lowering to LLVM IR itself.

I have a few questions about the points you mentioned:

When I searched the repository, I found mentions of llvm.vector.reduce.fmaximum only in tests for AArch64 and X86 backends. Is it an acceptable number of backends to be supported for merging this kind of change? Perhaps I missed something or are there fallbacks for other backends?
I agree that fixing the semantics of operations to match those in LLVM IR is extremely important. However, I'm concerned that there might be many downstream users of these MLIR dialects, such as IREE or possibly even Flang (not exactly a downstream user but not an MLIR in-tree one). As far as I know, MLIR has never declared IR stability, but breaking someone's code wouldn't be ideal. How does the community handle such situations? Also, I would like to participate in finding a solution, but I'm unsure if I'll have enough free time. So, please ping me when this comes up :)

In D155877#4523827, @unterumarmung wrote:

Thank you for providing such a clear explanation! Now I understand that the issue goes beyond just lowering to LLVM IR itself.

I have a few questions about the points you mentioned:

When I searched the repository, I found mentions of llvm.vector.reduce.fmaximum only in tests for AArch64 and X86 backends. Is it an acceptable number of backends to be supported for merging this kind of change? Perhaps I missed something or are there fallbacks for other backends?

I had quickly tested x86 and AArch64 but let's run a more extensive testing also including RISC-V: https://github.com/openxla/iree/pull/14472

I agree that fixing the semantics of operations to match those in LLVM IR is extremely important. However, I'm concerned that there might be many downstream users of these MLIR dialects, such as IREE or possibly even Flang (not exactly a downstream user but not an MLIR in-tree one). As far as I know, MLIR has never declared IR stability, but breaking someone's code wouldn't be ideal. How does the community handle such situations? Also, I would like to participate in finding a solution, but I'm unsure if I'll have enough free time. So, please ping me when this comes up :)

MLIR allows breaking changes. In this case they would be well justified. We usually send a PSA to Discourse in advance, letting the community know...
I personally don't think I have cycles right now so you are more than welcome to champion this as time permits! :). Happy to help with questions and reviews if you feel motivated to move this forward!

Harbormaster completed remote builds in B247310: Diff 543050.Jul 21 2023, 6:35 PM

Opened an issue for RISC-V: https://github.com/llvm/llvm-project/issues/64022

I think this should be good to go. Tests are passing.

Closed by commit rGdad9de0ae536: [mlir][vector] Improve lowering to LLVM for `minf`, `maxf` reductions (authored by unterumarmung). · Explain WhyAug 2 2023, 10:27 AM

This revision was automatically updated to reflect the committed changes.

unterumarmung added a commit: rGdad9de0ae536: [mlir][vector] Improve lowering to LLVM for `minf`, `maxf` reductions.

This change causes downstream breakages. On CUDA backends this is generating an instruction that is valid only on SM80 and above (and therefor causes a crash on anything below SM80). Posting this here as an FYI. Looking at the patch this is probably just exposing an issue in the NVPTX backend rather than anything here being the root cause.

It could be that the new intrinsics are not supported/implemented in the NVPTX backend for some cases? I think we can revert it if that is the case and something is filed against the NVPTX backend. Otherwise, this will become a blocker for a much larger effort: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671
What do you think @unterumarmung?

I believe that the issue lies within the NVPTX backend, and it should be addressed as a bug. If this change is causing significant issues and there isn't a straightforward solution to the bug, we can definitely consider reverting it. Additionally, I'd like to request @mravishankar to review the RFC mentioned, particularly from the NVPTX backend standpoint. Your insights would be greatly appreciated.

In D155877#4577998, @unterumarmung wrote:

I believe that the issue lies within the NVPTX backend, and it should be addressed as a bug. If this change is causing significant issues and there isn't a straightforward solution to the bug, we can definitely consider reverting it. Additionally, I'd like to request @mravishankar to review the RFC mentioned, particularly from the NVPTX backend standpoint. Your insights would be greatly appreciated.

I filed this issue on NVPTX backend https://github.com/llvm/llvm-project/issues/64606 . Ill take a look at the RFC, but my knowledge of NVPTX is pretty dated at this point. This comes down to whether these instructions are supported or not. It looks like it is only supported on SM 80 and above. If I dont hear back, then instead of reverting the change, maybe refactor it so that the use of the llvm.intr.minimum can be controlled from downstream users.

@unterumarmung and @dcaballe based on the discussion here (https://github.com/llvm/llvm-project/issues/64606) this patch either needs to be reverted or needs to have a way to opt-in/opt-out to not hit the issue on CUDA backends... Maybe create a separate entry point which populates these patterns, or have a flag that says enforce NaN propagation semantics that can be true by default, but can be set to false on CUDA backends. This is breaking downstream tests, so a fix sooner rather than later would be appreciated.

It looks like the fix is moving forward now, right?

Herald added a subscriber: sunshaoce. · View Herald TranscriptAug 18 2023, 5:40 PM

In D155877#4600630, @dcaballe wrote:

It looks like the fix is moving forward now, right?

Actually it seems like we have to make this opt-in somehow. It is failing on CUDA on architectures lesser then sm_80 and that doesn't seem easy to fix. I was going to update this bug saying this. Either we need to revert or make this transformation opt-in.

In D155877#4600660, @mravishankar wrote:

In D155877#4600630, @dcaballe wrote:

It looks like the fix is moving forward now, right?

Actually it seems like we have to make this opt-in somehow. It is failing on CUDA on architectures lesser then sm_80 and that doesn't seem easy to fix. I was going to update this bug saying this. Either we need to revert or make this transformation opt-in.

If a workaround is required at MLIR level, we should add expansion patterns for these ops to Arith/Transforms/ExpandOps.cpp, where we can a introduce compare + select for the NaN part and another compare + select for the +-0.0 part. However, these ops are first class instructions/intrinsics in LLVM so they should be supported in one way or the other by the backends.

Revision Contents

Path

Size

mlir/

lib/

Conversion/

VectorToLLVM/

ConvertVectorToLLVM.cpp

62 lines

test/

Conversion/

VectorToLLVM/

vector-to-llvm.mlir

16 lines

Diff 546536

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

Show First 20 Lines • Show All 567 Lines • ▼ Show 20 Lines	static Value createIntegerReductionComparisonOpLowering(
if (accumulator) {		if (accumulator) {
Value cmp =		Value cmp =
rewriter.create<LLVM::ICmpOp>(loc, predicate, accumulator, result);		rewriter.create<LLVM::ICmpOp>(loc, predicate, accumulator, result);
result = rewriter.create<LLVM::SelectOp>(loc, cmp, accumulator, result);		result = rewriter.create<LLVM::SelectOp>(loc, cmp, accumulator, result);
}		}
return result;		return result;
}		}

/// Create lowering of minf/maxf op. We cannot use llvm.maximum/llvm.minimum		namespace {
/// with vector types.		template <typename Source>
static Value createMinMaxF(OpBuilder &builder, Location loc, Value lhs,		struct VectorToScalarMapper;
Value rhs, bool isMin) {		template <>
auto floatType = cast<FloatType>(getElementTypeOrSelf(lhs.getType()));		struct VectorToScalarMapper<LLVM::vector_reduce_fmaximum> {
Type i1Type = builder.getI1Type();		using Type = LLVM::MaximumOp;
if (auto vecType = dyn_cast<VectorType>(lhs.getType()))		};
i1Type = VectorType::get(vecType.getShape(), i1Type);		template <>
Value cmp = builder.create<LLVM::FCmpOp>(		struct VectorToScalarMapper<LLVM::vector_reduce_fminimum> {
loc, i1Type, isMin ? LLVM::FCmpPredicate::olt : LLVM::FCmpPredicate::ogt,		using Type = LLVM::MinimumOp;
lhs, rhs);		};
Value sel = builder.create<LLVM::SelectOp>(loc, cmp, lhs, rhs);		} // namespace
Value isNan = builder.create<LLVM::FCmpOp>(
loc, i1Type, LLVM::FCmpPredicate::uno, lhs, rhs);
Value nan = builder.create<LLVM::ConstantOp>(
loc, lhs.getType(),
builder.getFloatAttr(floatType,
APFloat::getQNaN(floatType.getFloatSemantics())));
return builder.create<LLVM::SelectOp>(loc, isNan, nan, sel);
}

template <class LLVMRedIntrinOp>		template <class LLVMRedIntrinOp>
static Value createFPReductionComparisonOpLowering(		static Value
ConversionPatternRewriter &rewriter, Location loc, Type llvmType,		createFPReductionComparisonOpLowering(ConversionPatternRewriter &rewriter,
Value vectorOperand, Value accumulator, bool isMin) {		Location loc, Type llvmType,
		Value vectorOperand, Value accumulator) {
Value result = rewriter.create<LLVMRedIntrinOp>(loc, llvmType, vectorOperand);		Value result = rewriter.create<LLVMRedIntrinOp>(loc, llvmType, vectorOperand);

if (accumulator)		if (accumulator) {
result = createMinMaxF(rewriter, loc, result, accumulator, /isMin=/isMin);		result =
		rewriter.create<typename VectorToScalarMapper<LLVMRedIntrinOp>::Type>(
		loc, result, accumulator);
		}

return result;		return result;
}		}

/// Overloaded methods to lower a reduction to an llvm instrinsic that requires		/// Overloaded methods to lower a reduction to an llvm instrinsic that requires
		dcaballeUnsubmitted Not Done Reply Inline Actions IIRC, I added this comment quite sometime ago. / Create lowering of minf/maxf op. We cannot use llvm.maximum/llvm.minimum / with vector types. Have you tried replacing these two ops with an `llvm.maximum`/`llvm.minimum`? Maybe they are more widely supported now. Otherwise, we would have to propagate the NaNs ourselves here. dcaballe: IIRC, I added this comment quite sometime ago. > /// Create lowering of minf/maxf op. We cannot…
		unterumarmungAuthorUnsubmitted Done Reply Inline Actions Oh, sorry, I'm new to this I thought that using new intrinsics would allow to delete this boilerplate for NaNs. From what I can see in tests, `llvm.minimum` and `llvm.maximum` can be called with vector arguments too. But I do not understand one thing - why do we need this ability here in the first place? Both `FCmpOp` and `SelectOp` accept scalar values and return them because they accept the result value of the reduction intrinsic and the accumulator value which are floating-point scalars. So, replacing them would result in a `llvm.minimum`/`llvm.maximum` intrinsic call with scalar arguments. Before: func.func @reduce_fmax_f32(%arg0: vector<16xf32>, %arg1: f32) -> f32 { %0 = vector.reduction <maxf>, %arg0, %arg1 : vector<16xf32> into f32 return %0 : f32 } // CHECK-LABEL: @reduce_fmax_f32( // CHECK-SAME: %[[A:.]]: vector<16xf32>, %[[B:.]]: f32) // CHECK: %[[V:.]] = llvm.intr.vector.reduce.fmaximum(%[[A]]) : (vector<16xf32>) -> f32 // CHECK: %[[C0:.]] = llvm.fcmp "ogt" %[[V]], %[[B]] : f32 // CHECK: %[[S0:.]] = llvm.select %[[C0]], %[[V]], %[[B]] : i1, f32 // CHECK: return %[[S0]] : f32 After: func.func @reduce_fmax_f32(%arg0: vector<16xf32>, %arg1: f32) -> f32 { %0 = vector.reduction <maxf>, %arg0, %arg1 : vector<16xf32> into f32 return %0 : f32 } // CHECK-LABEL: @reduce_fmax_f32( // CHECK-SAME: %[[A:.]]: vector<16xf32>, %[[B:.]]: f32) // CHECK: %[[V:.]] = llvm.intr.vector.reduce.fmaximum(%[[A]]) : (vector<16xf32>) -> f32 // CHECK: %[[C0:.]] = llvm.maximum "ogt" %[[V]], %[[B]] : f32 // CHECK: return %[[C0]] : f32 Is this the intended behavior? unterumarmung:* Oh, sorry, I'm new to this I thought that using new intrinsics would allow to delete this…
		dcaballeUnsubmitted Not Done Reply Inline Actions Yes, that would be the intended behavior, thanks! The existing semantics of fp max/min ops in MLIR match the semantics of maximum/minimum in LLVM. However, when we first implemented this, LLVM's maximum/minimum instructions were not implemented in some of the backends so we couldn't use them directly. Instead, we generated the `fcmp` + `select` with the ad-hoc NaN handling, which is kind of matching the semantics of the `maximum`/`minimum` instructions. Now that they seem more generally supported by the backends, we can use the maximum/minimum ones. This shouldn't have a large impact, at least on x64, where these instructions are lowered to `fcmp` + `select` in the backend. Eventually, we may want to rename `arith::fmax/fmin` to `arith::fmaximum/fminimum` and redefine the semantics of the former ones so that they all match LLVM spec. That would also require changing the fmax/fmin reductions in the Vector dialect... Quite some work... Happy to answer any other question you may have! Thanks for helping with this! dcaballe: Yes, that would be the intended behavior, thanks! The existing semantics of fp [max/min](https…
/// a start value. This start value format spans across fp reductions without		/// a start value. This start value format spans across fp reductions without
/// mask and all the masked reduction intrinsics.		/// mask and all the masked reduction intrinsics.
template <class LLVMVPRedIntrinOp, class ReductionNeutral>		template <class LLVMVPRedIntrinOp, class ReductionNeutral>
static Value lowerReductionWithStartValue(ConversionPatternRewriter &rewriter,		static Value lowerReductionWithStartValue(ConversionPatternRewriter &rewriter,
Location loc, Type llvmType,		Location loc, Type llvmType,
Value vectorOperand,		Value vectorOperand,
Value accumulator) {		Value accumulator) {
accumulator = getOrCreateAccumulator<ReductionNeutral>(rewriter, loc,		accumulator = getOrCreateAccumulator<ReductionNeutral>(rewriter, loc,
▲ Show 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	if (kind == vector::CombiningKind::ADD) {
result = lowerReductionWithStartValue<LLVM::vector_reduce_fadd,		result = lowerReductionWithStartValue<LLVM::vector_reduce_fadd,
ReductionNeutralZero>(		ReductionNeutralZero>(
rewriter, loc, llvmType, operand, acc, reassociateFPReductions);		rewriter, loc, llvmType, operand, acc, reassociateFPReductions);
} else if (kind == vector::CombiningKind::MUL) {		} else if (kind == vector::CombiningKind::MUL) {
result = lowerReductionWithStartValue<LLVM::vector_reduce_fmul,		result = lowerReductionWithStartValue<LLVM::vector_reduce_fmul,
ReductionNeutralFPOne>(		ReductionNeutralFPOne>(
rewriter, loc, llvmType, operand, acc, reassociateFPReductions);		rewriter, loc, llvmType, operand, acc, reassociateFPReductions);
} else if (kind == vector::CombiningKind::MINF) {		} else if (kind == vector::CombiningKind::MINF) {
// FIXME: MLIR's 'minf' and LLVM's 'vector_reduce_fmin' do not handle		result =
// NaNs/-0.0/+0.0 in the same way.		createFPReductionComparisonOpLowering<LLVM::vector_reduce_fminimum>(
result = createFPReductionComparisonOpLowering<LLVM::vector_reduce_fmin>(		rewriter, loc, llvmType, operand, acc);
rewriter, loc, llvmType, operand, acc,
/isMin=/true);
} else if (kind == vector::CombiningKind::MAXF) {		} else if (kind == vector::CombiningKind::MAXF) {
// FIXME: MLIR's 'maxf' and LLVM's 'vector_reduce_fmax' do not handle		result =
// NaNs/-0.0/+0.0 in the same way.		createFPReductionComparisonOpLowering<LLVM::vector_reduce_fmaximum>(
result = createFPReductionComparisonOpLowering<LLVM::vector_reduce_fmax>(		rewriter, loc, llvmType, operand, acc);
rewriter, loc, llvmType, operand, acc,
/isMin=/false);
} else		} else
return failure();		return failure();

rewriter.replaceOp(reductionOp, result);		rewriter.replaceOp(reductionOp, result);
return success();		return success();
}		}

private:		private:
▲ Show 20 Lines • Show All 917 Lines • Show Last 20 Lines

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

	Show First 20 Lines • Show All 1,368 Lines • ▼ Show 20 Lines
	// -----			// -----

	func.func @reduce_fmax_f32(%arg0: vector<16xf32>, %arg1: f32) -> f32 {			func.func @reduce_fmax_f32(%arg0: vector<16xf32>, %arg1: f32) -> f32 {
	%0 = vector.reduction <maxf>, %arg0, %arg1 : vector<16xf32> into f32			%0 = vector.reduction <maxf>, %arg0, %arg1 : vector<16xf32> into f32
	return %0 : f32			return %0 : f32
	}			}
	// CHECK-LABEL: @reduce_fmax_f32(			// CHECK-LABEL: @reduce_fmax_f32(
	// CHECK-SAME: %[[A:.]]: vector<16xf32>, %[[B:.]]: f32)			// CHECK-SAME: %[[A:.]]: vector<16xf32>, %[[B:.]]: f32)
	// CHECK: %[[V:.*]] = llvm.intr.vector.reduce.fmax(%[[A]]) : (vector<16xf32>) -> f32			// CHECK: %[[V:.*]] = llvm.intr.vector.reduce.fmaximum(%[[A]]) : (vector<16xf32>) -> f32
	// CHECK: %[[C0:.*]] = llvm.fcmp "ogt" %[[V]], %[[B]] : f32			// CHECK: %[[R:.*]] = llvm.intr.maximum(%[[V]], %[[B]]) : (f32, f32) -> f32
	// CHECK: %[[S0:.*]] = llvm.select %[[C0]], %[[V]], %[[B]] : i1, f32
	// CHECK: %[[C1:.*]] = llvm.fcmp "uno" %[[V]], %[[B]] : f32
	// CHECK: %[[NAN:.*]] = llvm.mlir.constant(0x7FC00000 : f32) : f32
	// CHECK: %[[R:.*]] = llvm.select %[[C1]], %[[NAN]], %[[S0]] : i1, f32
	// CHECK: return %[[R]] : f32			// CHECK: return %[[R]] : f32

	// -----			// -----

	func.func @reduce_fmin_f32(%arg0: vector<16xf32>, %arg1: f32) -> f32 {			func.func @reduce_fmin_f32(%arg0: vector<16xf32>, %arg1: f32) -> f32 {
	%0 = vector.reduction <minf>, %arg0, %arg1 : vector<16xf32> into f32			%0 = vector.reduction <minf>, %arg0, %arg1 : vector<16xf32> into f32
	return %0 : f32			return %0 : f32
	}			}
	// CHECK-LABEL: @reduce_fmin_f32(			// CHECK-LABEL: @reduce_fmin_f32(
	// CHECK-SAME: %[[A:.]]: vector<16xf32>, %[[B:.]]: f32)			// CHECK-SAME: %[[A:.]]: vector<16xf32>, %[[B:.]]: f32)
	// CHECK: %[[V:.*]] = llvm.intr.vector.reduce.fmin(%[[A]]) : (vector<16xf32>) -> f32			// CHECK: %[[V:.*]] = llvm.intr.vector.reduce.fminimum(%[[A]]) : (vector<16xf32>) -> f32
	// CHECK: %[[C0:.*]] = llvm.fcmp "olt" %[[V]], %[[B]] : f32			// CHECK: %[[R:.*]] = llvm.intr.minimum(%[[V]], %[[B]]) : (f32, f32) -> f32
	// CHECK: %[[S0:.*]] = llvm.select %[[C0]], %[[V]], %[[B]] : i1, f32
	// CHECK: %[[C1:.*]] = llvm.fcmp "uno" %[[V]], %[[B]] : f32
	// CHECK: %[[NAN:.*]] = llvm.mlir.constant(0x7FC00000 : f32) : f32
	// CHECK: %[[R:.*]] = llvm.select %[[C1]], %[[NAN]], %[[S0]] : i1, f32
	// CHECK: return %[[R]] : f32			// CHECK: return %[[R]] : f32

	// -----			// -----

	func.func @reduce_minui_i32(%arg0: vector<16xi32>) -> i32 {			func.func @reduce_minui_i32(%arg0: vector<16xi32>) -> i32 {
	%0 = vector.reduction <minui>, %arg0 : vector<16xi32> into i32			%0 = vector.reduction <minui>, %arg0 : vector<16xi32> into i32
	return %0 : i32			return %0 : i32
	}			}
	▲ Show 20 Lines • Show All 869 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][vector] Improve lowering to LLVM for `minf`, `maxf` reductionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 546536

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

[mlir][vector] Improve lowering to LLVM for `minf`, `maxf` reductions
ClosedPublic