This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics
ClosedPublic

Authored by spatel on May 24 2019, 11:34 AM.

Download Raw Diff

Details

Reviewers

arsenm
efriedma
lebedev.ri
nikic
cameron.mcinally
craig.topper

Commits

rG706b48251f6a: [InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics
rL364721: [InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics

Summary

This is the opposite direction of D62158 (we have to choose 1 form or the other). Now that we have FMF on the select, this becomes more palatable. And the benefits of having a single IR instruction for this operation (less chances of missing folds based on extra uses, etc) overcome my previous comments about the potential advantage of larger pattern matching/analysis. I'll abandon the other patch if there's general agreement.

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.May 24 2019, 11:34 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 24 2019, 11:34 AM

Herald added subscribers: hiraditya, wdng, mcrosier. · View Herald Transcript

Ping.

spatel mentioned this in D51701: ValueTracking: Report fast math flags for fcmp/select.Jun 5 2019, 9:53 AM

Ping * 2.

cameron.mcinally added inline comments.Jun 11 2019, 7:37 AM

llvm/test/Transforms/InstCombine/fast-math.ll
884 ↗	(On Diff #201295)	Do we want NSZ here? I see that this was an artifact of the old fcmp+select lowering, but I'm not sure if it is correct. Definitely not IEEE-754 compliant...

cameron.mcinally added inline comments.Jun 11 2019, 7:40 AM

llvm/test/Transforms/InstCombine/fast-math.ll
884 ↗	(On Diff #201295)	Now that I think about it, maybe it would be better to skip the fcmp+sel step and lower the libm call directly into the intrinsic?

spatel marked an inline comment as done.Jun 11 2019, 8:15 AM

spatel added inline comments.

llvm/test/Transforms/InstCombine/fast-math.ll
884 ↗	(On Diff #201295)	Can you explain/show example for the compliancy problem? Reference for our current docs: http://llvm.org/docs/LangRef.html#llvm-minnum-intrinsic

cameron.mcinally added inline comments.Jun 11 2019, 8:41 AM

llvm/test/Transforms/InstCombine/fast-math.ll
884 ↗	(On Diff #201295)	Ah, right. This is ok in IEEE-754 2008. My copy of IEEE-754 2018 (a draft): minimumNumber(x, y) is x if x < y, y if y < x, and the number if one operand is a number and the 25 other is a NaN. !For this operation, −0 compares less than +0.! If x = y and signs are the same it is either x or y. If both operands are NaNs, a quiet NaN is returned, according to 6.2. If either operand is a signaling NaN, an invalid operation exception is signaled, but unless both operands are NaNs, the signaling NaN is otherwise ignored and not converted to a quiet NaN as stated in 6.2 for other operations. So adding NSZ there wouldn't be correct for the min(-0,+0) and friends cases.

spatel marked an inline comment as done.Jun 11 2019, 8:53 AM

spatel added inline comments.

llvm/test/Transforms/InstCombine/fast-math.ll
884 ↗	(On Diff #201295)	I might still be missing some subtlety. My interpretation of the LangRef text (and so also IEEE-754 2008) is that 'nsz' ("Allow optimizations to treat the sign of a zero argument or result as insignificant") is implicit in the definition of minnum/maxnum. By explicitly propagating the 'nsz' in this case, we are future-proofing the behavior even for the new standard. The original fcmp+select said that sign of zero is a 'don't care', and we want the intrinsic call to have that same freedom.

LGTM

llvm/test/Transforms/InstCombine/fast-math.ll
884 ↗	(On Diff #201295)	Yeah, I agree your new code is fine. I was questioning whether we should be adding NSZ during the @fmin(...) lowering, since the NSZ isn't on the original fmin(...) call. But now I don't see anything that requires @fmin*(...) to honor the sign of a zero in the Standards or libm. So this is fine too. Sorry for the noise...

This revision is now accepted and ready to land.Jun 11 2019, 9:17 AM

spatel marked an inline comment as done.Jun 11 2019, 9:23 AM

spatel added inline comments.

llvm/test/Transforms/InstCombine/fast-math.ll
884 ↗	(On Diff #201295)	Ah, I see now. Not noise at all - these are excellent and valid points. It goes back to your other comment - we could be transforming the fmin call directly to the intrinsic in these examples. So that confusion was just caused by me being lazy and not creating minimal tests for this patch. Let me do that, then I'll try to fix up that existing transform.

spatel mentioned this in D63214: [InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum.Jun 12 2019, 9:02 AM

spatel mentioned this in D62158: [InstCombine] canonicalize minnum/maxnum with 'nnan' to fcmp+select.Jun 21 2019, 11:35 AM

spatel mentioned this in rL364714: [InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum.Jun 29 2019, 7:32 AM

spatel mentioned this in rG77dc1e85683c: [InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum.

Closed by commit rL364721: [InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics (authored by spatel). · Explain WhyJun 30 2019, 6:42 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

InstCombine/

InstCombineSelect.cpp

13 lines

test/

Transforms/

InstCombine/

minmax-fp.ll

20 lines

Diff 207219

llvm/trunk/lib/Transforms/InstCombine/InstCombineSelect.cpp

Show First 20 Lines • Show All 2,024 Lines • ▼ Show 20 Lines	if (SelectPatternResult::isMinOrMax(SPF)) {
if (Instruction *I = moveAddAfterMinMax(SPF, LHS, RHS, Builder))		if (Instruction *I = moveAddAfterMinMax(SPF, LHS, RHS, Builder))
return I;		return I;

if (Instruction *I = factorizeMinMaxTree(SPF, LHS, RHS, Builder))		if (Instruction *I = factorizeMinMaxTree(SPF, LHS, RHS, Builder))
return I;		return I;
}		}
}		}

		// Canonicalize select of FP values where NaN and -0.0 are not valid as
		// minnum/maxnum intrinsics.
		if (isa<FPMathOperator>(SI) && SI.hasNoNaNs() && SI.hasNoSignedZeros()) {
		Value X, Y;
		if (match(&SI, m_OrdFMax(m_Value(X), m_Value(Y))))
		return replaceInstUsesWith(
		SI, Builder.CreateBinaryIntrinsic(Intrinsic::maxnum, X, Y, &SI));

		if (match(&SI, m_OrdFMin(m_Value(X), m_Value(Y))))
		return replaceInstUsesWith(
		SI, Builder.CreateBinaryIntrinsic(Intrinsic::minnum, X, Y, &SI));
		}

// See if we can fold the select into a phi node if the condition is a select.		// See if we can fold the select into a phi node if the condition is a select.
if (auto *PN = dyn_cast<PHINode>(SI.getCondition()))		if (auto *PN = dyn_cast<PHINode>(SI.getCondition()))
// The true/false values have to be live in the PHI predecessor's blocks.		// The true/false values have to be live in the PHI predecessor's blocks.
if (canSelectOperandBeMappingIntoPredBlock(TrueVal, SI) &&		if (canSelectOperandBeMappingIntoPredBlock(TrueVal, SI) &&
canSelectOperandBeMappingIntoPredBlock(FalseVal, SI))		canSelectOperandBeMappingIntoPredBlock(FalseVal, SI))
if (Instruction *NV = foldOpIntoPhi(SI, PN))		if (Instruction *NV = foldOpIntoPhi(SI, PN))
return NV;		return NV;

▲ Show 20 Lines • Show All 143 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/minmax-fp.ll

Show First 20 Lines • Show All 307 Lines • ▼ Show 20 Lines	;
%n2 = fneg double %y		%n2 = fneg double %y
%cond = fcmp nsz nnan ule double %n1, %n2		%cond = fcmp nsz nnan ule double %n1, %n2
%max = select i1 %cond, double %n1, double %n2		%max = select i1 %cond, double %n1, double %n2
ret double %max		ret double %max
}		}

define float @maxnum_ogt_fmf_on_select(float %a, float %b) {		define float @maxnum_ogt_fmf_on_select(float %a, float %b) {
; CHECK-LABEL: @maxnum_ogt_fmf_on_select(		; CHECK-LABEL: @maxnum_ogt_fmf_on_select(
; CHECK-NEXT: [[COND:%.]] = fcmp ogt float [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = call nnan nsz float @llvm.maxnum.f32(float [[A:%.]], float [[B:%.*]])
; CHECK-NEXT: [[F:%.*]] = select nnan nsz i1 [[COND]], float [[A]], float [[B]]		; CHECK-NEXT: ret float [[TMP1]]
; CHECK-NEXT: ret float [[F]]
;		;
%cond = fcmp ogt float %a, %b		%cond = fcmp ogt float %a, %b
%f = select nnan nsz i1 %cond, float %a, float %b		%f = select nnan nsz i1 %cond, float %a, float %b
ret float %f		ret float %f
}		}

define <2 x float> @maxnum_oge_fmf_on_select(<2 x float> %a, <2 x float> %b) {		define <2 x float> @maxnum_oge_fmf_on_select(<2 x float> %a, <2 x float> %b) {
; CHECK-LABEL: @maxnum_oge_fmf_on_select(		; CHECK-LABEL: @maxnum_oge_fmf_on_select(
; CHECK-NEXT: [[COND:%.]] = fcmp oge <2 x float> [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = call nnan ninf nsz <2 x float> @llvm.maxnum.v2f32(<2 x float> [[A:%.]], <2 x float> [[B:%.*]])
; CHECK-NEXT: [[F:%.*]] = select nnan ninf nsz <2 x i1> [[COND]], <2 x float> [[A]], <2 x float> [[B]]		; CHECK-NEXT: ret <2 x float> [[TMP1]]
; CHECK-NEXT: ret <2 x float> [[F]]
;		;
%cond = fcmp oge <2 x float> %a, %b		%cond = fcmp oge <2 x float> %a, %b
%f = select ninf nnan nsz <2 x i1> %cond, <2 x float> %a, <2 x float> %b		%f = select ninf nnan nsz <2 x i1> %cond, <2 x float> %a, <2 x float> %b
ret <2 x float> %f		ret <2 x float> %f
}		}

define float @maxnum_ogt_fmf_on_fcmp(float %a, float %b) {		define float @maxnum_ogt_fmf_on_fcmp(float %a, float %b) {
; CHECK-LABEL: @maxnum_ogt_fmf_on_fcmp(		; CHECK-LABEL: @maxnum_ogt_fmf_on_fcmp(
Show All 36 Lines
;		;
%cond = fcmp oge float %a, %b		%cond = fcmp oge float %a, %b
%f = select nsz i1 %cond, float %a, float %b		%f = select nsz i1 %cond, float %a, float %b
ret float %f		ret float %f
}		}

define float @minnum_olt_fmf_on_select(float %a, float %b) {		define float @minnum_olt_fmf_on_select(float %a, float %b) {
; CHECK-LABEL: @minnum_olt_fmf_on_select(		; CHECK-LABEL: @minnum_olt_fmf_on_select(
; CHECK-NEXT: [[COND:%.]] = fcmp olt float [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = call nnan nsz float @llvm.minnum.f32(float [[A:%.]], float [[B:%.*]])
; CHECK-NEXT: [[F:%.*]] = select nnan nsz i1 [[COND]], float [[A]], float [[B]]		; CHECK-NEXT: ret float [[TMP1]]
; CHECK-NEXT: ret float [[F]]
;		;
%cond = fcmp olt float %a, %b		%cond = fcmp olt float %a, %b
%f = select nnan nsz i1 %cond, float %a, float %b		%f = select nnan nsz i1 %cond, float %a, float %b
ret float %f		ret float %f
}		}

define <2 x float> @minnum_ole_fmf_on_select(<2 x float> %a, <2 x float> %b) {		define <2 x float> @minnum_ole_fmf_on_select(<2 x float> %a, <2 x float> %b) {
; CHECK-LABEL: @minnum_ole_fmf_on_select(		; CHECK-LABEL: @minnum_ole_fmf_on_select(
; CHECK-NEXT: [[COND:%.]] = fcmp ole <2 x float> [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = call nnan ninf nsz <2 x float> @llvm.minnum.v2f32(<2 x float> [[A:%.]], <2 x float> [[B:%.*]])
; CHECK-NEXT: [[F:%.*]] = select nnan ninf nsz <2 x i1> [[COND]], <2 x float> [[A]], <2 x float> [[B]]		; CHECK-NEXT: ret <2 x float> [[TMP1]]
; CHECK-NEXT: ret <2 x float> [[F]]
;		;
%cond = fcmp ole <2 x float> %a, %b		%cond = fcmp ole <2 x float> %a, %b
%f = select ninf nnan nsz <2 x i1> %cond, <2 x float> %a, <2 x float> %b		%f = select ninf nnan nsz <2 x i1> %cond, <2 x float> %a, <2 x float> %b
ret <2 x float> %f		ret <2 x float> %f
}		}

define float @minnum_olt_fmf_on_fcmp(float %a, float %b) {		define float @minnum_olt_fmf_on_fcmp(float %a, float %b) {
; CHECK-LABEL: @minnum_olt_fmf_on_fcmp(		; CHECK-LABEL: @minnum_olt_fmf_on_fcmp(
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsicsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 207219

llvm/trunk/lib/Transforms/InstCombine/InstCombineSelect.cpp

llvm/trunk/test/Transforms/InstCombine/minmax-fp.ll

[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics
ClosedPublic