This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] fold 'fneg undef' to undef
ClosedPublic

Authored by spatel on May 3 2019, 8:37 AM.

Download Raw Diff

Details

Reviewers

cameron.mcinally
arsenm
efriedma
craig.topper

Commits

rZORG5e534c398315: [SelectionDAG] fold 'fneg undef' to undef
rZORG1dbcd69d0332: [SelectionDAG] fold 'fneg undef' to undef
rG5e534c398315: [SelectionDAG] fold 'fneg undef' to undef
rG1dbcd69d0332: [SelectionDAG] fold 'fneg undef' to undef
rG902b3ecdad89: [SelectionDAG] fold 'fneg undef' to undef
rL360296: [SelectionDAG] fold 'fneg undef' to undef

Summary

This is extracted from the original draft of D61419 with some additional tests.
We don't currently get this in IR (it's conservatively turned into a NaN), but presumably that'll get updated as we add real IR support for 'fneg' rather than 'fsub -0.0, x'.

The x86-32 run shows the following, and I haven't looked further to see why, but that seems to be independent:
Legalizing: t1: f32 = undef
Trying to expand node
Creating fp constant: t4: f32 = ConstantFP<0.000000e+00>

Diff Detail

Event Timeline

spatel created this revision.May 3 2019, 8:37 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 3 2019, 8:37 AM

Herald added subscribers: hiraditya, wdng, mcrosier. · View Herald Transcript

arsenm added inline comments.May 3 2019, 8:40 AM

llvm/test/CodeGen/X86/vec_fneg.ll
52	It seems problematic that the DAG lowering is still producing an fneg for this

spatel marked an inline comment as done.May 3 2019, 9:46 AM

spatel added inline comments.

llvm/test/CodeGen/X86/vec_fneg.ll
52	I'm not following. Is there some place before/after getNode() that we should also fix?

arsenm added inline comments.May 3 2019, 9:50 AM

llvm/test/CodeGen/X86/vec_fneg.ll
52	I mean I don't expect fsub -0, x to be equivalent to fneg x anymore. This test shouldn't be hitting this path, and should be using the fneg instruction?

spatel marked an inline comment as done.May 3 2019, 10:12 AM

spatel added inline comments.

llvm/test/CodeGen/X86/vec_fneg.ll
52	Ah, I see. So we need to get our patches in order. I don't think we're ready to pull the plug on SDAG converting 'fsub -0.0, x' to fneg yet because we don't have that canonicalization in IR yet, but let me know if I'm wrong. Either way, I should have added tests with fneg in the IR, so we don't lose coverage when we do flip that switch.

arsenm added inline comments.May 3 2019, 10:16 AM

llvm/test/CodeGen/X86/vec_fneg.ll
52	Yes, I assumed this compatibility hack was still in here somewhere, but we need to start adding tests for the pure fneg

mcberg2017 added a subscriber: mcberg2017.May 3 2019, 10:18 AM

Patch updated:
Rebased after adding tests with the real fneg IR instructions (currently, results in the same test diffs).

cameron.mcinally added inline comments.May 3 2019, 4:23 PM

llvm/test/CodeGen/X86/vec_fneg.ll
52	I mean I don't expect fsub -0, x to be equivalent to fneg x anymore. This test shouldn't be hitting this path, and should be using the fneg instruction? That's actually ok to do. What isn't ok is FNEG(X)->FSUB(-0.0, X). FNEG(X) has clearly defined outputs for some edges cases, e.g. NaNs. FSUB(-0.0, X) does not.

spatel marked an inline comment as done.May 6 2019, 6:49 AM

spatel added inline comments.

llvm/test/CodeGen/X86/vec_fneg.ll
52	I suspect the subtlety of the NaN behavior diff is not known/forgotten by most people. Should I add a blurb to the LangRef and/or SDAG node code comments about that?

cameron.mcinally added inline comments.May 6 2019, 8:03 AM

llvm/test/CodeGen/X86/vec_fneg.ll
52	Sure, or I can do it. I have some time to work on LLVM specific projects right now. The problematic case is X=+/-NaN. This only applies to FNEG(X) -> FSUB(-0.0, X) transforms, since IEEE-754 does not specify the sign-bit of a NaN result for FSUB(-0.0, +/-NaN). IEEE-754 does specify the sign-bit for a FNEG(+/-NaN). That said, IIRC, some architectures make mistakes in practice with FSUB where X=+/-0.0. But, that is well defined by IEEE-754. I can brush up on this case if you'd like more detail. I'll also add that FSUB(-0.0, X) -> FNEG(X) may not be safe for the constrained intrinsics when rounding mode is in effect. I haven't studied that close enough yet, but I've seen enough verbiage in IEEE-754 to know I should be worried about it.

spatel marked an inline comment as done.May 6 2019, 8:19 AM

spatel added inline comments.

llvm/test/CodeGen/X86/vec_fneg.ll
52	Great - please go ahead with additional examples/docs. IMO, we can always use more of that. Anything left to do with this patch?

cameron.mcinally marked an inline comment as done.May 6 2019, 8:39 AM

cameron.mcinally added inline comments.

llvm/test/CodeGen/X86/vec_fneg.ll
52	Great - please go ahead with additional examples/docs. IMO, we can always use more of that. No problem. On my TODO list.

Anything left to do with this patch?

If there are no other objections, this LGTM.

This revision is now accepted and ready to land.May 8 2019, 5:37 AM

Closed by commit rL360296: [SelectionDAG] fold 'fneg undef' to undef (authored by spatel). · Explain WhyMay 8 2019, 3:18 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

SelectionDAG.cpp

4 lines

test/

CodeGen/

X86/

vec_fneg.ll

53 lines

Diff 198016

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,481 Lines • ▼ Show 20 Lines	case ISD::SCALAR_TO_VECTOR:
// scalar_to_vector(extract_vector_elt V, 0) -> V, top bits are undefined.		// scalar_to_vector(extract_vector_elt V, 0) -> V, top bits are undefined.
if (OpOpcode == ISD::EXTRACT_VECTOR_ELT &&		if (OpOpcode == ISD::EXTRACT_VECTOR_ELT &&
isa<ConstantSDNode>(Operand.getOperand(1)) &&		isa<ConstantSDNode>(Operand.getOperand(1)) &&
Operand.getConstantOperandVal(1) == 0 &&		Operand.getConstantOperandVal(1) == 0 &&
Operand.getOperand(0).getValueType() == VT)		Operand.getOperand(0).getValueType() == VT)
return Operand.getOperand(0);		return Operand.getOperand(0);
break;		break;
case ISD::FNEG:		case ISD::FNEG:
		// Negation of an unknown bag of bits is still completely undefined.
		if (OpOpcode == ISD::UNDEF)
		return getUNDEF(VT);

// -(X-Y) -> (Y-X) is unsafe because when X==Y, -0.0 != +0.0		// -(X-Y) -> (Y-X) is unsafe because when X==Y, -0.0 != +0.0
if ((getTarget().Options.UnsafeFPMath \|\| Flags.hasNoSignedZeros()) &&		if ((getTarget().Options.UnsafeFPMath \|\| Flags.hasNoSignedZeros()) &&
OpOpcode == ISD::FSUB)		OpOpcode == ISD::FSUB)
return getNode(ISD::FSUB, DL, VT, Operand.getOperand(1),		return getNode(ISD::FSUB, DL, VT, Operand.getOperand(1),
Operand.getOperand(0), Flags);		Operand.getOperand(0), Flags);
if (OpOpcode == ISD::FNEG) // --X -> X		if (OpOpcode == ISD::FNEG) // --X -> X
return Operand.getOperand(0);		return Operand.getOperand(0);
break;		break;
▲ Show 20 Lines • Show All 4,931 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/vec_fneg.ll

	Show All 18 Lines
	; X64-SSE-NEXT: retq			; X64-SSE-NEXT: retq
	%tmp = fsub <4 x float> < float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00 >, %Q			%tmp = fsub <4 x float> < float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00 >, %Q
	ret <4 x float> %tmp			ret <4 x float> %tmp
	}			}

	; Possibly misplaced test, but since we're checking undef scenarios...			; Possibly misplaced test, but since we're checking undef scenarios...

	define float @scalar_fsub_neg0_undef(float %x) nounwind {			define float @scalar_fsub_neg0_undef(float %x) nounwind {
	; X32-SSE1-LABEL: scalar_fsub_neg0_undef:			; X32-SSE-LABEL: scalar_fsub_neg0_undef:
	; X32-SSE1: # %bb.0:			; X32-SSE: # %bb.0:
	; X32-SSE1-NEXT: pushl %eax			; X32-SSE-NEXT: fldz
	; X32-SSE1-NEXT: xorps {{\.LCPI.*}}, %xmm0			; X32-SSE-NEXT: retl
	; X32-SSE1-NEXT: movss %xmm0, (%esp)
	; X32-SSE1-NEXT: flds (%esp)
	; X32-SSE1-NEXT: popl %eax
	; X32-SSE1-NEXT: retl
	;
	; X32-SSE2-LABEL: scalar_fsub_neg0_undef:
	; X32-SSE2: # %bb.0:
	; X32-SSE2-NEXT: pushl %eax
	; X32-SSE2-NEXT: movss %xmm0, (%esp)
	; X32-SSE2-NEXT: flds (%esp)
	; X32-SSE2-NEXT: popl %eax
	; X32-SSE2-NEXT: retl
	;
	; X64-SSE1-LABEL: scalar_fsub_neg0_undef:
	; X64-SSE1: # %bb.0:
	; X64-SSE1-NEXT: xorps {{.*}}(%rip), %xmm0
	; X64-SSE1-NEXT: retq
	;			;
	; X64-SSE2-LABEL: scalar_fsub_neg0_undef:			; X64-SSE-LABEL: scalar_fsub_neg0_undef:
	; X64-SSE2: # %bb.0:			; X64-SSE: # %bb.0:
	; X64-SSE2-NEXT: retq			; X64-SSE-NEXT: retq
	%r = fsub float -0.0, undef			%r = fsub float -0.0, undef
	arsenmUnsubmitted Not Done Reply Inline Actions It seems problematic that the DAG lowering is still producing an fneg for this arsenm: It seems problematic that the DAG lowering is still producing an fneg for this
	spatelAuthorUnsubmitted Done Reply Inline Actions I'm not following. Is there some place before/after getNode() that we should also fix? spatel: I'm not following. Is there some place before/after getNode() that we should also fix?
	arsenmUnsubmitted Not Done Reply Inline Actions I mean I don't expect fsub -0, x to be equivalent to fneg x anymore. This test shouldn't be hitting this path, and should be using the fneg instruction? arsenm: I mean I don't expect fsub -0, x to be equivalent to fneg x anymore. This test shouldn't be…
	spatelAuthorUnsubmitted Done Reply Inline Actions Ah, I see. So we need to get our patches in order. I don't think we're ready to pull the plug on SDAG converting 'fsub -0.0, x' to fneg yet because we don't have that canonicalization in IR yet, but let me know if I'm wrong. Either way, I should have added tests with fneg in the IR, so we don't lose coverage when we do flip that switch. spatel: Ah, I see. So we need to get our patches in order. I don't think we're ready to pull the plug…
	arsenmUnsubmitted Not Done Reply Inline Actions Yes, I assumed this compatibility hack was still in here somewhere, but we need to start adding tests for the pure fneg arsenm: Yes, I assumed this compatibility hack was still in here somewhere, but we need to start adding…
	cameron.mcinallyUnsubmitted Not Done Reply Inline Actions I mean I don't expect fsub -0, x to be equivalent to fneg x anymore. This test shouldn't be hitting this path, and should be using the fneg instruction? That's actually ok to do. What isn't ok is FNEG(X)->FSUB(-0.0, X). FNEG(X) has clearly defined outputs for some edges cases, e.g. NaNs. FSUB(-0.0, X) does not. cameron.mcinally: > I mean I don't expect fsub -0, x to be equivalent to fneg x anymore. This test shouldn't be…
	spatelAuthorUnsubmitted Done Reply Inline Actions I suspect the subtlety of the NaN behavior diff is not known/forgotten by most people. Should I add a blurb to the LangRef and/or SDAG node code comments about that? spatel: I suspect the subtlety of the NaN behavior diff is not known/forgotten by most people. Should I…
	cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Sure, or I can do it. I have some time to work on LLVM specific projects right now. The problematic case is X=+/-NaN. This only applies to FNEG(X) -> FSUB(-0.0, X) transforms, since IEEE-754 does not specify the sign-bit of a NaN result for FSUB(-0.0, +/-NaN). IEEE-754 does specify the sign-bit for a FNEG(+/-NaN). That said, IIRC, some architectures make mistakes in practice with FSUB where X=+/-0.0. But, that is well defined by IEEE-754. I can brush up on this case if you'd like more detail. I'll also add that FSUB(-0.0, X) -> FNEG(X) may not be safe for the constrained intrinsics when rounding mode is in effect. I haven't studied that close enough yet, but I've seen enough verbiage in IEEE-754 to know I should be worried about it. cameron.mcinally: Sure, or I can do it. I have some time to work on LLVM specific projects right now. The…
	spatelAuthorUnsubmitted Done Reply Inline Actions Great - please go ahead with additional examples/docs. IMO, we can always use more of that. Anything left to do with this patch? spatel: Great - please go ahead with additional examples/docs. IMO, we can always use more of that.
	cameron.mcinallyUnsubmitted Done Reply Inline Actions Great - please go ahead with additional examples/docs. IMO, we can always use more of that. No problem. On my TODO list. cameron.mcinally: > Great - please go ahead with additional examples/docs. IMO, we can always use more of that.
	ret float %r			ret float %r
	}			}

	define <4 x float> @fsub_neg0_undef(<4 x float> %Q) nounwind {			define <4 x float> @fsub_neg0_undef(<4 x float> %Q) nounwind {
	; X32-SSE1-LABEL: fsub_neg0_undef:			; X32-SSE-LABEL: fsub_neg0_undef:
	; X32-SSE1: # %bb.0:			; X32-SSE: # %bb.0:
	; X32-SSE1-NEXT: xorps {{\.LCPI.*}}, %xmm0			; X32-SSE-NEXT: retl
	; X32-SSE1-NEXT: retl
	;
	; X32-SSE2-LABEL: fsub_neg0_undef:
	; X32-SSE2: # %bb.0:
	; X32-SSE2-NEXT: retl
	;
	; X64-SSE1-LABEL: fsub_neg0_undef:
	; X64-SSE1: # %bb.0:
	; X64-SSE1-NEXT: xorps {{.*}}(%rip), %xmm0
	; X64-SSE1-NEXT: retq
	;			;
	; X64-SSE2-LABEL: fsub_neg0_undef:			; X64-SSE-LABEL: fsub_neg0_undef:
	; X64-SSE2: # %bb.0:			; X64-SSE: # %bb.0:
	; X64-SSE2-NEXT: retq			; X64-SSE-NEXT: retq
	%tmp = fsub <4 x float> < float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00 >, undef			%tmp = fsub <4 x float> < float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00 >, undef
	ret <4 x float> %tmp			ret <4 x float> %tmp
	}			}

	define <4 x float> @fsub_neg0_undef_elts_undef(<4 x float> %x) {			define <4 x float> @fsub_neg0_undef_elts_undef(<4 x float> %x) {
	; X32-SSE-LABEL: fsub_neg0_undef_elts_undef:			; X32-SSE-LABEL: fsub_neg0_undef_elts_undef:
	; X32-SSE: # %bb.0:			; X32-SSE: # %bb.0:
	; X32-SSE-NEXT: movaps {{.*#+}} xmm0 = <NaN,u,u,NaN>			; X32-SSE-NEXT: movaps {{.*#+}} xmm0 = <NaN,u,u,NaN>
	▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] fold 'fneg undef' to undefClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 198016

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/test/CodeGen/X86/vec_fneg.ll

[SelectionDAG] fold 'fneg undef' to undef
ClosedPublic