This is an archive of the discontinued LLVM Phabricator instance.

DAG: Recognize no-signed-zeros-fp-math attribute
ClosedPublic

Authored by arsenm on Jan 18 2017, 5:22 PM.

Download Raw Diff

Details

Reviewers

• tstellarAMD
echristo
jlebar

Summary

clang already emits this with -cl-no-signed-zeros, but codegen
doesn't do anything with it. Treat it like the other fast math
attributes, and change one place to use it.

Diff Detail

Event Timeline

arsenm created this revision.Jan 18 2017, 5:22 PM

Herald added a reviewer: • tstellarAMD. · View Herald TranscriptJan 18 2017, 5:22 PM

Herald added subscribers: nhaehnle, wdng, mehdi_amini. · View Herald Transcript

arsenm added a child revision: D28885: AMDGPU: Disable some fneg combines unless nsz.Jan 18 2017, 6:22 PM

kzhuravl added a subscriber: kzhuravl.Jan 18 2017, 9:12 PM

kzhuravl added inline comments.Jan 18 2017, 9:43 PM

include/llvm/Target/TargetOptions.h
163	rezo->zero.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
632	line exceeds 80 chars.
lib/Target/TargetMachine.cpp
81	This line duplicates the previous one.

Address feedback

Justin has been looking at these things lately.

jlebar added inline comments.Jan 24 2017, 6:20 PM

include/llvm/Target/TargetOptions.h
161	"when the -foo flag is specified" or "when -foo is specified"
lib/Target/TargetMachine.cpp
91	Personally I'd prefer to do this in a separate patch, and to make a reasonable attempt at updating our documentation to say that this implies the others. It was only after working on these flags for a few weeks that I even realized that UnsafeFPMath implies the others in IR (and therefore that it should imply the others elsewhere). I even went as far as to add a separate unsafe-fp-math flag to XLA separate from the fast-math flag. Oops. :)
test/CodeGen/AMDGPU/fsub.ll
115	Can we add a (brief ok) comment explaining why this one gets an xor but the others don't?

jlebar added inline comments.Jan 24 2017, 6:23 PM

lib/Target/TargetMachine.cpp
91	Oh, I see why you're doing it here -- you don't want to regress existing codegen that sets UnsafeFPMath but not this flag (which is new). I guess you have to do it this way. We should still do this with the intent to update the docs and set the other flags, though.

arsenm marked an inline comment as done.Jan 24 2017, 9:55 PM

arsenm added inline comments.

lib/Target/TargetMachine.cpp
91	I would actually prefer to someday split unsafe math into an unsafe algebra flag which does not imply the others for places where you can reassociate without changing nan/inf behavior. Oddly, I checked what clang does for just -funsafe-math-optimizations. It sets unsurprisingly: "unsafe-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" But I was surprised it still emits: "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" but these are enabled by -ffast-math. So from a practical point of view in the current world it doesn't really need to imply it, but it also confuses me about what unsafe-fp-math is really supposed to mean if before it was a stand in for no-signed-zeros.

jlebar added inline comments.Jan 24 2017, 9:59 PM

lib/Target/TargetMachine.cpp
91	I would actually prefer to someday split unsafe math into an unsafe algebra flag which does not imply the others for places where you can reassociate without changing nan/inf behavior. I mean, me too. At least, I think I want to be able to reassociate / lose precision without assuming that I never produce a NaN or Inf. I think that's rather useful, but maybe someone can learn me why it's not. And clearly the flags are a mess, yikes. But at an IR level, the only way to set "unsafe-fp-math" on an instruction is with the "fast" attr, and that implies all the the things. I'm also not a fan of pretending that we have more configurability than we actually do...

Don't have unsafe-fp-math imply no-signed-zeros. Surprisingly only one other test failed besides the ones I was editing in the first place for this

lgtm with the other outstanding comments addressed.

This revision is now accepted and ready to land.Jan 24 2017, 10:05 PM

r293024

Revision Contents

Path

Size

include/

llvm/

CodeGen/

CommandFlags.h

7 lines

Target/

TargetOptions.h

10 lines

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

2 lines

Target/

TargetMachine.cpp

7 lines

test/

CodeGen/

AMDGPU/

enable-no-signed-zeros-fp-math.ll

22 lines

fsub.ll

56 lines

Diff 84916

include/llvm/CodeGen/CommandFlags.h

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	EnableNoInfsFPMath("enable-no-infs-fp-math",
cl::init(false));		cl::init(false));

cl::opt<bool>		cl::opt<bool>
EnableNoNaNsFPMath("enable-no-nans-fp-math",		EnableNoNaNsFPMath("enable-no-nans-fp-math",
cl::desc("Enable FP math optimizations that assume no NaNs"),		cl::desc("Enable FP math optimizations that assume no NaNs"),
cl::init(false));		cl::init(false));

cl::opt<bool>		cl::opt<bool>
		EnableNoSignedZerosFPMath("enable-no-signed-zeros-fp-math",
		cl::desc("Enable FP math optimizations that assume "
		"the sign of 0 is insignificant"),
		cl::init(false));

		cl::opt<bool>
EnableNoTrappingFPMath("enable-no-trapping-fp-math",		EnableNoTrappingFPMath("enable-no-trapping-fp-math",
cl::desc("Enable setting the FP exceptions build "		cl::desc("Enable setting the FP exceptions build "
"attribute not to use exceptions"),		"attribute not to use exceptions"),
cl::init(false));		cl::init(false));

cl::opt<llvm::FPDenormal::DenormalMode>		cl::opt<llvm::FPDenormal::DenormalMode>
DenormalMode("denormal-fp-math",		DenormalMode("denormal-fp-math",
cl::desc("Select which denormal numbers the code is permitted to require"),		cl::desc("Select which denormal numbers the code is permitted to require"),
▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
// a TargetOptions object with CodeGen flags and returns it.		// a TargetOptions object with CodeGen flags and returns it.
static inline TargetOptions InitTargetOptionsFromCodeGenFlags() {		static inline TargetOptions InitTargetOptionsFromCodeGenFlags() {
TargetOptions Options;		TargetOptions Options;
Options.LessPreciseFPMADOption = EnableFPMAD;		Options.LessPreciseFPMADOption = EnableFPMAD;
Options.AllowFPOpFusion = FuseFPOps;		Options.AllowFPOpFusion = FuseFPOps;
Options.UnsafeFPMath = EnableUnsafeFPMath;		Options.UnsafeFPMath = EnableUnsafeFPMath;
Options.NoInfsFPMath = EnableNoInfsFPMath;		Options.NoInfsFPMath = EnableNoInfsFPMath;
Options.NoNaNsFPMath = EnableNoNaNsFPMath;		Options.NoNaNsFPMath = EnableNoNaNsFPMath;
		Options.NoSignedZerosFPMath = EnableNoSignedZerosFPMath;
Options.NoTrappingFPMath = EnableNoTrappingFPMath;		Options.NoTrappingFPMath = EnableNoTrappingFPMath;
Options.FPDenormalMode = DenormalMode;		Options.FPDenormalMode = DenormalMode;
Options.HonorSignDependentRoundingFPMathOption =		Options.HonorSignDependentRoundingFPMathOption =
EnableHonorSignDependentRoundingFPMath;		EnableHonorSignDependentRoundingFPMath;
if (FloatABIForCalls != FloatABI::Default)		if (FloatABIForCalls != FloatABI::Default)
Options.FloatABIType = FloatABIForCalls;		Options.FloatABIType = FloatABIForCalls;
Options.NoZerosInBSS = DontPlaceZerosInBSS;		Options.NoZerosInBSS = DontPlaceZerosInBSS;
Options.GuaranteedTailCallOpt = EnableGuaranteedTailCallOpt;		Options.GuaranteedTailCallOpt = EnableGuaranteedTailCallOpt;
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

include/llvm/Target/TargetOptions.h

Show First 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	public:
unsigned NoInfsFPMath : 1;		unsigned NoInfsFPMath : 1;

/// NoNaNsFPMath - This flag is enabled when the		/// NoNaNsFPMath - This flag is enabled when the
/// -enable-no-nans-fp-math flag is specified on the command line. When		/// -enable-no-nans-fp-math flag is specified on the command line. When
/// this flag is off (the default), the code generator is not allowed to		/// this flag is off (the default), the code generator is not allowed to
/// assume the FP arithmetic arguments and results are never NaNs.		/// assume the FP arithmetic arguments and results are never NaNs.
unsigned NoNaNsFPMath : 1;		unsigned NoNaNsFPMath : 1;

/// NoTrappingFPMath - This flag is enabled when the		/// NoTrappingFPMath - This flag is enabled when the
/// -enable-no-trapping-fp-math is specified on the command line. This		/// -enable-no-trapping-fp-math is specified on the command line. This
/// specifies that there are no trap handlers to handle exceptions.		/// specifies that there are no trap handlers to handle exceptions.
unsigned NoTrappingFPMath : 1;		unsigned NoTrappingFPMath : 1;

		/// NoSignedZerosFPMath - This flag is enabled when the
		jlebarUnsubmitted Not Done Reply Inline Actions "when the -foo flag is specified" or "when -foo is specified" jlebar: "when the -foo flag is specified" or "when -foo is specified"
		/// -enable-no-signed-zeros-fp-math is specified on the command line. This
		/// specifies that optimizations are allowed to treat the sign of a rezo
		kzhuravlUnsubmitted Done Reply Inline Actions rezo->zero. kzhuravl: rezo->zero.
		/// argument or result as insignificant.
		unsigned NoSignedZerosFPMath : 1;

/// HonorSignDependentRoundingFPMath - This returns true when the		/// HonorSignDependentRoundingFPMath - This returns true when the
/// -enable-sign-dependent-rounding-fp-math is specified. If this returns		/// -enable-sign-dependent-rounding-fp-math is specified. If this returns
/// false (the default), the code generator is allowed to assume that the		/// false (the default), the code generator is allowed to assume that the
/// rounding behavior is the default (round-to-zero for all floating point		/// rounding behavior is the default (round-to-zero for all floating point
/// to integer conversions, and round-to-nearest for all other arithmetic		/// to integer conversions, and round-to-nearest for all other arithmetic
/// truncations). If this is enabled (set to true), the code generator must		/// truncations). If this is enabled (set to true), the code generator must
/// assume that the rounding mode may dynamically change.		/// assume that the rounding mode may dynamically change.
unsigned HonorSignDependentRoundingFPMathOption : 1;		unsigned HonorSignDependentRoundingFPMathOption : 1;
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 623 Lines • ▼ Show 20 Lines	case ISD::FADD:
if (char V = isNegatibleForFree(Op.getOperand(0), LegalOperations, TLI,		if (char V = isNegatibleForFree(Op.getOperand(0), LegalOperations, TLI,
Options, Depth + 1))		Options, Depth + 1))
return V;		return V;
// fold (fneg (fadd A, B)) -> (fsub (fneg B), A)		// fold (fneg (fadd A, B)) -> (fsub (fneg B), A)
return isNegatibleForFree(Op.getOperand(1), LegalOperations, TLI, Options,		return isNegatibleForFree(Op.getOperand(1), LegalOperations, TLI, Options,
Depth + 1);		Depth + 1);
case ISD::FSUB:		case ISD::FSUB:
// We can't turn -(A-B) into B-A when we honor signed zeros.		// We can't turn -(A-B) into B-A when we honor signed zeros.
if (!Options->UnsafeFPMath && !Op.getNode()->getFlags()->hasNoSignedZeros())		if (!Options->NoSignedZerosFPMath && !Op.getNode()->getFlags()->hasNoSignedZeros())
		kzhuravlUnsubmitted Done Reply Inline Actions line exceeds 80 chars. kzhuravl: line exceeds 80 chars.
return 0;		return 0;

// fold (fneg (fsub A, B)) -> (fsub B, A)		// fold (fneg (fsub A, B)) -> (fsub B, A)
return 1;		return 1;

case ISD::FMUL:		case ISD::FMUL:
case ISD::FDIV:		case ISD::FDIV:
if (Options->HonorSignDependentRoundingFPMath()) return 0;		if (Options->HonorSignDependentRoundingFPMath()) return 0;
▲ Show 20 Lines • Show All 9,991 Lines • Show Last 20 Lines

lib/Target/TargetMachine.cpp

Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	do { \
else \		else \
Options.X = DefaultOptions.X; \		Options.X = DefaultOptions.X; \
} while (0)		} while (0)

RESET_OPTION(LessPreciseFPMADOption, "less-precise-fpmad");		RESET_OPTION(LessPreciseFPMADOption, "less-precise-fpmad");
RESET_OPTION(UnsafeFPMath, "unsafe-fp-math");		RESET_OPTION(UnsafeFPMath, "unsafe-fp-math");
RESET_OPTION(NoInfsFPMath, "no-infs-fp-math");		RESET_OPTION(NoInfsFPMath, "no-infs-fp-math");
RESET_OPTION(NoNaNsFPMath, "no-nans-fp-math");		RESET_OPTION(NoNaNsFPMath, "no-nans-fp-math");
		RESET_OPTION(NoNaNsFPMath, "no-nans-fp-math");
		kzhuravlUnsubmitted Done Reply Inline Actions This line duplicates the previous one. kzhuravl: This line duplicates the previous one.
RESET_OPTION(NoTrappingFPMath, "no-trapping-math");		RESET_OPTION(NoTrappingFPMath, "no-trapping-math");
		RESET_OPTION(NoSignedZerosFPMath, "no-signed-zeros-fp-math");

		if (Options.UnsafeFPMath) {
		// Should this imply the others?
		Options.NoSignedZerosFPMath = true;
		}

StringRef Denormal =		StringRef Denormal =
F.getFnAttribute("denormal-fp-math").getValueAsString();		F.getFnAttribute("denormal-fp-math").getValueAsString();
		jlebarUnsubmitted Not Done Reply Inline Actions Personally I'd prefer to do this in a separate patch, and to make a reasonable attempt at updating our documentation to say that this implies the others. It was only after working on these flags for a few weeks that I even realized that UnsafeFPMath implies the others in IR (and therefore that it should imply the others elsewhere). I even went as far as to add a separate unsafe-fp-math flag to XLA separate from the fast-math flag. Oops. :) jlebar: Personally I'd prefer to do this in a separate patch, and to make a reasonable attempt at…
		jlebarUnsubmitted Not Done Reply Inline Actions Oh, I see why you're doing it here -- you don't want to regress existing codegen that sets UnsafeFPMath but not this flag (which is new). I guess you have to do it this way. We should still do this with the intent to update the docs and set the other flags, though. jlebar: Oh, I see why you're doing it here -- you don't want to regress existing codegen that sets…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions I would actually prefer to someday split unsafe math into an unsafe algebra flag which does not imply the others for places where you can reassociate without changing nan/inf behavior. Oddly, I checked what clang does for just -funsafe-math-optimizations. It sets unsurprisingly: "unsafe-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" But I was surprised it still emits: "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" but these are enabled by -ffast-math. So from a practical point of view in the current world it doesn't really need to imply it, but it also confuses me about what unsafe-fp-math is really supposed to mean if before it was a stand in for no-signed-zeros. arsenm: I would actually prefer to someday split unsafe math into an unsafe algebra flag which does not…
		jlebarUnsubmitted Not Done Reply Inline Actions I would actually prefer to someday split unsafe math into an unsafe algebra flag which does not imply the others for places where you can reassociate without changing nan/inf behavior. I mean, me too. At least, I think I want to be able to reassociate / lose precision without assuming that I never produce a NaN or Inf. I think that's rather useful, but maybe someone can learn me why it's not. And clearly the flags are a mess, yikes. But at an IR level, the only way to set "unsafe-fp-math" on an instruction is with the "fast" attr, and that implies all the the things. I'm also not a fan of pretending that we have more configurability than we actually do... jlebar: > I would actually prefer to someday split unsafe math into an unsafe algebra flag which does…
if (Denormal == "ieee")		if (Denormal == "ieee")
Options.FPDenormalMode = FPDenormal::IEEE;		Options.FPDenormalMode = FPDenormal::IEEE;
else if (Denormal == "preserve-sign")		else if (Denormal == "preserve-sign")
Options.FPDenormalMode = FPDenormal::PreserveSign;		Options.FPDenormalMode = FPDenormal::PreserveSign;
else if (Denormal == "positive-zero")		else if (Denormal == "positive-zero")
Options.FPDenormalMode = FPDenormal::PositiveZero;		Options.FPDenormalMode = FPDenormal::PositiveZero;
else		else
Options.FPDenormalMode = DefaultOptions.FPDenormalMode;		Options.FPDenormalMode = DefaultOptions.FPDenormalMode;
▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

test/CodeGen/AMDGPU/enable-no-signed-zeros-fp-math.ll

This file was added.

				; RUN: llc -march=amdgcn -enable-no-signed-zeros-fp-math=0 < %s \| FileCheck -check-prefix=GCN -check-prefix=GCN-SAFE %s
				; RUN: llc -march=amdgcn -enable-no-signed-zeros-fp-math=1 < %s \| FileCheck -check-prefix=GCN -check-prefix=GCN-UNSAFE %s
				; RUN: llc -march=amdgcn -enable-unsafe-fp-math < %s \| FileCheck -check-prefix=GCN -check-prefix=GCN-UNSAFE %s

				; Test that the -enable-no-signed-zeros-fp-math flag works

				; GCN-LABEL: {{^}}fneg_fsub_f32:
				; GCN: v_subrev_f32_e32 [[SUB:v[0-9]+]], {{v[0-9]+}}, {{v[0-9]+}}
				; GCN-SAFE: v_xor_b32_e32 v{{[0-9]+}}, 0x80000000, [[SUB]]

				; GCN-UNSAFE-NOT: xor
				define void @fneg_fsub_f32(float addrspace(1)* %out, float addrspace(1)* %in) #0 {
				%b_ptr = getelementptr float, float addrspace(1)* %in, i32 1
				%a = load float, float addrspace(1)* %in, align 4
				%b = load float, float addrspace(1)* %b_ptr, align 4
				%result = fsub float %a, %b
				%neg.result = fsub float -0.0, %result
				store float %neg.result, float addrspace(1)* %out, align 4
				ret void
				}

				attributes #0 = { nounwind }

test/CodeGen/AMDGPU/fsub.ll

	Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
	; SI: v_subrev_f32_e32 {{v[0-9]+}}, {{s[0-9]+}}, {{v[0-9]+}}			; SI: v_subrev_f32_e32 {{v[0-9]+}}, {{s[0-9]+}}, {{v[0-9]+}}
	; SI: v_subrev_f32_e32 {{v[0-9]+}}, {{s[0-9]+}}, {{v[0-9]+}}			; SI: v_subrev_f32_e32 {{v[0-9]+}}, {{s[0-9]+}}, {{v[0-9]+}}
	; SI: s_endpgm			; SI: s_endpgm
	define void @s_fsub_v4f32(<4 x float> addrspace(1)* %out, <4 x float> %a, <4 x float> %b) {			define void @s_fsub_v4f32(<4 x float> addrspace(1)* %out, <4 x float> %a, <4 x float> %b) {
	%result = fsub <4 x float> %a, %b			%result = fsub <4 x float> %a, %b
	store <4 x float> %result, <4 x float> addrspace(1)* %out, align 16			store <4 x float> %result, <4 x float> addrspace(1)* %out, align 16
	ret void			ret void
	}			}


				; FUNC-LABEL: {{^}}v_fneg_fsub_f32:
				; SI: v_subrev_f32_e32 [[SUB:v[0-9]+]], {{v[0-9]+}}, {{v[0-9]+}}
				; SI: v_xor_b32_e32 v{{[0-9]+}}, 0x80000000, [[SUB]]
				define void @v_fneg_fsub_f32(float addrspace(1)* %out, float addrspace(1)* %in) {
				%b_ptr = getelementptr float, float addrspace(1)* %in, i32 1
				%a = load float, float addrspace(1)* %in, align 4
				%b = load float, float addrspace(1)* %b_ptr, align 4
				%result = fsub float %a, %b
				%neg.result = fsub float -0.0, %result
				store float %neg.result, float addrspace(1)* %out, align 4
				ret void
				}

				; FUNC-LABEL: {{^}}v_fneg_fsub_nsz_f32:
				; SI: v_subrev_f32_e32 [[SUB:v[0-9]+]], {{v[0-9]+}}, {{v[0-9]+}}
				; SI-NOT: xor
				define void @v_fneg_fsub_nsz_f32(float addrspace(1)* %out, float addrspace(1)* %in) {
				%b_ptr = getelementptr float, float addrspace(1)* %in, i32 1
				%a = load float, float addrspace(1)* %in, align 4
				%b = load float, float addrspace(1)* %b_ptr, align 4
				%result = fsub nsz float %a, %b
				%neg.result = fsub float -0.0, %result
				store float %neg.result, float addrspace(1)* %out, align 4
				ret void
				}

				; FUNC-LABEL: {{^}}v_fneg_fsub_nsz_attribute_f32:
				; SI: v_subrev_f32_e32 [[SUB:v[0-9]+]], {{v[0-9]+}}, {{v[0-9]+}}
				; SI-NOT: xor
				define void @v_fneg_fsub_nsz_attribute_f32(float addrspace(1)* %out, float addrspace(1)* %in) #0 {
				%b_ptr = getelementptr float, float addrspace(1)* %in, i32 1
				%a = load float, float addrspace(1)* %in, align 4
				%b = load float, float addrspace(1)* %b_ptr, align 4
				%result = fsub float %a, %b
				%neg.result = fsub float -0.0, %result
				store float %neg.result, float addrspace(1)* %out, align 4
				ret void
				}

				; FUNC-LABEL: {{^}}v_fneg_fsub_nsz_false_attribute_f32:
				; SI: v_subrev_f32_e32 [[SUB:v[0-9]+]], {{v[0-9]+}}, {{v[0-9]+}}
				; SI: v_xor_b32_e32 v{{[0-9]+}}, 0x80000000, [[SUB]]
				jlebarUnsubmitted Done Reply Inline Actions Can we add a (brief ok) comment explaining why this one gets an xor but the others don't? jlebar: Can we add a (brief ok) comment explaining why this one gets an xor but the others don't?
				define void @v_fneg_fsub_nsz_false_attribute_f32(float addrspace(1)* %out, float addrspace(1)* %in) #1 {
				%b_ptr = getelementptr float, float addrspace(1)* %in, i32 1
				%a = load float, float addrspace(1)* %in, align 4
				%b = load float, float addrspace(1)* %b_ptr, align 4
				%result = fsub float %a, %b
				%neg.result = fsub float -0.0, %result
				store float %neg.result, float addrspace(1)* %out, align 4
				ret void
				}

				attributes #0 = { nounwind "no-signed-zeros-fp-math"="true" }
				attributes #1 = { nounwind "no-signed-zeros-fp-math"="false" }

This is an archive of the discontinued LLVM Phabricator instance.

DAG: Recognize no-signed-zeros-fp-math attributeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 84916

include/llvm/CodeGen/CommandFlags.h

include/llvm/Target/TargetOptions.h

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

lib/Target/TargetMachine.cpp

test/CodeGen/AMDGPU/enable-no-signed-zeros-fp-math.ll

test/CodeGen/AMDGPU/fsub.ll

DAG: Recognize no-signed-zeros-fp-math attribute
ClosedPublic