This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] remove constant folding limitations based on FP exceptions
ClosedPublic

Authored by spatel on Apr 30 2019, 11:47 AM.

Download Raw Diff

Details

Reviewers

cameron.mcinally
kpn
greened
mcberg2017
efriedma
andrew.w.kaylor
scanon

Commits

rG284472be6da3: [SelectionDAG] remove constant folding limitations based on FP exceptions
rL359791: [SelectionDAG] remove constant folding limitations based on FP exceptions

Summary

We don't have any FP exception limits in the IR constant folder (apart from strict ops), so I don't think it makes sense to have them here in the DAG either. Nothing else in the backend tries to preserve those (again outside of strict ops), so I don't see how this could have ever worked for real code that cares about FP exceptions.

Diff Detail

Event Timeline

spatel created this revision.Apr 30 2019, 11:47 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2019, 11:47 AM

Herald added subscribers: aheejin, hiraditya, jgravelle-google and 7 others. · View Herald Transcript

craig.topper added a reviewer: andrew.w.kaylor.Apr 30 2019, 12:01 PM

LGTM; this matches the direction we're going, where we use different DAG nodes for functions which care about FP exceptions. (Please wait a couple days in case I'm missing something obvious.)

This revision is now accepted and ready to land.Apr 30 2019, 12:19 PM

Rather than get rid of this, can we update it to handle strict FP nodes correctly? Or at least put a skeleton in place to prepare to handle it correctly?

Right now I think we're throwing away the rounding mode and exception semantics arguments when we convert constrained intrinsics to strict FP nodes, but if we still had that information here we could use it to make a smart decision. For instance, suppose we had an intrinsic like this:

%t = call double @llvm.experimental.constrained.fadd.f64(double 4.3, double 6.5,
                                                         metadata !"round.upward",
                                                         metadata !"fpexcept.ignore")

If that got to the selection DAG we would be able to fold it.

Yes, we could possibly constant-fold STRICT_FADD etc. using similar logic, but I'm not sure we would actually want to reuse the existing codepath in foldConstantFPMath. In any case, it'll be in the revision history if someone does end up wanting to revive the code for that.

Why wouldn't we want to reuse this code path?

We made the decision that we are initially willing to live with poor optimization in order to get correctness for the constrained FP implementation, but the plan all along has been that after we had correct results we wanted to go back to re-enable optimizations. I'm just trying to avoid making that path harder.

What I'm asking for here is something like this:

SDValue SelectionDAG::foldConstantFPMath(unsigned Opcode, const SDLoc &DL,
                                         EVT VT, SDValue N1, SDValue N2) {
  auto *N1CFP = dyn_cast<ConstantFPSDNode>(N1.getNode());
  auto *N2CFP = dyn_cast<ConstantFPSDNode>(N2.getNode());
  bool HasFPExceptions = false;
  APFloat::roundingMode = APFloat::round_to_nearest;
  // FIXME: Add support for strict FP rounding and exception semantics.

That would produce the same effect as this patch. To support it we'd just need extra arguments to provide the rounding mode and exception semantics. I think those are probably even available not far up the stack from where this is called.

In D61331#1484982, @andrew.w.kaylor wrote:
Why wouldn't we want to reuse this code path?

We made the decision that we are initially willing to live with poor optimization in order to get correctness for the constrained FP implementation, but the plan all along has been that after we had correct results we wanted to go back to re-enable optimizations. I'm just trying to avoid making that path harder.

What I'm asking for here is something like this:
SDValue SelectionDAG::foldConstantFPMath(unsigned Opcode, const SDLoc &DL,
                                         EVT VT, SDValue N1, SDValue N2) {
  auto *N1CFP = dyn_cast<ConstantFPSDNode>(N1.getNode());
  auto *N2CFP = dyn_cast<ConstantFPSDNode>(N2.getNode());
  bool HasFPExceptions = false;
  APFloat::roundingMode = APFloat::round_to_nearest;
  // FIXME: Add support for strict FP rounding and exception semantics.
That would produce the same effect as this patch. To support it we'd just need extra arguments to provide the rounding mode and exception semantics. I think those are probably even available not far up the stack from where this is called.

You also need to pass in the input chain and probably emit a MERGE_VALUES to connect it to the expected output chain. Probably need to call this from a different getNode function with the right number of input arguments and add STRICT_FP nodes to the switch in that function to call this.

We generally try to avoid keeping dead code in-tree, but I guess I'm not strongly opposed to it here.

In D61331#1485000, @efriedma wrote:

We generally try to avoid keeping dead code in-tree, but I guess I'm not strongly opposed to it here.

I honestly don't have strong feelings about this particular change either. I'm more concerned about the way we think about strict FP handling. We've still got a lot of work to do to finish strict FP handling, and I would hate to see any code that currently has logic to handle strict exception semantics lose it. On the other hand, looking closer at the current code, I see that the logic there today isn't actually correct for strict exception semantics. So I guess if we decide to go with my suggestion then every place that's currently checking "Status != APFloat::opInvalidOp" should really be checking "Status == APFloat::opOK".

BTW, in addition to what Craig pointed out above with regard to the chain, he visited me and mentioned additional complications related to the fact that we aren't keeping the rounding mode and exception semantics in the DAG. We really need a way to fix that. I'm inclined to believe constant folding will still end up going through this function, but it'll be more work to get here than I realized.

mcberg2017 added a reviewer: scanon.Apr 30 2019, 1:49 PM

In D61331#1485112, @andrew.w.kaylor wrote:

In D61331#1485000, @efriedma wrote:

We generally try to avoid keeping dead code in-tree, but I guess I'm not strongly opposed to it here.

I honestly don't have strong feelings about this particular change either. I'm more concerned about the way we think about strict FP handling. We've still got a lot of work to do to finish strict FP handling, and I would hate to see any code that currently has logic to handle strict exception semantics lose it. On the other hand, looking closer at the current code, I see that the logic there today isn't actually correct for strict exception semantics. So I guess if we decide to go with my suggestion then every place that's currently checking "Status != APFloat::opInvalidOp" should really be checking "Status == APFloat::opOK".

That would definitely be an improvement, but I'm not sure if that's enough. For example, if we intend to track denormals here, then we'd have to check the returned constant value rather than the status value?

It seemed to me that the strict and regular cases would end up sharing just the 1 line of code that computes the constant, so it might make more sense to group strict nodes together rather than interleave those cases with the regular ops.

BTW, in addition to what Craig pointed out above with regard to the chain, he visited me and mentioned additional complications related to the fact that we aren't keeping the rounding mode and exception semantics in the DAG. We really need a way to fix that. I'm inclined to believe constant folding will still end up going through this function, but it'll be more work to get here than I realized.

Yes, I think there's a lot of plumbing to do...

cameron.mcinally added inline comments.May 1 2019, 10:09 AM

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4819	These break statements (FADD to FREM) are unreachable.

Patch updated:

Added a TODO comment about handling strict opcodes.
Removed unreachable 'break' statements in the switch.

I'm OK with this. Thanks for the TODO comment!

LGTM, thanks Sanjay

cameron.mcinally added inline comments.May 1 2019, 5:43 PM

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4221	Happened to stumble across more of these exception checks. Just a heads up...

spatel marked an inline comment as done.May 2 2019, 6:39 AM

spatel added inline comments.

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4221	Thanks. Also, I just noticed that we do not constant fold the FMA intrinsic in IR. Haven't looked at the unary ops in IR yet, but I'll try to make this consistent.

Closed by commit rL359791: [SelectionDAG] remove constant folding limitations based on FP exceptions (authored by spatel). · Explain WhyMay 2 2019, 7:45 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

TargetLowering.h

15 lines

lib/

CodeGen/

SelectionDAG/

SelectionDAG.cpp

33 lines

TargetLoweringBase.cpp

1 line

Target/

AMDGPU/

SIISelLowering.cpp

5 lines

WebAssembly/

WebAssemblyISelLowering.cpp

3 lines

test/

CodeGen/

AArch64/

fp-const-fold.ll

27 lines

Diff 197388

llvm/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 569 Lines • ▼ Show 20 Lines	public:
}		}

/// Return true if inserting a scalar into a variable element of an undef		/// Return true if inserting a scalar into a variable element of an undef
/// vector is more efficiently handled by splatting the scalar instead.		/// vector is more efficiently handled by splatting the scalar instead.
virtual bool shouldSplatInsEltVarIndex(EVT) const {		virtual bool shouldSplatInsEltVarIndex(EVT) const {
return false;		return false;
}		}

/// Return true if target supports floating point exceptions.
bool hasFloatingPointExceptions() const {
return HasFloatingPointExceptions;
}

/// Return true if target always beneficiates from combining into FMA for a		/// Return true if target always beneficiates from combining into FMA for a
/// given value type. This must typically return false on targets where FMA		/// given value type. This must typically return false on targets where FMA
/// takes more cycles to execute than FADD.		/// takes more cycles to execute than FADD.
virtual bool enableAggressiveFMAFusion(EVT VT) const {		virtual bool enableAggressiveFMAFusion(EVT VT) const {
return false;		return false;
}		}

/// Return the ValueType of the result of SETCC operations.		/// Return the ValueType of the result of SETCC operations.
▲ Show 20 Lines • Show All 1,293 Lines • ▼ Show 20 Lines	void setHasExtractBitsInsn(bool hasExtractInsn = true) {
HasExtractBitsInsn = hasExtractInsn;		HasExtractBitsInsn = hasExtractInsn;
}		}

/// Tells the code generator not to expand logic operations on comparison		/// Tells the code generator not to expand logic operations on comparison
/// predicates into separate sequences that increase the amount of flow		/// predicates into separate sequences that increase the amount of flow
/// control.		/// control.
void setJumpIsExpensive(bool isExpensive = true);		void setJumpIsExpensive(bool isExpensive = true);

/// Tells the code generator that this target supports floating point
/// exceptions and cares about preserving floating point exception behavior.
void setHasFloatingPointExceptions(bool FPExceptions = true) {
HasFloatingPointExceptions = FPExceptions;
}

/// Tells the code generator which bitwidths to bypass.		/// Tells the code generator which bitwidths to bypass.
void addBypassSlowDiv(unsigned int SlowBitWidth, unsigned int FastBitWidth) {		void addBypassSlowDiv(unsigned int SlowBitWidth, unsigned int FastBitWidth) {
BypassSlowDivWidths[SlowBitWidth] = FastBitWidth;		BypassSlowDivWidths[SlowBitWidth] = FastBitWidth;
}		}

/// Add the specified register class as an available regclass for the		/// Add the specified register class as an available regclass for the
/// specified value type. This indicates the selector can handle values of		/// specified value type. This indicates the selector can handle values of
/// that class natively.		/// that class natively.
▲ Show 20 Lines • Show All 643 Lines • ▼ Show 20 Lines	private:
/// div/rem when the operands are positive and less than 256.		/// div/rem when the operands are positive and less than 256.
DenseMap <unsigned int, unsigned int> BypassSlowDivWidths;		DenseMap <unsigned int, unsigned int> BypassSlowDivWidths;

/// Tells the code generator that it shouldn't generate extra flow control		/// Tells the code generator that it shouldn't generate extra flow control
/// instructions and should attempt to combine flow control instructions via		/// instructions and should attempt to combine flow control instructions via
/// predication.		/// predication.
bool JumpIsExpensive;		bool JumpIsExpensive;

/// Whether the target supports or cares about preserving floating point
/// exception behavior.
bool HasFloatingPointExceptions;

/// This target prefers to use _setjmp to implement llvm.setjmp.		/// This target prefers to use _setjmp to implement llvm.setjmp.
///		///
/// Defaults to false.		/// Defaults to false.
bool UseUnderscoreSetJmp;		bool UseUnderscoreSetJmp;

/// This target prefers to use _longjmp to implement llvm.longjmp.		/// This target prefers to use _longjmp to implement llvm.longjmp.
///		///
/// Defaults to false.		/// Defaults to false.
▲ Show 20 Lines • Show All 1,426 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,212 Lines • ▼ Show 20 Lines	case ISD::FTRUNC: {
APFloat::opStatus fs = V.roundToIntegral(APFloat::rmTowardZero);		APFloat::opStatus fs = V.roundToIntegral(APFloat::rmTowardZero);
if (fs == APFloat::opOK \|\| fs == APFloat::opInexact)		if (fs == APFloat::opOK \|\| fs == APFloat::opInexact)
return getConstantFP(V, DL, VT);		return getConstantFP(V, DL, VT);
break;		break;
}		}
case ISD::FFLOOR: {		case ISD::FFLOOR: {
APFloat::opStatus fs = V.roundToIntegral(APFloat::rmTowardNegative);		APFloat::opStatus fs = V.roundToIntegral(APFloat::rmTowardNegative);
if (fs == APFloat::opOK \|\| fs == APFloat::opInexact)		if (fs == APFloat::opOK \|\| fs == APFloat::opInexact)
return getConstantFP(V, DL, VT);		return getConstantFP(V, DL, VT);
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Happened to stumble across more of these exception checks. Just a heads up... cameron.mcinally: Happened to stumble across more of these exception checks. Just a heads up...
		spatelAuthorUnsubmitted Done Reply Inline Actions Thanks. Also, I just noticed that we do not constant fold the FMA intrinsic in IR. Haven't looked at the unary ops in IR yet, but I'll try to make this consistent. spatel: Thanks. Also, I just noticed that we do not constant fold the FMA intrinsic in IR. Haven't…
break;		break;
}		}
case ISD::FP_EXTEND: {		case ISD::FP_EXTEND: {
bool ignored;		bool ignored;
// This can return overflow, underflow, or inexact; we don't care.		// This can return overflow, underflow, or inexact; we don't care.
// FIXME need to be more flexible about rounding mode.		// FIXME need to be more flexible about rounding mode.
(void)V.convert(EVTToAPFloatSemantics(VT),		(void)V.convert(EVTToAPFloatSemantics(VT),
APFloat::rmNearestTiesToEven, &ignored);		APFloat::rmNearestTiesToEven, &ignored);
▲ Show 20 Lines • Show All 559 Lines • ▼ Show 20 Lines	SDValue SelectionDAG::FoldConstantVectorArithmetic(unsigned Opcode,
NewSDValueDbgMsg(V, "New node fold constant vector: ", this);		NewSDValueDbgMsg(V, "New node fold constant vector: ", this);
return V;		return V;
}		}

SDValue SelectionDAG::foldConstantFPMath(unsigned Opcode, const SDLoc &DL,		SDValue SelectionDAG::foldConstantFPMath(unsigned Opcode, const SDLoc &DL,
EVT VT, SDValue N1, SDValue N2) {		EVT VT, SDValue N1, SDValue N2) {
auto *N1CFP = dyn_cast<ConstantFPSDNode>(N1.getNode());		auto *N1CFP = dyn_cast<ConstantFPSDNode>(N1.getNode());
auto *N2CFP = dyn_cast<ConstantFPSDNode>(N2.getNode());		auto *N2CFP = dyn_cast<ConstantFPSDNode>(N2.getNode());
bool HasFPExceptions = TLI->hasFloatingPointExceptions();
if (N1CFP && N2CFP) {		if (N1CFP && N2CFP) {
APFloat C1 = N1CFP->getValueAPF(), C2 = N2CFP->getValueAPF();		APFloat C1 = N1CFP->getValueAPF(), C2 = N2CFP->getValueAPF();
APFloat::opStatus Status;
switch (Opcode) {		switch (Opcode) {
case ISD::FADD:		case ISD::FADD:
Status = C1.add(C2, APFloat::rmNearestTiesToEven);		C1.add(C2, APFloat::rmNearestTiesToEven);
if (!HasFPExceptions \|\| Status != APFloat::opInvalidOp)
return getConstantFP(C1, DL, VT);		return getConstantFP(C1, DL, VT);
break;		break;
case ISD::FSUB:		case ISD::FSUB:
Status = C1.subtract(C2, APFloat::rmNearestTiesToEven);		C1.subtract(C2, APFloat::rmNearestTiesToEven);
if (!HasFPExceptions \|\| Status != APFloat::opInvalidOp)
return getConstantFP(C1, DL, VT);		return getConstantFP(C1, DL, VT);
break;		break;
case ISD::FMUL:		case ISD::FMUL:
Status = C1.multiply(C2, APFloat::rmNearestTiesToEven);		C1.multiply(C2, APFloat::rmNearestTiesToEven);
if (!HasFPExceptions \|\| Status != APFloat::opInvalidOp)
return getConstantFP(C1, DL, VT);		return getConstantFP(C1, DL, VT);
break;		break;
case ISD::FDIV:		case ISD::FDIV:
Status = C1.divide(C2, APFloat::rmNearestTiesToEven);		C1.divide(C2, APFloat::rmNearestTiesToEven);
if (!HasFPExceptions \|\| Status != APFloat::opInvalidOp)
return getConstantFP(C1, DL, VT);		return getConstantFP(C1, DL, VT);
break;		break;
case ISD::FREM:		case ISD::FREM:
Status = C1.mod(C2);		C1.mod(C2);
if (!HasFPExceptions \|\| Status != APFloat::opInvalidOp)
return getConstantFP(C1, DL, VT);		return getConstantFP(C1, DL, VT);
break;		break;
		cameron.mcinallyUnsubmitted Done Reply Inline Actions These break statements (FADD to FREM) are unreachable. cameron.mcinally: These break statements (FADD to FREM) are unreachable.
case ISD::FCOPYSIGN:		case ISD::FCOPYSIGN:
C1.copySign(C2);		C1.copySign(C2);
return getConstantFP(C1, DL, VT);		return getConstantFP(C1, DL, VT);
default: break;		default: break;
}		}
}		}
if (N1CFP && Opcode == ISD::FP_ROUND) {		if (N1CFP && Opcode == ISD::FP_ROUND) {
APFloat C1 = N1CFP->getValueAPF(); // make copy		APFloat C1 = N1CFP->getValueAPF(); // make copy
▲ Show 20 Lines • Show All 459 Lines • ▼ Show 20 Lines	assert(N1.getValueType() == VT && N2.getValueType() == VT &&
N3.getValueType() == VT && "FMA types must match!");		N3.getValueType() == VT && "FMA types must match!");
ConstantFPSDNode *N1CFP = dyn_cast<ConstantFPSDNode>(N1);		ConstantFPSDNode *N1CFP = dyn_cast<ConstantFPSDNode>(N1);
ConstantFPSDNode *N2CFP = dyn_cast<ConstantFPSDNode>(N2);		ConstantFPSDNode *N2CFP = dyn_cast<ConstantFPSDNode>(N2);
ConstantFPSDNode *N3CFP = dyn_cast<ConstantFPSDNode>(N3);		ConstantFPSDNode *N3CFP = dyn_cast<ConstantFPSDNode>(N3);
if (N1CFP && N2CFP && N3CFP) {		if (N1CFP && N2CFP && N3CFP) {
APFloat V1 = N1CFP->getValueAPF();		APFloat V1 = N1CFP->getValueAPF();
const APFloat &V2 = N2CFP->getValueAPF();		const APFloat &V2 = N2CFP->getValueAPF();
const APFloat &V3 = N3CFP->getValueAPF();		const APFloat &V3 = N3CFP->getValueAPF();
APFloat::opStatus s =
V1.fusedMultiplyAdd(V2, V3, APFloat::rmNearestTiesToEven);		V1.fusedMultiplyAdd(V2, V3, APFloat::rmNearestTiesToEven);
if (!TLI->hasFloatingPointExceptions() \|\| s != APFloat::opInvalidOp)
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
}		}
break;		break;
}		}
case ISD::BUILD_VECTOR: {		case ISD::BUILD_VECTOR: {
// Attempt to simplify BUILD_VECTOR.		// Attempt to simplify BUILD_VECTOR.
SDValue Ops[] = {N1, N2, N3};		SDValue Ops[] = {N1, N2, N3};
if (SDValue V = FoldBUILD_VECTOR(DL, VT, Ops, *this))		if (SDValue V = FoldBUILD_VECTOR(DL, VT, Ops, *this))
return V;		return V;
▲ Show 20 Lines • Show All 4,103 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 539 Lines • ▼ Show 20 Lines	MaxStoresPerMemsetOptSize = MaxStoresPerMemcpyOptSize =
MaxStoresPerMemmoveOptSize = MaxLoadsPerMemcmpOptSize = 4;		MaxStoresPerMemmoveOptSize = MaxLoadsPerMemcmpOptSize = 4;
UseUnderscoreSetJmp = false;		UseUnderscoreSetJmp = false;
UseUnderscoreLongJmp = false;		UseUnderscoreLongJmp = false;
HasMultipleConditionRegisters = false;		HasMultipleConditionRegisters = false;
HasExtractBitsInsn = false;		HasExtractBitsInsn = false;
JumpIsExpensive = JumpIsExpensiveOverride;		JumpIsExpensive = JumpIsExpensiveOverride;
PredictableSelectIsExpensive = false;		PredictableSelectIsExpensive = false;
EnableExtLdPromotion = false;		EnableExtLdPromotion = false;
HasFloatingPointExceptions = true;
StackPointerRegisterToSaveRestore = 0;		StackPointerRegisterToSaveRestore = 0;
BooleanContents = UndefinedBooleanContent;		BooleanContents = UndefinedBooleanContent;
BooleanFloatContents = UndefinedBooleanContent;		BooleanFloatContents = UndefinedBooleanContent;
BooleanVectorContents = UndefinedBooleanContent;		BooleanVectorContents = UndefinedBooleanContent;
SchedPreferenceInfo = Sched::ILP;		SchedPreferenceInfo = Sched::ILP;
JumpBufSize = 0;		JumpBufSize = 0;
JumpBufAlignment = 0;		JumpBufAlignment = 0;
MinFunctionAlignment = 0;		MinFunctionAlignment = 0;
▲ Show 20 Lines • Show All 1,337 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 723 Lines • ▼ Show 20 Lines	#endif
setTargetDAGCombine(ISD::ATOMIC_LOAD_NAND);		setTargetDAGCombine(ISD::ATOMIC_LOAD_NAND);
setTargetDAGCombine(ISD::ATOMIC_LOAD_MIN);		setTargetDAGCombine(ISD::ATOMIC_LOAD_MIN);
setTargetDAGCombine(ISD::ATOMIC_LOAD_MAX);		setTargetDAGCombine(ISD::ATOMIC_LOAD_MAX);
setTargetDAGCombine(ISD::ATOMIC_LOAD_UMIN);		setTargetDAGCombine(ISD::ATOMIC_LOAD_UMIN);
setTargetDAGCombine(ISD::ATOMIC_LOAD_UMAX);		setTargetDAGCombine(ISD::ATOMIC_LOAD_UMAX);
setTargetDAGCombine(ISD::ATOMIC_LOAD_FADD);		setTargetDAGCombine(ISD::ATOMIC_LOAD_FADD);

setSchedulingPreference(Sched::RegPressure);		setSchedulingPreference(Sched::RegPressure);

// SI at least has hardware support for floating point exceptions, but no way
// of using or handling them is implemented. They are also optional in OpenCL
// (Section 7.3)
setHasFloatingPointExceptions(Subtarget->hasFPExceptions());
}		}

const GCNSubtarget *SITargetLowering::getSubtarget() const {		const GCNSubtarget *SITargetLowering::getSubtarget() const {
return Subtarget;		return Subtarget;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// TargetLowering queries		// TargetLowering queries
▲ Show 20 Lines • Show All 9,271 Lines • Show Last 20 Lines

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

Show All 39 Lines	WebAssemblyTargetLowering::WebAssemblyTargetLowering(
const TargetMachine &TM, const WebAssemblySubtarget &STI)		const TargetMachine &TM, const WebAssemblySubtarget &STI)
: TargetLowering(TM), Subtarget(&STI) {		: TargetLowering(TM), Subtarget(&STI) {
auto MVTPtr = Subtarget->hasAddr64() ? MVT::i64 : MVT::i32;		auto MVTPtr = Subtarget->hasAddr64() ? MVT::i64 : MVT::i32;

// Booleans always contain 0 or 1.		// Booleans always contain 0 or 1.
setBooleanContents(ZeroOrOneBooleanContent);		setBooleanContents(ZeroOrOneBooleanContent);
// Except in SIMD vectors		// Except in SIMD vectors
setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);		setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);
// WebAssembly does not produce floating-point exceptions on normal floating
// point operations.
setHasFloatingPointExceptions(false);
// We don't know the microarchitecture here, so just reduce register pressure.		// We don't know the microarchitecture here, so just reduce register pressure.
setSchedulingPreference(Sched::RegPressure);		setSchedulingPreference(Sched::RegPressure);
// Tell ISel that we have a stack pointer.		// Tell ISel that we have a stack pointer.
setStackPointerRegisterToSaveRestore(		setStackPointerRegisterToSaveRestore(
Subtarget->hasAddr64() ? WebAssembly::SP64 : WebAssembly::SP32);		Subtarget->hasAddr64() ? WebAssembly::SP64 : WebAssembly::SP32);
// Set up the register classes.		// Set up the register classes.
addRegisterClass(MVT::i32, &WebAssembly::I32RegClass);		addRegisterClass(MVT::i32, &WebAssembly::I32RegClass);
addRegisterClass(MVT::i64, &WebAssembly::I64RegClass);		addRegisterClass(MVT::i64, &WebAssembly::I64RegClass);
▲ Show 20 Lines • Show All 1,324 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/fp-const-fold.ll

Show All 12 Lines	; CHECK-NEXT: ret
ret double %r		ret double %r
}		}

; frem by 0.0 --> NaN		; frem by 0.0 --> NaN

define double @constant_fold_frem_by_zero(double* %p) {		define double @constant_fold_frem_by_zero(double* %p) {
; CHECK-LABEL: constant_fold_frem_by_zero:		; CHECK-LABEL: constant_fold_frem_by_zero:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mov x8, #1		; CHECK-NEXT: mov x8, #9221120237041090560
; CHECK-NEXT: fmov d1, xzr
; CHECK-NEXT: fmov d0, x8		; CHECK-NEXT: fmov d0, x8
; CHECK-NEXT: b fmod		; CHECK-NEXT: ret
%r = frem double 4.940660e-324, 0.0		%r = frem double 4.940660e-324, 0.0
ret double %r		ret double %r
}		}

; Inf * 0.0 --> NaN		; Inf * 0.0 --> NaN

define double @constant_fold_fmul_nan(double* %p) {		define double @constant_fold_fmul_nan(double* %p) {
; CHECK-LABEL: constant_fold_fmul_nan:		; CHECK-LABEL: constant_fold_fmul_nan:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mov x8, #9218868437227405312		; CHECK-NEXT: mov x8, #9221120237041090560
; CHECK-NEXT: fmov d0, xzr		; CHECK-NEXT: fmov d0, x8
; CHECK-NEXT: fmov d1, x8
; CHECK-NEXT: fmul d0, d1, d0
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%r = fmul double 0x7ff0000000000000, 0.0		%r = fmul double 0x7ff0000000000000, 0.0
ret double %r		ret double %r
}		}

; Inf + -Inf --> NaN		; Inf + -Inf --> NaN

define double @constant_fold_fadd_nan(double* %p) {		define double @constant_fold_fadd_nan(double* %p) {
; CHECK-LABEL: constant_fold_fadd_nan:		; CHECK-LABEL: constant_fold_fadd_nan:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mov x8, #-4503599627370496		; CHECK-NEXT: mov x8, #9221120237041090560
; CHECK-NEXT: mov x9, #9218868437227405312
; CHECK-NEXT: fmov d0, x8		; CHECK-NEXT: fmov d0, x8
; CHECK-NEXT: fmov d1, x9
; CHECK-NEXT: fadd d0, d1, d0
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%r = fadd double 0x7ff0000000000000, 0xfff0000000000000		%r = fadd double 0x7ff0000000000000, 0xfff0000000000000
ret double %r		ret double %r
}		}

; Inf - Inf --> NaN		; Inf - Inf --> NaN

define double @constant_fold_fsub_nan(double* %p) {		define double @constant_fold_fsub_nan(double* %p) {
; CHECK-LABEL: constant_fold_fsub_nan:		; CHECK-LABEL: constant_fold_fsub_nan:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mov x8, #9218868437227405312		; CHECK-NEXT: mov x8, #9221120237041090560
; CHECK-NEXT: fmov d0, x8		; CHECK-NEXT: fmov d0, x8
; CHECK-NEXT: fsub d0, d0, d0
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%r = fsub double 0x7ff0000000000000, 0x7ff0000000000000		%r = fsub double 0x7ff0000000000000, 0x7ff0000000000000
ret double %r		ret double %r
}		}

; Inf * 0.0 + ? --> NaN		; Inf * 0.0 + ? --> NaN

define double @constant_fold_fma_nan(double* %p) {		define double @constant_fold_fma_nan(double* %p) {
; CHECK-LABEL: constant_fold_fma_nan:		; CHECK-LABEL: constant_fold_fma_nan:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mov x8, #4631107791820423168		; CHECK-NEXT: mov x8, #9221120237041090560
; CHECK-NEXT: mov x9, #9218868437227405312		; CHECK-NEXT: fmov d0, x8
; CHECK-NEXT: fmov d0, xzr
; CHECK-NEXT: fmov d1, x8
; CHECK-NEXT: fmov d2, x9
; CHECK-NEXT: fmadd d0, d2, d0, d1
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%r = call double @llvm.fma.f64(double 0x7ff0000000000000, double 0.0, double 42.0)		%r = call double @llvm.fma.f64(double 0x7ff0000000000000, double 0.0, double 42.0)
ret double %r		ret double %r
}		}

declare double @llvm.fma.f64(double, double, double)		declare double @llvm.fma.f64(double, double, double)

This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] remove constant folding limitations based on FP exceptionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 197388

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/lib/CodeGen/TargetLoweringBase.cpp

llvm/lib/Target/AMDGPU/SIISelLowering.cpp

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

llvm/test/CodeGen/AArch64/fp-const-fold.ll

[SelectionDAG] remove constant folding limitations based on FP exceptions
ClosedPublic