This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] lower math intrinsics to finite version of libcalls when possible (PR35672)
ClosedPublic

Authored by spatel on Dec 17 2017, 4:53 PM.

Download Raw Diff

Details

Reviewers

hfinkel
efriedma
andrew.w.kaylor

Commits

rG37e28e40cbec: [SelectionDAG] lower math intrinsics to finite version of libcalls when…
rL322087: [SelectionDAG] lower math intrinsics to finite version of libcalls when…

Summary

This is a partial implementation of a fix for PR35672:
https://bugs.llvm.org/show_bug.cgi?id=35672

If this is on the right track, then I can add similar code for other transcendentals (exp2, log, log10, log2, pow).

Some questions:

Do the finite calls need the double-leading underscores? I saw an existing test with __sqrt_finite, so I assume we want those, but I'm not sure if/how the regular calls acquire the underscores.
Does this make sense for ISD::STRICT_FEXP (the strict version of the node)?
Does the mathlib actually support the long double variants?

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.Dec 17 2017, 4:53 PM

Herald added a subscriber: mcrosier. · View Herald TranscriptDec 17 2017, 4:53 PM

Do the finite calls need the double-leading underscores? I saw an existing test with __sqrt_finite, so I assume we want those, but I'm not sure if/how the regular calls acquire the underscores.

That's just how they're named; the names come from the glibc headers.

Does this make sense for ISD::STRICT_FEXP (the strict version of the node)?

I would guess strict and fast math don't really mix...

Does the mathlib actually support the long double variants?

long double variants should exist, as far as I know... at least for the "long double" which is native for the platform.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4049 ↗	(On Diff #127295)	"getLibInfo().has(LibFunc_exp_finite)" probably isn't the right check. There are three __exp*_finite variants; each of them may or may not be legal, and you need to check that the long double variant actually accepts the input float type. Also, it looks like TargetLibraryInfo doesn't contain the appropriate checks to disable them (it assumes any non-Windows platform has __exp_finite, which is clearly false).

In D41338#958611, @efriedma wrote:

Do the finite calls need the double-leading underscores? I saw an existing test with __sqrt_finite, so I assume we want those, but I'm not sure if/how the regular calls acquire the underscores.

That's just how they're named; the names come from the glibc headers.

Does this make sense for ISD::STRICT_FEXP (the strict version of the node)?

I would guess strict and fast math don't really mix...

I agree.

Does the mathlib actually support the long double variants?

long double variants should exist, as far as I know... at least for the "long double" which is native for the platform.

Correct.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4049 ↗	(On Diff #127295)	and you need to check that the long double variant actually accepts the input float type. How would we check this? (it assumes any non-Windows platform has __exp_finite, which is clearly false). Hrmm. We could start with disabling them for any non-GNU triple.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4049 ↗	(On Diff #127295)	For figuring out whether the long-double variant is the one we need, I guess TargetLibraryInfo should know? We need to avoid generating a call with undefined behavior somehow. And we probably need to eventually teach this code how to generate a call to __expf128_finite when it's available.

hfinkel added inline comments.Dec 20 2017, 12:37 PM

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
4049 ↗	(On Diff #127295)	For figuring out whether the long-double variant is the one we need, I guess TargetLibraryInfo should know? We need to avoid generating a call with undefined behavior somehow. I don't think that anything knows right now (except Clang). I think that we can generate <foo>l/__<foo>l_finite calls for ppcf128/f80 and <foo>f128/__<foo>f128_finite for f128 for x86 and ppc specifically (as we have special types, ppcf128 and f80, for long double on those platforms), and calls to <foo>l/__<foo>l_finite for f128 everywhere else. No? And we probably need to eventually teach this code how to generate a call to __expf128_finite when it's available. I agree. glibc added a whole bunch of these now.

and calls to <foo>l/__<foo>l_finite for f128 everywhere else. No?

There are four ways to lower C long double to LLVM IR: fp128, ppc_f128, x86_fp80, and double. If you assume the IR doesn't contain any intrinsic calls which can't be lowered, the logic becomes simpler, but I'm not sure that's a safe assumption.

In D41338#961372, @efriedma wrote:

and calls to <foo>l/__<foo>l_finite for f128 everywhere else. No?

There are four ways to lower C long double to LLVM IR: fp128, ppc_f128, x86_fp80, and double. If you assume the IR doesn't contain any intrinsic calls which can't be lowered, the logic becomes simpler, but I'm not sure that's a safe assumption.

Historically, I believe that we've made this assumption. That's, in part, why all of the intrinsics say "Not all targets support all types however." in the LangRef.

That's, in part, why all of the intrinsics say "Not all targets support all types however." in the LangRef.

I think most people would assume that means you'll get some sort of error from the backend, not a silent miscompile. But I guess it isn't any worse than what we do now for other math library calls.

In D41338#960487, @hfinkel wrote:

In D41338#958611, @efriedma wrote:

Does this make sense for ISD::STRICT_FEXP (the strict version of the node)?

I would guess strict and fast math don't really mix...

I agree.

Sorry for the delay. Let me address this comment first. I would agree too, but I was informed that there's a clang customer in the gaming world that wants to compile with -ffast-math and enable div-by-zero FP exceptions as a way to sanitize their data (at least in development builds).

I assume that we're still a long way from realizing this dream (optimized FP + some subset of FP exceptions enabled) in LLVM, but if there's no correctness issue with allowing this transform, then I think we should treat these nodes the same in this patch.

Patch updated:

Fixed the availability checking for finite functions in TargetLibraryInfoImpl - these only exist on Linux.
Added a darwin triple to the test file to show that is behaving as expected (no finite calls there).

IIUC, if we're going to improve the long double availability / lowering, then we'll want to do that for all of the math functions (not just the finite variants). So I deferred making any changes related to that here. Also, I haven't added {exp2, log, log10, log2, pow} yet just to keep the patch small until we're sure this is correct. If it is correct, then the other nodes should be simple edits of how we handle exp, and I'll add them.

Ping.

In D41338#964520, @spatel wrote:

In D41338#960487, @hfinkel wrote:

In D41338#958611, @efriedma wrote:

Does this make sense for ISD::STRICT_FEXP (the strict version of the node)?

I would guess strict and fast math don't really mix...

I agree.

Sorry for the delay. Let me address this comment first. I would agree too, but I was informed that there's a clang customer in the gaming world that wants to compile with -ffast-math and enable div-by-zero FP exceptions as a way to sanitize their data (at least in development builds).

I assume that we're still a long way from realizing this dream (optimized FP + some subset of FP exceptions enabled) in LLVM, but if there's no correctness issue with allowing this transform, then I think we should treat these nodes the same in this patch.

I feel like there are still a lot of things that need to be figured out with regard to StrictFP and ISel. What's there now is just kind of a "get something working" solution, and I think it will need to be beefed up later.

Regarding the original question about mixing fast math and strict FP, I think what the gaming customer is asking for sounds reasonable. The strict FP intrinsics handle two things that are at least potentially separate: value safety and exception behavior. The fast math flags mostly affect value safety, but there are some transformation that could change exception behavior. That will have to be thought through when individual transformations are taught to recognize the constrained intrinsics. More generally, it's certainly reasonable to indicate via the intrinsics that you want to preserve exception behavior but that you don't care about rounding mode. How the user would get the front end to do that is another question, because I think it will have to be something other than using the FENV_ACCESS pragma and -ffast-math together (which I would hope would trigger a warning saying that one of them was going to be ignored).

After thinking through this particular situation a bit more with regard to the STRICT_EXP node, I think what you've chosen to do here is probably correct. I'm not entirely certain what the exp_finite implementation does, but I would expect that with regard to rounding it will produce the same result as the normal function as long as the input is finite. Similarly, I think that the exception behavior of exp_finite should be the same as the non-finite version as long as the input is finite. If the input is non-finite, then I would expect that the appropriate exception was raised or status flag set by whatever produced the value. I don't think either exp or exp_finite will produce an exception for non-finite values. We'll get the wrong answer with exp_finite of course, but the user signed up for that when using the fast math flags.

In general, as this implementation approach is propagated to other functions, I think this is the questions we need to ask. Will the finite version hide exceptions that would have been raised by the normal function call or produce exceptions that wouldn't have been produced otherwise? I'd need to look into each function implementation, but I can imagine cases where assuming finite inputs could lead to changes in exception behavior.

LGTM

This revision is now accepted and ready to land.Jan 6 2018, 6:33 PM

I noticed that we can expose the availability bug via constant propagation, so I split that part of the patch off here:
rL322010

Closed by commit rL322087: [SelectionDAG] lower math intrinsics to finite version of libcalls when… (authored by spatel). · Explain WhyJan 9 2018, 7:42 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

RuntimeLibcalls.def

30 lines

SelectionDAG.h

5 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

80 lines

SelectionDAG.cpp

3 lines

SelectionDAGISel.cpp

2 lines

test/

CodeGen/

X86/

finite-libcalls.ll

399 lines

Diff 129086

llvm/trunk/include/llvm/CodeGen/RuntimeLibcalls.def

	Show First 20 Lines • Show All 124 Lines • ▼ Show 20 Lines
	HANDLE_LIBCALL(SQRT_F80, "sqrtl")			HANDLE_LIBCALL(SQRT_F80, "sqrtl")
	HANDLE_LIBCALL(SQRT_F128, "sqrtl")			HANDLE_LIBCALL(SQRT_F128, "sqrtl")
	HANDLE_LIBCALL(SQRT_PPCF128, "sqrtl")			HANDLE_LIBCALL(SQRT_PPCF128, "sqrtl")
	HANDLE_LIBCALL(LOG_F32, "logf")			HANDLE_LIBCALL(LOG_F32, "logf")
	HANDLE_LIBCALL(LOG_F64, "log")			HANDLE_LIBCALL(LOG_F64, "log")
	HANDLE_LIBCALL(LOG_F80, "logl")			HANDLE_LIBCALL(LOG_F80, "logl")
	HANDLE_LIBCALL(LOG_F128, "logl")			HANDLE_LIBCALL(LOG_F128, "logl")
	HANDLE_LIBCALL(LOG_PPCF128, "logl")			HANDLE_LIBCALL(LOG_PPCF128, "logl")
				HANDLE_LIBCALL(LOG_FINITE_F32, "__logf_finite")
				HANDLE_LIBCALL(LOG_FINITE_F64, "__log_finite")
				HANDLE_LIBCALL(LOG_FINITE_F80, "__logl_finite")
				HANDLE_LIBCALL(LOG_FINITE_F128, "__logl_finite")
				HANDLE_LIBCALL(LOG_FINITE_PPCF128, "__logl_finite")
	HANDLE_LIBCALL(LOG2_F32, "log2f")			HANDLE_LIBCALL(LOG2_F32, "log2f")
	HANDLE_LIBCALL(LOG2_F64, "log2")			HANDLE_LIBCALL(LOG2_F64, "log2")
	HANDLE_LIBCALL(LOG2_F80, "log2l")			HANDLE_LIBCALL(LOG2_F80, "log2l")
	HANDLE_LIBCALL(LOG2_F128, "log2l")			HANDLE_LIBCALL(LOG2_F128, "log2l")
	HANDLE_LIBCALL(LOG2_PPCF128, "log2l")			HANDLE_LIBCALL(LOG2_PPCF128, "log2l")
				HANDLE_LIBCALL(LOG2_FINITE_F32, "__log2f_finite")
				HANDLE_LIBCALL(LOG2_FINITE_F64, "__log2_finite")
				HANDLE_LIBCALL(LOG2_FINITE_F80, "__log2l_finite")
				HANDLE_LIBCALL(LOG2_FINITE_F128, "__log2l_finite")
				HANDLE_LIBCALL(LOG2_FINITE_PPCF128, "__log2l_finite")
	HANDLE_LIBCALL(LOG10_F32, "log10f")			HANDLE_LIBCALL(LOG10_F32, "log10f")
	HANDLE_LIBCALL(LOG10_F64, "log10")			HANDLE_LIBCALL(LOG10_F64, "log10")
	HANDLE_LIBCALL(LOG10_F80, "log10l")			HANDLE_LIBCALL(LOG10_F80, "log10l")
	HANDLE_LIBCALL(LOG10_F128, "log10l")			HANDLE_LIBCALL(LOG10_F128, "log10l")
	HANDLE_LIBCALL(LOG10_PPCF128, "log10l")			HANDLE_LIBCALL(LOG10_PPCF128, "log10l")
				HANDLE_LIBCALL(LOG10_FINITE_F32, "__log10f_finite")
				HANDLE_LIBCALL(LOG10_FINITE_F64, "__log10_finite")
				HANDLE_LIBCALL(LOG10_FINITE_F80, "__log10l_finite")
				HANDLE_LIBCALL(LOG10_FINITE_F128, "__log10l_finite")
				HANDLE_LIBCALL(LOG10_FINITE_PPCF128, "__log10l_finite")
	HANDLE_LIBCALL(EXP_F32, "expf")			HANDLE_LIBCALL(EXP_F32, "expf")
	HANDLE_LIBCALL(EXP_F64, "exp")			HANDLE_LIBCALL(EXP_F64, "exp")
	HANDLE_LIBCALL(EXP_F80, "expl")			HANDLE_LIBCALL(EXP_F80, "expl")
	HANDLE_LIBCALL(EXP_F128, "expl")			HANDLE_LIBCALL(EXP_F128, "expl")
	HANDLE_LIBCALL(EXP_PPCF128, "expl")			HANDLE_LIBCALL(EXP_PPCF128, "expl")
				HANDLE_LIBCALL(EXP_FINITE_F32, "__expf_finite")
				HANDLE_LIBCALL(EXP_FINITE_F64, "__exp_finite")
				HANDLE_LIBCALL(EXP_FINITE_F80, "__expl_finite")
				HANDLE_LIBCALL(EXP_FINITE_F128, "__expl_finite")
				HANDLE_LIBCALL(EXP_FINITE_PPCF128, "__expl_finite")
	HANDLE_LIBCALL(EXP2_F32, "exp2f")			HANDLE_LIBCALL(EXP2_F32, "exp2f")
	HANDLE_LIBCALL(EXP2_F64, "exp2")			HANDLE_LIBCALL(EXP2_F64, "exp2")
	HANDLE_LIBCALL(EXP2_F80, "exp2l")			HANDLE_LIBCALL(EXP2_F80, "exp2l")
	HANDLE_LIBCALL(EXP2_F128, "exp2l")			HANDLE_LIBCALL(EXP2_F128, "exp2l")
	HANDLE_LIBCALL(EXP2_PPCF128, "exp2l")			HANDLE_LIBCALL(EXP2_PPCF128, "exp2l")
				HANDLE_LIBCALL(EXP2_FINITE_F32, "__exp2f_finite")
				HANDLE_LIBCALL(EXP2_FINITE_F64, "__exp2_finite")
				HANDLE_LIBCALL(EXP2_FINITE_F80, "__exp2l_finite")
				HANDLE_LIBCALL(EXP2_FINITE_F128, "__exp2l_finite")
				HANDLE_LIBCALL(EXP2_FINITE_PPCF128, "__exp2l_finite")
	HANDLE_LIBCALL(SIN_F32, "sinf")			HANDLE_LIBCALL(SIN_F32, "sinf")
	HANDLE_LIBCALL(SIN_F64, "sin")			HANDLE_LIBCALL(SIN_F64, "sin")
	HANDLE_LIBCALL(SIN_F80, "sinl")			HANDLE_LIBCALL(SIN_F80, "sinl")
	HANDLE_LIBCALL(SIN_F128, "sinl")			HANDLE_LIBCALL(SIN_F128, "sinl")
	HANDLE_LIBCALL(SIN_PPCF128, "sinl")			HANDLE_LIBCALL(SIN_PPCF128, "sinl")
	HANDLE_LIBCALL(COS_F32, "cosf")			HANDLE_LIBCALL(COS_F32, "cosf")
	HANDLE_LIBCALL(COS_F64, "cos")			HANDLE_LIBCALL(COS_F64, "cos")
	HANDLE_LIBCALL(COS_F80, "cosl")			HANDLE_LIBCALL(COS_F80, "cosl")
	HANDLE_LIBCALL(COS_F128, "cosl")			HANDLE_LIBCALL(COS_F128, "cosl")
	HANDLE_LIBCALL(COS_PPCF128, "cosl")			HANDLE_LIBCALL(COS_PPCF128, "cosl")
	HANDLE_LIBCALL(SINCOS_F32, nullptr)			HANDLE_LIBCALL(SINCOS_F32, nullptr)
	HANDLE_LIBCALL(SINCOS_F64, nullptr)			HANDLE_LIBCALL(SINCOS_F64, nullptr)
	HANDLE_LIBCALL(SINCOS_F80, nullptr)			HANDLE_LIBCALL(SINCOS_F80, nullptr)
	HANDLE_LIBCALL(SINCOS_F128, nullptr)			HANDLE_LIBCALL(SINCOS_F128, nullptr)
	HANDLE_LIBCALL(SINCOS_PPCF128, nullptr)			HANDLE_LIBCALL(SINCOS_PPCF128, nullptr)
	HANDLE_LIBCALL(SINCOS_STRET_F32, nullptr)			HANDLE_LIBCALL(SINCOS_STRET_F32, nullptr)
	HANDLE_LIBCALL(SINCOS_STRET_F64, nullptr)			HANDLE_LIBCALL(SINCOS_STRET_F64, nullptr)
	HANDLE_LIBCALL(POW_F32, "powf")			HANDLE_LIBCALL(POW_F32, "powf")
	HANDLE_LIBCALL(POW_F64, "pow")			HANDLE_LIBCALL(POW_F64, "pow")
	HANDLE_LIBCALL(POW_F80, "powl")			HANDLE_LIBCALL(POW_F80, "powl")
	HANDLE_LIBCALL(POW_F128, "powl")			HANDLE_LIBCALL(POW_F128, "powl")
	HANDLE_LIBCALL(POW_PPCF128, "powl")			HANDLE_LIBCALL(POW_PPCF128, "powl")
				HANDLE_LIBCALL(POW_FINITE_F32, "__powf_finite")
				HANDLE_LIBCALL(POW_FINITE_F64, "__pow_finite")
				HANDLE_LIBCALL(POW_FINITE_F80, "__powl_finite")
				HANDLE_LIBCALL(POW_FINITE_F128, "__powl_finite")
				HANDLE_LIBCALL(POW_FINITE_PPCF128, "__powl_finite")
	HANDLE_LIBCALL(CEIL_F32, "ceilf")			HANDLE_LIBCALL(CEIL_F32, "ceilf")
	HANDLE_LIBCALL(CEIL_F64, "ceil")			HANDLE_LIBCALL(CEIL_F64, "ceil")
	HANDLE_LIBCALL(CEIL_F80, "ceill")			HANDLE_LIBCALL(CEIL_F80, "ceill")
	HANDLE_LIBCALL(CEIL_F128, "ceill")			HANDLE_LIBCALL(CEIL_F128, "ceill")
	HANDLE_LIBCALL(CEIL_PPCF128, "ceill")			HANDLE_LIBCALL(CEIL_PPCF128, "ceill")
	HANDLE_LIBCALL(TRUNC_F32, "truncf")			HANDLE_LIBCALL(TRUNC_F32, "truncf")
	HANDLE_LIBCALL(TRUNC_F64, "trunc")			HANDLE_LIBCALL(TRUNC_F64, "trunc")
	HANDLE_LIBCALL(TRUNC_F80, "truncl")			HANDLE_LIBCALL(TRUNC_F80, "truncl")
	▲ Show 20 Lines • Show All 313 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/SelectionDAG.h

	Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
	class LLVMContext;			class LLVMContext;
	class MachineBasicBlock;			class MachineBasicBlock;
	class MachineConstantPoolValue;			class MachineConstantPoolValue;
	class MCSymbol;			class MCSymbol;
	class OptimizationRemarkEmitter;			class OptimizationRemarkEmitter;
	class SDDbgValue;			class SDDbgValue;
	class SelectionDAG;			class SelectionDAG;
	class SelectionDAGTargetInfo;			class SelectionDAGTargetInfo;
				class TargetLibraryInfo;
	class TargetLowering;			class TargetLowering;
	class TargetMachine;			class TargetMachine;
	class TargetSubtargetInfo;			class TargetSubtargetInfo;
	class Value;			class Value;

	class SDVTListNode : public FoldingSetNode {			class SDVTListNode : public FoldingSetNode {
	friend struct FoldingSetTrait<SDVTListNode>;			friend struct FoldingSetTrait<SDVTListNode>;

	▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines
	/// representation, which has some similarities to the GCC RTL representation,			/// representation, which has some similarities to the GCC RTL representation,
	/// but is significantly more simple, powerful, and is a graph form instead of a			/// but is significantly more simple, powerful, and is a graph form instead of a
	/// linear form.			/// linear form.
	///			///
	class SelectionDAG {			class SelectionDAG {
	const TargetMachine &TM;			const TargetMachine &TM;
	const SelectionDAGTargetInfo *TSI = nullptr;			const SelectionDAGTargetInfo *TSI = nullptr;
	const TargetLowering *TLI = nullptr;			const TargetLowering *TLI = nullptr;
				const TargetLibraryInfo *LibInfo = nullptr;
	MachineFunction *MF;			MachineFunction *MF;
	Pass *SDAGISelPass = nullptr;			Pass *SDAGISelPass = nullptr;
	LLVMContext *Context;			LLVMContext *Context;
	CodeGenOpt::Level OptLevel;			CodeGenOpt::Level OptLevel;

	/// The function-level optimization remark emitter. Used to emit remarks			/// The function-level optimization remark emitter. Used to emit remarks
	/// whenever manipulating the DAG.			/// whenever manipulating the DAG.
	OptimizationRemarkEmitter *ORE;			OptimizationRemarkEmitter *ORE;
	▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines
	public:			public:
	explicit SelectionDAG(const TargetMachine &TM, CodeGenOpt::Level);			explicit SelectionDAG(const TargetMachine &TM, CodeGenOpt::Level);
	SelectionDAG(const SelectionDAG &) = delete;			SelectionDAG(const SelectionDAG &) = delete;
	SelectionDAG &operator=(const SelectionDAG &) = delete;			SelectionDAG &operator=(const SelectionDAG &) = delete;
	~SelectionDAG();			~SelectionDAG();

	/// Prepare this SelectionDAG to process code in the given MachineFunction.			/// Prepare this SelectionDAG to process code in the given MachineFunction.
	void init(MachineFunction &NewMF, OptimizationRemarkEmitter &NewORE,			void init(MachineFunction &NewMF, OptimizationRemarkEmitter &NewORE,
	Pass *PassPtr);			Pass PassPtr, const TargetLibraryInfo LibraryInfo);

	/// Clear state and free memory necessary to make this			/// Clear state and free memory necessary to make this
	/// SelectionDAG ready to process a new block.			/// SelectionDAG ready to process a new block.
	void clear();			void clear();

	MachineFunction &getMachineFunction() const { return *MF; }			MachineFunction &getMachineFunction() const { return *MF; }
	const Pass *getPass() const { return SDAGISelPass; }			const Pass *getPass() const { return SDAGISelPass; }

	const DataLayout &getDataLayout() const { return MF->getDataLayout(); }			const DataLayout &getDataLayout() const { return MF->getDataLayout(); }
	const TargetMachine &getTarget() const { return TM; }			const TargetMachine &getTarget() const { return TM; }
	const TargetSubtargetInfo &getSubtarget() const { return MF->getSubtarget(); }			const TargetSubtargetInfo &getSubtarget() const { return MF->getSubtarget(); }
	const TargetLowering &getTargetLoweringInfo() const { return *TLI; }			const TargetLowering &getTargetLoweringInfo() const { return *TLI; }
				const TargetLibraryInfo &getLibInfo() const { return *LibInfo; }
	const SelectionDAGTargetInfo &getSelectionDAGInfo() const { return *TSI; }			const SelectionDAGTargetInfo &getSelectionDAGInfo() const { return *TSI; }
	LLVMContext *getContext() const {return Context; }			LLVMContext *getContext() const {return Context; }
	OptimizationRemarkEmitter &getORE() const { return *ORE; }			OptimizationRemarkEmitter &getORE() const { return *ORE; }

	/// Pop up a GraphViz/gv window with the DAG rendered using 'dot'.			/// Pop up a GraphViz/gv window with the DAG rendered using 'dot'.
	void viewGraph(const std::string &Title);			void viewGraph(const std::string &Title);
	void viewGraph();			void viewGraph();

	▲ Show 20 Lines • Show All 1,202 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 3,926 Lines • ▼ Show 20 Lines
ReplaceNode(Node, Results.data());		ReplaceNode(Node, Results.data());
return true;		return true;
}		}

void SelectionDAGLegalize::ConvertNodeToLibcall(SDNode *Node) {		void SelectionDAGLegalize::ConvertNodeToLibcall(SDNode *Node) {
DEBUG(dbgs() << "Trying to convert node to libcall\n");		DEBUG(dbgs() << "Trying to convert node to libcall\n");
SmallVector<SDValue, 8> Results;		SmallVector<SDValue, 8> Results;
SDLoc dl(Node);		SDLoc dl(Node);
		// FIXME: Check flags on the node to see if we can use a finite call.
		bool CanUseFiniteLibCall = TM.Options.NoInfsFPMath && TM.Options.NoNaNsFPMath;
unsigned Opc = Node->getOpcode();		unsigned Opc = Node->getOpcode();
switch (Opc) {		switch (Opc) {
case ISD::ATOMIC_FENCE: {		case ISD::ATOMIC_FENCE: {
// If the target didn't lower this, lower it to '__sync_synchronize()' call		// If the target didn't lower this, lower it to '__sync_synchronize()' call
// FIXME: handle "fence singlethread" more efficiently.		// FIXME: handle "fence singlethread" more efficiently.
TargetLowering::ArgListTy Args;		TargetLowering::ArgListTy Args;

TargetLowering::CallLoweringInfo CLI(DAG);		TargetLowering::CallLoweringInfo CLI(DAG);
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	Results.push_back(ExpandFPLibCall(Node, RTLIB::COS_F32, RTLIB::COS_F64,
RTLIB::COS_PPCF128));		RTLIB::COS_PPCF128));
break;		break;
case ISD::FSINCOS:		case ISD::FSINCOS:
// Expand into sincos libcall.		// Expand into sincos libcall.
ExpandSinCosLibCall(Node, Results);		ExpandSinCosLibCall(Node, Results);
break;		break;
case ISD::FLOG:		case ISD::FLOG:
case ISD::STRICT_FLOG:		case ISD::STRICT_FLOG:
		if (CanUseFiniteLibCall && DAG.getLibInfo().has(LibFunc_log_finite))
		Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG_FINITE_F32,
		RTLIB::LOG_FINITE_F64,
		RTLIB::LOG_FINITE_F80,
		RTLIB::LOG_FINITE_F128,
		RTLIB::LOG_FINITE_PPCF128));
		else
Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG_F32, RTLIB::LOG_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG_F32, RTLIB::LOG_F64,
RTLIB::LOG_F80, RTLIB::LOG_F128,		RTLIB::LOG_F80, RTLIB::LOG_F128,
RTLIB::LOG_PPCF128));		RTLIB::LOG_PPCF128));
break;		break;
case ISD::FLOG2:		case ISD::FLOG2:
case ISD::STRICT_FLOG2:		case ISD::STRICT_FLOG2:
		if (CanUseFiniteLibCall && DAG.getLibInfo().has(LibFunc_log2_finite))
		Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG2_FINITE_F32,
		RTLIB::LOG2_FINITE_F64,
		RTLIB::LOG2_FINITE_F80,
		RTLIB::LOG2_FINITE_F128,
		RTLIB::LOG2_FINITE_PPCF128));
		else
Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG2_F32, RTLIB::LOG2_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG2_F32, RTLIB::LOG2_F64,
RTLIB::LOG2_F80, RTLIB::LOG2_F128,		RTLIB::LOG2_F80, RTLIB::LOG2_F128,
RTLIB::LOG2_PPCF128));		RTLIB::LOG2_PPCF128));
break;		break;
case ISD::FLOG10:		case ISD::FLOG10:
case ISD::STRICT_FLOG10:		case ISD::STRICT_FLOG10:
		if (CanUseFiniteLibCall && DAG.getLibInfo().has(LibFunc_log10_finite))
		Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG10_FINITE_F32,
		RTLIB::LOG10_FINITE_F64,
		RTLIB::LOG10_FINITE_F80,
		RTLIB::LOG10_FINITE_F128,
		RTLIB::LOG10_FINITE_PPCF128));
		else
Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG10_F32, RTLIB::LOG10_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::LOG10_F32, RTLIB::LOG10_F64,
RTLIB::LOG10_F80, RTLIB::LOG10_F128,		RTLIB::LOG10_F80, RTLIB::LOG10_F128,
RTLIB::LOG10_PPCF128));		RTLIB::LOG10_PPCF128));
break;		break;
case ISD::FEXP:		case ISD::FEXP:
case ISD::STRICT_FEXP:		case ISD::STRICT_FEXP:
		if (CanUseFiniteLibCall && DAG.getLibInfo().has(LibFunc_exp_finite))
		Results.push_back(ExpandFPLibCall(Node, RTLIB::EXP_FINITE_F32,
		RTLIB::EXP_FINITE_F64,
		RTLIB::EXP_FINITE_F80,
		RTLIB::EXP_FINITE_F128,
		RTLIB::EXP_FINITE_PPCF128));
		else
Results.push_back(ExpandFPLibCall(Node, RTLIB::EXP_F32, RTLIB::EXP_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::EXP_F32, RTLIB::EXP_F64,
RTLIB::EXP_F80, RTLIB::EXP_F128,		RTLIB::EXP_F80, RTLIB::EXP_F128,
RTLIB::EXP_PPCF128));		RTLIB::EXP_PPCF128));
break;		break;
case ISD::FEXP2:		case ISD::FEXP2:
case ISD::STRICT_FEXP2:		case ISD::STRICT_FEXP2:
		if (CanUseFiniteLibCall && DAG.getLibInfo().has(LibFunc_exp2_finite))
		Results.push_back(ExpandFPLibCall(Node, RTLIB::EXP2_FINITE_F32,
		RTLIB::EXP2_FINITE_F64,
		RTLIB::EXP2_FINITE_F80,
		RTLIB::EXP2_FINITE_F128,
		RTLIB::EXP2_FINITE_PPCF128));
		else
Results.push_back(ExpandFPLibCall(Node, RTLIB::EXP2_F32, RTLIB::EXP2_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::EXP2_F32, RTLIB::EXP2_F64,
RTLIB::EXP2_F80, RTLIB::EXP2_F128,		RTLIB::EXP2_F80, RTLIB::EXP2_F128,
RTLIB::EXP2_PPCF128));		RTLIB::EXP2_PPCF128));
break;		break;
case ISD::FTRUNC:		case ISD::FTRUNC:
Results.push_back(ExpandFPLibCall(Node, RTLIB::TRUNC_F32, RTLIB::TRUNC_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::TRUNC_F32, RTLIB::TRUNC_F64,
RTLIB::TRUNC_F80, RTLIB::TRUNC_F128,		RTLIB::TRUNC_F80, RTLIB::TRUNC_F128,
RTLIB::TRUNC_PPCF128));		RTLIB::TRUNC_PPCF128));
break;		break;
case ISD::FFLOOR:		case ISD::FFLOOR:
Results.push_back(ExpandFPLibCall(Node, RTLIB::FLOOR_F32, RTLIB::FLOOR_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::FLOOR_F32, RTLIB::FLOOR_F64,
Show All 29 Lines	void SelectionDAGLegalize::ConvertNodeToLibcall(SDNode *Node) {
case ISD::FPOWI:		case ISD::FPOWI:
case ISD::STRICT_FPOWI:		case ISD::STRICT_FPOWI:
Results.push_back(ExpandFPLibCall(Node, RTLIB::POWI_F32, RTLIB::POWI_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::POWI_F32, RTLIB::POWI_F64,
RTLIB::POWI_F80, RTLIB::POWI_F128,		RTLIB::POWI_F80, RTLIB::POWI_F128,
RTLIB::POWI_PPCF128));		RTLIB::POWI_PPCF128));
break;		break;
case ISD::FPOW:		case ISD::FPOW:
case ISD::STRICT_FPOW:		case ISD::STRICT_FPOW:
		if (CanUseFiniteLibCall && DAG.getLibInfo().has(LibFunc_pow_finite))
		Results.push_back(ExpandFPLibCall(Node, RTLIB::POW_FINITE_F32,
		RTLIB::POW_FINITE_F64,
		RTLIB::POW_FINITE_F80,
		RTLIB::POW_FINITE_F128,
		RTLIB::POW_FINITE_PPCF128));
		else
Results.push_back(ExpandFPLibCall(Node, RTLIB::POW_F32, RTLIB::POW_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::POW_F32, RTLIB::POW_F64,
RTLIB::POW_F80, RTLIB::POW_F128,		RTLIB::POW_F80, RTLIB::POW_F128,
RTLIB::POW_PPCF128));		RTLIB::POW_PPCF128));
break;		break;
case ISD::FDIV:		case ISD::FDIV:
Results.push_back(ExpandFPLibCall(Node, RTLIB::DIV_F32, RTLIB::DIV_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::DIV_F32, RTLIB::DIV_F64,
RTLIB::DIV_F80, RTLIB::DIV_F128,		RTLIB::DIV_F80, RTLIB::DIV_F128,
RTLIB::DIV_PPCF128));		RTLIB::DIV_PPCF128));
break;		break;
case ISD::FREM:		case ISD::FREM:
Results.push_back(ExpandFPLibCall(Node, RTLIB::REM_F32, RTLIB::REM_F64,		Results.push_back(ExpandFPLibCall(Node, RTLIB::REM_F32, RTLIB::REM_F64,
▲ Show 20 Lines • Show All 597 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 897 Lines • ▼ Show 20 Lines	: TM(tm), OptLevel(OL),
EntryNode(ISD::EntryToken, 0, DebugLoc(), getVTList(MVT::Other)),		EntryNode(ISD::EntryToken, 0, DebugLoc(), getVTList(MVT::Other)),
Root(getEntryNode()) {		Root(getEntryNode()) {
InsertNode(&EntryNode);		InsertNode(&EntryNode);
DbgInfo = new SDDbgInfo();		DbgInfo = new SDDbgInfo();
}		}

void SelectionDAG::init(MachineFunction &NewMF,		void SelectionDAG::init(MachineFunction &NewMF,
OptimizationRemarkEmitter &NewORE,		OptimizationRemarkEmitter &NewORE,
Pass *PassPtr) {		Pass PassPtr, const TargetLibraryInfo LibraryInfo) {
MF = &NewMF;		MF = &NewMF;
SDAGISelPass = PassPtr;		SDAGISelPass = PassPtr;
ORE = &NewORE;		ORE = &NewORE;
TLI = getSubtarget().getTargetLowering();		TLI = getSubtarget().getTargetLowering();
TSI = getSubtarget().getSelectionDAGInfo();		TSI = getSubtarget().getSelectionDAGInfo();
		LibInfo = LibraryInfo;
Context = &MF->getFunction().getContext();		Context = &MF->getFunction().getContext();
}		}

SelectionDAG::~SelectionDAG() {		SelectionDAG::~SelectionDAG() {
assert(!UpdateListeners && "Dangling registered DAGUpdateListeners");		assert(!UpdateListeners && "Dangling registered DAGUpdateListeners");
allnodes_clear();		allnodes_clear();
OperandRecycler.clear(OperandAllocator);		OperandRecycler.clear(OperandAllocator);
delete DbgInfo;		delete DbgInfo;
▲ Show 20 Lines • Show All 7,353 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

Show First 20 Lines • Show All 408 Lines • ▼ Show 20 Lines	bool SelectionDAGISel::runOnMachineFunction(MachineFunction &mf) {
DominatorTree *DT = DTWP ? &DTWP->getDomTree() : nullptr;		DominatorTree *DT = DTWP ? &DTWP->getDomTree() : nullptr;
auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();		auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();
LoopInfo *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;		LoopInfo *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;

DEBUG(dbgs() << "\n\n\n=== " << Fn.getName() << "\n");		DEBUG(dbgs() << "\n\n\n=== " << Fn.getName() << "\n");

SplitCriticalSideEffectEdges(const_cast<Function &>(Fn), DT, LI);		SplitCriticalSideEffectEdges(const_cast<Function &>(Fn), DT, LI);

CurDAG->init(MF, ORE, this);		CurDAG->init(MF, ORE, this, LibInfo);
FuncInfo->set(Fn, *MF, CurDAG);		FuncInfo->set(Fn, *MF, CurDAG);

// Now get the optional analyzes if we want to.		// Now get the optional analyzes if we want to.
// This is based on the possibly changed OptLevel (after optnone is taken		// This is based on the possibly changed OptLevel (after optnone is taken
// into account). That's unfortunate but OK because it just means we won't		// into account). That's unfortunate but OK because it just means we won't
// ask for passes that have been required anyway.		// ask for passes that have been required anyway.

if (UseMBPI && OptLevel != CodeGenOpt::None)		if (UseMBPI && OptLevel != CodeGenOpt::None)
▲ Show 20 Lines • Show All 3,396 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/finite-libcalls.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-pc-linux-gnu \| FileCheck %s --check-prefix=CHECK --check-prefix=GNU			; RUN: llc < %s -mtriple=x86_64-pc-linux-gnu \| FileCheck %s --check-prefix=CHECK --check-prefix=GNU
	; RUN: llc < %s -mtriple=x86_64-pc-windows-msvc \| FileCheck %s --check-prefix=CHECK --check-prefix=WIN			; RUN: llc < %s -mtriple=x86_64-pc-windows-msvc \| FileCheck %s --check-prefix=CHECK --check-prefix=WIN
				; RUN: llc < %s -mtriple=x86_64-apple-darwin \| FileCheck %s --check-prefix=CHECK --check-prefix=MAC

	; PR35672 - https://bugs.llvm.org/show_bug.cgi?id=35672			; PR35672 - https://bugs.llvm.org/show_bug.cgi?id=35672
	; FIXME: We would not need the function-level attributes if FMF were propagated to DAG nodes for this case.			; FIXME: We would not need the function-level attributes if FMF were propagated to DAG nodes for this case.

	define float @exp_f32(float %x) #0 {			define float @exp_f32(float %x) #0 {
	; CHECK-LABEL: exp_f32:			; GNU-LABEL: exp_f32:
	; CHECK: # %bb.0:			; GNU: # %bb.0:
	; CHECK-NEXT: jmp expf # TAILCALL			; GNU-NEXT: jmp __expf_finite # TAILCALL
	%exp = tail call nnan ninf float @llvm.exp.f32(float %x)			;
	ret float %exp			; WIN-LABEL: exp_f32:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp expf # TAILCALL
				;
				; MAC-LABEL: exp_f32:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _expf ## TAILCALL
				%r = tail call nnan ninf float @llvm.exp.f32(float %x)
				ret float %r
	}			}

	define double @exp_f64(double %x) #0 {			define double @exp_f64(double %x) #0 {
	; CHECK-LABEL: exp_f64:			; GNU-LABEL: exp_f64:
	; CHECK: # %bb.0:			; GNU: # %bb.0:
	; CHECK-NEXT: jmp exp # TAILCALL			; GNU-NEXT: jmp __exp_finite # TAILCALL
	%exp = tail call nnan ninf double @llvm.exp.f64(double %x)			;
	ret double %exp			; WIN-LABEL: exp_f64:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp exp # TAILCALL
				;
				; MAC-LABEL: exp_f64:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _exp ## TAILCALL
				%r = tail call nnan ninf double @llvm.exp.f64(double %x)
				ret double %r
	}			}

	define x86_fp80 @exp_f80(x86_fp80 %x) #0 {			define x86_fp80 @exp_f80(x86_fp80 %x) #0 {
	; GNU-LABEL: exp_f80:			; GNU-LABEL: exp_f80:
	; GNU: # %bb.0:			; GNU: # %bb.0:
	; GNU-NEXT: subq $24, %rsp			; GNU-NEXT: subq $24, %rsp
	; GNU-NEXT: fldt {{[0-9]+}}(%rsp)			; GNU-NEXT: fldt {{[0-9]+}}(%rsp)
	; GNU-NEXT: fstpt (%rsp)			; GNU-NEXT: fstpt (%rsp)
	; GNU-NEXT: callq expl			; GNU-NEXT: callq __expl_finite
	; GNU-NEXT: addq $24, %rsp			; GNU-NEXT: addq $24, %rsp
	; GNU-NEXT: retq			; GNU-NEXT: retq
	;			;
	; WIN-LABEL: exp_f80:			; WIN-LABEL: exp_f80:
	; WIN: # %bb.0:			; WIN: # %bb.0:
	; WIN-NEXT: subq $56, %rsp			; WIN-NEXT: subq $56, %rsp
	; WIN-NEXT: fldt {{[0-9]+}}(%rsp)			; WIN-NEXT: fldt {{[0-9]+}}(%rsp)
	; WIN-NEXT: fstpt {{[0-9]+}}(%rsp)			; WIN-NEXT: fstpt {{[0-9]+}}(%rsp)
	; WIN-NEXT: callq expl			; WIN-NEXT: callq expl
	; WIN-NEXT: addq $56, %rsp			; WIN-NEXT: addq $56, %rsp
	; WIN-NEXT: retq			; WIN-NEXT: retq
	%exp = tail call nnan ninf x86_fp80 @llvm.exp.f80(x86_fp80 %x)			;
	ret x86_fp80 %exp			; MAC-LABEL: exp_f80:
				; MAC: ## %bb.0:
				; MAC-NEXT: subq $24, %rsp
				; MAC-NEXT: fldt {{[0-9]+}}(%rsp)
				; MAC-NEXT: fstpt (%rsp)
				; MAC-NEXT: callq _expl
				; MAC-NEXT: addq $24, %rsp
				; MAC-NEXT: retq
				%r = tail call nnan ninf x86_fp80 @llvm.exp.f80(x86_fp80 %x)
				ret x86_fp80 %r
				}

				define float @exp2_f32(float %x) #0 {
				; GNU-LABEL: exp2_f32:
				; GNU: # %bb.0:
				; GNU-NEXT: jmp __exp2f_finite # TAILCALL
				;
				; WIN-LABEL: exp2_f32:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp exp2f # TAILCALL
				;
				; MAC-LABEL: exp2_f32:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _exp2f ## TAILCALL
				%r = tail call nnan ninf float @llvm.exp2.f32(float %x)
				ret float %r
				}

				define double @exp2_f64(double %x) #0 {
				; GNU-LABEL: exp2_f64:
				; GNU: # %bb.0:
				; GNU-NEXT: jmp __exp2_finite # TAILCALL
				;
				; WIN-LABEL: exp2_f64:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp exp2 # TAILCALL
				;
				; MAC-LABEL: exp2_f64:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _exp2 ## TAILCALL
				%r = tail call nnan ninf double @llvm.exp2.f64(double %x)
				ret double %r
				}

				define x86_fp80 @exp2_f80(x86_fp80 %x) #0 {
				; GNU-LABEL: exp2_f80:
				; GNU: # %bb.0:
				; GNU-NEXT: subq $24, %rsp
				; GNU-NEXT: fldt {{[0-9]+}}(%rsp)
				; GNU-NEXT: fstpt (%rsp)
				; GNU-NEXT: callq __exp2l_finite
				; GNU-NEXT: addq $24, %rsp
				; GNU-NEXT: retq
				;
				; WIN-LABEL: exp2_f80:
				; WIN: # %bb.0:
				; WIN-NEXT: subq $56, %rsp
				; WIN-NEXT: fldt {{[0-9]+}}(%rsp)
				; WIN-NEXT: fstpt {{[0-9]+}}(%rsp)
				; WIN-NEXT: callq exp2l
				; WIN-NEXT: addq $56, %rsp
				; WIN-NEXT: retq
				;
				; MAC-LABEL: exp2_f80:
				; MAC: ## %bb.0:
				; MAC-NEXT: subq $24, %rsp
				; MAC-NEXT: fldt {{[0-9]+}}(%rsp)
				; MAC-NEXT: fstpt (%rsp)
				; MAC-NEXT: callq _exp2l
				; MAC-NEXT: addq $24, %rsp
				; MAC-NEXT: retq
				%r = tail call nnan ninf x86_fp80 @llvm.exp2.f80(x86_fp80 %x)
				ret x86_fp80 %r
				}

				define float @log_f32(float %x) #0 {
				; GNU-LABEL: log_f32:
				; GNU: # %bb.0:
				; GNU-NEXT: jmp __logf_finite # TAILCALL
				;
				; WIN-LABEL: log_f32:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp logf # TAILCALL
				;
				; MAC-LABEL: log_f32:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _logf ## TAILCALL
				%r = tail call nnan ninf float @llvm.log.f32(float %x)
				ret float %r
				}

				define double @log_f64(double %x) #0 {
				; GNU-LABEL: log_f64:
				; GNU: # %bb.0:
				; GNU-NEXT: jmp __log_finite # TAILCALL
				;
				; WIN-LABEL: log_f64:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp log # TAILCALL
				;
				; MAC-LABEL: log_f64:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _log ## TAILCALL
				%r = tail call nnan ninf double @llvm.log.f64(double %x)
				ret double %r
				}

				define x86_fp80 @log_f80(x86_fp80 %x) #0 {
				; GNU-LABEL: log_f80:
				; GNU: # %bb.0:
				; GNU-NEXT: subq $24, %rsp
				; GNU-NEXT: fldt {{[0-9]+}}(%rsp)
				; GNU-NEXT: fstpt (%rsp)
				; GNU-NEXT: callq __logl_finite
				; GNU-NEXT: addq $24, %rsp
				; GNU-NEXT: retq
				;
				; WIN-LABEL: log_f80:
				; WIN: # %bb.0:
				; WIN-NEXT: subq $56, %rsp
				; WIN-NEXT: fldt {{[0-9]+}}(%rsp)
				; WIN-NEXT: fstpt {{[0-9]+}}(%rsp)
				; WIN-NEXT: callq logl
				; WIN-NEXT: addq $56, %rsp
				; WIN-NEXT: retq
				;
				; MAC-LABEL: log_f80:
				; MAC: ## %bb.0:
				; MAC-NEXT: subq $24, %rsp
				; MAC-NEXT: fldt {{[0-9]+}}(%rsp)
				; MAC-NEXT: fstpt (%rsp)
				; MAC-NEXT: callq _logl
				; MAC-NEXT: addq $24, %rsp
				; MAC-NEXT: retq
				%r = tail call nnan ninf x86_fp80 @llvm.log.f80(x86_fp80 %x)
				ret x86_fp80 %r
				}

				define float @log2_f32(float %x) #0 {
				; GNU-LABEL: log2_f32:
				; GNU: # %bb.0:
				; GNU-NEXT: jmp __log2f_finite # TAILCALL
				;
				; WIN-LABEL: log2_f32:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp log2f # TAILCALL
				;
				; MAC-LABEL: log2_f32:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _log2f ## TAILCALL
				%r = tail call nnan ninf float @llvm.log2.f32(float %x)
				ret float %r
				}

				define double @log2_f64(double %x) #0 {
				; GNU-LABEL: log2_f64:
				; GNU: # %bb.0:
				; GNU-NEXT: jmp __log2_finite # TAILCALL
				;
				; WIN-LABEL: log2_f64:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp log2 # TAILCALL
				;
				; MAC-LABEL: log2_f64:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _log2 ## TAILCALL
				%r = tail call nnan ninf double @llvm.log2.f64(double %x)
				ret double %r
				}

				define x86_fp80 @log2_f80(x86_fp80 %x) #0 {
				; GNU-LABEL: log2_f80:
				; GNU: # %bb.0:
				; GNU-NEXT: subq $24, %rsp
				; GNU-NEXT: fldt {{[0-9]+}}(%rsp)
				; GNU-NEXT: fstpt (%rsp)
				; GNU-NEXT: callq __log2l_finite
				; GNU-NEXT: addq $24, %rsp
				; GNU-NEXT: retq
				;
				; WIN-LABEL: log2_f80:
				; WIN: # %bb.0:
				; WIN-NEXT: subq $56, %rsp
				; WIN-NEXT: fldt {{[0-9]+}}(%rsp)
				; WIN-NEXT: fstpt {{[0-9]+}}(%rsp)
				; WIN-NEXT: callq log2l
				; WIN-NEXT: addq $56, %rsp
				; WIN-NEXT: retq
				;
				; MAC-LABEL: log2_f80:
				; MAC: ## %bb.0:
				; MAC-NEXT: subq $24, %rsp
				; MAC-NEXT: fldt {{[0-9]+}}(%rsp)
				; MAC-NEXT: fstpt (%rsp)
				; MAC-NEXT: callq _log2l
				; MAC-NEXT: addq $24, %rsp
				; MAC-NEXT: retq
				%r = tail call nnan ninf x86_fp80 @llvm.log2.f80(x86_fp80 %x)
				ret x86_fp80 %r
				}

				define float @log10_f32(float %x) #0 {
				; GNU-LABEL: log10_f32:
				; GNU: # %bb.0:
				; GNU-NEXT: jmp __log10f_finite # TAILCALL
				;
				; WIN-LABEL: log10_f32:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp log10f # TAILCALL
				;
				; MAC-LABEL: log10_f32:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _log10f ## TAILCALL
				%r = tail call nnan ninf float @llvm.log10.f32(float %x)
				ret float %r
				}

				define double @log10_f64(double %x) #0 {
				; GNU-LABEL: log10_f64:
				; GNU: # %bb.0:
				; GNU-NEXT: jmp __log10_finite # TAILCALL
				;
				; WIN-LABEL: log10_f64:
				; WIN: # %bb.0:
				; WIN-NEXT: jmp log10 # TAILCALL
				;
				; MAC-LABEL: log10_f64:
				; MAC: ## %bb.0:
				; MAC-NEXT: jmp _log10 ## TAILCALL
				%r = tail call nnan ninf double @llvm.log10.f64(double %x)
				ret double %r
				}

				define x86_fp80 @log10_f80(x86_fp80 %x) #0 {
				; GNU-LABEL: log10_f80:
				; GNU: # %bb.0:
				; GNU-NEXT: subq $24, %rsp
				; GNU-NEXT: fldt {{[0-9]+}}(%rsp)
				; GNU-NEXT: fstpt (%rsp)
				; GNU-NEXT: callq __log10l_finite
				; GNU-NEXT: addq $24, %rsp
				; GNU-NEXT: retq
				;
				; WIN-LABEL: log10_f80:
				; WIN: # %bb.0:
				; WIN-NEXT: subq $56, %rsp
				; WIN-NEXT: fldt {{[0-9]+}}(%rsp)
				; WIN-NEXT: fstpt {{[0-9]+}}(%rsp)
				; WIN-NEXT: callq log10l
				; WIN-NEXT: addq $56, %rsp
				; WIN-NEXT: retq
				;
				; MAC-LABEL: log10_f80:
				; MAC: ## %bb.0:
				; MAC-NEXT: subq $24, %rsp
				; MAC-NEXT: fldt {{[0-9]+}}(%rsp)
				; MAC-NEXT: fstpt (%rsp)
				; MAC-NEXT: callq _log10l
				; MAC-NEXT: addq $24, %rsp
				; MAC-NEXT: retq
				%r = tail call nnan ninf x86_fp80 @llvm.log10.f80(x86_fp80 %x)
				ret x86_fp80 %r
				}

				define float @pow_f32(float %x) #0 {
				; GNU-LABEL: pow_f32:
				; GNU: # %bb.0:
				; GNU-NEXT: movaps %xmm0, %xmm1
				; GNU-NEXT: jmp __powf_finite # TAILCALL
				;
				; WIN-LABEL: pow_f32:
				; WIN: # %bb.0:
				; WIN-NEXT: movaps %xmm0, %xmm1
				; WIN-NEXT: jmp powf # TAILCALL
				;
				; MAC-LABEL: pow_f32:
				; MAC: ## %bb.0:
				; MAC-NEXT: movaps %xmm0, %xmm1
				; MAC-NEXT: jmp _powf ## TAILCALL
				%r = tail call nnan ninf float @llvm.pow.f32(float %x, float %x)
				ret float %r
				}

				define double @pow_f64(double %x) #0 {
				; GNU-LABEL: pow_f64:
				; GNU: # %bb.0:
				; GNU-NEXT: movaps %xmm0, %xmm1
				; GNU-NEXT: jmp __pow_finite # TAILCALL
				;
				; WIN-LABEL: pow_f64:
				; WIN: # %bb.0:
				; WIN-NEXT: movaps %xmm0, %xmm1
				; WIN-NEXT: jmp pow # TAILCALL
				;
				; MAC-LABEL: pow_f64:
				; MAC: ## %bb.0:
				; MAC-NEXT: movaps %xmm0, %xmm1
				; MAC-NEXT: jmp _pow ## TAILCALL
				%r = tail call nnan ninf double @llvm.pow.f64(double %x, double %x)
				ret double %r
				}

				define x86_fp80 @pow_f80(x86_fp80 %x) #0 {
				; GNU-LABEL: pow_f80:
				; GNU: # %bb.0:
				; GNU-NEXT: subq $40, %rsp
				; GNU-NEXT: fldt {{[0-9]+}}(%rsp)
				; GNU-NEXT: fld %st(0)
				; GNU-NEXT: fstpt {{[0-9]+}}(%rsp)
				; GNU-NEXT: fstpt (%rsp)
				; GNU-NEXT: callq __powl_finite
				; GNU-NEXT: addq $40, %rsp
				; GNU-NEXT: retq
				;
				; WIN-LABEL: pow_f80:
				; WIN: # %bb.0:
				; WIN-NEXT: subq $72, %rsp
				; WIN-NEXT: fldt {{[0-9]+}}(%rsp)
				; WIN-NEXT: fld %st(0)
				; WIN-NEXT: fstpt {{[0-9]+}}(%rsp)
				; WIN-NEXT: fstpt {{[0-9]+}}(%rsp)
				; WIN-NEXT: callq powl
				; WIN-NEXT: addq $72, %rsp
				; WIN-NEXT: retq
				;
				; MAC-LABEL: pow_f80:
				; MAC: ## %bb.0:
				; MAC-NEXT: subq $40, %rsp
				; MAC-NEXT: fldt {{[0-9]+}}(%rsp)
				; MAC-NEXT: fld %st(0)
				; MAC-NEXT: fstpt {{[0-9]+}}(%rsp)
				; MAC-NEXT: fstpt (%rsp)
				; MAC-NEXT: callq _powl
				; MAC-NEXT: addq $40, %rsp
				; MAC-NEXT: retq
				%r = tail call nnan ninf x86_fp80 @llvm.pow.f80(x86_fp80 %x, x86_fp80 %x)
				ret x86_fp80 %r
	}			}

	declare float @llvm.exp.f32(float) #1			declare float @llvm.exp.f32(float) #1
	declare double @llvm.exp.f64(double) #1			declare double @llvm.exp.f64(double) #1
	declare x86_fp80 @llvm.exp.f80(x86_fp80) #1			declare x86_fp80 @llvm.exp.f80(x86_fp80) #1

				declare float @llvm.exp2.f32(float) #1
				declare double @llvm.exp2.f64(double) #1
				declare x86_fp80 @llvm.exp2.f80(x86_fp80) #1

				declare float @llvm.log.f32(float) #1
				declare double @llvm.log.f64(double) #1
				declare x86_fp80 @llvm.log.f80(x86_fp80) #1

				declare float @llvm.log2.f32(float) #1
				declare double @llvm.log2.f64(double) #1
				declare x86_fp80 @llvm.log2.f80(x86_fp80) #1

				declare float @llvm.log10.f32(float) #1
				declare double @llvm.log10.f64(double) #1
				declare x86_fp80 @llvm.log10.f80(x86_fp80) #1

				declare float @llvm.pow.f32(float, float) #1
				declare double @llvm.pow.f64(double, double) #1
				declare x86_fp80 @llvm.pow.f80(x86_fp80, x86_fp80) #1

	attributes #0 = { nounwind "no-infs-fp-math"="true" "no-nans-fp-math"="true" }			attributes #0 = { nounwind "no-infs-fp-math"="true" "no-nans-fp-math"="true" }
	attributes #1 = { nounwind readnone speculatable }			attributes #1 = { nounwind readnone speculatable }

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] lower math intrinsics to finite version of libcalls when possible (PR35672)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 129086

llvm/trunk/include/llvm/CodeGen/RuntimeLibcalls.def

llvm/trunk/include/llvm/CodeGen/SelectionDAG.h

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

llvm/trunk/test/CodeGen/X86/finite-libcalls.ll

[CodeGen] lower math intrinsics to finite version of libcalls when possible (PR35672)
ClosedPublic