This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
-
LangRef.rst
-
lib/
-
CodeGen/
-
SelectionDAG/
-
LegalizeDAG.cpp
-
SelectionDAGBuilder.cpp
-
TargetLowering.cpp
-
TargetLoweringBase.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64FastISel.cpp

Differential D28335

[WIP] [RFC] Don't lower floating point intrinsics to libcalls which modify errno
AbandonedPublic

Authored by efriedma on Jan 4 2017, 4:43 PM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
andreadb
rob.lougher
javed.absar

Summary

sqrt(), sin(), cos(), pow(), exp(), exp2(), log(), log2(), and log10() are not available in variants which don't set errno on glibc (and maybe other more obscure platforms), so don't claim they're universally available in TargetLowering.cpp. Then fix LangRef to make it obvious that they in fact have no side-effects, and get rid of the silly sqrt() special-case.

TODO:

Make the backend error message print out the actual name of the intrinsic rather than a number.
Provide libcall lowerings for these intrinsics on platforms where they're available.
Provide a target API to check whether these intrinsics are available, and use it (SimplifyLibCalls, vectorizer)
Fix ConstantFolding so it doesn't assume the operand to llvm.sqrt() is positive
Fix regression tests (60-ish failures at the moment).
Fix clang so it doesn't generate calls to llvm.sqrt() and llvm.pow() when they aren't available.
Fix clang so it doesn't mark calls which have side-effects readnone.

Sort of a response to https://reviews.llvm.org/D27618 and other related discussion lately. I'm not sure when I'll have time to finish this.

Diff Detail

Repository: rL LLVM

Event Timeline

efriedma updated this revision to Diff 83161.Jan 4 2017, 4:43 PM

efriedma retitled this revision from to [WIP] [RFC] Don't lower floating point intrinsics to libcalls which modify errno.

efriedma updated this object.

efriedma set the repository for this revision to rL LLVM.

efriedma added subscribers: llvm-commits, spatel, avt77 and 2 others.

sqrt(), sin(), cos(), pow(), exp(), exp2(), log(), log2(), and log10() are not available in variants which don't set errno on glibc (and maybe other more obscure platforms), so don't claim they're universally available in TargetLowering.cpp. Then fix LangRef to make it obvious that they in fact have no side-effects, and get rid of the silly sqrt() special-case.

I fully support work in this direction; we definitely need to fix this up.

We also do need to model errno, however, or provide compiler-rt wrappers,etc. so that we can really use -fno-math-errno on systems where the math functions really do set errno and not miscompile code by reordering calls to math functions with calls to other functions (e.g. open, read) that set errno and where errno is later used (e.g. by perror).

We also do need to model errno, however, or provide compiler-rt wrappers,etc. so that we can really use -fno-math-errno on systems where the math functions really do set errno and not miscompile code by reordering calls to math functions with calls to other functions (e.g. open, read) that set errno and where errno is later used (e.g. by perror).

Yes, that's what I was getting at with "Fix clang so it doesn't mark calls which have side-effects readnone." It's hard to come up with a good solution here, though. The options I can think of:

Make compiler-rt implement a bunch of libm functions with sane semantics. Straightforward, but writing fast, correct libm routines is tricky.
Instead of marking the math functions readnone, mark them "dead_errno_write" or something like that. That allows correct modeling, but it probably ends up blocking a substantial number of optimizations because we have to assume almost any pointer could alias errno.
Some solution which involves teaching the compiler or compiler-rt how to compute the address of errno. This gets nasty fast because it's different for every libc implementation.

That's largely orthogonal to this patch, though.

RKSimon added a subscriber: davide.Jan 5 2017, 3:26 AM

In D28335#636356, @efriedma wrote:

We also do need to model errno, however, or provide compiler-rt wrappers,etc. so that we can really use -fno-math-errno on systems where the math functions really do set errno and not miscompile code by reordering calls to math functions with calls to other functions (e.g. open, read) that set errno and where errno is later used (e.g. by perror).

Yes, that's what I was getting at with "Fix clang so it doesn't mark calls which have side-effects readnone." It's hard to come up with a good solution here, though. The options I can think of:

Make compiler-rt implement a bunch of libm functions with sane semantics. Straightforward, but writing fast, correct libm routines is tricky.

No, you just need to wrap them such that you save/restore errno around the calls to the libm implementation.

Instead of marking the math functions readnone, mark them "dead_errno_write" or something like that. That allows correct modeling, but it probably ends up blocking a substantial number of optimizations because we have to assume almost any pointer could alias errno.

Some solution which involves teaching the compiler or compiler-rt how to compute the address of errno. This gets nasty fast because it's different for every libc implementation.

I know. I posted an RFC at some point where I detailed how errno is implemented across a wide range of implementations. It is nasty, at least in theory, but practically, it is not as bad as you might fear.

That's largely orthogonal to this patch, though.

It is in some sense, but understanding how we want to proceed here is an important part of this overall space, and directly affects whether it is reasonable to assume that the intrinsics will have no side effects.

I know. I posted an RFC at some point where I detailed how errno is implemented across a wide range of implementations. It is nasty, at least in theory, but practically, it is not as bad as you might fear.

http://lists.llvm.org/pipermail/llvm-dev/2013-November/068154.html ? It looks like the conclusion was actually that it's worse than that: matherr() exists on both glibc and MSVC, which means sqrt(-1) can do anything.

Sort of a response to https://reviews.llvm.org/D27618 and other related discussion lately. I'm not sure when I'll have time to finish this.

Should D27618 be abandoned and we instead pursue this approach?

Should D27618 be abandoned and we instead pursue this approach?

Probably, yes...

Actually, just fixing sqrt() on its own is substantially simpler than fixing all the other functions: we can lower llvm.sqrt() to "x < 0 ? NaN : sqrt(x)" if necessary, since that never sets errno. (We probably want to try to avoid generating that sequence if possible, but it would only be relevant on soft-float Linux targets, so maybe not a big deal.)

jlebar added a subscriber: jlebar.Jan 9 2017, 11:53 PM

@efriedma Any more thoughts on this?

I haven't really been looking at this lately.

The only angle that isn't mentioned here is that it might make sense to get rid of some of these intrinsics. llvm.sin(), llvm.cos(), and llvm.pow() are lowered to libcalls for every in-tree target.

Doesn't X86 lower llvm.sin/llvm.cos to fsin/fcos instructions with unsafe math and no sse?

In D28335#832364, @efriedma wrote:

I haven't really been looking at this lately.

The only angle that isn't mentioned here is that it might make sense to get rid of some of these intrinsics. llvm.sin(), llvm.cos(), and llvm.pow() are lowered to libcalls for every in-tree target.

This isn't true. On AMDGPU we select sin and cos at least to a sequence involving a hardware instruction (but we use these as the fast versions)

Huh, I guess we do use the x87 sin/cos. It's probably a terrible idea (the microcoded routines are both slow and wildly inaccurate for large inputs), but we do it.

And it looks like AMDGPU lowers llvm.pow using the identity pow(x,y) = exp2(y * log2(x)). So nevermind. :)

In D28335#832394, @efriedma wrote:

Huh, I guess we do use the x87 sin/cos. It's probably a terrible idea (the microcoded routines are both slow and wildly inaccurate for large inputs), but we do it.

fsin/fcos can be wildly inaccurate even for relatively modest arguments (https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/). We really shouldn’t ever generate them unless the user specifically asks for them.

craig.topper mentioned this in D36344: [X86] Don't use fsin/fcos/fsincos instructions ever.Aug 4 2017, 5:10 PM

Diffusion mentioned this in rL310762: [X86] Don't use fsin/fcos/fsincos instructions ever.Aug 11 2017, 1:56 PM

spatel mentioned this in D39160: [CodeGen] __builtin_sqrt should map to the compiler's intrinsic sqrt function.Oct 21 2017, 8:12 AM

spatel mentioned this in D39204: [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set.Oct 24 2017, 9:25 AM

spatel mentioned this in D39304: [IR] redefine 'reassoc' fast-math-flag and add 'trans' fast-math-flag.Oct 31 2017, 8:47 AM

spatel mentioned this in D27618: Failure to vectorize __builtin_sqrt/__builtin_sqrtf.Dec 1 2017, 6:47 AM

I think we have clang doing the expected thing after:

https://reviews.llvm.org/rL317031 ( https://reviews.llvm.org/D39204 )
https://reviews.llvm.org/rL317265 ( https://reviews.llvm.org/D39481 )
https://reviews.llvm.org/rL317407 ( https://reviews.llvm.org/D39615 )
https://reviews.llvm.org/rL318093 ( https://reviews.llvm.org/D39641 )
https://reviews.llvm.org/rL318598 ( https://reviews.llvm.org/D39611 )
https://reviews.llvm.org/rL319593 ( https://reviews.llvm.org/D40044 )
https://reviews.llvm.org/rL319619

...if you still see bugs up there, let me know.

Instead of:
"Fix clang so it doesn't generate calls to llvm.sqrt() and llvm.pow() when they aren't available."
We're deferring that to LLVM to save/restore errno somehow, but that's not done yet.

efriedma abandoned this revision.Aug 30 2018, 2:13 PM

Herald added a reviewer: javed.absar. · View Herald TranscriptAug 30 2018, 2:13 PM

Revision Contents

Path

Size

docs/

LangRef.rst

50 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

7 lines

SelectionDAGBuilder.cpp

33 lines

TargetLowering.cpp

3 lines

TargetLoweringBase.cpp

58 lines

Target/

AArch64/

AArch64FastISel.cpp

41 lines

Diff 83161

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 10,052 Lines • ▼ Show 20 Lines	::
declare double @llvm.sqrt.f64(double %Val)		declare double @llvm.sqrt.f64(double %Val)
declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)		declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
declare fp128 @llvm.sqrt.f128(fp128 %Val)		declare fp128 @llvm.sqrt.f128(fp128 %Val)
declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)		declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)

Overview:		Overview:
"""""""""		"""""""""

The '``llvm.sqrt``' intrinsics return the sqrt of the specified operand,		The '``llvm.sqrt``' intrinsics return the square root of the operand.
returning the same value as the libm '``sqrt``' functions would. Unlike
``sqrt`` in libm, however, ``llvm.sqrt`` has undefined behavior for
negative numbers other than -0.0 (which allows for better optimization,
because there is no need to worry about errno being set).
``llvm.sqrt(-0.0)`` is defined to return -0.0 like IEEE sqrt.

Arguments:		Arguments:
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the sqrt of the specified operand if it is a		This is equivalent to the IEEE 754 ``squareRoot()`` function. It has no
nonnegative floating point number.		side-effects.

'``llvm.powi.*``' Intrinsic		'``llvm.powi.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.powi`` on any		This is an overloaded intrinsic. You can use ``llvm.powi`` on any
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the sine of the specified operand, returning the		This is equivalent to the IEEE 754 ``sin()`` function. It has no
same values as the libm ``sin`` functions would, and handles error		side-effects.
conditions in the same way.

'``llvm.cos.*``' Intrinsic		'``llvm.cos.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.cos`` on any		This is an overloaded intrinsic. You can use ``llvm.cos`` on any
Show All 17 Lines
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the cosine of the specified operand, returning the		This is equivalent to the IEEE 754 ``cos()`` function. It has no
same values as the libm ``cos`` functions would, and handles error		side-effects.
conditions in the same way.

'``llvm.pow.*``' Intrinsic		'``llvm.pow.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.pow`` on any		This is an overloaded intrinsic. You can use ``llvm.pow`` on any
Show All 18 Lines
""""""""""		""""""""""

The second argument is a floating point power, and the first is a value		The second argument is a floating point power, and the first is a value
to raise to that power.		to raise to that power.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the first value raised to the second power,		This is equivalent to the IEEE 754 ``pow()`` function. It has no
returning the same values as the libm ``pow`` functions would, and		side-effects.
handles error conditions in the same way.

'``llvm.exp.*``' Intrinsic		'``llvm.exp.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.exp`` on any		This is an overloaded intrinsic. You can use ``llvm.exp`` on any
Show All 17 Lines
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the same values as the libm ``exp`` functions		This is equivalent to the IEEE 754 ``exp()`` function. It has no
would, and handles error conditions in the same way.		side-effects.

'``llvm.exp2.*``' Intrinsic		'``llvm.exp2.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.exp2`` on any		This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
Show All 17 Lines
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the same values as the libm ``exp2`` functions		This is equivalent to the IEEE 754 ``exp2()`` function. It has no
would, and handles error conditions in the same way.		side-effects.

'``llvm.log.*``' Intrinsic		'``llvm.log.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.log`` on any		This is an overloaded intrinsic. You can use ``llvm.log`` on any
Show All 17 Lines
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the same values as the libm ``log`` functions		This is equivalent to the IEEE 754 ``log()`` function. It has no
would, and handles error conditions in the same way.		side-effects.

'``llvm.log10.*``' Intrinsic		'``llvm.log10.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.log10`` on any		This is an overloaded intrinsic. You can use ``llvm.log10`` on any
Show All 17 Lines
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the same values as the libm ``log10`` functions		This is equivalent to the IEEE 754 ``log10()`` function. It has no
would, and handles error conditions in the same way.		side-effects.

'``llvm.log2.*``' Intrinsic		'``llvm.log2.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.log2`` on any		This is an overloaded intrinsic. You can use ``llvm.log2`` on any
Show All 17 Lines
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the same values as the libm ``log2`` functions		This is equivalent to the IEEE 754 ``log2()`` function. It has no
would, and handles error conditions in the same way.		side-effects.

'``llvm.fma.*``' Intrinsic		'``llvm.fma.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.fma`` on any		This is an overloaded intrinsic. You can use ``llvm.fma`` on any
Show All 18 Lines
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
""""""""""		""""""""""

This function returns the same values as the libm ``fma`` functions		This is equivalent to the IEEE 754 ``fma()`` function. It has no
would, and does not set errno.		side-effects.

'``llvm.fabs.*``' Intrinsic		'``llvm.fabs.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.fabs`` on any		This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
▲ Show 20 Lines • Show All 2,296 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show All 9 Lines
// This file implements the SelectionDAG::Legalize method.		// This file implements the SelectionDAG::Legalize method.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineJumpTableInfo.h"		#include "llvm/CodeGen/MachineJumpTableInfo.h"
#include "llvm/CodeGen/SelectionDAG.h"		#include "llvm/CodeGen/SelectionDAG.h"
#include "llvm/CodeGen/SelectionDAGNodes.h"		#include "llvm/CodeGen/SelectionDAGNodes.h"
#include "llvm/IR/CallingConv.h"		#include "llvm/IR/CallingConv.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
▲ Show 20 Lines • Show All 1,891 Lines • ▼ Show 20 Lines
}		}

// Expand a node into a call to a libcall. If the result value		// Expand a node into a call to a libcall. If the result value
// does not fit into a register, return the lo part and set the hi part to the		// does not fit into a register, return the lo part and set the hi part to the
// by-reg argument. If it does fit into a single register, return the result		// by-reg argument. If it does fit into a single register, return the result
// and leave the Hi part unset.		// and leave the Hi part unset.
SDValue SelectionDAGLegalize::ExpandLibCall(RTLIB::Libcall LC, SDNode *Node,		SDValue SelectionDAGLegalize::ExpandLibCall(RTLIB::Libcall LC, SDNode *Node,
bool isSigned) {		bool isSigned) {
		if (!TLI.getLibcallName(LC))
		report_fatal_error("No implementation available for libcall " + utostr(LC));

TargetLowering::ArgListTy Args;		TargetLowering::ArgListTy Args;
TargetLowering::ArgListEntry Entry;		TargetLowering::ArgListEntry Entry;
for (const SDValue &Op : Node->op_values()) {		for (const SDValue &Op : Node->op_values()) {
EVT ArgVT = Op.getValueType();		EVT ArgVT = Op.getValueType();
Type ArgTy = ArgVT.getTypeForEVT(DAG.getContext());		Type ArgTy = ArgVT.getTypeForEVT(DAG.getContext());
Entry.Node = Op;		Entry.Node = Op;
Entry.Ty = ArgTy;		Entry.Ty = ArgTy;
Entry.isSExt = isSigned;		Entry.isSExt = isSigned;
Show All 35 Lines	SDValue SelectionDAGLegalize::ExpandLibCall(RTLIB::Libcall LC, SDNode *Node,
return CallInfo.first;		return CallInfo.first;
}		}

/// Generate a libcall taking the given operands as arguments		/// Generate a libcall taking the given operands as arguments
/// and returning a result of type RetVT.		/// and returning a result of type RetVT.
SDValue SelectionDAGLegalize::ExpandLibCall(RTLIB::Libcall LC, EVT RetVT,		SDValue SelectionDAGLegalize::ExpandLibCall(RTLIB::Libcall LC, EVT RetVT,
const SDValue *Ops, unsigned NumOps,		const SDValue *Ops, unsigned NumOps,
bool isSigned, const SDLoc &dl) {		bool isSigned, const SDLoc &dl) {
		if (!TLI.getLibcallName(LC))
		report_fatal_error("No implementation available for libcall " + utostr(LC));

TargetLowering::ArgListTy Args;		TargetLowering::ArgListTy Args;
Args.reserve(NumOps);		Args.reserve(NumOps);

TargetLowering::ArgListEntry Entry;		TargetLowering::ArgListEntry Entry;
for (unsigned i = 0; i != NumOps; ++i) {		for (unsigned i = 0; i != NumOps; ++i) {
Entry.Node = Ops[i];		Entry.Node = Ops[i];
Entry.Ty = Entry.Node.getValueType().getTypeForEVT(*DAG.getContext());		Entry.Ty = Entry.Node.getValueType().getTypeForEVT(*DAG.getContext());
Entry.isSExt = isSigned;		Entry.isSExt = isSigned;
▲ Show 20 Lines • Show All 2,607 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,392 Lines • ▼ Show 20 Lines	if (!I.isNoBuiltin() && !F->hasLocalLinkage() && F->hasName() &&
return;		return;
break;		break;
case LibFunc::fmax:		case LibFunc::fmax:
case LibFunc::fmaxf:		case LibFunc::fmaxf:
case LibFunc::fmaxl:		case LibFunc::fmaxl:
if (visitBinaryFloatCall(I, ISD::FMAXNUM))		if (visitBinaryFloatCall(I, ISD::FMAXNUM))
return;		return;
break;		break;
case LibFunc::sin:
case LibFunc::sinf:
case LibFunc::sinl:
if (visitUnaryFloatCall(I, ISD::FSIN))
return;
break;
case LibFunc::cos:
case LibFunc::cosf:
case LibFunc::cosl:
if (visitUnaryFloatCall(I, ISD::FCOS))
return;
break;
case LibFunc::sqrt:
case LibFunc::sqrtf:
case LibFunc::sqrtl:
case LibFunc::sqrt_finite:
case LibFunc::sqrtf_finite:
case LibFunc::sqrtl_finite:
if (visitUnaryFloatCall(I, ISD::FSQRT))
return;
break;
case LibFunc::floor:		case LibFunc::floor:
case LibFunc::floorf:		case LibFunc::floorf:
case LibFunc::floorl:		case LibFunc::floorl:
if (visitUnaryFloatCall(I, ISD::FFLOOR))		if (visitUnaryFloatCall(I, ISD::FFLOOR))
return;		return;
break;		break;
case LibFunc::nearbyint:		case LibFunc::nearbyint:
case LibFunc::nearbyintf:		case LibFunc::nearbyintf:
Show All 20 Lines	if (!I.isNoBuiltin() && !F->hasLocalLinkage() && F->hasName() &&
return;		return;
break;		break;
case LibFunc::trunc:		case LibFunc::trunc:
case LibFunc::truncf:		case LibFunc::truncf:
case LibFunc::truncl:		case LibFunc::truncl:
if (visitUnaryFloatCall(I, ISD::FTRUNC))		if (visitUnaryFloatCall(I, ISD::FTRUNC))
return;		return;
break;		break;
case LibFunc::log2:
case LibFunc::log2f:
case LibFunc::log2l:
if (visitUnaryFloatCall(I, ISD::FLOG2))
return;
break;
case LibFunc::exp2:
case LibFunc::exp2f:
case LibFunc::exp2l:
if (visitUnaryFloatCall(I, ISD::FEXP2))
return;
break;
case LibFunc::memcmp:		case LibFunc::memcmp:
if (visitMemCmpCall(I))		if (visitMemCmpCall(I))
return;		return;
break;		break;
case LibFunc::mempcpy:		case LibFunc::mempcpy:
if (visitMemPCpyCall(I))		if (visitMemPCpyCall(I))
return;		return;
break;		break;
▲ Show 20 Lines • Show All 2,898 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/TargetLowering.cpp

//===-- TargetLowering.cpp - Implement the TargetLowering class -----------===//		//===-- TargetLowering.cpp - Implement the TargetLowering class -----------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This implements the TargetLowering class.		// This implements the TargetLowering class.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Target/TargetLowering.h"		#include "llvm/Target/TargetLowering.h"
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
		#include "llvm/ADT/StringExtras.h"
#include "llvm/CodeGen/CallingConvLower.h"		#include "llvm/CodeGen/CallingConvLower.h"
#include "llvm/CodeGen/MachineFrameInfo.h"		#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineJumpTableInfo.h"		#include "llvm/CodeGen/MachineJumpTableInfo.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/SelectionDAG.h"		#include "llvm/CodeGen/SelectionDAG.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	for (SDValue Op : Ops) {
Entry.Ty = Entry.Node.getValueType().getTypeForEVT(*DAG.getContext());		Entry.Ty = Entry.Node.getValueType().getTypeForEVT(*DAG.getContext());
Entry.isSExt = shouldSignExtendTypeInLibCall(Op.getValueType(), isSigned);		Entry.isSExt = shouldSignExtendTypeInLibCall(Op.getValueType(), isSigned);
Entry.isZExt = !shouldSignExtendTypeInLibCall(Op.getValueType(), isSigned);		Entry.isZExt = !shouldSignExtendTypeInLibCall(Op.getValueType(), isSigned);
Args.push_back(Entry);		Args.push_back(Entry);
}		}

if (LC == RTLIB::UNKNOWN_LIBCALL)		if (LC == RTLIB::UNKNOWN_LIBCALL)
report_fatal_error("Unsupported library call operation!");		report_fatal_error("Unsupported library call operation!");
		if (!getLibcallName(LC))
		report_fatal_error("No implementation available for libcall " + utostr(LC));
SDValue Callee = DAG.getExternalSymbol(getLibcallName(LC),		SDValue Callee = DAG.getExternalSymbol(getLibcallName(LC),
getPointerTy(DAG.getDataLayout()));		getPointerTy(DAG.getDataLayout()));

Type RetTy = RetVT.getTypeForEVT(DAG.getContext());		Type RetTy = RetVT.getTypeForEVT(DAG.getContext());
TargetLowering::CallLoweringInfo CLI(DAG);		TargetLowering::CallLoweringInfo CLI(DAG);
bool signExtend = shouldSignExtendTypeInLibCall(RetVT, isSigned);		bool signExtend = shouldSignExtendTypeInLibCall(RetVT, isSigned);
CLI.setDebugLoc(dl).setChain(DAG.getEntryNode())		CLI.setDebugLoc(dl).setChain(DAG.getEntryNode())
.setCallee(getLibcallCallingConv(LC), RetTy, Callee, std::move(Args))		.setCallee(getLibcallCallingConv(LC), RetTy, Callee, std::move(Args))
▲ Show 20 Lines • Show All 3,663 Lines • Show Last 20 Lines

lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	static void InitLibcallNames(const char **Names, const Triple &TT) {
Names[RTLIB::MUL_F80] = "__mulxf3";		Names[RTLIB::MUL_F80] = "__mulxf3";
Names[RTLIB::MUL_F128] = "__multf3";		Names[RTLIB::MUL_F128] = "__multf3";
Names[RTLIB::MUL_PPCF128] = "__gcc_qmul";		Names[RTLIB::MUL_PPCF128] = "__gcc_qmul";
Names[RTLIB::DIV_F32] = "__divsf3";		Names[RTLIB::DIV_F32] = "__divsf3";
Names[RTLIB::DIV_F64] = "__divdf3";		Names[RTLIB::DIV_F64] = "__divdf3";
Names[RTLIB::DIV_F80] = "__divxf3";		Names[RTLIB::DIV_F80] = "__divxf3";
Names[RTLIB::DIV_F128] = "__divtf3";		Names[RTLIB::DIV_F128] = "__divtf3";
Names[RTLIB::DIV_PPCF128] = "__gcc_qdiv";		Names[RTLIB::DIV_PPCF128] = "__gcc_qdiv";
Names[RTLIB::REM_F32] = "fmodf";
Names[RTLIB::REM_F64] = "fmod";
Names[RTLIB::REM_F80] = "fmodl";
Names[RTLIB::REM_F128] = "fmodl";
Names[RTLIB::REM_PPCF128] = "fmodl";
Names[RTLIB::FMA_F32] = "fmaf";		Names[RTLIB::FMA_F32] = "fmaf";
Names[RTLIB::FMA_F64] = "fma";		Names[RTLIB::FMA_F64] = "fma";
Names[RTLIB::FMA_F80] = "fmal";		Names[RTLIB::FMA_F80] = "fmal";
Names[RTLIB::FMA_F128] = "fmal";		Names[RTLIB::FMA_F128] = "fmal";
Names[RTLIB::FMA_PPCF128] = "fmal";		Names[RTLIB::FMA_PPCF128] = "fmal";
Names[RTLIB::POWI_F32] = "__powisf2";		Names[RTLIB::POWI_F32] = "__powisf2";
Names[RTLIB::POWI_F64] = "__powidf2";		Names[RTLIB::POWI_F64] = "__powidf2";
Names[RTLIB::POWI_F80] = "__powixf2";		Names[RTLIB::POWI_F80] = "__powixf2";
Names[RTLIB::POWI_F128] = "__powitf2";		Names[RTLIB::POWI_F128] = "__powitf2";
Names[RTLIB::POWI_PPCF128] = "__powitf2";		Names[RTLIB::POWI_PPCF128] = "__powitf2";
Names[RTLIB::SQRT_F32] = "sqrtf";
Names[RTLIB::SQRT_F64] = "sqrt";
Names[RTLIB::SQRT_F80] = "sqrtl";
Names[RTLIB::SQRT_F128] = "sqrtl";
Names[RTLIB::SQRT_PPCF128] = "sqrtl";
Names[RTLIB::LOG_F32] = "logf";
Names[RTLIB::LOG_F64] = "log";
Names[RTLIB::LOG_F80] = "logl";
Names[RTLIB::LOG_F128] = "logl";
Names[RTLIB::LOG_PPCF128] = "logl";
Names[RTLIB::LOG2_F32] = "log2f";
Names[RTLIB::LOG2_F64] = "log2";
Names[RTLIB::LOG2_F80] = "log2l";
Names[RTLIB::LOG2_F128] = "log2l";
Names[RTLIB::LOG2_PPCF128] = "log2l";
Names[RTLIB::LOG10_F32] = "log10f";
Names[RTLIB::LOG10_F64] = "log10";
Names[RTLIB::LOG10_F80] = "log10l";
Names[RTLIB::LOG10_F128] = "log10l";
Names[RTLIB::LOG10_PPCF128] = "log10l";
Names[RTLIB::EXP_F32] = "expf";
Names[RTLIB::EXP_F64] = "exp";
Names[RTLIB::EXP_F80] = "expl";
Names[RTLIB::EXP_F128] = "expl";
Names[RTLIB::EXP_PPCF128] = "expl";
Names[RTLIB::EXP2_F32] = "exp2f";
Names[RTLIB::EXP2_F64] = "exp2";
Names[RTLIB::EXP2_F80] = "exp2l";
Names[RTLIB::EXP2_F128] = "exp2l";
Names[RTLIB::EXP2_PPCF128] = "exp2l";
Names[RTLIB::SIN_F32] = "sinf";
Names[RTLIB::SIN_F64] = "sin";
Names[RTLIB::SIN_F80] = "sinl";
Names[RTLIB::SIN_F128] = "sinl";
Names[RTLIB::SIN_PPCF128] = "sinl";
Names[RTLIB::COS_F32] = "cosf";
Names[RTLIB::COS_F64] = "cos";
Names[RTLIB::COS_F80] = "cosl";
Names[RTLIB::COS_F128] = "cosl";
Names[RTLIB::COS_PPCF128] = "cosl";
Names[RTLIB::POW_F32] = "powf";
Names[RTLIB::POW_F64] = "pow";
Names[RTLIB::POW_F80] = "powl";
Names[RTLIB::POW_F128] = "powl";
Names[RTLIB::POW_PPCF128] = "powl";
Names[RTLIB::CEIL_F32] = "ceilf";		Names[RTLIB::CEIL_F32] = "ceilf";
Names[RTLIB::CEIL_F64] = "ceil";		Names[RTLIB::CEIL_F64] = "ceil";
Names[RTLIB::CEIL_F80] = "ceill";		Names[RTLIB::CEIL_F80] = "ceill";
Names[RTLIB::CEIL_F128] = "ceill";		Names[RTLIB::CEIL_F128] = "ceill";
Names[RTLIB::CEIL_PPCF128] = "ceill";		Names[RTLIB::CEIL_PPCF128] = "ceill";
Names[RTLIB::TRUNC_F32] = "truncf";		Names[RTLIB::TRUNC_F32] = "truncf";
Names[RTLIB::TRUNC_F64] = "trunc";		Names[RTLIB::TRUNC_F64] = "trunc";
Names[RTLIB::TRUNC_F80] = "truncl";		Names[RTLIB::TRUNC_F80] = "truncl";
▲ Show 20 Lines • Show All 281 Lines • ▼ Show 20 Lines	static void InitLibcallNames(const char **Names, const Triple &TT) {
Names[RTLIB::ATOMIC_FETCH_XOR_8] = "__atomic_fetch_xor_8";		Names[RTLIB::ATOMIC_FETCH_XOR_8] = "__atomic_fetch_xor_8";
Names[RTLIB::ATOMIC_FETCH_XOR_16] = "__atomic_fetch_xor_16";		Names[RTLIB::ATOMIC_FETCH_XOR_16] = "__atomic_fetch_xor_16";
Names[RTLIB::ATOMIC_FETCH_NAND_1] = "__atomic_fetch_nand_1";		Names[RTLIB::ATOMIC_FETCH_NAND_1] = "__atomic_fetch_nand_1";
Names[RTLIB::ATOMIC_FETCH_NAND_2] = "__atomic_fetch_nand_2";		Names[RTLIB::ATOMIC_FETCH_NAND_2] = "__atomic_fetch_nand_2";
Names[RTLIB::ATOMIC_FETCH_NAND_4] = "__atomic_fetch_nand_4";		Names[RTLIB::ATOMIC_FETCH_NAND_4] = "__atomic_fetch_nand_4";
Names[RTLIB::ATOMIC_FETCH_NAND_8] = "__atomic_fetch_nand_8";		Names[RTLIB::ATOMIC_FETCH_NAND_8] = "__atomic_fetch_nand_8";
Names[RTLIB::ATOMIC_FETCH_NAND_16] = "__atomic_fetch_nand_16";		Names[RTLIB::ATOMIC_FETCH_NAND_16] = "__atomic_fetch_nand_16";

if (TT.isGNUEnvironment()) {
Names[RTLIB::SINCOS_F32] = "sincosf";
Names[RTLIB::SINCOS_F64] = "sincos";
Names[RTLIB::SINCOS_F80] = "sincosl";
Names[RTLIB::SINCOS_F128] = "sincosl";
Names[RTLIB::SINCOS_PPCF128] = "sincosl";
}

if (!TT.isOSOpenBSD()) {		if (!TT.isOSOpenBSD()) {
Names[RTLIB::STACKPROTECTOR_CHECK_FAIL] = "__stack_chk_fail";		Names[RTLIB::STACKPROTECTOR_CHECK_FAIL] = "__stack_chk_fail";
}		}

Names[RTLIB::DEOPTIMIZE] = "__llvm_deoptimize";		Names[RTLIB::DEOPTIMIZE] = "__llvm_deoptimize";
}		}

/// Set default libcall CallingConvs.		/// Set default libcall CallingConvs.
▲ Show 20 Lines • Show All 1,594 Lines • Show Last 20 Lines

lib/Target/AArch64/AArch64FastISel.cpp

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	private:
bool selectIntToFP(const Instruction *I, bool Signed);		bool selectIntToFP(const Instruction *I, bool Signed);
bool selectRem(const Instruction *I, unsigned ISDOpcode);		bool selectRem(const Instruction *I, unsigned ISDOpcode);
bool selectRet(const Instruction *I);		bool selectRet(const Instruction *I);
bool selectTrunc(const Instruction *I);		bool selectTrunc(const Instruction *I);
bool selectIntExt(const Instruction *I);		bool selectIntExt(const Instruction *I);
bool selectMul(const Instruction *I);		bool selectMul(const Instruction *I);
bool selectShift(const Instruction *I);		bool selectShift(const Instruction *I);
bool selectBitCast(const Instruction *I);		bool selectBitCast(const Instruction *I);
bool selectFRem(const Instruction *I);
bool selectSDiv(const Instruction *I);		bool selectSDiv(const Instruction *I);
bool selectGetElementPtr(const Instruction *I);		bool selectGetElementPtr(const Instruction *I);
bool selectAtomicCmpXchg(const AtomicCmpXchgInst *I);		bool selectAtomicCmpXchg(const AtomicCmpXchgInst *I);

// Utility helper routines.		// Utility helper routines.
bool isTypeLegal(Type *Ty, MVT &VT);		bool isTypeLegal(Type *Ty, MVT &VT);
bool isTypeSupported(Type *Ty, MVT &VT, bool IsVectorAllowed = false);		bool isTypeSupported(Type *Ty, MVT &VT, bool IsVectorAllowed = false);
bool isValueAvailable(const Value *V) const;		bool isValueAvailable(const Value *V) const;
▲ Show 20 Lines • Show All 4,592 Lines • ▼ Show 20 Lines	bool AArch64FastISel::selectBitCast(const Instruction *I) {

if (!ResultReg)		if (!ResultReg)
return false;		return false;

updateValueMap(I, ResultReg);		updateValueMap(I, ResultReg);
return true;		return true;
}		}

bool AArch64FastISel::selectFRem(const Instruction *I) {
MVT RetVT;
if (!isTypeLegal(I->getType(), RetVT))
return false;

RTLIB::Libcall LC;
switch (RetVT.SimpleTy) {
default:
return false;
case MVT::f32:
LC = RTLIB::REM_F32;
break;
case MVT::f64:
LC = RTLIB::REM_F64;
break;
}

ArgListTy Args;
Args.reserve(I->getNumOperands());

// Populate the argument list.
for (auto &Arg : I->operands()) {
ArgListEntry Entry;
Entry.Val = Arg;
Entry.Ty = Arg->getType();
Args.push_back(Entry);
}

CallLoweringInfo CLI;
MCContext &Ctx = MF->getContext();
CLI.setCallee(DL, Ctx, TLI.getLibcallCallingConv(LC), I->getType(),
TLI.getLibcallName(LC), std::move(Args));
if (!lowerCallTo(CLI))
return false;
updateValueMap(I, CLI.ResultReg);
return true;
}

bool AArch64FastISel::selectSDiv(const Instruction *I) {		bool AArch64FastISel::selectSDiv(const Instruction *I) {
MVT VT;		MVT VT;
if (!isTypeLegal(I->getType(), VT))		if (!isTypeLegal(I->getType(), VT))
return false;		return false;

if (!isa<ConstantInt>(I->getOperand(1)))		if (!isa<ConstantInt>(I->getOperand(1)))
return selectBinaryOp(I, ISD::SDIV);		return selectBinaryOp(I, ISD::SDIV);

▲ Show 20 Lines • Show All 283 Lines • ▼ Show 20 Lines	case Instruction::Store:
return selectStore(I);		return selectStore(I);
case Instruction::FCmp:		case Instruction::FCmp:
case Instruction::ICmp:		case Instruction::ICmp:
return selectCmp(I);		return selectCmp(I);
case Instruction::Select:		case Instruction::Select:
return selectSelect(I);		return selectSelect(I);
case Instruction::Ret:		case Instruction::Ret:
return selectRet(I);		return selectRet(I);
case Instruction::FRem:
return selectFRem(I);
case Instruction::GetElementPtr:		case Instruction::GetElementPtr:
return selectGetElementPtr(I);		return selectGetElementPtr(I);
case Instruction::AtomicCmpXchg:		case Instruction::AtomicCmpXchg:
return selectAtomicCmpXchg(cast<AtomicCmpXchgInst>(I));		return selectAtomicCmpXchg(cast<AtomicCmpXchgInst>(I));
}		}

// fall-back to target-independent instruction selection.		// fall-back to target-independent instruction selection.
return selectOperator(I, I->getOpcode());		return selectOperator(I, I->getOpcode());
Show All 10 Lines