This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
1
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
-
ISDOpcodes.h
-
SelectionDAG.h
-
SelectionDAGNodes.h
-
IR/
6/10
Intrinsics.td
-
RuntimeLibcalls.def
-
lib/CodeGen/
-
CodeGen/
-
SelectionDAG/
-
LegalizeDAG.cpp
1/3
SelectionDAG.cpp
2/6
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
-
TargetLoweringBase.cpp
-
test/CodeGen/
-
CodeGen/
-
ARM/
-
fpenv.ll
-
X86/
-
fpenv.ll

Differential D71742

Added intrinsics for access to FP environment
ClosedPublic

Authored by sepavloff on Dec 19 2019, 10:45 PM.

Download Raw Diff

Details

Reviewers

hfinkel
kpn
andrew.w.kaylor
efriedma
bkramer
cameron.mcinally
uweigand
arsenm
jdoerfert

Commits

rGeecaeb6f100a: [FPEnv] Intrinsics for access to FP environment

Summary

The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'.
They are used to read floating-point environment, set it or reset to
some default state. They do the same actions as C library functions
'fegetenv' and 'fesetenv'. By default these intrinsics are lowered to calls
to these functions.

The new intrinsics specify FP environment as a value of integer type, it
is convenient of most targets where the FP state is a content of some
register. Some targets however use long representations. On X86 the size
of FP environment is 256 bits, and even half of this size is not a legal
ibteger type. To facilitate legalization in such cases, two sets of DAG
nodes is used. Nodes GET_FPENV and SET_FPENV are used when FP
environment may be represented by a legal integer type. Nodes
GET_FPENV_MEM and SET_FPENV_MEM consider FP environment as a region in
memory, much like fesetenv and fegetenv do. They are used when
target has long representation for floationg-point state.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sepavloff created this revision.Dec 19 2019, 10:45 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 19 2019, 10:45 PM

Herald added subscribers: jdoerfert, hiraditya. · View Herald Transcript

sepavloff added a parent revision: D71741: Add size of FP environment to DataLayout.Dec 19 2019, 10:51 PM

Harbormaster completed remote builds in B42813: Diff 234828.Dec 19 2019, 10:53 PM

sepavloff added a child revision: D69798: Implement inlining of strictfp functions.Dec 19 2019, 10:57 PM

pengfei added a subscriber: pengfei.Dec 20 2019, 12:02 AM

sepavloff mentioned this in D69798: Implement inlining of strictfp functions.Dec 20 2019, 8:25 AM

I don't see the need. Changing the FP environment in a mixed environment program is the responsibility of the programmer, and standard calls already exist for this.

It is expected that targets will implement custom lowering to proper machine instructions for better performance.

Do you have any performance numbers?

In D71742#1793243, @kpn wrote:

I don't see the need. Changing the FP environment in a mixed environment program is the responsibility of the programmer, and standard calls already exist for this.

This is about inlining. In the code like this:

double f1(double x, double y) {
  return x + y;
}
double f2(double x, double y) {
  #pragma STDC FENV_ACCESS ON
  ...
  return f1(x, y);
}

compiler might inline call to f1 in f2. However the inlined function f1 expects default FP environment but is called in some other one.

In D71742#1793447, @efriedma wrote:

It is expected that targets will implement custom lowering to proper machine instructions for better performance.

Do you have any performance numbers?

I don't have them. But there are some considerations why optimization may be desirable:

For some targets FP environment is a content of 1 or 2 registers. It this case custom lowering can store the environment in registers and avoid expenses for function calls and memory operations.
Some targets encode FP environment in instructions. In this case these intrinsics shoul not produce any code.
Probably we need to reset FP environment before calls to external functions in strictfp function and restore it upon return. In this case number of calls to such intrinsics may be large enough.

In D71742#1794638, @sepavloff wrote:
In D71742#1793243, @kpn wrote:

I don't see the need. Changing the FP environment in a mixed environment program is the responsibility of the programmer, and standard calls already exist for this.

This is about inlining. In the code like this:
double f1(double x, double y) {
  return x + y;
}
double f2(double x, double y) {
  #pragma STDC FENV_ACCESS ON
  ...
  return f1(x, y);
}
compiler might inline call to f1 in f2. However the inlined function f1 expects default FP environment but is called in some other one.

Nothing here changes my statement. The compiler does _not_ change the FP environment because of the #pragma. So f1() here would behave the same whether it was inlined or not.

In D71742#1794901, @kpn wrote:
In D71742#1794638, @sepavloff wrote:
In D71742#1793243, @kpn wrote:

I don't see the need. Changing the FP environment in a mixed environment program is the responsibility of the programmer, and standard calls already exist for this.

This is about inlining. In the code like this:
double f1(double x, double y) {
  return x + y;
}
double f2(double x, double y) {
  #pragma STDC FENV_ACCESS ON
  ...
  return f1(x, y);
}
compiler might inline call to f1 in f2. However the inlined function f1 expects default FP environment but is called in some other one.
Nothing here changes my statement. The compiler does _not_ change the FP environment because of the #pragma. So f1() here would behave the same whether it was inlined or not.

When f1 is defined, no pragma is in act, so its body is executed in default FP environment. f2 contains #pragma, so FP environment in its body may differ from the default. When f1 is inlined into f2, the body of f1 becomes a part of the body of f2. Basic blocks of f1 would be executed in the environment, set in f2. To keep semantics of f1 we must execute its BBs in the default environment.

In D71742#1795013, @sepavloff wrote:
In D71742#1794901, @kpn wrote:
In D71742#1794638, @sepavloff wrote:
In D71742#1793243, @kpn wrote:

I don't see the need. Changing the FP environment in a mixed environment program is the responsibility of the programmer, and standard calls already exist for this.

This is about inlining. In the code like this:
double f1(double x, double y) {
  return x + y;
}
double f2(double x, double y) {
  #pragma STDC FENV_ACCESS ON
  ...
  return f1(x, y);
}
compiler might inline call to f1 in f2. However the inlined function f1 expects default FP environment but is called in some other one.
Nothing here changes my statement. The compiler does _not_ change the FP environment because of the #pragma. So f1() here would behave the same whether it was inlined or not.
When f1 is defined, no pragma is in act, so its body is executed in default FP environment. f2 contains #pragma, so FP environment in its body may differ from the default. When f1 is inlined into f2, the body of f1 becomes a part of the body of f2. Basic blocks of f1 would be executed in the environment, set in f2. To keep semantics of f1 we must execute its BBs in the default environment.

I don’t see how inlining is any different than f2 calling f1 without it being inlined. f1 would execute with the f2 environment in that case too.

arsenm added a subscriber: arsenm.Dec 23 2019, 9:01 AM

arsenm added inline comments.

llvm/docs/LangRef.rst
25427	Why a pointer? Why not an i64 argument? The lowering of this through memory seems problematic for any target that doesn't implement this as the fe libcalls

In D71742#1795014, @craig.topper wrote:
In D71742#1795013, @sepavloff wrote:
In D71742#1794901, @kpn wrote:
In D71742#1794638, @sepavloff wrote:
In D71742#1793243, @kpn wrote:

I don't see the need. Changing the FP environment in a mixed environment program is the responsibility of the programmer, and standard calls already exist for this.

This is about inlining. In the code like this:
double f1(double x, double y) {
  return x + y;
}
double f2(double x, double y) {
  #pragma STDC FENV_ACCESS ON
  ...
  return f1(x, y);
}
compiler might inline call to f1 in f2. However the inlined function f1 expects default FP environment but is called in some other one.
Nothing here changes my statement. The compiler does _not_ change the FP environment because of the #pragma. So f1() here would behave the same whether it was inlined or not.
When f1 is defined, no pragma is in act, so its body is executed in default FP environment. f2 contains #pragma, so FP environment in its body may differ from the default. When f1 is inlined into f2, the body of f1 becomes a part of the body of f2. Basic blocks of f1 would be executed in the environment, set in f2. To keep semantics of f1 we must execute its BBs in the default environment.
I don’t see how inlining is any different than f2 calling f1 without it being inlined. f1 would execute with the f2 environment in that case too.

Exactly. It is the responsibility of the programmer to ensure f1 is being called with the correct environment set. C11 7.6.1.2 explicitly says:

If part of a program [...] runs under non-default mode settings, but was translated with the state for the FENV_ACCESS pragma ‘‘off’’, the behavior is undefined.

The only issue for the compiler w.r.t. inlining is the case where the programmer writes correct code in f2, i.e. resets the environment to the default state before calling f1, but then f1 is inlined into f2. In this scenario, the compiler must ensure to preserve correctness by not scheduling any part of the inlined f1 to before the instruction in f2 that resets the environment. This can be achieved e.g. by the inliner replacing all regular FP instructions with constrained intrinsics (which may specify the default mode).

sepavloff removed a child revision: D69798: Implement inlining of strictfp functions.Feb 3 2020, 5:37 AM

Updated patch

Rebased,
Used more consistent naming.

sepavloff edited the summary of this revision. (Show Details)Jun 12 2020, 10:43 AM

Harbormaster failed remote builds in B60142: Diff 270451!Jun 12 2020, 11:26 AM

jdoerfert added inline comments.Jun 13 2020, 9:15 AM

llvm/include/llvm/IR/Intrinsics.td
1077	[Drive by][Unrelated] `nosync`, `nofree` missing. Should be `readonly` or `writeonly` I suspect. Arguments can be `nocapture`.

arsenm requested changes to this revision.Jun 13 2020, 9:57 AM

arsenm added inline comments.

llvm/include/llvm/IR/Intrinsics.td
1073–1075	Using a pointer for this is problematic, and one with a hardcoded 0 address space doubly so

This revision now requires changes to proceed.Jun 13 2020, 9:57 AM

Herald added a subscriber: wdng. · View Herald TranscriptJun 13 2020, 9:57 AM

Updated patch

fixed some formatting issues,
added missed attrubutes to the new intrinsics,
pointer arguments support address spaces.

sepavloff marked 3 inline comments as done.Jun 15 2020, 2:04 AM

sepavloff added inline comments.

llvm/include/llvm/IR/Intrinsics.td
1073–1075	Changed type to `llvm_anyptr_ty`, which must support arbitrary address spaces.
1077	Fixed, thank you.

Harbormaster failed remote builds in B60270: Diff 270683!Jun 15 2020, 3:14 AM

sepavloff added a child revision: D81833: [X86][FPEnv] Lowering of {get,set,reset}_fpenv.Jun 15 2020, 3:33 AM

arsenm added inline comments.Jun 15 2020, 5:08 AM

llvm/include/llvm/IR/Intrinsics.td
1073–1075	Accepting any address space is only a half-fix. Why not make this return llvm_anyint_ty, and define it as zext or truncated to the expected target size in the backend? This wouldn't require lowering to introduce stack usage for example for something that's usually read directly out of a register

sepavloff added a child revision: D81843: [ARM][FPEnv] Lowering of {get,set,reset}_fpenv.Jun 15 2020, 8:05 AM

sepavloff marked 2 inline comments as done.Jun 15 2020, 9:52 AM

sepavloff added inline comments.

llvm/include/llvm/IR/Intrinsics.td
1073–1075	Why not make this return llvm_anyint_ty, and define it as zext or truncated to the expected target size in the backend? In this case X86 represents FP environment as `i256`. It is not clear how to legalize this scalar type. Passing FPEnv through memory allows to avoid issues in legalization. For targets that get/set environment by simple register moves the intermediate stack slot is eliminated (see D81843). Well, almost eliminated, only write to stack slot remains, but it is DCE deficiency.

Removed dependency on DataLayout

For the task of the current patch availability of FPEnv size in
DataLayout is not strictly necessary.

Herald added a reviewer: jdoerfert. · View Herald TranscriptJun 22 2020, 2:52 AM

sepavloff removed a parent revision: D71741: Add size of FP environment to DataLayout.Jun 22 2020, 2:53 AM

sepavloff mentioned this in D71741: Add size of FP environment to DataLayout.Jun 22 2020, 2:56 AM

Harbormaster failed remote builds in B61199: Diff 272363!Jun 22 2020, 4:48 AM

arsenm added inline comments.Jun 23 2020, 8:50 AM

llvm/include/llvm/IR/Intrinsics.td
1073–1075	I'm not sure what you mean. You could legalize i256 by storing to the stack and reloading parts for all operations? This probably already works.

sepavloff marked an inline comment as done.Jun 26 2020, 4:09 AM

sepavloff added inline comments.

llvm/include/llvm/IR/Intrinsics.td
1073–1075	I tried to implement a solution in which `get.fpenv` returns integer value. The solution seems to work, however it has its own drawbacks: In the case of x86 `get.fpenv` returns `i256`. This is illegal type, so we cannot set custom lowering through `setOperationAction`. `TLI.getOperationAction` always returns `Expand` as the type is extended. To cope with this difficulty, `DAGTypeLegalizer::ExpandIntegerResult` has to call `TLI.ReplaceNodeResults` directly. It is necessary to use special node class to represent `GET_FPENV`, otherwise `TLI.ReplaceNodeResults` has no access to the chain token argument. The code created for the source: define void @func_01(i256* %fpenv) { entry: %fpe = call i256 @llvm.get.fpenv.i256() store i256 %fpe, i256* %fpenv ret void } uses stack variable, although it should use pointer argument directly. Optimization that would eliminate it is now absent. This way is possible, but it requires more efforts to implement. There must be serious reasons why we prefer this way over using pointer argument. Could you please explain the concern of using pointer in the intrinsic? What is wrong with the implementation used in D81843? Why it cannot be used by targets where FP environment is simply a content of a register?

arsenm added inline comments.Jul 17 2020, 12:55 PM

llvm/include/llvm/IR/Intrinsics.td
1073–1075	It could be, but it forces the lowering to introduce stack usage. Now an alloca has to stick around for the scratch pad until codegen, and SROA can't eliminate it. For AMDGPU for example, we change ABI lowering and other optimization strategies based on whether there is stack usage in the incoming IR. We don't really have other intrinsics that force you into this situation I also noticed we have llvm.flt.rounds already, so I'm confused why this is different

sepavloff marked an inline comment as done.Jul 20 2020, 5:00 AM

sepavloff added inline comments.

llvm/include/llvm/IR/Intrinsics.td
1073–1075	Thank you for the explanation. I'll try to elaborate the other way. I also noticed we have llvm.flt.rounds already, so I'm confused why this is different llvm.flt.rounds only reads the current rounding mode. These intrinsics save/restore the entire FP environment.

jdoerfert resigned from this revision.Jul 20 2020, 6:38 AM

Intrinsics now does not use pointers

Harbormaster completed remote builds in B71530: Diff 291501.Sep 14 2020, 1:35 AM

sepavloff mentioned this in D99083: [RISCV] Introduce floating point control and state registers.Apr 7 2021, 8:47 PM

Is this still relevant?

Herald added a project: Restricted Project. · View Herald TranscriptSep 21 2022, 5:28 PM

Updated the patch

Harbormaster completed remote builds in B188188: Diff 462193.Sep 22 2022, 9:05 AM

Matt added a subscriber: Matt.Sep 27 2022, 11:41 AM

Updated patch

Use special DAG nodes if FP environment is passed in memory,
Various cleanups.

sepavloff edited the summary of this revision. (Show Details)May 10 2023, 8:36 AM

Harbormaster completed remote builds in B231104: Diff 521004.May 10 2023, 10:03 AM

Updated patch after test started using autogenerated assertions in another commit.

Harbormaster completed remote builds in B231547: Diff 521579.May 12 2023, 2:28 AM

sepavloff added a child revision: D150437: [FPEnv] Get rid of extra moves in fpenv calls.May 12 2023, 3:56 AM

Add memoperand to GET_FPENV_MEM and SET_FPENV_MEM

Harbormaster completed remote builds in B232570: Diff 522994.May 17 2023, 5:10 AM

arsenm added inline comments.May 19 2023, 11:09 AM

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6578–6580	Can this just go into an Expand action like normal? We randomly expand certain things in the DAG builder but the inconsistency is annoying

arsenm added inline comments.May 19 2023, 11:11 AM

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
12423	If not getting it from the actual declaration, should probably use the default program address space (don't really care if you fix it here, 90% of the uses of this are wrong as it is)
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6618	This should use the alignment for EnvVT, not the datalayout's stack alignment

Updated patch

Use type alignment rather than stack one,
Use getPointerMemTy instead of getPointerTy.

sepavloff added inline comments.May 25 2023, 5:57 AM

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
12423	IIRC default program address space is for function code. I put `getPointerMemTy` instead of `getPointerTy`, although now it behave in the same way. I don't know to put correct address space here, as declaration of the function is unavailable.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6578–6580	This code does the same job as other cases in this function, - it creates relevant DAG nodes. Their expansion occurs in `SelectionDAG::Legalize`, as for other nodes. The only difference is using two kings of nodes, register and memory-based ones. Using only one kind of node could simplify code in this function but complicates code in other parts. On some targets FP environment is long and corresponding integer type is illegal (i256 on x86). In such case DAG nodes have to be expanded early, when DAG legalizes types. If FP environment can be represented by a legal type, the expansion occurs late, in `SelectionDAG::Legalize`. There were two points when the expansion occurred. This approach was used in previous versions of this patch, it complicated the implementation and made it fragile. Using separate nodes for operations that involves memory allows to treat all FP environment operations in uniform manner, at the cost of small complication of this function.
6618	Fixed.

Harbormaster completed remote builds in B234479: Diff 525563.May 25 2023, 7:31 AM

Ping.

LGTM with the getPointerMemTy thing reverted

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
12423	pointer mem ty makes less sense
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6578–6580	This hack doesn't require repeating in globalisel
6588	CreateStackTemporary should be fixed to directly take Align input

This revision is now accepted and ready to land.Jun 2 2023, 12:57 PM

This revision was landed with ongoing or failed builds.Jun 4 2023, 11:11 PM

Closed by commit rGeecaeb6f100a: [FPEnv] Intrinsics for access to FP environment (authored by sepavloff). · Explain Why

This revision was automatically updated to reflect the committed changes.

sepavloff added a commit: rGeecaeb6f100a: [FPEnv] Intrinsics for access to FP environment.

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

75 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

24 lines

SelectionDAG.h

8 lines

SelectionDAGNodes.h

19 lines

IR/

Intrinsics.td

3 lines

RuntimeLibcalls.def

4 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

27 lines

SelectionDAG.cpp

86 lines

SelectionDAGBuilder.cpp

58 lines

SelectionDAGDumper.cpp

5 lines

TargetLoweringBase.cpp

8 lines

test/

CodeGen/

ARM/

fpenv.ll

54 lines

X86/

fpenv.ll

230 lines

Diff 528295

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 25,243 Lines • ▼ Show 20 Lines
	""""""""""			""""""""""

	The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is			The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
	similar to C library function 'fesetround', however this intrinsic does not			similar to C library function 'fesetround', however this intrinsic does not
	return any value and uses platform-independent representation of IEEE rounding			return any value and uses platform-independent representation of IEEE rounding
	modes.			modes.


				'``llvm.get.fpenv``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <integer_type> @llvm.get.fpenv()

				Overview:
				"""""""""

				The '``llvm.get.fpenv``' intrinsic returns bits of the current floating-point
				environment. The return value type is platform-specific.

				Semantics:
				""""""""""

				The '``llvm.get.fpenv``' intrinsic reads the current floating-point environment
				and returns it as an integer value.


				'``llvm.set.fpenv``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare void @llvm.set.fpenv(<integer_type> <val>)

				Overview:
				"""""""""

				The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment.

				Arguments:
				""""""""""

				The argument is an integer representing the new floating-point environment. The
				integer type is platform-specific.

				Semantics:
				""""""""""

				The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment
				to the state specified by the argument. The state may be previously obtained by a
				call to '``llvm.get.fpenv``' or synthesised in a platform-dependent way.


				'``llvm.reset.fpenv``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare void @llvm.reset.fpenv()

				Overview:
				"""""""""

				The '``llvm.reset.fpenv``' intrinsic sets the default floating-point environment.

				Semantics:
				""""""""""

				The '``llvm.reset.fpenv``' intrinsic sets the current floating-point environment
				to default state. It is similar to the call 'fesetenv(FE_DFL_ENV)', except it
				does not return any value.


	Floating-Point Test Intrinsics			Floating-Point Test Intrinsics
	------------------------------			------------------------------

	These functions get properties of floating-point values.			These functions get properties of floating-point values.


	.. _llvm.is.fpclass:			.. _llvm.is.fpclass:

	▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	declare void @llvm.var.annotation(ptr <val>, ptr <str>, ptr <str>, i32 <int>)			declare void @llvm.var.annotation(ptr <val>, ptr <str>, ptr <str>, i32 <int>)

	Overview:			Overview:
				arsenmUnsubmitted Not Done Reply Inline Actions Why a pointer? Why not an i64 argument? The lowering of this through memory seems problematic for any target that doesn't implement this as the fe libcalls arsenm: Why a pointer? Why not an i64 argument? The lowering of this through memory seems problematic…
	"""""""""			"""""""""

	The '``llvm.var.annotation``' intrinsic.			The '``llvm.var.annotation``' intrinsic.

	Arguments:			Arguments:
	""""""""""			""""""""""

	The first argument is a pointer to a value, the second is a pointer to a			The first argument is a pointer to a value, the second is a pointer to a
	▲ Show 20 Lines • Show All 1,710 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 965 Lines • ▼ Show 20 Lines	enum NodeType {
/// as less than 0.0. While FMINNUM_IEEE/FMAXNUM_IEEE follow IEEE 754-2008		/// as less than 0.0. While FMINNUM_IEEE/FMAXNUM_IEEE follow IEEE 754-2008
/// semantics, FMINIMUM/FMAXIMUM follow IEEE 754-2018 draft semantics.		/// semantics, FMINIMUM/FMAXIMUM follow IEEE 754-2018 draft semantics.
FMINIMUM,		FMINIMUM,
FMAXIMUM,		FMAXIMUM,

/// FSINCOS - Compute both fsin and fcos as a single operation.		/// FSINCOS - Compute both fsin and fcos as a single operation.
FSINCOS,		FSINCOS,

		/// Gets the current floating-point environment. The first operand is a token
		/// chain. The results are FP environment, represented by an integer value,
		/// and a token chain.
		GET_FPENV,

		/// Sets the current floating-point environment. The first operand is a token
		/// chain, the second is FP environment, represented by an integer value. The
		/// result is a token chain.
		SET_FPENV,

		/// Set floating-point environment to default state. The first operand and the
		/// result are token chains.
		RESET_FPENV,

		/// Gets the current floating-point environment. The first operand is a token
		/// chain, the second is a pointer to memory, where FP environment is stored
		/// to. The result is a token chain.
		GET_FPENV_MEM,

		/// Sets the current floating point environment. The first operand is a token
		/// chain, the second is a pointer to memory, where FP environment is loaded
		/// from. The result is a token chain.
		SET_FPENV_MEM,

/// LOAD and STORE have token chains as their first operand, then the same		/// LOAD and STORE have token chains as their first operand, then the same
/// operands as an LLVM load/store instruction, then an offset node that		/// operands as an LLVM load/store instruction, then an offset node that
/// is added / subtracted from the base pointer to form the address (for		/// is added / subtracted from the base pointer to form the address (for
/// indexed memory ops).		/// indexed memory ops).
LOAD,		LOAD,
STORE,		STORE,

/// DYNAMIC_STACKALLOC - Allocate some number of bytes on the stack aligned		/// DYNAMIC_STACKALLOC - Allocate some number of bytes on the stack aligned
▲ Show 20 Lines • Show All 570 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/SelectionDAG.h

Show First 20 Lines • Show All 1,576 Lines • ▼ Show 20 Lines
SDValue getMaskedGather(SDVTList VTs, EVT MemVT, const SDLoc &dl,		SDValue getMaskedGather(SDVTList VTs, EVT MemVT, const SDLoc &dl,
ArrayRef<SDValue> Ops, MachineMemOperand *MMO,		ArrayRef<SDValue> Ops, MachineMemOperand *MMO,
ISD::MemIndexType IndexType, ISD::LoadExtType ExtTy);		ISD::MemIndexType IndexType, ISD::LoadExtType ExtTy);
SDValue getMaskedScatter(SDVTList VTs, EVT MemVT, const SDLoc &dl,		SDValue getMaskedScatter(SDVTList VTs, EVT MemVT, const SDLoc &dl,
ArrayRef<SDValue> Ops, MachineMemOperand *MMO,		ArrayRef<SDValue> Ops, MachineMemOperand *MMO,
ISD::MemIndexType IndexType,		ISD::MemIndexType IndexType,
bool IsTruncating = false);		bool IsTruncating = false);

		SDValue getGetFPEnv(SDValue Chain, const SDLoc &dl, SDValue Ptr, EVT MemVT,
		MachineMemOperand *MMO);
		SDValue getSetFPEnv(SDValue Chain, const SDLoc &dl, SDValue Ptr, EVT MemVT,
		MachineMemOperand *MMO);

/// Construct a node to track a Value* through the backend.		/// Construct a node to track a Value* through the backend.
SDValue getSrcValue(const Value *v);		SDValue getSrcValue(const Value *v);

/// Return an MDNodeSDNode which holds an MDNode.		/// Return an MDNodeSDNode which holds an MDNode.
SDValue getMDNode(const MDNode *MD);		SDValue getMDNode(const MDNode *MD);

/// Return a bitcast using the SDLoc of the value operand, and casting to the		/// Return a bitcast using the SDLoc of the value operand, and casting to the
/// provided type. Use getNode to set a custom SDLoc.		/// provided type. Use getNode to set a custom SDLoc.
▲ Show 20 Lines • Show All 746 Lines • ▼ Show 20 Lines	bool isSafeToSpeculativelyExecute(unsigned Opcode) const {
case ISD::UREM:		case ISD::UREM:
case ISD::UDIVREM:		case ISD::UDIVREM:
return false;		return false;
default:		default:
return true;		return true;
}		}
}		}

		SDValue makeStateFunctionCall(unsigned LibFunc, SDValue Ptr, SDValue InChain,
		const SDLoc &DLoc);

private:		private:
void InsertNode(SDNode *N);		void InsertNode(SDNode *N);
bool RemoveNodeFromCSEMaps(SDNode *N);		bool RemoveNodeFromCSEMaps(SDNode *N);
void AddModifiedNodeToCSEMaps(SDNode *N);		void AddModifiedNodeToCSEMaps(SDNode *N);
SDNode FindModifiedNodeSlot(SDNode N, SDValue Op, void *&InsertPos);		SDNode FindModifiedNodeSlot(SDNode N, SDValue Op, void *&InsertPos);
SDNode FindModifiedNodeSlot(SDNode N, SDValue Op1, SDValue Op2,		SDNode FindModifiedNodeSlot(SDNode N, SDValue Op1, SDValue Op2,
void *&InsertPos);		void *&InsertPos);
SDNode FindModifiedNodeSlot(SDNode N, ArrayRef<SDValue> Ops,		SDNode FindModifiedNodeSlot(SDNode N, ArrayRef<SDValue> Ops,
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/SelectionDAGNodes.h

Show First 20 Lines • Show All 1,432 Lines • ▼ Show 20 Lines	static bool classof(const SDNode *N) {
case ISD::MGATHER:		case ISD::MGATHER:
case ISD::MSCATTER:		case ISD::MSCATTER:
case ISD::VP_LOAD:		case ISD::VP_LOAD:
case ISD::VP_STORE:		case ISD::VP_STORE:
case ISD::VP_GATHER:		case ISD::VP_GATHER:
case ISD::VP_SCATTER:		case ISD::VP_SCATTER:
case ISD::EXPERIMENTAL_VP_STRIDED_LOAD:		case ISD::EXPERIMENTAL_VP_STRIDED_LOAD:
case ISD::EXPERIMENTAL_VP_STRIDED_STORE:		case ISD::EXPERIMENTAL_VP_STRIDED_STORE:
		case ISD::GET_FPENV_MEM:
		case ISD::SET_FPENV_MEM:
return true;		return true;
default:		default:
return N->isMemIntrinsic() \|\| N->isTargetMemoryOpcode();		return N->isMemIntrinsic() \|\| N->isTargetMemoryOpcode();
}		}
}		}
};		};

/// This is an SDNode representing atomic operations.		/// This is an SDNode representing atomic operations.
▲ Show 20 Lines • Show All 1,446 Lines • ▼ Show 20 Lines	public:

const SDValue &getValue() const { return getOperand(1); }		const SDValue &getValue() const { return getOperand(1); }

static bool classof(const SDNode *N) {		static bool classof(const SDNode *N) {
return N->getOpcode() == ISD::MSCATTER;		return N->getOpcode() == ISD::MSCATTER;
}		}
};		};

		class FPStateAccessSDNode : public MemSDNode {
		public:
		friend class SelectionDAG;

		FPStateAccessSDNode(unsigned NodeTy, unsigned Order, const DebugLoc &dl,
		SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
		: MemSDNode(NodeTy, Order, dl, VTs, MemVT, MMO) {
		assert((NodeTy == ISD::GET_FPENV_MEM \|\| NodeTy == ISD::SET_FPENV_MEM) &&
		"Expected FP state access node");
		}

		static bool classof(const SDNode *N) {
		return N->getOpcode() == ISD::GET_FPENV_MEM \|\|
		N->getOpcode() == ISD::SET_FPENV_MEM;
		}
		};

/// An SDNode that represents everything that will be needed		/// An SDNode that represents everything that will be needed
/// to construct a MachineInstr. These nodes are created during the		/// to construct a MachineInstr. These nodes are created during the
/// instruction selection proper phase.		/// instruction selection proper phase.
///		///
/// Note that the only supported way to set the `memoperands` is by calling the		/// Note that the only supported way to set the `memoperands` is by calling the
/// `SelectionDAG::setNodeMemRefs` function as the memory management happens		/// `SelectionDAG::setNodeMemRefs` function as the memory management happens
/// inside the DAG rather than in the node.		/// inside the DAG rather than in the node.
class MachineSDNode : public SDNode {		class MachineSDNode : public SDNode {
▲ Show 20 Lines • Show All 227 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 1,064 Lines • ▼ Show 20 Lines	def int_objectsize : DefaultAttrsIntrinsic<[llvm_anyint_ty],
ImmArg<ArgIndex<3>>]>,		ImmArg<ArgIndex<3>>]>,
ClangBuiltin<"__builtin_object_size">;		ClangBuiltin<"__builtin_object_size">;

//===--------------- Access to Floating Point Environment -----------------===//		//===--------------- Access to Floating Point Environment -----------------===//
//		//

let IntrProperties = [IntrInaccessibleMemOnly, IntrWillReturn] in {		let IntrProperties = [IntrInaccessibleMemOnly, IntrWillReturn] in {
def int_get_rounding : DefaultAttrsIntrinsic<[llvm_i32_ty], []>;		def int_get_rounding : DefaultAttrsIntrinsic<[llvm_i32_ty], []>;
def int_set_rounding : DefaultAttrsIntrinsic<[], [llvm_i32_ty]>;		def int_set_rounding : DefaultAttrsIntrinsic<[], [llvm_i32_ty]>;
		def int_get_fpenv : DefaultAttrsIntrinsic<[llvm_anyint_ty], []>;
		def int_set_fpenv : DefaultAttrsIntrinsic<[], [llvm_anyint_ty]>;
		arsenmUnsubmitted Not Done Reply Inline Actions Using a pointer for this is problematic, and one with a hardcoded 0 address space doubly so arsenm: Using a pointer for this is problematic, and one with a hardcoded 0 address space doubly so
		sepavloffAuthorUnsubmitted Done Reply Inline Actions Changed type to `llvm_anyptr_ty`, which must support arbitrary address spaces. sepavloff: Changed type to `llvm_anyptr_ty`, which must support arbitrary address spaces.
		arsenmUnsubmitted Not Done Reply Inline Actions Accepting any address space is only a half-fix. Why not make this return llvm_anyint_ty, and define it as zext or truncated to the expected target size in the backend? This wouldn't require lowering to introduce stack usage for example for something that's usually read directly out of a register arsenm: Accepting any address space is only a half-fix. Why not make this return llvm_anyint_ty, and…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions Why not make this return llvm_anyint_ty, and define it as zext or truncated to the expected target size in the backend? In this case X86 represents FP environment as `i256`. It is not clear how to legalize this scalar type. Passing FPEnv through memory allows to avoid issues in legalization. For targets that get/set environment by simple register moves the intermediate stack slot is eliminated (see D81843). Well, almost eliminated, only write to stack slot remains, but it is DCE deficiency. sepavloff: > Why not make this return llvm_anyint_ty, and define it as zext or truncated to the expected…
		arsenmUnsubmitted Not Done Reply Inline Actions I'm not sure what you mean. You could legalize i256 by storing to the stack and reloading parts for all operations? This probably already works. arsenm: I'm not sure what you mean. You could legalize i256 by storing to the stack and reloading parts…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions I tried to implement a solution in which `get.fpenv` returns integer value. The solution seems to work, however it has its own drawbacks: In the case of x86 `get.fpenv` returns `i256`. This is illegal type, so we cannot set custom lowering through `setOperationAction`. `TLI.getOperationAction` always returns `Expand` as the type is extended. To cope with this difficulty, `DAGTypeLegalizer::ExpandIntegerResult` has to call `TLI.ReplaceNodeResults` directly. It is necessary to use special node class to represent `GET_FPENV`, otherwise `TLI.ReplaceNodeResults` has no access to the chain token argument. The code created for the source: define void @func_01(i256* %fpenv) { entry: %fpe = call i256 @llvm.get.fpenv.i256() store i256 %fpe, i256* %fpenv ret void } uses stack variable, although it should use pointer argument directly. Optimization that would eliminate it is now absent. This way is possible, but it requires more efforts to implement. There must be serious reasons why we prefer this way over using pointer argument. Could you please explain the concern of using pointer in the intrinsic? What is wrong with the implementation used in D81843? Why it cannot be used by targets where FP environment is simply a content of a register? sepavloff: I tried to implement a solution in which `get.fpenv` returns integer value. The solution seems…
		arsenmUnsubmitted Not Done Reply Inline Actions It could be, but it forces the lowering to introduce stack usage. Now an alloca has to stick around for the scratch pad until codegen, and SROA can't eliminate it. For AMDGPU for example, we change ABI lowering and other optimization strategies based on whether there is stack usage in the incoming IR. We don't really have other intrinsics that force you into this situation I also noticed we have llvm.flt.rounds already, so I'm confused why this is different arsenm: It could be, but it forces the lowering to introduce stack usage. Now an alloca has to stick…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions Thank you for the explanation. I'll try to elaborate the other way. I also noticed we have llvm.flt.rounds already, so I'm confused why this is different llvm.flt.rounds only reads the current rounding mode. These intrinsics save/restore the entire FP environment. sepavloff: Thank you for the explanation. I'll try to elaborate the other way. > I also noticed we have…
		def int_reset_fpenv : DefaultAttrsIntrinsic<[], []>;
}		}
		jdoerfertUnsubmitted Done Reply Inline Actions [Drive by][Unrelated] `nosync`, `nofree` missing. Should be `readonly` or `writeonly` I suspect. Arguments can be `nocapture`. jdoerfert: [Drive by][Unrelated] `nosync`, `nofree` missing. Should be `readonly` or `writeonly` I suspect.
		sepavloffAuthorUnsubmitted Done Reply Inline Actions Fixed, thank you. sepavloff: Fixed, thank you.

//===--------------- Floating Point Properties ----------------------------===//		//===--------------- Floating Point Properties ----------------------------===//
//		//

def int_is_fpclass		def int_is_fpclass
: DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],		: DefaultAttrsIntrinsic<[LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>],
[llvm_anyfloat_ty, llvm_i32_ty],		[llvm_anyfloat_ty, llvm_i32_ty],
[IntrNoMem, IntrSpeculatable, ImmArg<ArgIndex<1>>]>;		[IntrNoMem, IntrSpeculatable, ImmArg<ArgIndex<1>>]>;
▲ Show 20 Lines • Show All 1,440 Lines • Show Last 20 Lines

llvm/include/llvm/IR/RuntimeLibcalls.def

	Show First 20 Lines • Show All 274 Lines • ▼ Show 20 Lines
	HANDLE_LIBCALL(LRINT_F128, "lrintl")			HANDLE_LIBCALL(LRINT_F128, "lrintl")
	HANDLE_LIBCALL(LRINT_PPCF128, "lrintl")			HANDLE_LIBCALL(LRINT_PPCF128, "lrintl")
	HANDLE_LIBCALL(LLRINT_F32, "llrintf")			HANDLE_LIBCALL(LLRINT_F32, "llrintf")
	HANDLE_LIBCALL(LLRINT_F64, "llrint")			HANDLE_LIBCALL(LLRINT_F64, "llrint")
	HANDLE_LIBCALL(LLRINT_F80, "llrintl")			HANDLE_LIBCALL(LLRINT_F80, "llrintl")
	HANDLE_LIBCALL(LLRINT_F128, "llrintl")			HANDLE_LIBCALL(LLRINT_F128, "llrintl")
	HANDLE_LIBCALL(LLRINT_PPCF128, "llrintl")			HANDLE_LIBCALL(LLRINT_PPCF128, "llrintl")

				// Floating point environment
				HANDLE_LIBCALL(FEGETENV, "fegetenv")
				HANDLE_LIBCALL(FESETENV, "fesetenv")

	// Conversion			// Conversion
	HANDLE_LIBCALL(FPEXT_F32_PPCF128, "__gcc_stoq")			HANDLE_LIBCALL(FPEXT_F32_PPCF128, "__gcc_stoq")
	HANDLE_LIBCALL(FPEXT_F64_PPCF128, "__gcc_dtoq")			HANDLE_LIBCALL(FPEXT_F64_PPCF128, "__gcc_dtoq")
	HANDLE_LIBCALL(FPEXT_F80_F128, "__extendxftf2")			HANDLE_LIBCALL(FPEXT_F80_F128, "__extendxftf2")
	HANDLE_LIBCALL(FPEXT_F64_F128, "__extenddftf2")			HANDLE_LIBCALL(FPEXT_F64_F128, "__extenddftf2")
	HANDLE_LIBCALL(FPEXT_F32_F128, "__extendsftf2")			HANDLE_LIBCALL(FPEXT_F32_F128, "__extendsftf2")
	HANDLE_LIBCALL(FPEXT_F16_F128, "__extendhftf2")			HANDLE_LIBCALL(FPEXT_F16_F128, "__extendhftf2")
	HANDLE_LIBCALL(FPEXT_F16_F80, "__extendhfxf2")			HANDLE_LIBCALL(FPEXT_F16_F80, "__extendhfxf2")
	▲ Show 20 Lines • Show All 304 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 993 Lines • ▼ Show 20 Lines	Action = TLI.getOperationAction(Node->getOpcode(),
Node->getValueType(0));		Node->getValueType(0));
break;		break;
case ISD::VAARG:		case ISD::VAARG:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
Node->getValueType(0));		Node->getValueType(0));
if (Action != TargetLowering::Promote)		if (Action != TargetLowering::Promote)
Action = TLI.getOperationAction(Node->getOpcode(), MVT::Other);		Action = TLI.getOperationAction(Node->getOpcode(), MVT::Other);
break;		break;
		case ISD::SET_FPENV:
		Action = TLI.getOperationAction(Node->getOpcode(),
		Node->getOperand(1).getValueType());
		break;
case ISD::FP_TO_FP16:		case ISD::FP_TO_FP16:
case ISD::FP_TO_BF16:		case ISD::FP_TO_BF16:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
case ISD::LROUND:		case ISD::LROUND:
case ISD::LLROUND:		case ISD::LLROUND:
case ISD::LRINT:		case ISD::LRINT:
▲ Show 20 Lines • Show All 3,445 Lines • ▼ Show 20 Lines	case ISD::CTLZ_ZERO_UNDEF:
case MVT::i64:		case MVT::i64:
Results.push_back(ExpandLibCall(RTLIB::CTLZ_I64, Node, false));		Results.push_back(ExpandLibCall(RTLIB::CTLZ_I64, Node, false));
break;		break;
case MVT::i128:		case MVT::i128:
Results.push_back(ExpandLibCall(RTLIB::CTLZ_I128, Node, false));		Results.push_back(ExpandLibCall(RTLIB::CTLZ_I128, Node, false));
break;		break;
}		}
break;		break;
		case ISD::RESET_FPENV: {
		// It is legalized to call 'fesetenv(FE_DFL_ENV)'. On most targets
		// FE_DFL_ENV is defined as '((const fenv_t *) -1)' in glibc.
		SDValue Ptr = DAG.getIntPtrConstant(-1LL, dl);
		SDValue Chain = Node->getOperand(0);
		Results.push_back(
		DAG.makeStateFunctionCall(RTLIB::FESETENV, Ptr, Chain, dl));
		break;
		}
		case ISD::GET_FPENV_MEM: {
		SDValue Chain = Node->getOperand(0);
		SDValue EnvPtr = Node->getOperand(1);
		Results.push_back(
		DAG.makeStateFunctionCall(RTLIB::FEGETENV, EnvPtr, Chain, dl));
		break;
		}
		case ISD::SET_FPENV_MEM: {
		SDValue Chain = Node->getOperand(0);
		SDValue EnvPtr = Node->getOperand(1);
		Results.push_back(
		DAG.makeStateFunctionCall(RTLIB::FESETENV, EnvPtr, Chain, dl));
		break;
		}
}		}

// Replace the original node with the legalized result.		// Replace the original node with the legalized result.
if (!Results.empty()) {		if (!Results.empty()) {
LLVM_DEBUG(dbgs() << "Successfully converted node to libcall\n");		LLVM_DEBUG(dbgs() << "Successfully converted node to libcall\n");
ReplaceNode(Node, Results.data());		ReplaceNode(Node, Results.data());
} else		} else
LLVM_DEBUG(dbgs() << "Could not convert node to libcall\n");		LLVM_DEBUG(dbgs() << "Could not convert node to libcall\n");
▲ Show 20 Lines • Show All 689 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 9,216 Lines • ▼ Show 20 Lines	SDValue SelectionDAG::getMaskedScatter(SDVTList VTs, EVT MemVT, const SDLoc &dl,

CSEMap.InsertNode(N, IP);		CSEMap.InsertNode(N, IP);
InsertNode(N);		InsertNode(N);
SDValue V(N, 0);		SDValue V(N, 0);
NewSDValueDbgMsg(V, "Creating new node: ", this);		NewSDValueDbgMsg(V, "Creating new node: ", this);
return V;		return V;
}		}

		SDValue SelectionDAG::getGetFPEnv(SDValue Chain, const SDLoc &dl, SDValue Ptr,
		EVT MemVT, MachineMemOperand *MMO) {
		assert(Chain.getValueType() == MVT::Other && "Invalid chain type");
		SDVTList VTs = getVTList(MVT::Other);
		SDValue Ops[] = {Chain, Ptr};
		FoldingSetNodeID ID;
		AddNodeIDNode(ID, ISD::GET_FPENV_MEM, VTs, Ops);
		ID.AddInteger(MemVT.getRawBits());
		ID.AddInteger(getSyntheticNodeSubclassData<FPStateAccessSDNode>(
		ISD::GET_FPENV_MEM, dl.getIROrder(), VTs, MemVT, MMO));
		ID.AddInteger(MMO->getPointerInfo().getAddrSpace());
		ID.AddInteger(MMO->getFlags());
		void *IP = nullptr;
		if (SDNode *E = FindNodeOrInsertPos(ID, dl, IP))
		return SDValue(E, 0);

		auto *N = newSDNode<FPStateAccessSDNode>(ISD::GET_FPENV_MEM, dl.getIROrder(),
		dl.getDebugLoc(), VTs, MemVT, MMO);
		createOperands(N, Ops);

		CSEMap.InsertNode(N, IP);
		InsertNode(N);
		SDValue V(N, 0);
		NewSDValueDbgMsg(V, "Creating new node: ", this);
		return V;
		}

		SDValue SelectionDAG::getSetFPEnv(SDValue Chain, const SDLoc &dl, SDValue Ptr,
		EVT MemVT, MachineMemOperand *MMO) {
		assert(Chain.getValueType() == MVT::Other && "Invalid chain type");
		SDVTList VTs = getVTList(MVT::Other);
		SDValue Ops[] = {Chain, Ptr};
		FoldingSetNodeID ID;
		AddNodeIDNode(ID, ISD::SET_FPENV_MEM, VTs, Ops);
		ID.AddInteger(MemVT.getRawBits());
		ID.AddInteger(getSyntheticNodeSubclassData<FPStateAccessSDNode>(
		ISD::SET_FPENV_MEM, dl.getIROrder(), VTs, MemVT, MMO));
		ID.AddInteger(MMO->getPointerInfo().getAddrSpace());
		ID.AddInteger(MMO->getFlags());
		void *IP = nullptr;
		if (SDNode *E = FindNodeOrInsertPos(ID, dl, IP))
		return SDValue(E, 0);

		auto *N = newSDNode<FPStateAccessSDNode>(ISD::SET_FPENV_MEM, dl.getIROrder(),
		dl.getDebugLoc(), VTs, MemVT, MMO);
		createOperands(N, Ops);

		CSEMap.InsertNode(N, IP);
		InsertNode(N);
		SDValue V(N, 0);
		NewSDValueDbgMsg(V, "Creating new node: ", this);
		return V;
		}

SDValue SelectionDAG::simplifySelect(SDValue Cond, SDValue T, SDValue F) {		SDValue SelectionDAG::simplifySelect(SDValue Cond, SDValue T, SDValue F) {
// select undef, T, F --> T (if T is a constant), otherwise F		// select undef, T, F --> T (if T is a constant), otherwise F
// select, ?, undef, F --> F		// select, ?, undef, F --> F
// select, ?, T, undef --> T		// select, ?, T, undef --> T
if (Cond.isUndef())		if (Cond.isUndef())
return isConstantValueOfAnyType(T) ? T : F;		return isConstantValueOfAnyType(T) ? T : F;
if (T.isUndef())		if (T.isUndef())
return F;		return F;
▲ Show 20 Lines • Show All 3,104 Lines • ▼ Show 20 Lines	case ISD::FMAXNUM: {
if (Opcode == ISD::FMAXNUM)		if (Opcode == ISD::FMAXNUM)
NeutralAF.changeSign();		NeutralAF.changeSign();

return getConstantFP(NeutralAF, DL, VT);		return getConstantFP(NeutralAF, DL, VT);
}		}
}		}
}		}

		/// Helper used to make a call to a library function that has one argument of
		/// pointer type.
		///
		/// Such functions include 'fegetmode', 'fesetenv' and some others, which are
		/// used to get or set floating-point state. They have one argument of pointer
		/// type, which points to the memory region containing bits of the
		/// floating-point state. The value returned by such function is ignored in the
		/// created call.
		///
		/// \param LibFunc Reference to library function (value of RTLIB::Libcall).
		/// \param Ptr Pointer used to save/load state.
		/// \param InChain Ingoing token chain.
		/// \returns Outgoing chain token.
		SDValue SelectionDAG::makeStateFunctionCall(unsigned LibFunc, SDValue Ptr,
		SDValue InChain,
		const SDLoc &DLoc) {
		assert(InChain.getValueType() == MVT::Other && "Expected token chain");
		TargetLowering::ArgListTy Args;
		TargetLowering::ArgListEntry Entry;
		Entry.Node = Ptr;
		Entry.Ty = Ptr.getValueType().getTypeForEVT(*getContext());
		Args.push_back(Entry);
		RTLIB::Libcall LC = static_cast<RTLIB::Libcall>(LibFunc);
		SDValue Callee = getExternalSymbol(TLI->getLibcallName(LC),
		TLI->getPointerTy(getDataLayout()));
		arsenmUnsubmitted Not Done Reply Inline Actions If not getting it from the actual declaration, should probably use the default program address space (don't really care if you fix it here, 90% of the uses of this are wrong as it is) arsenm: If not getting it from the actual declaration, should probably use the default program address…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions IIRC default program address space is for function code. I put `getPointerMemTy` instead of `getPointerTy`, although now it behave in the same way. I don't know to put correct address space here, as declaration of the function is unavailable. sepavloff: IIRC default program address space is for function code. I put `getPointerMemTy` instead of…
		arsenmUnsubmitted Not Done Reply Inline Actions pointer mem ty makes less sense arsenm: pointer mem ty makes less sense
		TargetLowering::CallLoweringInfo CLI(*this);
		CLI.setDebugLoc(DLoc).setChain(InChain).setLibCallee(
		TLI->getLibcallCallingConv(LC), Type::getVoidTy(*getContext()), Callee,
		std::move(Args));
		return TLI->LowerCallTo(CLI).second;
		}

void SelectionDAG::copyExtraInfo(SDNode From, SDNode To) {		void SelectionDAG::copyExtraInfo(SDNode From, SDNode To) {
assert(From && To && "Invalid SDNode; empty source SDValue?");		assert(From && To && "Invalid SDNode; empty source SDValue?");
auto I = SDEI.find(From);		auto I = SDEI.find(From);
if (I == SDEI.end())		if (I == SDEI.end())
return;		return;

// Use of operator[] on the DenseMap may cause an insertion, which invalidates		// Use of operator[] on the DenseMap may cause an insertion, which invalidates
// the iterator, hence the need to make a copy to prevent a use-after-free.		// the iterator, hence the need to make a copy to prevent a use-after-free.
▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,565 Lines • ▼ Show 20 Lines	if (!TLI.isOperationLegalOrCustom(ISD::IS_FPCLASS, ArgVT)) {
return;		return;
}		}

SDValue Check = DAG.getTargetConstant(Test, sdl, MVT::i32);		SDValue Check = DAG.getTargetConstant(Test, sdl, MVT::i32);
SDValue V = DAG.getNode(ISD::IS_FPCLASS, sdl, DestVT, {Op, Check}, Flags);		SDValue V = DAG.getNode(ISD::IS_FPCLASS, sdl, DestVT, {Op, Check}, Flags);
setValue(&I, V);		setValue(&I, V);
return;		return;
}		}
		case Intrinsic::get_fpenv: {
		const DataLayout DLayout = DAG.getDataLayout();
		EVT EnvVT = TLI.getValueType(DLayout, I.getType());
		Align TempAlign = DAG.getEVTAlign(EnvVT);
		SDValue Chain = DAG.getRoot();
		// Use GET_FPENV if it is legal or custom. Otherwise use memory-based node
		// and temporary storage in stack.
		arsenmUnsubmitted Not Done Reply Inline Actions Can this just go into an Expand action like normal? We randomly expand certain things in the DAG builder but the inconsistency is annoying arsenm: Can this just go into an Expand action like normal? We randomly expand certain things in the…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions This code does the same job as other cases in this function, - it creates relevant DAG nodes. Their expansion occurs in `SelectionDAG::Legalize`, as for other nodes. The only difference is using two kings of nodes, register and memory-based ones. Using only one kind of node could simplify code in this function but complicates code in other parts. On some targets FP environment is long and corresponding integer type is illegal (i256 on x86). In such case DAG nodes have to be expanded early, when DAG legalizes types. If FP environment can be represented by a legal type, the expansion occurs late, in `SelectionDAG::Legalize`. There were two points when the expansion occurred. This approach was used in previous versions of this patch, it complicated the implementation and made it fragile. Using separate nodes for operations that involves memory allows to treat all FP environment operations in uniform manner, at the cost of small complication of this function. sepavloff: This code does the same job as other cases in this function, - it creates relevant DAG nodes.
		arsenmUnsubmitted Not Done Reply Inline Actions This hack doesn't require repeating in globalisel arsenm: This hack doesn't require repeating in globalisel
		if (TLI.isOperationLegalOrCustom(ISD::SET_FPENV, EnvVT)) {
		Res = DAG.getNode(
		ISD::GET_FPENV, sdl,
		DAG.getVTList(TLI.getValueType(DAG.getDataLayout(), I.getType()),
		MVT::Other),
		Chain);
		} else {
		SDValue Temp = DAG.CreateStackTemporary(EnvVT, TempAlign.value());
		arsenmUnsubmitted Not Done Reply Inline Actions CreateStackTemporary should be fixed to directly take Align input arsenm: CreateStackTemporary should be fixed to directly take Align input
		int SPFI = cast<FrameIndexSDNode>(Temp.getNode())->getIndex();
		auto MPI =
		MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), SPFI);
		MachineMemOperand *MMO = DAG.getMachineFunction().getMachineMemOperand(
		MPI, MachineMemOperand::MOStore, MemoryLocation::UnknownSize,
		TempAlign);
		Chain = DAG.getGetFPEnv(Chain, sdl, Temp, EnvVT, MMO);
		Res = DAG.getLoad(EnvVT, sdl, Chain, Temp, MPI);
		}
		setValue(&I, Res);
		DAG.setRoot(Res.getValue(1));
		return;
		}
		case Intrinsic::set_fpenv: {
		const DataLayout DLayout = DAG.getDataLayout();
		SDValue Env = getValue(I.getArgOperand(0));
		EVT EnvVT = Env.getValueType();
		Align TempAlign = DAG.getEVTAlign(EnvVT);
		SDValue Chain = getRoot();
		// If SET_FPENV is custom or legal, use it. Otherwise use loading
		// environment from memory.
		if (TLI.isOperationLegalOrCustom(ISD::SET_FPENV, EnvVT)) {
		Chain = DAG.getNode(ISD::SET_FPENV, sdl, MVT::Other, Chain, Env);
		} else {
		// Allocate space in stack, copy environment bits into it and use this
		// memory in SET_FPENV_MEM.
		SDValue Temp = DAG.CreateStackTemporary(EnvVT, TempAlign.value());
		int SPFI = cast<FrameIndexSDNode>(Temp.getNode())->getIndex();
		auto MPI =
		MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), SPFI);
		arsenmUnsubmitted Not Done Reply Inline Actions This should use the alignment for EnvVT, not the datalayout's stack alignment arsenm: This should use the alignment for EnvVT, not the datalayout's stack alignment
		sepavloffAuthorUnsubmitted Done Reply Inline Actions Fixed. sepavloff: Fixed.
		Chain = DAG.getStore(Chain, sdl, Env, Temp, MPI, TempAlign,
		MachineMemOperand::MOStore);
		MachineMemOperand *MMO = DAG.getMachineFunction().getMachineMemOperand(
		MPI, MachineMemOperand::MOLoad, MemoryLocation::UnknownSize,
		TempAlign);
		Chain = DAG.getSetFPEnv(Chain, sdl, Temp, EnvVT, MMO);
		}
		DAG.setRoot(Chain);
		return;
		}
		case Intrinsic::reset_fpenv:
		DAG.setRoot(DAG.getNode(ISD::RESET_FPENV, sdl, MVT::Other, getRoot()));
		return;
case Intrinsic::pcmarker: {		case Intrinsic::pcmarker: {
SDValue Tmp = getValue(I.getArgOperand(0));		SDValue Tmp = getValue(I.getArgOperand(0));
DAG.setRoot(DAG.getNode(ISD::PCMARKER, sdl, MVT::Other, getRoot(), Tmp));		DAG.setRoot(DAG.getNode(ISD::PCMARKER, sdl, MVT::Other, getRoot(), Tmp));
return;		return;
}		}
case Intrinsic::readcyclecounter: {		case Intrinsic::readcyclecounter: {
SDValue Op = getRoot();		SDValue Op = getRoot();
Res = DAG.getNode(ISD::READCYCLECOUNTER, sdl,		Res = DAG.getNode(ISD::READCYCLECOUNTER, sdl,
▲ Show 20 Lines • Show All 5,265 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 425 Lines • ▼ Show 20 Lines	#endif
case ISD::PREALLOCATED_SETUP:		case ISD::PREALLOCATED_SETUP:
return "call_setup";		return "call_setup";
case ISD::PREALLOCATED_ARG:		case ISD::PREALLOCATED_ARG:
return "call_alloc";		return "call_alloc";

// Floating point environment manipulation		// Floating point environment manipulation
case ISD::GET_ROUNDING: return "get_rounding";		case ISD::GET_ROUNDING: return "get_rounding";
case ISD::SET_ROUNDING: return "set_rounding";		case ISD::SET_ROUNDING: return "set_rounding";
		case ISD::GET_FPENV: return "get_fpenv";
		case ISD::SET_FPENV: return "set_fpenv";
		case ISD::RESET_FPENV: return "reset_fpenv";
		case ISD::GET_FPENV_MEM: return "get_fpenv_mem";
		case ISD::SET_FPENV_MEM: return "set_fpenv_mem";

// Bit manipulation		// Bit manipulation
case ISD::ABS: return "abs";		case ISD::ABS: return "abs";
case ISD::BITREVERSE: return "bitreverse";		case ISD::BITREVERSE: return "bitreverse";
case ISD::BSWAP: return "bswap";		case ISD::BSWAP: return "bswap";
case ISD::CTPOP: return "ctpop";		case ISD::CTPOP: return "ctpop";
case ISD::CTTZ: return "cttz";		case ISD::CTTZ: return "cttz";
case ISD::CTTZ_ZERO_UNDEF: return "cttz_zero_undef";		case ISD::CTTZ_ZERO_UNDEF: return "cttz_zero_undef";
▲ Show 20 Lines • Show All 645 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 873 Lines • ▼ Show 20 Lines	#include "llvm/IR/ConstrainedOps.def"

// Named vector shuffles default to expand.		// Named vector shuffles default to expand.
setOperationAction(ISD::VECTOR_SPLICE, VT, Expand);		setOperationAction(ISD::VECTOR_SPLICE, VT, Expand);

// VP operations default to expand.		// VP operations default to expand.
#define BEGIN_REGISTER_VP_SDNODE(SDOPC, ...) \		#define BEGIN_REGISTER_VP_SDNODE(SDOPC, ...) \
setOperationAction(ISD::SDOPC, VT, Expand);		setOperationAction(ISD::SDOPC, VT, Expand);
#include "llvm/IR/VPIntrinsics.def"		#include "llvm/IR/VPIntrinsics.def"

		// FP environment operations default to expand.
		setOperationAction(ISD::GET_FPENV, VT, Expand);
		setOperationAction(ISD::SET_FPENV, VT, Expand);
		setOperationAction(ISD::RESET_FPENV, VT, Expand);
}		}

// Most targets ignore the @llvm.prefetch intrinsic.		// Most targets ignore the @llvm.prefetch intrinsic.
setOperationAction(ISD::PREFETCH, MVT::Other, Expand);		setOperationAction(ISD::PREFETCH, MVT::Other, Expand);

// Most targets also ignore the @llvm.readcyclecounter intrinsic.		// Most targets also ignore the @llvm.readcyclecounter intrinsic.
setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Expand);		setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Expand);

Show All 14 Lines	#include "llvm/IR/VPIntrinsics.def"
// Default ISD::TRAP to expand (which turns it into abort).		// Default ISD::TRAP to expand (which turns it into abort).
setOperationAction(ISD::TRAP, MVT::Other, Expand);		setOperationAction(ISD::TRAP, MVT::Other, Expand);

// On most systems, DEBUGTRAP and TRAP have no difference. The "Expand"		// On most systems, DEBUGTRAP and TRAP have no difference. The "Expand"
// here is to inform DAG Legalizer to replace DEBUGTRAP with TRAP.		// here is to inform DAG Legalizer to replace DEBUGTRAP with TRAP.
setOperationAction(ISD::DEBUGTRAP, MVT::Other, Expand);		setOperationAction(ISD::DEBUGTRAP, MVT::Other, Expand);

setOperationAction(ISD::UBSANTRAP, MVT::Other, Expand);		setOperationAction(ISD::UBSANTRAP, MVT::Other, Expand);

		setOperationAction(ISD::GET_FPENV_MEM, MVT::Other, Expand);
		setOperationAction(ISD::SET_FPENV_MEM, MVT::Other, Expand);
}		}

MVT TargetLoweringBase::getScalarShiftAmountTy(const DataLayout &DL,		MVT TargetLoweringBase::getScalarShiftAmountTy(const DataLayout &DL,
EVT) const {		EVT) const {
return MVT::getIntegerVT(DL.getPointerSizeInBits(0));		return MVT::getIntegerVT(DL.getPointerSizeInBits(0));
}		}

EVT TargetLoweringBase::getShiftAmountTy(EVT LHSTy, const DataLayout &DL,		EVT TargetLoweringBase::getShiftAmountTy(EVT LHSTy, const DataLayout &DL,
▲ Show 20 Lines • Show All 1,452 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/fpenv.ll

	Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: bic r0, r0, #12582912			; CHECK-NEXT: bic r0, r0, #12582912
	; CHECK-NEXT: orr r0, r0, #8388608			; CHECK-NEXT: orr r0, r0, #8388608
	; CHECK-NEXT: vmsr fpscr, r0			; CHECK-NEXT: vmsr fpscr, r0
	; CHECK-NEXT: mov pc, lr			; CHECK-NEXT: mov pc, lr
	call void @llvm.set.rounding(i32 3)			call void @llvm.set.rounding(i32 3)
	ret void			ret void
	}			}

				define i32 @get_fpenv_01() #0 {
				; CHECK-LABEL: get_fpenv_01:
				; CHECK: @ %bb.0: @ %entry
				; CHECK-NEXT: .save {r11, lr}
				; CHECK-NEXT: push {r11, lr}
				; CHECK-NEXT: .pad #8
				; CHECK-NEXT: sub sp, sp, #8
				; CHECK-NEXT: add r0, sp, #4
				; CHECK-NEXT: bl fegetenv
				; CHECK-NEXT: ldr r0, [sp, #4]
				; CHECK-NEXT: add sp, sp, #8
				; CHECK-NEXT: pop {r11, lr}
				; CHECK-NEXT: mov pc, lr
				entry:
				%fpenv = call i32 @llvm.get.fpenv.i32()
				ret i32 %fpenv
				}

				define void @set_fpenv_01(i32 %fpenv) #0 {
				; CHECK-LABEL: set_fpenv_01:
				; CHECK: @ %bb.0: @ %entry
				; CHECK-NEXT: .save {r11, lr}
				; CHECK-NEXT: push {r11, lr}
				; CHECK-NEXT: .pad #8
				; CHECK-NEXT: sub sp, sp, #8
				; CHECK-NEXT: str r0, [sp, #4]
				; CHECK-NEXT: add r0, sp, #4
				; CHECK-NEXT: bl fesetenv
				; CHECK-NEXT: add sp, sp, #8
				; CHECK-NEXT: pop {r11, lr}
				; CHECK-NEXT: mov pc, lr
				entry:
				call void @llvm.set.fpenv.i32(i32 %fpenv)
				ret void
				}

				define void @reset_fpenv_01() #0 {
				; CHECK-LABEL: reset_fpenv_01:
				; CHECK: @ %bb.0: @ %entry
				; CHECK-NEXT: .save {r11, lr}
				; CHECK-NEXT: push {r11, lr}
				; CHECK-NEXT: mvn r0, #0
				; CHECK-NEXT: bl fesetenv
				; CHECK-NEXT: pop {r11, lr}
				; CHECK-NEXT: mov pc, lr
				entry:
				call void @llvm.reset.fpenv()
				ret void
				}

				attributes #0 = { nounwind "use-soft-float"="true" }

	declare void @llvm.set.rounding(i32)			declare void @llvm.set.rounding(i32)
				declare i32 @llvm.get.fpenv.i32()
				declare void @llvm.set.fpenv.i32(i32 %fpenv)
				declare void @llvm.reset.fpenv()

llvm/test/CodeGen/X86/fpenv.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=-sse -verify-machineinstrs < %s \| FileCheck %s -check-prefix=X86-NOSSE			; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=-sse -verify-machineinstrs < %s \| FileCheck %s -check-prefix=X86-NOSSE
	; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+sse -verify-machineinstrs < %s \| FileCheck %s -check-prefix=X86-SSE			; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+sse -verify-machineinstrs < %s \| FileCheck %s -check-prefix=X86-SSE
	; RUN: llc -mtriple=x86_64-unknown-linux-gnu -verify-machineinstrs < %s \| FileCheck %s -check-prefix=X64			; RUN: llc -mtriple=x86_64-unknown-linux-gnu -verify-machineinstrs < %s \| FileCheck %s -check-prefix=X64

	declare void @llvm.set.rounding(i32 %x)			declare void @llvm.set.rounding(i32 %x)
				declare i256 @llvm.get.fpenv.i256()
				declare void @llvm.set.fpenv.i256(i256 %fpenv)
				declare void @llvm.reset.fpenv()

	define void @func_01() nounwind {			define void @func_01() nounwind {
	; X86-NOSSE-LABEL: func_01:			; X86-NOSSE-LABEL: func_01:
	; X86-NOSSE: # %bb.0:			; X86-NOSSE: # %bb.0:
	; X86-NOSSE-NEXT: pushl %eax			; X86-NOSSE-NEXT: pushl %eax
	; X86-NOSSE-NEXT: fnstcw (%esp)			; X86-NOSSE-NEXT: fnstcw (%esp)
	; X86-NOSSE-NEXT: orb $12, {{[0-9]+}}(%esp)			; X86-NOSSE-NEXT: orb $12, {{[0-9]+}}(%esp)
	; X86-NOSSE-NEXT: fldcw (%esp)			; X86-NOSSE-NEXT: fldcw (%esp)
	▲ Show 20 Lines • Show All 222 Lines • ▼ Show 20 Lines
	; X64-NEXT: andl -{{[0-9]+}}(%rsp), %ecx			; X64-NEXT: andl -{{[0-9]+}}(%rsp), %ecx
	; X64-NEXT: leal (%rcx,%rax,8), %eax			; X64-NEXT: leal (%rcx,%rax,8), %eax
	; X64-NEXT: movl %eax, -{{[0-9]+}}(%rsp)			; X64-NEXT: movl %eax, -{{[0-9]+}}(%rsp)
	; X64-NEXT: ldmxcsr -{{[0-9]+}}(%rsp)			; X64-NEXT: ldmxcsr -{{[0-9]+}}(%rsp)
	; X64-NEXT: retq			; X64-NEXT: retq
	call void @llvm.set.rounding(i32 %x) ; Downward			call void @llvm.set.rounding(i32 %x) ; Downward
	ret void			ret void
	}			}

				define void @get_fpenv_01(ptr %ptr) #0 {
				; X86-NOSSE-LABEL: get_fpenv_01:
				; X86-NOSSE: # %bb.0: # %entry
				; X86-NOSSE-NEXT: pushl %ebp
				; X86-NOSSE-NEXT: pushl %ebx
				; X86-NOSSE-NEXT: pushl %edi
				; X86-NOSSE-NEXT: pushl %esi
				; X86-NOSSE-NEXT: subl $60, %esp
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NOSSE-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NOSSE-NEXT: movl %eax, (%esp)
				; X86-NOSSE-NEXT: calll fegetenv
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NOSSE-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NOSSE-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ebp
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NOSSE-NEXT: movl %ecx, 24(%esi)
				; X86-NOSSE-NEXT: movl %eax, 28(%esi)
				; X86-NOSSE-NEXT: movl %ebp, 16(%esi)
				; X86-NOSSE-NEXT: movl %ebx, 20(%esi)
				; X86-NOSSE-NEXT: movl %edi, 8(%esi)
				; X86-NOSSE-NEXT: movl %edx, 12(%esi)
				; X86-NOSSE-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NOSSE-NEXT: movl %eax, (%esi)
				; X86-NOSSE-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NOSSE-NEXT: movl %eax, 4(%esi)
				; X86-NOSSE-NEXT: addl $60, %esp
				; X86-NOSSE-NEXT: popl %esi
				; X86-NOSSE-NEXT: popl %edi
				; X86-NOSSE-NEXT: popl %ebx
				; X86-NOSSE-NEXT: popl %ebp
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE-LABEL: get_fpenv_01:
				; X86-SSE: # %bb.0: # %entry
				; X86-SSE-NEXT: pushl %ebp
				; X86-SSE-NEXT: pushl %ebx
				; X86-SSE-NEXT: pushl %edi
				; X86-SSE-NEXT: pushl %esi
				; X86-SSE-NEXT: subl $60, %esp
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-SSE-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-SSE-NEXT: movl %eax, (%esp)
				; X86-SSE-NEXT: calll fegetenv
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %ebp
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-SSE-NEXT: movl %ecx, 24(%esi)
				; X86-SSE-NEXT: movl %eax, 28(%esi)
				; X86-SSE-NEXT: movl %ebp, 16(%esi)
				; X86-SSE-NEXT: movl %ebx, 20(%esi)
				; X86-SSE-NEXT: movl %edi, 8(%esi)
				; X86-SSE-NEXT: movl %edx, 12(%esi)
				; X86-SSE-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-SSE-NEXT: movl %eax, (%esi)
				; X86-SSE-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-SSE-NEXT: movl %eax, 4(%esi)
				; X86-SSE-NEXT: addl $60, %esp
				; X86-SSE-NEXT: popl %esi
				; X86-SSE-NEXT: popl %edi
				; X86-SSE-NEXT: popl %ebx
				; X86-SSE-NEXT: popl %ebp
				; X86-SSE-NEXT: retl
				;
				; X64-LABEL: get_fpenv_01:
				; X64: # %bb.0: # %entry
				; X64-NEXT: pushq %rbx
				; X64-NEXT: subq $32, %rsp
				; X64-NEXT: movq %rdi, %rbx
				; X64-NEXT: movq %rsp, %rdi
				; X64-NEXT: callq fegetenv@PLT
				; X64-NEXT: movq (%rsp), %rax
				; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx
				; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx
				; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi
				; X64-NEXT: movq %rsi, 16(%rbx)
				; X64-NEXT: movq %rdx, 24(%rbx)
				; X64-NEXT: movq %rax, (%rbx)
				; X64-NEXT: movq %rcx, 8(%rbx)
				; X64-NEXT: addq $32, %rsp
				; X64-NEXT: popq %rbx
				; X64-NEXT: retq
				entry:
				%env = call i256 @llvm.get.fpenv.i256()
				store i256 %env, ptr %ptr
				ret void
				}

				define void @set_fpenv_01(ptr %ptr) #0 {
				; X86-NOSSE-LABEL: set_fpenv_01:
				; X86-NOSSE: # %bb.0: # %entry
				; X86-NOSSE-NEXT: pushl %ebp
				; X86-NOSSE-NEXT: pushl %ebx
				; X86-NOSSE-NEXT: pushl %edi
				; X86-NOSSE-NEXT: pushl %esi
				; X86-NOSSE-NEXT: subl $44, %esp
				; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NOSSE-NEXT: movl (%eax), %ecx
				; X86-NOSSE-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NOSSE-NEXT: movl 4(%eax), %edx
				; X86-NOSSE-NEXT: movl 12(%eax), %esi
				; X86-NOSSE-NEXT: movl 8(%eax), %edi
				; X86-NOSSE-NEXT: movl 20(%eax), %ebx
				; X86-NOSSE-NEXT: movl 16(%eax), %ebp
				; X86-NOSSE-NEXT: movl 28(%eax), %ecx
				; X86-NOSSE-NEXT: movl 24(%eax), %eax
				; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: movl %ebp, {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: movl %ebx, {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: movl %edi, {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
				; X86-NOSSE-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NOSSE-NEXT: movl %eax, (%esp)
				; X86-NOSSE-NEXT: calll fesetenv
				; X86-NOSSE-NEXT: addl $44, %esp
				; X86-NOSSE-NEXT: popl %esi
				; X86-NOSSE-NEXT: popl %edi
				; X86-NOSSE-NEXT: popl %ebx
				; X86-NOSSE-NEXT: popl %ebp
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE-LABEL: set_fpenv_01:
				; X86-SSE: # %bb.0: # %entry
				; X86-SSE-NEXT: pushl %ebp
				; X86-SSE-NEXT: pushl %ebx
				; X86-SSE-NEXT: pushl %edi
				; X86-SSE-NEXT: pushl %esi
				; X86-SSE-NEXT: subl $44, %esp
				; X86-SSE-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-SSE-NEXT: movl (%eax), %ecx
				; X86-SSE-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-SSE-NEXT: movl 4(%eax), %edx
				; X86-SSE-NEXT: movl 12(%eax), %esi
				; X86-SSE-NEXT: movl 8(%eax), %edi
				; X86-SSE-NEXT: movl 20(%eax), %ebx
				; X86-SSE-NEXT: movl 16(%eax), %ebp
				; X86-SSE-NEXT: movl 28(%eax), %ecx
				; X86-SSE-NEXT: movl 24(%eax), %eax
				; X86-SSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
				; X86-SSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
				; X86-SSE-NEXT: movl %ebp, {{[0-9]+}}(%esp)
				; X86-SSE-NEXT: movl %ebx, {{[0-9]+}}(%esp)
				; X86-SSE-NEXT: movl %edi, {{[0-9]+}}(%esp)
				; X86-SSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
				; X86-SSE-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-SSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
				; X86-SSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
				; X86-SSE-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-SSE-NEXT: movl %eax, (%esp)
				; X86-SSE-NEXT: calll fesetenv
				; X86-SSE-NEXT: addl $44, %esp
				; X86-SSE-NEXT: popl %esi
				; X86-SSE-NEXT: popl %edi
				; X86-SSE-NEXT: popl %ebx
				; X86-SSE-NEXT: popl %ebp
				; X86-SSE-NEXT: retl
				;
				; X64-LABEL: set_fpenv_01:
				; X64: # %bb.0: # %entry
				; X64-NEXT: subq $40, %rsp
				; X64-NEXT: movq (%rdi), %rax
				; X64-NEXT: movq 8(%rdi), %rcx
				; X64-NEXT: movq 24(%rdi), %rdx
				; X64-NEXT: movq 16(%rdi), %rsi
				; X64-NEXT: movq %rsi, {{[0-9]+}}(%rsp)
				; X64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)
				; X64-NEXT: movq %rax, {{[0-9]+}}(%rsp)
				; X64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)
				; X64-NEXT: leaq {{[0-9]+}}(%rsp), %rdi
				; X64-NEXT: callq fesetenv@PLT
				; X64-NEXT: addq $40, %rsp
				; X64-NEXT: retq
				entry:
				%env = load i256, ptr %ptr
				call void @llvm.set.fpenv.i256(i256 %env)
				ret void
				}


				define void @reset_fpenv_01() #0 {
				; X86-NOSSE-LABEL: reset_fpenv_01:
				; X86-NOSSE: # %bb.0: # %entry
				; X86-NOSSE-NEXT: subl $12, %esp
				; X86-NOSSE-NEXT: movl $-1, (%esp)
				; X86-NOSSE-NEXT: calll fesetenv
				; X86-NOSSE-NEXT: addl $12, %esp
				; X86-NOSSE-NEXT: retl
				;
				; X86-SSE-LABEL: reset_fpenv_01:
				; X86-SSE: # %bb.0: # %entry
				; X86-SSE-NEXT: subl $12, %esp
				; X86-SSE-NEXT: movl $-1, (%esp)
				; X86-SSE-NEXT: calll fesetenv
				; X86-SSE-NEXT: addl $12, %esp
				; X86-SSE-NEXT: retl
				;
				; X64-LABEL: reset_fpenv_01:
				; X64: # %bb.0: # %entry
				; X64-NEXT: pushq %rax
				; X64-NEXT: movq $-1, %rdi
				; X64-NEXT: callq fesetenv@PLT
				; X64-NEXT: popq %rax
				; X64-NEXT: retq
				entry:
				call void @llvm.reset.fpenv()
				ret void
				}

				attributes #0 = { nounwind "use-soft-float"="true" }

This is an archive of the discontinued LLVM Phabricator instance.

Added intrinsics for access to FP environmentClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 528295

llvm/docs/LangRef.rst

llvm/include/llvm/CodeGen/ISDOpcodes.h

llvm/include/llvm/CodeGen/SelectionDAG.h

llvm/include/llvm/CodeGen/SelectionDAGNodes.h

llvm/include/llvm/IR/Intrinsics.td

llvm/include/llvm/IR/RuntimeLibcalls.def

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

llvm/lib/CodeGen/TargetLoweringBase.cpp

llvm/test/CodeGen/ARM/fpenv.ll

llvm/test/CodeGen/X86/fpenv.ll

Added intrinsics for access to FP environment
ClosedPublic