This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
CodeGenOptions.def
-
LangOptions.h
-
lib/
-
CodeGen/
-
BackendUtil.cpp
-
CGCall.cpp
-
CGExprScalar.cpp
-
CodeGenFunction.h
-
CodeGenFunction.cpp
-
Frontend/
-
CompilerInvocation.cpp
-
test/
-
CodeGen/
-
builtins-nvptx-ptx60.cu
-
complex-math.c
-
fp-options-to-fast-math-flags.c
1/1
libcalls.c
-
CodeGenCUDA/
1/1
builtins-amdgcn.cu
-
library-builtin.cu
-
CodeGenOpenCL/
1/2
relaxed-fpmath.cl

Differential D80315

Fix CC1 command line options mapping into fast-math flags.
ClosedPublic

Authored by michele.scandale on May 20 2020, 10:39 AM.

Download Raw Diff

Details

Reviewers

rjmccall
mibintc
Anastasia
scanon
rsmith
jdoerfert

Summary

This fixes the mapping between CC1 command line options to the
properties in LangOptions describing the floating point optimizations
configuration.
The default fast-math flags for the IR builder are now derived from such
properties to avoid inconsistencies.
Finally some of the CodeGenOptions floating point optimizations
properties have been removed since they now exist in LangOptions.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

michele.scandale created this revision.May 20 2020, 10:39 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2020, 10:39 AM

Herald added subscribers: cfe-commits, jvesely. · View Herald Transcript

michele.scandale marked an inline comment as done.May 20 2020, 10:49 AM

michele.scandale added inline comments.

clang/test/CodeGenOpenCL/relaxed-fpmath.cl
14	This change is based on the following: `-cl-fast-relaxed-math` = `-cl-unsafe-math-optimizations` + `-cl-finite-math-only` the GCC option `-funsafe-math-optimizations` and `-cl-unsafe-math-optimizations` are described with very similar wording and from the GCC description states explicitly mention that no signed zeros, reassociation and reciprocals are enabled, but there is no mention to assuming that NaNs do not exist. See https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/clBuildProgram.html and https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

michele.scandale marked 2 inline comments as done.May 20 2020, 10:52 AM

michele.scandale added inline comments.

clang/test/CodeGen/libcalls.c
11–21	For CUDA the default FP contract mode is `fast`, therefore the `contract` FMF is emitted.
clang/test/CodeGenCUDA/builtins-amdgcn.cu
13	For CUDA the default FP contract mode is `fast`, therefore the `contract` FMF is emitted.

Harbormaster failed remote builds in B57416: Diff 265301!May 20 2020, 12:02 PM

Fix 'clang/test/CodeGenCUDA/library-builtin.cu'

Thanks for cleaning this up!

Harbormaster failed remote builds in B57423: Diff 265333!May 20 2020, 2:18 PM

Fix formatting issues.

Harbormaster completed remote builds in B57470: Diff 265405.May 20 2020, 8:57 PM

michele.scandale added a reviewer: scanon.May 21 2020, 8:33 AM

Anastasia added inline comments.May 21 2020, 1:13 PM

clang/test/CodeGenOpenCL/relaxed-fpmath.cl
14	Makes sense!

michele.scandale edited the summary of this revision. (Show Details)May 21 2020, 4:58 PM

michele.scandale added a reviewer: rsmith.May 22 2020, 3:57 PM

michele.scandale added a child revision: D80462: Fix floating point math function attributes definition..May 22 2020, 5:33 PM

The code cleanups all seems reasonable. The actual changes in code-generation changes are because we were failing to set these reliably?

In D80315#2058549, @rjmccall wrote:

The code cleanups all seems reasonable. The actual changes in code-generation changes are because we were failing to set these reliably?

Most of them yes.

In the CUDA test we there is now contract because we honor the default contraction mode that for CUDA is set to fast.

In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode. We might want to treat -ffast-math similarly to -cl-fast-relaxed-math, i.e. it implies by default the "fast" contraction mode. Before this change there behavior was accidentally the same as "-ffast-math changes the default contraction mode" because there was:

if (CGM.getLangOpts().FastMath)
  FMF.setFast()

and so the bit for AllowContract was enabled in the fast-math flags.

In D80315#2058735, @michele.scandale wrote:

In D80315#2058549, @rjmccall wrote:

The code cleanups all seems reasonable. The actual changes in code-generation changes are because we were failing to set these reliably?

Most of them yes.

In the CUDA test we there is now contract because we honor the default contraction mode that for CUDA is set to fast.

Right, and we weren't honoring that mode before?

In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode.

Oh, I see, makes sense. So there was inconsistent treatment of the options, where one thing observed it but others didn't, and that's been fixed now.

In D80315#2058914, @rjmccall wrote:

In D80315#2058735, @michele.scandale wrote:

In D80315#2058549, @rjmccall wrote:

The code cleanups all seems reasonable. The actual changes in code-generation changes are because we were failing to set these reliably?

Most of them yes.

In the CUDA test we there is now contract because we honor the default contraction mode that for CUDA is set to fast.

Right, and we weren't honoring that mode before?

Not in the setup of base fast-math flags inside CodeGenFunction. However when emitting code for expression with FPOptions stored in the AST nodes then contract was set correctly.

In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode.

Oh, I see, makes sense. So there was inconsistent treatment of the options, where one thing observed it but others didn't, and that's been fixed now.

Do you think we should handle -ffast-math as -cl-fast-relaxed-math, i.e. it implies the default contraction mode being "fast"?

In D80315#2059160, @michele.scandale wrote:

In D80315#2058914, @rjmccall wrote:

In D80315#2058735, @michele.scandale wrote:

In D80315#2058549, @rjmccall wrote:

The code cleanups all seems reasonable. The actual changes in code-generation changes are because we were failing to set these reliably?

Most of them yes.

In the CUDA test we there is now contract because we honor the default contraction mode that for CUDA is set to fast.

Right, and we weren't honoring that mode before?

Not in the setup of base fast-math flags inside CodeGenFunction. However when emitting code for expression with FPOptions stored in the AST nodes then contract was set correctly.

Okay, that seems justifiable.

In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode.

Oh, I see, makes sense. So there was inconsistent treatment of the options, where one thing observed it but others didn't, and that's been fixed now.

Do you think we should handle -ffast-math as -cl-fast-relaxed-math, i.e. it implies the default contraction mode being "fast"?

I'm actually surprised it doesn't. I can't imagine why someone enabling fast math would want contraction to be disabled.

In D80315#2061164, @rjmccall wrote:

In D80315#2059160, @michele.scandale wrote:

In D80315#2058914, @rjmccall wrote:

In D80315#2058735, @michele.scandale wrote:

In D80315#2058549, @rjmccall wrote:

The code cleanups all seems reasonable. The actual changes in code-generation changes are because we were failing to set these reliably?

Most of them yes.

In the CUDA test we there is now contract because we honor the default contraction mode that for CUDA is set to fast.

Right, and we weren't honoring that mode before?

Not in the setup of base fast-math flags inside CodeGenFunction. However when emitting code for expression with FPOptions stored in the AST nodes then contract was set correctly.

Okay, that seems justifiable.

In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode.

Oh, I see, makes sense. So there was inconsistent treatment of the options, where one thing observed it but others didn't, and that's been fixed now.

Do you think we should handle -ffast-math as -cl-fast-relaxed-math, i.e. it implies the default contraction mode being "fast"?

I'm actually surprised it doesn't. I can't imagine why someone enabling fast math would want contraction to be disabled.

Just to be clear the clang driver does the right thing.
If you run clang -ffast-math the CC1 invocation has both -ffast-math and -ffp-contract=fast (and other options as well)

Here specifically I'm just considering the behavior of clang -cc1 -ffast-math.

Rebase + -ffast-math change the default contraction mode to fast.

Fixup more tests with now redundant -ffp-contract=fast option.

Herald added a reviewer: jdoerfert. · View Herald TranscriptMay 28 2020, 11:03 PM

Herald added a subscriber: sstefan1. · View Herald Transcript

Harbormaster completed remote builds in B58375: Diff 267121.May 28 2020, 11:57 PM

Harbormaster completed remote builds in B58378: Diff 267124.May 29 2020, 1:03 AM

I'm actually surprised it doesn't. I can't imagine why someone enabling fast math would want contraction to be disabled.

Just to be clear the clang driver does the right thing.
If you run clang -ffast-math the CC1 invocation has both -ffast-math and -ffp-contract=fast (and other options as well)

Here specifically I'm just considering the behavior of clang -cc1 -ffast-math.

Okay. It is probably best for testing purposes if flags like this stay as close as possible to their driver behavior. The driver can just canonicalize different spellings down to the interface that -cc1 wants.

This revision is now accepted and ready to land.May 29 2020, 12:32 PM

I've just realized it might be incorrect to have the CC1 option -ffast-math changing the default contraction mode. The clang driver generates -ffast-math based on conditions that do not involve the contraction mode state at all:

// -ffast-math enables the __FAST_MATH__ preprocessor macro, but check for the
// individual features enabled by -ffast-math instead of the option itself as
// that's consistent with gcc's behaviour.
if (!HonorINFs && !HonorNaNs && !MathErrno && AssociativeMath &&
    ReciprocalMath && !SignedZeros && !TrappingMath && !RoundingFPMath) {
  CmdArgs.push_back("-ffast-math");
  if (FPModel.equals("fast")) {
    if (FPContract.equals("fast"))
      // All set, do nothing.
      ;
    else if (FPContract.empty())
      // Enable -ffp-contract=fast
      CmdArgs.push_back(Args.MakeArgString("-ffp-contract=fast"));
    else
      D.Diag(clang::diag::warn_drv_overriding_flag_option)
        << "-ffp-model=fast"
        << Args.MakeArgString("-ffp-contract=" + FPContract);
  }
}

For example the running the following clang -### -funsafe-math-optimizations -ffinite-math-only -x c - lead to a CC1 command line without any -ffp-contract= option relying on the fact that the default value for the contraction mode in the compiler is OFF.

I will revert the modification on this aspect.

michele.scandale planned changes to this revision.May 29 2020, 2:48 PM

Revert last change about -ffast-math imply "fast" contraction mode by default in CC1.
Rebase.

Per a private discussion, it seems like the right thing to do here would be stop recognizing -ffast-math flag in cc1 and just put the driver in charge of all these individual flags. That may necessitate adding a -fdefine-fast-math cc1 option to control the #define, but the logic for that seems complex enough that it ought to be left to the driver.

Separately, we should consider making some more driver options imply -ffp-contract=fast. Fast contraction is the default behavior in GCC, and while Clang has chosen not to follow that precedent, it seems to be in keeping with the spirit of options like -funsafe-math-optimizations that they should broadly enable contraction if that policy isn't otherwise specified.

Those sorts of changes should be kept separate from this kind of refactor, though. This is still approved.

rjmccall accepted this revision.May 29 2020, 5:51 PM

This revision is now accepted and ready to land.May 29 2020, 5:51 PM

Harbormaster completed remote builds in B58513: Diff 267401.May 29 2020, 6:02 PM

Thanks John.

Would you be able to land this on my behalf?

8a8d703be0986dd6785cba0b610c9c4708b83e89

michele.scandale removed a child revision: D80462: Fix floating point math function attributes definition..Jun 2 2020, 9:40 AM

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

CodeGenOptions.def

7 lines

LangOptions.h

10 lines

lib/

CodeGen/

6 lines

11 lines

35 lines

3 lines

33 lines

Frontend/

CompilerInvocation.cpp

64 lines

test/

CodeGen/

builtins-nvptx-ptx60.cu

8 lines

complex-math.c

2 lines

fp-options-to-fast-math-flags.c

42 lines

libcalls.c

6 lines

CodeGenCUDA/

builtins-amdgcn.cu

2 lines

library-builtin.cu

4 lines

CodeGenOpenCL/

relaxed-fpmath.cl

4 lines

Diff 265405

clang/include/clang/Basic/CodeGenOptions.def

	Show First 20 Lines • Show All 144 Lines • ▼ Show 20 Lines
	CODEGENOPT(FatalWarnings , 1, 0) ///< Set when -Wa,--fatal-warnings is			CODEGENOPT(FatalWarnings , 1, 0) ///< Set when -Wa,--fatal-warnings is
	///< enabled.			///< enabled.
	CODEGENOPT(NoWarn , 1, 0) ///< Set when -Wa,--no-warn is enabled.			CODEGENOPT(NoWarn , 1, 0) ///< Set when -Wa,--no-warn is enabled.
	CODEGENOPT(EnableSegmentedStacks , 1, 0) ///< Set when -fsplit-stack is enabled.			CODEGENOPT(EnableSegmentedStacks , 1, 0) ///< Set when -fsplit-stack is enabled.
	CODEGENOPT(NoInlineLineTables, 1, 0) ///< Whether debug info should contain			CODEGENOPT(NoInlineLineTables, 1, 0) ///< Whether debug info should contain
	///< inline line tables.			///< inline line tables.
	CODEGENOPT(StackClashProtector, 1, 0) ///< Set when -fstack-clash-protection is enabled.			CODEGENOPT(StackClashProtector, 1, 0) ///< Set when -fstack-clash-protection is enabled.
	CODEGENOPT(NoImplicitFloat , 1, 0) ///< Set when -mno-implicit-float is enabled.			CODEGENOPT(NoImplicitFloat , 1, 0) ///< Set when -mno-implicit-float is enabled.
	CODEGENOPT(NoInfsFPMath , 1, 0) ///< Assume FP arguments, results not +-Inf.
	CODEGENOPT(NoSignedZeros , 1, 0) ///< Allow ignoring the signedness of FP zero
	CODEGENOPT(NullPointerIsValid , 1, 0) ///< Assume Null pointer deference is defined.			CODEGENOPT(NullPointerIsValid , 1, 0) ///< Assume Null pointer deference is defined.
	CODEGENOPT(Reassociate , 1, 0) ///< Allow reassociation of FP math ops
	CODEGENOPT(ReciprocalMath , 1, 0) ///< Allow FP divisions to be reassociated.
	CODEGENOPT(NoTrappingMath , 1, 0) ///< Set when -fno-trapping-math is enabled.
	CODEGENOPT(NoNaNsFPMath , 1, 0) ///< Assume FP arguments, results not NaN.
	CODEGENOPT(CorrectlyRoundedDivSqrt, 1, 0) ///< -cl-fp32-correctly-rounded-divide-sqrt			CODEGENOPT(CorrectlyRoundedDivSqrt, 1, 0) ///< -cl-fp32-correctly-rounded-divide-sqrt
	CODEGENOPT(UniqueInternalLinkageNames, 1, 0) ///< Internal Linkage symbols get unique names.			CODEGENOPT(UniqueInternalLinkageNames, 1, 0) ///< Internal Linkage symbols get unique names.

	/// When false, this attempts to generate code as if the result of an			/// When false, this attempts to generate code as if the result of an
	/// overflowing conversion matches the overflowing behavior of a target's native			/// overflowing conversion matches the overflowing behavior of a target's native
	/// float-to-int conversion instructions.			/// float-to-int conversion instructions.
	CODEGENOPT(StrictFloatCastOverflow, 1, 1)			CODEGENOPT(StrictFloatCastOverflow, 1, 1)

	▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	CODEGENOPT(StrictVTablePointers, 1, 0) ///< Optimize based on the strict vtable pointers			CODEGENOPT(StrictVTablePointers, 1, 0) ///< Optimize based on the strict vtable pointers
	CODEGENOPT(TimePasses , 1, 0) ///< Set when -ftime-report is enabled.			CODEGENOPT(TimePasses , 1, 0) ///< Set when -ftime-report is enabled.
	CODEGENOPT(TimeTrace , 1, 0) ///< Set when -ftime-trace is enabled.			CODEGENOPT(TimeTrace , 1, 0) ///< Set when -ftime-trace is enabled.
	VALUE_CODEGENOPT(TimeTraceGranularity, 32, 500) ///< Minimum time granularity (in microseconds),			VALUE_CODEGENOPT(TimeTraceGranularity, 32, 500) ///< Minimum time granularity (in microseconds),
	///< traced by time profiler			///< traced by time profiler
	CODEGENOPT(UnrollLoops , 1, 0) ///< Control whether loops are unrolled.			CODEGENOPT(UnrollLoops , 1, 0) ///< Control whether loops are unrolled.
	CODEGENOPT(RerollLoops , 1, 0) ///< Control whether loops are rerolled.			CODEGENOPT(RerollLoops , 1, 0) ///< Control whether loops are rerolled.
	CODEGENOPT(NoUseJumpTables , 1, 0) ///< Set when -fno-jump-tables is enabled.			CODEGENOPT(NoUseJumpTables , 1, 0) ///< Set when -fno-jump-tables is enabled.
	CODEGENOPT(UnsafeFPMath , 1, 0) ///< Allow unsafe floating point optzns.
	CODEGENOPT(UnwindTables , 1, 0) ///< Emit unwind tables.			CODEGENOPT(UnwindTables , 1, 0) ///< Emit unwind tables.
	CODEGENOPT(VectorizeLoop , 1, 0) ///< Run loop vectorizer.			CODEGENOPT(VectorizeLoop , 1, 0) ///< Run loop vectorizer.
	CODEGENOPT(VectorizeSLP , 1, 0) ///< Run SLP vectorizer.			CODEGENOPT(VectorizeSLP , 1, 0) ///< Run SLP vectorizer.
	CODEGENOPT(ProfileSampleAccurate, 1, 0) ///< Sample profile is accurate.			CODEGENOPT(ProfileSampleAccurate, 1, 0) ///< Sample profile is accurate.
	CODEGENOPT(CallGraphProfile , 1, 0) ///< Run call graph profile.			CODEGENOPT(CallGraphProfile , 1, 0) ///< Run call graph profile.

	/// Attempt to use register sized accesses to bit-fields in structures, when			/// Attempt to use register sized accesses to bit-fields in structures, when
	/// possible.			/// possible.
	▲ Show 20 Lines • Show All 138 Lines • Show Last 20 Lines

clang/include/clang/Basic/LangOptions.h

Show First 20 Lines • Show All 378 Lines • ▼ Show 20 Lines	public:
// Used for serializing.		// Used for serializing.
explicit FPOptions(unsigned I) { getFromOpaqueInt(I); }		explicit FPOptions(unsigned I) { getFromOpaqueInt(I); }

explicit FPOptions(const LangOptions &LangOpts)		explicit FPOptions(const LangOptions &LangOpts)
: fp_contract(LangOpts.getDefaultFPContractMode()),		: fp_contract(LangOpts.getDefaultFPContractMode()),
fenv_access(LangOptions::FPM_Off),		fenv_access(LangOptions::FPM_Off),
rounding(static_cast<unsigned>(LangOpts.getFPRoundingMode())),		rounding(static_cast<unsigned>(LangOpts.getFPRoundingMode())),
exceptions(LangOpts.getFPExceptionMode()),		exceptions(LangOpts.getFPExceptionMode()),
allow_reassoc(LangOpts.FastMath \|\| LangOpts.AllowFPReassoc),		allow_reassoc(LangOpts.AllowFPReassoc), no_nans(LangOpts.NoHonorNaNs),
no_nans(LangOpts.FastMath \|\| LangOpts.NoHonorNaNs),		no_infs(LangOpts.NoHonorInfs), no_signed_zeros(LangOpts.NoSignedZero),
no_infs(LangOpts.FastMath \|\| LangOpts.NoHonorInfs),		allow_reciprocal(LangOpts.AllowRecip),
no_signed_zeros(LangOpts.FastMath \|\| LangOpts.NoSignedZero),		approx_func(LangOpts.ApproxFunc) {}
allow_reciprocal(LangOpts.FastMath \|\| LangOpts.AllowRecip),
approx_func(LangOpts.FastMath \|\| LangOpts.ApproxFunc) {}
// FIXME: Use getDefaultFEnvAccessMode() when available.		// FIXME: Use getDefaultFEnvAccessMode() when available.

void setFastMath(bool B = true) {		void setFastMath(bool B = true) {
allow_reassoc = no_nans = no_infs = no_signed_zeros = approx_func =		allow_reassoc = no_nans = no_infs = no_signed_zeros = approx_func =
allow_reciprocal = B;		allow_reciprocal = B;
}		}

/// Return the default value of FPOptions that's used when trailing		/// Return the default value of FPOptions that's used when trailing
▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines

clang/lib/CodeGen/BackendUtil.cpp

Show First 20 Lines • Show All 476 Lines • ▼ Show 20 Lines	if (LangOpts.SjLjExceptions)
Options.ExceptionModel = llvm::ExceptionHandling::SjLj;		Options.ExceptionModel = llvm::ExceptionHandling::SjLj;
if (LangOpts.SEHExceptions)		if (LangOpts.SEHExceptions)
Options.ExceptionModel = llvm::ExceptionHandling::WinEH;		Options.ExceptionModel = llvm::ExceptionHandling::WinEH;
if (LangOpts.DWARFExceptions)		if (LangOpts.DWARFExceptions)
Options.ExceptionModel = llvm::ExceptionHandling::DwarfCFI;		Options.ExceptionModel = llvm::ExceptionHandling::DwarfCFI;
if (LangOpts.WasmExceptions)		if (LangOpts.WasmExceptions)
Options.ExceptionModel = llvm::ExceptionHandling::Wasm;		Options.ExceptionModel = llvm::ExceptionHandling::Wasm;

Options.NoInfsFPMath = CodeGenOpts.NoInfsFPMath;		Options.NoInfsFPMath = LangOpts.NoHonorInfs;
Options.NoNaNsFPMath = CodeGenOpts.NoNaNsFPMath;		Options.NoNaNsFPMath = LangOpts.NoHonorNaNs;
Options.NoZerosInBSS = CodeGenOpts.NoZeroInitializedInBSS;		Options.NoZerosInBSS = CodeGenOpts.NoZeroInitializedInBSS;
Options.UnsafeFPMath = CodeGenOpts.UnsafeFPMath;		Options.UnsafeFPMath = LangOpts.UnsafeFPMath;
Options.StackAlignmentOverride = CodeGenOpts.StackAlignment;		Options.StackAlignmentOverride = CodeGenOpts.StackAlignment;
Options.FunctionSections = CodeGenOpts.FunctionSections;		Options.FunctionSections = CodeGenOpts.FunctionSections;
Options.DataSections = CodeGenOpts.DataSections;		Options.DataSections = CodeGenOpts.DataSections;
Options.UniqueSectionNames = CodeGenOpts.UniqueSectionNames;		Options.UniqueSectionNames = CodeGenOpts.UniqueSectionNames;
Options.TLSSize = CodeGenOpts.TLSSize;		Options.TLSSize = CodeGenOpts.TLSSize;
Options.EmulatedTLS = CodeGenOpts.EmulatedTLS;		Options.EmulatedTLS = CodeGenOpts.EmulatedTLS;
Options.ExplicitEmulatedTLS = CodeGenOpts.ExplicitEmulatedTLS;		Options.ExplicitEmulatedTLS = CodeGenOpts.ExplicitEmulatedTLS;
Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning();		Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning();
▲ Show 20 Lines • Show All 1,184 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGCall.cpp

Show First 20 Lines • Show All 1,751 Lines • ▼ Show 20 Lines	if (CodeGenOpts.FPDenormalMode != llvm::DenormalMode::getIEEE())
CodeGenOpts.FPDenormalMode.str());		CodeGenOpts.FPDenormalMode.str());
if (CodeGenOpts.FP32DenormalMode != CodeGenOpts.FPDenormalMode) {		if (CodeGenOpts.FP32DenormalMode != CodeGenOpts.FPDenormalMode) {
FuncAttrs.addAttribute(		FuncAttrs.addAttribute(
"denormal-fp-math-f32",		"denormal-fp-math-f32",
CodeGenOpts.FP32DenormalMode.str());		CodeGenOpts.FP32DenormalMode.str());
}		}

FuncAttrs.addAttribute("no-trapping-math",		FuncAttrs.addAttribute("no-trapping-math",
llvm::toStringRef(CodeGenOpts.NoTrappingMath));		llvm::toStringRef(LangOpts.getFPExceptionMode() ==
		LangOptions::FPE_Ignore));

// Strict (compliant) code is the default, so only add this attribute to		// Strict (compliant) code is the default, so only add this attribute to
// indicate that we are trying to workaround a problem case.		// indicate that we are trying to workaround a problem case.
if (!CodeGenOpts.StrictFloatCastOverflow)		if (!CodeGenOpts.StrictFloatCastOverflow)
FuncAttrs.addAttribute("strict-float-cast-overflow", "false");		FuncAttrs.addAttribute("strict-float-cast-overflow", "false");

// TODO: Are these all needed?		// TODO: Are these all needed?
// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.		// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
FuncAttrs.addAttribute("no-infs-fp-math",		FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));		llvm::toStringRef(LangOpts.NoHonorInfs));
FuncAttrs.addAttribute("no-nans-fp-math",		FuncAttrs.addAttribute("no-nans-fp-math",
llvm::toStringRef(CodeGenOpts.NoNaNsFPMath));		llvm::toStringRef(LangOpts.NoHonorNaNs));
FuncAttrs.addAttribute("unsafe-fp-math",		FuncAttrs.addAttribute("unsafe-fp-math",
llvm::toStringRef(CodeGenOpts.UnsafeFPMath));		llvm::toStringRef(LangOpts.UnsafeFPMath));
FuncAttrs.addAttribute("use-soft-float",		FuncAttrs.addAttribute("use-soft-float",
llvm::toStringRef(CodeGenOpts.SoftFloat));		llvm::toStringRef(CodeGenOpts.SoftFloat));
FuncAttrs.addAttribute("stack-protector-buffer-size",		FuncAttrs.addAttribute("stack-protector-buffer-size",
llvm::utostr(CodeGenOpts.SSPBufferSize));		llvm::utostr(CodeGenOpts.SSPBufferSize));
FuncAttrs.addAttribute("no-signed-zeros-fp-math",		FuncAttrs.addAttribute("no-signed-zeros-fp-math",
llvm::toStringRef(CodeGenOpts.NoSignedZeros));		llvm::toStringRef(LangOpts.NoSignedZero));
FuncAttrs.addAttribute(		FuncAttrs.addAttribute(
"correctly-rounded-divide-sqrt-fp-math",		"correctly-rounded-divide-sqrt-fp-math",
llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));		llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));

// TODO: Reciprocal estimate codegen options should apply to instructions?		// TODO: Reciprocal estimate codegen options should apply to instructions?
const std::vector<std::string> &Recips = CodeGenOpts.Reciprocals;		const std::vector<std::string> &Recips = CodeGenOpts.Reciprocals;
if (!Recips.empty())		if (!Recips.empty())
FuncAttrs.addAttribute("reciprocal-estimates",		FuncAttrs.addAttribute("reciprocal-estimates",
▲ Show 20 Lines • Show All 3,356 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 208 Lines • ▼ Show 20 Lines	static bool CanElideOverflowCheck(const ASTContext &Ctx, const BinOpInfo &Op) {

// For unsigned multiplication the overflow check can be elided if either one		// For unsigned multiplication the overflow check can be elided if either one
// of the unpromoted types are less than half the size of the promoted type.		// of the unpromoted types are less than half the size of the promoted type.
unsigned PromotedSize = Ctx.getTypeSize(Op.E->getType());		unsigned PromotedSize = Ctx.getTypeSize(Op.E->getType());
return (2 * Ctx.getTypeSize(LHSTy)) < PromotedSize \|\|		return (2 * Ctx.getTypeSize(LHSTy)) < PromotedSize \|\|
(2 * Ctx.getTypeSize(RHSTy)) < PromotedSize;		(2 * Ctx.getTypeSize(RHSTy)) < PromotedSize;
}		}

/// Update the FastMathFlags of LLVM IR from the FPOptions in LangOptions.
static void updateFastMathFlags(llvm::FastMathFlags &FMF,
FPOptions FPFeatures) {
FMF.setAllowReassoc(FPFeatures.allowAssociativeMath());
FMF.setNoNaNs(FPFeatures.noHonorNaNs());
FMF.setNoInfs(FPFeatures.noHonorInfs());
FMF.setNoSignedZeros(FPFeatures.noSignedZeros());
FMF.setAllowReciprocal(FPFeatures.allowReciprocalMath());
FMF.setApproxFunc(FPFeatures.allowApproximateFunctions());
FMF.setAllowContract(FPFeatures.allowFPContractAcrossStatement());
}

/// Propagate fast-math flags from \p Op to the instruction in \p V.
static Value propagateFMFlags(Value V, const BinOpInfo &Op) {
if (auto *I = dyn_cast<llvm::Instruction>(V)) {
llvm::FastMathFlags FMF = I->getFastMathFlags();
updateFastMathFlags(FMF, Op.FPFeatures);
I->setFastMathFlags(FMF);
}
return V;
}

static void setBuilderFlagsFromFPFeatures(CGBuilderTy &Builder,		static void setBuilderFlagsFromFPFeatures(CGBuilderTy &Builder,
CodeGenFunction &CGF,		CodeGenFunction &CGF,
FPOptions FPFeatures) {		FPOptions FPFeatures) {
auto NewRoundingBehavior = FPFeatures.getRoundingMode();		auto NewRoundingBehavior = FPFeatures.getRoundingMode();
Builder.setDefaultConstrainedRounding(NewRoundingBehavior);		Builder.setDefaultConstrainedRounding(NewRoundingBehavior);
auto NewExceptionBehavior =		auto NewExceptionBehavior =
ToConstrainedExceptMD(FPFeatures.getExceptionMode());		ToConstrainedExceptMD(FPFeatures.getExceptionMode());
Builder.setDefaultConstrainedExcept(NewExceptionBehavior);		Builder.setDefaultConstrainedExcept(NewExceptionBehavior);
auto FMF = Builder.getFastMathFlags();		CGF.SetFastMathFlags(FPFeatures);
updateFastMathFlags(FMF, FPFeatures);
Builder.setFastMathFlags(FMF);
assert((CGF.CurFuncDecl == nullptr \|\| Builder.getIsFPConstrained() \|\|		assert((CGF.CurFuncDecl == nullptr \|\| Builder.getIsFPConstrained() \|\|
isa<CXXConstructorDecl>(CGF.CurFuncDecl) \|\|		isa<CXXConstructorDecl>(CGF.CurFuncDecl) \|\|
isa<CXXDestructorDecl>(CGF.CurFuncDecl) \|\|		isa<CXXDestructorDecl>(CGF.CurFuncDecl) \|\|
(NewExceptionBehavior == llvm::fp::ebIgnore &&		(NewExceptionBehavior == llvm::fp::ebIgnore &&
NewRoundingBehavior == llvm::RoundingMode::NearestTiesToEven)) &&		NewRoundingBehavior == llvm::RoundingMode::NearestTiesToEven)) &&
"FPConstrained should be enabled on entire function");		"FPConstrained should be enabled on entire function");
}		}

▲ Show 20 Lines • Show All 509 Lines • ▼ Show 20 Lines	if (Ops.Ty->isUnsignedIntegerType() &&
CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&		CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&
!CanElideOverflowCheck(CGF.getContext(), Ops))		!CanElideOverflowCheck(CGF.getContext(), Ops))
return EmitOverflowCheckedBinOp(Ops);		return EmitOverflowCheckedBinOp(Ops);

if (Ops.LHS->getType()->isFPOrFPVectorTy()) {		if (Ops.LHS->getType()->isFPOrFPVectorTy()) {
// Preserve the old values		// Preserve the old values
llvm::IRBuilder<>::FastMathFlagGuard FMFG(Builder);		llvm::IRBuilder<>::FastMathFlagGuard FMFG(Builder);
setBuilderFlagsFromFPFeatures(Builder, CGF, Ops.FPFeatures);		setBuilderFlagsFromFPFeatures(Builder, CGF, Ops.FPFeatures);
Value *V = Builder.CreateFMul(Ops.LHS, Ops.RHS, "mul");		return Builder.CreateFMul(Ops.LHS, Ops.RHS, "mul");
return propagateFMFlags(V, Ops);
}		}
if (Ops.isFixedPointOp())		if (Ops.isFixedPointOp())
return EmitFixedPointBinOp(Ops);		return EmitFixedPointBinOp(Ops);
return Builder.CreateMul(Ops.LHS, Ops.RHS, "mul");		return Builder.CreateMul(Ops.LHS, Ops.RHS, "mul");
}		}
/// Create a binary op that checks for overflow.		/// Create a binary op that checks for overflow.
/// Currently only supports +, - and *.		/// Currently only supports +, - and *.
Value *EmitOverflowCheckedBinOp(const BinOpInfo &Ops);		Value *EmitOverflowCheckedBinOp(const BinOpInfo &Ops);
▲ Show 20 Lines • Show All 2,758 Lines • ▼ Show 20 Lines	Value *ScalarExprEmitter::EmitAdd(const BinOpInfo &op) {

if (op.LHS->getType()->isFPOrFPVectorTy()) {		if (op.LHS->getType()->isFPOrFPVectorTy()) {
llvm::IRBuilder<>::FastMathFlagGuard FMFG(Builder);		llvm::IRBuilder<>::FastMathFlagGuard FMFG(Builder);
setBuilderFlagsFromFPFeatures(Builder, CGF, op.FPFeatures);		setBuilderFlagsFromFPFeatures(Builder, CGF, op.FPFeatures);
// Try to form an fmuladd.		// Try to form an fmuladd.
if (Value *FMulAdd = tryEmitFMulAdd(op, CGF, Builder))		if (Value *FMulAdd = tryEmitFMulAdd(op, CGF, Builder))
return FMulAdd;		return FMulAdd;

Value *V = Builder.CreateFAdd(op.LHS, op.RHS, "add");		return Builder.CreateFAdd(op.LHS, op.RHS, "add");
return propagateFMFlags(V, op);
}		}

if (op.isFixedPointOp())		if (op.isFixedPointOp())
return EmitFixedPointBinOp(op);		return EmitFixedPointBinOp(op);

return Builder.CreateAdd(op.LHS, op.RHS, "add");		return Builder.CreateAdd(op.LHS, op.RHS, "add");
}		}

▲ Show 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	if (op.Ty->isUnsignedIntegerType() &&
return EmitOverflowCheckedBinOp(op);		return EmitOverflowCheckedBinOp(op);

if (op.LHS->getType()->isFPOrFPVectorTy()) {		if (op.LHS->getType()->isFPOrFPVectorTy()) {
llvm::IRBuilder<>::FastMathFlagGuard FMFG(Builder);		llvm::IRBuilder<>::FastMathFlagGuard FMFG(Builder);
setBuilderFlagsFromFPFeatures(Builder, CGF, op.FPFeatures);		setBuilderFlagsFromFPFeatures(Builder, CGF, op.FPFeatures);
// Try to form an fmuladd.		// Try to form an fmuladd.
if (Value *FMulAdd = tryEmitFMulAdd(op, CGF, Builder, true))		if (Value *FMulAdd = tryEmitFMulAdd(op, CGF, Builder, true))
return FMulAdd;		return FMulAdd;
Value *V = Builder.CreateFSub(op.LHS, op.RHS, "sub");		return Builder.CreateFSub(op.LHS, op.RHS, "sub");
return propagateFMFlags(V, op);
}		}

if (op.isFixedPointOp())		if (op.isFixedPointOp())
return EmitFixedPointBinOp(op);		return EmitFixedPointBinOp(op);

return Builder.CreateSub(op.LHS, op.RHS, "sub");		return Builder.CreateSub(op.LHS, op.RHS, "sub");
}		}

▲ Show 20 Lines • Show All 1,280 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenFunction.h

Show First 20 Lines • Show All 4,331 Lines • ▼ Show 20 Lines	public:

/// SetFPAccuracy - Set the minimum required accuracy of the given floating		/// SetFPAccuracy - Set the minimum required accuracy of the given floating
/// point operation, expressed as the maximum relative error in ulp.		/// point operation, expressed as the maximum relative error in ulp.
void SetFPAccuracy(llvm::Value *Val, float Accuracy);		void SetFPAccuracy(llvm::Value *Val, float Accuracy);

/// SetFPModel - Control floating point behavior via fp-model settings.		/// SetFPModel - Control floating point behavior via fp-model settings.
void SetFPModel();		void SetFPModel();

		/// Set the codegen fast-math flags.
		void SetFastMathFlags(FPOptions FPFeatures);

private:		private:
llvm::MDNode *getRangeForLoadFromType(QualType Ty);		llvm::MDNode *getRangeForLoadFromType(QualType Ty);
void EmitReturnOfRValue(RValue RV, QualType Ty);		void EmitReturnOfRValue(RValue RV, QualType Ty);

void deferPlaceholderReplacement(llvm::Instruction Old, llvm::Value New);		void deferPlaceholderReplacement(llvm::Instruction Old, llvm::Value New);

llvm::SmallVector<std::pair<llvm::Instruction , llvm::Value >, 4>		llvm::SmallVector<std::pair<llvm::Instruction , llvm::Value >, 4>
DeferredReplacements;		DeferredReplacements;
▲ Show 20 Lines • Show All 240 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenFunction.cpp

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	: CodeGenTypeCache(cgm), CGM(cgm), Target(cgm.getTarget()),
Builder(cgm, cgm.getModule().getContext(), llvm::ConstantFolder(),		Builder(cgm, cgm.getModule().getContext(), llvm::ConstantFolder(),
CGBuilderInserterTy(this)),		CGBuilderInserterTy(this)),
SanOpts(CGM.getLangOpts().Sanitize), DebugInfo(CGM.getModuleDebugInfo()),		SanOpts(CGM.getLangOpts().Sanitize), DebugInfo(CGM.getModuleDebugInfo()),
PGO(cgm), ShouldEmitLifetimeMarkers(shouldEmitLifetimeMarkers(		PGO(cgm), ShouldEmitLifetimeMarkers(shouldEmitLifetimeMarkers(
CGM.getCodeGenOpts(), CGM.getLangOpts())) {		CGM.getCodeGenOpts(), CGM.getLangOpts())) {
if (!suppressNewContext)		if (!suppressNewContext)
CGM.getCXXABI().getMangleContext().startNewFunction();		CGM.getCXXABI().getMangleContext().startNewFunction();

llvm::FastMathFlags FMF;		SetFastMathFlags(FPOptions(CGM.getLangOpts()));
if (CGM.getLangOpts().FastMath)
FMF.setFast();
if (CGM.getLangOpts().FiniteMathOnly) {
FMF.setNoNaNs();
FMF.setNoInfs();
}
if (CGM.getCodeGenOpts().NoNaNsFPMath) {
FMF.setNoNaNs();
}
if (CGM.getCodeGenOpts().NoSignedZeros) {
FMF.setNoSignedZeros();
}
if (CGM.getCodeGenOpts().ReciprocalMath) {
FMF.setAllowReciprocal();
}
if (CGM.getCodeGenOpts().Reassociate) {
FMF.setAllowReassoc();
}
Builder.setFastMathFlags(FMF);
SetFPModel();		SetFPModel();
}		}

CodeGenFunction::~CodeGenFunction() {		CodeGenFunction::~CodeGenFunction() {
assert(LifetimeExtendedCleanupStack.empty() && "failed to emit a cleanup");		assert(LifetimeExtendedCleanupStack.empty() && "failed to emit a cleanup");

// If there are any unclaimed block infos, go ahead and destroy them		// If there are any unclaimed block infos, go ahead and destroy them
// now. This can happen if IR-gen gets clever and skips evaluating		// now. This can happen if IR-gen gets clever and skips evaluating
Show All 32 Lines	auto fpExceptionBehavior = ToConstrainedExceptMD(
getLangOpts().getFPExceptionMode());		getLangOpts().getFPExceptionMode());

Builder.setDefaultConstrainedRounding(RM);		Builder.setDefaultConstrainedRounding(RM);
Builder.setDefaultConstrainedExcept(fpExceptionBehavior);		Builder.setDefaultConstrainedExcept(fpExceptionBehavior);
Builder.setIsFPConstrained(fpExceptionBehavior != llvm::fp::ebIgnore \|\|		Builder.setIsFPConstrained(fpExceptionBehavior != llvm::fp::ebIgnore \|\|
RM != llvm::RoundingMode::NearestTiesToEven);		RM != llvm::RoundingMode::NearestTiesToEven);
}		}

		void CodeGenFunction::SetFastMathFlags(FPOptions FPFeatures) {
		llvm::FastMathFlags FMF;
		FMF.setAllowReassoc(FPFeatures.allowAssociativeMath());
		FMF.setNoNaNs(FPFeatures.noHonorNaNs());
		FMF.setNoInfs(FPFeatures.noHonorInfs());
		FMF.setNoSignedZeros(FPFeatures.noSignedZeros());
		FMF.setAllowReciprocal(FPFeatures.allowReciprocalMath());
		FMF.setApproxFunc(FPFeatures.allowApproximateFunctions());
		FMF.setAllowContract(FPFeatures.allowFPContractAcrossStatement());
		Builder.setFastMathFlags(FMF);
		}

LValue CodeGenFunction::MakeNaturalAlignAddrLValue(llvm::Value *V, QualType T) {		LValue CodeGenFunction::MakeNaturalAlignAddrLValue(llvm::Value *V, QualType T) {
LValueBaseInfo BaseInfo;		LValueBaseInfo BaseInfo;
TBAAAccessInfo TBAAInfo;		TBAAAccessInfo TBAAInfo;
CharUnits Alignment = CGM.getNaturalTypeAlignment(T, &BaseInfo, &TBAAInfo);		CharUnits Alignment = CGM.getNaturalTypeAlignment(T, &BaseInfo, &TBAAInfo);
return LValue::MakeAddr(Address(V, Alignment), T, getContext(), BaseInfo,		return LValue::MakeAddr(Address(V, Alignment), T, getContext(), BaseInfo,
TBAAInfo);		TBAAInfo);
}		}

▲ Show 20 Lines • Show All 2,310 Lines • Show Last 20 Lines

clang/lib/Frontend/CompilerInvocation.cpp

Show First 20 Lines • Show All 891 Lines • ▼ Show 20 Lines	static bool ParseCodeGenArgs(CodeGenOptions &Opts, ArgList &Args, InputKind IK,
Opts.NoEscapingBlockTailCalls =		Opts.NoEscapingBlockTailCalls =
Args.hasArg(OPT_fno_escaping_block_tail_calls);		Args.hasArg(OPT_fno_escaping_block_tail_calls);
Opts.FloatABI = std::string(Args.getLastArgValue(OPT_mfloat_abi));		Opts.FloatABI = std::string(Args.getLastArgValue(OPT_mfloat_abi));
Opts.LessPreciseFPMAD = Args.hasArg(OPT_cl_mad_enable) \|\|		Opts.LessPreciseFPMAD = Args.hasArg(OPT_cl_mad_enable) \|\|
Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|		Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math);		Args.hasArg(OPT_cl_fast_relaxed_math);
Opts.LimitFloatPrecision =		Opts.LimitFloatPrecision =
std::string(Args.getLastArgValue(OPT_mlimit_float_precision));		std::string(Args.getLastArgValue(OPT_mlimit_float_precision));
Opts.NoInfsFPMath = (Args.hasArg(OPT_menable_no_infinities) \|\|
Args.hasArg(OPT_cl_finite_math_only) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math));
Opts.NoNaNsFPMath = (Args.hasArg(OPT_menable_no_nans) \|\|
Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
Args.hasArg(OPT_cl_finite_math_only) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math));
Opts.NoSignedZeros = (Args.hasArg(OPT_fno_signed_zeros) \|\|
Args.hasArg(OPT_cl_no_signed_zeros) \|\|
Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math));
Opts.Reassociate = Args.hasArg(OPT_mreassociate);
Opts.CorrectlyRoundedDivSqrt =		Opts.CorrectlyRoundedDivSqrt =
Args.hasArg(OPT_cl_fp32_correctly_rounded_divide_sqrt);		Args.hasArg(OPT_cl_fp32_correctly_rounded_divide_sqrt);
Opts.UniformWGSize =		Opts.UniformWGSize =
Args.hasArg(OPT_cl_uniform_work_group_size);		Args.hasArg(OPT_cl_uniform_work_group_size);
Opts.Reciprocals = Args.getAllArgValues(OPT_mrecip_EQ);		Opts.Reciprocals = Args.getAllArgValues(OPT_mrecip_EQ);
Opts.ReciprocalMath = Args.hasArg(OPT_freciprocal_math);
Opts.NoTrappingMath = Args.hasArg(OPT_fno_trapping_math);
Opts.StrictFloatCastOverflow =		Opts.StrictFloatCastOverflow =
!Args.hasArg(OPT_fno_strict_float_cast_overflow);		!Args.hasArg(OPT_fno_strict_float_cast_overflow);

Opts.NoZeroInitializedInBSS = Args.hasArg(OPT_mno_zero_initialized_in_bss);		Opts.NoZeroInitializedInBSS = Args.hasArg(OPT_mno_zero_initialized_in_bss);
Opts.NumRegisterParameters = getLastArgIntValue(Args, OPT_mregparm, 0, Diags);		Opts.NumRegisterParameters = getLastArgIntValue(Args, OPT_mregparm, 0, Diags);
Opts.NoExecStack = Args.hasArg(OPT_mno_exec_stack);		Opts.NoExecStack = Args.hasArg(OPT_mno_exec_stack);
Opts.SmallDataLimit =		Opts.SmallDataLimit =
getLastArgIntValue(Args, OPT_msmall_data_limit, 0, Diags);		getLastArgIntValue(Args, OPT_msmall_data_limit, 0, Diags);
Opts.FatalWarnings = Args.hasArg(OPT_massembler_fatal_warnings);		Opts.FatalWarnings = Args.hasArg(OPT_massembler_fatal_warnings);
Opts.NoWarn = Args.hasArg(OPT_massembler_no_warn);		Opts.NoWarn = Args.hasArg(OPT_massembler_no_warn);
Opts.EnableSegmentedStacks = Args.hasArg(OPT_split_stacks);		Opts.EnableSegmentedStacks = Args.hasArg(OPT_split_stacks);
Opts.RelaxAll = Args.hasArg(OPT_mrelax_all);		Opts.RelaxAll = Args.hasArg(OPT_mrelax_all);
Opts.IncrementalLinkerCompatible =		Opts.IncrementalLinkerCompatible =
Args.hasArg(OPT_mincremental_linker_compatible);		Args.hasArg(OPT_mincremental_linker_compatible);
Opts.PIECopyRelocations =		Opts.PIECopyRelocations =
Args.hasArg(OPT_mpie_copy_relocations);		Args.hasArg(OPT_mpie_copy_relocations);
Opts.NoPLT = Args.hasArg(OPT_fno_plt);		Opts.NoPLT = Args.hasArg(OPT_fno_plt);
Opts.SaveTempLabels = Args.hasArg(OPT_msave_temp_labels);		Opts.SaveTempLabels = Args.hasArg(OPT_msave_temp_labels);
Opts.NoDwarfDirectoryAsm = Args.hasArg(OPT_fno_dwarf_directory_asm);		Opts.NoDwarfDirectoryAsm = Args.hasArg(OPT_fno_dwarf_directory_asm);
Opts.SoftFloat = Args.hasArg(OPT_msoft_float);		Opts.SoftFloat = Args.hasArg(OPT_msoft_float);
Opts.StrictEnums = Args.hasArg(OPT_fstrict_enums);		Opts.StrictEnums = Args.hasArg(OPT_fstrict_enums);
Opts.StrictReturn = !Args.hasArg(OPT_fno_strict_return);		Opts.StrictReturn = !Args.hasArg(OPT_fno_strict_return);
Opts.StrictVTablePointers = Args.hasArg(OPT_fstrict_vtable_pointers);		Opts.StrictVTablePointers = Args.hasArg(OPT_fstrict_vtable_pointers);
Opts.ForceEmitVTables = Args.hasArg(OPT_fforce_emit_vtables);		Opts.ForceEmitVTables = Args.hasArg(OPT_fforce_emit_vtables);
Opts.UnsafeFPMath = Args.hasArg(OPT_menable_unsafe_fp_math) \|\|
Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math);
Opts.UnwindTables = Args.hasArg(OPT_munwind_tables);		Opts.UnwindTables = Args.hasArg(OPT_munwind_tables);
Opts.RelocationModel = getRelocModel(Args, Diags);		Opts.RelocationModel = getRelocModel(Args, Diags);
Opts.ThreadModel =		Opts.ThreadModel =
std::string(Args.getLastArgValue(OPT_mthread_model, "posix"));		std::string(Args.getLastArgValue(OPT_mthread_model, "posix"));
if (Opts.ThreadModel != "posix" && Opts.ThreadModel != "single")		if (Opts.ThreadModel != "posix" && Opts.ThreadModel != "single")
Diags.Report(diag::err_drv_invalid_value)		Diags.Report(diag::err_drv_invalid_value)
<< Args.getLastArg(OPT_mthread_model)->getAsString(Args)		<< Args.getLastArg(OPT_mthread_model)->getAsString(Args)
<< Opts.ThreadModel;		<< Opts.ThreadModel;
▲ Show 20 Lines • Show All 2,228 Lines • ▼ Show 20 Lines	+ Args.hasFlag(OPT_frecovery_ast_type, OPT_fno_recovery_ast_type, false);
// inlining enabled.		// inlining enabled.
Opts.NoInlineDefine = !Opts.Optimize;		Opts.NoInlineDefine = !Opts.Optimize;
if (Arg *InlineArg = Args.getLastArg(		if (Arg *InlineArg = Args.getLastArg(
options::OPT_finline_functions, options::OPT_finline_hint_functions,		options::OPT_finline_functions, options::OPT_finline_hint_functions,
options::OPT_fno_inline_functions, options::OPT_fno_inline))		options::OPT_fno_inline_functions, options::OPT_fno_inline))
if (InlineArg->getOption().matches(options::OPT_fno_inline))		if (InlineArg->getOption().matches(options::OPT_fno_inline))
Opts.NoInlineDefine = true;		Opts.NoInlineDefine = true;

Opts.FastMath = Args.hasArg(OPT_ffast_math) \|\|		Opts.FastMath =
Args.hasArg(OPT_cl_fast_relaxed_math);		Args.hasArg(OPT_ffast_math) \|\| Args.hasArg(OPT_cl_fast_relaxed_math);
Opts.FiniteMathOnly = Args.hasArg(OPT_ffinite_math_only) \|\|		Opts.FiniteMathOnly = Args.hasArg(OPT_ffinite_math_only) \|\|
		Args.hasArg(OPT_ffast_math) \|\|
Args.hasArg(OPT_cl_finite_math_only) \|\|		Args.hasArg(OPT_cl_finite_math_only) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math);		Args.hasArg(OPT_cl_fast_relaxed_math);
Opts.UnsafeFPMath = Args.hasArg(OPT_menable_unsafe_fp_math) \|\|		Opts.UnsafeFPMath = Args.hasArg(OPT_menable_unsafe_fp_math) \|\|
		Args.hasArg(OPT_ffast_math) \|\|
Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|		Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math);		Args.hasArg(OPT_cl_fast_relaxed_math);
Opts.AllowFPReassoc = Opts.FastMath \|\| Args.hasArg(OPT_mreassociate);		Opts.AllowFPReassoc = Args.hasArg(OPT_mreassociate) \|\|
Opts.NoHonorNaNs = Opts.FastMath \|\| Opts.FiniteMathOnly \|\|		Args.hasArg(OPT_menable_unsafe_fp_math) \|\|
Args.hasArg(OPT_menable_no_nans) \|\|		Args.hasArg(OPT_ffast_math) \|\|
Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|		Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
Args.hasArg(OPT_cl_finite_math_only) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math);		Args.hasArg(OPT_cl_fast_relaxed_math);
Opts.NoHonorInfs = Opts.FastMath \|\| Opts.FiniteMathOnly \|\|		Opts.NoHonorNaNs =
Args.hasArg(OPT_menable_no_infinities) \|\|		Args.hasArg(OPT_menable_no_nans) \|\| Args.hasArg(OPT_ffinite_math_only) \|\|
		Args.hasArg(OPT_ffast_math) \|\| Args.hasArg(OPT_cl_finite_math_only) \|\|
		Args.hasArg(OPT_cl_fast_relaxed_math);
		Opts.NoHonorInfs = Args.hasArg(OPT_menable_no_infinities) \|\|
		Args.hasArg(OPT_ffinite_math_only) \|\|
		Args.hasArg(OPT_ffast_math) \|\|
Args.hasArg(OPT_cl_finite_math_only) \|\|		Args.hasArg(OPT_cl_finite_math_only) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math);		Args.hasArg(OPT_cl_fast_relaxed_math);
Opts.NoSignedZero = Opts.FastMath \|\| (Args.hasArg(OPT_fno_signed_zeros) \|\|		Opts.NoSignedZero = Args.hasArg(OPT_fno_signed_zeros) \|\|
		Args.hasArg(OPT_menable_unsafe_fp_math) \|\|
		Args.hasArg(OPT_ffast_math) \|\|
Args.hasArg(OPT_cl_no_signed_zeros) \|\|		Args.hasArg(OPT_cl_no_signed_zeros) \|\|
Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|		Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
Args.hasArg(OPT_cl_fast_relaxed_math));		Args.hasArg(OPT_cl_fast_relaxed_math);
Opts.AllowRecip = Opts.FastMath \|\| Args.hasArg(OPT_freciprocal_math);		Opts.AllowRecip = Args.hasArg(OPT_freciprocal_math) \|\|
		Args.hasArg(OPT_menable_unsafe_fp_math) \|\|
		Args.hasArg(OPT_ffast_math) \|\|
		Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
		Args.hasArg(OPT_cl_fast_relaxed_math);
// Currently there's no clang option to enable this individually		// Currently there's no clang option to enable this individually
Opts.ApproxFunc = Opts.FastMath;		Opts.ApproxFunc = Args.hasArg(OPT_menable_unsafe_fp_math) \|\|
		Args.hasArg(OPT_ffast_math) \|\|
		Args.hasArg(OPT_cl_unsafe_math_optimizations) \|\|
		Args.hasArg(OPT_cl_fast_relaxed_math);

if (Arg *A = Args.getLastArg(OPT_ffp_contract)) {		if (Arg *A = Args.getLastArg(OPT_ffp_contract)) {
StringRef Val = A->getValue();		StringRef Val = A->getValue();
if (Val == "fast")		if (Val == "fast")
Opts.setDefaultFPContractMode(LangOptions::FPM_Fast);		Opts.setDefaultFPContractMode(LangOptions::FPM_Fast);
else if (Val == "on")		else if (Val == "on")
Opts.setDefaultFPContractMode(LangOptions::FPM_On);		Opts.setDefaultFPContractMode(LangOptions::FPM_On);
else if (Val == "off")		else if (Val == "off")
▲ Show 20 Lines • Show All 654 Lines • Show Last 20 Lines

clang/test/CodeGen/builtins-nvptx-ptx60.cu

Show All 39 Lines	__device__ void nvvm_sync(unsigned mask, int i, float f, int a, int b,
__nvvm_barrier_sync_cnt(mask, i);		__nvvm_barrier_sync_cnt(mask, i);

//		//
// SHFL.SYNC		// SHFL.SYNC
//		//
// CHECK: call i32 @llvm.nvvm.shfl.sync.down.i32(i32 {{%[0-9]+}}, i32		// CHECK: call i32 @llvm.nvvm.shfl.sync.down.i32(i32 {{%[0-9]+}}, i32
// expected-error@+1 {{'__nvvm_shfl_sync_down_i32' needs target feature ptx60}}		// expected-error@+1 {{'__nvvm_shfl_sync_down_i32' needs target feature ptx60}}
__nvvm_shfl_sync_down_i32(mask, i, a, b);		__nvvm_shfl_sync_down_i32(mask, i, a, b);
// CHECK: call float @llvm.nvvm.shfl.sync.down.f32(i32 {{%[0-9]+}}, float		// CHECK: call contract float @llvm.nvvm.shfl.sync.down.f32(i32 {{%[0-9]+}}, float
// expected-error@+1 {{'__nvvm_shfl_sync_down_f32' needs target feature ptx60}}		// expected-error@+1 {{'__nvvm_shfl_sync_down_f32' needs target feature ptx60}}
__nvvm_shfl_sync_down_f32(mask, f, a, b);		__nvvm_shfl_sync_down_f32(mask, f, a, b);
// CHECK: call i32 @llvm.nvvm.shfl.sync.up.i32(i32 {{%[0-9]+}}, i32		// CHECK: call i32 @llvm.nvvm.shfl.sync.up.i32(i32 {{%[0-9]+}}, i32
// expected-error@+1 {{'__nvvm_shfl_sync_up_i32' needs target feature ptx60}}		// expected-error@+1 {{'__nvvm_shfl_sync_up_i32' needs target feature ptx60}}
__nvvm_shfl_sync_up_i32(mask, i, a, b);		__nvvm_shfl_sync_up_i32(mask, i, a, b);
// CHECK: call float @llvm.nvvm.shfl.sync.up.f32(i32 {{%[0-9]+}}, float		// CHECK: call contract float @llvm.nvvm.shfl.sync.up.f32(i32 {{%[0-9]+}}, float
// expected-error@+1 {{'__nvvm_shfl_sync_up_f32' needs target feature ptx60}}		// expected-error@+1 {{'__nvvm_shfl_sync_up_f32' needs target feature ptx60}}
__nvvm_shfl_sync_up_f32(mask, f, a, b);		__nvvm_shfl_sync_up_f32(mask, f, a, b);
// CHECK: call i32 @llvm.nvvm.shfl.sync.bfly.i32(i32 {{%[0-9]+}}, i32		// CHECK: call i32 @llvm.nvvm.shfl.sync.bfly.i32(i32 {{%[0-9]+}}, i32
// expected-error@+1 {{'__nvvm_shfl_sync_bfly_i32' needs target feature ptx60}}		// expected-error@+1 {{'__nvvm_shfl_sync_bfly_i32' needs target feature ptx60}}
__nvvm_shfl_sync_bfly_i32(mask, i, a, b);		__nvvm_shfl_sync_bfly_i32(mask, i, a, b);
// CHECK: call float @llvm.nvvm.shfl.sync.bfly.f32(i32 {{%[0-9]+}}, float		// CHECK: call contract float @llvm.nvvm.shfl.sync.bfly.f32(i32 {{%[0-9]+}}, float
// expected-error@+1 {{'__nvvm_shfl_sync_bfly_f32' needs target feature ptx60}}		// expected-error@+1 {{'__nvvm_shfl_sync_bfly_f32' needs target feature ptx60}}
__nvvm_shfl_sync_bfly_f32(mask, f, a, b);		__nvvm_shfl_sync_bfly_f32(mask, f, a, b);
// CHECK: call i32 @llvm.nvvm.shfl.sync.idx.i32(i32 {{%[0-9]+}}, i32		// CHECK: call i32 @llvm.nvvm.shfl.sync.idx.i32(i32 {{%[0-9]+}}, i32
// expected-error@+1 {{'__nvvm_shfl_sync_idx_i32' needs target feature ptx60}}		// expected-error@+1 {{'__nvvm_shfl_sync_idx_i32' needs target feature ptx60}}
__nvvm_shfl_sync_idx_i32(mask, i, a, b);		__nvvm_shfl_sync_idx_i32(mask, i, a, b);
// CHECK: call float @llvm.nvvm.shfl.sync.idx.f32(i32 {{%[0-9]+}}, float		// CHECK: call contract float @llvm.nvvm.shfl.sync.idx.f32(i32 {{%[0-9]+}}, float
// expected-error@+1 {{'__nvvm_shfl_sync_idx_f32' needs target feature ptx60}}		// expected-error@+1 {{'__nvvm_shfl_sync_idx_f32' needs target feature ptx60}}
__nvvm_shfl_sync_idx_f32(mask, f, a, b);		__nvvm_shfl_sync_idx_f32(mask, f, a, b);

//		//
// VOTE.SYNC		// VOTE.SYNC
//		//

// CHECK: call i1 @llvm.nvvm.vote.all.sync(i32		// CHECK: call i1 @llvm.nvvm.vote.all.sync(i32
Show All 31 Lines

clang/test/CodeGen/complex-math.c

	// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple x86_64-unknown-unknown -o - \| FileCheck %s --check-prefix=X86			// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple x86_64-unknown-unknown -o - \| FileCheck %s --check-prefix=X86
	// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple x86_64-pc-win64 -o - \| FileCheck %s --check-prefix=X86			// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple x86_64-pc-win64 -o - \| FileCheck %s --check-prefix=X86
	// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple i686-unknown-unknown -o - \| FileCheck %s --check-prefix=X86			// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple i686-unknown-unknown -o - \| FileCheck %s --check-prefix=X86
	// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple powerpc-unknown-unknown -o - \| FileCheck %s --check-prefix=PPC			// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple powerpc-unknown-unknown -o - \| FileCheck %s --check-prefix=PPC
	// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple armv7-none-linux-gnueabi -o - \| FileCheck %s --check-prefix=ARM			// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple armv7-none-linux-gnueabi -o - \| FileCheck %s --check-prefix=ARM
	// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple armv7-none-linux-gnueabihf -o - \| FileCheck %s --check-prefix=ARMHF			// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple armv7-none-linux-gnueabihf -o - \| FileCheck %s --check-prefix=ARMHF
	// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple thumbv7k-apple-watchos2.0 -o - -target-abi aapcs16 \| FileCheck %s --check-prefix=ARM7K			// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple thumbv7k-apple-watchos2.0 -o - -target-abi aapcs16 \| FileCheck %s --check-prefix=ARM7K
	// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple aarch64-unknown-unknown -ffast-math -o - \| FileCheck %s --check-prefix=AARCH64-FASTMATH			// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm -triple aarch64-unknown-unknown -ffast-math -ffp-contract=fast -o - \| FileCheck %s --check-prefix=AARCH64-FASTMATH

	float _Complex add_float_rr(float a, float b) {			float _Complex add_float_rr(float a, float b) {
	// X86-LABEL: @add_float_rr(			// X86-LABEL: @add_float_rr(
	// X86: fadd			// X86: fadd
	// X86-NOT: fadd			// X86-NOT: fadd
	// X86: ret			// X86: ret
	return a + b;			return a + b;
	}			}
	▲ Show 20 Lines • Show All 596 Lines • Show Last 20 Lines

clang/test/CodeGen/fp-options-to-fast-math-flags.c

This file was added.

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-PRECISE %s
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -menable-no-nans -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-NO-NANS %s
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -menable-no-infs -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-NO-INFS %s
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -ffinite-math-only -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-FINITE %s
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fno-signed-zeros -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-NO-SIGNED-ZEROS %s
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -mreassociate -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-REASSOC %s
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -freciprocal-math -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-RECIP %s
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -menable-unsafe-fp-math -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-UNSAFE %s
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -ffast-math -emit-llvm -o - %s \| FileCheck -check-prefix CHECK-FAST %s

				float fn(float);

				float test(float a) {
				return a + fn(a);
				}

				// CHECK-PRECISE: [[CALL_RES:%.+]] = call float @fn(float {{%.+}})
				// CHECK-PRECISE: {{%.+}} = fadd float {{%.+}}, [[CALL_RES]]

				// CHECK-NO-NANS: [[CALL_RES:%.+]] = call nnan float @fn(float {{%.+}})
				// CHECK-NO-NANS: {{%.+}} = fadd nnan float {{%.+}}, [[CALL_RES]]

				// CHECK-NO-INFS: [[CALL_RES:%.+]] = call ninf float @fn(float {{%.+}})
				// CHECK-NO-INFS: {{%.+}} = fadd ninf float {{%.+}}, [[CALL_RES]]

				// CHECK-FINITE: [[CALL_RES:%.+]] = call nnan ninf float @fn(float {{%.+}})
				// CHECK-FINITE: {{%.+}} = fadd nnan ninf float {{%.+}}, [[CALL_RES]]

				// CHECK-NO-SIGNED-ZEROS: [[CALL_RES:%.+]] = call nsz float @fn(float {{%.+}})
				// CHECK-NO-SIGNED-ZEROS: {{%.+}} = fadd nsz float {{%.+}}, [[CALL_RES]]

				// CHECK-REASSOC: [[CALL_RES:%.+]] = call reassoc float @fn(float {{%.+}})
				// CHECK-REASSOC: {{%.+}} = fadd reassoc float {{%.+}}, [[CALL_RES]]

				// CHECK-RECIP: [[CALL_RES:%.+]] = call arcp float @fn(float {{%.+}})
				// CHECK-RECIP: {{%.+}} = fadd arcp float {{%.+}}, [[CALL_RES]]

				// CHECK-UNSAFE: [[CALL_RES:%.+]] = call reassoc nsz arcp afn float @fn(float {{%.+}})
				// CHECK-UNSAFE: {{%.+}} = fadd reassoc nsz arcp afn float {{%.+}}, [[CALL_RES]]

				// CHECK-FAST: [[CALL_RES:%.+]] = call reassoc nnan ninf nsz arcp afn float @fn(float {{%.+}})
				// CHECK-FAST: {{%.+}} = fadd reassoc nnan ninf nsz arcp afn float {{%.+}}, [[CALL_RES]]

clang/test/CodeGen/libcalls.c

	// RUN: %clang_cc1 -fmath-errno -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-YES %s			// RUN: %clang_cc1 -fmath-errno -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-YES %s
	// RUN: %clang_cc1 -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-NO %s			// RUN: %clang_cc1 -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-NO %s
	// RUN: %clang_cc1 -menable-unsafe-fp-math -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-FAST %s			// RUN: %clang_cc1 -menable-unsafe-fp-math -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-FAST %s

	// CHECK-YES-LABEL: define void @test_sqrt			// CHECK-YES-LABEL: define void @test_sqrt
	// CHECK-NO-LABEL: define void @test_sqrt			// CHECK-NO-LABEL: define void @test_sqrt
	// CHECK-FAST-LABEL: define void @test_sqrt			// CHECK-FAST-LABEL: define void @test_sqrt
	void test_sqrt(float a0, double a1, long double a2) {			void test_sqrt(float a0, double a1, long double a2) {
	// CHECK-YES: call float @sqrtf			// CHECK-YES: call float @sqrtf
	// CHECK-NO: call float @llvm.sqrt.f32(float			// CHECK-NO: call float @llvm.sqrt.f32(float
	// CHECK-FAST: call float @llvm.sqrt.f32(float			// CHECK-FAST: call reassoc nsz arcp afn float @llvm.sqrt.f32(float
	float l0 = sqrtf(a0);			float l0 = sqrtf(a0);

	// CHECK-YES: call double @sqrt			// CHECK-YES: call double @sqrt
	// CHECK-NO: call double @llvm.sqrt.f64(double			// CHECK-NO: call double @llvm.sqrt.f64(double
	// CHECK-FAST: call double @llvm.sqrt.f64(double			// CHECK-FAST: call reassoc nsz arcp afn double @llvm.sqrt.f64(double
	double l1 = sqrt(a1);			double l1 = sqrt(a1);

	// CHECK-YES: call x86_fp80 @sqrtl			// CHECK-YES: call x86_fp80 @sqrtl
	// CHECK-NO: call x86_fp80 @llvm.sqrt.f80(x86_fp80			// CHECK-NO: call x86_fp80 @llvm.sqrt.f80(x86_fp80
	// CHECK-FAST: call x86_fp80 @llvm.sqrt.f80(x86_fp80			// CHECK-FAST: call reassoc nsz arcp afn x86_fp80 @llvm.sqrt.f80(x86_fp80
				michele.scandaleAuthorUnsubmitted Done Reply Inline Actions For CUDA the default FP contract mode is `fast`, therefore the `contract` FMF is emitted. michele.scandale: For CUDA the default FP contract mode is `fast`, therefore the `contract` FMF is emitted.
	long double l2 = sqrtl(a2);			long double l2 = sqrtl(a2);
	}			}

	// CHECK-YES: declare float @sqrtf(float)			// CHECK-YES: declare float @sqrtf(float)
	// CHECK-YES: declare double @sqrt(double)			// CHECK-YES: declare double @sqrt(double)
	// CHECK-YES: declare x86_fp80 @sqrtl(x86_fp80)			// CHECK-YES: declare x86_fp80 @sqrtl(x86_fp80)
	// CHECK-NO: declare float @llvm.sqrt.f32(float)			// CHECK-NO: declare float @llvm.sqrt.f32(float)
	// CHECK-NO: declare double @llvm.sqrt.f64(double)			// CHECK-NO: declare double @llvm.sqrt.f64(double)
	▲ Show 20 Lines • Show All 98 Lines • Show Last 20 Lines

clang/test/CodeGenCUDA/builtins-amdgcn.cu

	// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -emit-llvm %s -o - \| FileCheck %s			// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -emit-llvm %s -o - \| FileCheck %s
	#include "Inputs/cuda.h"			#include "Inputs/cuda.h"

	// CHECK-LABEL: @_Z16use_dispatch_ptrPi(			// CHECK-LABEL: @_Z16use_dispatch_ptrPi(
	// CHECK: %[[PTR:.]] = call align 4 dereferenceable(64) i8 addrspace(4) @llvm.amdgcn.dispatch.ptr()			// CHECK: %[[PTR:.]] = call align 4 dereferenceable(64) i8 addrspace(4) @llvm.amdgcn.dispatch.ptr()
	// CHECK: %{{.}} = addrspacecast i8 addrspace(4) %[[PTR]] to i8*			// CHECK: %{{.}} = addrspacecast i8 addrspace(4) %[[PTR]] to i8*
	__global__ void use_dispatch_ptr(int* out) {			__global__ void use_dispatch_ptr(int* out) {
	const int* dispatch_ptr = (const int*)__builtin_amdgcn_dispatch_ptr();			const int* dispatch_ptr = (const int*)__builtin_amdgcn_dispatch_ptr();
	out = dispatch_ptr;			out = dispatch_ptr;
	}			}

	// CHECK-LABEL: @_Z12test_ds_fmaxf(			// CHECK-LABEL: @_Z12test_ds_fmaxf(
	// CHECK: call float @llvm.amdgcn.ds.fmax(float addrspace(3)* @_ZZ12test_ds_fmaxfE6shared, float %{{[^,]*}}, i32 0, i32 0, i1 false)			// CHECK: call contract float @llvm.amdgcn.ds.fmax(float addrspace(3)* @_ZZ12test_ds_fmaxfE6shared, float %{{[^,]*}}, i32 0, i32 0, i1 false)
				michele.scandaleAuthorUnsubmitted Done Reply Inline Actions For CUDA the default FP contract mode is `fast`, therefore the `contract` FMF is emitted. michele.scandale: For CUDA the default FP contract mode is `fast`, therefore the `contract` FMF is emitted.
	__global__			__global__
	void test_ds_fmax(float src) {			void test_ds_fmax(float src) {
	__shared__ float shared;			__shared__ float shared;
	volatile float x = __builtin_amdgcn_ds_fmaxf(&shared, src, 0, 0, false);			volatile float x = __builtin_amdgcn_ds_fmaxf(&shared, src, 0, 0, false);
	}			}

clang/test/CodeGenCUDA/library-builtin.cu

	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target

	// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \| \			// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \| \
	// RUN: FileCheck --check-prefixes=HOST,BOTH %s			// RUN: FileCheck --check-prefixes=HOST,BOTH %s
	// RUN: %clang_cc1 -fcuda-is-device -triple nvptx64-nvidia-cuda \			// RUN: %clang_cc1 -fcuda-is-device -triple nvptx64-nvidia-cuda \
	// RUN: -emit-llvm -o - %s \| FileCheck %s --check-prefixes=DEVICE,BOTH			// RUN: -emit-llvm -o - %s \| FileCheck %s --check-prefixes=DEVICE,BOTH

	// BOTH-LABEL: define float @logf(float			// BOTH-LABEL: define float @logf(float

	// logf() should be calling itself recursively as we don't have any standard			// logf() should be calling itself recursively as we don't have any standard
	// library on device side.			// library on device side.
	// DEVICE: call float @logf(float			// DEVICE: call contract float @logf(float
	extern "C" __attribute__((device)) float logf(float __x) { return logf(__x); }			extern "C" __attribute__((device)) float logf(float __x) { return logf(__x); }

	// NOTE: this case is to illustrate the expected differences in behavior between			// NOTE: this case is to illustrate the expected differences in behavior between
	// the host and device. In general we do not mess with host-side standard			// the host and device. In general we do not mess with host-side standard
	// library.			// library.
	//			//
	// Host is assumed to have standard library, so logf() calls LLVM intrinsic.			// Host is assumed to have standard library, so logf() calls LLVM intrinsic.
	// HOST: call float @llvm.log.f32(float			// HOST: call contract float @llvm.log.f32(float
	extern "C" float logf(float __x) { return logf(__x); }			extern "C" float logf(float __x) { return logf(__x); }

clang/test/CodeGenOpenCL/relaxed-fpmath.cl

	// RUN: %clang_cc1 %s -emit-llvm -o - \| FileCheck %s -check-prefix=NORMAL			// RUN: %clang_cc1 %s -emit-llvm -o - \| FileCheck %s -check-prefix=NORMAL
	// RUN: %clang_cc1 %s -emit-llvm -cl-fast-relaxed-math -o - \| FileCheck %s -check-prefix=FAST			// RUN: %clang_cc1 %s -emit-llvm -cl-fast-relaxed-math -o - \| FileCheck %s -check-prefix=FAST
	// RUN: %clang_cc1 %s -emit-llvm -cl-finite-math-only -o - \| FileCheck %s -check-prefix=FINITE			// RUN: %clang_cc1 %s -emit-llvm -cl-finite-math-only -o - \| FileCheck %s -check-prefix=FINITE
	// RUN: %clang_cc1 %s -emit-llvm -cl-unsafe-math-optimizations -o - \| FileCheck %s -check-prefix=UNSAFE			// RUN: %clang_cc1 %s -emit-llvm -cl-unsafe-math-optimizations -o - \| FileCheck %s -check-prefix=UNSAFE
	// RUN: %clang_cc1 %s -emit-llvm -cl-mad-enable -o - \| FileCheck %s -check-prefix=MAD			// RUN: %clang_cc1 %s -emit-llvm -cl-mad-enable -o - \| FileCheck %s -check-prefix=MAD
	// RUN: %clang_cc1 %s -emit-llvm -cl-no-signed-zeros -o - \| FileCheck %s -check-prefix=NOSIGNED			// RUN: %clang_cc1 %s -emit-llvm -cl-no-signed-zeros -o - \| FileCheck %s -check-prefix=NOSIGNED

	float spscalardiv(float a, float b) {			float spscalardiv(float a, float b) {
	// CHECK: @spscalardiv(			// CHECK: @spscalardiv(

	// NORMAL: fdiv float			// NORMAL: fdiv float
	// FAST: fdiv fast float			// FAST: fdiv fast float
	// FINITE: fdiv nnan ninf float			// FINITE: fdiv nnan ninf float
	// UNSAFE: fdiv nnan nsz float			// UNSAFE: fdiv reassoc nsz arcp afn float
				michele.scandaleAuthorUnsubmitted Done Reply Inline Actions This change is based on the following: `-cl-fast-relaxed-math` = `-cl-unsafe-math-optimizations` + `-cl-finite-math-only` the GCC option `-funsafe-math-optimizations` and `-cl-unsafe-math-optimizations` are described with very similar wording and from the GCC description states explicitly mention that no signed zeros, reassociation and reciprocals are enabled, but there is no mention to assuming that NaNs do not exist. See https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/clBuildProgram.html and https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html michele.scandale: This change is based on the following: * `-cl-fast-relaxed-math` = `-cl-unsafe-math…
				AnastasiaUnsubmitted Not Done Reply Inline Actions Makes sense! Anastasia: Makes sense!
	// MAD: fdiv float			// MAD: fdiv float
	// NOSIGNED: fdiv nsz float			// NOSIGNED: fdiv nsz float
	return a / b;			return a / b;
	}			}
	// CHECK: attributes			// CHECK: attributes

	// NORMAL: "less-precise-fpmad"="false"			// NORMAL: "less-precise-fpmad"="false"
	// NORMAL: "no-infs-fp-math"="false"			// NORMAL: "no-infs-fp-math"="false"
	Show All 10 Lines
	// FINITE: "less-precise-fpmad"="false"			// FINITE: "less-precise-fpmad"="false"
	// FINITE: "no-infs-fp-math"="true"			// FINITE: "no-infs-fp-math"="true"
	// FINITE: "no-nans-fp-math"="true"			// FINITE: "no-nans-fp-math"="true"
	// FINITE: "no-signed-zeros-fp-math"="false"			// FINITE: "no-signed-zeros-fp-math"="false"
	// FINITE: "unsafe-fp-math"="false"			// FINITE: "unsafe-fp-math"="false"

	// UNSAFE: "less-precise-fpmad"="true"			// UNSAFE: "less-precise-fpmad"="true"
	// UNSAFE: "no-infs-fp-math"="false"			// UNSAFE: "no-infs-fp-math"="false"
	// UNSAFE: "no-nans-fp-math"="true"			// UNSAFE: "no-nans-fp-math"="false"
	// UNSAFE: "no-signed-zeros-fp-math"="true"			// UNSAFE: "no-signed-zeros-fp-math"="true"
	// UNSAFE: "unsafe-fp-math"="true"			// UNSAFE: "unsafe-fp-math"="true"

	// MAD: "less-precise-fpmad"="true"			// MAD: "less-precise-fpmad"="true"
	// MAD: "no-infs-fp-math"="false"			// MAD: "no-infs-fp-math"="false"
	// MAD: "no-nans-fp-math"="false"			// MAD: "no-nans-fp-math"="false"
	// MAD: "no-signed-zeros-fp-math"="false"			// MAD: "no-signed-zeros-fp-math"="false"
	// MAD: "unsafe-fp-math"="false"			// MAD: "unsafe-fp-math"="false"

	// NOSIGNED: "less-precise-fpmad"="false"			// NOSIGNED: "less-precise-fpmad"="false"
	// NOSIGNED: "no-infs-fp-math"="false"			// NOSIGNED: "no-infs-fp-math"="false"
	// NOSIGNED: "no-nans-fp-math"="false"			// NOSIGNED: "no-nans-fp-math"="false"
	// NOSIGNED: "no-signed-zeros-fp-math"="true"			// NOSIGNED: "no-signed-zeros-fp-math"="true"
	// NOSIGNED: "unsafe-fp-math"="false"			// NOSIGNED: "unsafe-fp-math"="false"