This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
10/22
LangRef.rst
-
include/llvm/
-
llvm/
-
IR/
-
Instruction.h
6
Operator.h
-
Transforms/Utils/
-
Utils/
-
LoopUtils.h
-
lib/
-
AsmParser/
-
LLLexer.cpp
-
LLParser.h
-
LLToken.h
-
Bitcode/
-
Reader/
-
BitcodeReader.cpp
-
Writer/
-
BitcodeWriter.cpp
-
CodeGen/
-
ExpandReductions.cpp
-
SelectionDAG/
-
SelectionDAGBuilder.cpp
-
IR/
-
AsmWriter.cpp
-
Instruction.cpp
-
Target/AMDGPU/
-
AMDGPU/
-
AMDGPUCodeGenPrepare.cpp
-
AMDGPULibCalls.cpp
-
Transforms/
-
InstCombine/
-
InstCombineAddSub.cpp
-
InstCombineCalls.cpp
-
InstCombineMulDivRem.cpp
-
Scalar/
3
Reassociate.cpp
-
Utils/
-
LoopUtils.cpp
-
SimplifyLibCalls.cpp
-
Vectorize/
-
LoopVectorize.cpp
-
SLPVectorizer.cpp
-
test/
-
Assembler/
3
fast-math-flags.ll
-
Bitcode/
-
compatibility-3.6.ll
-
compatibility-3.7.ll
-
compatibility-3.8.ll
-
compatibility-3.9.ll
-
compatibility-4.0.ll
-
compatibility-5.0.ll
-
compatibility.ll
-
unittests/IR/
-
IR/
-
IRBuilderTest.cpp

Differential D39304

[IR] redefine 'reassoc' fast-math-flag and add 'trans' fast-math-flag
ClosedPublic

Authored by spatel on Oct 25 2017, 1:49 PM.

Download Raw Diff

Details

Reviewers

hfinkel
andrew.w.kaylor
wristow
nhaehnle
mehdi_amini
efriedma
anemet
yuyichao

Commits

rG629c41153876: [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast…
rL317488: [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast…

Summary

As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR.

As proposed in the above threads, I've replaced the 'UnsafeAlgebra' bit (which had the 'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic reassociation - 'AllowReassoc'.

I've also added a bit to allow relaxed precision for transcendental functions called 'AllowTrans' (this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), but that's apparently already used for other purposes. Also, I don't think we can just add a field to FPMathOperator because Operator is not intended to be instantiated. We'll defer movement of FMF to another day.

I'm not sure, but I may be diverging from the last proposal by keeping the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract trans float %op1, %op2
...made me think we want to keep the shortcut synonym.

I've also gone ahead and renamed the getter/setters, and mechanically added 'TODO' comments where we need to review how the old setUnsafeAlgebra() was used. In some cases, it's obvious that should be translated to setAllowReassoc(), but others may need to be discussed.

Finally, this change is binary incompatible with existing IR as seen in the compatibility tests. I'm hoping this:
"Newer releases can ignore features from older releases, but they cannot miscompile them. For example, if nsw is ever replaced with something else, dropping it would be a valid way to upgrade the IR." ( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR version. Ie, I don't think we're ever loosening the FP strictness of existing IR. At worst, we will fail to optimize some previously 'fast' code because it's no longer recognized as 'fast'. This should get fixed as we squash all of the TODO comments.

Diff Detail

Event Timeline

spatel created this revision.Oct 25 2017, 1:49 PM

Herald added subscribers: mcrosier, arsenm. · View Herald TranscriptOct 25 2017, 1:49 PM

Patch updated:
I forgot to add assembler tests for the new keywords which revealed that I had forgotten to add 'reassoc' to the parser. Fixed the code and added tests.

efriedma added inline comments.Oct 25 2017, 3:07 PM

docs/LangRef.rst
2305	Could you mark up the LLVM intrinsics affected by the "trans" flag with the meaning with/without the flag? Please clarify the language here to indicate this affects the semantics of both LLVM intrinsics and known library functions. And include a more complete description of what counts as a known library function. And explain what "relaxed" means, given that libm generally doesn't provide correctly rounded versions of transcendental functions. Also, do we want to optimize sqrt() based on this flag? It technically isn't transcendental, but we currently generate an approximation in some cases based on fast-math flags.

spatel added inline comments.Oct 25 2017, 4:12 PM

docs/LangRef.rst
2305	I think we'd include sqrt() in the 'trans' bucket (so maybe 'libm' was the better name). But looking back through the dev thread, I don't see an actual definition of that term or what this flag would map to as a clang command-line param. @hfinkel / @wristow - suggestions?

spatel mentioned this in D39319: Set contract flag when setting unsafe algebra flag.Oct 26 2017, 6:28 AM

D39319 (Set contract flag when setting unsafe algebra flag) proposes to fix a bug introduced with D31164 (Add AllowContract to FastMathFlags). That bug is also fixed here - but I should add that test.

This might suggest that we should split 'trans' into its own small follow-up patch. That would also allow the unsafe TODO fixes to proceed independently of 'trans' in case that takes longer to sort out. Let me know if splitting this patch up seems better.

I don't see a lot of value in having:

// TODO: This should use hasAllowReassoc()?

before every call to isFast. I can grep for isFast more easily than I can grep for the comment. Moreover, if something really needs 'isFast' because the transformation really needs "every possible liberty", then there should definitely be a comment explaining that.

Also, unless we make hasAllowReassoc() also imply hasNoSignedZeros() (as GCC does, FWIW), we'll almost certainly need that check as well in many places. Many need NoNans and NoInfs too.

docs/LangRef.rst
2305	You're correct, sqrt is not a transcendental function (it is an algebraic function because it is the root of a polynomial equation). The problem is that algebraic functions also include things that we don't want to include here (e.g., addition, division). I don't like 'libm' because it refers to a very specific set of functions, but maybe that's the best we can do. The best description I have of this set is: "All of the mathematical operations that generally produce irrational numbers and are not in the set of functions specified by the IEEE specification (e.g., +,-,*,/,%,sqrt)."

wristow added inline comments.Oct 26 2017, 11:42 AM

docs/LangRef.rst
2305	For context, the original suggestion of the flag-name 'libm' was changed to 'trans' based on an observation that OpenCL has an option '-cl-fast-relaxed-math' that includes the semantics "This option also relaxes the precision of commonly used math functions" (see http://man.opencl.org/clCompileProgram.html), and so 'libm' may incorrectly imply it only applies to a more limited set of functions (math operations) only implemented in "libm.a". That led to the suggestion of 'trans', for transcendental functions. I originally liked the 'trans' suggestion, moving away from the libm.a implication. But I do feel that sqrt() should also be optimized on this flag, so as people said, 'trans' isn't perfect either. I don't have a precise formal definition of what things would be controlled by this, but loosely, I'd say it's "All mathematical operations that have runtime library support on many platforms." In practice in LLVM, I think this means many (most?) of the math operations handled by "SimplifyLibCalls.cpp". So for me, in addition to transcendental functions, it would include sqrt(), and even simpler things like fmin() and fmax(). So after giving it more thought, I'm now back to preferring 'libm' over 'trans'. If someone has a better suggestion, that would be great. But I haven't thought of one.

In D39304#907880, @spatel wrote:

...
This might suggest that we should split 'trans' into its own small follow-up patch. That would also allow the unsafe TODO fixes to proceed independently of 'trans' in case that takes longer to sort out. Let me know if splitting this patch up seems better.

I'd like to see the new enumerations of the FastMathFlags bits all defined in one patch. So removing 'UnsafeAlgebra', and adding 'AllowReassoc' and 'AllowTrans' (along with all the changes required since 'UnsafeAlgebra' can no longer be referenced) in the first patch, is my preference. (As an aside, probably 'AllowTrans' will be changed to something like 'AllowMathLib', due to other discussions here.)

In D39304#908340, @wristow wrote:

In D39304#907880, @spatel wrote:

...
This might suggest that we should split 'trans' into its own small follow-up patch. That would also allow the unsafe TODO fixes to proceed independently of 'trans' in case that takes longer to sort out. Let me know if splitting this patch up seems better.

I'd like to see the new enumerations of the FastMathFlags bits all defined in one patch. So removing 'UnsafeAlgebra', and adding 'AllowReassoc' and 'AllowTrans' (along with all the changes required since 'UnsafeAlgebra' can no longer be referenced) in the first patch, is my preference. (As an aside, probably 'AllowTrans' will be changed to something like 'AllowMathLib', due to other discussions here.)

Makes sense to me as well.

wristow added inline comments.Oct 26 2017, 12:15 PM

docs/LangRef.rst
2305	So after giving it more thought, I'm now back to preferring 'libm' over 'trans'. If someone has a better suggestion, that would be great. But I haven't thought of one. In the spirit of the flag 'arcp' for 'AllowReciprocal', and the possibility of 'AllowMathLib' for the internal enumeration name, how about 'amlib'? With that, there's no direct implication of "this is only for libm.a operations", or "this is only for transcendental functions". That said, the 'am' part does give it an awkward "morning library" feeling. Maybe 'amathlib' instead?

andrew.w.kaylor added inline comments.Oct 26 2017, 12:25 PM

docs/LangRef.rst
2305	It seems that we're allowing something kind of open-ended here. That is, we don't seem to have an exact set of functions that will be covered. If that's the case then we should probably document it as such -- something like "allow substitution of approximate calculations for functions whose meaning are recognized by the optimizer." And maybe the flag could be "approx".
2308	Since reciprocal is an algebraically-equivalent transformation, this documentation isn't quite correct. Does this enable anything other than reassociation?

hfinkel added inline comments.Oct 26 2017, 12:35 PM

docs/LangRef.rst
2305	I like this suggestion.

wristow added inline comments.Oct 26 2017, 10:20 PM

docs/LangRef.rst
2305	Yes, I think it's open-ended. And I very much like the description "allow substitution of approximate calculations for functions whose meaning are recognized by the optimizer.". I'm less enthusiastic about the flag-name 'approx', although I'm not horribly opposed to it (especially since I cannot come up with anything I really like). On it's own, a flag named 'approx' sounds too wide of scope. To me, it sounds like it might be describing all of what is enabled by -ffast-math. In short, it doesn't explicitly convey the concept of it being approximate calculations for functions whose meanings are recognized. To describe my concern "from the other direction", the reciprocal transformation is also an approximation (as is virtually everything else enabled by fast-math), and we don't intend to control the reciprocal transformation via this flag.
2308	Good point. I don't think 'reassoc' intended to enable anything else. Maybe: Allow algebraic reassociation transformations that may dramatically change results in floating point. For that matter, I think 'algebraic' could also be removed, so just "Allow reassociation transformations that..." would work.

hfinkel added inline comments.Oct 27 2017, 8:11 AM

docs/LangRef.rst
2305	We could call it funcapprox or approxfunc or something like that. We could use something shorter too (apfn perhaps).

wristow added inline comments.Oct 27 2017, 10:36 AM

docs/LangRef.rst
2305	I like it. I'm happy with any of those three. Given the brevity of many of the others (eg, 'arcp', 'nnan'), I lean toward 'apfn'.

Patch updated:

Changed 'AllowTrans' to 'AllowMathLib' and 'trans' to 'aml'. The IR abbreviation is more alphabet soup, but less "awkward morning" than 'amlib' - still open to suggestions :)
Updated LangRef with suggested fixes. Trying to toe the weasel-word line here with enough ambiguity to accomplish transforms but not be completely meaningless.
Removed 'TODO' comments at uses of isFast() / setFast(). We can just grep those calls and fix them.
Added a variant of the 'fast' means all flags test from D39319.

Sadly, I didn't get email notifications for the last couple of name suggestions before I posted the updated patch. I agree that 'ApproxFunc' and 'afn' are better than 'AllowMathLib' and 'aml', so I'll change that unless there are objections or better suggestions.

Patch updated:
'AllowMathLib' --> 'ApproxFunc'
'aml' --> 'afn'

hfinkel added inline comments.Oct 27 2017, 7:08 PM

docs/LangRef.rst
2304	I think that we should essentially use Andrew's proposed definition here: Allow substitution of approximate calculations for functions (e.g., sin, log, sqrt).
10498	This conformance language is only necessary for sqrt. For the other functions, there is no standard for their accuracy/precision. You might say that with 'afn' the result may not match what would have been returned by the system's libm implementation.

Patch updated - edited LangRef phrasing.

hfinkel added inline comments.Oct 30 2017, 3:05 PM

docs/LangRef.rst
10866	I don't expect that afn would affect fma. I recommend removing the statement here. It might be true that fma is computed differently under -ffast-math because of denormal handling, but that applies to everything. The same is true of the conversion functions below (floor, rint, etc.). maxnum/minnum too.

Patch updated:
We don't need the 'afn' warning label on every FP intrinsic. The note is now only applied to these 10:
sqrt, powi, sin, cos, pow, exp, exp2, log, log10, log2

efriedma added inline comments.Oct 30 2017, 5:19 PM

docs/LangRef.rst
10498	We probably want different language here than for the transcendental functions; libm sqrt() is precisely the IEEE754 squareRoot().
10538	Does afn actually do anything here? I think "unspecified sequence of rounding operations" implies it isn't exact anyway. (And __powisf2 isn't really part of libm.)
10574	This language might give the wrong idea. Even without 'afn', the result may not match the target's libm: we constant-fold using a different implementation. The part to call out is that we might substitute an implementation which is less accurate.
10866	We might want to transform fma(a,b,c) to a*b+c in fast-math mode on targets which don't have a native fma instruction? Not sure if that makes sense.

hfinkel added inline comments.Oct 30 2017, 5:48 PM

docs/LangRef.rst
10498	True. On the other hand, the system's libm sqrt should be IEEE compliant, so saying that this differs from the libm result covers that (and also covers other cases, such as PPC long double, which aren't IEEE).
10574	we constant-fold using a different implementation. Which, indeed, might be less accurate -- just hopefully not by much ;) I realize that having these notes on the intrinsics seems like it could be helpful, but I'm leaning toward recommending that we don't have them at all.
10866	Good point. This would be relatively easy to implement as well: afn fma(a, b, c) -> [afn] fmuladd(a, b, c) fmuladd lowers either to an fma, or not, depending on target preferences.

wristow added inline comments.Oct 30 2017, 6:37 PM

include/llvm/IR/Operator.h
220	One loose end that needs to be taken care of more or less simultaneously is a Clang change. Specifically, the constructor for `CodeGenFunction` (in "CodeGenFunction.cpp") invokes `FastMathFlags::setUnsafeAlgebra()`, so it will need to be changed to `setFast()`.

spatel added inline comments.Oct 31 2017, 6:39 AM

include/llvm/IR/Operator.h
220	This is correct - I didn't post it, but I have that one line patch in place locally, so I was planning to submit it to the clang repo as close as possible after this patch and reference this commit (if there's a way to avoid the build breakage cleanly, please let me know).

Patch updated - just changes to the LangRef text (this now attempts to subsume the similar changes in D28335) :

Add an extra line to the 'sqrt' semantics because that has well-defined behavior...unless the type is not IEEE.
Remove the 'afn' blurb for 'powi' because that's loose by definition.
Add the 'afn' blurb back to 'fma' because that actually could be affected.
Try to soften the existing libm language for all 10 'afn'-affected intrinsics while retaining some shred of accuracy. :)

wristow added inline comments.Oct 31 2017, 7:08 PM

include/llvm/IR/Operator.h
187	We'll need to add flags to `SDNodeFlags` that are analogous to `AllowReassoc` and `ApproxFunc`. Adding them in a separate patch seems fine, but in case the lack of that change in this patch was an oversight, I wanted to raise the point here.
220	I don't know of a clean way. Definitely fine with me to submit it right after this patch is submitted.
lib/Transforms/Scalar/Reassociate.cpp
2018	Very minor point/question: Since the test is no longer `hasUnsafeALgebra()`, are we OK with the comment still saying `unsafe algebra`? Or do we want to change the comment above to something like: `// Don't optimize floating point instructions that don't have fast-math.` I'm fine leaving it as-is, but I've found these sorts of things in a handful of places, so if we want to change them, I'll look through the patch more thoroughly, and identify each one I find.
test/Assembler/fast-math-flags.ll
97	Is testing the 'reassoc' flag on the 'call' instruction really what you intended here? I would have thought add/sub/mul/div, but 'call' surprised me.

I'm happy; please go ahead when the other reviewers are too.

spatel added inline comments.Nov 2 2017, 7:34 AM

include/llvm/IR/Operator.h
187	I want to fix the backend too, but that's separate patches. For example, we don't propagate the IR flags properly yet (see D37686). Also IIRC, the backend does not treat the global "-enable-unsafe-fp-math" as an umbrella for other flags. So that setting does not imply "-enable-no-nans-fp-math" or other FP relaxations.
lib/Transforms/Scalar/Reassociate.cpp
2018	I'll fix the comment to match the current code. Since this is the reassociation pass, I would guess that 'reassoc' is all we need to enable transforms here, but we'll have to verify that that is correct.
test/Assembler/fast-math-flags.ll
97	It is surprising but intentional. I want to highlight the fact that we allow any FMF on any FPMathOperator. So things like this or 'fadd arcp ...' are legal but I'm not sure how that would be used for optimization. We may want to refine that someday?

wristow added inline comments.Nov 2 2017, 10:55 AM

include/llvm/IR/Operator.h
187	I see. Thanks for clarifying.
lib/Transforms/Scalar/Reassociate.cpp
2018	To be clear, I'm not suggesting that in this patch we change the code here to check for 'reassoc' (i.e., I'm not suggesting we change the `isFast()` call to `hasAllowReassoc()` at this time). I view that as a separate piece of work, where we go through and carefully audit existing FMF-related checks, and decide how to use the more precise flags. Possibly it's just 'reassoc' that is needed for this case, or possibly it's 'reassoc' and some other conditions. All I was suggesting by my comment/question, is that code-comments referring to a no longer-existing "unsafe algebra" umbrella flag, are a bit misleading. So I wondered whether we wanted to change those comments to better match the new implementation. It's a pretty minor point, in my view. From my POV, the main purpose of this patch is to fix the underlying implementation to allow us to go through and do that audit, and fix issues like this "use `isFast()` or use some finer check?" example, here.
test/Assembler/fast-math-flags.ll
97	OK, thanks for explaining.

Patch updated:
Change code comments that reference 'unsafe' to the new 'fast' vocabulary. This is not condoning what may be a wrongful use of 'fast'; it's just trying to keep our code and comments in sync. I didn't go out of my way to look beyond a few lines in the diffs, so there may still be 'unsafe' refs and wrapper functions out there, but we'll squash those as we audit the usage of 'isFast'.

In D39304#915601, @spatel wrote:

Patch updated:
Change code comments that reference 'unsafe' to the new 'fast' vocabulary. ...

That covers my last questions.
LGTM.

This revision is now accepted and ready to land.Nov 3 2017, 2:42 PM

Closed by commit rL317488: [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast… (authored by spatel). · Explain WhyNov 6 2017, 8:28 AM

This revision was automatically updated to reflect the committed changes.

As an llvm vendor for Apple targets we have actually run into the incompatibility issues with old fast and current fast, we elected to use version markers in the IR for in house code to tell us that the old fast is relevant in the bit code reader during auto upgrade for IR into new Fast format.

In D39304#1001295, @mcberg2017 wrote:

As an llvm vendor for Apple targets we have actually run into the incompatibility issues with old fast and current fast, we elected to use version markers in the IR for in house code to tell us that the old fast is relevant in the bit code reader during auto upgrade for IR into new Fast format.

Thanks for letting me know. Did this manifest as a performance problem only or was there a visible functional difference too?

Hi Sanjay,

Did this manifest as a performance problem only or was there a visible functional difference too?

I guess it depends how you qualify visible functional difference :).
With the previous layout of the bits, reassociation was performed, but with the new layout, it is not.
The reason behind that reassociation checks isFast, which with the new layout is true only if all the 7 bits are set to true and that doesn't happen when the upgrade path is used (we get only 5 of them).

So this is a performance problem that is exposed by a functional difference in the compiler (an optimization was triggered and now it is not).

Given this new format was not released yet, do you see a way we could make the autoupgrade path to preserve the isFast semantic from the old format to the new format?

I haven't audited the code, but potential we are going to regress all the code that relied on isFast and given this is not publicly available yet (unless I am mistaken), I was wondering if there is a way to fix that.

Cheers,
-Quentin

In D39304#1002599, @qcolombet wrote:

Hi Sanjay,

Did this manifest as a performance problem only or was there a visible functional difference too?

I guess it depends how you qualify visible functional difference :).
With the previous layout of the bits, reassociation was performed, but with the new layout, it is not.
The reason behind that reassociation checks isFast, which with the new layout is true only if all the 7 bits are set to true and that doesn't happen when the upgrade path is used (we get only 5 of them).

So this is a performance problem that is exposed by a functional difference in the compiler (an optimization was triggered and now it is not).

Right - we expected that could happen with existing IR compiled with -ffast-math. I don't think we could go wrong in that scenario given that 'fast' gives us license to do all kinds of transforms, but doesn't require it - although people have different expectations once they get accustomed to the optimizations. :)

Given this new format was not released yet, do you see a way we could make the autoupgrade path to preserve the isFast semantic from the old format to the new format?
I haven't audited the code, but potential we are going to regress all the code that relied on isFast and given this is not publicly available yet (unless I am mistaken), I was wondering if there is a way to fix that.

Michael has this patch already, right? I think we have to create a new version of the IR since the bits changed meaning (we can't just flip 'on' new bits).

I have no objection to that, but that use case wasn't important to me, so that's why I didn't bother in this patch. AFAIK, this is new for v6.0, so yes, it's still possible to do this before that window closes.

I think we have to create a new version of the IR since the bits changed meaning (we can't just flip 'on' new bits).

Yeah, exactly.

I have no objection to that, but that use case wasn't important to me, so that's why I didn't bother in this patch. AFAIK, this is new for v6.0, so yes, it's still possible to do this before that window closes.

That would be ideal, but on the other hand, like you said, the new code won't be wrong.
I don't know what it takes to bump the bitcode version nor what are the implications, so maybe that's the right call, but let us have this conversation on LLVM dev.

Let me start a thread on that.

That's http://lists.llvm.org/pipermail/llvm-dev/2018-February/121114.html

BTW, while writing the RFC, I realized that we could potentially generated incorrect code if we were silently downgrade a post-r317488 bitcode with a pre-r317488 compiler. (I.e., running fast math optimizations whereas we only wanted reassoc)

spatel mentioned this in rL324967: [InstSimplify] allow exp/log simplifications with only 'reassoc' FMF.Feb 12 2018, 3:53 PM

spatel mentioned this in D43253: bitcode support change for fast flags compatibility.Feb 15 2018, 3:39 PM

aemerson mentioned this in D57359: [GlobalISel] Introduce a G_FSQRT generic instruction.Jan 29 2019, 4:42 PM

Revision Contents

Path

Size

docs/

LangRef.rst

52 lines

include/

llvm/

IR/

Instruction.h

24 lines

Operator.h

113 lines

Transforms/

Utils/

LoopUtils.h

6 lines

lib/

AsmParser/

LLLexer.cpp

2 lines

LLParser.h

4 lines

LLToken.h

2 lines

Bitcode/

Reader/

BitcodeReader.cpp

6 lines

Writer/

BitcodeWriter.cpp

6 lines

CodeGen/

ExpandReductions.cpp

2 lines

SelectionDAG/

SelectionDAGBuilder.cpp

10 lines

IR/

AsmWriter.cpp

8 lines

Instruction.cpp

30 lines

Target/

AMDGPU/

AMDGPUCodeGenPrepare.cpp

2 lines

AMDGPULibCalls.cpp

2 lines

Transforms/

InstCombine/

InstCombineAddSub.cpp

8 lines

InstCombineCalls.cpp

2 lines

InstCombineMulDivRem.cpp

11 lines

Scalar/

Reassociate.cpp

10 lines

Utils/

LoopUtils.cpp

10 lines

SimplifyLibCalls.cpp

31 lines

Vectorize/

LoopVectorize.cpp

6 lines

SLPVectorizer.cpp

4 lines

test/

Assembler/

fast-math-flags.ll

32 lines

Bitcode/

4 lines

4 lines

8 lines

8 lines

8 lines

8 lines

4 lines

unittests/

IR/

IRBuilderTest.cpp

51 lines

Diff 120909

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 2,266 Lines • ▼ Show 20 Lines
	seq\_cst total orderings of other operations that are not marked			seq\_cst total orderings of other operations that are not marked
	``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.			``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.

	.. _fastmath:			.. _fastmath:

	Fast-Math Flags			Fast-Math Flags
	---------------			---------------

	LLVM IR floating-point binary ops (:ref:`fadd <i_fadd>`,			LLVM IR floating-point operations (:ref:`fadd <i_fadd>`,
	:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,			:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
	:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>`			:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>`
	instructions have the following flags that can be set to enable			may use the following flags to enable otherwise unsafe
	otherwise unsafe floating point transformations.			floating-point transformations.

	``nnan``			``nnan``
	No NaNs - Allow optimizations to assume the arguments and result are not			No NaNs - Allow optimizations to assume the arguments and result are not
	NaN. Such optimizations are required to retain defined behavior over			NaN. Such optimizations are required to retain defined behavior over
	NaNs, but the value of the result is undefined.			NaNs, but the value of the result is undefined.

	``ninf``			``ninf``
	No Infs - Allow optimizations to assume the arguments and result are not			No Infs - Allow optimizations to assume the arguments and result are not
	+/-Inf. Such optimizations are required to retain defined behavior over			+/-Inf. Such optimizations are required to retain defined behavior over
	+/-Inf, but the value of the result is undefined.			+/-Inf, but the value of the result is undefined.

	``nsz``			``nsz``
	No Signed Zeros - Allow optimizations to treat the sign of a zero			No Signed Zeros - Allow optimizations to treat the sign of a zero
	argument or result as insignificant.			argument or result as insignificant.

	``arcp``			``arcp``
	Allow Reciprocal - Allow optimizations to use the reciprocal of an			Allow Reciprocal - Allow optimizations to use the reciprocal of an
	argument rather than perform division.			argument rather than perform division.

	``contract``			``contract``
	Allow floating-point contraction (e.g. fusing a multiply followed by an			Allow floating-point contraction (e.g. fusing a multiply followed by an
	addition into a fused multiply-and-add).			addition into a fused multiply-and-add).

				``afn``
				Approximate functions - Allow substitution of approximate calculations for
				hfinkelUnsubmitted Done Reply Inline Actions I think that we should essentially use Andrew's proposed definition here: Allow substitution of approximate calculations for functions (e.g., sin, log, sqrt). hfinkel: I think that we should essentially use Andrew's proposed definition here: Allow substitution…
				functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
				efriedmaUnsubmitted Done Reply Inline Actions Could you mark up the LLVM intrinsics affected by the "trans" flag with the meaning with/without the flag? Please clarify the language here to indicate this affects the semantics of both LLVM intrinsics and known library functions. And include a more complete description of what counts as a known library function. And explain what "relaxed" means, given that libm generally doesn't provide correctly rounded versions of transcendental functions. Also, do we want to optimize sqrt() based on this flag? It technically isn't transcendental, but we currently generate an approximation in some cases based on fast-math flags. efriedma: Could you mark up the LLVM intrinsics affected by the "trans" flag with the meaning…
				spatelAuthorUnsubmitted Not Done Reply Inline Actions I think we'd include sqrt() in the 'trans' bucket (so maybe 'libm' was the better name). But looking back through the dev thread, I don't see an actual definition of that term or what this flag would map to as a clang command-line param. @hfinkel / @wristow - suggestions? spatel: I think we'd include sqrt() in the 'trans' bucket (so maybe 'libm' was the better name). But…
				hfinkelUnsubmitted Not Done Reply Inline Actions You're correct, sqrt is not a transcendental function (it is an algebraic function because it is the root of a polynomial equation). The problem is that algebraic functions also include things that we don't want to include here (e.g., addition, division). I don't like 'libm' because it refers to a very specific set of functions, but maybe that's the best we can do. The best description I have of this set is: "All of the mathematical operations that generally produce irrational numbers and are not in the set of functions specified by the IEEE specification (e.g., +,-,,/,%,sqrt)." hfinkel:* You're correct, sqrt is not a transcendental function (it is an algebraic function because it…
				wristowUnsubmitted Not Done Reply Inline Actions For context, the original suggestion of the flag-name 'libm' was changed to 'trans' based on an observation that OpenCL has an option '-cl-fast-relaxed-math' that includes the semantics "This option also relaxes the precision of commonly used math functions" (see http://man.opencl.org/clCompileProgram.html), and so 'libm' may incorrectly imply it only applies to a more limited set of functions (math operations) only implemented in "libm.a". That led to the suggestion of 'trans', for transcendental functions. I originally liked the 'trans' suggestion, moving away from the libm.a implication. But I do feel that sqrt() should also be optimized on this flag, so as people said, 'trans' isn't perfect either. I don't have a precise formal definition of what things would be controlled by this, but loosely, I'd say it's "All mathematical operations that have runtime library support on many platforms." In practice in LLVM, I think this means many (most?) of the math operations handled by "SimplifyLibCalls.cpp". So for me, in addition to transcendental functions, it would include sqrt(), and even simpler things like fmin() and fmax(). So after giving it more thought, I'm now back to preferring 'libm' over 'trans'. If someone has a better suggestion, that would be great. But I haven't thought of one. wristow: For context, the original suggestion of the flag-name 'libm' was changed to 'trans' based on an…
				wristowUnsubmitted Not Done Reply Inline Actions So after giving it more thought, I'm now back to preferring 'libm' over 'trans'. If someone has a better suggestion, that would be great. But I haven't thought of one. In the spirit of the flag 'arcp' for 'AllowReciprocal', and the possibility of 'AllowMathLib' for the internal enumeration name, how about 'amlib'? With that, there's no direct implication of "this is only for libm.a operations", or "this is only for transcendental functions". That said, the 'am' part does give it an awkward "morning library" feeling. Maybe 'amathlib' instead? wristow: >So after giving it more thought, I'm now back to preferring 'libm' over 'trans'. If someone…
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions It seems that we're allowing something kind of open-ended here. That is, we don't seem to have an exact set of functions that will be covered. If that's the case then we should probably document it as such -- something like "allow substitution of approximate calculations for functions whose meaning are recognized by the optimizer." And maybe the flag could be "approx". andrew.w.kaylor: It seems that we're allowing something kind of open-ended here. That is, we don't seem to have…
				hfinkelUnsubmitted Not Done Reply Inline Actions I like this suggestion. hfinkel: I like this suggestion.
				wristowUnsubmitted Done Reply Inline Actions Yes, I think it's open-ended. And I very much like the description "allow substitution of approximate calculations for functions whose meaning are recognized by the optimizer.". I'm less enthusiastic about the flag-name 'approx', although I'm not horribly opposed to it (especially since I cannot come up with anything I really like). On it's own, a flag named 'approx' sounds too wide of scope. To me, it sounds like it might be describing all of what is enabled by -ffast-math. In short, it doesn't explicitly convey the concept of it being approximate calculations for functions whose meanings are recognized. To describe my concern "from the other direction", the reciprocal transformation is also an approximation (as is virtually everything else enabled by fast-math), and we don't intend to control the reciprocal transformation via this flag. wristow: Yes, I think it's open-ended. And I very much like the description "allow substitution of…
				hfinkelUnsubmitted Not Done Reply Inline Actions We could call it funcapprox or approxfunc or something like that. We could use something shorter too (apfn perhaps). hfinkel: We could call it funcapprox or approxfunc or something like that. We could use something…
				wristowUnsubmitted Not Done Reply Inline Actions I like it. I'm happy with any of those three. Given the brevity of many of the others (eg, 'arcp', 'nnan'), I lean toward 'apfn'. wristow: I like it. I'm happy with any of those three. Given the brevity of many of the others (eg…
				for places where this can apply to LLVM's intrinsic math functions.

				``reassoc``
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Since reciprocal is an algebraically-equivalent transformation, this documentation isn't quite correct. Does this enable anything other than reassociation? andrew.w.kaylor: Since reciprocal is an algebraically-equivalent transformation, this documentation isn't quite…
				wristowUnsubmitted Done Reply Inline Actions Good point. I don't think 'reassoc' intended to enable anything else. Maybe: Allow algebraic reassociation transformations that may dramatically change results in floating point. For that matter, I think 'algebraic' could also be removed, so just "Allow reassociation transformations that..." would work. wristow: Good point. I don't think 'reassoc' intended to enable anything else. Maybe: ``` Allow…
				Allow reassociation transformations for floating-point instructions.
				This may dramatically change results in floating point.

	``fast``			``fast``
	Fast - Allow algebraically equivalent transformations that may			This flag implies all of the others.
	dramatically change results in floating point (e.g. reassociate). This
	flag implies all the others.

	.. _uselistorder:			.. _uselistorder:

	Use-list Order Directives			Use-list Order Directives
	-------------------------			-------------------------

	Use-list directives encode the in-memory order of each use-list, allowing the			Use-list directives encode the in-memory order of each use-list, allowing the
	order to be recreated. ``<order-indexes>`` is a comma-separated list of			order to be recreated. ``<order-indexes>`` is a comma-separated list of
	▲ Show 20 Lines • Show All 8,167 Lines • ▼ Show 20 Lines
	""""""""""			""""""""""

	The argument and return value are floating point numbers of the same type.			The argument and return value are floating point numbers of the same type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the square root of the operand if it is a nonnegative			This function returns the square root of the operand if it is a nonnegative
	floating point number.			floating point number. When specified with the fast-math-flag 'afn', the
				result may not match the system's libm implementation.
				hfinkelUnsubmitted Done Reply Inline Actions This conformance language is only necessary for sqrt. For the other functions, there is no standard for their accuracy/precision. You might say that with 'afn' the result may not match what would have been returned by the system's libm implementation. hfinkel: This conformance language is only necessary for sqrt. For the other functions, there is no…
				efriedmaUnsubmitted Not Done Reply Inline Actions We probably want different language here than for the transcendental functions; libm sqrt() is precisely the IEEE754 squareRoot(). efriedma: We probably want different language here than for the transcendental functions; libm sqrt() is…
				hfinkelUnsubmitted Done Reply Inline Actions True. On the other hand, the system's libm sqrt should be IEEE compliant, so saying that this differs from the libm result covers that (and also covers other cases, such as PPC long double, which aren't IEEE). hfinkel: True. On the other hand, the system's libm sqrt should be IEEE compliant, so saying that this…

	'``llvm.powi.*``' Intrinsic			'``llvm.powi.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.powi`` on any			This is an overloaded intrinsic. You can use ``llvm.powi`` on any
	Show All 21 Lines

	The second argument is an integer power, and the first is a value to			The second argument is an integer power, and the first is a value to
	raise to that power.			raise to that power.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the first value raised to the second power with an			This function returns the first value raised to the second power with an
	unspecified sequence of rounding operations.			unspecified sequence of rounding operations. When specified with the
				fast-math-flag 'afn', the result may not match the system's libm
				implementation.
				efriedmaUnsubmitted Done Reply Inline Actions Does afn actually do anything here? I think "unspecified sequence of rounding operations" implies it isn't exact anyway. (And __powisf2 isn't really part of libm.) efriedma: Does afn actually do anything here? I think "unspecified sequence of rounding operations"…

	'``llvm.sin.*``' Intrinsic			'``llvm.sin.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.sin`` on any			This is an overloaded intrinsic. You can use ``llvm.sin`` on any
	Show All 18 Lines

	The argument and return value are floating point numbers of the same type.			The argument and return value are floating point numbers of the same type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the sine of the specified operand, returning the			This function returns the sine of the specified operand, returning the
	same values as the libm ``sin`` functions would, and handles error			same values as the libm ``sin`` functions would, and handles error
	conditions in the same way.			conditions in the same way. When specified with the fast-math-flag 'afn',
				the result may not match the system's libm implementation.
				efriedmaUnsubmitted Done Reply Inline Actions This language might give the wrong idea. Even without 'afn', the result may not match the target's libm: we constant-fold using a different implementation. The part to call out is that we might substitute an implementation which is less accurate. efriedma: This language might give the wrong idea. Even without 'afn', the result may not match the…
				hfinkelUnsubmitted Not Done Reply Inline Actions we constant-fold using a different implementation. Which, indeed, might be less accurate -- just hopefully not by much ;) I realize that having these notes on the intrinsics seems like it could be helpful, but I'm leaning toward recommending that we don't have them at all. hfinkel: > we constant-fold using a different implementation. Which, indeed, might be less accurate…

	'``llvm.cos.*``' Intrinsic			'``llvm.cos.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.cos`` on any			This is an overloaded intrinsic. You can use ``llvm.cos`` on any
	Show All 18 Lines

	The argument and return value are floating point numbers of the same type.			The argument and return value are floating point numbers of the same type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the cosine of the specified operand, returning the			This function returns the cosine of the specified operand, returning the
	same values as the libm ``cos`` functions would, and handles error			same values as the libm ``cos`` functions would, and handles error
	conditions in the same way.			conditions in the same way. When specified with the fast-math-flag 'afn',
				the result may not match the system's libm implementation.

	'``llvm.pow.*``' Intrinsic			'``llvm.pow.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.pow`` on any			This is an overloaded intrinsic. You can use ``llvm.pow`` on any
	Show All 13 Lines

	The '``llvm.pow.*``' intrinsics return the first operand raised to the			The '``llvm.pow.*``' intrinsics return the first operand raised to the
	specified (positive or negative) power.			specified (positive or negative) power.

	Arguments:			Arguments:
	""""""""""			""""""""""

	The second argument is a floating point power, and the first is a value			The second argument is a floating point power, and the first is a value
	to raise to that power.			to raise to that power. When specified with the fast-math-flag 'afn', the
				result may not match the system's libm implementation.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the first value raised to the second power,			This function returns the first value raised to the second power,
	returning the same values as the libm ``pow`` functions would, and			returning the same values as the libm ``pow`` functions would, and
	handles error conditions in the same way.			handles error conditions in the same way.

	Show All 25 Lines
	""""""""""			""""""""""

	The argument and return value are floating point numbers of the same type.			The argument and return value are floating point numbers of the same type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the same values as the libm ``exp`` functions			This function returns the same values as the libm ``exp`` functions
	would, and handles error conditions in the same way.			would, and handles error conditions in the same way. When specified with the
				fast-math-flag 'afn', the result may not match the system's libm implementation.

	'``llvm.exp2.*``' Intrinsic			'``llvm.exp2.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.exp2`` on any			This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
	Show All 18 Lines
	""""""""""			""""""""""

	The argument and return value are floating point numbers of the same type.			The argument and return value are floating point numbers of the same type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the same values as the libm ``exp2`` functions			This function returns the same values as the libm ``exp2`` functions
	would, and handles error conditions in the same way.			would, and handles error conditions in the same way. When specified with
				the fast-math-flag 'afn', the result may not match the system's libm
				implementation.

	'``llvm.log.*``' Intrinsic			'``llvm.log.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.log`` on any			This is an overloaded intrinsic. You can use ``llvm.log`` on any
	Show All 18 Lines
	""""""""""			""""""""""

	The argument and return value are floating point numbers of the same type.			The argument and return value are floating point numbers of the same type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the same values as the libm ``log`` functions			This function returns the same values as the libm ``log`` functions
	would, and handles error conditions in the same way.			would, and handles error conditions in the same way. When specified with the
				fast-math-flag 'afn', the result may not match the system's libm implementation.

	'``llvm.log10.*``' Intrinsic			'``llvm.log10.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.log10`` on any			This is an overloaded intrinsic. You can use ``llvm.log10`` on any
	Show All 18 Lines
	""""""""""			""""""""""

	The argument and return value are floating point numbers of the same type.			The argument and return value are floating point numbers of the same type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the same values as the libm ``log10`` functions			This function returns the same values as the libm ``log10`` functions
	would, and handles error conditions in the same way.			would, and handles error conditions in the same way. When specified with
				the fast-math-flag 'afn', the result may not match the system's libm
				implementation.

	'``llvm.log2.*``' Intrinsic			'``llvm.log2.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.log2`` on any			This is an overloaded intrinsic. You can use ``llvm.log2`` on any
	Show All 18 Lines
	""""""""""			""""""""""

	The argument and return value are floating point numbers of the same type.			The argument and return value are floating point numbers of the same type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the same values as the libm ``log2`` functions			This function returns the same values as the libm ``log2`` functions
	would, and handles error conditions in the same way.			would, and handles error conditions in the same way. When specified with the
				fast-math-flag 'afn', the result may not match the system's libm implementation.

	'``llvm.fma.*``' Intrinsic			'``llvm.fma.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.fma`` on any			This is an overloaded intrinsic. You can use ``llvm.fma`` on any
	Show All 19 Lines

	The argument and return value are floating point numbers of the same			The argument and return value are floating point numbers of the same
	type.			type.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the same values as the libm ``fma`` functions			This function returns the same values as the libm ``fma`` functions
	would, and does not set errno.			would, and does not set errno.
				hfinkelUnsubmitted Not Done Reply Inline Actions I don't expect that afn would affect fma. I recommend removing the statement here. It might be true that fma is computed differently under -ffast-math because of denormal handling, but that applies to everything. The same is true of the conversion functions below (floor, rint, etc.). maxnum/minnum too. hfinkel: I don't expect that afn would affect fma. I recommend removing the statement here. It might be…
				efriedmaUnsubmitted Done Reply Inline Actions We might want to transform fma(a,b,c) to ab+c in fast-math mode on targets which don't have a native fma instruction? Not sure if that makes sense. efriedma:* We might want to transform fma(a,b,c) to a*b+c in fast-math mode on targets which don't have a…
				hfinkelUnsubmitted Not Done Reply Inline Actions Good point. This would be relatively easy to implement as well: afn fma(a, b, c) -> [afn] fmuladd(a, b, c) fmuladd lowers either to an fma, or not, depending on target preferences. hfinkel: Good point. This would be relatively easy to implement as well: afn fma(a, b, c) -> [afn]…

	'``llvm.fabs.*``' Intrinsic			'``llvm.fabs.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	This is an overloaded intrinsic. You can use ``llvm.fabs`` on any			This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
	▲ Show 20 Lines • Show All 3,598 Lines • Show Last 20 Lines

include/llvm/IR/Instruction.h

Show First 20 Lines • Show All 302 Lines • ▼ Show 20 Lines	public:

/// Drops flags that may cause this instruction to evaluate to poison despite		/// Drops flags that may cause this instruction to evaluate to poison despite
/// having non-poison inputs.		/// having non-poison inputs.
void dropPoisonGeneratingFlags();		void dropPoisonGeneratingFlags();

/// Determine whether the exact flag is set.		/// Determine whether the exact flag is set.
bool isExact() const;		bool isExact() const;

/// Set or clear the unsafe-algebra flag on this instruction, which must be an		/// Set or clear all fast-math-flags on this instruction, which must be an
/// operator which supports this flag. See LangRef.html for the meaning of		/// operator which supports this flag. See LangRef.html for the meaning of
/// this flag.		/// this flag.
void setHasUnsafeAlgebra(bool B);		void setFast(bool B);

		/// Set or clear the reassociation flag on this instruction, which must be
		/// an operator which supports this flag. See LangRef.html for the meaning of
		/// this flag.
		void setHasAllowReassoc(bool B);

/// Set or clear the no-nans flag on this instruction, which must be an		/// Set or clear the no-nans flag on this instruction, which must be an
/// operator which supports this flag. See LangRef.html for the meaning of		/// operator which supports this flag. See LangRef.html for the meaning of
/// this flag.		/// this flag.
void setHasNoNaNs(bool B);		void setHasNoNaNs(bool B);

/// Set or clear the no-infs flag on this instruction, which must be an		/// Set or clear the no-infs flag on this instruction, which must be an
/// operator which supports this flag. See LangRef.html for the meaning of		/// operator which supports this flag. See LangRef.html for the meaning of
/// this flag.		/// this flag.
void setHasNoInfs(bool B);		void setHasNoInfs(bool B);

/// Set or clear the no-signed-zeros flag on this instruction, which must be		/// Set or clear the no-signed-zeros flag on this instruction, which must be
/// an operator which supports this flag. See LangRef.html for the meaning of		/// an operator which supports this flag. See LangRef.html for the meaning of
/// this flag.		/// this flag.
void setHasNoSignedZeros(bool B);		void setHasNoSignedZeros(bool B);

/// Set or clear the allow-reciprocal flag on this instruction, which must be		/// Set or clear the allow-reciprocal flag on this instruction, which must be
/// an operator which supports this flag. See LangRef.html for the meaning of		/// an operator which supports this flag. See LangRef.html for the meaning of
/// this flag.		/// this flag.
void setHasAllowReciprocal(bool B);		void setHasAllowReciprocal(bool B);

		/// Set or clear the approximate-math-functions flag on this instruction,
		/// which must be an operator which supports this flag. See LangRef.html for
		/// the meaning of this flag.
		void setHasApproxFunc(bool B);

/// Convenience function for setting multiple fast-math flags on this		/// Convenience function for setting multiple fast-math flags on this
/// instruction, which must be an operator which supports these flags. See		/// instruction, which must be an operator which supports these flags. See
/// LangRef.html for the meaning of these flags.		/// LangRef.html for the meaning of these flags.
void setFastMathFlags(FastMathFlags FMF);		void setFastMathFlags(FastMathFlags FMF);

/// Convenience function for transferring all fast-math flag values to this		/// Convenience function for transferring all fast-math flag values to this
/// instruction, which must be an operator which supports these flags. See		/// instruction, which must be an operator which supports these flags. See
/// LangRef.html for the meaning of these flags.		/// LangRef.html for the meaning of these flags.
void copyFastMathFlags(FastMathFlags FMF);		void copyFastMathFlags(FastMathFlags FMF);

/// Determine whether the unsafe-algebra flag is set.		/// Determine whether all fast-math-flags are set.
bool hasUnsafeAlgebra() const;		bool isFast() const;

		/// Determine whether the allow-reassociation flag is set.
		bool hasAllowReassoc() const;

/// Determine whether the no-NaNs flag is set.		/// Determine whether the no-NaNs flag is set.
bool hasNoNaNs() const;		bool hasNoNaNs() const;

/// Determine whether the no-infs flag is set.		/// Determine whether the no-infs flag is set.
bool hasNoInfs() const;		bool hasNoInfs() const;

/// Determine whether the no-signed-zeros flag is set.		/// Determine whether the no-signed-zeros flag is set.
bool hasNoSignedZeros() const;		bool hasNoSignedZeros() const;

/// Determine whether the allow-reciprocal flag is set.		/// Determine whether the allow-reciprocal flag is set.
bool hasAllowReciprocal() const;		bool hasAllowReciprocal() const;

/// Determine whether the allow-contract flag is set.		/// Determine whether the allow-contract flag is set.
bool hasAllowContract() const;		bool hasAllowContract() const;

		/// Determine whether the approximate-math-functions flag is set.
		bool hasApproxFunc() const;

/// Convenience function for getting all the fast-math flags, which must be an		/// Convenience function for getting all the fast-math flags, which must be an
/// operator which supports these flags. See LangRef.html for the meaning of		/// operator which supports these flags. See LangRef.html for the meaning of
/// these flags.		/// these flags.
FastMathFlags getFastMathFlags() const;		FastMathFlags getFastMathFlags() const;

/// Copy I's fast-math flags		/// Copy I's fast-math flags
void copyFastMathFlags(const Instruction *I);		void copyFastMathFlags(const Instruction *I);

▲ Show 20 Lines • Show All 306 Lines • Show Last 20 Lines

include/llvm/IR/Operator.h

Show First 20 Lines • Show All 157 Lines • ▼ Show 20 Lines

/// Convenience struct for specifying and reasoning about fast-math flags.		/// Convenience struct for specifying and reasoning about fast-math flags.
class FastMathFlags {		class FastMathFlags {
private:		private:
friend class FPMathOperator;		friend class FPMathOperator;

unsigned Flags = 0;		unsigned Flags = 0;

FastMathFlags(unsigned F) : Flags(F) { }		FastMathFlags(unsigned F) {
		// If all 7 bits are set, turn this into -1. If the number of bits grows,
		// this must be updated. This is intended to provide some forward binary
		// compatibility insurance for the meaning of 'fast' in case bits are added.
		if (F == 0x7F) Flags = ~0U;
		else Flags = F;
		}

public:		public:
/// This is how the bits are used in Value::SubclassOptionalData so they		// This is how the bits are used in Value::SubclassOptionalData so they
/// should fit there too.		// should fit there too.
		// WARNING: We're out of space. SubclassOptionalData only has 7 bits. New
		// functionality will require a change in how this information is stored.
enum {		enum {
UnsafeAlgebra = (1 << 0),		AllowReassoc = (1 << 0),
NoNaNs = (1 << 1),		NoNaNs = (1 << 1),
NoInfs = (1 << 2),		NoInfs = (1 << 2),
NoSignedZeros = (1 << 3),		NoSignedZeros = (1 << 3),
AllowReciprocal = (1 << 4),		AllowReciprocal = (1 << 4),
AllowContract = (1 << 5)		AllowContract = (1 << 5),
		ApproxFunc = (1 << 6)
};		};
		wristowUnsubmitted Not Done Reply Inline Actions We'll need to add flags to `SDNodeFlags` that are analogous to `AllowReassoc` and `ApproxFunc`. Adding them in a separate patch seems fine, but in case the lack of that change in this patch was an oversight, I wanted to raise the point here. wristow: We'll need to add flags to `SDNodeFlags` that are analogous to `AllowReassoc` and `ApproxFunc`.
		spatelAuthorUnsubmitted Not Done Reply Inline Actions I want to fix the backend too, but that's separate patches. For example, we don't propagate the IR flags properly yet (see D37686). Also IIRC, the backend does not treat the global "-enable-unsafe-fp-math" as an umbrella for other flags. So that setting does not imply "-enable-no-nans-fp-math" or other FP relaxations. spatel: I want to fix the backend too, but that's separate patches. For example, we don't propagate the…
		wristowUnsubmitted Not Done Reply Inline Actions I see. Thanks for clarifying. wristow: I see. Thanks for clarifying.

FastMathFlags() = default;		FastMathFlags() = default;

/// Whether any flag is set
bool any() const { return Flags != 0; }		bool any() const { return Flags != 0; }
		bool none() const { return Flags == 0; }
		bool all() const { return Flags == ~0U; }

/// Set all the flags to false
void clear() { Flags = 0; }		void clear() { Flags = 0; }
		void set() { Flags = ~0U; }

/// Flag queries		/// Flag queries
		bool allowReassoc() const { return 0 != (Flags & AllowReassoc); }
bool noNaNs() const { return 0 != (Flags & NoNaNs); }		bool noNaNs() const { return 0 != (Flags & NoNaNs); }
bool noInfs() const { return 0 != (Flags & NoInfs); }		bool noInfs() const { return 0 != (Flags & NoInfs); }
bool noSignedZeros() const { return 0 != (Flags & NoSignedZeros); }		bool noSignedZeros() const { return 0 != (Flags & NoSignedZeros); }
bool allowReciprocal() const { return 0 != (Flags & AllowReciprocal); }		bool allowReciprocal() const { return 0 != (Flags & AllowReciprocal); }
bool allowContract() const { return 0 != (Flags & AllowContract); }		bool allowContract() const { return 0 != (Flags & AllowContract); }
bool unsafeAlgebra() const { return 0 != (Flags & UnsafeAlgebra); }		bool approxFunc() const { return 0 != (Flags & ApproxFunc); }
		/// 'Fast' means all bits are set.
		bool isFast() const { return all(); }

/// Flag setters		/// Flag setters
		void setAllowReassoc() { Flags \|= AllowReassoc; }
void setNoNaNs() { Flags \|= NoNaNs; }		void setNoNaNs() { Flags \|= NoNaNs; }
void setNoInfs() { Flags \|= NoInfs; }		void setNoInfs() { Flags \|= NoInfs; }
void setNoSignedZeros() { Flags \|= NoSignedZeros; }		void setNoSignedZeros() { Flags \|= NoSignedZeros; }
void setAllowReciprocal() { Flags \|= AllowReciprocal; }		void setAllowReciprocal() { Flags \|= AllowReciprocal; }
		// TODO: Change the other set* functions to take a parameter?
void setAllowContract(bool B) {		void setAllowContract(bool B) {
Flags = (Flags & ~AllowContract) \| B * AllowContract;		Flags = (Flags & ~AllowContract) \| B * AllowContract;
}		}
void setUnsafeAlgebra() {		void setApproxFunc() { Flags \|= ApproxFunc; }
Flags \|= UnsafeAlgebra;		void setFast() { set(); }
		wristowUnsubmitted Not Done Reply Inline Actions One loose end that needs to be taken care of more or less simultaneously is a Clang change. Specifically, the constructor for `CodeGenFunction` (in "CodeGenFunction.cpp") invokes `FastMathFlags::setUnsafeAlgebra()`, so it will need to be changed to `setFast()`. wristow: One loose end that needs to be taken care of more or less simultaneously is a Clang change.
		spatelAuthorUnsubmitted Not Done Reply Inline Actions This is correct - I didn't post it, but I have that one line patch in place locally, so I was planning to submit it to the clang repo as close as possible after this patch and reference this commit (if there's a way to avoid the build breakage cleanly, please let me know). spatel: This is correct - I didn't post it, but I have that one line patch in place locally, so I was…
		wristowUnsubmitted Not Done Reply Inline Actions I don't know of a clean way. Definitely fine with me to submit it right after this patch is submitted. wristow: I don't know of a clean way. Definitely fine with me to submit it right after this patch is…
setNoNaNs();
setNoInfs();
setNoSignedZeros();
setAllowReciprocal();
setAllowContract(true);
}

void operator&=(const FastMathFlags &OtherFlags) {		void operator&=(const FastMathFlags &OtherFlags) {
Flags &= OtherFlags.Flags;		Flags &= OtherFlags.Flags;
}		}
};		};

/// Utility class for floating point operations which can have		/// Utility class for floating point operations which can have
/// information about relaxed accuracy requirements attached to them.		/// information about relaxed accuracy requirements attached to them.
class FPMathOperator : public Operator {		class FPMathOperator : public Operator {
private:		private:
friend class Instruction;		friend class Instruction;

void setHasUnsafeAlgebra(bool B) {		/// 'Fast' means all bits are set.
SubclassOptionalData =		void setFast(bool B) {
(SubclassOptionalData & ~FastMathFlags::UnsafeAlgebra) \|		setHasAllowReassoc(B);
(B * FastMathFlags::UnsafeAlgebra);		setHasNoNaNs(B);
		setHasNoInfs(B);
// Unsafe algebra implies all the others		setHasNoSignedZeros(B);
if (B) {		setHasAllowReciprocal(B);
setHasNoNaNs(true);		setHasAllowContract(B);
setHasNoInfs(true);		setHasApproxFunc(B);
setHasNoSignedZeros(true);
setHasAllowReciprocal(true);
}		}

		void setHasAllowReassoc(bool B) {
		SubclassOptionalData =
		(SubclassOptionalData & ~FastMathFlags::AllowReassoc) \|
		(B * FastMathFlags::AllowReassoc);
}		}

void setHasNoNaNs(bool B) {		void setHasNoNaNs(bool B) {
SubclassOptionalData =		SubclassOptionalData =
(SubclassOptionalData & ~FastMathFlags::NoNaNs) \|		(SubclassOptionalData & ~FastMathFlags::NoNaNs) \|
(B * FastMathFlags::NoNaNs);		(B * FastMathFlags::NoNaNs);
}		}

Show All 16 Lines	private:
}		}

void setHasAllowContract(bool B) {		void setHasAllowContract(bool B) {
SubclassOptionalData =		SubclassOptionalData =
(SubclassOptionalData & ~FastMathFlags::AllowContract) \|		(SubclassOptionalData & ~FastMathFlags::AllowContract) \|
(B * FastMathFlags::AllowContract);		(B * FastMathFlags::AllowContract);
}		}

		void setHasApproxFunc(bool B) {
		SubclassOptionalData =
		(SubclassOptionalData & ~FastMathFlags::ApproxFunc) \|
		(B * FastMathFlags::ApproxFunc);
		}

/// Convenience function for setting multiple fast-math flags.		/// Convenience function for setting multiple fast-math flags.
/// FMF is a mask of the bits to set.		/// FMF is a mask of the bits to set.
void setFastMathFlags(FastMathFlags FMF) {		void setFastMathFlags(FastMathFlags FMF) {
SubclassOptionalData \|= FMF.Flags;		SubclassOptionalData \|= FMF.Flags;
}		}

/// Convenience function for copying all fast-math flags.		/// Convenience function for copying all fast-math flags.
/// All values in FMF are transferred to this operator.		/// All values in FMF are transferred to this operator.
void copyFastMathFlags(FastMathFlags FMF) {		void copyFastMathFlags(FastMathFlags FMF) {
SubclassOptionalData = FMF.Flags;		SubclassOptionalData = FMF.Flags;
}		}

public:		public:
/// Test whether this operation is permitted to be		/// Test if this operation allows all non-strict floating-point transforms.
/// algebraically transformed, aka the 'A' fast-math property.		bool isFast() const {
bool hasUnsafeAlgebra() const {		return ((SubclassOptionalData & FastMathFlags::AllowReassoc) != 0 &&
return (SubclassOptionalData & FastMathFlags::UnsafeAlgebra) != 0;		(SubclassOptionalData & FastMathFlags::NoNaNs) != 0 &&
		(SubclassOptionalData & FastMathFlags::NoInfs) != 0 &&
		(SubclassOptionalData & FastMathFlags::NoSignedZeros) != 0 &&
		(SubclassOptionalData & FastMathFlags::AllowReciprocal) != 0 &&
		(SubclassOptionalData & FastMathFlags::AllowContract) != 0 &&
		(SubclassOptionalData & FastMathFlags::ApproxFunc) != 0);
		}

		/// Test if this operation may be simplified with reassociative transforms.
		bool hasAllowReassoc() const {
		return (SubclassOptionalData & FastMathFlags::AllowReassoc) != 0;
}		}

/// Test whether this operation's arguments and results are to be		/// Test if this operation's arguments and results are assumed not-NaN.
/// treated as non-NaN, aka the 'N' fast-math property.
bool hasNoNaNs() const {		bool hasNoNaNs() const {
return (SubclassOptionalData & FastMathFlags::NoNaNs) != 0;		return (SubclassOptionalData & FastMathFlags::NoNaNs) != 0;
}		}

/// Test whether this operation's arguments and results are to be		/// Test if this operation's arguments and results are assumed not-infinite.
/// treated as NoN-Inf, aka the 'I' fast-math property.
bool hasNoInfs() const {		bool hasNoInfs() const {
return (SubclassOptionalData & FastMathFlags::NoInfs) != 0;		return (SubclassOptionalData & FastMathFlags::NoInfs) != 0;
}		}

/// Test whether this operation can treat the sign of zero		/// Test if this operation can ignore the sign of zero.
/// as insignificant, aka the 'S' fast-math property.
bool hasNoSignedZeros() const {		bool hasNoSignedZeros() const {
return (SubclassOptionalData & FastMathFlags::NoSignedZeros) != 0;		return (SubclassOptionalData & FastMathFlags::NoSignedZeros) != 0;
}		}

/// Test whether this operation is permitted to use		/// Test if this operation can use reciprocal multiply instead of division.
/// reciprocal instead of division, aka the 'R' fast-math property.
bool hasAllowReciprocal() const {		bool hasAllowReciprocal() const {
return (SubclassOptionalData & FastMathFlags::AllowReciprocal) != 0;		return (SubclassOptionalData & FastMathFlags::AllowReciprocal) != 0;
}		}

/// Test whether this operation is permitted to		/// Test if this operation can be floating-point contracted (FMA).
/// be floating-point contracted.
bool hasAllowContract() const {		bool hasAllowContract() const {
return (SubclassOptionalData & FastMathFlags::AllowContract) != 0;		return (SubclassOptionalData & FastMathFlags::AllowContract) != 0;
}		}

		/// Test if this operation allows approximations of math library functions or
		/// intrinsics.
		bool hasApproxFunc() const {
		return (SubclassOptionalData & FastMathFlags::ApproxFunc) != 0;
		}

/// Convenience function for getting all the fast-math flags		/// Convenience function for getting all the fast-math flags
FastMathFlags getFastMathFlags() const {		FastMathFlags getFastMathFlags() const {
return FastMathFlags(SubclassOptionalData);		return FastMathFlags(SubclassOptionalData);
}		}

/// Get the maximum error permitted by this operation in ULPs. An accuracy of		/// Get the maximum error permitted by this operation in ULPs. An accuracy of
/// 0.0 means that the operation should be performed with the default		/// 0.0 means that the operation should be performed with the default
/// precision.		/// precision.
▲ Show 20 Lines • Show All 213 Lines • Show Last 20 Lines

include/llvm/Transforms/Utils/LoopUtils.h

Show First 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	public:
static bool isInductionPHI(PHINode Phi, const Loop L,		static bool isInductionPHI(PHINode Phi, const Loop L,
PredicatedScalarEvolution &PSE,		PredicatedScalarEvolution &PSE,
InductionDescriptor &D, bool Assume = false);		InductionDescriptor &D, bool Assume = false);

/// Returns true if the induction type is FP and the binary operator does		/// Returns true if the induction type is FP and the binary operator does
/// not have the "fast-math" property. Such operation requires a relaxed FP		/// not have the "fast-math" property. Such operation requires a relaxed FP
/// mode.		/// mode.
bool hasUnsafeAlgebra() {		bool hasUnsafeAlgebra() {
return InductionBinOp &&		return InductionBinOp && !cast<FPMathOperator>(InductionBinOp)->isFast();
!cast<FPMathOperator>(InductionBinOp)->hasUnsafeAlgebra();
}		}

/// Returns induction operator that does not have "fast-math" property		/// Returns induction operator that does not have "fast-math" property
/// and requires FP unsafe mode.		/// and requires FP unsafe mode.
Instruction *getUnsafeAlgebraInst() {		Instruction *getUnsafeAlgebraInst() {
if (!InductionBinOp \|\|		if (!InductionBinOp \|\| cast<FPMathOperator>(InductionBinOp)->isFast())
cast<FPMathOperator>(InductionBinOp)->hasUnsafeAlgebra())
return nullptr;		return nullptr;
return InductionBinOp;		return InductionBinOp;
}		}

/// Returns binary opcode of the induction operator.		/// Returns binary opcode of the induction operator.
Instruction::BinaryOps getInductionOpcode() const {		Instruction::BinaryOps getInductionOpcode() const {
return InductionBinOp ? InductionBinOp->getOpcode() :		return InductionBinOp ? InductionBinOp->getOpcode() :
Instruction::BinaryOpsEnd;		Instruction::BinaryOpsEnd;
▲ Show 20 Lines • Show All 212 Lines • Show Last 20 Lines

lib/AsmParser/LLLexer.cpp

Show First 20 Lines • Show All 546 Lines • ▼ Show 20 Lines	#define KEYWORD(STR) \
KEYWORD(seq_cst);		KEYWORD(seq_cst);
KEYWORD(syncscope);		KEYWORD(syncscope);

KEYWORD(nnan);		KEYWORD(nnan);
KEYWORD(ninf);		KEYWORD(ninf);
KEYWORD(nsz);		KEYWORD(nsz);
KEYWORD(arcp);		KEYWORD(arcp);
KEYWORD(contract);		KEYWORD(contract);
		KEYWORD(reassoc);
		KEYWORD(afn);
KEYWORD(fast);		KEYWORD(fast);
KEYWORD(nuw);		KEYWORD(nuw);
KEYWORD(nsw);		KEYWORD(nsw);
KEYWORD(exact);		KEYWORD(exact);
KEYWORD(inbounds);		KEYWORD(inbounds);
KEYWORD(inrange);		KEYWORD(inrange);
KEYWORD(align);		KEYWORD(align);
KEYWORD(addrspace);		KEYWORD(addrspace);
▲ Show 20 Lines • Show All 463 Lines • Show Last 20 Lines

lib/AsmParser/LLParser.h

Show First 20 Lines • Show All 187 Lines • ▼ Show 20 Lines	bool EatIfPresent(lltok::Kind T) {
Lex.Lex();		Lex.Lex();
return true;		return true;
}		}

FastMathFlags EatFastMathFlagsIfPresent() {		FastMathFlags EatFastMathFlagsIfPresent() {
FastMathFlags FMF;		FastMathFlags FMF;
while (true)		while (true)
switch (Lex.getKind()) {		switch (Lex.getKind()) {
case lltok::kw_fast: FMF.setUnsafeAlgebra(); Lex.Lex(); continue;		case lltok::kw_fast: FMF.setFast(); Lex.Lex(); continue;
case lltok::kw_nnan: FMF.setNoNaNs(); Lex.Lex(); continue;		case lltok::kw_nnan: FMF.setNoNaNs(); Lex.Lex(); continue;
case lltok::kw_ninf: FMF.setNoInfs(); Lex.Lex(); continue;		case lltok::kw_ninf: FMF.setNoInfs(); Lex.Lex(); continue;
case lltok::kw_nsz: FMF.setNoSignedZeros(); Lex.Lex(); continue;		case lltok::kw_nsz: FMF.setNoSignedZeros(); Lex.Lex(); continue;
case lltok::kw_arcp: FMF.setAllowReciprocal(); Lex.Lex(); continue;		case lltok::kw_arcp: FMF.setAllowReciprocal(); Lex.Lex(); continue;
case lltok::kw_contract:		case lltok::kw_contract:
FMF.setAllowContract(true);		FMF.setAllowContract(true);
Lex.Lex();		Lex.Lex();
continue;		continue;
		case lltok::kw_reassoc: FMF.setAllowReassoc(); Lex.Lex(); continue;
		case lltok::kw_afn: FMF.setApproxFunc(); Lex.Lex(); continue;
default: return FMF;		default: return FMF;
}		}
return FMF;		return FMF;
}		}

bool ParseOptionalToken(lltok::Kind T, bool &Present,		bool ParseOptionalToken(lltok::Kind T, bool &Present,
LocTy *Loc = nullptr) {		LocTy *Loc = nullptr) {
if (Lex.getKind() != T) {		if (Lex.getKind() != T) {
▲ Show 20 Lines • Show All 312 Lines • Show Last 20 Lines

lib/AsmParser/LLToken.h

Show First 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	enum Kind {
kw_acq_rel,		kw_acq_rel,
kw_seq_cst,		kw_seq_cst,
kw_syncscope,		kw_syncscope,
kw_nnan,		kw_nnan,
kw_ninf,		kw_ninf,
kw_nsz,		kw_nsz,
kw_arcp,		kw_arcp,
kw_contract,		kw_contract,
		kw_reassoc,
		kw_afn,
kw_fast,		kw_fast,
kw_nuw,		kw_nuw,
kw_nsw,		kw_nsw,
kw_exact,		kw_exact,
kw_inbounds,		kw_inbounds,
kw_inrange,		kw_inrange,
kw_align,		kw_align,
kw_addrspace,		kw_addrspace,
▲ Show 20 Lines • Show All 264 Lines • Show Last 20 Lines

lib/Bitcode/Reader/BitcodeReader.cpp

Show First 20 Lines • Show All 1,038 Lines • ▼ Show 20 Lines	case bitc::COMDAT_SELECTION_KIND_NO_DUPLICATES:
return Comdat::NoDuplicates;		return Comdat::NoDuplicates;
case bitc::COMDAT_SELECTION_KIND_SAME_SIZE:		case bitc::COMDAT_SELECTION_KIND_SAME_SIZE:
return Comdat::SameSize;		return Comdat::SameSize;
}		}
}		}

static FastMathFlags getDecodedFastMathFlags(unsigned Val) {		static FastMathFlags getDecodedFastMathFlags(unsigned Val) {
FastMathFlags FMF;		FastMathFlags FMF;
if (0 != (Val & FastMathFlags::UnsafeAlgebra))		if (0 != (Val & FastMathFlags::AllowReassoc))
FMF.setUnsafeAlgebra();		FMF.setAllowReassoc();
if (0 != (Val & FastMathFlags::NoNaNs))		if (0 != (Val & FastMathFlags::NoNaNs))
FMF.setNoNaNs();		FMF.setNoNaNs();
if (0 != (Val & FastMathFlags::NoInfs))		if (0 != (Val & FastMathFlags::NoInfs))
FMF.setNoInfs();		FMF.setNoInfs();
if (0 != (Val & FastMathFlags::NoSignedZeros))		if (0 != (Val & FastMathFlags::NoSignedZeros))
FMF.setNoSignedZeros();		FMF.setNoSignedZeros();
if (0 != (Val & FastMathFlags::AllowReciprocal))		if (0 != (Val & FastMathFlags::AllowReciprocal))
FMF.setAllowReciprocal();		FMF.setAllowReciprocal();
if (0 != (Val & FastMathFlags::AllowContract))		if (0 != (Val & FastMathFlags::AllowContract))
FMF.setAllowContract(true);		FMF.setAllowContract(true);
		if (0 != (Val & FastMathFlags::ApproxFunc))
		FMF.setApproxFunc();
return FMF;		return FMF;
}		}

static void upgradeDLLImportExportLinkage(GlobalValue *GV, unsigned Val) {		static void upgradeDLLImportExportLinkage(GlobalValue *GV, unsigned Val) {
switch (Val) {		switch (Val) {
case 5: GV->setDLLStorageClass(GlobalValue::DLLImportStorageClass); break;		case 5: GV->setDLLStorageClass(GlobalValue::DLLImportStorageClass); break;
case 6: GV->setDLLStorageClass(GlobalValue::DLLExportStorageClass); break;		case 6: GV->setDLLStorageClass(GlobalValue::DLLExportStorageClass); break;
}		}
▲ Show 20 Lines • Show All 4,760 Lines • Show Last 20 Lines

lib/Bitcode/Writer/BitcodeWriter.cpp

Show First 20 Lines • Show All 1,313 Lines • ▼ Show 20 Lines	if (const auto *OBO = dyn_cast<OverflowingBinaryOperator>(V)) {
if (OBO->hasNoSignedWrap())		if (OBO->hasNoSignedWrap())
Flags \|= 1 << bitc::OBO_NO_SIGNED_WRAP;		Flags \|= 1 << bitc::OBO_NO_SIGNED_WRAP;
if (OBO->hasNoUnsignedWrap())		if (OBO->hasNoUnsignedWrap())
Flags \|= 1 << bitc::OBO_NO_UNSIGNED_WRAP;		Flags \|= 1 << bitc::OBO_NO_UNSIGNED_WRAP;
} else if (const auto *PEO = dyn_cast<PossiblyExactOperator>(V)) {		} else if (const auto *PEO = dyn_cast<PossiblyExactOperator>(V)) {
if (PEO->isExact())		if (PEO->isExact())
Flags \|= 1 << bitc::PEO_EXACT;		Flags \|= 1 << bitc::PEO_EXACT;
} else if (const auto *FPMO = dyn_cast<FPMathOperator>(V)) {		} else if (const auto *FPMO = dyn_cast<FPMathOperator>(V)) {
if (FPMO->hasUnsafeAlgebra())		if (FPMO->hasAllowReassoc())
Flags \|= FastMathFlags::UnsafeAlgebra;		Flags \|= FastMathFlags::AllowReassoc;
if (FPMO->hasNoNaNs())		if (FPMO->hasNoNaNs())
Flags \|= FastMathFlags::NoNaNs;		Flags \|= FastMathFlags::NoNaNs;
if (FPMO->hasNoInfs())		if (FPMO->hasNoInfs())
Flags \|= FastMathFlags::NoInfs;		Flags \|= FastMathFlags::NoInfs;
if (FPMO->hasNoSignedZeros())		if (FPMO->hasNoSignedZeros())
Flags \|= FastMathFlags::NoSignedZeros;		Flags \|= FastMathFlags::NoSignedZeros;
if (FPMO->hasAllowReciprocal())		if (FPMO->hasAllowReciprocal())
Flags \|= FastMathFlags::AllowReciprocal;		Flags \|= FastMathFlags::AllowReciprocal;
if (FPMO->hasAllowContract())		if (FPMO->hasAllowContract())
Flags \|= FastMathFlags::AllowContract;		Flags \|= FastMathFlags::AllowContract;
		if (FPMO->hasApproxFunc())
		Flags \|= FastMathFlags::ApproxFunc;
}		}

return Flags;		return Flags;
}		}

void ModuleBitcodeWriter::writeValueAsMetadata(		void ModuleBitcodeWriter::writeValueAsMetadata(
const ValueAsMetadata *MD, SmallVectorImpl<uint64_t> &Record) {		const ValueAsMetadata *MD, SmallVectorImpl<uint64_t> &Record) {
// Mimic an MDNode with a value as one operand.		// Mimic an MDNode with a value as one operand.
▲ Show 20 Lines • Show All 2,901 Lines • Show Last 20 Lines

lib/CodeGen/ExpandReductions.cpp

Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	for (auto *II : Worklist) {
auto MRK = RecurrenceDescriptor::MRK_Invalid;		auto MRK = RecurrenceDescriptor::MRK_Invalid;
switch (ID) {		switch (ID) {
case Intrinsic::experimental_vector_reduce_fadd:		case Intrinsic::experimental_vector_reduce_fadd:
case Intrinsic::experimental_vector_reduce_fmul:		case Intrinsic::experimental_vector_reduce_fmul:
// FMFs must be attached to the call, otherwise it's an ordered reduction		// FMFs must be attached to the call, otherwise it's an ordered reduction
// and it can't be handled by generating this shuffle sequence.		// and it can't be handled by generating this shuffle sequence.
// TODO: Implement scalarization of ordered reductions here for targets		// TODO: Implement scalarization of ordered reductions here for targets
// without native support.		// without native support.
if (!II->getFastMathFlags().unsafeAlgebra())		if (!II->getFastMathFlags().isFast())
continue;		continue;
Vec = II->getArgOperand(1);		Vec = II->getArgOperand(1);
break;		break;
case Intrinsic::experimental_vector_reduce_add:		case Intrinsic::experimental_vector_reduce_add:
case Intrinsic::experimental_vector_reduce_mul:		case Intrinsic::experimental_vector_reduce_mul:
case Intrinsic::experimental_vector_reduce_and:		case Intrinsic::experimental_vector_reduce_and:
case Intrinsic::experimental_vector_reduce_or:		case Intrinsic::experimental_vector_reduce_or:
case Intrinsic::experimental_vector_reduce_xor:		case Intrinsic::experimental_vector_reduce_xor:
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,579 Lines • ▼ Show 20 Lines	static bool isVectorReductionOp(const User *I) {
case Instruction::Mul:		case Instruction::Mul:
case Instruction::And:		case Instruction::And:
case Instruction::Or:		case Instruction::Or:
case Instruction::Xor:		case Instruction::Xor:
break;		break;
case Instruction::FAdd:		case Instruction::FAdd:
case Instruction::FMul:		case Instruction::FMul:
if (const FPMathOperator *FPOp = dyn_cast<const FPMathOperator>(Inst))		if (const FPMathOperator *FPOp = dyn_cast<const FPMathOperator>(Inst))
if (FPOp->getFastMathFlags().unsafeAlgebra())		if (FPOp->getFastMathFlags().isFast())
break;		break;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
default:		default:
return false;		return false;
}		}

unsigned ElemNum = Inst->getType()->getVectorNumElements();		unsigned ElemNum = Inst->getType()->getVectorNumElements();
unsigned ElemNumToReduce = ElemNum;		unsigned ElemNumToReduce = ElemNum;
Show All 29 Lines	while (!UsersToVisit.empty()) {

for (const auto &U : User->users()) {		for (const auto &U : User->users()) {
auto Inst = dyn_cast<Instruction>(U);		auto Inst = dyn_cast<Instruction>(U);
if (!Inst)		if (!Inst)
return false;		return false;

if (Inst->getOpcode() == OpCode \|\| isa<PHINode>(U)) {		if (Inst->getOpcode() == OpCode \|\| isa<PHINode>(U)) {
if (const FPMathOperator *FPOp = dyn_cast<const FPMathOperator>(Inst))		if (const FPMathOperator *FPOp = dyn_cast<const FPMathOperator>(Inst))
if (!isa<PHINode>(FPOp) && !FPOp->getFastMathFlags().unsafeAlgebra())		if (!isa<PHINode>(FPOp) && !FPOp->getFastMathFlags().isFast())
return false;		return false;
UsersToVisit.push_back(U);		UsersToVisit.push_back(U);
} else if (const ShuffleVectorInst *ShufInst =		} else if (const ShuffleVectorInst *ShufInst =
dyn_cast<ShuffleVectorInst>(U)) {		dyn_cast<ShuffleVectorInst>(U)) {
// Detect the following pattern: A ShuffleVector instruction together		// Detect the following pattern: A ShuffleVector instruction together
// with a reduction that do partial reduction on the first and second		// with a reduction that do partial reduction on the first and second
// ElemNumToReduce / 2 elements, and store the result in		// ElemNumToReduce / 2 elements, and store the result in
// ElemNumToReduce / 2 elements in another vector.		// ElemNumToReduce / 2 elements in another vector.
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitBinary(const User &I, unsigned OpCode) {
Flags.setNoSignedWrap(nsw);		Flags.setNoSignedWrap(nsw);
Flags.setNoUnsignedWrap(nuw);		Flags.setNoUnsignedWrap(nuw);
Flags.setVectorReduction(vec_redux);		Flags.setVectorReduction(vec_redux);
Flags.setAllowReciprocal(FMF.allowReciprocal());		Flags.setAllowReciprocal(FMF.allowReciprocal());
Flags.setAllowContract(FMF.allowContract());		Flags.setAllowContract(FMF.allowContract());
Flags.setNoInfs(FMF.noInfs());		Flags.setNoInfs(FMF.noInfs());
Flags.setNoNaNs(FMF.noNaNs());		Flags.setNoNaNs(FMF.noNaNs());
Flags.setNoSignedZeros(FMF.noSignedZeros());		Flags.setNoSignedZeros(FMF.noSignedZeros());
Flags.setUnsafeAlgebra(FMF.unsafeAlgebra());		Flags.setUnsafeAlgebra(FMF.isFast());

SDValue BinNodeValue = DAG.getNode(OpCode, getCurSDLoc(), Op1.getValueType(),		SDValue BinNodeValue = DAG.getNode(OpCode, getCurSDLoc(), Op1.getValueType(),
Op1, Op2, Flags);		Op1, Op2, Flags);
setValue(&I, BinNodeValue);		setValue(&I, BinNodeValue);
}		}

void SelectionDAGBuilder::visitShift(const User &I, unsigned Opcode) {		void SelectionDAGBuilder::visitShift(const User &I, unsigned Opcode) {
SDValue Op1 = getValue(I.getOperand(0));		SDValue Op1 = getValue(I.getOperand(0));
▲ Show 20 Lines • Show All 5,217 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitVectorReduce(const CallInst &I,
FastMathFlags FMF;		FastMathFlags FMF;
if (isa<FPMathOperator>(I))		if (isa<FPMathOperator>(I))
FMF = I.getFastMathFlags();		FMF = I.getFastMathFlags();
SDNodeFlags SDFlags;		SDNodeFlags SDFlags;
SDFlags.setNoNaNs(FMF.noNaNs());		SDFlags.setNoNaNs(FMF.noNaNs());

switch (Intrinsic) {		switch (Intrinsic) {
case Intrinsic::experimental_vector_reduce_fadd:		case Intrinsic::experimental_vector_reduce_fadd:
if (FMF.unsafeAlgebra())		if (FMF.isFast())
Res = DAG.getNode(ISD::VECREDUCE_FADD, dl, VT, Op2);		Res = DAG.getNode(ISD::VECREDUCE_FADD, dl, VT, Op2);
else		else
Res = DAG.getNode(ISD::VECREDUCE_STRICT_FADD, dl, VT, Op1, Op2);		Res = DAG.getNode(ISD::VECREDUCE_STRICT_FADD, dl, VT, Op1, Op2);
break;		break;
case Intrinsic::experimental_vector_reduce_fmul:		case Intrinsic::experimental_vector_reduce_fmul:
if (FMF.unsafeAlgebra())		if (FMF.isFast())
Res = DAG.getNode(ISD::VECREDUCE_FMUL, dl, VT, Op2);		Res = DAG.getNode(ISD::VECREDUCE_FMUL, dl, VT, Op2);
else		else
Res = DAG.getNode(ISD::VECREDUCE_STRICT_FMUL, dl, VT, Op1, Op2);		Res = DAG.getNode(ISD::VECREDUCE_STRICT_FMUL, dl, VT, Op1, Op2);
break;		break;
case Intrinsic::experimental_vector_reduce_add:		case Intrinsic::experimental_vector_reduce_add:
Res = DAG.getNode(ISD::VECREDUCE_ADD, dl, VT, Op1);		Res = DAG.getNode(ISD::VECREDUCE_ADD, dl, VT, Op1);
break;		break;
case Intrinsic::experimental_vector_reduce_mul:		case Intrinsic::experimental_vector_reduce_mul:
▲ Show 20 Lines • Show All 1,960 Lines • Show Last 20 Lines

lib/IR/AsmWriter.cpp

Show First 20 Lines • Show All 1,102 Lines • ▼ Show 20 Lines	static void writeAtomicRMWOperation(raw_ostream &Out,
case AtomicRMWInst::Min: Out << " min"; break;		case AtomicRMWInst::Min: Out << " min"; break;
case AtomicRMWInst::UMax: Out << " umax"; break;		case AtomicRMWInst::UMax: Out << " umax"; break;
case AtomicRMWInst::UMin: Out << " umin"; break;		case AtomicRMWInst::UMin: Out << " umin"; break;
}		}
}		}

static void WriteOptimizationInfo(raw_ostream &Out, const User *U) {		static void WriteOptimizationInfo(raw_ostream &Out, const User *U) {
if (const FPMathOperator *FPO = dyn_cast<const FPMathOperator>(U)) {		if (const FPMathOperator *FPO = dyn_cast<const FPMathOperator>(U)) {
// Unsafe algebra implies all the others, no need to write them all out		// 'Fast' is an abbreviation for all fast-math-flags.
if (FPO->hasUnsafeAlgebra())		if (FPO->isFast())
Out << " fast";		Out << " fast";
else {		else {
		if (FPO->hasAllowReassoc())
		Out << " reassoc";
if (FPO->hasNoNaNs())		if (FPO->hasNoNaNs())
Out << " nnan";		Out << " nnan";
if (FPO->hasNoInfs())		if (FPO->hasNoInfs())
Out << " ninf";		Out << " ninf";
if (FPO->hasNoSignedZeros())		if (FPO->hasNoSignedZeros())
Out << " nsz";		Out << " nsz";
if (FPO->hasAllowReciprocal())		if (FPO->hasAllowReciprocal())
Out << " arcp";		Out << " arcp";
if (FPO->hasAllowContract())		if (FPO->hasAllowContract())
Out << " contract";		Out << " contract";
		if (FPO->hasApproxFunc())
		Out << " afn";
}		}
}		}

if (const OverflowingBinaryOperator *OBO =		if (const OverflowingBinaryOperator *OBO =
dyn_cast<OverflowingBinaryOperator>(U)) {		dyn_cast<OverflowingBinaryOperator>(U)) {
if (OBO->hasNoUnsignedWrap())		if (OBO->hasNoUnsignedWrap())
Out << " nuw";		Out << " nuw";
if (OBO->hasNoSignedWrap())		if (OBO->hasNoSignedWrap())
▲ Show 20 Lines • Show All 2,534 Lines • Show Last 20 Lines

lib/IR/Instruction.cpp

Show First 20 Lines • Show All 140 Lines • ▼ Show 20 Lines	case Instruction::GetElementPtr:
break;		break;
}		}
}		}

bool Instruction::isExact() const {		bool Instruction::isExact() const {
return cast<PossiblyExactOperator>(this)->isExact();		return cast<PossiblyExactOperator>(this)->isExact();
}		}

void Instruction::setHasUnsafeAlgebra(bool B) {		void Instruction::setFast(bool B) {
assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
cast<FPMathOperator>(this)->setHasUnsafeAlgebra(B);		cast<FPMathOperator>(this)->setFast(B);
		}

		void Instruction::setHasAllowReassoc(bool B) {
		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
		cast<FPMathOperator>(this)->setHasAllowReassoc(B);
}		}

void Instruction::setHasNoNaNs(bool B) {		void Instruction::setHasNoNaNs(bool B) {
assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
cast<FPMathOperator>(this)->setHasNoNaNs(B);		cast<FPMathOperator>(this)->setHasNoNaNs(B);
}		}

void Instruction::setHasNoInfs(bool B) {		void Instruction::setHasNoInfs(bool B) {
assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
cast<FPMathOperator>(this)->setHasNoInfs(B);		cast<FPMathOperator>(this)->setHasNoInfs(B);
}		}

void Instruction::setHasNoSignedZeros(bool B) {		void Instruction::setHasNoSignedZeros(bool B) {
assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
cast<FPMathOperator>(this)->setHasNoSignedZeros(B);		cast<FPMathOperator>(this)->setHasNoSignedZeros(B);
}		}

void Instruction::setHasAllowReciprocal(bool B) {		void Instruction::setHasAllowReciprocal(bool B) {
assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
cast<FPMathOperator>(this)->setHasAllowReciprocal(B);		cast<FPMathOperator>(this)->setHasAllowReciprocal(B);
}		}

		void Instruction::setHasApproxFunc(bool B) {
		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
		cast<FPMathOperator>(this)->setHasApproxFunc(B);
		}

void Instruction::setFastMathFlags(FastMathFlags FMF) {		void Instruction::setFastMathFlags(FastMathFlags FMF) {
assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "setting fast-math flag on invalid op");
cast<FPMathOperator>(this)->setFastMathFlags(FMF);		cast<FPMathOperator>(this)->setFastMathFlags(FMF);
}		}

void Instruction::copyFastMathFlags(FastMathFlags FMF) {		void Instruction::copyFastMathFlags(FastMathFlags FMF) {
assert(isa<FPMathOperator>(this) && "copying fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "copying fast-math flag on invalid op");
cast<FPMathOperator>(this)->copyFastMathFlags(FMF);		cast<FPMathOperator>(this)->copyFastMathFlags(FMF);
}		}

bool Instruction::hasUnsafeAlgebra() const {		bool Instruction::isFast() const {
assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");
return cast<FPMathOperator>(this)->hasUnsafeAlgebra();		return cast<FPMathOperator>(this)->isFast();
		}

		bool Instruction::hasAllowReassoc() const {
		assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");
		return cast<FPMathOperator>(this)->hasAllowReassoc();
}		}

bool Instruction::hasNoNaNs() const {		bool Instruction::hasNoNaNs() const {
assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");
return cast<FPMathOperator>(this)->hasNoNaNs();		return cast<FPMathOperator>(this)->hasNoNaNs();
}		}

bool Instruction::hasNoInfs() const {		bool Instruction::hasNoInfs() const {
Show All 11 Lines	bool Instruction::hasAllowReciprocal() const {
return cast<FPMathOperator>(this)->hasAllowReciprocal();		return cast<FPMathOperator>(this)->hasAllowReciprocal();
}		}

bool Instruction::hasAllowContract() const {		bool Instruction::hasAllowContract() const {
assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");
return cast<FPMathOperator>(this)->hasAllowContract();		return cast<FPMathOperator>(this)->hasAllowContract();
}		}

		bool Instruction::hasApproxFunc() const {
		assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");
		return cast<FPMathOperator>(this)->hasApproxFunc();
		}

FastMathFlags Instruction::getFastMathFlags() const {		FastMathFlags Instruction::getFastMathFlags() const {
assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");		assert(isa<FPMathOperator>(this) && "getting fast-math flag on invalid op");
return cast<FPMathOperator>(this)->getFastMathFlags();		return cast<FPMathOperator>(this)->getFastMathFlags();
}		}

void Instruction::copyFastMathFlags(const Instruction *I) {		void Instruction::copyFastMathFlags(const Instruction *I) {
copyFastMathFlags(I->getFastMathFlags());		copyFastMathFlags(I->getFastMathFlags());
}		}
▲ Show 20 Lines • Show All 352 Lines • ▼ Show 20 Lines
bool Instruction::isAssociative() const {		bool Instruction::isAssociative() const {
unsigned Opcode = getOpcode();		unsigned Opcode = getOpcode();
if (isAssociative(Opcode))		if (isAssociative(Opcode))
return true;		return true;

switch (Opcode) {		switch (Opcode) {
case FMul:		case FMul:
case FAdd:		case FAdd:
return cast<FPMathOperator>(this)->hasUnsafeAlgebra();		return cast<FPMathOperator>(this)->isFast();
default:		default:
return false;		return false;
}		}
}		}

Instruction *Instruction::cloneImpl() const {		Instruction *Instruction::cloneImpl() const {
llvm_unreachable("Subclass of Instruction failed to implement cloneImpl");		llvm_unreachable("Subclass of Instruction failed to implement cloneImpl");
}		}
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp

Show First 20 Lines • Show All 394 Lines • ▼ Show 20 Lines	if (!FPMath)
return false;		return false;

const FPMathOperator *FPOp = cast<const FPMathOperator>(&FDiv);		const FPMathOperator *FPOp = cast<const FPMathOperator>(&FDiv);
float ULP = FPOp->getFPAccuracy();		float ULP = FPOp->getFPAccuracy();
if (ULP < 2.5f)		if (ULP < 2.5f)
return false;		return false;

FastMathFlags FMF = FPOp->getFastMathFlags();		FastMathFlags FMF = FPOp->getFastMathFlags();
bool UnsafeDiv = HasUnsafeFPMath \|\| FMF.unsafeAlgebra() \|\|		bool UnsafeDiv = HasUnsafeFPMath \|\| FMF.isFast() \|\|
FMF.allowReciprocal();		FMF.allowReciprocal();

// With UnsafeDiv node will be optimized to just rcp and mul.		// With UnsafeDiv node will be optimized to just rcp and mul.
if (ST->hasFP32Denormals() \|\| UnsafeDiv)		if (ST->hasFP32Denormals() \|\| UnsafeDiv)
return false;		return false;

IRBuilder<> Builder(FDiv.getParent(), std::next(FDiv.getIterator()), FPMath);		IRBuilder<> Builder(FDiv.getParent(), std::next(FDiv.getIterator()), FPMath);
Builder.setFastMathFlags(FMF);		Builder.setFastMathFlags(FMF);
▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

lib/Target/AMDGPU/AMDGPULibCalls.cpp

	Show First 20 Lines • Show All 481 Lines • ▼ Show 20 Lines

	bool AMDGPULibCalls::parseFunctionName(const StringRef& FMangledName,			bool AMDGPULibCalls::parseFunctionName(const StringRef& FMangledName,
	FuncInfo *FInfo) {			FuncInfo *FInfo) {
	return AMDGPULibFunc::parse(FMangledName, *FInfo);			return AMDGPULibFunc::parse(FMangledName, *FInfo);
	}			}

	bool AMDGPULibCalls::isUnsafeMath(const CallInst *CI) const {			bool AMDGPULibCalls::isUnsafeMath(const CallInst *CI) const {
	if (auto Op = dyn_cast<FPMathOperator>(CI))			if (auto Op = dyn_cast<FPMathOperator>(CI))
	if (Op->hasUnsafeAlgebra())			if (Op->isFast())
	return true;			return true;
	const Function *F = CI->getParent()->getParent();			const Function *F = CI->getParent()->getParent();
	Attribute Attr = F->getFnAttribute("unsafe-fp-math");			Attribute Attr = F->getFnAttribute("unsafe-fp-math");
	return Attr.getValueAsString() == "true";			return Attr.getValueAsString() == "true";
	}			}

	bool AMDGPULibCalls::useNativeFunc(const StringRef F) const {			bool AMDGPULibCalls::useNativeFunc(const StringRef F) const {
	return AllNative \|\|			return AllNative \|\|
	▲ Show 20 Lines • Show All 1,272 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show First 20 Lines • Show All 476 Lines • ▼ Show 20 Lines	if (isMpy) {
AddSub0 = Opnd0_0;		AddSub0 = Opnd0_0;
AddSub1 = Opnd1_0;		AddSub1 = Opnd1_0;
}		}

if (!Factor)		if (!Factor)
return nullptr;		return nullptr;

FastMathFlags Flags;		FastMathFlags Flags;
Flags.setUnsafeAlgebra();		Flags.setFast();
if (I0) Flags &= I->getFastMathFlags();		if (I0) Flags &= I->getFastMathFlags();
if (I1) Flags &= I->getFastMathFlags();		if (I1) Flags &= I->getFastMathFlags();

// Create expression "NewAddSub = AddSub0 +/- AddsSub1"		// Create expression "NewAddSub = AddSub0 +/- AddsSub1"
Value *NewAddSub = (I->getOpcode() == Instruction::FAdd) ?		Value *NewAddSub = (I->getOpcode() == Instruction::FAdd) ?
createFAdd(AddSub0, AddSub1) :		createFAdd(AddSub0, AddSub1) :
createFSub(AddSub0, AddSub1);		createFSub(AddSub0, AddSub1);
if (ConstantFP *CFP = dyn_cast<ConstantFP>(NewAddSub)) {		if (ConstantFP *CFP = dyn_cast<ConstantFP>(NewAddSub)) {
Show All 12 Lines	Value FAddCombine::performFactorization(Instruction I) {

Value *RI = createFDiv(NewAddSub, Factor);		Value *RI = createFDiv(NewAddSub, Factor);
if (Instruction *II = dyn_cast<Instruction>(RI))		if (Instruction *II = dyn_cast<Instruction>(RI))
II->setFastMathFlags(Flags);		II->setFastMathFlags(Flags);
return RI;		return RI;
}		}

Value FAddCombine::simplify(Instruction I) {		Value FAddCombine::simplify(Instruction I) {
assert(I->hasUnsafeAlgebra() && "Should be in unsafe mode");		assert(I->isFast() && "Should be in unsafe mode");

// Currently we are not able to handle vector type.		// Currently we are not able to handle vector type.
if (I->getType()->isVectorTy())		if (I->getType()->isVectorTy())
return nullptr;		return nullptr;

assert((I->getOpcode() == Instruction::FAdd \|\|		assert((I->getOpcode() == Instruction::FAdd \|\|
I->getOpcode() == Instruction::FSub) && "Expect add/sub");		I->getOpcode() == Instruction::FSub) && "Expect add/sub");

▲ Show 20 Lines • Show All 858 Lines • ▼ Show 20 Lines	if (SIToFPInst *RHSConv = dyn_cast<SIToFPInst>(RHS)) {
}		}
}		}
}		}

// Handle specials cases for FAdd with selects feeding the operation		// Handle specials cases for FAdd with selects feeding the operation
if (Value *V = SimplifySelectsFeedingBinaryOp(I, LHS, RHS))		if (Value *V = SimplifySelectsFeedingBinaryOp(I, LHS, RHS))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (I.hasUnsafeAlgebra()) {		if (I.isFast()) {
if (Value *V = FAddCombine(Builder).simplify(&I))		if (Value *V = FAddCombine(Builder).simplify(&I))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
}		}

return Changed ? &I : nullptr;		return Changed ? &I : nullptr;
}		}

/// Optimize pointer differences into the same array into a size. Consider:		/// Optimize pointer differences into the same array into a size. Consider:
▲ Show 20 Lines • Show All 333 Lines • ▼ Show 20 Lines	if (Value *V = dyn_castFNegVal(FPEI->getOperand(0))) {
return NewI;		return NewI;
}		}
}		}

// Handle specials cases for FSub with selects feeding the operation		// Handle specials cases for FSub with selects feeding the operation
if (Value *V = SimplifySelectsFeedingBinaryOp(I, Op0, Op1))		if (Value *V = SimplifySelectsFeedingBinaryOp(I, Op0, Op1))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (I.hasUnsafeAlgebra()) {		if (I.isFast()) {
if (Value *V = FAddCombine(Builder).simplify(&I))		if (Value *V = FAddCombine(Builder).simplify(&I))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
}		}

return nullptr;		return nullptr;
}		}

lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 2,011 Lines • ▼ Show 20 Lines	if (isa<ConstantFP>(Arg0) && !isa<ConstantFP>(Arg1)) {
return II;		return II;
}		}
if (Value V = simplifyMinnumMaxnum(II))		if (Value V = simplifyMinnumMaxnum(II))
return replaceInstUsesWith(*II, V);		return replaceInstUsesWith(*II, V);
break;		break;
}		}
case Intrinsic::fmuladd: {		case Intrinsic::fmuladd: {
// Canonicalize fast fmuladd to the separate fmul + fadd.		// Canonicalize fast fmuladd to the separate fmul + fadd.
if (II->hasUnsafeAlgebra()) {		if (II->isFast()) {
BuilderTy::FastMathFlagGuard Guard(Builder);		BuilderTy::FastMathFlagGuard Guard(Builder);
Builder.setFastMathFlags(II->getFastMathFlags());		Builder.setFastMathFlags(II->getFastMathFlags());
Value *Mul = Builder.CreateFMul(II->getArgOperand(0),		Value *Mul = Builder.CreateFMul(II->getArgOperand(0),
II->getArgOperand(1));		II->getArgOperand(1));
Value *Add = Builder.CreateFAdd(Mul, II->getArgOperand(2));		Value *Add = Builder.CreateFAdd(Mul, II->getArgOperand(2));
Add->takeName(II);		Add->takeName(II);
return replaceInstUsesWith(*II, Add);		return replaceInstUsesWith(*II, Add);
}		}
▲ Show 20 Lines • Show All 2,381 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

Show First 20 Lines • Show All 481 Lines • ▼ Show 20 Lines
/// Detect pattern log2(Y * 0.5) with corresponding fast math flags.		/// Detect pattern log2(Y * 0.5) with corresponding fast math flags.
static void detectLog2OfHalf(Value &Op, Value &Y, IntrinsicInst *&Log2) {		static void detectLog2OfHalf(Value &Op, Value &Y, IntrinsicInst *&Log2) {
if (!Op->hasOneUse())		if (!Op->hasOneUse())
return;		return;

IntrinsicInst *II = dyn_cast<IntrinsicInst>(Op);		IntrinsicInst *II = dyn_cast<IntrinsicInst>(Op);
if (!II)		if (!II)
return;		return;
if (II->getIntrinsicID() != Intrinsic::log2 \|\| !II->hasUnsafeAlgebra())		if (II->getIntrinsicID() != Intrinsic::log2 \|\| !II->isFast())
return;		return;
Log2 = II;		Log2 = II;

Value *OpLog2Of = II->getArgOperand(0);		Value *OpLog2Of = II->getArgOperand(0);
if (!OpLog2Of->hasOneUse())		if (!OpLog2Of->hasOneUse())
return;		return;

Instruction *I = dyn_cast<Instruction>(OpLog2Of);		Instruction *I = dyn_cast<Instruction>(OpLog2Of);
if (!I)		if (!I)
return;		return;
if (I->getOpcode() != Instruction::FMul \|\| !I->hasUnsafeAlgebra())
		if (I->getOpcode() != Instruction::FMul \|\| !I->isFast())
return;		return;

if (match(I->getOperand(0), m_SpecificFP(0.5)))		if (match(I->getOperand(0), m_SpecificFP(0.5)))
Y = I->getOperand(1);		Y = I->getOperand(1);
else if (match(I->getOperand(1), m_SpecificFP(0.5)))		else if (match(I->getOperand(1), m_SpecificFP(0.5)))
Y = I->getOperand(0);		Y = I->getOperand(0);
}		}

▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	if (C0) {
Constant *F = ConstantExpr::getFDiv(C1, C);		Constant *F = ConstantExpr::getFDiv(C1, C);
if (isNormalFp(F))		if (isNormalFp(F))
R = BinaryOperator::CreateFDiv(Opnd0, F);		R = BinaryOperator::CreateFDiv(Opnd0, F);
}		}
}		}
}		}

if (R) {		if (R) {
R->setHasUnsafeAlgebra(true);		R->setFast(true);
InsertNewInstWith(R, *InsertBefore);		InsertNewInstWith(R, *InsertBefore);
}		}

return R;		return R;
}		}

Instruction *InstCombiner::visitFMul(BinaryOperator &I) {		Instruction *InstCombiner::visitFMul(BinaryOperator &I) {
bool Changed = SimplifyAssociativeOrCommutative(I);		bool Changed = SimplifyAssociativeOrCommutative(I);
Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);

if (Value *V = SimplifyVectorOp(I))		if (Value *V = SimplifyVectorOp(I))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (isa<Constant>(Op0))		if (isa<Constant>(Op0))
std::swap(Op0, Op1);		std::swap(Op0, Op1);

if (Value *V = SimplifyFMulInst(Op0, Op1, I.getFastMathFlags(),		if (Value *V = SimplifyFMulInst(Op0, Op1, I.getFastMathFlags(),
SQ.getWithInstruction(&I)))		SQ.getWithInstruction(&I)))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

bool AllowReassociate = I.hasUnsafeAlgebra();		bool AllowReassociate = I.isFast();

// Simplify mul instructions with a constant RHS.		// Simplify mul instructions with a constant RHS.
if (isa<Constant>(Op1)) {		if (isa<Constant>(Op1)) {
if (Instruction *FoldedMul = foldOpWithConstantIntoOperand(I))		if (Instruction *FoldedMul = foldOpWithConstantIntoOperand(I))
return FoldedMul;		return FoldedMul;

// (fmul X, -1.0) --> (fsub -0.0, X)		// (fmul X, -1.0) --> (fsub -0.0, X)
if (match(Op1, m_SpecificFP(-1.0))) {		if (match(Op1, m_SpecificFP(-1.0))) {
▲ Show 20 Lines • Show All 702 Lines • ▼ Show 20 Lines	if (Value *V = SimplifyFDivInst(Op0, Op1, I.getFastMathFlags(),
SQ.getWithInstruction(&I)))		SQ.getWithInstruction(&I)))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (isa<Constant>(Op0))		if (isa<Constant>(Op0))
if (SelectInst *SI = dyn_cast<SelectInst>(Op1))		if (SelectInst *SI = dyn_cast<SelectInst>(Op1))
if (Instruction *R = FoldOpIntoSelect(I, SI))		if (Instruction *R = FoldOpIntoSelect(I, SI))
return R;		return R;

bool AllowReassociate = I.hasUnsafeAlgebra();		bool AllowReassociate = I.isFast();
bool AllowReciprocal = I.hasAllowReciprocal();		bool AllowReciprocal = I.hasAllowReciprocal();

if (Constant *Op1C = dyn_cast<Constant>(Op1)) {		if (Constant *Op1C = dyn_cast<Constant>(Op1)) {
if (SelectInst *SI = dyn_cast<SelectInst>(Op0))		if (SelectInst *SI = dyn_cast<SelectInst>(Op0))
if (Instruction *R = FoldOpIntoSelect(I, SI))		if (Instruction *R = FoldOpIntoSelect(I, SI))
return R;		return R;

if (AllowReassociate) {		if (AllowReassociate) {
▲ Show 20 Lines • Show All 286 Lines • Show Last 20 Lines

lib/Transforms/Scalar/Reassociate.cpp

Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	XorOpnd::XorOpnd(Value *V) {
isOr = true;		isOr = true;
}		}

/// Return true if V is an instruction of the specified opcode and if it		/// Return true if V is an instruction of the specified opcode and if it
/// only has one use.		/// only has one use.
static BinaryOperator isReassociableOp(Value V, unsigned Opcode) {		static BinaryOperator isReassociableOp(Value V, unsigned Opcode) {
if (V->hasOneUse() && isa<Instruction>(V) &&		if (V->hasOneUse() && isa<Instruction>(V) &&
cast<Instruction>(V)->getOpcode() == Opcode &&		cast<Instruction>(V)->getOpcode() == Opcode &&
(!isa<FPMathOperator>(V) \|\|		(!isa<FPMathOperator>(V) \|\| cast<Instruction>(V)->isFast()))
cast<Instruction>(V)->hasUnsafeAlgebra()))
return cast<BinaryOperator>(V);		return cast<BinaryOperator>(V);
return nullptr;		return nullptr;
}		}

static BinaryOperator isReassociableOp(Value V, unsigned Opcode1,		static BinaryOperator isReassociableOp(Value V, unsigned Opcode1,
unsigned Opcode2) {		unsigned Opcode2) {
if (V->hasOneUse() && isa<Instruction>(V) &&		if (V->hasOneUse() && isa<Instruction>(V) &&
(cast<Instruction>(V)->getOpcode() == Opcode1 \|\|		(cast<Instruction>(V)->getOpcode() == Opcode1 \|\|
cast<Instruction>(V)->getOpcode() == Opcode2) &&		cast<Instruction>(V)->getOpcode() == Opcode2) &&
(!isa<FPMathOperator>(V) \|\|		(!isa<FPMathOperator>(V) \|\| cast<Instruction>(V)->isFast()))
cast<Instruction>(V)->hasUnsafeAlgebra()))
return cast<BinaryOperator>(V);		return cast<BinaryOperator>(V);
return nullptr;		return nullptr;
}		}

void ReassociatePass::BuildRankMap(Function &F,		void ReassociatePass::BuildRankMap(Function &F,
ReversePostOrderTraversal<Function*> &RPOT) {		ReversePostOrderTraversal<Function*> &RPOT) {
unsigned Rank = 2;		unsigned Rank = 2;

▲ Show 20 Lines • Show All 391 Lines • ▼ Show 20 Lines	#endif

// At this point we have a value which, first of all, is not a binary		// At this point we have a value which, first of all, is not a binary
// expression of the right kind, and secondly, is only used inside the		// expression of the right kind, and secondly, is only used inside the
// expression. This means that it can safely be modified. See if we		// expression. This means that it can safely be modified. See if we
// can usefully morph it into an expression of the right kind.		// can usefully morph it into an expression of the right kind.
assert((!isa<Instruction>(Op) \|\|		assert((!isa<Instruction>(Op) \|\|
cast<Instruction>(Op)->getOpcode() != Opcode		cast<Instruction>(Op)->getOpcode() != Opcode
\|\| (isa<FPMathOperator>(Op) &&		\|\| (isa<FPMathOperator>(Op) &&
!cast<Instruction>(Op)->hasUnsafeAlgebra())) &&		!cast<Instruction>(Op)->isFast())) &&
"Should have been handled above!");		"Should have been handled above!");
assert(Op->hasOneUse() && "Has uses outside the expression tree!");		assert(Op->hasOneUse() && "Has uses outside the expression tree!");

// If this is a multiply expression, turn any internal negations into		// If this is a multiply expression, turn any internal negations into
// multiplies by -1 so they can be reassociated.		// multiplies by -1 so they can be reassociated.
if (BinaryOperator *BO = dyn_cast<BinaryOperator>(Op))		if (BinaryOperator *BO = dyn_cast<BinaryOperator>(Op))
if ((Opcode == Instruction::Mul && BinaryOperator::isNeg(BO)) \|\|		if ((Opcode == Instruction::Mul && BinaryOperator::isNeg(BO)) \|\|
(Opcode == Instruction::FMul && BinaryOperator::isFNeg(BO))) {		(Opcode == Instruction::FMul && BinaryOperator::isFNeg(BO))) {
▲ Show 20 Lines • Show All 1,435 Lines • ▼ Show 20 Lines	if (Instruction *Res = canonicalizeNegConstExpr(I))
I = Res;		I = Res;

// Commute binary operators, to canonicalize the order of their operands.		// Commute binary operators, to canonicalize the order of their operands.
// This can potentially expose more CSE opportunities, and makes writing other		// This can potentially expose more CSE opportunities, and makes writing other
// transformations simpler.		// transformations simpler.
if (I->isCommutative())		if (I->isCommutative())
canonicalizeOperands(I);		canonicalizeOperands(I);

// Don't optimize floating point instructions that don't have unsafe algebra.		// Don't optimize floating point instructions that don't have unsafe algebra.
		wristowUnsubmitted Not Done Reply Inline Actions Very minor point/question: Since the test is no longer `hasUnsafeALgebra()`, are we OK with the comment still saying `unsafe algebra`? Or do we want to change the comment above to something like: `// Don't optimize floating point instructions that don't have fast-math.` I'm fine leaving it as-is, but I've found these sorts of things in a handful of places, so if we want to change them, I'll look through the patch more thoroughly, and identify each one I find. wristow: Very minor point/question: Since the test is no longer `hasUnsafeALgebra()`, are we OK with the…
		spatelAuthorUnsubmitted Not Done Reply Inline Actions I'll fix the comment to match the current code. Since this is the reassociation pass, I would guess that 'reassoc' is all we need to enable transforms here, but we'll have to verify that that is correct. spatel: I'll fix the comment to match the current code. Since this is the reassociation pass, I would…
		wristowUnsubmitted Not Done Reply Inline Actions To be clear, I'm not suggesting that in this patch we change the code here to check for 'reassoc' (i.e., I'm not suggesting we change the `isFast()` call to `hasAllowReassoc()` at this time). I view that as a separate piece of work, where we go through and carefully audit existing FMF-related checks, and decide how to use the more precise flags. Possibly it's just 'reassoc' that is needed for this case, or possibly it's 'reassoc' and some other conditions. All I was suggesting by my comment/question, is that code-comments referring to a no longer-existing "unsafe algebra" umbrella flag, are a bit misleading. So I wondered whether we wanted to change those comments to better match the new implementation. It's a pretty minor point, in my view. From my POV, the main purpose of this patch is to fix the underlying implementation to allow us to go through and do that audit, and fix issues like this "use `isFast()` or use some finer check?" example, here. wristow: To be clear, I'm //not //suggesting that in this patch we change the code here to check for…
if (I->getType()->isFPOrFPVectorTy() && !I->hasUnsafeAlgebra())		if (I->getType()->isFPOrFPVectorTy() && !I->isFast())
return;		return;

// Do not reassociate boolean (i1) expressions. We want to preserve the		// Do not reassociate boolean (i1) expressions. We want to preserve the
// original order of evaluation for short-circuited comparisons that		// original order of evaluation for short-circuited comparisons that
// SimplifyCFG has folded to AND/OR expressions. If the expression		// SimplifyCFG has folded to AND/OR expressions. If the expression
// is not further optimized, it is likely to be transformed back to a		// is not further optimized, it is likely to be transformed back to a
// short-circuited form for code gen, and the source order may have been		// short-circuited form for code gen, and the source order may have been
// optimized for the most likely conditions.		// optimized for the most likely conditions.
▲ Show 20 Lines • Show All 269 Lines • Show Last 20 Lines

lib/Transforms/Utils/LoopUtils.cpp

Show First 20 Lines • Show All 426 Lines • ▼ Show 20 Lines	RecurrenceDescriptor::isMinMaxSelectCmpPattern(Instruction *I, InstDesc &Prev) {
return InstDesc(false, I);		return InstDesc(false, I);
}		}

RecurrenceDescriptor::InstDesc		RecurrenceDescriptor::InstDesc
RecurrenceDescriptor::isRecurrenceInstr(Instruction *I, RecurrenceKind Kind,		RecurrenceDescriptor::isRecurrenceInstr(Instruction *I, RecurrenceKind Kind,
InstDesc &Prev, bool HasFunNoNaNAttr) {		InstDesc &Prev, bool HasFunNoNaNAttr) {
bool FP = I->getType()->isFloatingPointTy();		bool FP = I->getType()->isFloatingPointTy();
Instruction *UAI = Prev.getUnsafeAlgebraInst();		Instruction *UAI = Prev.getUnsafeAlgebraInst();
if (!UAI && FP && !I->hasUnsafeAlgebra())		if (!UAI && FP && !I->isFast())
UAI = I; // Found an unsafe (unvectorizable) algebra instruction.		UAI = I; // Found an unsafe (unvectorizable) algebra instruction.

switch (I->getOpcode()) {		switch (I->getOpcode()) {
default:		default:
return InstDesc(false, I);		return InstDesc(false, I);
case Instruction::PHI:		case Instruction::PHI:
return InstDesc(I, Prev.getMinMaxKind(), Prev.getUnsafeAlgebraInst());		return InstDesc(I, Prev.getMinMaxKind(), Prev.getUnsafeAlgebraInst());
case Instruction::Sub:		case Instruction::Sub:
▲ Show 20 Lines • Show All 215 Lines • ▼ Show 20 Lines	case MRK_FloatMax:
P = CmpInst::FCMP_OGT;		P = CmpInst::FCMP_OGT;
break;		break;
}		}

// We only match FP sequences with unsafe algebra, so we can unconditionally		// We only match FP sequences with unsafe algebra, so we can unconditionally
// set it on any generated instructions.		// set it on any generated instructions.
IRBuilder<>::FastMathFlagGuard FMFG(Builder);		IRBuilder<>::FastMathFlagGuard FMFG(Builder);
FastMathFlags FMF;		FastMathFlags FMF;
FMF.setUnsafeAlgebra();		FMF.setFast();
Builder.setFastMathFlags(FMF);		Builder.setFastMathFlags(FMF);

Value *Cmp;		Value *Cmp;
if (RK == MRK_FloatMin \|\| RK == MRK_FloatMax)		if (RK == MRK_FloatMin \|\| RK == MRK_FloatMax)
Cmp = Builder.CreateFCmp(P, Left, Right, "rdx.minmax.cmp");		Cmp = Builder.CreateFCmp(P, Left, Right, "rdx.minmax.cmp");
else		else
Cmp = Builder.CreateICmp(P, Left, Right, "rdx.minmax.cmp");		Cmp = Builder.CreateICmp(P, Left, Right, "rdx.minmax.cmp");

▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	assert(InductionBinOp &&
(InductionBinOp->getOpcode() == Instruction::FAdd \|\|		(InductionBinOp->getOpcode() == Instruction::FAdd \|\|
InductionBinOp->getOpcode() == Instruction::FSub) &&		InductionBinOp->getOpcode() == Instruction::FSub) &&
"Original bin op should be defined for FP induction");		"Original bin op should be defined for FP induction");

Value *StepValue = cast<SCEVUnknown>(Step)->getValue();		Value *StepValue = cast<SCEVUnknown>(Step)->getValue();

// Floating point operations had to be 'fast' to enable the induction.		// Floating point operations had to be 'fast' to enable the induction.
FastMathFlags Flags;		FastMathFlags Flags;
Flags.setUnsafeAlgebra();		Flags.setFast();

Value *MulExp = B.CreateFMul(StepValue, Index);		Value *MulExp = B.CreateFMul(StepValue, Index);
if (isa<Instruction>(MulExp))		if (isa<Instruction>(MulExp))
// We have to check, the MulExp may be a constant.		// We have to check, the MulExp may be a constant.
cast<Instruction>(MulExp)->setFastMathFlags(Flags);		cast<Instruction>(MulExp)->setFastMathFlags(Flags);

Value *BOp = B.CreateBinOp(InductionBinOp->getOpcode() , StartValue,		Value *BOp = B.CreateBinOp(InductionBinOp->getOpcode() , StartValue,
MulExp, "induction");		MulExp, "induction");
▲ Show 20 Lines • Show All 553 Lines • ▼ Show 20 Lines	Optional<unsigned> llvm::getLoopEstimatedTripCount(Loop *L) {
else		else
return (FalseVal + (TrueVal / 2)) / TrueVal;		return (FalseVal + (TrueVal / 2)) / TrueVal;
}		}

/// \brief Adds a 'fast' flag to floating point operations.		/// \brief Adds a 'fast' flag to floating point operations.
static Value addFastMathFlag(Value V) {		static Value addFastMathFlag(Value V) {
if (isa<FPMathOperator>(V)) {		if (isa<FPMathOperator>(V)) {
FastMathFlags Flags;		FastMathFlags Flags;
Flags.setUnsafeAlgebra();		Flags.setFast();
cast<Instruction>(V)->setFastMathFlags(Flags);		cast<Instruction>(V)->setFastMathFlags(Flags);
}		}
return V;		return V;
}		}

// Helper to generate a log2 shuffle reduction.		// Helper to generate a log2 shuffle reduction.
Value *		Value *
llvm::getShuffleReduction(IRBuilder<> &Builder, Value *Src, unsigned Op,		llvm::getShuffleReduction(IRBuilder<> &Builder, Value *Src, unsigned Op,
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	Value *llvm::createSimpleTargetReduction(
assert(isa<VectorType>(Src->getType()) && "Type must be a vector");		assert(isa<VectorType>(Src->getType()) && "Type must be a vector");

Value *ScalarUdf = UndefValue::get(Src->getType()->getVectorElementType());		Value *ScalarUdf = UndefValue::get(Src->getType()->getVectorElementType());
std::function<Value*()> BuildFunc;		std::function<Value*()> BuildFunc;
using RD = RecurrenceDescriptor;		using RD = RecurrenceDescriptor;
RD::MinMaxRecurrenceKind MinMaxKind = RD::MRK_Invalid;		RD::MinMaxRecurrenceKind MinMaxKind = RD::MRK_Invalid;
// TODO: Support creating ordered reductions.		// TODO: Support creating ordered reductions.
FastMathFlags FMFUnsafe;		FastMathFlags FMFUnsafe;
FMFUnsafe.setUnsafeAlgebra();		FMFUnsafe.setFast();

switch (Opcode) {		switch (Opcode) {
case Instruction::Add:		case Instruction::Add:
BuildFunc = [&]() { return Builder.CreateAddReduce(Src); };		BuildFunc = [&]() { return Builder.CreateAddReduce(Src); };
break;		break;
case Instruction::Mul:		case Instruction::Mul:
BuildFunc = [&]() { return Builder.CreateMulReduce(Src); };		BuildFunc = [&]() { return Builder.CreateMulReduce(Src); };
break;		break;
▲ Show 20 Lines • Show All 128 Lines • Show Last 20 Lines

lib/Transforms/Utils/SimplifyLibCalls.cpp

Show First 20 Lines • Show All 1,105 Lines • ▼ Show 20 Lines	Value LibCallSimplifier::optimizePow(CallInst CI, IRBuilder<> &B) {

// pow(exp(x), y) -> exp(x * y)		// pow(exp(x), y) -> exp(x * y)
// pow(exp2(x), y) -> exp2(x * y)		// pow(exp2(x), y) -> exp2(x * y)
// We enable these only with fast-math. Besides rounding differences, the		// We enable these only with fast-math. Besides rounding differences, the
// transformation changes overflow and underflow behavior quite dramatically.		// transformation changes overflow and underflow behavior quite dramatically.
// Example: x = 1000, y = 0.001.		// Example: x = 1000, y = 0.001.
// pow(exp(x), y) = pow(inf, 0.001) = inf, whereas exp(x*y) = exp(1).		// pow(exp(x), y) = pow(inf, 0.001) = inf, whereas exp(x*y) = exp(1).
auto *OpC = dyn_cast<CallInst>(Op1);		auto *OpC = dyn_cast<CallInst>(Op1);
if (OpC && OpC->hasUnsafeAlgebra() && CI->hasUnsafeAlgebra()) {		if (OpC && OpC->isFast() && CI->isFast()) {
LibFunc Func;		LibFunc Func;
Function *OpCCallee = OpC->getCalledFunction();		Function *OpCCallee = OpC->getCalledFunction();
if (OpCCallee && TLI->getLibFunc(OpCCallee->getName(), Func) &&		if (OpCCallee && TLI->getLibFunc(OpCCallee->getName(), Func) &&
TLI->has(Func) && (Func == LibFunc_exp \|\| Func == LibFunc_exp2)) {		TLI->has(Func) && (Func == LibFunc_exp \|\| Func == LibFunc_exp2)) {
IRBuilder<>::FastMathFlagGuard Guard(B);		IRBuilder<>::FastMathFlagGuard Guard(B);
B.setFastMathFlags(CI->getFastMathFlags());		B.setFastMathFlags(CI->getFastMathFlags());
Value *FMul = B.CreateFMul(OpC->getArgOperand(0), Op2, "mul");		Value *FMul = B.CreateFMul(OpC->getArgOperand(0), Op2, "mul");
return emitUnaryFloatFnCall(FMul, OpCCallee->getName(), B,		return emitUnaryFloatFnCall(FMul, OpCCallee->getName(), B,
OpCCallee->getAttributes());		OpCCallee->getAttributes());
}		}
}		}

ConstantFP *Op2C = dyn_cast<ConstantFP>(Op2);		ConstantFP *Op2C = dyn_cast<ConstantFP>(Op2);
if (!Op2C)		if (!Op2C)
return Ret;		return Ret;

if (Op2C->getValueAPF().isZero()) // pow(x, 0.0) -> 1.0		if (Op2C->getValueAPF().isZero()) // pow(x, 0.0) -> 1.0
return ConstantFP::get(CI->getType(), 1.0);		return ConstantFP::get(CI->getType(), 1.0);

if (Op2C->isExactlyValue(-0.5) &&		if (Op2C->isExactlyValue(-0.5) &&
hasUnaryFloatFn(TLI, Op2->getType(), LibFunc_sqrt, LibFunc_sqrtf,		hasUnaryFloatFn(TLI, Op2->getType(), LibFunc_sqrt, LibFunc_sqrtf,
LibFunc_sqrtl)) {		LibFunc_sqrtl)) {
// If -ffast-math:		// If -ffast-math:
// pow(x, -0.5) -> 1.0 / sqrt(x)		// pow(x, -0.5) -> 1.0 / sqrt(x)
if (CI->hasUnsafeAlgebra()) {		if (CI->isFast()) {
IRBuilder<>::FastMathFlagGuard Guard(B);		IRBuilder<>::FastMathFlagGuard Guard(B);
B.setFastMathFlags(CI->getFastMathFlags());		B.setFastMathFlags(CI->getFastMathFlags());

// TODO: If the pow call is an intrinsic, we should lower to the sqrt		// TODO: If the pow call is an intrinsic, we should lower to the sqrt
// intrinsic, so we match errno semantics. We also should check that the		// intrinsic, so we match errno semantics. We also should check that the
// target can in fact lower the sqrt intrinsic -- we currently have no way		// target can in fact lower the sqrt intrinsic -- we currently have no way
// to ask this question other than asking whether the target has a sqrt		// to ask this question other than asking whether the target has a sqrt
// libcall, which is a sufficient but not necessary condition.		// libcall, which is a sufficient but not necessary condition.
Value *Sqrt = emitUnaryFloatFnCall(Op1, TLI->getName(LibFunc_sqrt), B,		Value *Sqrt = emitUnaryFloatFnCall(Op1, TLI->getName(LibFunc_sqrt), B,
Callee->getAttributes());		Callee->getAttributes());

return B.CreateFDiv(ConstantFP::get(CI->getType(), 1.0), Sqrt, "sqrtrecip");		return B.CreateFDiv(ConstantFP::get(CI->getType(), 1.0), Sqrt, "sqrtrecip");
}		}
}		}

if (Op2C->isExactlyValue(0.5) &&		if (Op2C->isExactlyValue(0.5) &&
hasUnaryFloatFn(TLI, Op2->getType(), LibFunc_sqrt, LibFunc_sqrtf,		hasUnaryFloatFn(TLI, Op2->getType(), LibFunc_sqrt, LibFunc_sqrtf,
LibFunc_sqrtl)) {		LibFunc_sqrtl)) {

// In -ffast-math, pow(x, 0.5) -> sqrt(x).		// In -ffast-math, pow(x, 0.5) -> sqrt(x).
if (CI->hasUnsafeAlgebra()) {		if (CI->isFast()) {
IRBuilder<>::FastMathFlagGuard Guard(B);		IRBuilder<>::FastMathFlagGuard Guard(B);
B.setFastMathFlags(CI->getFastMathFlags());		B.setFastMathFlags(CI->getFastMathFlags());

// TODO: As above, we should lower to the sqrt intrinsic if the pow is an		// TODO: As above, we should lower to the sqrt intrinsic if the pow is an
// intrinsic, to match errno semantics.		// intrinsic, to match errno semantics.
return emitUnaryFloatFnCall(Op1, TLI->getName(LibFunc_sqrt), B,		return emitUnaryFloatFnCall(Op1, TLI->getName(LibFunc_sqrt), B,
Callee->getAttributes());		Callee->getAttributes());
}		}
Show All 22 Lines	Value LibCallSimplifier::optimizePow(CallInst CI, IRBuilder<> &B) {
if (Op2C->isExactlyValue(1.0)) // pow(x, 1.0) -> x		if (Op2C->isExactlyValue(1.0)) // pow(x, 1.0) -> x
return Op1;		return Op1;
if (Op2C->isExactlyValue(2.0)) // pow(x, 2.0) -> x*x		if (Op2C->isExactlyValue(2.0)) // pow(x, 2.0) -> x*x
return B.CreateFMul(Op1, Op1, "pow2");		return B.CreateFMul(Op1, Op1, "pow2");
if (Op2C->isExactlyValue(-1.0)) // pow(x, -1.0) -> 1.0/x		if (Op2C->isExactlyValue(-1.0)) // pow(x, -1.0) -> 1.0/x
return B.CreateFDiv(ConstantFP::get(CI->getType(), 1.0), Op1, "powrecip");		return B.CreateFDiv(ConstantFP::get(CI->getType(), 1.0), Op1, "powrecip");

// In -ffast-math, generate repeated fmul instead of generating pow(x, n).		// In -ffast-math, generate repeated fmul instead of generating pow(x, n).
if (CI->hasUnsafeAlgebra()) {		if (CI->isFast()) {
APFloat V = abs(Op2C->getValueAPF());		APFloat V = abs(Op2C->getValueAPF());
// We limit to a max of 7 fmul(s). Thus max exponent is 32.		// We limit to a max of 7 fmul(s). Thus max exponent is 32.
// This transformation applies to integer exponents only.		// This transformation applies to integer exponents only.
if (V.compare(APFloat(V.getSemantics(), 32.0)) == APFloat::cmpGreaterThan \|\|		if (V.compare(APFloat(V.getSemantics(), 32.0)) == APFloat::cmpGreaterThan \|\|
!V.isInteger())		!V.isInteger())
return nullptr;		return nullptr;

// Propagate fast math flags.		// Propagate fast math flags.
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	Value LibCallSimplifier::optimizeFMinFMax(CallInst CI, IRBuilder<> &B) {
// function, do that first.		// function, do that first.
StringRef Name = Callee->getName();		StringRef Name = Callee->getName();
if ((Name == "fmin" \|\| Name == "fmax") && hasFloatVersion(Name))		if ((Name == "fmin" \|\| Name == "fmax") && hasFloatVersion(Name))
if (Value *Ret = optimizeBinaryDoubleFP(CI, B))		if (Value *Ret = optimizeBinaryDoubleFP(CI, B))
return Ret;		return Ret;

IRBuilder<>::FastMathFlagGuard Guard(B);		IRBuilder<>::FastMathFlagGuard Guard(B);
FastMathFlags FMF;		FastMathFlags FMF;
if (CI->hasUnsafeAlgebra()) {		if (CI->isFast()) {
// Unsafe algebra sets all fast-math-flags to true.		// Unsafe algebra sets all fast-math-flags to true.
FMF.setUnsafeAlgebra();		FMF.setFast();
} else {		} else {
// At a minimum, no-nans-fp-math must be true.		// At a minimum, no-nans-fp-math must be true.
if (!CI->hasNoNaNs())		if (!CI->hasNoNaNs())
return nullptr;		return nullptr;
// No-signed-zeros is implied by the definitions of fmax/fmin themselves:		// No-signed-zeros is implied by the definitions of fmax/fmin themselves:
// "Ideally, fmax would be sensitive to the sign of zero, for example		// "Ideally, fmax would be sensitive to the sign of zero, for example
// fmax(-0. 0, +0. 0) would return +0; however, implementation in software		// fmax(-0. 0, +0. 0) would return +0; however, implementation in software
// might be impractical."		// might be impractical."
Show All 14 Lines

Value LibCallSimplifier::optimizeLog(CallInst CI, IRBuilder<> &B) {		Value LibCallSimplifier::optimizeLog(CallInst CI, IRBuilder<> &B) {
Function *Callee = CI->getCalledFunction();		Function *Callee = CI->getCalledFunction();
Value *Ret = nullptr;		Value *Ret = nullptr;
StringRef Name = Callee->getName();		StringRef Name = Callee->getName();
if (UnsafeFPShrink && hasFloatVersion(Name))		if (UnsafeFPShrink && hasFloatVersion(Name))
Ret = optimizeUnaryDoubleFP(CI, B, true);		Ret = optimizeUnaryDoubleFP(CI, B, true);

if (!CI->hasUnsafeAlgebra())		if (!CI->isFast())
return Ret;		return Ret;
Value *Op1 = CI->getArgOperand(0);		Value *Op1 = CI->getArgOperand(0);
auto *OpC = dyn_cast<CallInst>(Op1);		auto *OpC = dyn_cast<CallInst>(Op1);

// The earlier call must also be unsafe in order to do these transforms.		// The earlier call must also be unsafe in order to do these transforms.
if (!OpC \|\| !OpC->hasUnsafeAlgebra())		if (!OpC \|\| !OpC->isFast())
return Ret;		return Ret;

// log(pow(x,y)) -> y*log(x)		// log(pow(x,y)) -> y*log(x)
// This is only applicable to log, log2, log10.		// This is only applicable to log, log2, log10.
if (Name != "log" && Name != "log2" && Name != "log10")		if (Name != "log" && Name != "log2" && Name != "log10")
return Ret;		return Ret;

IRBuilder<>::FastMathFlagGuard Guard(B);		IRBuilder<>::FastMathFlagGuard Guard(B);
FastMathFlags FMF;		FastMathFlags FMF;
FMF.setUnsafeAlgebra();		FMF.setFast();
B.setFastMathFlags(FMF);		B.setFastMathFlags(FMF);

LibFunc Func;		LibFunc Func;
Function *F = OpC->getCalledFunction();		Function *F = OpC->getCalledFunction();
if (F && ((TLI->getLibFunc(F->getName(), Func) && TLI->has(Func) &&		if (F && ((TLI->getLibFunc(F->getName(), Func) && TLI->has(Func) &&
Func == LibFunc_pow) \|\| F->getIntrinsicID() == Intrinsic::pow))		Func == LibFunc_pow) \|\| F->getIntrinsicID() == Intrinsic::pow))
return B.CreateFMul(OpC->getArgOperand(1),		return B.CreateFMul(OpC->getArgOperand(1),
emitUnaryFloatFnCall(OpC->getOperand(0), Callee->getName(), B,		emitUnaryFloatFnCall(OpC->getOperand(0), Callee->getName(), B,
Show All 15 Lines	Value LibCallSimplifier::optimizeSqrt(CallInst CI, IRBuilder<> &B) {
Value *Ret = nullptr;		Value *Ret = nullptr;
// TODO: Once we have a way (other than checking for the existince of the		// TODO: Once we have a way (other than checking for the existince of the
// libcall) to tell whether our target can lower @llvm.sqrt, relax the		// libcall) to tell whether our target can lower @llvm.sqrt, relax the
// condition below.		// condition below.
if (TLI->has(LibFunc_sqrtf) && (Callee->getName() == "sqrt" \|\|		if (TLI->has(LibFunc_sqrtf) && (Callee->getName() == "sqrt" \|\|
Callee->getIntrinsicID() == Intrinsic::sqrt))		Callee->getIntrinsicID() == Intrinsic::sqrt))
Ret = optimizeUnaryDoubleFP(CI, B, true);		Ret = optimizeUnaryDoubleFP(CI, B, true);

if (!CI->hasUnsafeAlgebra())		if (!CI->isFast())
return Ret;		return Ret;

Instruction *I = dyn_cast<Instruction>(CI->getArgOperand(0));		Instruction *I = dyn_cast<Instruction>(CI->getArgOperand(0));
if (!I \|\| I->getOpcode() != Instruction::FMul \|\| !I->hasUnsafeAlgebra())		if (!I \|\| I->getOpcode() != Instruction::FMul \|\| !I->isFast())
return Ret;		return Ret;

// We're looking for a repeated factor in a multiplication tree,		// We're looking for a repeated factor in a multiplication tree,
// so we can do this fold: sqrt(x * x) -> fabs(x);		// so we can do this fold: sqrt(x * x) -> fabs(x);
// or this fold: sqrt((x * x) * y) -> fabs(x) * sqrt(y).		// or this fold: sqrt((x * x) * y) -> fabs(x) * sqrt(y).
Value *Op0 = I->getOperand(0);		Value *Op0 = I->getOperand(0);
Value *Op1 = I->getOperand(1);		Value *Op1 = I->getOperand(1);
Value *RepeatOp = nullptr;		Value *RepeatOp = nullptr;
Value *OtherOp = nullptr;		Value *OtherOp = nullptr;
if (Op0 == Op1) {		if (Op0 == Op1) {
// Simple match: the operands of the multiply are identical.		// Simple match: the operands of the multiply are identical.
RepeatOp = Op0;		RepeatOp = Op0;
} else {		} else {
// Look for a more complicated pattern: one of the operands is itself		// Look for a more complicated pattern: one of the operands is itself
// a multiply, so search for a common factor in that multiply.		// a multiply, so search for a common factor in that multiply.
// Note: We don't bother looking any deeper than this first level or for		// Note: We don't bother looking any deeper than this first level or for
// variations of this pattern because instcombine's visitFMUL and/or the		// variations of this pattern because instcombine's visitFMUL and/or the
// reassociation pass should give us this form.		// reassociation pass should give us this form.
Value OtherMul0, OtherMul1;		Value OtherMul0, OtherMul1;
if (match(Op0, m_FMul(m_Value(OtherMul0), m_Value(OtherMul1)))) {		if (match(Op0, m_FMul(m_Value(OtherMul0), m_Value(OtherMul1)))) {
// Pattern: sqrt((x * y) * z)		// Pattern: sqrt((x * y) * z)
if (OtherMul0 == OtherMul1 &&		if (OtherMul0 == OtherMul1 && cast<Instruction>(Op0)->isFast()) {
cast<Instruction>(Op0)->hasUnsafeAlgebra()) {
// Matched: sqrt((x * x) * z)		// Matched: sqrt((x * x) * z)
RepeatOp = OtherMul0;		RepeatOp = OtherMul0;
OtherOp = Op1;		OtherOp = Op1;
}		}
}		}
}		}
if (!RepeatOp)		if (!RepeatOp)
return Ret;		return Ret;
Show All 29 Lines	if (UnsafeFPShrink && Name == "tan" && hasFloatVersion(Name))
Ret = optimizeUnaryDoubleFP(CI, B, true);		Ret = optimizeUnaryDoubleFP(CI, B, true);

Value *Op1 = CI->getArgOperand(0);		Value *Op1 = CI->getArgOperand(0);
auto *OpC = dyn_cast<CallInst>(Op1);		auto *OpC = dyn_cast<CallInst>(Op1);
if (!OpC)		if (!OpC)
return Ret;		return Ret;

// Both calls must allow unsafe optimizations in order to remove them.		// Both calls must allow unsafe optimizations in order to remove them.
if (!CI->hasUnsafeAlgebra() \|\| !OpC->hasUnsafeAlgebra())		if (!CI->isFast() \|\| !OpC->isFast())
return Ret;		return Ret;

// tan(atan(x)) -> x		// tan(atan(x)) -> x
// tanf(atanf(x)) -> x		// tanf(atanf(x)) -> x
// tanl(atanl(x)) -> x		// tanl(atanl(x)) -> x
LibFunc Func;		LibFunc Func;
Function *F = OpC->getCalledFunction();		Function *F = OpC->getCalledFunction();
if (F && TLI->getLibFunc(F->getName(), Func) && TLI->has(Func) &&		if (F && TLI->getLibFunc(F->getName(), Func) && TLI->has(Func) &&
▲ Show 20 Lines • Show All 712 Lines • ▼ Show 20 Lines	Value LibCallSimplifier::optimizeCall(CallInst CI) {

SmallVector<OperandBundleDef, 2> OpBundles;		SmallVector<OperandBundleDef, 2> OpBundles;
CI->getOperandBundlesAsDefs(OpBundles);		CI->getOperandBundlesAsDefs(OpBundles);
IRBuilder<> Builder(CI, /FPMathTag=/nullptr, OpBundles);		IRBuilder<> Builder(CI, /FPMathTag=/nullptr, OpBundles);
bool isCallingConvC = isCallingConvCCompatible(CI);		bool isCallingConvC = isCallingConvCCompatible(CI);

// Command-line parameter overrides instruction attribute.		// Command-line parameter overrides instruction attribute.
// This can't be moved to optimizeFloatingPointLibCall() because it may be		// This can't be moved to optimizeFloatingPointLibCall() because it may be
// used by the intrinsic optimizations.		// used by the intrinsic optimizations.
if (EnableUnsafeFPShrink.getNumOccurrences() > 0)		if (EnableUnsafeFPShrink.getNumOccurrences() > 0)
UnsafeFPShrink = EnableUnsafeFPShrink;		UnsafeFPShrink = EnableUnsafeFPShrink;
else if (isa<FPMathOperator>(CI) && CI->hasUnsafeAlgebra())		else if (isa<FPMathOperator>(CI) && CI->isFast())
UnsafeFPShrink = true;		UnsafeFPShrink = true;

// First, check for intrinsics.		// First, check for intrinsics.
if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(CI)) {		if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(CI)) {
if (!isCallingConvC)		if (!isCallingConvC)
return nullptr;		return nullptr;
// The FP intrinsics have corresponding constrained versions so we don't		// The FP intrinsics have corresponding constrained versions so we don't
// need to check for the StrictFP attribute here.		// need to check for the StrictFP attribute here.
▲ Show 20 Lines • Show All 300 Lines • Show Last 20 Lines

lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 379 Lines • ▼ Show 20 Lines
/// TODO: We should use actual block probability here, if available. Currently,		/// TODO: We should use actual block probability here, if available. Currently,
/// we always assume predicated blocks have a 50% chance of executing.		/// we always assume predicated blocks have a 50% chance of executing.
static unsigned getReciprocalPredBlockProb() { return 2; }		static unsigned getReciprocalPredBlockProb() { return 2; }

/// A helper function that adds a 'fast' flag to floating-point operations.		/// A helper function that adds a 'fast' flag to floating-point operations.
static Value addFastMathFlag(Value V) {		static Value addFastMathFlag(Value V) {
if (isa<FPMathOperator>(V)) {		if (isa<FPMathOperator>(V)) {
FastMathFlags Flags;		FastMathFlags Flags;
Flags.setUnsafeAlgebra();		Flags.setFast();
cast<Instruction>(V)->setFastMathFlags(Flags);		cast<Instruction>(V)->setFastMathFlags(Flags);
}		}
return V;		return V;
}		}

/// A helper function that returns an integer or floating-point constant with		/// A helper function that returns an integer or floating-point constant with
/// value C.		/// value C.
static Constant getSignedIntOrFpConstant(Type Ty, int64_t C) {		static Constant getSignedIntOrFpConstant(Type Ty, int64_t C) {
▲ Show 20 Lines • Show All 2,326 Lines • ▼ Show 20 Lines	Value InnerLoopVectorizer::getStepVector(Value Val, int StartIdx, Value *Step,

// Add the consecutive indices to the vector value.		// Add the consecutive indices to the vector value.
Constant *Cv = ConstantVector::get(Indices);		Constant *Cv = ConstantVector::get(Indices);

Step = Builder.CreateVectorSplat(VLen, Step);		Step = Builder.CreateVectorSplat(VLen, Step);

// Floating point operations had to be 'fast' to enable the induction.		// Floating point operations had to be 'fast' to enable the induction.
FastMathFlags Flags;		FastMathFlags Flags;
Flags.setUnsafeAlgebra();		Flags.setFast();

Value *MulOp = Builder.CreateFMul(Cv, Step);		Value *MulOp = Builder.CreateFMul(Cv, Step);
if (isa<Instruction>(MulOp))		if (isa<Instruction>(MulOp))
// Have to check, MulOp may be a constant		// Have to check, MulOp may be a constant
cast<Instruction>(MulOp)->setFastMathFlags(Flags);		cast<Instruction>(MulOp)->setFastMathFlags(Flags);

Value *BOp = Builder.CreateBinOp(BinOp, Val, MulOp, "induction");		Value *BOp = Builder.CreateBinOp(BinOp, Val, MulOp, "induction");
if (isa<Instruction>(BOp))		if (isa<Instruction>(BOp))
▲ Show 20 Lines • Show All 2,659 Lines • ▼ Show 20 Lines	for (Instruction &I : *BB) {
}		}

// FP instructions can allow unsafe algebra, thus vectorizable by		// FP instructions can allow unsafe algebra, thus vectorizable by
// non-IEEE-754 compliant SIMD units.		// non-IEEE-754 compliant SIMD units.
// This applies to floating-point math operations and calls, not memory		// This applies to floating-point math operations and calls, not memory
// operations, shuffles, or casts, as they don't change precision or		// operations, shuffles, or casts, as they don't change precision or
// semantics.		// semantics.
} else if (I.getType()->isFloatingPointTy() && (CI \|\| I.isBinaryOp()) &&		} else if (I.getType()->isFloatingPointTy() && (CI \|\| I.isBinaryOp()) &&
!I.hasUnsafeAlgebra()) {		!I.isFast()) {
DEBUG(dbgs() << "LV: Found FP op with unsafe algebra.\n");		DEBUG(dbgs() << "LV: Found FP op with unsafe algebra.\n");
Hints->setPotentiallyUnsafe();		Hints->setPotentiallyUnsafe();
}		}

// Reduction instructions are allowed to have exit users.		// Reduction instructions are allowed to have exit users.
// All other instructions must not have external users.		// All other instructions must not have external users.
if (hasOutsideLoopUser(TheLoop, &I, AllowedExit)) {		if (hasOutsideLoopUser(TheLoop, &I, AllowedExit)) {
ORE->emit(createMissedAnalysis("ValueUsedOutsideLoop", &I)		ORE->emit(createMissedAnalysis("ValueUsedOutsideLoop", &I)
▲ Show 20 Lines • Show All 3,418 Lines • Show Last 20 Lines

lib/Transforms/Vectorize/SLPVectorizer.cpp

Show First 20 Lines • Show All 4,874 Lines • ▼ Show 20 Lines	bool isAssociative(Instruction *I) const {
assert(Kind != RK_None && *this && LHS && RHS &&		assert(Kind != RK_None && *this && LHS && RHS &&
"Expected reduction operation.");		"Expected reduction operation.");
switch (Kind) {		switch (Kind) {
case RK_Arithmetic:		case RK_Arithmetic:
return I->isAssociative();		return I->isAssociative();
case RK_Min:		case RK_Min:
case RK_Max:		case RK_Max:
return Opcode == Instruction::ICmp \|\|		return Opcode == Instruction::ICmp \|\|
cast<Instruction>(I->getOperand(0))->hasUnsafeAlgebra();		cast<Instruction>(I->getOperand(0))->isFast();
case RK_UMin:		case RK_UMin:
case RK_UMax:		case RK_UMax:
assert(Opcode == Instruction::ICmp &&		assert(Opcode == Instruction::ICmp &&
"Only integer compare operation is expected.");		"Only integer compare operation is expected.");
return true;		return true;
case RK_None:		case RK_None:
break;		break;
}		}
▲ Show 20 Lines • Show All 335 Lines • ▼ Show 20 Lines	bool tryToReduce(BoUpSLP &V, TargetTransformInfo *TTI) {
if (NumReducedVals < 4)		if (NumReducedVals < 4)
return false;		return false;

unsigned ReduxWidth = PowerOf2Floor(NumReducedVals);		unsigned ReduxWidth = PowerOf2Floor(NumReducedVals);

Value *VectorizedTree = nullptr;		Value *VectorizedTree = nullptr;
IRBuilder<> Builder(ReductionRoot);		IRBuilder<> Builder(ReductionRoot);
FastMathFlags Unsafe;		FastMathFlags Unsafe;
Unsafe.setUnsafeAlgebra();		Unsafe.setFast();
Builder.setFastMathFlags(Unsafe);		Builder.setFastMathFlags(Unsafe);
unsigned i = 0;		unsigned i = 0;

BoUpSLP::ExtraValueToDebugLocsMap ExternallyUsedValues;		BoUpSLP::ExtraValueToDebugLocsMap ExternallyUsedValues;
// The same extra argument may be used several time, so log each attempt		// The same extra argument may be used several time, so log each attempt
// to use it.		// to use it.
for (auto &Pair : ExtraArgs)		for (auto &Pair : ExtraArgs)
ExternallyUsedValues[Pair.second].push_back(Pair.first);		ExternallyUsedValues[Pair.second].push_back(Pair.first);
▲ Show 20 Lines • Show All 711 Lines • Show Last 20 Lines

test/Assembler/fast-math-flags.ll

; RUN: llvm-as < %s \| llvm-dis \| FileCheck %s		; RUN: llvm-as < %s \| llvm-dis \| FileCheck %s
; RUN: opt -S < %s \| FileCheck %s		; RUN: opt -S < %s \| FileCheck %s
; RUN: verify-uselistorder %s		; RUN: verify-uselistorder %s

@addr = external global i64		@addr = external global i64
@select = external global i1		@select = external global i1
@vec = external global <3 x float>		@vec = external global <3 x float>
@arr = external global [3 x float]		@arr = external global [3 x float]

		declare float @foo(float)

define float @none(float %x, float %y) {		define float @none(float %x, float %y) {
entry:		entry:
; CHECK: %vec = load <3 x float>, <3 x float>* @vec		; CHECK: %vec = load <3 x float>, <3 x float>* @vec
%vec = load <3 x float>, <3 x float>* @vec		%vec = load <3 x float>, <3 x float>* @vec
; CHECK: %select = load i1, i1* @select		; CHECK: %select = load i1, i1* @select
%select = load i1, i1* @select		%select = load i1, i1* @select
; CHECK: %arr = load [3 x float], [3 x float]* @arr		; CHECK: %arr = load [3 x float], [3 x float]* @arr
%arr = load [3 x float], [3 x float]* @arr		%arr = load [3 x float], [3 x float]* @arr
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	; CHECK: %a = fsub contract float %x, %y
%a = fsub contract float %x, %y		%a = fsub contract float %x, %y
; CHECK: %b = fadd contract float %x, %y		; CHECK: %b = fadd contract float %x, %y
%b = fadd contract float %x, %y		%b = fadd contract float %x, %y
; CHECK: %c = fmul contract float %a, %b		; CHECK: %c = fmul contract float %a, %b
%c = fmul contract float %a, %b		%c = fmul contract float %a, %b
ret float %c		ret float %c
}		}

		; CHECK: @reassoc(
		define float @reassoc(float %x, float %y) {
		; CHECK: %a = fsub reassoc float %x, %y
		%a = fsub reassoc float %x, %y
		; CHECK: %b = fmul reassoc float %x, %y
		%b = fmul reassoc float %x, %y
		; CHECK: %c = call reassoc float @foo(float %b)
		wristowUnsubmitted Not Done Reply Inline Actions Is testing the 'reassoc' flag on the 'call' instruction really what you intended here? I would have thought add/sub/mul/div, but 'call' surprised me. wristow: Is testing the 'reassoc' flag on the 'call' instruction really what you intended here? I would…
		spatelAuthorUnsubmitted Not Done Reply Inline Actions It is surprising but intentional. I want to highlight the fact that we allow any FMF on any FPMathOperator. So things like this or 'fadd arcp ...' are legal but I'm not sure how that would be used for optimization. We may want to refine that someday? spatel: It is surprising but intentional. I want to highlight the fact that we allow any FMF on any…
		wristowUnsubmitted Not Done Reply Inline Actions OK, thanks for explaining. wristow: OK, thanks for explaining.
		%c = call reassoc float @foo(float %b)
		ret float %c
		}

		; CHECK: @afn(
		define float @afn(float %x, float %y) {
		; CHECK: %a = fdiv afn float %x, %y
		%a = fdiv afn float %x, %y
		; CHECK: %b = frem afn float %x, %y
		%b = frem afn float %x, %y
		; CHECK: %c = call afn float @foo(float %b)
		%c = call afn float @foo(float %b)
		ret float %c
		}

; CHECK: no_nan_inf		; CHECK: no_nan_inf
define float @no_nan_inf(float %x, float %y) {		define float @no_nan_inf(float %x, float %y) {
entry:		entry:
; CHECK: %vec = load <3 x float>, <3 x float>* @vec		; CHECK: %vec = load <3 x float>, <3 x float>* @vec
%vec = load <3 x float>, <3 x float>* @vec		%vec = load <3 x float>, <3 x float>* @vec
; CHECK: %select = load i1, i1* @select		; CHECK: %select = load i1, i1* @select
%select = load i1, i1* @select		%select = load i1, i1* @select
; CHECK: %arr = load [3 x float], [3 x float]* @arr		; CHECK: %arr = load [3 x float], [3 x float]* @arr
Show All 28 Lines
entry:		entry:
; CHECK: %vec = load <3 x float>, <3 x float>* @vec		; CHECK: %vec = load <3 x float>, <3 x float>* @vec
%vec = load <3 x float>, <3 x float>* @vec		%vec = load <3 x float>, <3 x float>* @vec
; CHECK: %select = load i1, i1* @select		; CHECK: %select = load i1, i1* @select
%select = load i1, i1* @select		%select = load i1, i1* @select
; CHECK: %arr = load [3 x float], [3 x float]* @arr		; CHECK: %arr = load [3 x float], [3 x float]* @arr
%arr = load [3 x float], [3 x float]* @arr		%arr = load [3 x float], [3 x float]* @arr

; CHECK: %a = fadd nnan ninf float %x, %y		; CHECK: %a = fadd nnan ninf afn float %x, %y
%a = fadd ninf nnan float %x, %y		%a = fadd ninf nnan afn float %x, %y
; CHECK: %a_vec = fadd nnan <3 x float> %vec, %vec		; CHECK: %a_vec = fadd reassoc nnan <3 x float> %vec, %vec
%a_vec = fadd nnan <3 x float> %vec, %vec		%a_vec = fadd reassoc nnan <3 x float> %vec, %vec
; CHECK: %b = fsub fast float %x, %y		; CHECK: %b = fsub fast float %x, %y
%b = fsub nnan nsz fast float %x, %y		%b = fsub nnan nsz fast float %x, %y
; CHECK: %b_vec = fsub nnan <3 x float> %vec, %vec		; CHECK: %b_vec = fsub nnan <3 x float> %vec, %vec
%b_vec = fsub nnan <3 x float> %vec, %vec		%b_vec = fsub nnan <3 x float> %vec, %vec
; CHECK: %c = fmul fast float %x, %y		; CHECK: %c = fmul fast float %x, %y
%c = fmul nsz fast arcp float %x, %y		%c = fmul nsz fast arcp float %x, %y
; CHECK: %c_vec = fmul nsz <3 x float> %vec, %vec		; CHECK: %c_vec = fmul nsz <3 x float> %vec, %vec
%c_vec = fmul nsz <3 x float> %vec, %vec		%c_vec = fmul nsz <3 x float> %vec, %vec
Show All 11 Lines

test/Bitcode/compatibility-3.6.ll

Show First 20 Lines • Show All 606 Lines • ▼ Show 20 Lines	define void @fastmathflags(float %op1, float %op2) {
; CHECK: %f.nnan = fadd nnan float %op1, %op2		; CHECK: %f.nnan = fadd nnan float %op1, %op2
%f.ninf = fadd ninf float %op1, %op2		%f.ninf = fadd ninf float %op1, %op2
; CHECK: %f.ninf = fadd ninf float %op1, %op2		; CHECK: %f.ninf = fadd ninf float %op1, %op2
%f.nsz = fadd nsz float %op1, %op2		%f.nsz = fadd nsz float %op1, %op2
; CHECK: %f.nsz = fadd nsz float %op1, %op2		; CHECK: %f.nsz = fadd nsz float %op1, %op2
%f.arcp = fadd arcp float %op1, %op2		%f.arcp = fadd arcp float %op1, %op2
; CHECK: %f.arcp = fadd arcp float %op1, %op2		; CHECK: %f.arcp = fadd arcp float %op1, %op2
%f.fast = fadd fast float %op1, %op2		%f.fast = fadd fast float %op1, %op2
; CHECK: %f.fast = fadd fast float %op1, %op2		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
		; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
ret void		ret void
}		}

;; Type System		;; Type System
%opaquety = type opaque		%opaquety = type opaque
define void @typesystem() {		define void @typesystem() {
%p0 = bitcast i8* null to i32 (i32)*		%p0 = bitcast i8* null to i32 (i32)*
; CHECK: %p0 = bitcast i8* null to i32 (i32)*		; CHECK: %p0 = bitcast i8* null to i32 (i32)*
▲ Show 20 Lines • Show All 585 Lines • Show Last 20 Lines

test/Bitcode/compatibility-3.7.ll

Show First 20 Lines • Show All 650 Lines • ▼ Show 20 Lines	define void @fastmathflags(float %op1, float %op2) {
; CHECK: %f.nnan = fadd nnan float %op1, %op2		; CHECK: %f.nnan = fadd nnan float %op1, %op2
%f.ninf = fadd ninf float %op1, %op2		%f.ninf = fadd ninf float %op1, %op2
; CHECK: %f.ninf = fadd ninf float %op1, %op2		; CHECK: %f.ninf = fadd ninf float %op1, %op2
%f.nsz = fadd nsz float %op1, %op2		%f.nsz = fadd nsz float %op1, %op2
; CHECK: %f.nsz = fadd nsz float %op1, %op2		; CHECK: %f.nsz = fadd nsz float %op1, %op2
%f.arcp = fadd arcp float %op1, %op2		%f.arcp = fadd arcp float %op1, %op2
; CHECK: %f.arcp = fadd arcp float %op1, %op2		; CHECK: %f.arcp = fadd arcp float %op1, %op2
%f.fast = fadd fast float %op1, %op2		%f.fast = fadd fast float %op1, %op2
; CHECK: %f.fast = fadd fast float %op1, %op2		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
		; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
ret void		ret void
}		}

;; Type System		;; Type System
%opaquety = type opaque		%opaquety = type opaque
define void @typesystem() {		define void @typesystem() {
%p0 = bitcast i8* null to i32 (i32)*		%p0 = bitcast i8* null to i32 (i32)*
; CHECK: %p0 = bitcast i8* null to i32 (i32)*		; CHECK: %p0 = bitcast i8* null to i32 (i32)*
▲ Show 20 Lines • Show All 614 Lines • Show Last 20 Lines

test/Bitcode/compatibility-3.8.ll

Show First 20 Lines • Show All 681 Lines • ▼ Show 20 Lines	define void @fastmathflags(float %op1, float %op2) {
; CHECK: %f.nnan = fadd nnan float %op1, %op2		; CHECK: %f.nnan = fadd nnan float %op1, %op2
%f.ninf = fadd ninf float %op1, %op2		%f.ninf = fadd ninf float %op1, %op2
; CHECK: %f.ninf = fadd ninf float %op1, %op2		; CHECK: %f.ninf = fadd ninf float %op1, %op2
%f.nsz = fadd nsz float %op1, %op2		%f.nsz = fadd nsz float %op1, %op2
; CHECK: %f.nsz = fadd nsz float %op1, %op2		; CHECK: %f.nsz = fadd nsz float %op1, %op2
%f.arcp = fadd arcp float %op1, %op2		%f.arcp = fadd arcp float %op1, %op2
; CHECK: %f.arcp = fadd arcp float %op1, %op2		; CHECK: %f.arcp = fadd arcp float %op1, %op2
%f.fast = fadd fast float %op1, %op2		%f.fast = fadd fast float %op1, %op2
; CHECK: %f.fast = fadd fast float %op1, %op2		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
		; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
ret void		ret void
}		}

; Check various fast math flags and floating-point types on calls.		; Check various fast math flags and floating-point types on calls.

declare float @fmf1()		declare float @fmf1()
declare double @fmf2()		declare double @fmf2()
declare <4 x double> @fmf3()		declare <4 x double> @fmf3()

; CHECK-LABEL: fastMathFlagsForCalls(		; CHECK-LABEL: fastMathFlagsForCalls(
define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {		define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {
%call.fast = call fast float @fmf1()		%call.fast = call fast float @fmf1()
; CHECK: %call.fast = call fast float @fmf1()		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'contract' and 'aml' bits set, so this is not fully 'fast'.
		; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1()

; Throw in some other attributes to make sure those stay in the right places.		; Throw in some other attributes to make sure those stay in the right places.

%call.nsz.arcp = notail call nsz arcp double @fmf2()		%call.nsz.arcp = notail call nsz arcp double @fmf2()
; CHECK: %call.nsz.arcp = notail call nsz arcp double @fmf2()		; CHECK: %call.nsz.arcp = notail call nsz arcp double @fmf2()

%call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()		%call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()
; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()		; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()
▲ Show 20 Lines • Show All 880 Lines • Show Last 20 Lines

test/Bitcode/compatibility-3.9.ll

Show First 20 Lines • Show All 752 Lines • ▼ Show 20 Lines	define void @fastmathflags(float %op1, float %op2) {
; CHECK: %f.nnan = fadd nnan float %op1, %op2		; CHECK: %f.nnan = fadd nnan float %op1, %op2
%f.ninf = fadd ninf float %op1, %op2		%f.ninf = fadd ninf float %op1, %op2
; CHECK: %f.ninf = fadd ninf float %op1, %op2		; CHECK: %f.ninf = fadd ninf float %op1, %op2
%f.nsz = fadd nsz float %op1, %op2		%f.nsz = fadd nsz float %op1, %op2
; CHECK: %f.nsz = fadd nsz float %op1, %op2		; CHECK: %f.nsz = fadd nsz float %op1, %op2
%f.arcp = fadd arcp float %op1, %op2		%f.arcp = fadd arcp float %op1, %op2
; CHECK: %f.arcp = fadd arcp float %op1, %op2		; CHECK: %f.arcp = fadd arcp float %op1, %op2
%f.fast = fadd fast float %op1, %op2		%f.fast = fadd fast float %op1, %op2
; CHECK: %f.fast = fadd fast float %op1, %op2		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
		; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
ret void		ret void
}		}

; Check various fast math flags and floating-point types on calls.		; Check various fast math flags and floating-point types on calls.

declare float @fmf1()		declare float @fmf1()
declare double @fmf2()		declare double @fmf2()
declare <4 x double> @fmf3()		declare <4 x double> @fmf3()

; CHECK-LABEL: fastMathFlagsForCalls(		; CHECK-LABEL: fastMathFlagsForCalls(
define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {		define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {
%call.fast = call fast float @fmf1()		%call.fast = call fast float @fmf1()
; CHECK: %call.fast = call fast float @fmf1()		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
		; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1()

; Throw in some other attributes to make sure those stay in the right places.		; Throw in some other attributes to make sure those stay in the right places.

%call.nsz.arcp = notail call nsz arcp double @fmf2()		%call.nsz.arcp = notail call nsz arcp double @fmf2()
; CHECK: %call.nsz.arcp = notail call nsz arcp double @fmf2()		; CHECK: %call.nsz.arcp = notail call nsz arcp double @fmf2()

%call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()		%call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()
; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()		; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()
▲ Show 20 Lines • Show All 883 Lines • Show Last 20 Lines

test/Bitcode/compatibility-4.0.ll

Show First 20 Lines • Show All 751 Lines • ▼ Show 20 Lines	define void @fastmathflags(float %op1, float %op2) {
%f.nnan = fadd nnan float %op1, %op2		%f.nnan = fadd nnan float %op1, %op2
; CHECK: %f.nnan = fadd nnan float %op1, %op2		; CHECK: %f.nnan = fadd nnan float %op1, %op2
%f.ninf = fadd ninf float %op1, %op2		%f.ninf = fadd ninf float %op1, %op2
; CHECK: %f.ninf = fadd ninf float %op1, %op2		; CHECK: %f.ninf = fadd ninf float %op1, %op2
%f.nsz = fadd nsz float %op1, %op2		%f.nsz = fadd nsz float %op1, %op2
; CHECK: %f.nsz = fadd nsz float %op1, %op2		; CHECK: %f.nsz = fadd nsz float %op1, %op2
%f.arcp = fadd arcp float %op1, %op2		%f.arcp = fadd arcp float %op1, %op2
; CHECK: %f.arcp = fadd arcp float %op1, %op2		; CHECK: %f.arcp = fadd arcp float %op1, %op2
		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
%f.fast = fadd fast float %op1, %op2		%f.fast = fadd fast float %op1, %op2
; CHECK: %f.fast = fadd fast float %op1, %op2		; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2
ret void		ret void
}		}

; Check various fast math flags and floating-point types on calls.		; Check various fast math flags and floating-point types on calls.

declare float @fmf1()		declare float @fmf1()
declare double @fmf2()		declare double @fmf2()
declare <4 x double> @fmf3()		declare <4 x double> @fmf3()

; CHECK-LABEL: fastMathFlagsForCalls(		; CHECK-LABEL: fastMathFlagsForCalls(
define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {		define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {
%call.fast = call fast float @fmf1()		%call.fast = call fast float @fmf1()
; CHECK: %call.fast = call fast float @fmf1()		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'.
		; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1()

; Throw in some other attributes to make sure those stay in the right places.		; Throw in some other attributes to make sure those stay in the right places.

%call.nsz.arcp = notail call nsz arcp double @fmf2()		%call.nsz.arcp = notail call nsz arcp double @fmf2()
; CHECK: %call.nsz.arcp = notail call nsz arcp double @fmf2()		; CHECK: %call.nsz.arcp = notail call nsz arcp double @fmf2()

%call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()		%call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()
; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()		; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()
▲ Show 20 Lines • Show All 908 Lines • Show Last 20 Lines

test/Bitcode/compatibility-5.0.ll

Show First 20 Lines • Show All 759 Lines • ▼ Show 20 Lines	define void @fastmathflags(float %op1, float %op2) {
; CHECK: %f.ninf = fadd ninf float %op1, %op2		; CHECK: %f.ninf = fadd ninf float %op1, %op2
%f.nsz = fadd nsz float %op1, %op2		%f.nsz = fadd nsz float %op1, %op2
; CHECK: %f.nsz = fadd nsz float %op1, %op2		; CHECK: %f.nsz = fadd nsz float %op1, %op2
%f.arcp = fadd arcp float %op1, %op2		%f.arcp = fadd arcp float %op1, %op2
; CHECK: %f.arcp = fadd arcp float %op1, %op2		; CHECK: %f.arcp = fadd arcp float %op1, %op2
%f.contract = fadd contract float %op1, %op2		%f.contract = fadd contract float %op1, %op2
; CHECK: %f.contract = fadd contract float %op1, %op2		; CHECK: %f.contract = fadd contract float %op1, %op2
%f.fast = fadd fast float %op1, %op2		%f.fast = fadd fast float %op1, %op2
; CHECK: %f.fast = fadd fast float %op1, %op2		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'afn' bit set, so this is not fully 'fast'.
		; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp contract float %op1, %op2
ret void		ret void
}		}

; Check various fast math flags and floating-point types on calls.		; Check various fast math flags and floating-point types on calls.

declare float @fmf1()		declare float @fmf1()
declare double @fmf2()		declare double @fmf2()
declare <4 x double> @fmf3()		declare <4 x double> @fmf3()

; CHECK-LABEL: fastMathFlagsForCalls(		; CHECK-LABEL: fastMathFlagsForCalls(
define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {		define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) {
%call.fast = call fast float @fmf1()		%call.fast = call fast float @fmf1()
; CHECK: %call.fast = call fast float @fmf1()		; 'fast' used to be its own bit, but this changed in Oct 2017.
		; The binary test file does not have the newer 'afn' bit set, so this is not fully 'fast'.
		; CHECK: %call.fast = call reassoc nnan ninf nsz arcp contract float @fmf1()

; Throw in some other attributes to make sure those stay in the right places.		; Throw in some other attributes to make sure those stay in the right places.

%call.nsz.arcp = notail call nsz arcp double @fmf2()		%call.nsz.arcp = notail call nsz arcp double @fmf2()
; CHECK: %call.nsz.arcp = notail call nsz arcp double @fmf2()		; CHECK: %call.nsz.arcp = notail call nsz arcp double @fmf2()

%call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()		%call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()
; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()		; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc <4 x double> @fmf3()
▲ Show 20 Lines • Show All 916 Lines • Show Last 20 Lines

test/Bitcode/compatibility.ll

Show First 20 Lines • Show All 769 Lines • ▼ Show 20 Lines	define void @fastmathflags(float %op1, float %op2) {
%f.ninf = fadd ninf float %op1, %op2		%f.ninf = fadd ninf float %op1, %op2
; CHECK: %f.ninf = fadd ninf float %op1, %op2		; CHECK: %f.ninf = fadd ninf float %op1, %op2
%f.nsz = fadd nsz float %op1, %op2		%f.nsz = fadd nsz float %op1, %op2
; CHECK: %f.nsz = fadd nsz float %op1, %op2		; CHECK: %f.nsz = fadd nsz float %op1, %op2
%f.arcp = fadd arcp float %op1, %op2		%f.arcp = fadd arcp float %op1, %op2
; CHECK: %f.arcp = fadd arcp float %op1, %op2		; CHECK: %f.arcp = fadd arcp float %op1, %op2
%f.contract = fadd contract float %op1, %op2		%f.contract = fadd contract float %op1, %op2
; CHECK: %f.contract = fadd contract float %op1, %op2		; CHECK: %f.contract = fadd contract float %op1, %op2
		%f.afn = fadd afn float %op1, %op2
		; CHECK: %f.afn = fadd afn float %op1, %op2
		%f.reassoc = fadd reassoc float %op1, %op2
		; CHECK: %f.reassoc = fadd reassoc float %op1, %op2
%f.fast = fadd fast float %op1, %op2		%f.fast = fadd fast float %op1, %op2
; CHECK: %f.fast = fadd fast float %op1, %op2		; CHECK: %f.fast = fadd fast float %op1, %op2
ret void		ret void
}		}

; Check various fast math flags and floating-point types on calls.		; Check various fast math flags and floating-point types on calls.

declare float @fmf1()		declare float @fmf1()
▲ Show 20 Lines • Show All 931 Lines • Show Last 20 Lines

unittests/IR/IRBuilderTest.cpp

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	TEST_F(IRBuilderTest, FastMathFlags) {
EXPECT_FALSE(Builder.getFastMathFlags().any());		EXPECT_FALSE(Builder.getFastMathFlags().any());
ASSERT_TRUE(isa<Instruction>(F));		ASSERT_TRUE(isa<Instruction>(F));
FAdd = cast<Instruction>(F);		FAdd = cast<Instruction>(F);
EXPECT_FALSE(FAdd->hasNoNaNs());		EXPECT_FALSE(FAdd->hasNoNaNs());

FastMathFlags FMF;		FastMathFlags FMF;
Builder.setFastMathFlags(FMF);		Builder.setFastMathFlags(FMF);

		// By default, no flags are set.
F = Builder.CreateFAdd(F, F);		F = Builder.CreateFAdd(F, F);
EXPECT_FALSE(Builder.getFastMathFlags().any());		EXPECT_FALSE(Builder.getFastMathFlags().any());
		ASSERT_TRUE(isa<Instruction>(F));
		FAdd = cast<Instruction>(F);
		EXPECT_FALSE(FAdd->hasNoNaNs());
		EXPECT_FALSE(FAdd->hasNoInfs());
		EXPECT_FALSE(FAdd->hasNoSignedZeros());
		EXPECT_FALSE(FAdd->hasAllowReciprocal());
		EXPECT_FALSE(FAdd->hasAllowContract());
		EXPECT_FALSE(FAdd->hasAllowReassoc());
		EXPECT_FALSE(FAdd->hasApproxFunc());

FMF.setUnsafeAlgebra();		// Set all flags in the instruction.
		FAdd->setFast(true);
		EXPECT_TRUE(FAdd->hasNoNaNs());
		EXPECT_TRUE(FAdd->hasNoInfs());
		EXPECT_TRUE(FAdd->hasNoSignedZeros());
		EXPECT_TRUE(FAdd->hasAllowReciprocal());
		EXPECT_TRUE(FAdd->hasAllowContract());
		EXPECT_TRUE(FAdd->hasAllowReassoc());
		EXPECT_TRUE(FAdd->hasApproxFunc());

		// All flags are set in the builder.
		FMF.setFast();
Builder.setFastMathFlags(FMF);		Builder.setFastMathFlags(FMF);

F = Builder.CreateFAdd(F, F);		F = Builder.CreateFAdd(F, F);
EXPECT_TRUE(Builder.getFastMathFlags().any());		EXPECT_TRUE(Builder.getFastMathFlags().any());
		EXPECT_TRUE(Builder.getFastMathFlags().all());
ASSERT_TRUE(isa<Instruction>(F));		ASSERT_TRUE(isa<Instruction>(F));
FAdd = cast<Instruction>(F);		FAdd = cast<Instruction>(F);
EXPECT_TRUE(FAdd->hasNoNaNs());		EXPECT_TRUE(FAdd->hasNoNaNs());
		EXPECT_TRUE(FAdd->isFast());

// Now, try it with CreateBinOp		// Now, try it with CreateBinOp
F = Builder.CreateBinOp(Instruction::FAdd, F, F);		F = Builder.CreateBinOp(Instruction::FAdd, F, F);
EXPECT_TRUE(Builder.getFastMathFlags().any());		EXPECT_TRUE(Builder.getFastMathFlags().any());
ASSERT_TRUE(isa<Instruction>(F));		ASSERT_TRUE(isa<Instruction>(F));
FAdd = cast<Instruction>(F);		FAdd = cast<Instruction>(F);
EXPECT_TRUE(FAdd->hasNoNaNs());		EXPECT_TRUE(FAdd->hasNoNaNs());
		EXPECT_TRUE(FAdd->isFast());

F = Builder.CreateFDiv(F, F);		F = Builder.CreateFDiv(F, F);
EXPECT_TRUE(Builder.getFastMathFlags().any());		EXPECT_TRUE(Builder.getFastMathFlags().all());
EXPECT_TRUE(Builder.getFastMathFlags().UnsafeAlgebra);
ASSERT_TRUE(isa<Instruction>(F));		ASSERT_TRUE(isa<Instruction>(F));
FDiv = cast<Instruction>(F);		FDiv = cast<Instruction>(F);
EXPECT_TRUE(FDiv->hasAllowReciprocal());		EXPECT_TRUE(FDiv->hasAllowReciprocal());

		// Clear all FMF in the builder.
Builder.clearFastMathFlags();		Builder.clearFastMathFlags();

F = Builder.CreateFDiv(F, F);		F = Builder.CreateFDiv(F, F);
ASSERT_TRUE(isa<Instruction>(F));		ASSERT_TRUE(isa<Instruction>(F));
FDiv = cast<Instruction>(F);		FDiv = cast<Instruction>(F);
EXPECT_FALSE(FDiv->hasAllowReciprocal());		EXPECT_FALSE(FDiv->hasAllowReciprocal());

		// Try individual flags.
FMF.clear();		FMF.clear();
FMF.setAllowReciprocal();		FMF.setAllowReciprocal();
Builder.setFastMathFlags(FMF);		Builder.setFastMathFlags(FMF);

F = Builder.CreateFDiv(F, F);		F = Builder.CreateFDiv(F, F);
EXPECT_TRUE(Builder.getFastMathFlags().any());		EXPECT_TRUE(Builder.getFastMathFlags().any());
EXPECT_TRUE(Builder.getFastMathFlags().AllowReciprocal);		EXPECT_TRUE(Builder.getFastMathFlags().AllowReciprocal);
ASSERT_TRUE(isa<Instruction>(F));		ASSERT_TRUE(isa<Instruction>(F));
Show All 32 Lines	TEST_F(IRBuilderTest, FastMathFlags) {

FC = Builder.CreateFAdd(F, F);		FC = Builder.CreateFAdd(F, F);
EXPECT_TRUE(Builder.getFastMathFlags().any());		EXPECT_TRUE(Builder.getFastMathFlags().any());
EXPECT_TRUE(Builder.getFastMathFlags().AllowContract);		EXPECT_TRUE(Builder.getFastMathFlags().AllowContract);
ASSERT_TRUE(isa<Instruction>(FC));		ASSERT_TRUE(isa<Instruction>(FC));
FAdd = cast<Instruction>(FC);		FAdd = cast<Instruction>(FC);
EXPECT_TRUE(FAdd->hasAllowContract());		EXPECT_TRUE(FAdd->hasAllowContract());

		FMF.setApproxFunc();
Builder.clearFastMathFlags();		Builder.clearFastMathFlags();
		Builder.setFastMathFlags(FMF);
		// Now 'aml' and 'contract' are set.
		F = Builder.CreateFMul(F, F);
		FAdd = cast<Instruction>(F);
		EXPECT_TRUE(FAdd->hasApproxFunc());
		EXPECT_TRUE(FAdd->hasAllowContract());
		EXPECT_FALSE(FAdd->hasAllowReassoc());

		FMF.setAllowReassoc();
		Builder.clearFastMathFlags();
		Builder.setFastMathFlags(FMF);
		// Now 'aml' and 'contract' and 'reassoc' are set.
		F = Builder.CreateFMul(F, F);
		FAdd = cast<Instruction>(F);
		EXPECT_TRUE(FAdd->hasApproxFunc());
		EXPECT_TRUE(FAdd->hasAllowContract());
		EXPECT_TRUE(FAdd->hasAllowReassoc());

// Test a call with FMF.		// Test a call with FMF.
auto CalleeTy = FunctionType::get(Type::getFloatTy(Ctx),		auto CalleeTy = FunctionType::get(Type::getFloatTy(Ctx),
/isVarArg=/false);		/isVarArg=/false);
auto Callee =		auto Callee =
Function::Create(CalleeTy, Function::ExternalLinkage, "", M.get());		Function::Create(CalleeTy, Function::ExternalLinkage, "", M.get());

FCall = Builder.CreateCall(Callee, None);		FCall = Builder.CreateCall(Callee, None);
▲ Show 20 Lines • Show All 311 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[IR] redefine 'reassoc' fast-math-flag and add 'trans' fast-math-flagClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 120909

docs/LangRef.rst

include/llvm/IR/Instruction.h

include/llvm/IR/Operator.h

include/llvm/Transforms/Utils/LoopUtils.h

lib/AsmParser/LLLexer.cpp

lib/AsmParser/LLParser.h

lib/AsmParser/LLToken.h

lib/Bitcode/Reader/BitcodeReader.cpp

lib/Bitcode/Writer/BitcodeWriter.cpp

lib/CodeGen/ExpandReductions.cpp

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

lib/IR/AsmWriter.cpp

lib/IR/Instruction.cpp

lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp

lib/Target/AMDGPU/AMDGPULibCalls.cpp

lib/Transforms/InstCombine/InstCombineAddSub.cpp

lib/Transforms/InstCombine/InstCombineCalls.cpp

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

lib/Transforms/Scalar/Reassociate.cpp

lib/Transforms/Utils/LoopUtils.cpp

lib/Transforms/Utils/SimplifyLibCalls.cpp

lib/Transforms/Vectorize/LoopVectorize.cpp

lib/Transforms/Vectorize/SLPVectorizer.cpp

test/Assembler/fast-math-flags.ll

test/Bitcode/compatibility-3.6.ll

test/Bitcode/compatibility-3.7.ll

test/Bitcode/compatibility-3.8.ll

test/Bitcode/compatibility-3.9.ll

test/Bitcode/compatibility-4.0.ll

test/Bitcode/compatibility-5.0.ll

test/Bitcode/compatibility.ll

unittests/IR/IRBuilderTest.cpp

[IR] redefine 'reassoc' fast-math-flag and add 'trans' fast-math-flag
ClosedPublic