This is an archive of the discontinued LLVM Phabricator instance.

Add intrinsics for constrained floating point operations
ClosedPublic

Authored by andrew.w.kaylor on Nov 22 2016, 5:32 PM.

Download Raw Diff

Details

Reviewers

aemerson
DavidKreitzer
mehdi_amini
hfinkel

Commits

rGa0a1164ce41b: Add intrinsics for constrained floating point operations
rL293226: Add intrinsics for constrained floating point operations

Summary

This adds intrinsics that can be used to constrain optimizations that assume the default rounding mode and ignore FP exceptions.

I am starting with a simple lowering that translates the intrinsics directly to the corresponding target-independent FP operations when the selection DAG is built. I think this is necessary because the existing target-specific pattern matching for FP operations is fairly complex and I think attempting to duplicate all of that infrastructure is likely to result in errors and will certainly result in more work each time new wrinkles are introduced in the future.

I intend to model the implicit uses and defs of FP control and status registers in a future revision, which should be sufficient to prevent unwanted optimizations at the MachineInstr level. There is also a potential for incorrect code motion in during instruction selection. My tentative plan to handle that is to introduce pseudo-instruction nodes that wrap the inputs and outputs of the FP operations created based on the new intrinsics and effectively model the implicit FP control and status register behavior and use a chain. These nodes would then be eliminated during instruction selection.

I believe that this patch is sufficient as presented here and can be committed without any solution to the potential problems with code motion at the ISel and MachineInstr levels.

Diff Detail

Repository: rL LLVM

Event Timeline

andrew.w.kaylor updated this revision to Diff 79005.Nov 22 2016, 5:32 PM

andrew.w.kaylor retitled this revision from to Add intrinsics for constrained floating point operations.

andrew.w.kaylor updated this object.

andrew.w.kaylor added reviewers: mehdi_amini, aemerson, DavidKreitzer, hfinkel.

andrew.w.kaylor set the repository for this revision to rL LLVM.

andrew.w.kaylor added a subscriber: llvm-commits.

RKSimon added a subscriber: RKSimon.Nov 23 2016, 3:26 AM

I intend to model the implicit uses and defs of FP control and status registers in a future revision, which should be sufficient to prevent unwanted optimizations at the MachineInstr level. There is also a potential for incorrect code motion in during instruction selection. My tentative plan to handle that is to introduce pseudo-instruction nodes that wrap the inputs and outputs of the FP operations created based on the new intrinsics and effectively model the implicit FP control and status register behavior and use a chain. These nodes would then be eliminated during instruction selection.

Based on our conversation at the dev meeting, here's how I thought this would work:

Introduce target-independent chain-carrying nodes to represent these operations. For argument's sake, STRICT_FADD, etc.
Since nothing in the SDAG knows about what these nodes do, there's no problem with optimizations doing bad things.
You'd do something minimal in SelectionDAGISel::DoInstructionSelection() around the call to:

Select(Node);

so that it would become:

bool IsStrictFPOp = isStrictFPOp(Node);
if (IsStrictFPOp)
  mutateStrictFPToFP(Node); // STRICT_FADD -> FADD, etc.

Select(Node);

if (IsStrictFPOp && !TLI->addStrictFPRegDeps(Node))
  report_fatal_error("Could not add strict FP reg deps");

and then you'd be done. Obviously this is somewhat hand wavy, but if there are complexities I'm overlooking, I'd like to understand them.

In D27028#604172, @hfinkel wrote:
Based on our conversation at the dev meeting, here's how I thought this would work:

Introduce target-independent chain-carrying nodes to represent these operations. For argument's sake, STRICT_FADD, etc.

Since nothing in the SDAG knows about what these nodes do, there's no problem with optimizations doing bad things.

You'd do something minimal in SelectionDAGISel::DoInstructionSelection() around the call to:

Select(Node);

so that it would become:
bool IsStrictFPOp = isStrictFPOp(Node);
if (IsStrictFPOp)
  mutateStrictFPToFP(Node); // STRICT_FADD -> FADD, etc.

Select(Node);

if (IsStrictFPOp && !TLI->addStrictFPRegDeps(Node))
  report_fatal_error("Could not add strict FP reg deps");
and then you'd be done. Obviously this is somewhat hand wavy, but if there are complexities I'm overlooking, I'd like to understand them.

Yes, that's exactly what I wanted to do, but I couldn't figure out how to do it. This is my first time exploring in ISel and it's been a bit intimidating. The way you describe it sounds very simple. I guess I was just reluctant to do that because there are currently no target-independent transformations happening there. I tried putting something in SelectionDAGISel::SelectCommonCode() but it seemed like that was too late for the kind of double transformation I needed (STRICT_FADD->FADD->target-specific instruction). I'm always hesitant to believe that my new feature is special and really needs something that has never been needed before. Usually that's a signal that I'm misunderstanding the design. So I tried to do something that seemed a more natural fit with the existing code.

I'll go back and see what I can do with it following the model you describe here and see what it looks like.

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

In D27028#604594, @arsenm wrote:

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

You mean that there should be a way to control this on a per-operation basis, or there should be some way to represent that the user might be changing some thread state that controls how this is done?

In D27028#604603, @hfinkel wrote:

In D27028#604594, @arsenm wrote:

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

You mean that there should be a way to control this on a per-operation basis, or there should be some way to represent that the user might be changing some thread state that controls how this is done?

I was going to ask that same question.

For the purposes of these intrinsics, we only need enough information to control how optimization passes will handle these instructions. I suppose a transformation like constant folding could theoretically have something to flush to zero. Does it make sense to treat that as a set of new rounding mode, like LLVM_ROUND_TONEAREST_FTZ, etc.?

In D27028#604609, @andrew.w.kaylor wrote:

In D27028#604603, @hfinkel wrote:

In D27028#604594, @arsenm wrote:

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

You mean that there should be a way to control this on a per-operation basis, or there should be some way to represent that the user might be changing some thread state that controls how this is done?

I was going to ask that same question.

For the purposes of these intrinsics, we only need enough information to control how optimization passes will handle these instructions. I suppose a transformation like constant folding could theoretically have something to flush to zero. Does it make sense to treat that as a set of new rounding mode, like LLVM_ROUND_TONEAREST_FTZ, etc.?

Yes, I think it makes sense to treat it the same as a rounding mode. For lowering some of these operations, we would need to change the denormal mode for some subsequence of the lowering. Some of the OpenCL library builtin functions also need to do this, and need to be able to control this via the intrinsic.

Is flush-to-zero currently handled with function attributes?

Also, how would you feel about deferring flush-to-zero support to a later patch?

In D27028#604694, @andrew.w.kaylor wrote:

Is flush-to-zero currently handled with function attributes?

Also, how would you feel about deferring flush-to-zero support to a later patch?

It's not handled in general llvm. We currently have a subtarget feature for f32/f64 denormal support in the default rounding mode

In D27028#604699, @arsenm wrote:

It's not handled in general llvm. We currently have a subtarget feature for f32/f64 denormal support in the default rounding mode

It looks like several sub-targets have some kind of denormal support, but I didn't look closely enough to see what each of them is doing. That's kind of why I would prefer to defer the issue...because I haven't thought about it enough to know that I'd be implementing it correctly.

So to get back to Hal's question, is this needed at a per-instruction level? I understand that you want to select instructions differently based on this setting, but is attaching it to the instruction just a convenient way to get the information to the ISel or can this change across different scopes?

In D27028#604714, @andrew.w.kaylor wrote:

In D27028#604699, @arsenm wrote:

It's not handled in general llvm. We currently have a subtarget feature for f32/f64 denormal support in the default rounding mode

It looks like several sub-targets have some kind of denormal support, but I didn't look closely enough to see what each of them is doing. That's kind of why I would prefer to defer the issue...because I haven't thought about it enough to know that I'd be implementing it correctly.

So to get back to Hal's question, is this needed at a per-instruction level? I understand that you want to select instructions differently based on this setting, but is attaching it to the instruction just a convenient way to get the information to the ISel or can this change across different scopes?

It's the same as the rounding mode and be per-instruction

I re-wrote the ISel code to introduce a pseudo-instruction for the strict variants of FP operations, which is mutated directly to a normal FP node just before instruction selection. I removed the SDNodeFlag extensions I had added in my original patch because they aren't needed in this implementation. Some variant of that code will likely need to be re-introduced at some point, particularly to handle FTZ rounding modes.

I did not introduce the code to add a target-specific implicit use of the FP-environment register. I do intend to add this in a later patch, but this code works well enough without it that I believe it can be deferred.

At some point it will be necessary to teach the LegalizeDAG code how to handle the strict instructions in order to legalize strict FP nodes with vector operands. I intend to do that in a later patch also.

b-sumner added a subscriber: b-sumner.Nov 30 2016, 9:41 AM

I did not introduce the code to add a target-specific implicit use of the FP-environment register. I do intend to add this in a later patch, but this code works well enough without it that I believe it can be deferred.

I'm fine with doing this, but there should be FIXME comments in the code about this and/or bug reports filed (preferably both) so that we've not tempted to remove 'experimental' from the name until we do this part of it.

Regarding denormals, LLVM has support for the following modes (include/llvm/Target/TargetOptions.h):

namespace FPDenormal {
  enum DenormalMode {
    IEEE,           // IEEE 754 denormal numbers
    PreserveSign,   // the sign of a flushed-to-zero number is preserved in
                    // the sign of 0
    PositiveZero    // denormals are flushed to positive zero
  };
}

and so I recommend just following the same pattern here as with the rounding mode with strings like "denormals.dynamic", "denormals.ieee", "denormals.preserve_sign", and "denormals.positive_zero".

docs/LangRef.rst
12046	To engage in some bikeshedding, I really don't like this naming convention which reminds me of macro names. Not that we've been extremely consistent about the standardized metadata strings we already use, but this would be yet another choice; we should avoid that. For controlling LLVM optimization passes, we use strings like this !"llvm.loop.vectorize.enable". We also use strings like !"function_entry_count" and !"branch_weights" for profiling info. We also have strings like !"ProfileFormat" and !"TotalCount" in other places. Of these choices, I prefer the one used by our basic profiling information (i.e. !"branch_weights"), and so I'd prefer these be named: "round_dynamic" "round_downward" ... and similar for the fpexcept strings. We might also borrow the dot-based hierarchy scheme from our loop optimization metadata and use: "round.dynamic" "round.downward" ... and similar for the fpexcept strings. I think that I like this slightly better than just using the underscore separators. I specifically dislike having "llvm" in the name here, as it makes it seem as though the meaning of these things is somehow LLVM-specific. It is not (they come from IEEE if nothing else). Having "llvm" in the optimization metadata makes a bit more sense because those refer to specific LLVM optimization passes.
12083	"not be unmasked" is a double negative. How about just saying will be masked? Or will not be enabled?

Few stylistic comments.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5773	No need for a SmallVector. Just create a 3 entry array and assign each element directly.
lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
955 ↗	(On Diff #79447)	This doesn't need to be a SmallVector. You can just use SDValue Ops[2] = { Node->getOperand(1), Node->getOperand(2) };
lib/IR/IntrinsicInst.cpp
106	Don't use 'else' after an 'if' containing a 'return' per coding standards.

andrew.w.kaylor added inline comments.Dec 5 2016, 11:08 AM

lib/IR/IntrinsicInst.cpp
106	I can make that change. What I wanted to convey here is that this is effectively a switch statement with no uncovered cases. I suppose since there is no way to get the compiler to warn if that ever changes there is no particular benefit in doing it this way.

craig.topper added inline comments.Dec 5 2016, 11:43 AM

lib/IR/IntrinsicInst.cpp
107	You could maybe use a StringSwitch.

andrew.w.kaylor added inline comments.Dec 5 2016, 12:03 PM

lib/IR/IntrinsicInst.cpp
107	That would be perfect, but can I put an llvm_unreachable() in a StringSwitch?

craig.topper added inline comments.Dec 5 2016, 12:12 PM

lib/IR/IntrinsicInst.cpp
107	I think StringSwitch will assert if it doesn't find a match and there is no default specified.

andrew.w.kaylor added inline comments.Dec 5 2016, 2:41 PM

lib/IR/IntrinsicInst.cpp
107	Yes, I saw that in the code. I wanted something that was informative in non-assert mode too. I guess the obvious solution to that is to add something like an UnreachableDefault() method to StringSwitch.

I don't think llvm_unreachable is guaranteed to do anything in non-asserts build. From the description in the header:

/ Marks that the current location is not supposed to be reachable.
/ In !NDEBUG builds, prints the message and location info to stderr.
/ In NDEBUG builds, becomes an optimizer hint that the current location
/ is not supposed to be reachable. On compilers that don't support
/ such hints, prints a reduced message instead.
/
/ Use this instead of assert(0). It conveys intent more clearly and
/ allows compilers to omit some unnecessary code.
#ifndef NDEBUG
#define llvm_unreachable(msg) \

::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)

#elif defined(LLVM_BUILTIN_UNREACHABLE)
#define llvm_unreachable(msg) LLVM_BUILTIN_UNREACHABLE
#else
#define llvm_unreachable(msg) ::llvm::llvm_unreachable_internal()
#endif

mehdi_amini added inline comments.Dec 5 2016, 2:49 PM

lib/IR/IntrinsicInst.cpp
107	"Informative" as in "triggers a runtime failure"? Unreachable isn't the right solution, it is UB in release mode AFAIK.

I see. I still think it's worth adding UnreachableDefault() instead of just a comment explaining that other values will assert/crash. I'm checking the legal values in the verfifier, so this shouldn't be an issue.

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.
Otherwise a possible more generic extension to StringSwitch would be to accept lambdas:

StringSwitch(...)
   .Case("Value1", [] { return computeValue(); })
   .Default([] -> T { report_fatal_error("Unexpected Value XXX in YYYY"); }

In D27028#613867, @mehdi_amini wrote:

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.

I'm not sure I understand what you're suggesting here. If I do this:

return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero);
llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!");

the llvm_unreachable statement is purely unreachable because the implicit default from StringSwitch will assert or dereference a null pointer. The llvm_unreachable in this case effectively becomes a comment. So what I'd like to do is this:

return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero)
  .UnreachableDefault("Unexpected rounding mode argument in FP intrinsic!");

which at least in !NDEBUG builds will produce a useful message. This would be fairly trivial to implement.

In D27028#613926, @andrew.w.kaylor wrote:
In D27028#613867, @mehdi_amini wrote:

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.

I'm not sure I understand what you're suggesting here. If I do this:
return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero);
llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!");
the llvm_unreachable statement is purely unreachable because the implicit default from StringSwitch will assert or dereference a null pointer. The llvm_unreachable in this case effectively becomes a comment.

Not really: in an optimized build it means that the default case is unreachable. The assertion does not exist there, and the optimizer can drop the nullptr dereference.

So what I'd like to do is this:

return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero)
  .UnreachableDefault("Unexpected rounding mode argument in FP intrinsic!");

which at least in !NDEBUG builds will produce a useful message. This would be fairly trivial to implement.

Right the only difference is that you get a nicer runtime failure in assert mode.

arsenm added inline comments.Dec 5 2016, 5:01 PM

docs/LangRef.rst
12126	Typo: lllvm (seems repeated a number of times)

arsenm added inline comments.Dec 5 2016, 5:04 PM

include/llvm/IR/Intrinsics.td
475	Also fma, minnum, maxnum, sqrt

andrew.w.kaylor added inline comments.Dec 5 2016, 5:19 PM

include/llvm/IR/Intrinsics.td
475	That's a bit of a can of worms, isn't it? I mean there are a whole bunch of FP intrinsics, and I can imagine it making sense to have constrained versions of any of them.

In D27028#613949, @mehdi_amini wrote:
In D27028#613926, @andrew.w.kaylor wrote:
In D27028#613867, @mehdi_amini wrote:

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.

I'm not sure I understand what you're suggesting here. If I do this:
return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero);
llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!");
the llvm_unreachable statement is purely unreachable because the implicit default from StringSwitch will assert or dereference a null pointer. The llvm_unreachable in this case effectively becomes a comment.
Not really: in an optimized build it means that the default case is unreachable. The assertion does not exist there, and the optimizer can drop the nullptr dereference.

But the 'return' will always return, won't it? I don't see how the optimizer could associate the llvm_unreachable statement with the 'return StringSwitch(...)' to be able to do anything with it.

I'm in the process of preparing an updated revision that implements UnreachableDefault as I suggested, but I'm not overly attached to that approach. If you feel strongly that something else would be better I can change it.

Fixed typos and style issues.
Added FIXME comments as requested.

I have opened two Bugzilla reports.

One to add FP environment register modeling:
https://llvm.org/bugs/show_bug.cgi?id=31284

One to address denormal handling:
https://llvm.org/bugs/show_bug.cgi?id=31285

I'd like to talk more about the denormal handling apart from this review. I'm not certain that rolling it into these intrinsics is the correct solution. It might be, and I'd be happy to do that if we agree that's the best approach. It seems to me as though it has certain similarities to the fast math flags as well, so I'd like to talk about that relationship a bit.

In D27028#614081, @andrew.w.kaylor wrote:

Fixed typos and style issues.
Added FIXME comments as requested.

I have opened two Bugzilla reports.

One to add FP environment register modeling:
https://llvm.org/bugs/show_bug.cgi?id=31284

One to address denormal handling:
https://llvm.org/bugs/show_bug.cgi?id=31285

I'd like to talk more about the denormal handling apart from this review. I'm not certain that rolling it into these intrinsics is the correct solution. It might be, and I'd be happy to do that if we agree that's the best approach. It seems to me as though it has certain similarities to the fast math flags as well, so I'd like to talk about that relationship a bit.

Correct handling of ftz is required for correctness, so it isn't appropriate to be an optimization hint placed with the fast math flags

In D27028#614098, @arsenm wrote:

Correct handling of ftz is required for correctness, so it isn't appropriate to be an optimization hint placed with the fast math flags

But what's correct? The reason I think it's similar to the fast math flags is that if nothing is specified then optimizations can make no transformations based on FTZ, and if you want to be able to make a transformation that assumes a certain mode then you have to find the attribute set.

Basically, the way I've been viewing the difference between the intrinsics I'm introducing and the fast math flags is that the fast math flags are permissive (behavior is assumed to be restricted unless the attributes are present) whereas the intrinsics are restrictive (behavior is assumed to be permitted unless the constraining intrinisc is used). I'm not sure the denormal handling falls cleanly into either of these categories, but it seems to me more like the permissive approach is appropriate.

andrew.w.kaylor added a subscriber: Farhana.Dec 8 2016, 11:56 AM

Ping

Ping.

FWIW, rounding controls are needed for llvm.fma.*, llvm.fmuladd.*, and llvm.sqrt.*

Also, I don't see any reason to cover frem, but I also don't understand how frem ever got to be an instruction instead of a standard C library intrinsic.

In D27028#634214, @b-sumner wrote:

FWIW, rounding controls are needed for llvm.fma.*, llvm.fmuladd.*, and llvm.sqrt.*

I agree these are needed. I've added a comment in the .td file and intend to add variations of those intrinsics in a later revision.

In D27028#634214, @b-sumner wrote:

Also, I don't see any reason to cover frem, but I also don't understand how frem ever got to be an instruction instead of a standard C library intrinsic.

I'm not sure it actually happens, but I would think that frem is potentially subject to the same sorts of constant folding that can be done with fdiv.

This all looks pretty good to me, Andy. Regarding

In D27028#634214, @b-sumner wrote:

Also, I don't see any reason to cover frem, but I also don't understand how frem ever got to be an instruction instead of a standard C library intrinsic.

I'm not sure it actually happens, but I would think that frem is potentially subject to the same sorts of constant folding that can be done with fdiv.

When frem has a meaningful result, it is always exact, so perhaps we should omit the rounding behavior argument? I think we still need a constrained frem intrinsic, though, to handle the exceptional behavior in cases such as "frem inf, x" and "frem x, 0".

Regarding the denormal-handling issue, I agree that it is reasonable to defer support to a subsequent patch. I suspect you are ultimately going to want a separate argument rather than folding denormal-handling into the rounding mode. But we can discuss that in the Bugzilla report that you opened.

docs/LangRef.rst
12147	This question might be beyond the scope of this initial change set, but here goes. FP negation is currently handled using fsub -0, X. Is that sufficient in a constrained context? If we allow X to be a signaling NaN, -0-X should raise Invalid while -X should not, at least according to IEEE-754.
lib/IR/Verifier.cpp
4265	Rather than duplicating the legal rounding mode & exception behavior strings here (and in the getRoundingMode & getExceptionBehavior methods), would it be better to have a string-->enum function that is called in both places?

scanon added a subscriber: scanon.Jan 4 2017, 9:32 AM

scanon added inline comments.

docs/LangRef.rst
12147	Probably more importantly, -X has a defined signbit (the negation of whatever the signbit of X was), but -0-X does not [assuming the obvious binding of -X to the IEEE 754 negate operation and -0-X to the subtract operation].

In D27028#635372, @DavidKreitzer wrote:

When frem has a meaningful result, it is always exact, so perhaps we should omit the rounding behavior argument? I think we still need a constrained frem intrinsic, though, to handle the exceptional behavior in cases such as "frem inf, x" and "frem x, 0".

The implementation is slightly simpler if I give the frem intrinsic a rounding argument, even though it isn't needed. Otherwise, that intrinsic couldn't be handled by the same subclass as the others. I realize that's a fairly weak argument, but I just feel like making this one intrinsic different from the others will make the code ugly.

lib/IR/Verifier.cpp
4265	I do like the idea of having a single location for these strings. On the other hand, I think I'd need to introduce a new enum value (invalid?) that is only used here. I'll think about it a bit and try to consolidate the strings one way or another.

In D27028#636346, @andrew.w.kaylor wrote:

In D27028#635372, @DavidKreitzer wrote:

When frem has a meaningful result, it is always exact, so perhaps we should omit the rounding behavior argument? I think we still need a constrained frem intrinsic, though, to handle the exceptional behavior in cases such as "frem inf, x" and "frem x, 0".

The implementation is slightly simpler if I give the frem intrinsic a rounding argument, even though it isn't needed. Otherwise, that intrinsic couldn't be handled by the same subclass as the others. I realize that's a fairly weak argument, but I just feel like making this one intrinsic different from the others will make the code ugly.

Uniformity of implementation is a reasonable argument in favor of the extraneous rounding mode parameter for the frem intrinsic. But could you please make it clear that that is your intent in the language ref description of @llvm.experimental.constrained.frem?

docs/LangRef.rst
12147	Steve, are you referring to the fact that the sign of -0-X is unspecified when X is NaN or something else? I am trying to understand the implications of your comment - whether they are specific to the new constrained FP intrinsics or whether something needs to be done for normal FP LLVM IR. Another problem worth mentioning with implementing -X as -0-X in a constrained context is that -0-X will produce -0 when X is -0 and the rounding mode is toward -inf.

-Combined uses of string literals for FP rounding mode and exception behavior.
-Removed extension to StringSwitch since it is no longer needed.
-Added documentation explaining that the rounding mode argument isn't used from the frem intrinsic.

arsenm added inline comments.Jan 10 2017, 5:47 PM

lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
928–935 ↗	(On Diff #83904)	You can replace this with a switch strict->regular op mapping, and then you avoid having to switch over these again in MutateStrictFPToFP

-Consolidated examination of StrictFP node opcodes.

Is there anything left blocking this patch?

I'm happy with this now. We'll need to address adding implicit instruction dependencies (and generating these intrinsics from Clang in appropriate modes) in follow-up. Thanks for working on this.

This revision is now accepted and ready to land.Jan 25 2017, 5:05 PM

Closed by commit rL293226: Add intrinsics for constrained floating point operations (authored by akaylor). · Explain WhyJan 26 2017, 3:39 PM

This revision was automatically updated to reflect the committed changes.

andrew.w.kaylor mentioned this in D32319: Add constrained intrinsics for some libm-equivalent operations.Apr 20 2017, 3:54 PM

kpn mentioned this in D43515: More math intrinsics for conservative math handling.Feb 20 2018, 9:44 AM

@andrew.w.kaylor I went through the mailing list thread regarding this change and saw "Eventually, we’ll want to go back and teach specific optimizations to understand the intrinsics so that where possible optimizations can be performed in a manner consistent with dynamic rounding modes and strict exception handling.".
Do you have any references\plans on how to teach specific optimizations on this?

Herald added a subscriber: jdoerfert. · View Herald TranscriptApr 22 2020, 8:50 AM

In D27028#1997092, @GGanesh wrote:

@andrew.w.kaylor I went through the mailing list thread regarding this change and saw "Eventually, we’ll want to go back and teach specific optimizations to understand the intrinsics so that where possible optimizations can be performed in a manner consistent with dynamic rounding modes and strict exception handling.".
Do you have any references\plans on how to teach specific optimizations on this?

It depends on the optimization. Each optimization needs to be evaluated in some way to determine if it is safe to perform the optimization with the floating point constraints. Simon Moll has an idea that he can update the pattern matching to also match closely related intrinsics. Simon intends to work on this for the vector predicated intrinsics, but the idea should extend to constrained floating point intrinsics. However, we will still need to update the optimizations to evaluate whether its intended transformation meets the constraints. For example, pattern matching may find opportunities for constant folding, but we will need to examine the actual constants to determine if the operation being folded would be inexact (and therefore subject to rounding) or otherwise raise an FP exception.

Revision Contents

Path

Size

docs/

LangRef.rst

269 lines

include/

llvm/

CodeGen/

SelectionDAGNodes.h

47 lines

IR/

IntrinsicInst.h

37 lines

Intrinsics.td

33 lines

lib/

CodeGen/

SelectionDAG/

SelectionDAG.cpp

69 lines

SelectionDAGBuilder.h

1 line

SelectionDAGBuilder.cpp

58 lines

IR/

IntrinsicInst.cpp

35 lines

Verifier.cpp

36 lines

test/

CodeGen/

X86/

fp-intrinsics.ll

112 lines

Feature/

fp-intrinsics.ll

102 lines

Verifier/

fp-intrinsics.ll

43 lines

Diff 79005

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 12,018 Lines • ▼ Show 20 Lines
	the pointer to the memory for which the ``invariant.group`` no longer holds.			the pointer to the memory for which the ``invariant.group`` no longer holds.

	Semantics:			Semantics:
	""""""""""			""""""""""

	Returns another pointer that aliases its argument but which is considered different			Returns another pointer that aliases its argument but which is considered different
	for the purposes of ``load``/``store`` ``invariant.group`` metadata.			for the purposes of ``load``/``store`` ``invariant.group`` metadata.

				Constrained Floating Point Intrinsics
				-------------------------------------

				These intrinsics are used to provide special handling of floating point
				operations when specific rounding mode or floating point exception behavior is
				required. By default, LLVM optimization passes assume that the rounding mode is
				round-to-nearest and that floating point exceptions will not be monitored.
				Constrained FP intrinsics are used to support non-default rounding modes and
				accurately preserve exception behavior without compromising LLVM's ability to
				optimize FP code when the default behavior is used.

				Each of these intrinsics corresponds to a normal floating point operation. The
				first two arguments and the return value are the same as the corresponding FP
				operation.

				The third argument is a metadata argument specifying the rounding mode to be
				assumed. This argument must be one of the following strings:

				::
				"LLVM_ROUND_DYNAMIC"
				hfinkelUnsubmitted Not Done Reply Inline Actions To engage in some bikeshedding, I really don't like this naming convention which reminds me of macro names. Not that we've been extremely consistent about the standardized metadata strings we already use, but this would be yet another choice; we should avoid that. For controlling LLVM optimization passes, we use strings like this !"llvm.loop.vectorize.enable". We also use strings like !"function_entry_count" and !"branch_weights" for profiling info. We also have strings like !"ProfileFormat" and !"TotalCount" in other places. Of these choices, I prefer the one used by our basic profiling information (i.e. !"branch_weights"), and so I'd prefer these be named: "round_dynamic" "round_downward" ... and similar for the fpexcept strings. We might also borrow the dot-based hierarchy scheme from our loop optimization metadata and use: "round.dynamic" "round.downward" ... and similar for the fpexcept strings. I think that I like this slightly better than just using the underscore separators. I specifically dislike having "llvm" in the name here, as it makes it seem as though the meaning of these things is somehow LLVM-specific. It is not (they come from IEEE if nothing else). Having "llvm" in the optimization metadata makes a bit more sense because those refer to specific LLVM optimization passes. hfinkel: To engage in some bikeshedding, I really don't like this naming convention which reminds me of…
				"LLVM_ROUND_TONEAREST"
				"LLVM_ROUND_DOWNWARD"
				"LLVM_ROUND_UPWARD"
				"LLVM_ROUND_TOWARDZERO"

				If this argument is "LLVM_ROUND_DYNAMIC" optimization passes must assume that
				the rounding mode is unknown and may change at runtime. No transformations that
				depend on rounding mode may be performed in this case.

				The other possible values for the rounding mode argument correspond to the
				similarly named IEEE rounding modes. If the argument is any of these values
				optimization passes may perform transformations as long as they are consistent
				with the specified rounding mode.

				For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
				"LLVM_ROUND_DOWNWARD" or "LLVM_ROUND_DYNAMIC" because if the value of 'x' is +0
				then 'x-0' should evaluate to '-0' when rounding downward. However, this
				transformation is legal for all other rounding modes.

				For values other than "LLVM_ROUND_DYNAMIC" optimization passes may assume that
				the actual runtime rounding mode (as defined in a target-specific manner)
				matches the specified rounding mode, but this is not guaranteed. Using a
				specific non-dynamic rounding mode which does not match the actual rounding
				mode at runtime results in undefined behavior.

				The fourth argument to the constrained floating point intrinsics specifies the
				required exception behavior. This argument must be one of the following
				strings:

				::
				"LLVM_FPEXCEPT_IGNORE"
				"LLVM_FPEXCEPT_MAYTRAP"
				"LLVM_FPEXCEPT_STRICT"

				If this argument is "LLVM_FPEXCEPT_IGNORE" optimization passes may assume that
				the exception status flags will not be read and that floating point exceptions
				will not be unmasked. This allows transformations to be performed that may
				hfinkelUnsubmitted Not Done Reply Inline Actions "not be unmasked" is a double negative. How about just saying will be masked? Or will not be enabled? hfinkel: "not be unmasked" is a double negative. How about just saying will be masked? Or will not be…
				change the exception semantics of the original code. For example, FP operations
				may be speculatively executed in this case whereas they must not be for either
				of the other possible values of this argument.

				If the exception behavior argument is "LLVM_FPEXCEPT_MAYTRAP" optimization
				passes must avoid transformations that may raise exceptions that would not
				have been raised by the original code (such as speculatively executing FP
				operations), but passes are not required to preserve all exceptions that are
				implied by the original code. For example, exceptions may be potentially hidden
				by constant folding.

				If the exception behavior argument is "LLVM_FPEXCEPT_STRICT" all transformations
				must strictly preserve the floating point exception semantics of the original
				code. Any FP exception that would have been raised by the original code must be
				raised by the transformed code, and the transformed code must not raise any FP
				exceptions that would not have been raised by the original code. This is the
				exception behavior argument that will be used if the code being compiled reads
				the FP exception status flags, but this mode can also be used with code that
				unmasks FP exceptions.

				The number and order of floating point exceptions is NOT guaranteed. For
				example, a series of FP operations that each may raise exceptions may be
				vectorized into a single instruction that raises each unique exception a single
				time.


				'``llvm.experimental.constrained.fadd``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``lllvm.experimental.constrained.fadd``' intrinsic returns the sum of its
				arsenmUnsubmitted Not Done Reply Inline Actions Typo: lllvm (seems repeated a number of times) arsenm: Typo: lllvm (seems repeated a number of times)
				two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``lllvm.experimental.constrained.fadd``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point sum of the two value operands and has
				the same type as the operands.


				'``llvm.experimental.constrained.fsub``' Intrinsic
				DavidKreitzerUnsubmitted Not Done Reply Inline Actions This question might be beyond the scope of this initial change set, but here goes. FP negation is currently handled using fsub -0, X. Is that sufficient in a constrained context? If we allow X to be a signaling NaN, -0-X should raise Invalid while -X should not, at least according to IEEE-754. DavidKreitzer: This question might be beyond the scope of this initial change set, but here goes. FP negation…
				scanonUnsubmitted Not Done Reply Inline Actions Probably more importantly, -X has a defined signbit (the negation of whatever the signbit of X was), but -0-X does not [assuming the obvious binding of -X to the IEEE 754 negate operation and -0-X to the subtract operation]. scanon: Probably more importantly, -X has a defined signbit (the negation of whatever the signbit of X…
				DavidKreitzerUnsubmitted Not Done Reply Inline Actions Steve, are you referring to the fact that the sign of -0-X is unspecified when X is NaN or something else? I am trying to understand the implications of your comment - whether they are specific to the new constrained FP intrinsics or whether something needs to be done for normal FP LLVM IR. Another problem worth mentioning with implementing -X as -0-X in a constrained context is that -0-X will produce -0 when X is -0 and the rounding mode is toward -inf. DavidKreitzer: Steve, are you referring to the fact that the sign of -0-X is unspecified when X is NaN or…
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``lllvm.experimental.constrained.fsub``' intrinsic returns the difference
				of its two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``lllvm.experimental.constrained.fsub``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point difference of the two value operands
				and has the same type as the operands.


				'``llvm.experimental.constrained.fmul``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``lllvm.experimental.constrained.fmul``' intrinsic returns the product of
				its two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``lllvm.experimental.constrained.fmul``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point product of the two value operands and
				has the same type as the operands.


				'``llvm.experimental.constrained.fdiv``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``lllvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
				its two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``lllvm.experimental.constrained.fdiv``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point quotient of the two value operands and
				has the same type as the operands.


				'``llvm.experimental.constrained.frem``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``lllvm.experimental.constrained.frem``' intrinsic returns the remainder
				from the division of its two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``lllvm.experimental.constrained.frem``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point remainder from the division of the two
				value operands and has the same type as the operands. The remainder has the
				same sign as the dividend.


	General Intrinsics			General Intrinsics
	------------------			------------------

	This class of intrinsics is designed to be generic and has no specific			This class of intrinsics is designed to be generic and has no specific
	purpose.			purpose.

	'``llvm.var.annotation``' Intrinsic			'``llvm.var.annotation``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	▲ Show 20 Lines • Show All 607 Lines • Show Last 20 Lines

include/llvm/CodeGen/SelectionDAGNodes.h

Show First 20 Lines • Show All 334 Lines • ▼ Show 20 Lines	static SimpleType getSimplifiedValue(SDUse &Val) {
return Val.getNode();		return Val.getNode();
}		}
};		};

/// These are IR-level optimization flags that may be propagated to SDNodes.		/// These are IR-level optimization flags that may be propagated to SDNodes.
/// TODO: This data structure should be shared by the IR optimizer and the		/// TODO: This data structure should be shared by the IR optimizer and the
/// the backend.		/// the backend.
struct SDNodeFlags {		struct SDNodeFlags {
		public:
		enum RoundingModeTy {
		rmDefault = 0,
		rmDynamic = 1,
		rmToNearest = 2,
		rmDownward = 4,
		rmUpward = 8,
		rmTowardZero = 16
		};

		enum ExceptionBehaviorTy {
		ebIgnore = 0,
		ebMayTrap = 1,
		ebStrict = 2
		};

private:		private:
bool NoUnsignedWrap : 1;		bool NoUnsignedWrap : 1;
bool NoSignedWrap : 1;		bool NoSignedWrap : 1;
bool Exact : 1;		bool Exact : 1;
bool UnsafeAlgebra : 1;		bool UnsafeAlgebra : 1;
bool NoNaNs : 1;		bool NoNaNs : 1;
bool NoInfs : 1;		bool NoInfs : 1;
bool NoSignedZeros : 1;		bool NoSignedZeros : 1;
bool AllowReciprocal : 1;		bool AllowReciprocal : 1;
bool VectorReduction : 1;		bool VectorReduction : 1;
		RoundingModeTy RoundingMode : 5;
		ExceptionBehaviorTy ExceptionBehavior : 2;

public:		public:
/// Default constructor turns off all optimization flags.		/// Default constructor turns off all optimization flags.
SDNodeFlags() {		SDNodeFlags() {
NoUnsignedWrap = false;		NoUnsignedWrap = false;
NoSignedWrap = false;		NoSignedWrap = false;
Exact = false;		Exact = false;
UnsafeAlgebra = false;		UnsafeAlgebra = false;
NoNaNs = false;		NoNaNs = false;
NoInfs = false;		NoInfs = false;
NoSignedZeros = false;		NoSignedZeros = false;
AllowReciprocal = false;		AllowReciprocal = false;
VectorReduction = false;		VectorReduction = false;
		RoundingMode = rmDefault;
		ExceptionBehavior = ebIgnore;
}		}

// These are mutators for each flag.		// These are mutators for each flag.
void setNoUnsignedWrap(bool b) { NoUnsignedWrap = b; }		void setNoUnsignedWrap(bool b) { NoUnsignedWrap = b; }
void setNoSignedWrap(bool b) { NoSignedWrap = b; }		void setNoSignedWrap(bool b) { NoSignedWrap = b; }
void setExact(bool b) { Exact = b; }		void setExact(bool b) { Exact = b; }
void setUnsafeAlgebra(bool b) { UnsafeAlgebra = b; }		void setUnsafeAlgebra(bool b) { UnsafeAlgebra = b; }
void setNoNaNs(bool b) { NoNaNs = b; }		void setNoNaNs(bool b) { NoNaNs = b; }
void setNoInfs(bool b) { NoInfs = b; }		void setNoInfs(bool b) { NoInfs = b; }
void setNoSignedZeros(bool b) { NoSignedZeros = b; }		void setNoSignedZeros(bool b) { NoSignedZeros = b; }
void setAllowReciprocal(bool b) { AllowReciprocal = b; }		void setAllowReciprocal(bool b) { AllowReciprocal = b; }
void setVectorReduction(bool b) { VectorReduction = b; }		void setVectorReduction(bool b) { VectorReduction = b; }
		void setRoundingMode(RoundingModeTy rm) { RoundingMode = rm; }
		void setExceptionBehavior(ExceptionBehaviorTy eb) { ExceptionBehavior = eb; }

// These are accessors for each flag.		// These are accessors for each flag.
bool hasNoUnsignedWrap() const { return NoUnsignedWrap; }		bool hasNoUnsignedWrap() const { return NoUnsignedWrap; }
bool hasNoSignedWrap() const { return NoSignedWrap; }		bool hasNoSignedWrap() const { return NoSignedWrap; }
bool hasExact() const { return Exact; }		bool hasExact() const { return Exact; }
bool hasUnsafeAlgebra() const { return UnsafeAlgebra; }		bool hasUnsafeAlgebra() const { return UnsafeAlgebra; }
bool hasNoNaNs() const { return NoNaNs; }		bool hasNoNaNs() const { return NoNaNs; }
bool hasNoInfs() const { return NoInfs; }		bool hasNoInfs() const { return NoInfs; }
bool hasNoSignedZeros() const { return NoSignedZeros; }		bool hasNoSignedZeros() const { return NoSignedZeros; }
bool hasAllowReciprocal() const { return AllowReciprocal; }		bool hasAllowReciprocal() const { return AllowReciprocal; }
bool hasVectorReduction() const { return VectorReduction; }		bool hasVectorReduction() const { return VectorReduction; }
		RoundingModeTy getRoundingMode() const {
		// For flag merging purposes, we need to recognize when no expicit rounding
		// mode has been set (rmDefault), but the default is rmToNearest.
		if (RoundingMode == rmDefault)
		return rmToNearest;
		return RoundingMode;
		}
		ExceptionBehaviorTy getExceptionBehavior() const { return ExceptionBehavior; }

/// Clear any flags in this flag set that aren't also set in Flags.		/// Clear any flags in this flag set that aren't also set in Flags.
void intersectWith(const SDNodeFlags *Flags) {		void intersectWith(const SDNodeFlags *Flags) {
NoUnsignedWrap &= Flags->NoUnsignedWrap;		NoUnsignedWrap &= Flags->NoUnsignedWrap;
NoSignedWrap &= Flags->NoSignedWrap;		NoSignedWrap &= Flags->NoSignedWrap;
Exact &= Flags->Exact;		Exact &= Flags->Exact;
UnsafeAlgebra &= Flags->UnsafeAlgebra;		UnsafeAlgebra &= Flags->UnsafeAlgebra;
NoNaNs &= Flags->NoNaNs;		NoNaNs &= Flags->NoNaNs;
NoInfs &= Flags->NoInfs;		NoInfs &= Flags->NoInfs;
NoSignedZeros &= Flags->NoSignedZeros;		NoSignedZeros &= Flags->NoSignedZeros;
AllowReciprocal &= Flags->AllowReciprocal;		AllowReciprocal &= Flags->AllowReciprocal;
		// If either RoundingMode is rmDefault, we can use the other RoundingMode.
		// If neither is rmDefault and they are different, we must assume rmDynamic.
		if (RoundingMode == rmDefault)
		RoundingMode = Flags->RoundingMode;
		else if (RoundingMode != Flags->RoundingMode &&
		Flags->RoundingMode != rmDefault)
		RoundingMode = rmDynamic;
		// ExceptionBehavior is progressive. If the current flags specify ebIgnore
		// we should use whatever the merged flags specify. If the current flags
		// specify ebMayTrap, we can update to the more restrictive ebStrict but not
		// to the less restrictive ebIgnore. If the current flags specify ebStrict
		// we must keep that setting.
		if (ExceptionBehavior == ebIgnore)
		ExceptionBehavior = Flags->ExceptionBehavior;
		else if (ExceptionBehavior == ebMayTrap &&
		Flags->ExceptionBehavior != ebIgnore)
		ExceptionBehavior = Flags->ExceptionBehavior;
}		}
};		};

/// Represents one node in the SelectionDAG.		/// Represents one node in the SelectionDAG.
///		///
class SDNode : public FoldingSetNode, public ilist_node<SDNode> {		class SDNode : public FoldingSetNode, public ilist_node<SDNode> {
private:		private:
/// The operation that this node performs.		/// The operation that this node performs.
▲ Show 20 Lines • Show All 1,877 Lines • Show Last 20 Lines

include/llvm/IR/IntrinsicInst.h

Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	public:
static inline bool classof(const IntrinsicInst *I) {		static inline bool classof(const IntrinsicInst *I) {
return I->getIntrinsicID() == Intrinsic::dbg_value;		return I->getIntrinsicID() == Intrinsic::dbg_value;
}		}
static inline bool classof(const Value *V) {		static inline bool classof(const Value *V) {
return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
}		}
};		};

		/// This is the common base class for constrained floating point intrinsics.
		class ConstrainedFPIntrinsic : public IntrinsicInst {
		public:
		enum RoundingMode {
		rmDynamic,
		rmToNearest,
		rmDownward,
		rmUpward,
		rmTowardZero
		};

		enum ExceptionBehavior {
		ebIgnore,
		ebMayTrap,
		ebStrict
		};

		RoundingMode getRoundingMode() const;
		ExceptionBehavior getExceptionBehavior() const;

		// Methods for support type inquiry through isa, cast, and dyn_cast:
		static inline bool classof(const IntrinsicInst *I) {
		switch (I->getIntrinsicID()) {
		case Intrinsic::experimental_constrained_fadd:
		case Intrinsic::experimental_constrained_fsub:
		case Intrinsic::experimental_constrained_fmul:
		case Intrinsic::experimental_constrained_fdiv:
		case Intrinsic::experimental_constrained_frem:
		return true;
		default: return false;
		}
		}
		static inline bool classof(const Value *V) {
		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
		}
		};

/// This is the common base class for memset/memcpy/memmove.		/// This is the common base class for memset/memcpy/memmove.
class MemIntrinsic : public IntrinsicInst {		class MemIntrinsic : public IntrinsicInst {
public:		public:
Value getRawDest() const { return const_cast<Value>(getArgOperand(0)); }		Value getRawDest() const { return const_cast<Value>(getArgOperand(0)); }
const Use &getRawDestUse() const { return getArgOperandUse(0); }		const Use &getRawDestUse() const { return getArgOperandUse(0); }
Use &getRawDestUse() { return getArgOperandUse(0); }		Use &getRawDestUse() { return getArgOperandUse(0); }

Value getLength() const { return const_cast<Value>(getArgOperand(2)); }		Value getLength() const { return const_cast<Value>(getArgOperand(2)); }
▲ Show 20 Lines • Show All 262 Lines • Show Last 20 Lines

include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 436 Lines • ▼ Show 20 Lines
	def int_sigsetjmp : Intrinsic<[llvm_i32_ty] , [llvm_ptr_ty, llvm_i32_ty]>;			def int_sigsetjmp : Intrinsic<[llvm_i32_ty] , [llvm_ptr_ty, llvm_i32_ty]>;
	def int_siglongjmp : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty], [IntrNoReturn]>;			def int_siglongjmp : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty], [IntrNoReturn]>;

	// Internal interface for object size checking			// Internal interface for object size checking
	def int_objectsize : Intrinsic<[llvm_anyint_ty], [llvm_anyptr_ty, llvm_i1_ty],			def int_objectsize : Intrinsic<[llvm_anyint_ty], [llvm_anyptr_ty, llvm_i1_ty],
	[IntrNoMem]>,			[IntrNoMem]>,
	GCCBuiltin<"__builtin_object_size">;			GCCBuiltin<"__builtin_object_size">;

				//===--------------- Constrained Floating Point Intrinsics ----------------===//
				//

				let IntrProperties = [IntrInaccessibleMemOnly] in {
				def int_experimental_constrained_fadd : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty ]>;
				def int_experimental_constrained_fsub : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty ]>;
				def int_experimental_constrained_fmul : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty ]>;
				def int_experimental_constrained_fdiv : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty ]>;
				def int_experimental_constrained_frem : Intrinsic<[ llvm_anyfloat_ty ],
				[ LLVMMatchType<0>,
				LLVMMatchType<0>,
				llvm_metadata_ty,
				llvm_metadata_ty ]>;
				}
				// FIXME: Add intrinsic for fcmp, fptrunc, fpext, fptoui and fptosi.
				arsenmUnsubmitted Not Done Reply Inline Actions Also fma, minnum, maxnum, sqrt arsenm: Also fma, minnum, maxnum, sqrt
				andrew.w.kaylorAuthorUnsubmitted Not Done Reply Inline Actions That's a bit of a can of worms, isn't it? I mean there are a whole bunch of FP intrinsics, and I can imagine it making sense to have constrained versions of any of them. andrew.w.kaylor: That's a bit of a can of worms, isn't it? I mean there are a whole bunch of FP intrinsics, and…


	//===------------------------- Expect Intrinsics --------------------------===//			//===------------------------- Expect Intrinsics --------------------------===//
	//			//
	def int_expect : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>,			def int_expect : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>,
	LLVMMatchType<0>], [IntrNoMem]>;			LLVMMatchType<0>], [IntrNoMem]>;

	//===-------------------- Bit Manipulation Intrinsics ---------------------===//			//===-------------------- Bit Manipulation Intrinsics ---------------------===//
	//			//

	▲ Show 20 Lines • Show All 325 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,944 Lines • ▼ Show 20 Lines	SDValue SelectionDAG::getNode(unsigned Opcode, const SDLoc &DL, EVT VT,

// Perform trivial constant folding.		// Perform trivial constant folding.
if (SDValue SV =		if (SDValue SV =
FoldConstantArithmetic(Opcode, DL, VT, N1.getNode(), N2.getNode()))		FoldConstantArithmetic(Opcode, DL, VT, N1.getNode(), N2.getNode()))
return SV;		return SV;

// Constant fold FP operations.		// Constant fold FP operations.
bool HasFPExceptions = TLI->hasFloatingPointExceptions();		bool HasFPExceptions = TLI->hasFloatingPointExceptions();
		bool MustBeExact = false;
		bool MustPreserveFPExceptions = false;
		APFloat::roundingMode RoundingMode = APFloat::rmNearestTiesToEven;
		if (Flags) {
		switch (Flags->getRoundingMode()) {
		case SDNodeFlags::rmDefault:
		case SDNodeFlags::rmToNearest:
		break;
		case SDNodeFlags::rmDynamic:
		MustBeExact = true;
		break;
		case SDNodeFlags::rmDownward:
		RoundingMode = APFloat::rmTowardNegative;
		break;
		case SDNodeFlags::rmUpward:
		RoundingMode = APFloat::rmTowardPositive;
		break;
		case SDNodeFlags::rmTowardZero:
		RoundingMode = APFloat::rmTowardZero;
		break;
		}
		// If the exception behavior is ebIgnore or ebMayTrap we may perform
		// constant folding that hides FP exceptions that would otherwise have
		// been raised, but if it is ebStrict we may not.
		if (Flags->getExceptionBehavior() == SDNodeFlags::ebStrict)
		MustPreserveFPExceptions = true;
		}

if (N1CFP) {		if (N1CFP) {
if (N2CFP) {		if (N2CFP) {
APFloat V1 = N1CFP->getValueAPF(), V2 = N2CFP->getValueAPF();		APFloat V1 = N1CFP->getValueAPF(), V2 = N2CFP->getValueAPF();
APFloat::opStatus s;		APFloat::opStatus s;
switch (Opcode) {		switch (Opcode) {
case ISD::FADD:		case ISD::FADD:
s = V1.add(V2, APFloat::rmNearestTiesToEven);		s = V1.add(V2, RoundingMode);
if (!HasFPExceptions \|\| s != APFloat::opInvalidOp)		if (s == APFloat::opOK \|\|
		(!MustPreserveFPExceptions &&
		(!MustBeExact \|\| s != APFloat::opInexact) &&
		(!HasFPExceptions \|\| s != APFloat::opInvalidOp)))
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
break;		break;
case ISD::FSUB:		case ISD::FSUB:
s = V1.subtract(V2, APFloat::rmNearestTiesToEven);		s = V1.subtract(V2, RoundingMode);
if (!HasFPExceptions \|\| s!=APFloat::opInvalidOp)		if (s == APFloat::opOK \|\|
		(!MustPreserveFPExceptions &&
		(!MustBeExact \|\| s != APFloat::opInexact) &&
		(!HasFPExceptions \|\| s != APFloat::opInvalidOp)))
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
break;		break;
case ISD::FMUL:		case ISD::FMUL:
s = V1.multiply(V2, APFloat::rmNearestTiesToEven);		s = V1.multiply(V2, RoundingMode);
if (!HasFPExceptions \|\| s!=APFloat::opInvalidOp)		if (s == APFloat::opOK \|\|
		(!MustPreserveFPExceptions &&
		(!MustBeExact \|\| s != APFloat::opInexact) &&
		(!HasFPExceptions \|\| s != APFloat::opInvalidOp)))
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
break;		break;
case ISD::FDIV:		case ISD::FDIV:
s = V1.divide(V2, APFloat::rmNearestTiesToEven);		s = V1.divide(V2, RoundingMode);
if (!HasFPExceptions \|\| (s!=APFloat::opInvalidOp &&		if (s == APFloat::opOK \|\|
s!=APFloat::opDivByZero)) {		(!MustPreserveFPExceptions &&
		(!MustBeExact \|\| s != APFloat::opInexact) &&
		(!HasFPExceptions \|\|
		(s != APFloat::opInvalidOp && s != APFloat::opDivByZero))))
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
}
break;		break;
case ISD::FREM :		case ISD::FREM :
s = V1.mod(V2);		s = V1.mod(V2);
if (!HasFPExceptions \|\| (s!=APFloat::opInvalidOp &&		if (s == APFloat::opOK \|\|
s!=APFloat::opDivByZero)) {		(!MustPreserveFPExceptions &&
		(!MustBeExact \|\| s != APFloat::opInexact) &&
		(!HasFPExceptions \|\|
		(s != APFloat::opInvalidOp && s != APFloat::opDivByZero))))
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
}
break;		break;
case ISD::FCOPYSIGN:		case ISD::FCOPYSIGN:
V1.copySign(V2);		V1.copySign(V2);
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
default: break;		default: break;
}		}
}		}

if (Opcode == ISD::FP_ROUND) {		if (Opcode == ISD::FP_ROUND) {
APFloat V = N1CFP->getValueAPF(); // make copy		APFloat V = N1CFP->getValueAPF(); // make copy
bool ignored;		bool ignored;
// This can return overflow, underflow, or inexact; we don't care.		// This can return overflow, underflow, or inexact; we don't care.
// FIXME need to be more flexible about rounding mode.		// FIXME need to be more flexible about rounding mode.
(void)V.convert(EVTToAPFloatSemantics(VT),		(void)V.convert(EVTToAPFloatSemantics(VT),
APFloat::rmNearestTiesToEven, &ignored);		RoundingMode, &ignored);
return getConstantFP(V, DL, VT);		return getConstantFP(V, DL, VT);
}		}
}		}

// Canonicalize an UNDEF to the RHS, even over a constant.		// Canonicalize an UNDEF to the RHS, even over a constant.
if (N1.isUndef()) {		if (N1.isUndef()) {
if (isCommutativeBinOp(Opcode)) {		if (isCommutativeBinOp(Opcode)) {
std::swap(N1, N2);		std::swap(N1, N2);
▲ Show 20 Lines • Show All 3,437 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

Show First 20 Lines • Show All 894 Lines • ▼ Show 20 Lines	private:
void visitAtomicLoad(const LoadInst &I);		void visitAtomicLoad(const LoadInst &I);
void visitAtomicStore(const StoreInst &I);		void visitAtomicStore(const StoreInst &I);
void visitLoadFromSwiftError(const LoadInst &I);		void visitLoadFromSwiftError(const LoadInst &I);
void visitStoreToSwiftError(const StoreInst &I);		void visitStoreToSwiftError(const StoreInst &I);

void visitInlineAsm(ImmutableCallSite CS);		void visitInlineAsm(ImmutableCallSite CS);
const char *visitIntrinsicCall(const CallInst &I, unsigned Intrinsic);		const char *visitIntrinsicCall(const CallInst &I, unsigned Intrinsic);
void visitTargetIntrinsic(const CallInst &I, unsigned Intrinsic);		void visitTargetIntrinsic(const CallInst &I, unsigned Intrinsic);
		void visitConstrainedFPIntrinsic(const CallInst &I, unsigned Intrinsic);

void visitVAStart(const CallInst &I);		void visitVAStart(const CallInst &I);
void visitVAArg(const VAArgInst &I);		void visitVAArg(const VAArgInst &I);
void visitVAEnd(const CallInst &I);		void visitVAEnd(const CallInst &I);
void visitVACopy(const CallInst &I);		void visitVACopy(const CallInst &I);
void visitStackmap(const CallInst &I);		void visitStackmap(const CallInst &I);
void visitPatchpoint(ImmutableCallSite CS,		void visitPatchpoint(ImmutableCallSite CS,
const BasicBlock *EHPadBB = nullptr);		const BasicBlock *EHPadBB = nullptr);
▲ Show 20 Lines • Show All 114 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,284 Lines • ▼ Show 20 Lines	case Intrinsic::copysign:
return nullptr;		return nullptr;
case Intrinsic::fma:		case Intrinsic::fma:
setValue(&I, DAG.getNode(ISD::FMA, sdl,		setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(0)).getValueType(),		getValue(I.getArgOperand(0)).getValueType(),
getValue(I.getArgOperand(0)),		getValue(I.getArgOperand(0)),
getValue(I.getArgOperand(1)),		getValue(I.getArgOperand(1)),
getValue(I.getArgOperand(2))));		getValue(I.getArgOperand(2))));
return nullptr;		return nullptr;
		case Intrinsic::experimental_constrained_fadd:
		case Intrinsic::experimental_constrained_fsub:
		case Intrinsic::experimental_constrained_fmul:
		case Intrinsic::experimental_constrained_fdiv:
		case Intrinsic::experimental_constrained_frem:
		visitConstrainedFPIntrinsic(I, Intrinsic);
		return nullptr;
case Intrinsic::fmuladd: {		case Intrinsic::fmuladd: {
EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());		EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());
if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&		if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&
TLI.isFMAFasterThanFMulAndFAdd(VT)) {		TLI.isFMAFasterThanFMulAndFAdd(VT)) {
setValue(&I, DAG.getNode(ISD::FMA, sdl,		setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(0)).getValueType(),		getValue(I.getArgOperand(0)).getValueType(),
getValue(I.getArgOperand(0)),		getValue(I.getArgOperand(0)),
getValue(I.getArgOperand(1)),		getValue(I.getArgOperand(1)),
▲ Show 20 Lines • Show All 432 Lines • ▼ Show 20 Lines	SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I, unsigned Intrinsic) {
}		}

case Intrinsic::experimental_deoptimize:		case Intrinsic::experimental_deoptimize:
LowerDeoptimizeCall(&I);		LowerDeoptimizeCall(&I);
return nullptr;		return nullptr;
}		}
}		}

		void SelectionDAGBuilder::visitConstrainedFPIntrinsic(const CallInst &I,
		unsigned Intrinsic) {
		// FIXME: Do something to prevent unwanted code motion.
		SDLoc sdl = getCurSDLoc();
		unsigned Opcode;
		switch (Intrinsic) {
		default: llvm_unreachable("Impossible intrinsic"); // Can't reach here.
		case Intrinsic::experimental_constrained_fadd: Opcode = ISD::FADD; break;
		case Intrinsic::experimental_constrained_fsub: Opcode = ISD::FSUB; break;
		case Intrinsic::experimental_constrained_fmul: Opcode = ISD::FMUL; break;
		case Intrinsic::experimental_constrained_fdiv: Opcode = ISD::FDIV; break;
		case Intrinsic::experimental_constrained_frem: Opcode = ISD::FREM; break;
		}

		const ConstrainedFPIntrinsic *FPI = cast<ConstrainedFPIntrinsic>(&I);
		SDNodeFlags Flags;
		switch (FPI->getRoundingMode()) {
		case ConstrainedFPIntrinsic::rmDynamic:
		Flags.setRoundingMode(SDNodeFlags::rmDynamic);
		break;
		case ConstrainedFPIntrinsic::rmToNearest:
		Flags.setRoundingMode(SDNodeFlags::rmToNearest);
		break;
		case ConstrainedFPIntrinsic::rmDownward:
		Flags.setRoundingMode(SDNodeFlags::rmDownward);
		break;
		craig.topperUnsubmitted Not Done Reply Inline Actions No need for a SmallVector. Just create a 3 entry array and assign each element directly. craig.topper: No need for a SmallVector. Just create a 3 entry array and assign each element directly.
		case ConstrainedFPIntrinsic::rmUpward:
		Flags.setRoundingMode(SDNodeFlags::rmUpward);
		break;
		case ConstrainedFPIntrinsic::rmTowardZero:
		Flags.setRoundingMode(SDNodeFlags::rmTowardZero);
		break;
		}
		switch (FPI->getExceptionBehavior()) {
		case ConstrainedFPIntrinsic::ebIgnore:
		Flags.setExceptionBehavior(SDNodeFlags::ebIgnore);
		break;
		case ConstrainedFPIntrinsic::ebMayTrap:
		Flags.setExceptionBehavior(SDNodeFlags::ebMayTrap);
		break;
		case ConstrainedFPIntrinsic::ebStrict:
		Flags.setExceptionBehavior(SDNodeFlags::ebStrict);
		break;
		}
		SDValue FPNode = DAG.getNode(Opcode, sdl,
		getValue(I.getArgOperand(0)).getValueType(),
		getValue(I.getArgOperand(0)),
		getValue(I.getArgOperand(1)), &Flags);
		setValue(&I, FPNode);
		}

std::pair<SDValue, SDValue>		std::pair<SDValue, SDValue>
SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,		SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,
const BasicBlock *EHPadBB) {		const BasicBlock *EHPadBB) {
MachineModuleInfo &MMI = DAG.getMachineFunction().getMMI();		MachineModuleInfo &MMI = DAG.getMachineFunction().getMMI();
MCSymbol *BeginLabel = nullptr;		MCSymbol *BeginLabel = nullptr;

if (EHPadBB) {		if (EHPadBB) {
// Insert a label before the invoke call to mark the try range. This can be		// Insert a label before the invoke call to mark the try range. This can be
▲ Show 20 Lines • Show All 3,561 Lines • Show Last 20 Lines

lib/IR/IntrinsicInst.cpp

	Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	Value *InstrProfIncrementInst::getStep() const {			Value *InstrProfIncrementInst::getStep() const {
	if (InstrProfIncrementInstStep::classof(this)) {			if (InstrProfIncrementInstStep::classof(this)) {
	return const_cast<Value *>(getArgOperand(4));			return const_cast<Value *>(getArgOperand(4));
	}			}
	const Module *M = getModule();			const Module *M = getModule();
	LLVMContext &Context = M->getContext();			LLVMContext &Context = M->getContext();
	return ConstantInt::get(Type::getInt64Ty(Context), 1);			return ConstantInt::get(Type::getInt64Ty(Context), 1);
	}			}

				ConstrainedFPIntrinsic::RoundingMode
				ConstrainedFPIntrinsic::getRoundingMode() const {
				Metadata *RoundingMD = cast<MetadataAsValue>(getOperand(2))->getMetadata();
				StringRef RoundingArg = cast<MDString>(RoundingMD)->getString();

				// For dynamic rounding mode, we use round to nearest but we will set the
				// 'exact' SDNodeFlag so that the value will not be rounded.
				if (RoundingArg.equals("LLVM_ROUND_DYNAMIC"))
				return rmDynamic;
				else if (RoundingArg.equals("LLVM_ROUND_TONEAREST"))
				craig.topperUnsubmitted Not Done Reply Inline Actions Don't use 'else' after an 'if' containing a 'return' per coding standards. craig.topper: Don't use 'else' after an 'if' containing a 'return' per coding standards.
				andrew.w.kaylorAuthorUnsubmitted Not Done Reply Inline Actions I can make that change. What I wanted to convey here is that this is effectively a switch statement with no uncovered cases. I suppose since there is no way to get the compiler to warn if that ever changes there is no particular benefit in doing it this way. andrew.w.kaylor: I can make that change. What I wanted to convey here is that this is effectively a switch…
				return rmToNearest;
				craig.topperUnsubmitted Not Done Reply Inline Actions You could maybe use a StringSwitch. craig.topper: You could maybe use a StringSwitch.
				andrew.w.kaylorAuthorUnsubmitted Not Done Reply Inline Actions That would be perfect, but can I put an llvm_unreachable() in a StringSwitch? andrew.w.kaylor: That would be perfect, but can I put an llvm_unreachable() in a StringSwitch?
				craig.topperUnsubmitted Not Done Reply Inline Actions I think StringSwitch will assert if it doesn't find a match and there is no default specified. craig.topper: I think StringSwitch will assert if it doesn't find a match and there is no default specified.
				andrew.w.kaylorAuthorUnsubmitted Not Done Reply Inline Actions Yes, I saw that in the code. I wanted something that was informative in non-assert mode too. I guess the obvious solution to that is to add something like an UnreachableDefault() method to StringSwitch. andrew.w.kaylor: Yes, I saw that in the code. I wanted something that was informative in non-assert mode too.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions "Informative" as in "triggers a runtime failure"? Unreachable isn't the right solution, it is UB in release mode AFAIK. mehdi_amini: "Informative" as in "triggers a runtime failure"? Unreachable isn't the right solution, it is…
				else if (RoundingArg.equals("LLVM_ROUND_DOWNWARD"))
				return rmDownward;
				else if (RoundingArg.equals("LLVM_ROUND_UPWARD"))
				return rmUpward;
				else if (RoundingArg.equals("LLVM_ROUND_TOWARDZERO"))
				return rmTowardZero;

				llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!");
				}

				ConstrainedFPIntrinsic::ExceptionBehavior
				ConstrainedFPIntrinsic::getExceptionBehavior() const {
				Metadata *ExceptionMD = cast<MetadataAsValue>(getOperand(3))->getMetadata();
				StringRef ExceptionArg = cast<MDString>(ExceptionMD)->getString();
				if (ExceptionArg.equals("LLVM_FPEXCEPT_IGNORE"))
				return ebIgnore;
				else if (ExceptionArg.equals("LLVM_FPEXCEPT_MAYTRAP"))
				return ebMayTrap;
				else if (ExceptionArg.equals("LLVM_FPEXCEPT_STRICT"))
				return ebStrict;

				llvm_unreachable("Unexpected exception behavior argument in FP intrinsic!");
				}

lib/IR/Verifier.cpp

Show First 20 Lines • Show All 438 Lines • ▼ Show 20 Lines	#include "llvm/IR/Metadata.def"
void visitBranchInst(BranchInst &BI);		void visitBranchInst(BranchInst &BI);
void visitReturnInst(ReturnInst &RI);		void visitReturnInst(ReturnInst &RI);
void visitSwitchInst(SwitchInst &SI);		void visitSwitchInst(SwitchInst &SI);
void visitIndirectBrInst(IndirectBrInst &BI);		void visitIndirectBrInst(IndirectBrInst &BI);
void visitSelectInst(SelectInst &SI);		void visitSelectInst(SelectInst &SI);
void visitUserOp1(Instruction &I);		void visitUserOp1(Instruction &I);
void visitUserOp2(Instruction &I) { visitUserOp1(I); }		void visitUserOp2(Instruction &I) { visitUserOp1(I); }
void visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS);		void visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS);
		void visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI);
template <class DbgIntrinsicTy>		template <class DbgIntrinsicTy>
void visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII);		void visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII);
void visitAtomicCmpXchgInst(AtomicCmpXchgInst &CXI);		void visitAtomicCmpXchgInst(AtomicCmpXchgInst &CXI);
void visitAtomicRMWInst(AtomicRMWInst &RMWI);		void visitAtomicRMWInst(AtomicRMWInst &RMWI);
void visitFenceInst(FenceInst &FI);		void visitFenceInst(FenceInst &FI);
void visitAllocaInst(AllocaInst &AI);		void visitAllocaInst(AllocaInst &AI);
void visitExtractValueInst(ExtractValueInst &EVI);		void visitExtractValueInst(ExtractValueInst &EVI);
void visitInsertValueInst(InsertValueInst &IVI);		void visitInsertValueInst(InsertValueInst &IVI);
▲ Show 20 Lines • Show All 3,447 Lines • ▼ Show 20 Lines	void Verifier::visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS) {
}		}
case Intrinsic::ctlz: // llvm.ctlz		case Intrinsic::ctlz: // llvm.ctlz
case Intrinsic::cttz: // llvm.cttz		case Intrinsic::cttz: // llvm.cttz
Assert(isa<ConstantInt>(CS.getArgOperand(1)),		Assert(isa<ConstantInt>(CS.getArgOperand(1)),
"is_zero_undef argument of bit counting intrinsics must be a "		"is_zero_undef argument of bit counting intrinsics must be a "
"constant int",		"constant int",
CS);		CS);
break;		break;
		case Intrinsic::experimental_constrained_fadd:
		case Intrinsic::experimental_constrained_fsub:
		case Intrinsic::experimental_constrained_fmul:
		case Intrinsic::experimental_constrained_fdiv:
		case Intrinsic::experimental_constrained_frem:
		visitConstrainedFPIntrinsic(
		cast<ConstrainedFPIntrinsic>(*CS.getInstruction()));
		break;
case Intrinsic::dbg_declare: // llvm.dbg.declare		case Intrinsic::dbg_declare: // llvm.dbg.declare
Assert(isa<MetadataAsValue>(CS.getArgOperand(0)),		Assert(isa<MetadataAsValue>(CS.getArgOperand(0)),
"invalid llvm.dbg.declare intrinsic call 1", CS);		"invalid llvm.dbg.declare intrinsic call 1", CS);
visitDbgIntrinsic("declare", cast<DbgDeclareInst>(*CS.getInstruction()));		visitDbgIntrinsic("declare", cast<DbgDeclareInst>(*CS.getInstruction()));
break;		break;
case Intrinsic::dbg_value: // llvm.dbg.value		case Intrinsic::dbg_value: // llvm.dbg.value
visitDbgIntrinsic("value", cast<DbgValueInst>(*CS.getInstruction()));		visitDbgIntrinsic("value", cast<DbgValueInst>(*CS.getInstruction()));
break;		break;
▲ Show 20 Lines • Show All 323 Lines • ▼ Show 20 Lines	static DISubprogram getSubprogram(Metadata LocalScope) {
if (auto *LB = dyn_cast<DILexicalBlockBase>(LocalScope))		if (auto *LB = dyn_cast<DILexicalBlockBase>(LocalScope))
return getSubprogram(LB->getRawScope());		return getSubprogram(LB->getRawScope());

// Just return null; broken scope chains are checked elsewhere.		// Just return null; broken scope chains are checked elsewhere.
assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");		assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");
return nullptr;		return nullptr;
}		}

		void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {
		Assert(isa<MetadataAsValue>(FPI.getOperand(2)),
		"invalid rounding mode argument", &FPI);
		Metadata *RoundingMD =
		cast<MetadataAsValue>(FPI.getOperand(2))->getMetadata();
		Assert(isa<MDString>(RoundingMD), "invalid rounding mode argument", &FPI);
		StringRef RoundingArg = dyn_cast<MDString>(RoundingMD)->getString();
		Assert(RoundingArg.equals("LLVM_ROUND_DYNAMIC") \|\|
		DavidKreitzerUnsubmitted Not Done Reply Inline Actions Rather than duplicating the legal rounding mode & exception behavior strings here (and in the getRoundingMode & getExceptionBehavior methods), would it be better to have a string-->enum function that is called in both places? DavidKreitzer: Rather than duplicating the legal rounding mode & exception behavior strings here (and in the…
		andrew.w.kaylorAuthorUnsubmitted Not Done Reply Inline Actions I do like the idea of having a single location for these strings. On the other hand, I think I'd need to introduce a new enum value (invalid?) that is only used here. I'll think about it a bit and try to consolidate the strings one way or another. andrew.w.kaylor: I do like the idea of having a single location for these strings. On the other hand, I think…
		RoundingArg.equals("LLVM_ROUND_TONEAREST") \|\|
		RoundingArg.equals("LLVM_ROUND_DOWNWARD") \|\|
		RoundingArg.equals("LLVM_ROUND_UPWARD") \|\|
		RoundingArg.equals("LLVM_ROUND_TOWARDZERO"),
		"invalid rounding mode argument", &FPI);

		Assert(isa<MetadataAsValue>(FPI.getOperand(3)),
		"invalid exception behavior argument", &FPI);
		Metadata *ExceptionMD =
		cast<MetadataAsValue>(FPI.getOperand(3))->getMetadata();
		Assert(isa<MDString>(ExceptionMD), "invalid exception behavior argument",
		&FPI);
		StringRef ExceptionArg = dyn_cast<MDString>(ExceptionMD)->getString();
		Assert(ExceptionArg.equals("LLVM_FPEXCEPT_IGNORE") \|\|
		ExceptionArg.equals("LLVM_FPEXCEPT_MAYTRAP") \|\|
		ExceptionArg.equals("LLVM_FPEXCEPT_STRICT"),
		"invalid exception behavior argument", &FPI);
		}

template <class DbgIntrinsicTy>		template <class DbgIntrinsicTy>
void Verifier::visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII) {		void Verifier::visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII) {
auto *MD = cast<MetadataAsValue>(DII.getArgOperand(0))->getMetadata();		auto *MD = cast<MetadataAsValue>(DII.getArgOperand(0))->getMetadata();
AssertDI(isa<ValueAsMetadata>(MD) \|\|		AssertDI(isa<ValueAsMetadata>(MD) \|\|
(isa<MDNode>(MD) && !cast<MDNode>(MD)->getNumOperands()),		(isa<MDNode>(MD) && !cast<MDNode>(MD)->getNumOperands()),
"invalid llvm.dbg." + Kind + " intrinsic address/value", &DII, MD);		"invalid llvm.dbg." + Kind + " intrinsic address/value", &DII, MD);
AssertDI(isa<DILocalVariable>(DII.getRawVariable()),		AssertDI(isa<DILocalVariable>(DII.getRawVariable()),
"invalid llvm.dbg." + Kind + " intrinsic variable", &DII,		"invalid llvm.dbg." + Kind + " intrinsic variable", &DII,
▲ Show 20 Lines • Show All 257 Lines • Show Last 20 Lines

test/CodeGen/X86/fp-intrinsics.ll

				; RUN: llc -O3 -mtriple=x86_64-pc-linux < %s \| FileCheck %s

				; Verify that constants aren't folded to inexact results when the rounding mode
				; is unknown.
				;
				; double f1() {
				; // Because 0.1 cannot be represented exactly, this shouldn't be folded.
				; return 1.0/10.0;
				; }
				;
				; CHECK-LABEL: f1
				; CHECK: divsd
				define double @f1() {
				entry:
				%div = call double @llvm.experimental.constrained.fdiv.f64(
				double 1.000000e+00,
				double 1.000000e+01,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				ret double %div
				}

				; Verify that 'a - 0' isn't simplified to 'a' when the rounding mode is unknown.
				; However, transforming this to 'a + (-0)' is OK.
				;
				; double f2(double a) {
				; // Because he result of '0 - 0' is negative zero if rounding mode is
				; // downward, this shouldn't be simplified.
				; return a - 0;
				; }
				;
				; CHECK-LABEL: f2
				; CHECK: addsd
				define double @f2(double %a) {
				entry:
				%div = call double @llvm.experimental.constrained.fsub.f64(
				double %a,
				double 0.000000e+00,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				ret double %div
				}

				; Verify that '-((-a)b)' isn't simplified to 'ab' when the rounding mode is
				; unknown.
				;
				; double f3(double a, double b) {
				; // Because the intermediate value involved in this calculation may require
				; // rounding, this shouldn't be simplified.
				; return -((-a)*b);
				; }
				;
				; CHECK-LABEL: f3:
				; CHECK: subsd
				; CHECK: mulsd
				; CHECK: subsd
				define double @f3(double %a, double %b) {
				entry:
				%sub = call double @llvm.experimental.constrained.fsub.f64(
				double -0.000000e+00, double %a,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				%mul = call double @llvm.experimental.constrained.fmul.f64(
				double %sub, double %b,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				%ret = call double @llvm.experimental.constrained.fsub.f64(
				double -0.000000e+00,
				double %mul,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				ret double %ret
				}

				; Verify that FP operations are not performed speculatively when FP exceptions
				; are not being ignored.
				;
				; double f4(int n, double a) {
				; // Because a + 1 may overflow, this should not be simplified.
				; if (n > 0)
				; return a + 1.0;
				; return a;
				; }
				;
				;
				; CHECK-LABEL: f4:
				; CHECK: testl
				; CHECK: jle
				; CHECK: addsd
				define double @f4(i32 %n, double %a) {
				entry:
				%cmp = icmp sgt i32 %n, 0
				br i1 %cmp, label %if.then, label %if.end

				if.then:
				%add = call double @llvm.experimental.constrained.fadd.f64(
				double 1.000000e+00, double %a,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				br label %if.end

				if.end:
				%a.0 = phi double [%add, %if.then], [ %a, %entry ]
				ret double %a.0
				}


				@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
				declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)

test/Feature/fp-intrinsics.ll

				; RUN: opt -O3 -S < %s \| FileCheck %s

				; Test to verify that constants aren't folded when the rounding mode is unknown.
				; CHECK-LABEL: @f1
				; CHECK: call double @llvm.experimental.constrained.fdiv.f64
				define double @f1() {
				entry:
				%div = call double @llvm.experimental.constrained.fdiv.f64(
				double 1.000000e+00,
				double 1.000000e+01,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				ret double %div
				}

				; Verify that 'a - 0' isn't simplified to 'a' when the rounding mode is unknown.
				;
				; double f2(double a) {
				; // Because the result of '0 - 0' is negative zero if rounding mode is
				; // downward, this shouldn't be simplified.
				; return a - 0.0;
				; }
				;
				; CHECK-LABEL: @f2
				; CHECK: call double @llvm.experimental.constrained.fsub.f64
				define double @f2(double %a) {
				entry:
				%div = call double @llvm.experimental.constrained.fsub.f64(
				double %a, double 0.000000e+00,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				ret double %div
				}

				; Verify that '-((-a)b)' isn't simplified to 'ab' when the rounding mode is
				; unknown.
				;
				; double f3(double a, double b) {
				; // Because the intermediate value involved in this calculation may require
				; // rounding, this shouldn't be simplified.
				; return -((-a)*b);
				; }
				;
				; CHECK-LABEL: @f3
				; CHECK: call double @llvm.experimental.constrained.fsub.f64
				; CHECK: call double @llvm.experimental.constrained.fmul.f64
				; CHECK: call double @llvm.experimental.constrained.fsub.f64
				define double @f3(double %a, double %b) {
				entry:
				%sub = call double @llvm.experimental.constrained.fsub.f64(
				double -0.000000e+00, double %a,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				%mul = call double @llvm.experimental.constrained.fmul.f64(
				double %sub, double %b,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				%ret = call double @llvm.experimental.constrained.fsub.f64(
				double -0.000000e+00,
				double %mul,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				ret double %ret
				}

				; Verify that FP operations are not performed speculatively when FP exceptions
				; are not being ignored.
				;
				; double f4(int n, double a) {
				; // Because a + 1 may overflow, this should not be simplified.
				; if (n > 0)
				; return a + 1.0;
				; return a;
				; }
				;
				;
				; CHECK-LABEL: @f4
				; CHECK-NOT: select
				; CHECK: br i1 %cmp
				define double @f4(i32 %n, double %a) {
				entry:
				%cmp = icmp sgt i32 %n, 0
				br i1 %cmp, label %if.then, label %if.end

				if.then:
				%add = call double @llvm.experimental.constrained.fadd.f64(
				double 1.000000e+00, double %a,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				br label %if.end

				if.end:
				%a.0 = phi double [%add, %if.then], [ %a, %entry ]
				ret double %a.0
				}


				@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
				declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)

test/Verifier/fp-intrinsics.ll

				; RUN: opt -verify -S < %s 2>&1 \| FileCheck --check-prefix=CHECK1 %s
				; RUN: sed -e s/.T2:// %s \| not opt -verify -disable-output 2>&1 \| FileCheck --check-prefix=CHECK2 %s
				; RUN: sed -e s/.T3:// %s \| not opt -verify -disable-output 2>&1 \| FileCheck --check-prefix=CHECK3 %s

				; Common declaration used for all runs.
				declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)

				; Test that the verifier accepts legal code, and that the correct attributes are
				; attached to the FP intrinsic.
				; CHECK1: declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) #[[ATTR:[0-9]+]]
				; CHECK1: attributes #[[ATTR]] = { inaccessiblememonly nounwind }
				; Note: FP exceptions aren't usually caught through normal unwind mechanisms,
				; but we may want to revisit this for asynchronous exception handling.
				define double @f1(double %a, double %b) {
				entry:
				%fadd = call double @llvm.experimental.constrained.fadd.f64(
				double %a, double %b,
				metadata !"LLVM_ROUND_DYNAMIC",
				metadata !"LLVM_FPEXCEPT_STRICT")
				ret double %fadd
				}

				; Test an illegal value for the rounding mode argument.
				; CHECK2: invalid rounding mode argument
				;T2: define double @f2(double %a, double %b) {
				;T2: entry:
				;T2: %fadd = call double @llvm.experimental.constrained.fadd.f64(
				;T2: double %a, double %b,
				;T2: metadata !"LLVM_ROUND_DYNOMITE",
				;T2: metadata !"LLVM_FPEXCEPT_STRICT")
				;T2: ret double %fadd
				;T2: }

				; Test an illegal value for the exception behavior argument.
				; CHECK3: invalid exception behavior argument
				;T3: define double @f2(double %a, double %b) {
				;T3: entry:
				;T3: %fadd = call double @llvm.experimental.constrained.fadd.f64(
				;T3: double %a, double %b,
				;T3: metadata !"LLVM_ROUND_DYNAMIC",
				;T3: metadata !"LLVM_FPEXCEPT_RESTRICT")
				;T3: ret double %fadd
				;T3: }

This is an archive of the discontinued LLVM Phabricator instance.

Add intrinsics for constrained floating point operationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 79005

docs/LangRef.rst

include/llvm/CodeGen/SelectionDAGNodes.h

include/llvm/IR/IntrinsicInst.h

include/llvm/IR/Intrinsics.td

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

lib/IR/IntrinsicInst.cpp

lib/IR/Verifier.cpp

test/CodeGen/X86/fp-intrinsics.ll

test/Feature/fp-intrinsics.ll

test/Verifier/fp-intrinsics.ll

Add intrinsics for constrained floating point operations
ClosedPublic