This is an archive of the discontinued LLVM Phabricator instance.

Add intrinsics for constrained floating point operations
ClosedPublic

Authored by andrew.w.kaylor on Nov 22 2016, 5:32 PM.

Download Raw Diff

Details

Reviewers

aemerson
DavidKreitzer
mehdi_amini
hfinkel

Commits

rGa0a1164ce41b: Add intrinsics for constrained floating point operations
rL293226: Add intrinsics for constrained floating point operations

Summary

This adds intrinsics that can be used to constrain optimizations that assume the default rounding mode and ignore FP exceptions.

I am starting with a simple lowering that translates the intrinsics directly to the corresponding target-independent FP operations when the selection DAG is built. I think this is necessary because the existing target-specific pattern matching for FP operations is fairly complex and I think attempting to duplicate all of that infrastructure is likely to result in errors and will certainly result in more work each time new wrinkles are introduced in the future.

I intend to model the implicit uses and defs of FP control and status registers in a future revision, which should be sufficient to prevent unwanted optimizations at the MachineInstr level. There is also a potential for incorrect code motion in during instruction selection. My tentative plan to handle that is to introduce pseudo-instruction nodes that wrap the inputs and outputs of the FP operations created based on the new intrinsics and effectively model the implicit FP control and status register behavior and use a chain. These nodes would then be eliminated during instruction selection.

I believe that this patch is sufficient as presented here and can be committed without any solution to the potential problems with code motion at the ISel and MachineInstr levels.

Diff Detail

Repository: rL LLVM

Event Timeline

andrew.w.kaylor updated this revision to Diff 79005.Nov 22 2016, 5:32 PM

andrew.w.kaylor retitled this revision from to Add intrinsics for constrained floating point operations.

andrew.w.kaylor updated this object.

andrew.w.kaylor added reviewers: mehdi_amini, aemerson, DavidKreitzer, hfinkel.

andrew.w.kaylor set the repository for this revision to rL LLVM.

andrew.w.kaylor added a subscriber: llvm-commits.

RKSimon added a subscriber: RKSimon.Nov 23 2016, 3:26 AM

I intend to model the implicit uses and defs of FP control and status registers in a future revision, which should be sufficient to prevent unwanted optimizations at the MachineInstr level. There is also a potential for incorrect code motion in during instruction selection. My tentative plan to handle that is to introduce pseudo-instruction nodes that wrap the inputs and outputs of the FP operations created based on the new intrinsics and effectively model the implicit FP control and status register behavior and use a chain. These nodes would then be eliminated during instruction selection.

Based on our conversation at the dev meeting, here's how I thought this would work:

Introduce target-independent chain-carrying nodes to represent these operations. For argument's sake, STRICT_FADD, etc.
Since nothing in the SDAG knows about what these nodes do, there's no problem with optimizations doing bad things.
You'd do something minimal in SelectionDAGISel::DoInstructionSelection() around the call to:

Select(Node);

so that it would become:

bool IsStrictFPOp = isStrictFPOp(Node);
if (IsStrictFPOp)
  mutateStrictFPToFP(Node); // STRICT_FADD -> FADD, etc.

Select(Node);

if (IsStrictFPOp && !TLI->addStrictFPRegDeps(Node))
  report_fatal_error("Could not add strict FP reg deps");

and then you'd be done. Obviously this is somewhat hand wavy, but if there are complexities I'm overlooking, I'd like to understand them.

In D27028#604172, @hfinkel wrote:
Based on our conversation at the dev meeting, here's how I thought this would work:

Introduce target-independent chain-carrying nodes to represent these operations. For argument's sake, STRICT_FADD, etc.

Since nothing in the SDAG knows about what these nodes do, there's no problem with optimizations doing bad things.

You'd do something minimal in SelectionDAGISel::DoInstructionSelection() around the call to:

Select(Node);

so that it would become:
bool IsStrictFPOp = isStrictFPOp(Node);
if (IsStrictFPOp)
  mutateStrictFPToFP(Node); // STRICT_FADD -> FADD, etc.

Select(Node);

if (IsStrictFPOp && !TLI->addStrictFPRegDeps(Node))
  report_fatal_error("Could not add strict FP reg deps");
and then you'd be done. Obviously this is somewhat hand wavy, but if there are complexities I'm overlooking, I'd like to understand them.

Yes, that's exactly what I wanted to do, but I couldn't figure out how to do it. This is my first time exploring in ISel and it's been a bit intimidating. The way you describe it sounds very simple. I guess I was just reluctant to do that because there are currently no target-independent transformations happening there. I tried putting something in SelectionDAGISel::SelectCommonCode() but it seemed like that was too late for the kind of double transformation I needed (STRICT_FADD->FADD->target-specific instruction). I'm always hesitant to believe that my new feature is special and really needs something that has never been needed before. Usually that's a signal that I'm misunderstanding the design. So I tried to do something that seemed a more natural fit with the existing code.

I'll go back and see what I can do with it following the model you describe here and see what it looks like.

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

In D27028#604594, @arsenm wrote:

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

You mean that there should be a way to control this on a per-operation basis, or there should be some way to represent that the user might be changing some thread state that controls how this is done?

In D27028#604603, @hfinkel wrote:

In D27028#604594, @arsenm wrote:

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

You mean that there should be a way to control this on a per-operation basis, or there should be some way to represent that the user might be changing some thread state that controls how this is done?

I was going to ask that same question.

For the purposes of these intrinsics, we only need enough information to control how optimization passes will handle these instructions. I suppose a transformation like constant folding could theoretically have something to flush to zero. Does it make sense to treat that as a set of new rounding mode, like LLVM_ROUND_TONEAREST_FTZ, etc.?

In D27028#604609, @andrew.w.kaylor wrote:

In D27028#604603, @hfinkel wrote:

In D27028#604594, @arsenm wrote:

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

You mean that there should be a way to control this on a per-operation basis, or there should be some way to represent that the user might be changing some thread state that controls how this is done?

I was going to ask that same question.

For the purposes of these intrinsics, we only need enough information to control how optimization passes will handle these instructions. I suppose a transformation like constant folding could theoretically have something to flush to zero. Does it make sense to treat that as a set of new rounding mode, like LLVM_ROUND_TONEAREST_FTZ, etc.?

Yes, I think it makes sense to treat it the same as a rounding mode. For lowering some of these operations, we would need to change the denormal mode for some subsequence of the lowering. Some of the OpenCL library builtin functions also need to do this, and need to be able to control this via the intrinsic.

Is flush-to-zero currently handled with function attributes?

Also, how would you feel about deferring flush-to-zero support to a later patch?

In D27028#604694, @andrew.w.kaylor wrote:

Is flush-to-zero currently handled with function attributes?

Also, how would you feel about deferring flush-to-zero support to a later patch?

It's not handled in general llvm. We currently have a subtarget feature for f32/f64 denormal support in the default rounding mode

In D27028#604699, @arsenm wrote:

It's not handled in general llvm. We currently have a subtarget feature for f32/f64 denormal support in the default rounding mode

It looks like several sub-targets have some kind of denormal support, but I didn't look closely enough to see what each of them is doing. That's kind of why I would prefer to defer the issue...because I haven't thought about it enough to know that I'd be implementing it correctly.

So to get back to Hal's question, is this needed at a per-instruction level? I understand that you want to select instructions differently based on this setting, but is attaching it to the instruction just a convenient way to get the information to the ISel or can this change across different scopes?

In D27028#604714, @andrew.w.kaylor wrote:

In D27028#604699, @arsenm wrote:

It's not handled in general llvm. We currently have a subtarget feature for f32/f64 denormal support in the default rounding mode

It looks like several sub-targets have some kind of denormal support, but I didn't look closely enough to see what each of them is doing. That's kind of why I would prefer to defer the issue...because I haven't thought about it enough to know that I'd be implementing it correctly.

So to get back to Hal's question, is this needed at a per-instruction level? I understand that you want to select instructions differently based on this setting, but is attaching it to the instruction just a convenient way to get the information to the ISel or can this change across different scopes?

It's the same as the rounding mode and be per-instruction

I re-wrote the ISel code to introduce a pseudo-instruction for the strict variants of FP operations, which is mutated directly to a normal FP node just before instruction selection. I removed the SDNodeFlag extensions I had added in my original patch because they aren't needed in this implementation. Some variant of that code will likely need to be re-introduced at some point, particularly to handle FTZ rounding modes.

I did not introduce the code to add a target-specific implicit use of the FP-environment register. I do intend to add this in a later patch, but this code works well enough without it that I believe it can be deferred.

At some point it will be necessary to teach the LegalizeDAG code how to handle the strict instructions in order to legalize strict FP nodes with vector operands. I intend to do that in a later patch also.

b-sumner added a subscriber: b-sumner.Nov 30 2016, 9:41 AM

I did not introduce the code to add a target-specific implicit use of the FP-environment register. I do intend to add this in a later patch, but this code works well enough without it that I believe it can be deferred.

I'm fine with doing this, but there should be FIXME comments in the code about this and/or bug reports filed (preferably both) so that we've not tempted to remove 'experimental' from the name until we do this part of it.

Regarding denormals, LLVM has support for the following modes (include/llvm/Target/TargetOptions.h):

namespace FPDenormal {
  enum DenormalMode {
    IEEE,           // IEEE 754 denormal numbers
    PreserveSign,   // the sign of a flushed-to-zero number is preserved in
                    // the sign of 0
    PositiveZero    // denormals are flushed to positive zero
  };
}

and so I recommend just following the same pattern here as with the rounding mode with strings like "denormals.dynamic", "denormals.ieee", "denormals.preserve_sign", and "denormals.positive_zero".

docs/LangRef.rst
12046 ↗	(On Diff #79447)	To engage in some bikeshedding, I really don't like this naming convention which reminds me of macro names. Not that we've been extremely consistent about the standardized metadata strings we already use, but this would be yet another choice; we should avoid that. For controlling LLVM optimization passes, we use strings like this !"llvm.loop.vectorize.enable". We also use strings like !"function_entry_count" and !"branch_weights" for profiling info. We also have strings like !"ProfileFormat" and !"TotalCount" in other places. Of these choices, I prefer the one used by our basic profiling information (i.e. !"branch_weights"), and so I'd prefer these be named: "round_dynamic" "round_downward" ... and similar for the fpexcept strings. We might also borrow the dot-based hierarchy scheme from our loop optimization metadata and use: "round.dynamic" "round.downward" ... and similar for the fpexcept strings. I think that I like this slightly better than just using the underscore separators. I specifically dislike having "llvm" in the name here, as it makes it seem as though the meaning of these things is somehow LLVM-specific. It is not (they come from IEEE if nothing else). Having "llvm" in the optimization metadata makes a bit more sense because those refer to specific LLVM optimization passes.
12083 ↗	(On Diff #79447)	"not be unmasked" is a double negative. How about just saying will be masked? Or will not be enabled?

Few stylistic comments.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5773 ↗	(On Diff #79447)	No need for a SmallVector. Just create a 3 entry array and assign each element directly.
lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
955 ↗	(On Diff #79447)	This doesn't need to be a SmallVector. You can just use SDValue Ops[2] = { Node->getOperand(1), Node->getOperand(2) };
lib/IR/IntrinsicInst.cpp
106 ↗	(On Diff #79447)	Don't use 'else' after an 'if' containing a 'return' per coding standards.

andrew.w.kaylor added inline comments.Dec 5 2016, 11:08 AM

lib/IR/IntrinsicInst.cpp
106 ↗	(On Diff #79447)	I can make that change. What I wanted to convey here is that this is effectively a switch statement with no uncovered cases. I suppose since there is no way to get the compiler to warn if that ever changes there is no particular benefit in doing it this way.

craig.topper added inline comments.Dec 5 2016, 11:43 AM

lib/IR/IntrinsicInst.cpp
107 ↗	(On Diff #79447)	You could maybe use a StringSwitch.

andrew.w.kaylor added inline comments.Dec 5 2016, 12:03 PM

lib/IR/IntrinsicInst.cpp
107 ↗	(On Diff #79447)	That would be perfect, but can I put an llvm_unreachable() in a StringSwitch?

craig.topper added inline comments.Dec 5 2016, 12:12 PM

lib/IR/IntrinsicInst.cpp
107 ↗	(On Diff #79447)	I think StringSwitch will assert if it doesn't find a match and there is no default specified.

andrew.w.kaylor added inline comments.Dec 5 2016, 2:41 PM

lib/IR/IntrinsicInst.cpp
107 ↗	(On Diff #79447)	Yes, I saw that in the code. I wanted something that was informative in non-assert mode too. I guess the obvious solution to that is to add something like an UnreachableDefault() method to StringSwitch.

I don't think llvm_unreachable is guaranteed to do anything in non-asserts build. From the description in the header:

/ Marks that the current location is not supposed to be reachable.
/ In !NDEBUG builds, prints the message and location info to stderr.
/ In NDEBUG builds, becomes an optimizer hint that the current location
/ is not supposed to be reachable. On compilers that don't support
/ such hints, prints a reduced message instead.
/
/ Use this instead of assert(0). It conveys intent more clearly and
/ allows compilers to omit some unnecessary code.
#ifndef NDEBUG
#define llvm_unreachable(msg) \

::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)

#elif defined(LLVM_BUILTIN_UNREACHABLE)
#define llvm_unreachable(msg) LLVM_BUILTIN_UNREACHABLE
#else
#define llvm_unreachable(msg) ::llvm::llvm_unreachable_internal()
#endif

mehdi_amini added inline comments.Dec 5 2016, 2:49 PM

lib/IR/IntrinsicInst.cpp
107 ↗	(On Diff #79447)	"Informative" as in "triggers a runtime failure"? Unreachable isn't the right solution, it is UB in release mode AFAIK.

I see. I still think it's worth adding UnreachableDefault() instead of just a comment explaining that other values will assert/crash. I'm checking the legal values in the verfifier, so this shouldn't be an issue.

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.
Otherwise a possible more generic extension to StringSwitch would be to accept lambdas:

StringSwitch(...)
   .Case("Value1", [] { return computeValue(); })
   .Default([] -> T { report_fatal_error("Unexpected Value XXX in YYYY"); }

In D27028#613867, @mehdi_amini wrote:

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.

I'm not sure I understand what you're suggesting here. If I do this:

return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero);
llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!");

the llvm_unreachable statement is purely unreachable because the implicit default from StringSwitch will assert or dereference a null pointer. The llvm_unreachable in this case effectively becomes a comment. So what I'd like to do is this:

return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero)
  .UnreachableDefault("Unexpected rounding mode argument in FP intrinsic!");

which at least in !NDEBUG builds will produce a useful message. This would be fairly trivial to implement.

In D27028#613926, @andrew.w.kaylor wrote:
In D27028#613867, @mehdi_amini wrote:

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.

I'm not sure I understand what you're suggesting here. If I do this:
return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero);
llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!");
the llvm_unreachable statement is purely unreachable because the implicit default from StringSwitch will assert or dereference a null pointer. The llvm_unreachable in this case effectively becomes a comment.

Not really: in an optimized build it means that the default case is unreachable. The assertion does not exist there, and the optimizer can drop the nullptr dereference.

So what I'd like to do is this:

return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero)
  .UnreachableDefault("Unexpected rounding mode argument in FP intrinsic!");

which at least in !NDEBUG builds will produce a useful message. This would be fairly trivial to implement.

Right the only difference is that you get a nicer runtime failure in assert mode.

arsenm added inline comments.Dec 5 2016, 5:01 PM

docs/LangRef.rst
12126 ↗	(On Diff #79447)	Typo: lllvm (seems repeated a number of times)

arsenm added inline comments.Dec 5 2016, 5:04 PM

include/llvm/IR/Intrinsics.td
475 ↗	(On Diff #79447)	Also fma, minnum, maxnum, sqrt

andrew.w.kaylor added inline comments.Dec 5 2016, 5:19 PM

include/llvm/IR/Intrinsics.td
475 ↗	(On Diff #79447)	That's a bit of a can of worms, isn't it? I mean there are a whole bunch of FP intrinsics, and I can imagine it making sense to have constrained versions of any of them.

In D27028#613949, @mehdi_amini wrote:
In D27028#613926, @andrew.w.kaylor wrote:
In D27028#613867, @mehdi_amini wrote:

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.

I'm not sure I understand what you're suggesting here. If I do this:
return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero);
llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!");
the llvm_unreachable statement is purely unreachable because the implicit default from StringSwitch will assert or dereference a null pointer. The llvm_unreachable in this case effectively becomes a comment.
Not really: in an optimized build it means that the default case is unreachable. The assertion does not exist there, and the optimizer can drop the nullptr dereference.

But the 'return' will always return, won't it? I don't see how the optimizer could associate the llvm_unreachable statement with the 'return StringSwitch(...)' to be able to do anything with it.

I'm in the process of preparing an updated revision that implements UnreachableDefault as I suggested, but I'm not overly attached to that approach. If you feel strongly that something else would be better I can change it.

Fixed typos and style issues.
Added FIXME comments as requested.

I have opened two Bugzilla reports.

One to add FP environment register modeling:
https://llvm.org/bugs/show_bug.cgi?id=31284

One to address denormal handling:
https://llvm.org/bugs/show_bug.cgi?id=31285

I'd like to talk more about the denormal handling apart from this review. I'm not certain that rolling it into these intrinsics is the correct solution. It might be, and I'd be happy to do that if we agree that's the best approach. It seems to me as though it has certain similarities to the fast math flags as well, so I'd like to talk about that relationship a bit.

In D27028#614081, @andrew.w.kaylor wrote:

Fixed typos and style issues.
Added FIXME comments as requested.

I have opened two Bugzilla reports.

One to add FP environment register modeling:
https://llvm.org/bugs/show_bug.cgi?id=31284

One to address denormal handling:
https://llvm.org/bugs/show_bug.cgi?id=31285

I'd like to talk more about the denormal handling apart from this review. I'm not certain that rolling it into these intrinsics is the correct solution. It might be, and I'd be happy to do that if we agree that's the best approach. It seems to me as though it has certain similarities to the fast math flags as well, so I'd like to talk about that relationship a bit.

Correct handling of ftz is required for correctness, so it isn't appropriate to be an optimization hint placed with the fast math flags

In D27028#614098, @arsenm wrote:

Correct handling of ftz is required for correctness, so it isn't appropriate to be an optimization hint placed with the fast math flags

But what's correct? The reason I think it's similar to the fast math flags is that if nothing is specified then optimizations can make no transformations based on FTZ, and if you want to be able to make a transformation that assumes a certain mode then you have to find the attribute set.

Basically, the way I've been viewing the difference between the intrinsics I'm introducing and the fast math flags is that the fast math flags are permissive (behavior is assumed to be restricted unless the attributes are present) whereas the intrinsics are restrictive (behavior is assumed to be permitted unless the constraining intrinisc is used). I'm not sure the denormal handling falls cleanly into either of these categories, but it seems to me more like the permissive approach is appropriate.

andrew.w.kaylor added a subscriber: Farhana.Dec 8 2016, 11:56 AM

Ping

Ping.

FWIW, rounding controls are needed for llvm.fma.*, llvm.fmuladd.*, and llvm.sqrt.*

Also, I don't see any reason to cover frem, but I also don't understand how frem ever got to be an instruction instead of a standard C library intrinsic.

In D27028#634214, @b-sumner wrote:

FWIW, rounding controls are needed for llvm.fma.*, llvm.fmuladd.*, and llvm.sqrt.*

I agree these are needed. I've added a comment in the .td file and intend to add variations of those intrinsics in a later revision.

In D27028#634214, @b-sumner wrote:

Also, I don't see any reason to cover frem, but I also don't understand how frem ever got to be an instruction instead of a standard C library intrinsic.

I'm not sure it actually happens, but I would think that frem is potentially subject to the same sorts of constant folding that can be done with fdiv.

This all looks pretty good to me, Andy. Regarding

In D27028#634214, @b-sumner wrote:

Also, I don't see any reason to cover frem, but I also don't understand how frem ever got to be an instruction instead of a standard C library intrinsic.

I'm not sure it actually happens, but I would think that frem is potentially subject to the same sorts of constant folding that can be done with fdiv.

When frem has a meaningful result, it is always exact, so perhaps we should omit the rounding behavior argument? I think we still need a constrained frem intrinsic, though, to handle the exceptional behavior in cases such as "frem inf, x" and "frem x, 0".

Regarding the denormal-handling issue, I agree that it is reasonable to defer support to a subsequent patch. I suspect you are ultimately going to want a separate argument rather than folding denormal-handling into the rounding mode. But we can discuss that in the Bugzilla report that you opened.

docs/LangRef.rst
12147 ↗	(On Diff #80354)	This question might be beyond the scope of this initial change set, but here goes. FP negation is currently handled using fsub -0, X. Is that sufficient in a constrained context? If we allow X to be a signaling NaN, -0-X should raise Invalid while -X should not, at least according to IEEE-754.
lib/IR/Verifier.cpp
4265 ↗	(On Diff #80354)	Rather than duplicating the legal rounding mode & exception behavior strings here (and in the getRoundingMode & getExceptionBehavior methods), would it be better to have a string-->enum function that is called in both places?

scanon added a subscriber: scanon.Jan 4 2017, 9:32 AM

scanon added inline comments.

docs/LangRef.rst
12147 ↗	(On Diff #80354)	Probably more importantly, -X has a defined signbit (the negation of whatever the signbit of X was), but -0-X does not [assuming the obvious binding of -X to the IEEE 754 negate operation and -0-X to the subtract operation].

In D27028#635372, @DavidKreitzer wrote:

When frem has a meaningful result, it is always exact, so perhaps we should omit the rounding behavior argument? I think we still need a constrained frem intrinsic, though, to handle the exceptional behavior in cases such as "frem inf, x" and "frem x, 0".

The implementation is slightly simpler if I give the frem intrinsic a rounding argument, even though it isn't needed. Otherwise, that intrinsic couldn't be handled by the same subclass as the others. I realize that's a fairly weak argument, but I just feel like making this one intrinsic different from the others will make the code ugly.

lib/IR/Verifier.cpp
4265 ↗	(On Diff #80354)	I do like the idea of having a single location for these strings. On the other hand, I think I'd need to introduce a new enum value (invalid?) that is only used here. I'll think about it a bit and try to consolidate the strings one way or another.

In D27028#636346, @andrew.w.kaylor wrote:

In D27028#635372, @DavidKreitzer wrote:

When frem has a meaningful result, it is always exact, so perhaps we should omit the rounding behavior argument? I think we still need a constrained frem intrinsic, though, to handle the exceptional behavior in cases such as "frem inf, x" and "frem x, 0".

The implementation is slightly simpler if I give the frem intrinsic a rounding argument, even though it isn't needed. Otherwise, that intrinsic couldn't be handled by the same subclass as the others. I realize that's a fairly weak argument, but I just feel like making this one intrinsic different from the others will make the code ugly.

Uniformity of implementation is a reasonable argument in favor of the extraneous rounding mode parameter for the frem intrinsic. But could you please make it clear that that is your intent in the language ref description of @llvm.experimental.constrained.frem?

docs/LangRef.rst
12147 ↗	(On Diff #80354)	Steve, are you referring to the fact that the sign of -0-X is unspecified when X is NaN or something else? I am trying to understand the implications of your comment - whether they are specific to the new constrained FP intrinsics or whether something needs to be done for normal FP LLVM IR. Another problem worth mentioning with implementing -X as -0-X in a constrained context is that -0-X will produce -0 when X is -0 and the rounding mode is toward -inf.

-Combined uses of string literals for FP rounding mode and exception behavior.
-Removed extension to StringSwitch since it is no longer needed.
-Added documentation explaining that the rounding mode argument isn't used from the frem intrinsic.

arsenm added inline comments.Jan 10 2017, 5:47 PM

lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
928–935 ↗	(On Diff #83904)	You can replace this with a switch strict->regular op mapping, and then you avoid having to switch over these again in MutateStrictFPToFP

-Consolidated examination of StrictFP node opcodes.

Is there anything left blocking this patch?

I'm happy with this now. We'll need to address adding implicit instruction dependencies (and generating these intrinsics from Clang in appropriate modes) in follow-up. Thanks for working on this.

This revision is now accepted and ready to land.Jan 25 2017, 5:05 PM

Closed by commit rL293226: Add intrinsics for constrained floating point operations (authored by akaylor). · Explain WhyJan 26 2017, 3:39 PM

This revision was automatically updated to reflect the committed changes.

andrew.w.kaylor mentioned this in D32319: Add constrained intrinsics for some libm-equivalent operations.Apr 20 2017, 3:54 PM

kpn mentioned this in D43515: More math intrinsics for conservative math handling.Feb 20 2018, 9:44 AM

@andrew.w.kaylor I went through the mailing list thread regarding this change and saw "Eventually, we’ll want to go back and teach specific optimizations to understand the intrinsics so that where possible optimizations can be performed in a manner consistent with dynamic rounding modes and strict exception handling.".
Do you have any references\plans on how to teach specific optimizations on this?

Herald added a subscriber: jdoerfert. · View Herald TranscriptApr 22 2020, 8:50 AM

In D27028#1997092, @GGanesh wrote:

@andrew.w.kaylor I went through the mailing list thread regarding this change and saw "Eventually, we’ll want to go back and teach specific optimizations to understand the intrinsics so that where possible optimizations can be performed in a manner consistent with dynamic rounding modes and strict exception handling.".
Do you have any references\plans on how to teach specific optimizations on this?

It depends on the optimization. Each optimization needs to be evaluated in some way to determine if it is safe to perform the optimization with the floating point constraints. Simon Moll has an idea that he can update the pattern matching to also match closely related intrinsics. Simon intends to work on this for the vector predicated intrinsics, but the idea should extend to constrained floating point intrinsics. However, we will still need to update the optimizations to evaluate whether its intended transformation meets the constraints. For example, pattern matching may find opportunities for constant folding, but we will need to examine the actual constants to determine if the operation being folded would be inexact (and therefore subject to rounding) or otherwise raise an FP exception.

Revision Contents

Path

Size

llvm/

trunk/

docs/

LangRef.rst

271 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

6 lines

SelectionDAGISel.h

2 lines

IR/

IntrinsicInst.h

39 lines

Intrinsics.td

36 lines

lib/

CodeGen/

SelectionDAG/

SelectionDAGBuilder.h

1 line

SelectionDAGBuilder.cpp

47 lines

SelectionDAGISel.cpp

60 lines

IR/

IntrinsicInst.cpp

32 lines

Verifier.cpp

18 lines

test/

CodeGen/

X86/

fp-intrinsics.ll

111 lines

Feature/

fp-intrinsics.ll

102 lines

Verifier/

fp-intrinsics.ll

43 lines

Diff 85977

llvm/trunk/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 12,058 Lines • ▼ Show 20 Lines
	the pointer to the memory for which the ``invariant.group`` no longer holds.			the pointer to the memory for which the ``invariant.group`` no longer holds.

	Semantics:			Semantics:
	""""""""""			""""""""""

	Returns another pointer that aliases its argument but which is considered different			Returns another pointer that aliases its argument but which is considered different
	for the purposes of ``load``/``store`` ``invariant.group`` metadata.			for the purposes of ``load``/``store`` ``invariant.group`` metadata.

				Constrained Floating Point Intrinsics
				-------------------------------------

				These intrinsics are used to provide special handling of floating point
				operations when specific rounding mode or floating point exception behavior is
				required. By default, LLVM optimization passes assume that the rounding mode is
				round-to-nearest and that floating point exceptions will not be monitored.
				Constrained FP intrinsics are used to support non-default rounding modes and
				accurately preserve exception behavior without compromising LLVM's ability to
				optimize FP code when the default behavior is used.

				Each of these intrinsics corresponds to a normal floating point operation. The
				first two arguments and the return value are the same as the corresponding FP
				operation.

				The third argument is a metadata argument specifying the rounding mode to be
				assumed. This argument must be one of the following strings:

				::
				"round.dynamic"
				"round.tonearest"
				"round.downward"
				"round.upward"
				"round.towardzero"

				If this argument is "round.dynamic" optimization passes must assume that the
				rounding mode is unknown and may change at runtime. No transformations that
				depend on rounding mode may be performed in this case.

				The other possible values for the rounding mode argument correspond to the
				similarly named IEEE rounding modes. If the argument is any of these values
				optimization passes may perform transformations as long as they are consistent
				with the specified rounding mode.

				For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
				"round.downward" or "round.dynamic" because if the value of 'x' is +0 then
				'x-0' should evaluate to '-0' when rounding downward. However, this
				transformation is legal for all other rounding modes.

				For values other than "round.dynamic" optimization passes may assume that the
				actual runtime rounding mode (as defined in a target-specific manner) matches
				the specified rounding mode, but this is not guaranteed. Using a specific
				non-dynamic rounding mode which does not match the actual rounding mode at
				runtime results in undefined behavior.

				The fourth argument to the constrained floating point intrinsics specifies the
				required exception behavior. This argument must be one of the following
				strings:

				::
				"fpexcept.ignore"
				"fpexcept.maytrap"
				"fpexcept.strict"

				If this argument is "fpexcept.ignore" optimization passes may assume that the
				exception status flags will not be read and that floating point exceptions will
				be masked. This allows transformations to be performed that may change the
				exception semantics of the original code. For example, FP operations may be
				speculatively executed in this case whereas they must not be for either of the
				other possible values of this argument.

				If the exception behavior argument is "fpexcept.maytrap" optimization passes
				must avoid transformations that may raise exceptions that would not have been
				raised by the original code (such as speculatively executing FP operations), but
				passes are not required to preserve all exceptions that are implied by the
				original code. For example, exceptions may be potentially hidden by constant
				folding.

				If the exception behavior argument is "fpexcept.strict" all transformations must
				strictly preserve the floating point exception semantics of the original code.
				Any FP exception that would have been raised by the original code must be raised
				by the transformed code, and the transformed code must not raise any FP
				exceptions that would not have been raised by the original code. This is the
				exception behavior argument that will be used if the code being compiled reads
				the FP exception status flags, but this mode can also be used with code that
				unmasks FP exceptions.

				The number and order of floating point exceptions is NOT guaranteed. For
				example, a series of FP operations that each may raise exceptions may be
				vectorized into a single instruction that raises each unique exception a single
				time.


				'``llvm.experimental.constrained.fadd``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
				two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``llvm.experimental.constrained.fadd``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point sum of the two value operands and has
				the same type as the operands.


				'``llvm.experimental.constrained.fsub``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
				of its two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``llvm.experimental.constrained.fsub``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point difference of the two value operands
				and has the same type as the operands.


				'``llvm.experimental.constrained.fmul``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
				its two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``llvm.experimental.constrained.fmul``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point product of the two value operands and
				has the same type as the operands.


				'``llvm.experimental.constrained.fdiv``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
				its two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``llvm.experimental.constrained.fdiv``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				The value produced is the floating point quotient of the two value operands and
				has the same type as the operands.


				'``llvm.experimental.constrained.frem``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
				from the division of its two operands.


				Arguments:
				""""""""""

				The first two arguments to the '``llvm.experimental.constrained.frem``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
				of floating point values. Both arguments must have identical types.

				The third and fourth arguments specify the rounding mode and exception
				behavior as described above. The rounding mode argument has no effect, since
				the result of frem is never rounded, but the argument is included for
				consistency with the other constrained floating point intrinsics.

				Semantics:
				""""""""""

				The value produced is the floating point remainder from the division of the two
				value operands and has the same type as the operands. The remainder has the
				same sign as the dividend.


	General Intrinsics			General Intrinsics
	------------------			------------------

	This class of intrinsics is designed to be generic and has no specific			This class of intrinsics is designed to be generic and has no specific
	purpose.			purpose.

	'``llvm.var.annotation``' Intrinsic			'``llvm.var.annotation``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	▲ Show 20 Lines • Show All 683 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 239 Lines • ▼ Show 20 Lines	enum NodeType {
SSUBO, USUBO,		SSUBO, USUBO,

/// Same for multiplication.		/// Same for multiplication.
SMULO, UMULO,		SMULO, UMULO,

/// Simple binary floating point operators.		/// Simple binary floating point operators.
FADD, FSUB, FMUL, FDIV, FREM,		FADD, FSUB, FMUL, FDIV, FREM,

		/// Constrained versions of the binary floating point operators.
		/// These will be lowered to the simple operators before final selection.
		/// They are used to limit optimizations while the DAG is being
		/// optimized.
		STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,

/// FMA - Perform a * b + c with no intermediate rounding step.		/// FMA - Perform a * b + c with no intermediate rounding step.
FMA,		FMA,

/// FMAD - Perform a * b + c, while getting the same result as the		/// FMAD - Perform a * b + c, while getting the same result as the
/// separately rounded operations.		/// separately rounded operations.
FMAD,		FMAD,

/// FCOPYSIGN(X, Y) - Return the value of X with the sign of Y. NOTE: This		/// FCOPYSIGN(X, Y) - Return the value of X with the sign of Y. NOTE: This
▲ Show 20 Lines • Show All 667 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/SelectionDAGISel.h

Show First 20 Lines • Show All 264 Lines • ▼ Show 20 Lines	private:
void Select_UNDEF(SDNode *N);		void Select_UNDEF(SDNode *N);
void CannotYetSelect(SDNode *N);		void CannotYetSelect(SDNode *N);

private:		private:
void DoInstructionSelection();		void DoInstructionSelection();
SDNode MorphNode(SDNode Node, unsigned TargetOpc, SDVTList VTs,		SDNode MorphNode(SDNode Node, unsigned TargetOpc, SDVTList VTs,
ArrayRef<SDValue> Ops, unsigned EmitNodeInfo);		ArrayRef<SDValue> Ops, unsigned EmitNodeInfo);

		SDNode MutateStrictFPToFP(SDNode Node, unsigned NewOpc);

/// Prepares the landing pad to take incoming values or do other EH		/// Prepares the landing pad to take incoming values or do other EH
/// personality specific tasks. Returns true if the block should be		/// personality specific tasks. Returns true if the block should be
/// instruction selected, false if no code should be emitted for it.		/// instruction selected, false if no code should be emitted for it.
bool PrepareEHLandingPad();		bool PrepareEHLandingPad();

/// \brief Perform instruction selection on all basic blocks in the function.		/// \brief Perform instruction selection on all basic blocks in the function.
void SelectAllBasicBlocks(const Function &Fn);		void SelectAllBasicBlocks(const Function &Fn);

Show All 34 Lines

llvm/trunk/include/llvm/IR/IntrinsicInst.h

Show First 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	public:
static inline bool classof(const IntrinsicInst *I) {		static inline bool classof(const IntrinsicInst *I) {
return I->getIntrinsicID() == Intrinsic::dbg_value;		return I->getIntrinsicID() == Intrinsic::dbg_value;
}		}
static inline bool classof(const Value *V) {		static inline bool classof(const Value *V) {
return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
}		}
};		};

		/// This is the common base class for constrained floating point intrinsics.
		class ConstrainedFPIntrinsic : public IntrinsicInst {
		public:
		enum RoundingMode {
		rmInvalid,
		rmDynamic,
		rmToNearest,
		rmDownward,
		rmUpward,
		rmTowardZero
		};

		enum ExceptionBehavior {
		ebInvalid,
		ebIgnore,
		ebMayTrap,
		ebStrict
		};

		RoundingMode getRoundingMode() const;
		ExceptionBehavior getExceptionBehavior() const;

		// Methods for support type inquiry through isa, cast, and dyn_cast:
		static inline bool classof(const IntrinsicInst *I) {
		switch (I->getIntrinsicID()) {
		case Intrinsic::experimental_constrained_fadd:
		case Intrinsic::experimental_constrained_fsub:
		case Intrinsic::experimental_constrained_fmul:
		case Intrinsic::experimental_constrained_fdiv:
		case Intrinsic::experimental_constrained_frem:
		return true;
		default: return false;
		}
		}
		static inline bool classof(const Value *V) {
		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
		}
		};

/// This is the common base class for memset/memcpy/memmove.		/// This is the common base class for memset/memcpy/memmove.
class MemIntrinsic : public IntrinsicInst {		class MemIntrinsic : public IntrinsicInst {
public:		public:
Value getRawDest() const { return const_cast<Value>(getArgOperand(0)); }		Value getRawDest() const { return const_cast<Value>(getArgOperand(0)); }
const Use &getRawDestUse() const { return getArgOperandUse(0); }		const Use &getRawDestUse() const { return getArgOperandUse(0); }
Use &getRawDestUse() { return getArgOperandUse(0); }		Use &getRawDestUse() { return getArgOperandUse(0); }

Value getLength() const { return const_cast<Value>(getArgOperand(2)); }		Value getLength() const { return const_cast<Value>(getArgOperand(2)); }
▲ Show 20 Lines • Show All 264 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 383 Lines • ▼ Show 20 Lines	def int_memmove : Intrinsic<[],
llvm_i32_ty, llvm_i1_ty],		llvm_i32_ty, llvm_i1_ty],
[IntrArgMemOnly, NoCapture<0>, NoCapture<1>,		[IntrArgMemOnly, NoCapture<0>, NoCapture<1>,
ReadOnly<1>]>;		ReadOnly<1>]>;
def int_memset : Intrinsic<[],		def int_memset : Intrinsic<[],
[llvm_anyptr_ty, llvm_i8_ty, llvm_anyint_ty,		[llvm_anyptr_ty, llvm_i8_ty, llvm_anyint_ty,
llvm_i32_ty, llvm_i1_ty],		llvm_i32_ty, llvm_i1_ty],
[IntrArgMemOnly, NoCapture<0>, WriteOnly<0>]>;		[IntrArgMemOnly, NoCapture<0>, WriteOnly<0>]>;

		// FIXME: Add version of these floating point intrinsics which allow non-default
		// rounding modes and FP exception handling.

let IntrProperties = [IntrNoMem] in {		let IntrProperties = [IntrNoMem] in {
def int_fma : Intrinsic<[llvm_anyfloat_ty],		def int_fma : Intrinsic<[llvm_anyfloat_ty],
[LLVMMatchType<0>, LLVMMatchType<0>,		[LLVMMatchType<0>, LLVMMatchType<0>,
LLVMMatchType<0>]>;		LLVMMatchType<0>]>;
def int_fmuladd : Intrinsic<[llvm_anyfloat_ty],		def int_fmuladd : Intrinsic<[llvm_anyfloat_ty],
[LLVMMatchType<0>, LLVMMatchType<0>,		[LLVMMatchType<0>, LLVMMatchType<0>,
LLVMMatchType<0>]>;		LLVMMatchType<0>]>;

Show All 37 Lines
def int_sigsetjmp : Intrinsic<[llvm_i32_ty] , [llvm_ptr_ty, llvm_i32_ty]>;		def int_sigsetjmp : Intrinsic<[llvm_i32_ty] , [llvm_ptr_ty, llvm_i32_ty]>;
def int_siglongjmp : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty], [IntrNoReturn]>;		def int_siglongjmp : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty], [IntrNoReturn]>;

// Internal interface for object size checking		// Internal interface for object size checking
def int_objectsize : Intrinsic<[llvm_anyint_ty], [llvm_anyptr_ty, llvm_i1_ty],		def int_objectsize : Intrinsic<[llvm_anyint_ty], [llvm_anyptr_ty, llvm_i1_ty],
[IntrNoMem]>,		[IntrNoMem]>,
GCCBuiltin<"__builtin_object_size">;		GCCBuiltin<"__builtin_object_size">;

		//===--------------- Constrained Floating Point Intrinsics ----------------===//
		//

		let IntrProperties = [IntrInaccessibleMemOnly] in {
		def int_experimental_constrained_fadd : Intrinsic<[ llvm_anyfloat_ty ],
		[ LLVMMatchType<0>,
		LLVMMatchType<0>,
		llvm_metadata_ty,
		llvm_metadata_ty ]>;
		def int_experimental_constrained_fsub : Intrinsic<[ llvm_anyfloat_ty ],
		[ LLVMMatchType<0>,
		LLVMMatchType<0>,
		llvm_metadata_ty,
		llvm_metadata_ty ]>;
		def int_experimental_constrained_fmul : Intrinsic<[ llvm_anyfloat_ty ],
		[ LLVMMatchType<0>,
		LLVMMatchType<0>,
		llvm_metadata_ty,
		llvm_metadata_ty ]>;
		def int_experimental_constrained_fdiv : Intrinsic<[ llvm_anyfloat_ty ],
		[ LLVMMatchType<0>,
		LLVMMatchType<0>,
		llvm_metadata_ty,
		llvm_metadata_ty ]>;
		def int_experimental_constrained_frem : Intrinsic<[ llvm_anyfloat_ty ],
		[ LLVMMatchType<0>,
		LLVMMatchType<0>,
		llvm_metadata_ty,
		llvm_metadata_ty ]>;
		}
		// FIXME: Add intrinsic for fcmp, fptrunc, fpext, fptoui and fptosi.


//===------------------------- Expect Intrinsics --------------------------===//		//===------------------------- Expect Intrinsics --------------------------===//
//		//
def int_expect : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>,		def int_expect : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>,
LLVMMatchType<0>], [IntrNoMem]>;		LLVMMatchType<0>], [IntrNoMem]>;

//===-------------------- Bit Manipulation Intrinsics ---------------------===//		//===-------------------- Bit Manipulation Intrinsics ---------------------===//
//		//

▲ Show 20 Lines • Show All 311 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

Show First 20 Lines • Show All 895 Lines • ▼ Show 20 Lines	private:
void visitAtomicLoad(const LoadInst &I);		void visitAtomicLoad(const LoadInst &I);
void visitAtomicStore(const StoreInst &I);		void visitAtomicStore(const StoreInst &I);
void visitLoadFromSwiftError(const LoadInst &I);		void visitLoadFromSwiftError(const LoadInst &I);
void visitStoreToSwiftError(const StoreInst &I);		void visitStoreToSwiftError(const StoreInst &I);

void visitInlineAsm(ImmutableCallSite CS);		void visitInlineAsm(ImmutableCallSite CS);
const char *visitIntrinsicCall(const CallInst &I, unsigned Intrinsic);		const char *visitIntrinsicCall(const CallInst &I, unsigned Intrinsic);
void visitTargetIntrinsic(const CallInst &I, unsigned Intrinsic);		void visitTargetIntrinsic(const CallInst &I, unsigned Intrinsic);
		void visitConstrainedFPIntrinsic(const CallInst &I, unsigned Intrinsic);

void visitVAStart(const CallInst &I);		void visitVAStart(const CallInst &I);
void visitVAArg(const VAArgInst &I);		void visitVAArg(const VAArgInst &I);
void visitVAEnd(const CallInst &I);		void visitVAEnd(const CallInst &I);
void visitVACopy(const CallInst &I);		void visitVACopy(const CallInst &I);
void visitStackmap(const CallInst &I);		void visitStackmap(const CallInst &I);
void visitPatchpoint(ImmutableCallSite CS,		void visitPatchpoint(ImmutableCallSite CS,
const BasicBlock *EHPadBB = nullptr);		const BasicBlock *EHPadBB = nullptr);
▲ Show 20 Lines • Show All 114 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,330 Lines • ▼ Show 20 Lines	case Intrinsic::copysign:
return nullptr;		return nullptr;
case Intrinsic::fma:		case Intrinsic::fma:
setValue(&I, DAG.getNode(ISD::FMA, sdl,		setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(0)).getValueType(),		getValue(I.getArgOperand(0)).getValueType(),
getValue(I.getArgOperand(0)),		getValue(I.getArgOperand(0)),
getValue(I.getArgOperand(1)),		getValue(I.getArgOperand(1)),
getValue(I.getArgOperand(2))));		getValue(I.getArgOperand(2))));
return nullptr;		return nullptr;
		case Intrinsic::experimental_constrained_fadd:
		case Intrinsic::experimental_constrained_fsub:
		case Intrinsic::experimental_constrained_fmul:
		case Intrinsic::experimental_constrained_fdiv:
		case Intrinsic::experimental_constrained_frem:
		visitConstrainedFPIntrinsic(I, Intrinsic);
		return nullptr;
case Intrinsic::fmuladd: {		case Intrinsic::fmuladd: {
EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());		EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());
if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&		if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&
TLI.isFMAFasterThanFMulAndFAdd(VT)) {		TLI.isFMAFasterThanFMulAndFAdd(VT)) {
setValue(&I, DAG.getNode(ISD::FMA, sdl,		setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(0)).getValueType(),		getValue(I.getArgOperand(0)).getValueType(),
getValue(I.getArgOperand(0)),		getValue(I.getArgOperand(0)),
getValue(I.getArgOperand(1)),		getValue(I.getArgOperand(1)),
▲ Show 20 Lines • Show All 432 Lines • ▼ Show 20 Lines	SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I, unsigned Intrinsic) {
}		}

case Intrinsic::experimental_deoptimize:		case Intrinsic::experimental_deoptimize:
LowerDeoptimizeCall(&I);		LowerDeoptimizeCall(&I);
return nullptr;		return nullptr;
}		}
}		}

		void SelectionDAGBuilder::visitConstrainedFPIntrinsic(const CallInst &I,
		unsigned Intrinsic) {
		SDLoc sdl = getCurSDLoc();
		unsigned Opcode;
		switch (Intrinsic) {
		default: llvm_unreachable("Impossible intrinsic"); // Can't reach here.
		case Intrinsic::experimental_constrained_fadd:
		Opcode = ISD::STRICT_FADD;
		break;
		case Intrinsic::experimental_constrained_fsub:
		Opcode = ISD::STRICT_FSUB;
		break;
		case Intrinsic::experimental_constrained_fmul:
		Opcode = ISD::STRICT_FMUL;
		break;
		case Intrinsic::experimental_constrained_fdiv:
		Opcode = ISD::STRICT_FDIV;
		break;
		case Intrinsic::experimental_constrained_frem:
		Opcode = ISD::STRICT_FREM;
		break;
		}
		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
		SDValue Chain = getRoot();
		SDValue Ops[3] = { Chain, getValue(I.getArgOperand(0)),
		getValue(I.getArgOperand(1)) };
		SmallVector<EVT, 4> ValueVTs;
		ComputeValueVTs(TLI, DAG.getDataLayout(), I.getType(), ValueVTs);
		ValueVTs.push_back(MVT::Other); // Out chain

		SDVTList VTs = DAG.getVTList(ValueVTs);
		SDValue Result = DAG.getNode(Opcode, sdl, VTs, Ops);

		assert(Result.getNode()->getNumValues() == 2);
		SDValue OutChain = Result.getValue(1);
		DAG.setRoot(OutChain);
		SDValue FPResult = Result.getValue(0);
		setValue(&I, FPResult);
		}

std::pair<SDValue, SDValue>		std::pair<SDValue, SDValue>
SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,		SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,
const BasicBlock *EHPadBB) {		const BasicBlock *EHPadBB) {
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
MCSymbol *BeginLabel = nullptr;		MCSymbol *BeginLabel = nullptr;

if (EHPadBB) {		if (EHPadBB) {
▲ Show 20 Lines • Show All 3,588 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

Show First 20 Lines • Show All 919 Lines • ▼ Show 20 Lines	public:
///		///
void NodeDeleted(SDNode N, SDNode E) override {		void NodeDeleted(SDNode N, SDNode E) override {
if (ISelPosition == SelectionDAG::allnodes_iterator(N))		if (ISelPosition == SelectionDAG::allnodes_iterator(N))
++ISelPosition;		++ISelPosition;
}		}
};		};
} // end anonymous namespace		} // end anonymous namespace

		static bool isStrictFPOp(SDNode *Node, unsigned &NewOpc) {
		unsigned OrigOpc = Node->getOpcode();
		switch (OrigOpc) {
		case ISD::STRICT_FADD: NewOpc = ISD::FADD; return true;
		case ISD::STRICT_FSUB: NewOpc = ISD::FSUB; return true;
		case ISD::STRICT_FMUL: NewOpc = ISD::FMUL; return true;
		case ISD::STRICT_FDIV: NewOpc = ISD::FDIV; return true;
		case ISD::STRICT_FREM: NewOpc = ISD::FREM; return true;
		default: return false;
		}
		}

		SDNode* SelectionDAGISel::MutateStrictFPToFP(SDNode *Node, unsigned NewOpc) {
		assert(((Node->getOpcode() == ISD::STRICT_FADD && NewOpc == ISD::FADD) \|\|
		(Node->getOpcode() == ISD::STRICT_FSUB && NewOpc == ISD::FSUB) \|\|
		(Node->getOpcode() == ISD::STRICT_FMUL && NewOpc == ISD::FMUL) \|\|
		(Node->getOpcode() == ISD::STRICT_FDIV && NewOpc == ISD::FDIV) \|\|
		(Node->getOpcode() == ISD::STRICT_FREM && NewOpc == ISD::FREM)) &&
		"Unexpected StrictFP opcode!");

		// We're taking this node out of the chain, so we need to re-link things.
		SDValue InputChain = Node->getOperand(0);
		SDValue OutputChain = SDValue(Node, 1);
		CurDAG->ReplaceAllUsesOfValueWith(OutputChain, InputChain);

		SDVTList VTs = CurDAG->getVTList(Node->getOperand(1).getValueType());
		SDValue Ops[2] = { Node->getOperand(1), Node->getOperand(2) };
		SDNode *Res = CurDAG->MorphNodeTo(Node, NewOpc, VTs, Ops);

		// MorphNodeTo can operate in two ways: if an existing node with the
		// specified operands exists, it can just return it. Otherwise, it
		// updates the node in place to have the requested operands.
		if (Res == Node) {
		// If we updated the node in place, reset the node ID. To the isel,
		// this should be just like a newly allocated machine node.
		Res->setNodeId(-1);
		} else {
		CurDAG->ReplaceAllUsesWith(Node, Res);
		CurDAG->RemoveDeadNode(Node);
		}

		return Res;
		}

void SelectionDAGISel::DoInstructionSelection() {		void SelectionDAGISel::DoInstructionSelection() {
DEBUG(dbgs() << "===== Instruction selection begins: BB#"		DEBUG(dbgs() << "===== Instruction selection begins: BB#"
<< FuncInfo->MBB->getNumber()		<< FuncInfo->MBB->getNumber()
<< " '" << FuncInfo->MBB->getName() << "'\n");		<< " '" << FuncInfo->MBB->getName() << "'\n");

PreprocessISelDAG();		PreprocessISelDAG();

// Select target instructions for the DAG.		// Select target instructions for the DAG.
Show All 19 Lines	// Select target instructions for the DAG.
while (ISelPosition != CurDAG->allnodes_begin()) {		while (ISelPosition != CurDAG->allnodes_begin()) {
SDNode Node = &--ISelPosition;		SDNode Node = &--ISelPosition;
// Skip dead nodes. DAGCombiner is expected to eliminate all dead nodes,		// Skip dead nodes. DAGCombiner is expected to eliminate all dead nodes,
// but there are currently some corner cases that it misses. Also, this		// but there are currently some corner cases that it misses. Also, this
// makes it theoretically possible to disable the DAGCombiner.		// makes it theoretically possible to disable the DAGCombiner.
if (Node->use_empty())		if (Node->use_empty())
continue;		continue;

		// When we are using non-default rounding modes or FP exception behavior
		// FP operations are represented by StrictFP pseudo-operations. They
		// need to be simplified here so that the target-specific instruction
		// selectors know how to handle them.
		//
		// If the current node is a strict FP pseudo-op, the isStrictFPOp()
		// function will provide the corresponding normal FP opcode to which the
		// node should be mutated.
		unsigned NormalFPOpc = ISD::UNDEF;
		bool IsStrictFPOp = isStrictFPOp(Node, NormalFPOpc);
		if (IsStrictFPOp)
		Node = MutateStrictFPToFP(Node, NormalFPOpc);

Select(Node);		Select(Node);

		// FIXME: Add code here to attach an implicit def and use of
		// target-specific FP environment registers.
}		}

CurDAG->setRoot(Dummy.getValue());		CurDAG->setRoot(Dummy.getValue());
}		}

DEBUG(dbgs() << "===== Instruction selection ends:\n");		DEBUG(dbgs() << "===== Instruction selection ends:\n");

PostprocessISelDAG();		PostprocessISelDAG();
▲ Show 20 Lines • Show All 2,698 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/IntrinsicInst.cpp

	Show All 15 Lines
	// hack working.			// hack working.
	//			//
	// In some cases, arguments to intrinsics need to be generic and are defined as			// In some cases, arguments to intrinsics need to be generic and are defined as
	// type pointer to empty struct { }*. To access the real item of interest the			// type pointer to empty struct { }*. To access the real item of interest the
	// cast instruction needs to be stripped away.			// cast instruction needs to be stripped away.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

				#include "llvm/ADT/StringSwitch.h"
	#include "llvm/IR/IntrinsicInst.h"			#include "llvm/IR/IntrinsicInst.h"
	#include "llvm/IR/Constants.h"			#include "llvm/IR/Constants.h"
	#include "llvm/IR/GlobalVariable.h"			#include "llvm/IR/GlobalVariable.h"
	#include "llvm/IR/Metadata.h"			#include "llvm/IR/Metadata.h"
	#include "llvm/IR/Module.h"			#include "llvm/IR/Module.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"
	using namespace llvm;			using namespace llvm;

	▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
	Value *InstrProfIncrementInst::getStep() const {			Value *InstrProfIncrementInst::getStep() const {
	if (InstrProfIncrementInstStep::classof(this)) {			if (InstrProfIncrementInstStep::classof(this)) {
	return const_cast<Value *>(getArgOperand(4));			return const_cast<Value *>(getArgOperand(4));
	}			}
	const Module *M = getModule();			const Module *M = getModule();
	LLVMContext &Context = M->getContext();			LLVMContext &Context = M->getContext();
	return ConstantInt::get(Type::getInt64Ty(Context), 1);			return ConstantInt::get(Type::getInt64Ty(Context), 1);
	}			}

				ConstrainedFPIntrinsic::RoundingMode
				ConstrainedFPIntrinsic::getRoundingMode() const {
				Metadata *MD = dyn_cast<MetadataAsValue>(getOperand(2))->getMetadata();
				if (!MD \|\| !isa<MDString>(MD))
				return rmInvalid;
				StringRef RoundingArg = cast<MDString>(MD)->getString();

				// For dynamic rounding mode, we use round to nearest but we will set the
				// 'exact' SDNodeFlag so that the value will not be rounded.
				return StringSwitch<RoundingMode>(RoundingArg)
				.Case("round.dynamic", rmDynamic)
				.Case("round.tonearest", rmToNearest)
				.Case("round.downward", rmDownward)
				.Case("round.upward", rmUpward)
				.Case("round.towardzero", rmTowardZero)
				.Default(rmInvalid);
				}

				ConstrainedFPIntrinsic::ExceptionBehavior
				ConstrainedFPIntrinsic::getExceptionBehavior() const {
				Metadata *MD = dyn_cast<MetadataAsValue>(getOperand(3))->getMetadata();
				if (!MD \|\| !isa<MDString>(MD))
				return ebInvalid;
				StringRef ExceptionArg = cast<MDString>(MD)->getString();
				return StringSwitch<ExceptionBehavior>(ExceptionArg)
				.Case("fpexcept.ignore", ebIgnore)
				.Case("fpexcept.maytrap", ebMayTrap)
				.Case("fpexcept.strict", ebStrict)
				.Default(ebInvalid);
				}

llvm/trunk/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 451 Lines • ▼ Show 20 Lines	#include "llvm/IR/Metadata.def"
void visitBranchInst(BranchInst &BI);		void visitBranchInst(BranchInst &BI);
void visitReturnInst(ReturnInst &RI);		void visitReturnInst(ReturnInst &RI);
void visitSwitchInst(SwitchInst &SI);		void visitSwitchInst(SwitchInst &SI);
void visitIndirectBrInst(IndirectBrInst &BI);		void visitIndirectBrInst(IndirectBrInst &BI);
void visitSelectInst(SelectInst &SI);		void visitSelectInst(SelectInst &SI);
void visitUserOp1(Instruction &I);		void visitUserOp1(Instruction &I);
void visitUserOp2(Instruction &I) { visitUserOp1(I); }		void visitUserOp2(Instruction &I) { visitUserOp1(I); }
void visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS);		void visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS);
		void visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI);
template <class DbgIntrinsicTy>		template <class DbgIntrinsicTy>
void visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII);		void visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII);
void visitAtomicCmpXchgInst(AtomicCmpXchgInst &CXI);		void visitAtomicCmpXchgInst(AtomicCmpXchgInst &CXI);
void visitAtomicRMWInst(AtomicRMWInst &RMWI);		void visitAtomicRMWInst(AtomicRMWInst &RMWI);
void visitFenceInst(FenceInst &FI);		void visitFenceInst(FenceInst &FI);
void visitAllocaInst(AllocaInst &AI);		void visitAllocaInst(AllocaInst &AI);
void visitExtractValueInst(ExtractValueInst &EVI);		void visitExtractValueInst(ExtractValueInst &EVI);
void visitInsertValueInst(InsertValueInst &IVI);		void visitInsertValueInst(InsertValueInst &IVI);
▲ Show 20 Lines • Show All 3,456 Lines • ▼ Show 20 Lines	void Verifier::visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS) {
}		}
case Intrinsic::ctlz: // llvm.ctlz		case Intrinsic::ctlz: // llvm.ctlz
case Intrinsic::cttz: // llvm.cttz		case Intrinsic::cttz: // llvm.cttz
Assert(isa<ConstantInt>(CS.getArgOperand(1)),		Assert(isa<ConstantInt>(CS.getArgOperand(1)),
"is_zero_undef argument of bit counting intrinsics must be a "		"is_zero_undef argument of bit counting intrinsics must be a "
"constant int",		"constant int",
CS);		CS);
break;		break;
		case Intrinsic::experimental_constrained_fadd:
		case Intrinsic::experimental_constrained_fsub:
		case Intrinsic::experimental_constrained_fmul:
		case Intrinsic::experimental_constrained_fdiv:
		case Intrinsic::experimental_constrained_frem:
		visitConstrainedFPIntrinsic(
		cast<ConstrainedFPIntrinsic>(*CS.getInstruction()));
		break;
case Intrinsic::dbg_declare: // llvm.dbg.declare		case Intrinsic::dbg_declare: // llvm.dbg.declare
Assert(isa<MetadataAsValue>(CS.getArgOperand(0)),		Assert(isa<MetadataAsValue>(CS.getArgOperand(0)),
"invalid llvm.dbg.declare intrinsic call 1", CS);		"invalid llvm.dbg.declare intrinsic call 1", CS);
visitDbgIntrinsic("declare", cast<DbgDeclareInst>(*CS.getInstruction()));		visitDbgIntrinsic("declare", cast<DbgDeclareInst>(*CS.getInstruction()));
break;		break;
case Intrinsic::dbg_value: // llvm.dbg.value		case Intrinsic::dbg_value: // llvm.dbg.value
visitDbgIntrinsic("value", cast<DbgValueInst>(*CS.getInstruction()));		visitDbgIntrinsic("value", cast<DbgValueInst>(*CS.getInstruction()));
break;		break;
▲ Show 20 Lines • Show All 349 Lines • ▼ Show 20 Lines	static DISubprogram getSubprogram(Metadata LocalScope) {
if (auto *LB = dyn_cast<DILexicalBlockBase>(LocalScope))		if (auto *LB = dyn_cast<DILexicalBlockBase>(LocalScope))
return getSubprogram(LB->getRawScope());		return getSubprogram(LB->getRawScope());

// Just return null; broken scope chains are checked elsewhere.		// Just return null; broken scope chains are checked elsewhere.
assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");		assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");
return nullptr;		return nullptr;
}		}

		void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {
		Assert(isa<MetadataAsValue>(FPI.getOperand(2)),
		"invalid rounding mode argument", &FPI);
		Assert(FPI.getRoundingMode() != ConstrainedFPIntrinsic::rmInvalid,
		"invalid rounding mode argument", &FPI);
		Assert(FPI.getExceptionBehavior() != ConstrainedFPIntrinsic::ebInvalid,
		"invalid exception behavior argument", &FPI);
		}

template <class DbgIntrinsicTy>		template <class DbgIntrinsicTy>
void Verifier::visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII) {		void Verifier::visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII) {
auto *MD = cast<MetadataAsValue>(DII.getArgOperand(0))->getMetadata();		auto *MD = cast<MetadataAsValue>(DII.getArgOperand(0))->getMetadata();
AssertDI(isa<ValueAsMetadata>(MD) \|\|		AssertDI(isa<ValueAsMetadata>(MD) \|\|
(isa<MDNode>(MD) && !cast<MDNode>(MD)->getNumOperands()),		(isa<MDNode>(MD) && !cast<MDNode>(MD)->getNumOperands()),
"invalid llvm.dbg." + Kind + " intrinsic address/value", &DII, MD);		"invalid llvm.dbg." + Kind + " intrinsic address/value", &DII, MD);
AssertDI(isa<DILocalVariable>(DII.getRawVariable()),		AssertDI(isa<DILocalVariable>(DII.getRawVariable()),
"invalid llvm.dbg." + Kind + " intrinsic variable", &DII,		"invalid llvm.dbg." + Kind + " intrinsic variable", &DII,
▲ Show 20 Lines • Show All 528 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll

				; RUN: llc -O3 -mtriple=x86_64-pc-linux < %s \| FileCheck %s

				; Verify that constants aren't folded to inexact results when the rounding mode
				; is unknown.
				;
				; double f1() {
				; // Because 0.1 cannot be represented exactly, this shouldn't be folded.
				; return 1.0/10.0;
				; }
				;
				; CHECK-LABEL: f1
				; CHECK: divsd
				define double @f1() {
				entry:
				%div = call double @llvm.experimental.constrained.fdiv.f64(
				double 1.000000e+00,
				double 1.000000e+01,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret double %div
				}

				; Verify that 'a - 0' isn't simplified to 'a' when the rounding mode is unknown.
				;
				; double f2(double a) {
				; // Because the result of '0 - 0' is negative zero if rounding mode is
				; // downward, this shouldn't be simplified.
				; return a - 0;
				; }
				;
				; CHECK-LABEL: f2
				; CHECK: subsd
				define double @f2(double %a) {
				entry:
				%div = call double @llvm.experimental.constrained.fsub.f64(
				double %a,
				double 0.000000e+00,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret double %div
				}

				; Verify that '-((-a)b)' isn't simplified to 'ab' when the rounding mode is
				; unknown.
				;
				; double f3(double a, double b) {
				; // Because the intermediate value involved in this calculation may require
				; // rounding, this shouldn't be simplified.
				; return -((-a)*b);
				; }
				;
				; CHECK-LABEL: f3:
				; CHECK: subsd
				; CHECK: mulsd
				; CHECK: subsd
				define double @f3(double %a, double %b) {
				entry:
				%sub = call double @llvm.experimental.constrained.fsub.f64(
				double -0.000000e+00, double %a,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				%mul = call double @llvm.experimental.constrained.fmul.f64(
				double %sub, double %b,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				%ret = call double @llvm.experimental.constrained.fsub.f64(
				double -0.000000e+00,
				double %mul,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret double %ret
				}

				; Verify that FP operations are not performed speculatively when FP exceptions
				; are not being ignored.
				;
				; double f4(int n, double a) {
				; // Because a + 1 may overflow, this should not be simplified.
				; if (n > 0)
				; return a + 1.0;
				; return a;
				; }
				;
				;
				; CHECK-LABEL: f4:
				; CHECK: testl
				; CHECK: jle
				; CHECK: addsd
				define double @f4(i32 %n, double %a) {
				entry:
				%cmp = icmp sgt i32 %n, 0
				br i1 %cmp, label %if.then, label %if.end

				if.then:
				%add = call double @llvm.experimental.constrained.fadd.f64(
				double 1.000000e+00, double %a,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				br label %if.end

				if.end:
				%a.0 = phi double [%add, %if.then], [ %a, %entry ]
				ret double %a.0
				}


				@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
				declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)

llvm/trunk/test/Feature/fp-intrinsics.ll

				; RUN: opt -O3 -S < %s \| FileCheck %s

				; Test to verify that constants aren't folded when the rounding mode is unknown.
				; CHECK-LABEL: @f1
				; CHECK: call double @llvm.experimental.constrained.fdiv.f64
				define double @f1() {
				entry:
				%div = call double @llvm.experimental.constrained.fdiv.f64(
				double 1.000000e+00,
				double 1.000000e+01,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret double %div
				}

				; Verify that 'a - 0' isn't simplified to 'a' when the rounding mode is unknown.
				;
				; double f2(double a) {
				; // Because the result of '0 - 0' is negative zero if rounding mode is
				; // downward, this shouldn't be simplified.
				; return a - 0.0;
				; }
				;
				; CHECK-LABEL: @f2
				; CHECK: call double @llvm.experimental.constrained.fsub.f64
				define double @f2(double %a) {
				entry:
				%div = call double @llvm.experimental.constrained.fsub.f64(
				double %a, double 0.000000e+00,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret double %div
				}

				; Verify that '-((-a)b)' isn't simplified to 'ab' when the rounding mode is
				; unknown.
				;
				; double f3(double a, double b) {
				; // Because the intermediate value involved in this calculation may require
				; // rounding, this shouldn't be simplified.
				; return -((-a)*b);
				; }
				;
				; CHECK-LABEL: @f3
				; CHECK: call double @llvm.experimental.constrained.fsub.f64
				; CHECK: call double @llvm.experimental.constrained.fmul.f64
				; CHECK: call double @llvm.experimental.constrained.fsub.f64
				define double @f3(double %a, double %b) {
				entry:
				%sub = call double @llvm.experimental.constrained.fsub.f64(
				double -0.000000e+00, double %a,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				%mul = call double @llvm.experimental.constrained.fmul.f64(
				double %sub, double %b,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				%ret = call double @llvm.experimental.constrained.fsub.f64(
				double -0.000000e+00,
				double %mul,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret double %ret
				}

				; Verify that FP operations are not performed speculatively when FP exceptions
				; are not being ignored.
				;
				; double f4(int n, double a) {
				; // Because a + 1 may overflow, this should not be simplified.
				; if (n > 0)
				; return a + 1.0;
				; return a;
				; }
				;
				;
				; CHECK-LABEL: @f4
				; CHECK-NOT: select
				; CHECK: br i1 %cmp
				define double @f4(i32 %n, double %a) {
				entry:
				%cmp = icmp sgt i32 %n, 0
				br i1 %cmp, label %if.then, label %if.end

				if.then:
				%add = call double @llvm.experimental.constrained.fadd.f64(
				double 1.000000e+00, double %a,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				br label %if.end

				if.end:
				%a.0 = phi double [%add, %if.then], [ %a, %entry ]
				ret double %a.0
				}


				@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
				declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
				declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)

llvm/trunk/test/Verifier/fp-intrinsics.ll

				; RUN: opt -verify -S < %s 2>&1 \| FileCheck --check-prefix=CHECK1 %s
				; RUN: sed -e s/.T2:// %s \| not opt -verify -disable-output 2>&1 \| FileCheck --check-prefix=CHECK2 %s
				; RUN: sed -e s/.T3:// %s \| not opt -verify -disable-output 2>&1 \| FileCheck --check-prefix=CHECK3 %s

				; Common declaration used for all runs.
				declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)

				; Test that the verifier accepts legal code, and that the correct attributes are
				; attached to the FP intrinsic.
				; CHECK1: declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) #[[ATTR:[0-9]+]]
				; CHECK1: attributes #[[ATTR]] = { inaccessiblememonly nounwind }
				; Note: FP exceptions aren't usually caught through normal unwind mechanisms,
				; but we may want to revisit this for asynchronous exception handling.
				define double @f1(double %a, double %b) {
				entry:
				%fadd = call double @llvm.experimental.constrained.fadd.f64(
				double %a, double %b,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret double %fadd
				}

				; Test an illegal value for the rounding mode argument.
				; CHECK2: invalid rounding mode argument
				;T2: define double @f2(double %a, double %b) {
				;T2: entry:
				;T2: %fadd = call double @llvm.experimental.constrained.fadd.f64(
				;T2: double %a, double %b,
				;T2: metadata !"round.dynomite",
				;T2: metadata !"fpexcept.strict")
				;T2: ret double %fadd
				;T2: }

				; Test an illegal value for the exception behavior argument.
				; CHECK3: invalid exception behavior argument
				;T3: define double @f2(double %a, double %b) {
				;T3: entry:
				;T3: %fadd = call double @llvm.experimental.constrained.fadd.f64(
				;T3: double %a, double %b,
				;T3: metadata !"round.dynamic",
				;T3: metadata !"fpexcept.restrict")
				;T3: ret double %fadd
				;T3: }

This is an archive of the discontinued LLVM Phabricator instance.

Add intrinsics for constrained floating point operationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 85977

llvm/trunk/docs/LangRef.rst

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

llvm/trunk/include/llvm/CodeGen/SelectionDAGISel.h

llvm/trunk/include/llvm/IR/IntrinsicInst.h

llvm/trunk/include/llvm/IR/Intrinsics.td

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

llvm/trunk/lib/IR/IntrinsicInst.cpp

llvm/trunk/lib/IR/Verifier.cpp

llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll

llvm/trunk/test/Feature/fp-intrinsics.ll

llvm/trunk/test/Verifier/fp-intrinsics.ll

Add intrinsics for constrained floating point operations
ClosedPublic