This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
-
AddingConstrainedIntrinsics.rst
2/29
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
1/3
ISDOpcodes.h
-
Passes.h
-
SelectionDAGNodes.h
-
TargetLowering.h
-
IR/
-
IntrinsicInst.h
-
Intrinsics.td
-
InitializePasses.h
-
lib/
-
CodeGen/
-
CMakeLists.txt
-
CodeGen.cpp
-
SelectionDAG/
10
LegalizeDAG.cpp
-
LegalizeIntegerTypes.cpp
-
LegalizeVectorOps.cpp
-
LegalizeVectorTypes.cpp
2/5
SelectionDAG.cpp
7
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
13/19
StrictFP.cpp
1/1
TargetPassConfig.cpp
-
IR/
-
IntrinsicInst.cpp
5/11
Verifier.cpp
-
test/
-
CodeGen/
-
AArch64/
-
O0-pipeline.ll
-
O3-pipeline.ll
-
X86/
-
O0-pipeline.ll
-
O3-pipeline.ll
-
fp-con-fptoui.ll
2/6
fp-intrinsics.ll
4
vector-constrained-fp-intrinsics.ll
-
Feature/
4
fp-intrinsics.ll

Differential D43515

More math intrinsics for conservative math handling
AbandonedPublic

Authored by kpn on Feb 20 2018, 9:44 AM.

Download Raw Diff

Details

Reviewers

andrew.w.kaylor
craig.topper
hfinkel
mehdi_amini
aemerson
javed.absar
kbarton

Summary

This builds on D27028 and D32319's work on constrained math intrinsics.

Quoting from D32319: "The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed."

There are more patches coming, but I wanted to start with just a handful here.

Diff Detail

Event Timeline

kpn created this revision.Feb 20 2018, 9:44 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptFeb 20 2018, 9:44 AM

At the request of my employer's legal department:
Copyright © 2018 SAS Institute Inc., Cary, NC, USA. All Rights Reserved.

andrew.w.kaylor added inline comments.Feb 20 2018, 11:52 AM

docs/LangRef.rst
13933	I think you need to say more about the semantics. The LLVM IR fptoui and fptosi are specifically documented as rounding to zero. I don't think we want that with the constrained intrinsics so we need to very specifically document how they will be different from the standard instructions.
13980	This intrinsic is a bit troublesome. The llvm.round intrinsic says that it "returns the same values as the libm round functions would, and handles error conditions in the same way." The libm round function, in turn, is documented as rounding to the nearest integer (and away from zero in halfway cases) regardless of the current rounding mode. So what do we want the constrained form of the intrinsic to do? I think it needs to ignore the rounding mode. I'm not sure about exception behavior. If it doesn't respect exception behavior then we probably don't want to have the constrained form of this intrinsic at all.
14016	This seems to be leaking SelectionDAG implementation details into the IR space. How is this used?
14057	This is a replacement for fpext, right? I think you should say that somewhere.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
949	STRICT_FP_TO_SINT needs to do something more than this. The default lowering of FP_TO_SINT is known to raise spurious FE_INEXACT exceptions because it involves speculative execution.
1117–1118	Since this gets unique handling why isn't it just a separate case from the others?
1122	The style/formatting is wrong here. I think you need curly braces around your else-clause and the "else" itself needs to be on the same line as the curly brace above it.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6350–6351	Why are you not attaching these nodes to the chain?

kpn added inline comments.Feb 21 2018, 12:29 PM

docs/LangRef.rst
13933	What do we want these intrinsics to do that is different from the normal IR fptoui and fptosi? I don't think any of the constrained intrinsics _today_ are doing anything with the rounding and exception metadata. That makes it hard for me to say much about it in documentation, today. Well, unless I misunderstood the current code. That's totally possible.
13980	We'll still need the constrained intrinsic to avoid getting reordered. How about if I rename the intrinsic to be fptrunc instead of round? Then the rounding would be explicit.
14016	It seems to exist for the MVT::ppcf128 type. A quick grep doesn't show any other users. Test coverage is lacking. I added a test using the intrinsic, but I had to mark it expected fail since I couldn't get it to work. To avoid the risk of bugs getting introduced later I went ahead and implemented the intrinsic. Would it be better to not have the intrinsic and to instead have a pass that replaces the non-STRICT SDNode with a STRICT version? That would avoid said leaking into the IR space. It would, however, mean that llvm would have an opinion on when STRICT nodes should be used. I'm not sure that's a good thing.
14057	I think I picked names for the intrinsics that matched the SelectionDAG node enum. Perhaps it would be better to match the bitcode language names? In which case this intrinsic would be "fpext" instead of "extend". Either way that's a good idea for the documentation to at least mention fpext.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
949	I have seen FP_TO_SINT cause traps when it shouldn't. Would having that default lowering use the chain in the STRICT_ case solve that issue?
1117–1118	Good point. And making it a separate case also takes care of the formatting issue in the else block.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6350–6351	Because they need to match what the default lowering is expecting. Otherwise a variety of failures happen.

andrew.w.kaylor added inline comments.Feb 21 2018, 6:17 PM

docs/LangRef.rst
13933	You're right, none of the other intrinsics are doing anything specific with the rounding mode. The purpose of the intrinsics is to prevent the optimizer from doing things that would introduce compile-time rounding. However, the general assumption of the intrinsics is that whatever you set the rounding mode to at runtime is the rounding behavior you will get. I'd want to consult a front end expert as to exactly what should be happening. A quick glance at the C99 standard tells me that when a floating point number is converted to an integer it is truncated toward zero. I think that means the same thing as the LLVM language reference claim that fptoui and fptosi round the number toward zero. So I think that we don't want the runtime rounding mode to change the behavior of these intrinsics, and so we should document it as such.
13980	No, fptrunc does something else, right? My concern is that if this is preserving the behavior of the round library function (and I think we need to) then the rounding mode argument isn't relevant, so maybe it should be omitted in this case. The key thing is to be explicit about its behavior in the documentation and then make sure the implementation actually does what we say it will.
14016	I think it's better not to have the intrinsic for now. I don't understand what the SD node is doing well enough to say much more, but it looks to me like the SD node shouldn't be there either. It's a target-specific hack that leaked into the target-independent code if my understanding is correct.
14057	These intrinsics should be driven by what the front end needs. If no front end is generating an equivalent now then we don't want an intrinsic. So, yes, please match the bitcode language names.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
949	No, I don't think the chain will fix this. We need to implement strict lowering that does something different.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6350–6351	There is code that removes the chain when the strict node is mutated to a non-strict node. That should be preventing lowering problems. I believe the chain was necessary to prevent re-ordering prior to final instruction selection.

kpn added inline comments.Feb 26 2018, 12:24 PM

docs/LangRef.rst
13933	Our front end guy here confirms truncation. I've updated my working copy to state that fptoui and fptosi truncate.
13980	Well, if I'm reading SelectionDAGBuilder::visitFPTrunc() correctly then fptrunc gets turned into ISD::FP_ROUND. So, no, renaming this intrinsic to be fptrunc would not be wrong. Contrast this with llvm.round getting turned into ISD::FROUND. I think we need constrained intrinsics for both, so this one in this patch should be renamed after fptrunc. I haven't touched the ISD::FROUND node yet, and that's what llvm.round gets turned into from the looks of SelectionDAGBuilder::visitIntrinsicCall(). That could be a later patch. I agree that the rounding mode should be ignored and shouldn't be in the intrinsic's metadata. WDYT?
14016	Done. It will be removed in the next diff.
14057	Will do.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
949	Is there something we can do at this level to fix this? If there is then I'm all for it, but if there isn't then we should probably still put the intrinsic in. We'll need it eventually, and currently none of the constrained intrinsics solve the complete optimization problem. So I wonder if this intrinsic is really all that different from the other experimental constrained intrinsics. If a backend models FP side effects then wouldn't the existing default lowering work correctly?
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6350–6351	Mutation happens too late in some cases. If it happened early enough then there wouldn't be any need for the strict intrinsics to be mentioned in SelectionDAGLegalize::ExpandNode(). Since it doesn't we can't use the chain. Should that mutation happen earlier?

andrew.w.kaylor added inline comments.Feb 28 2018, 11:25 AM

docs/LangRef.rst
13980	You shouldn't be thinking in terms of the SelectionDAG at this level. ISD::FP_ROUND is very poorly named and does not do the same thing that llvm.round does. ISD::FP_ROUND converts a floating point number to a smaller type, but not necessarily an integer. That's fptrunc. I suppose you are right that we do need a constrained version of fptrunc. I'm a bit concerned by the things that the LLVM language definition says are undefined. Those are the cases that will be of most interest for the constrained case and we should document the expected behavior, but I think we need to consider why the current spec says the result is undefined. In any event llvm.round, which is what I thought you were replacing here, is something completely different.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
949	To be honest I'm not entirely sure how to fix this in the current selection DAG model. The issue is that we need to introduce a branch to fix the problem, but by the time we're selecting instructions it's too late to do that. I think it needs to be addressed when we're building the DAG,
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6350–6351	We need the mutation to be put off as long as possible. I think it should be possible to unlink the chain in ExpandNode if necessary. Depending on what's happening there, we might even want the expanded node to make use of the chain.

kpn added inline comments.Mar 6 2018, 6:13 AM

docs/LangRef.rst
14057	Say, now that I think about this, what does rounding have to do with extending a FP value? Shouldn't this intrinsic just plain not have rounding metadata?

andrew.w.kaylor added inline comments.Mar 6 2018, 9:14 AM

docs/LangRef.rst
14057	That's a reasonable point. The same issue came up with frem. The rounding mode doesn't apply to frem, but I kept the argument just so that it could be handled internally the same way as the other constrained intrinsics and then documented in the language reference that the rounding mode has no effect. I'm not completely convinced that was a good decision on my part. If you want to look at what would have to change to support constrained intrinsics with no rounding mode argument, I would likely support that. I don't think it would be much extra handling. There are just some things where the class that is used to represent the intrinsics (ConstrainedFPIntrinsic) would need to be aware of the possibility that this argument is omitted.

This new diff adds changing the default lowering of STRICT_FP_TO_UINT to not allow speculative execution. I've also eliminated the rounding metadata from instructions when it wasn't needed.

Herald added a reviewer: javed.absar. · View Herald TranscriptMay 1 2018, 7:32 AM

Herald added a subscriber: mgorny. · View Herald Transcript

Missed one place in the documentation that needed updating.

At some point we should create a document that describes the entire flow of FP instructions through the instruction selection process. To be honest I don't remember how it all works, and that makes it difficult to review changes like this. It would also be nice to verify that we all have the same understanding of how it works. I don't mean to volunteer you to produce the entire document, but would you mind giving me a rough outline? I'm still concerned about the case that is not chained.

docs/LangRef.rst
13975	You need to do something more here to document the difference between the return type and the argument type. Also in fpext below.
lib/CodeGen/StrictFP.cpp
12	Can you say more here about what these transformations are? It's clear that you intend this as a generic pass that currently does one thing but might have others added later. That's good, but I'd like to see the possibilities described here as they are implemented.
75	Per the LLVM Programmer's Manual (http://llvm.org/docs/ProgrammersManual.html#iterating-over-the-instruction-in-a-function) you should be using an inst_iterator here.
137	Could you add an example here of what the resulting IR will look like? It would make the code a lot easier to follow.
145	I think you can use "IntDst->getBitWidth()" here. Also, APInt::getSignedMinValue() does this same thing and is a bit more self-documenting.
151	I believe conversion of a NaN to an integer should raise "INVALID" (which the fcmp will) and then the result is undefined, but the 'true' case does less so I think ULT is preferable.
152	We are going to need a constrained version of fcmp, and when we have it you should use it here. When the IRBuilder supports constrained floating point modes, it would be nice to use that here but I guess you can't do that yet, so maybe just a comment saying we should later?
154	This is an odd name. How about "within.sint.range"? In any case, I think '.' is more common than '_' as a name space holder, probably because the name will automatically get '.<n>' appended if it's a duplicate.
202	This description is wrong.
lib/CodeGen/TargetPassConfig.cpp
573	This doesn't seem like the right place to do this. Should it be happening much later, like around CodeGenPrepare?
lib/IR/Verifier.cpp
4508	Since you've broken this out into a switch statement, can you separate the unary and ternary ops and give them each the appropriate assert? I think that would be much more readable than this compound check (which I realize was my creation).
4539	The default should probably always been an error.
4549	How about this? int RoundingIdx = (HasExceptionMD ? NumOperands - 2 : NumOperands - 1); On the other hand, are we ever going to have intrinsics that have a rounding mode but not exception behavior?
test/CodeGen/X86/fp-intrinsics.ll
281	Can you check the entire expanded IR pattern here? It might be worth having a separate test that verifies the StrictFP pass in isolation.
333	I believe your decorated intrinsic names are incorrect in all of these cases. The name needs to specify both return type and argument type. Take a look at what opt produces if you give it these names as inputs.

cameron.mcinally added a subscriber: cameron.mcinally.May 24 2018, 7:48 AM

kpn updated this revision to Diff 152072.Jun 20 2018, 6:05 AM

kpn marked 12 inline comments as done.

kpn added inline comments.

lib/CodeGen/StrictFP.cpp
145	error: no member named `getBitWidth' in` llvm::Value'.

kpn added inline comments.Jun 20 2018, 6:06 AM

docs/LangRef.rst
13975	I hope this is what you mean. I've changed it to show that the result is a different type. I've followed the naming scheme used elsewhere in this document.
lib/IR/Verifier.cpp
4549	Done. I don't know if we'll have a rounding mode but not exceptions. I doubt it, but I can't say for certain.

craig.topper added inline comments.Jun 20 2018, 11:52 PM

lib/CodeGen/StrictFP.cpp
77	I believe you should be getting TM by doing this. auto *TPC = getAnalysisIfAvailable<TargetPassConfig>(); if (!TPC) return false; auto &TM = TPC->getTM<TargetMachine>(); There is only one other pass that takes TargetMachine in its constructor. The others use what I've put above. So I believe that is the preferred way.
85	There is little reason to pass Context around. Value has a getContext as does Type. So you can get the context easily whenever you need it.
100	This cast is unnecessary. IntrinsicInst is a subclass of Value
102	This should use TLI->getValueType.
167	Why do you need a SmallVector here? Why can't you just call getArgOperand?
168	This cast is unecessary.
172	What if the intrinsic uses a vector type?

uweigand added a subscriber: uweigand.Jun 25 2018, 6:30 AM

kpn marked 6 inline comments as done.Jul 26 2018, 9:15 AM

kpn added inline comments.

lib/CodeGen/StrictFP.cpp
172	It would have been caught by the IR verifier. A vector would have been rejected there.

kpn updated this revision to Diff 157504.Jul 26 2018, 9:16 AM

craig.topper added inline comments.Jul 26 2018, 9:56 AM

lib/CodeGen/StrictFP.cpp
172	I don't see where the IR verifier rejects vectors. I just see that it checks that element counts are equal. And why should it reject vectors? We need to support fptosi/fptoui for vectors.

kpn added inline comments.Jul 26 2018, 9:59 AM

lib/CodeGen/StrictFP.cpp
172	It doesn't reject them, now. This latest patch added support for vectors. My inexperience with Phabricator lost the comment that said that.

Delete a line accidentally left in.

Rebase. Ping.

In D43515#1013383, @kpn wrote:

At the request of my employer's legal department:
Copyright © 2018 SAS Institute Inc., Cary, NC, USA. All Rights Reserved.

If it's not just a remark, but is supposed to have some legal/whatever meaning,
i'm not sure some comment in some review is the correct direction.

docs/LangRef.rst
13902	This probably has insufficient amount of `^`. Might want to actually test-build the docs.

Adding new constrained instrinsics and adding the pass should be separate patches I think. Changing the syntax of frem should be another patch.

docs/LangRef.rst
13840	This change should be in a separate patch. There's too much going on in this patch and this is easy to overlook.

In D43515#1229127, @craig.topper wrote:

Adding new constrained instrinsics and adding the pass should be separate patches I think. Changing the syntax of frem should be another patch.

Will do.

Split out changes as requested. This diff is just the four new intrinsics. The fptoui pass and the change to frem will be later.

I've also corrected some documentation issues in this iteration.

Ping

Herald added a subscriber: arphaman. · View Herald TranscriptOct 3 2018, 11:44 AM

craig.topper added inline comments.Oct 3 2018, 5:22 PM

docs/LangRef.rst
13983	This reads funny. I think it should maybe be "result of truncating a floating point"
14017	This also reads funny
include/llvm/CodeGen/ISDOpcodes.h
528	These need comments. Does STRICT_FP_ROUND have the TRUNC argument that FP_ROUND has?

kpn added inline comments.Oct 4 2018, 6:42 AM

docs/LangRef.rst
13983	How about if I just copy the text used by the normal fptrunc instruction?
14017	Same as fptrunc. I could just copy the text from the fpext instruction?
include/llvm/CodeGen/ISDOpcodes.h
528	I could rearrange and lump them in with the non-strict versions of each. Then I'd just need one extra line restating that the STRICT_ versions prevent optimizations. Yes, STRICT_FP_ROUND does have the TRUNC argument that FP_ROUND has, but it is currently always zero. Fixing this require rerouting this one strict node to go through the same codepath as the non-strict node. That would make it different from all the other constrained nodes. Should I go ahead and make that change? I _think_ it is safe if the TRUNC argument really does work like it is documented. I also just noticed that I need to put back STRICT_FP_TO_UINT in at least one place. It should be everywhere STRICT_FP_TO_SINT is handled _except_ in the default lowering.

craig.topper added inline comments.Oct 4 2018, 10:51 AM

docs/LangRef.rst
13983	Sure
14017	Sure
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7348	This doesn't copy the second argument to FP_ROUJND over does it?
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6355–6360	Is this adding a second argument to STRICT_FP_EXTEND as well? I don't think the non-strict FP_EXTEND has two arguments.

kpn added inline comments.Oct 4 2018, 10:53 AM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7348	No. It should.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6355–6360	Agreed. I'll fix it.

cameron.mcinally added inline comments.Oct 4 2018, 11:43 AM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7323	There are a lot of comments on this, so I may have missed something. Take with a grain of salt... I don't think these are correct. These can trap so can't be speculatively executed. They would need a chain.

kpn added inline comments.Oct 4 2018, 11:58 AM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7323	I couldn't figure out how to have these be chained but have the non-strict continue to not be chained. Too many things fell over if they didn't match.

cameron.mcinally added inline comments.Oct 5 2018, 8:24 AM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7323	I'm not an expert with this code, so cc @andrew.w.kaylor. This doesn't seem like the right direction though. Maybe these unchained operations should be left to a different patch until a proper solution is found.

kpn mentioned this in D53216: [FPEnv] Add constrained intrinsics for MAXNUM and MINNUM.Oct 18 2018, 11:24 AM

kpn mentioned this in D53411: [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics.Oct 19 2018, 5:56 AM

kpn marked 5 inline comments as done.Nov 1 2018, 9:08 AM

Address review comments.

Add use of the chain to these four new SDNode types.

cameron.mcinally added inline comments.Nov 5 2018, 11:51 AM

test/CodeGen/X86/fp-intrinsics.ll
281	Same here as the vector version below. Do we want the truncating convert? That was surprising to me.
333	Do we want unsigned convert tests too? fptoui? I see that there are SystemZ tests to cover them, so maybe that's sufficient? Just pointing this out so others can see.
test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
3579	Same question as the scalar versions. Do we want <4 x float> to <4 x i32>/etc casts?
3607	This surprised me. Should this be the truncating convert? Or should it be vcvtsd2si?
test/Feature/fp-intrinsics.ll
309	I haven't been following along closely, so please forgive if this was already discussed... Should we have float->i32 casts too? Also double->i64?
311	Should 'fpext.f64' have a float argument instead of a double?

kpn added inline comments.Nov 5 2018, 12:36 PM

test/CodeGen/X86/fp-intrinsics.ll
333	The SystemZ tests target hardware new enough to lower to a single instruction. The tests for fptoui on x86 use the default lowering, but the default lowering is disallowed since it does speculative execution and traps. The support for fixing that I had in this patch but was asked to split it out into another patch. So there's no constrained fptoui test here using the default lowering in this patch.
test/Feature/fp-intrinsics.ll
311	Probably, yes. Will fix.

kpn added inline comments.Nov 6 2018, 10:29 AM

test/CodeGen/X86/fp-intrinsics.ll
281	The strict intrinsic results in the same instruction as the regular fptosi instruction. Having them be the same means the mutation from strict to non-strict is working correctly. And, yes, rounding towards zero is correct. That's why there's no rounding metadata.
test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
3579	On the odd chance that it may tickle the vector legalizer I'll go ahead and add tests with float. Same answer as the scalar tests: Rounding towards zero is correct.
3607	Yes, it should be a truncating conversion.
test/Feature/fp-intrinsics.ll
309	This is an opt test. I'm not sure we'd benefit from placing those extra tests here.

Rebase.

Minor test changes.

cameron.mcinally added inline comments.Nov 28 2018, 10:59 AM

include/llvm/CodeGen/ISDOpcodes.h
528	This ordering does not mesh with what is already in place. The STRICT_XXX opcodes are currently clustered together, where these new opcodes are grouped with their non-strict counterparts. I'm not opposed to grouping the corresponding non-strict and strict opcodes, but that would probably be better left to a separate patch. I think it makes sense to keep everything uniform until a final decision is made, for clarity's sake.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2999	This doesn't seem correct. Shouldn't Node->getOperand(0) be the argument for FP_ROUND and the Chain for STRICT_FP_ROUND? Same for FP_EXTEND too. Also, does using a truncating store provide us with the same trapping behavior as an explicit trunc instruction? I don't know off the top of my head, but it may be different. To be fair, a user probably won't care too much about optimizing away a trap, but the purists might.
3054	This seems incorrect, unless I missed something. The first operand to FP_TO_SINT should be the argument, but the first operand to STRICT_FP_TO_SINT should be the Chain. It does not look like expandFP_TO_SINT accounts for that.

kpn mentioned this in D55897: Add constrained fptrunc and fpext intrinsics.Dec 19 2018, 12:20 PM

kpn mentioned this in D59830: [FPEnv] Make constrained FP IR verification more flexible..Mar 26 2019, 11:33 AM

kpn mentioned this in D59833: [FPEnv] New document for adding new constrained FP intrinsics.Mar 26 2019, 11:58 AM

kpn mentioned this in rL357065: The IR verifier currently supports the constrained floating point intrinsics,.Mar 27 2019, 6:30 AM

kpn mentioned this in rG4f3cdc6555ca: The IR verifier currently supports the constrained floating point intrinsics….

pengfei added a subscriber: pengfei.May 4 2019, 11:05 PM

kpn added a subscriber: kbarton.May 10 2019, 12:32 PM

A few minor comments about commoning asserts and using dyn_cast instead of cast<>.
Aside from that, I think this looks good. That said, I'm by no means an expert in this area so don't feel I'm qualified to give a final approval to commit.

lib/IR/Verifier.cpp
4536	Could you use the isFPOrFPVectorTy here to check both conditions, and then remove the assert in the else below?
4537	Can you use a dyn_cast here instead? if (auto *OperandT = dyn_cast<VectorType>(Operand->getType())) { do vector stuff }
4552	Similarly, could you use isIntOrIntVectorTy here and remove the assert in the if/else below?
4576	same comment about commoning the asserts here.
4620	It looks like getArgOperand expects an unsigned. Is there a specific reason you are using an int here? If it's possible that RoundingIdx is negative, then you should probably add an assert here before passing it to getArgOperand.

The only thing left in this ticket to commit is the fptosi and fptoui changes. I'm working on incorporating review comments from D55897 into this ticket and fixing other bugs that I'm coming across. I'll hopefully update this ticket next week.

lib/IR/Verifier.cpp
4537	Yes, that's much more concise.
4576	The fptrunc and fpext changes were split out into another ticket and updated there. That committed code is more concise in I suspect the way you are intending.

kpn added a subscriber: ajwock.May 21 2019, 6:26 AM

Address review comments from here and from D55897.

This patch is now down to just handling fptosi and fptoui.

kpn added a reviewer: kbarton.May 30 2019, 11:47 AM

kpn mentioned this in D53157: Teach the IRBuilder about constrained fadd and friends.May 31 2019, 9:52 AM

Ping.

Ping

How would you feel about rebooting this as a new patch? There's a lot of irrelevant history here, and I feel like I'm missing some context as I review it.

In general, I see that you're down to just implementing the fptosi and fptoui cases. I'm concerned about what happens in the fptoui case. It's mentioned in a few of the earlier comments that the default expansion of this opcode introduces speculative exceptions, and if that's being handled in the latest implementation I haven't read it closely enough to see what's going on. If it is being handled, I'd expect to see a comment block somewhere explaining what's being done.

In D43515#1547200, @andrew.w.kaylor wrote:

How would you feel about rebooting this as a new patch? There's a lot of irrelevant history here, and I feel like I'm missing some context as I review it.

I can do that.

In general, I see that you're down to just implementing the fptosi and fptoui cases. I'm concerned about what happens in the fptoui case. It's mentioned in a few of the earlier comments that the default expansion of this opcode introduces speculative exceptions, and if that's being handled in the latest implementation I haven't read it closely enough to see what's going on. If it is being handled, I'd expect to see a comment block somewhere explaining what's being done.

I believe the speculative exceptions should be largely fixed by r348251, committed by rksimon. I changed the code to continue to use strict nodes when expanding a strict node. Since the strict nodes are chained, does that not solve the part of the problem not solved by r348251? Is the existing comment not enough, or do I need to copy some of the commit message into code comments?

In D43515#1548845, @kpn wrote:

I believe the speculative exceptions should be largely fixed by r348251, committed by rksimon. I changed the code to continue to use strict nodes when expanding a strict node. Since the strict nodes are chained, does that not solve the part of the problem not solved by r348251? Is the existing comment not enough, or do I need to copy some of the commit message into code comments?

Sorry, I wasn't aware of Simon's change. That definitely simplifies what needs to be done here, and, yes, the existing comment is sufficient.

Replaced by D63782.

Revision Contents

Path

Size

docs/

AddingConstrainedIntrinsics.rst

96 lines

LangRef.rst

141 lines

include/

llvm/

CodeGen/

5 lines

2 lines

4 lines

4 lines

IR/

IntrinsicInst.h

4 lines

Intrinsics.td

19 lines

InitializePasses.h

1 line

lib/

CodeGen/

CMakeLists.txt

1 line

CodeGen.cpp

1 line

SelectionDAG/

LegalizeDAG.cpp

12 lines

LegalizeIntegerTypes.cpp

10 lines

LegalizeVectorOps.cpp

3 lines

LegalizeVectorTypes.cpp

13 lines

SelectionDAG.cpp

43 lines

SelectionDAGBuilder.cpp

38 lines

SelectionDAGDumper.cpp

4 lines

StrictFP.cpp

295 lines

TargetPassConfig.cpp

8 lines

IR/

IntrinsicInst.cpp

4 lines

Verifier.cpp

155 lines

test/

CodeGen/

AArch64/

O0-pipeline.ll

1 line

O3-pipeline.ll

1 line

X86/

1 line

1 line

142 lines

41 lines

vector-constrained-fp-intrinsics.ll

73 lines

Feature/

fp-intrinsics.ll

49 lines

Diff 161048

docs/AddingConstrainedIntrinsics.rst

				==================================================
				How to add a constrained floating-point intrinsic
				==================================================

				.. contents::
				:local:

				.. warning::
				This is a work in progress.

				Add the intrinsic
				=================

				Multiple files need to be updated when adding a new constrained intrinsic.

				Add the new intrinsic to the table of intrinsics.::

				include/llvm/IR/Intrinsics.td

				Update class ConstrainedFPIntrinsic to know about the intrinsics.::

				include/llvm/IR/IntrinsicInst.h

				Functions like ConstrainedFPIntrinsic::isUnaryOp() or
				ConstrainedFPIntrinsic::isTernaryOp() may need to know about the new
				intrinsic.::

				lib/IR/IntrinsicInst.cpp

				Update the IR verifier::

				lib/IR/Verifier.cpp

				Add SelectionDAG node types
				===========================

				Add the new STRICT version of the node type to the ISD::NodeType enum.::

				include/llvm/CodeGen/ISDOpcodes.h

				In class SDNode update isStrictFPOpcode()::

				include/llvm/CodeGen/SelectionDAGNodes.h

				A mapping from the STRICT SDnode type to the non-STRICT is done in
				TargetLoweringBase::getStrictFPOperationAction(). This allows STRICT
				nodes to be legalized similarly to the non-STRICT node type.::

				include/llvm/CodeGen/TargetLowering.h

				Building the SelectionDAG
				-------------------------

				The switch statement in SelectionDAGBuilder::visitIntrinsicCall() needs
				to be updated to call SelectionDAGBuilder::visitConstrainedFPIntrinsic().
				That function, in turn, needs to be updated to know how to create the
				SDNode for the intrinsic. The new STRICT node will eventually be converted
				to the matching non-STRICT node. For this reason it _must_ have the same
				operands and values as the non-STRICT version in case the non-STRICT
				version's default lowering is used. This means that if the non-STRICT
				version of the node does not use the chain then the STRICT node cannot
				either.::

				lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

				Most of the STRICT nodes get legalized the same as their matching non-STRICT
				counterparts. A new STRICT node with this property must get added to the
				switch in SelectionDAGLegalize::LegalizeOp().::

				lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

				The code to do the conversion or mutation of the STRICT node to a non-STRICT
				version of the node happens in SelectionDAG::mutateStrictFPToFP(). Be
				careful updating this function since some nodes are always chained and
				some are not. Some nodes have the same return type as their input operand,
				but some are different. Both of these points must be properly handled.::

				lib/CodeGen/SelectionDAG/SelectionDAG.cpp

				To make debug logs readable it is helpful to update the SelectionDAG's
				debug logger:::

				lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

				Add any required transforms to lib/CodeGen/StrictFP.cpp
				=======================================================

				If there are any transforms that cannot or should not be done in the
				SelectionDAG then the StrictFP.cpp pass is the place to put them.

				Add documentation and tests
				===========================

				::

				docs/LangRef.rst

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 13,831 Lines • ▼ Show 20 Lines

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	declare <type>			declare <type>
	@llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,			@llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
	metadata <rounding mode>,
	craig.topperUnsubmitted Not Done Reply Inline Actions This change should be in a separate patch. There's too much going on in this patch and this is easy to overlook. craig.topper: This change should be in a separate patch. There's too much going on in this patch and this is…
	metadata <exception behavior>)			metadata <exception behavior>)

	Overview:			Overview:
	"""""""""			"""""""""

	The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder			The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
	from the division of its two operands.			from the division of its two operands.


	Arguments:			Arguments:
	""""""""""			""""""""""

	The first two arguments to the '``llvm.experimental.constrained.frem``'			The first two arguments to the '``llvm.experimental.constrained.frem``'
	intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`			intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
	of floating-point values. Both arguments must have identical types.			of floating-point values. Both arguments must have identical types.

	The third and fourth arguments specify the rounding mode and exception			The third argument specifies the exception behavior as described above.
	behavior as described above. The rounding mode argument has no effect, since
	the result of frem is never rounded, but the argument is included for
	consistency with the other constrained floating-point intrinsics.

	Semantics:			Semantics:
	""""""""""			""""""""""

	The value produced is the floating-point remainder from the division of the two			The value produced is the floating-point remainder from the division of the two
	value operands and has the same type as the operands. The remainder has the			value operands and has the same type as the operands. The remainder has the
	same sign as the dividend.			same sign as the dividend.

	Show All 28 Lines

	Semantics:			Semantics:
	""""""""""			""""""""""

	The result produced is the product of the first two operands added to the third			The result produced is the product of the first two operands added to the third
	operand computed with infinite precision, and then rounded to the target			operand computed with infinite precision, and then rounded to the target
	precision.			precision.

				'``llvm.experimental.constrained.fptoui``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				lebedev.riUnsubmitted Not Done Reply Inline Actions This probably has insufficient amount of `^`. Might want to actually test-build the docs. lebedev.ri: This probably has insufficient amount of `^`. Might want to actually test-build the docs.

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fptoui(<type> <op>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fptoui``' intrinsic returns the result of a
				conversion of a floating point operand to an unsigned integer.

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fptoui``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is an unsigned integer converted from the floating
				point operand. The value is truncated, so it is rounded towards zero.

				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions I think you need to say more about the semantics. The LLVM IR fptoui and fptosi are specifically documented as rounding to zero. I don't think we want that with the constrained intrinsics so we need to very specifically document how they will be different from the standard instructions. andrew.w.kaylor: I think you need to say more about the semantics. The LLVM IR fptoui and fptosi are…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions What do we want these intrinsics to do that is different from the normal IR fptoui and fptosi? I don't think any of the constrained intrinsics _today_ are doing anything with the rounding and exception metadata. That makes it hard for me to say much about it in documentation, today. Well, unless I misunderstood the current code. That's totally possible. kpn: What do we want these intrinsics to do that is different from the normal IR fptoui and fptosi?
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions You're right, none of the other intrinsics are doing anything specific with the rounding mode. The purpose of the intrinsics is to prevent the optimizer from doing things that would introduce compile-time rounding. However, the general assumption of the intrinsics is that whatever you set the rounding mode to at runtime is the rounding behavior you will get. I'd want to consult a front end expert as to exactly what should be happening. A quick glance at the C99 standard tells me that when a floating point number is converted to an integer it is truncated toward zero. I think that means the same thing as the LLVM language reference claim that fptoui and fptosi round the number toward zero. So I think that we don't want the runtime rounding mode to change the behavior of these intrinsics, and so we should document it as such. andrew.w.kaylor: You're right, none of the other intrinsics are doing anything specific with the rounding mode.
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Our front end guy here confirms truncation. I've updated my working copy to state that fptoui and fptosi truncate. kpn: Our front end guy here confirms truncation. I've updated my working copy to state that fptoui…
				'``llvm.experimental.constrained.fptosi``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fptosi(<type> <op>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fptosi``' intrinsic returns the result of a
				conversion of a floating point operand to a signed integer.

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fptoui``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is a signed integer converted from the floating
				point operand. The value is truncated, so it is rounded towards zero.

				'``llvm.experimental.constrained.fptrunc``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <ty2>
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions You need to do something more here to document the difference between the return type and the argument type. Also in fpext below. andrew.w.kaylor: You need to do something more here to document the difference between the return type and the…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions I hope this is what you mean. I've changed it to show that the result is a different type. I've followed the naming scheme used elsewhere in this document. kpn: I hope this is what you mean. I've changed it to show that the result is a different type. I've…
				@llvm.experimental.constrained.fptrunc(<type> <op>,
				metadata <exception behavior>)

				Overview:
				"""""""""
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions This intrinsic is a bit troublesome. The llvm.round intrinsic says that it "returns the same values as the libm round functions would, and handles error conditions in the same way." The libm round function, in turn, is documented as rounding to the nearest integer (and away from zero in halfway cases) regardless of the current rounding mode. So what do we want the constrained form of the intrinsic to do? I think it needs to ignore the rounding mode. I'm not sure about exception behavior. If it doesn't respect exception behavior then we probably don't want to have the constrained form of this intrinsic at all. andrew.w.kaylor: This intrinsic is a bit troublesome. The llvm.round intrinsic says that it "returns the same…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions We'll still need the constrained intrinsic to avoid getting reordered. How about if I rename the intrinsic to be fptrunc instead of round? Then the rounding would be explicit. kpn: We'll still need the constrained intrinsic to avoid getting reordered. How about if I rename…
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions No, fptrunc does something else, right? My concern is that if this is preserving the behavior of the round library function (and I think we need to) then the rounding mode argument isn't relevant, so maybe it should be omitted in this case. The key thing is to be explicit about its behavior in the documentation and then make sure the implementation actually does what we say it will. andrew.w.kaylor: No, fptrunc does something else, right? My concern is that if this is preserving the behavior…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Well, if I'm reading SelectionDAGBuilder::visitFPTrunc() correctly then fptrunc gets turned into ISD::FP_ROUND. So, no, renaming this intrinsic to be fptrunc would not be wrong. Contrast this with llvm.round getting turned into ISD::FROUND. I think we need constrained intrinsics for both, so this one in this patch should be renamed after fptrunc. I haven't touched the ISD::FROUND node yet, and that's what llvm.round gets turned into from the looks of SelectionDAGBuilder::visitIntrinsicCall(). That could be a later patch. I agree that the rounding mode should be ignored and shouldn't be in the intrinsic's metadata. WDYT? kpn: Well, if I'm reading SelectionDAGBuilder::visitFPTrunc() correctly then fptrunc gets turned…
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions You shouldn't be thinking in terms of the SelectionDAG at this level. ISD::FP_ROUND is very poorly named and does not do the same thing that llvm.round does. ISD::FP_ROUND converts a floating point number to a smaller type, but not necessarily an integer. That's fptrunc. I suppose you are right that we do need a constrained version of fptrunc. I'm a bit concerned by the things that the LLVM language definition says are undefined. Those are the cases that will be of most interest for the constrained case and we should document the expected behavior, but I think we need to consider why the current spec says the result is undefined. In any event llvm.round, which is what I thought you were replacing here, is something completely different. andrew.w.kaylor: You shouldn't be thinking in terms of the SelectionDAG at this level. ISD::FP_ROUND is very…

				The '``llvm.experimental.constrained.fptrunc``' intrinsic returns the result of
				a truncating of a floating point operand into a smaller floating point result.
				craig.topperUnsubmitted Not Done Reply Inline Actions This reads funny. I think it should maybe be "result of truncating a floating point" craig.topper: This reads funny. I think it should maybe be "result of truncating a floating point"
				kpnAuthorUnsubmitted Not Done Reply Inline Actions How about if I just copy the text used by the normal fptrunc instruction? kpn: How about if I just copy the text used by the normal fptrunc instruction?
				craig.topperUnsubmitted Done Reply Inline Actions Sure craig.topper: Sure

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fptrunc``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values. This argument must be larger in size
				than the result.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is a floating point value truncated to be smaller in size
				than the operand.

				'``llvm.experimental.constrained.fpext``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <ty2>
				@llvm.experimental.constrained.fpext(<type> <op>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fpext``' intrinsic returns the result of
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions This seems to be leaking SelectionDAG implementation details into the IR space. How is this used? andrew.w.kaylor: This seems to be leaking SelectionDAG implementation details into the IR space. How is this…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions It seems to exist for the MVT::ppcf128 type. A quick grep doesn't show any other users. Test coverage is lacking. I added a test using the intrinsic, but I had to mark it expected fail since I couldn't get it to work. To avoid the risk of bugs getting introduced later I went ahead and implemented the intrinsic. Would it be better to not have the intrinsic and to instead have a pass that replaces the non-STRICT SDNode with a STRICT version? That would avoid said leaking into the IR space. It would, however, mean that llvm would have an opinion on when STRICT nodes should be used. I'm not sure that's a good thing. kpn: It seems to exist for the MVT::ppcf128 type. A quick grep doesn't show any other users. Test…
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions I think it's better not to have the intrinsic for now. I don't understand what the SD node is doing well enough to say much more, but it looks to me like the SD node shouldn't be there either. It's a target-specific hack that leaked into the target-independent code if my understanding is correct. andrew.w.kaylor: I think it's better not to have the intrinsic for now. I don't understand what the SD node is…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Done. It will be removed in the next diff. kpn: Done. It will be removed in the next diff.
				an enlarging of a floating point operand.
				craig.topperUnsubmitted Not Done Reply Inline Actions This also reads funny craig.topper: This also reads funny
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Same as fptrunc. I could just copy the text from the fpext instruction? kpn: Same as fptrunc. I could just copy the text from the fpext instruction?
				craig.topperUnsubmitted Done Reply Inline Actions Sure craig.topper: Sure

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fpext`'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values. This argument must be smaller in size
				than the result.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is a floating point value extended to be larger in size
				than the operand. All restrictions that apply to the fpext instruction also
				apply to this intrinsic.

	Constrained libm-equivalent Intrinsics			Constrained libm-equivalent Intrinsics
	--------------------------------------			--------------------------------------

	In addition to the basic floating-point operations for which constrained			In addition to the basic floating-point operations for which constrained
	intrinsics are described above, there are constrained versions of various			intrinsics are described above, there are constrained versions of various
	operations which provide equivalent behavior to a corresponding libm function.			operations which provide equivalent behavior to a corresponding libm function.
	These intrinsics allow the precise behavior of these operations with respect to			These intrinsics allow the precise behavior of these operations with respect to
	rounding mode and exception behavior to be controlled.			rounding mode and exception behavior to be controlled.

	As with the basic constrained floating-point intrinsics, the rounding mode			As with the basic constrained floating-point intrinsics, the rounding mode
	and exception behavior arguments only control the behavior of the optimizer.			and exception behavior arguments only control the behavior of the optimizer.
	They do not change the runtime floating-point environment.			They do not change the runtime floating-point environment.


	'``llvm.experimental.constrained.sqrt``' Intrinsic			'``llvm.experimental.constrained.sqrt``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions This is a replacement for fpext, right? I think you should say that somewhere. andrew.w.kaylor: This is a replacement for fpext, right? I think you should say that somewhere.
				kpnAuthorUnsubmitted Not Done Reply Inline Actions I think I picked names for the intrinsics that matched the SelectionDAG node enum. Perhaps it would be better to match the bitcode language names? In which case this intrinsic would be "fpext" instead of "extend". Either way that's a good idea for the documentation to at least mention fpext. kpn: I think I picked names for the intrinsics that matched the SelectionDAG node enum. Perhaps it…
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions These intrinsics should be driven by what the front end needs. If no front end is generating an equivalent now then we don't want an intrinsic. So, yes, please match the bitcode language names. andrew.w.kaylor: These intrinsics should be driven by what the front end needs. If no front end is generating an…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Will do. kpn: Will do.
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Say, now that I think about this, what does rounding have to do with extending a FP value? Shouldn't this intrinsic just plain not have rounding metadata? kpn: Say, now that I think about this, what does rounding have to do with extending a FP value?
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions That's a reasonable point. The same issue came up with frem. The rounding mode doesn't apply to frem, but I kept the argument just so that it could be handled internally the same way as the other constrained intrinsics and then documented in the language reference that the rounding mode has no effect. I'm not completely convinced that was a good decision on my part. If you want to look at what would have to change to support constrained intrinsics with no rounding mode argument, I would likely support that. I don't think it would be much extra handling. There are just some things where the class that is used to represent the intrinsics (ConstrainedFPIntrinsic) would need to be aware of the possibility that this argument is omitted. andrew.w.kaylor: That's a reasonable point. The same issue came up with frem. The rounding mode doesn't apply to…
	declare <type>			declare <type>
	@llvm.experimental.constrained.sqrt(<type> <op1>,			@llvm.experimental.constrained.sqrt(<type> <op1>,
	metadata <rounding mode>,			metadata <rounding mode>,
	metadata <exception behavior>)			metadata <exception behavior>)

	Overview:			Overview:
	"""""""""			"""""""""

	▲ Show 20 Lines • Show All 1,352 Lines • Show Last 20 Lines

include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 519 Lines • ▼ Show 20 Lines	enum NodeType {
/// in a register of the same size. This operation effectively just		/// in a register of the same size. This operation effectively just
/// discards excess precision. The type to round down to is specified by		/// discards excess precision. The type to round down to is specified by
/// the VT operand, a VTSDNode.		/// the VT operand, a VTSDNode.
FP_ROUND_INREG,		FP_ROUND_INREG,

/// X = FP_EXTEND(Y) - Extend a smaller FP type into a larger FP type.		/// X = FP_EXTEND(Y) - Extend a smaller FP type into a larger FP type.
FP_EXTEND,		FP_EXTEND,

		STRICT_FP_TO_SINT,
		craig.topperUnsubmitted Not Done Reply Inline Actions These need comments. Does STRICT_FP_ROUND have the TRUNC argument that FP_ROUND has? craig.topper: These need comments. Does STRICT_FP_ROUND have the TRUNC argument that FP_ROUND has?
		kpnAuthorUnsubmitted Done Reply Inline Actions I could rearrange and lump them in with the non-strict versions of each. Then I'd just need one extra line restating that the STRICT_ versions prevent optimizations. Yes, STRICT_FP_ROUND does have the TRUNC argument that FP_ROUND has, but it is currently always zero. Fixing this require rerouting this one strict node to go through the same codepath as the non-strict node. That would make it different from all the other constrained nodes. Should I go ahead and make that change? I _think_ it is safe if the TRUNC argument really does work like it is documented. I also just noticed that I need to put back STRICT_FP_TO_UINT in at least one place. It should be everywhere STRICT_FP_TO_SINT is handled _except_ in the default lowering. kpn: I could rearrange and lump them in with the non-strict versions of each. Then I'd just need one…
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions This ordering does not mesh with what is already in place. The STRICT_XXX opcodes are currently clustered together, where these new opcodes are grouped with their non-strict counterparts. I'm not opposed to grouping the corresponding non-strict and strict opcodes, but that would probably be better left to a separate patch. I think it makes sense to keep everything uniform until a final decision is made, for clarity's sake. cameron.mcinally: This ordering does not mesh with what is already in place. The STRICT_XXX opcodes are currently…
		STRICT_FP_TO_UINT,
		STRICT_FP_ROUND,
		STRICT_FP_EXTEND,

/// BITCAST - This operator converts between integer, vector and FP		/// BITCAST - This operator converts between integer, vector and FP
/// values, as if the value was stored to memory with one type and loaded		/// values, as if the value was stored to memory with one type and loaded
/// from the same address with the other type (or equivalently for vector		/// from the same address with the other type (or equivalently for vector
/// format conversions, etc). The source and result are required to have		/// format conversions, etc). The source and result are required to have
/// the same bit size (e.g. f32 <-> i32). This can also be used for		/// the same bit size (e.g. f32 <-> i32). This can also be used for
/// int-to-int or fp-to-fp conversions, but that is a noop, deleted by		/// int-to-int or fp-to-fp conversions, but that is a noop, deleted by
/// getNode().		/// getNode().
///		///
▲ Show 20 Lines • Show All 457 Lines • Show Last 20 Lines

include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 435 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.
FunctionPass *createBreakFalseDeps();		FunctionPass *createBreakFalseDeps();

// This pass expands indirectbr instructions.		// This pass expands indirectbr instructions.
FunctionPass *createIndirectBrExpandPass();		FunctionPass *createIndirectBrExpandPass();

/// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp		/// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp
FunctionPass *createCFIInstrInserter();		FunctionPass *createCFIInstrInserter();

		// Experimental pass with transforms needed for strict fp
		FunctionPass *createStrictFPPass();
} // End llvm namespace		} // End llvm namespace

#endif		#endif

include/llvm/CodeGen/SelectionDAGNodes.h

Show First 20 Lines • Show All 666 Lines • ▼ Show 20 Lines	switch (NodeType) {
case ISD::STRICT_FCOS:		case ISD::STRICT_FCOS:
case ISD::STRICT_FEXP:		case ISD::STRICT_FEXP:
case ISD::STRICT_FEXP2:		case ISD::STRICT_FEXP2:
case ISD::STRICT_FLOG:		case ISD::STRICT_FLOG:
case ISD::STRICT_FLOG10:		case ISD::STRICT_FLOG10:
case ISD::STRICT_FLOG2:		case ISD::STRICT_FLOG2:
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_EXTEND:
return true;		return true;
}		}
}		}

/// Test if this node has a post-isel opcode, directly		/// Test if this node has a post-isel opcode, directly
/// corresponding to a MachineInstr opcode.		/// corresponding to a MachineInstr opcode.
bool isMachineOpcode() const { return NodeType < 0; }		bool isMachineOpcode() const { return NodeType < 0; }

▲ Show 20 Lines • Show All 1,772 Lines • Show Last 20 Lines

include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 805 Lines • ▼ Show 20 Lines	switch (Op) {
case ISD::STRICT_FCOS: EqOpc = ISD::FCOS; break;		case ISD::STRICT_FCOS: EqOpc = ISD::FCOS; break;
case ISD::STRICT_FEXP: EqOpc = ISD::FEXP; break;		case ISD::STRICT_FEXP: EqOpc = ISD::FEXP; break;
case ISD::STRICT_FEXP2: EqOpc = ISD::FEXP2; break;		case ISD::STRICT_FEXP2: EqOpc = ISD::FEXP2; break;
case ISD::STRICT_FLOG: EqOpc = ISD::FLOG; break;		case ISD::STRICT_FLOG: EqOpc = ISD::FLOG; break;
case ISD::STRICT_FLOG10: EqOpc = ISD::FLOG10; break;		case ISD::STRICT_FLOG10: EqOpc = ISD::FLOG10; break;
case ISD::STRICT_FLOG2: EqOpc = ISD::FLOG2; break;		case ISD::STRICT_FLOG2: EqOpc = ISD::FLOG2; break;
case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break;		case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break;
case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;		case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;
		case ISD::STRICT_FP_TO_SINT: EqOpc = ISD::FP_TO_SINT; break;
		case ISD::STRICT_FP_TO_UINT: EqOpc = ISD::FP_TO_UINT; break;
		case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;
		case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;
}		}

auto Action = getOperationAction(EqOpc, VT);		auto Action = getOperationAction(EqOpc, VT);

// We don't currently handle Custom or Promote for strict FP pseudo-ops.		// We don't currently handle Custom or Promote for strict FP pseudo-ops.
// For now, we just expand for those cases.		// For now, we just expand for those cases.
if (Action != Legal)		if (Action != Legal)
Action = Expand;		Action = Expand;
▲ Show 20 Lines • Show All 2,888 Lines • Show Last 20 Lines

include/llvm/IR/IntrinsicInst.h

Show First 20 Lines • Show All 233 Lines • ▼ Show 20 Lines	public:
static bool classof(const IntrinsicInst *I) {		static bool classof(const IntrinsicInst *I) {
switch (I->getIntrinsicID()) {		switch (I->getIntrinsicID()) {
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
▲ Show 20 Lines • Show All 531 Lines • Show Last 20 Lines

include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 487 Lines • ▼ Show 20 Lines	let IntrProperties = [IntrInaccessibleMemOnly] in {
def int_experimental_constrained_fdiv : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_fdiv : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_frem : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_frem : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;

def int_experimental_constrained_fma : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_fma : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;

		def int_experimental_constrained_fptosi : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

		def int_experimental_constrained_fptoui : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

		def int_experimental_constrained_fptrunc : Intrinsic<[ llvm_anyfloat_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

		def int_experimental_constrained_fpext : Intrinsic<[ llvm_anyfloat_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

// These intrinsics are sensitive to the rounding mode so we need constrained		// These intrinsics are sensitive to the rounding mode so we need constrained
// versions of each of them. When strict rounding and exception control are		// versions of each of them. When strict rounding and exception control are
// not required the non-constrained versions of these intrinsics should be		// not required the non-constrained versions of these intrinsics should be
// used.		// used.
def int_experimental_constrained_sqrt : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_sqrt : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
Show All 39 Lines	def int_experimental_constrained_rint : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_nearbyint : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_nearbyint : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
}		}
// FIXME: Add intrinsics for fcmp, fptrunc, fpext, fptoui and fptosi.		// FIXME: Add intrinsic for fcmp
// FIXME: Add intrinsics for fabs, copysign, floor, ceil, trunc and round?		// FIXME: Add intrinsics for fabs, copysign, floor, ceil, trunc and round?


//===------------------------- Expect Intrinsics --------------------------===//		//===------------------------- Expect Intrinsics --------------------------===//
//		//
def int_expect : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>,		def int_expect : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>,
LLVMMatchType<0>], [IntrNoMem]>;		LLVMMatchType<0>], [IntrNoMem]>;

▲ Show 20 Lines • Show All 441 Lines • Show Last 20 Lines

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 365 Lines • ▼ Show 20 Lines
	void initializeSlotIndexesPass(PassRegistry&);			void initializeSlotIndexesPass(PassRegistry&);
	void initializeSpeculativeExecutionLegacyPassPass(PassRegistry&);			void initializeSpeculativeExecutionLegacyPassPass(PassRegistry&);
	void initializeSpillPlacementPass(PassRegistry&);			void initializeSpillPlacementPass(PassRegistry&);
	void initializeStackColoringPass(PassRegistry&);			void initializeStackColoringPass(PassRegistry&);
	void initializeStackMapLivenessPass(PassRegistry&);			void initializeStackMapLivenessPass(PassRegistry&);
	void initializeStackProtectorPass(PassRegistry&);			void initializeStackProtectorPass(PassRegistry&);
	void initializeStackSlotColoringPass(PassRegistry&);			void initializeStackSlotColoringPass(PassRegistry&);
	void initializeStraightLineStrengthReducePass(PassRegistry&);			void initializeStraightLineStrengthReducePass(PassRegistry&);
				void initializeStrictFPPassPass(PassRegistry&);
	void initializeStripDeadDebugInfoPass(PassRegistry&);			void initializeStripDeadDebugInfoPass(PassRegistry&);
	void initializeStripDeadPrototypesLegacyPassPass(PassRegistry&);			void initializeStripDeadPrototypesLegacyPassPass(PassRegistry&);
	void initializeStripDebugDeclarePass(PassRegistry&);			void initializeStripDebugDeclarePass(PassRegistry&);
	void initializeStripGCRelocatesPass(PassRegistry&);			void initializeStripGCRelocatesPass(PassRegistry&);
	void initializeStripNonDebugSymbolsPass(PassRegistry&);			void initializeStripNonDebugSymbolsPass(PassRegistry&);
	void initializeStripNonLineTableDebugInfoPass(PassRegistry&);			void initializeStripNonLineTableDebugInfoPass(PassRegistry&);
	void initializeStripSymbolsPass(PassRegistry&);			void initializeStripSymbolsPass(PassRegistry&);
	void initializeStructurizeCFGPass(PassRegistry&);			void initializeStructurizeCFGPass(PassRegistry&);
	Show All 25 Lines

lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines	add_llvm_library(LLVMCodeGen
SlotIndexes.cpp		SlotIndexes.cpp
SpillPlacement.cpp		SpillPlacement.cpp
SplitKit.cpp		SplitKit.cpp
StackColoring.cpp		StackColoring.cpp
StackMapLivenessAnalysis.cpp		StackMapLivenessAnalysis.cpp
StackMaps.cpp		StackMaps.cpp
StackProtector.cpp		StackProtector.cpp
StackSlotColoring.cpp		StackSlotColoring.cpp
		StrictFP.cpp
TailDuplication.cpp		TailDuplication.cpp
TailDuplicator.cpp		TailDuplicator.cpp
TargetFrameLoweringImpl.cpp		TargetFrameLoweringImpl.cpp
TargetInstrInfo.cpp		TargetInstrInfo.cpp
TargetLoweringBase.cpp		TargetLoweringBase.cpp
TargetLoweringObjectFileImpl.cpp		TargetLoweringObjectFileImpl.cpp
TargetOptionsImpl.cpp		TargetOptionsImpl.cpp
TargetPassConfig.cpp		TargetPassConfig.cpp
Show All 25 Lines

lib/CodeGen/CodeGen.cpp

Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeSafeStackLegacyPassPass(Registry);		initializeSafeStackLegacyPassPass(Registry);
initializeScalarizeMaskedMemIntrinPass(Registry);		initializeScalarizeMaskedMemIntrinPass(Registry);
initializeShrinkWrapPass(Registry);		initializeShrinkWrapPass(Registry);
initializeSlotIndexesPass(Registry);		initializeSlotIndexesPass(Registry);
initializeStackColoringPass(Registry);		initializeStackColoringPass(Registry);
initializeStackMapLivenessPass(Registry);		initializeStackMapLivenessPass(Registry);
initializeStackProtectorPass(Registry);		initializeStackProtectorPass(Registry);
initializeStackSlotColoringPass(Registry);		initializeStackSlotColoringPass(Registry);
		initializeStrictFPPassPass(Registry);
initializeTailDuplicatePass(Registry);		initializeTailDuplicatePass(Registry);
initializeTargetPassConfigPass(Registry);		initializeTargetPassConfigPass(Registry);
initializeTwoAddressInstructionPassPass(Registry);		initializeTwoAddressInstructionPassPass(Registry);
initializeUnpackMachineBundlesPass(Registry);		initializeUnpackMachineBundlesPass(Registry);
initializeUnreachableBlockElimLegacyPassPass(Registry);		initializeUnreachableBlockElimLegacyPassPass(Registry);
initializeUnreachableMachineBlockElimPass(Registry);		initializeUnreachableMachineBlockElimPass(Registry);
initializeVirtRegMapPass(Registry);		initializeVirtRegMapPass(Registry);
initializeVirtRegRewriterPass(Registry);		initializeVirtRegRewriterPass(Registry);
initializeWasmEHPreparePass(Registry);		initializeWasmEHPreparePass(Registry);
initializeWinEHPreparePass(Registry);		initializeWinEHPreparePass(Registry);
initializeXRayInstrumentationPass(Registry);		initializeXRayInstrumentationPass(Registry);
}		}

void LLVMInitializeCodeGen(LLVMPassRegistryRef R) {		void LLVMInitializeCodeGen(LLVMPassRegistryRef R) {
initializeCodeGen(*unwrap(R));		initializeCodeGen(*unwrap(R));
}		}

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 940 Lines • ▼ Show 20 Lines	if (Chain.getNode() != Node) {
if (UpdatedNodes) {		if (UpdatedNodes) {
UpdatedNodes->insert(Value.getNode());		UpdatedNodes->insert(Value.getNode());
UpdatedNodes->insert(Chain.getNode());		UpdatedNodes->insert(Chain.getNode());
}		}
ReplacedNode(Node);		ReplacedNode(Node);
}		}
}		}

/// Return a legal replacement for the given operation, with all legal operands.		/// Return a legal replacement for the given operation, with all legal operands.
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions STRICT_FP_TO_SINT needs to do something more than this. The default lowering of FP_TO_SINT is known to raise spurious FE_INEXACT exceptions because it involves speculative execution. andrew.w.kaylor: STRICT_FP_TO_SINT needs to do something more than this. The default lowering of FP_TO_SINT is…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions I have seen FP_TO_SINT cause traps when it shouldn't. Would having that default lowering use the chain in the STRICT_ case solve that issue? kpn: I have seen FP_TO_SINT cause traps when it shouldn't. Would having that default lowering use…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions No, I don't think the chain will fix this. We need to implement strict lowering that does something different. andrew.w.kaylor: No, I don't think the chain will fix this. We need to implement strict lowering that does…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Is there something we can do at this level to fix this? If there is then I'm all for it, but if there isn't then we should probably still put the intrinsic in. We'll need it eventually, and currently none of the constrained intrinsics solve the complete optimization problem. So I wonder if this intrinsic is really all that different from the other experimental constrained intrinsics. If a backend models FP side effects then wouldn't the existing default lowering work correctly? kpn: Is there something we can do at this level to fix this? If there is then I'm all for it, but if…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions To be honest I'm not entirely sure how to fix this in the current selection DAG model. The issue is that we need to introduce a branch to fix the problem, but by the time we're selecting instructions it's too late to do that. I think it needs to be addressed when we're building the DAG, andrew.w.kaylor: To be honest I'm not entirely sure how to fix this in the current selection DAG model. The…
void SelectionDAGLegalize::LegalizeOp(SDNode *Node) {		void SelectionDAGLegalize::LegalizeOp(SDNode *Node) {
LLVM_DEBUG(dbgs() << "\nLegalizing: "; Node->dump(&DAG));		LLVM_DEBUG(dbgs() << "\nLegalizing: "; Node->dump(&DAG));

// Allow illegal target nodes and illegal registers.		// Allow illegal target nodes and illegal registers.
if (Node->getOpcode() == ISD::TargetConstant \|\|		if (Node->getOpcode() == ISD::TargetConstant \|\|
Node->getOpcode() == ISD::Register)		Node->getOpcode() == ISD::Register)
return;		return;

▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FCOS:		case ISD::STRICT_FCOS:
case ISD::STRICT_FEXP:		case ISD::STRICT_FEXP:
case ISD::STRICT_FEXP2:		case ISD::STRICT_FEXP2:
case ISD::STRICT_FLOG:		case ISD::STRICT_FLOG:
case ISD::STRICT_FLOG10:		case ISD::STRICT_FLOG10:
case ISD::STRICT_FLOG2:		case ISD::STRICT_FLOG2:
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_EXTEND:
// These pseudo-ops get legalized as if they were their non-strict		// These pseudo-ops get legalized as if they were their non-strict
// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT		// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT
// is also legal, but if ISD::FSQRT requires expansion then so does		// is also legal, but if ISD::FSQRT requires expansion then so does
// ISD::STRICT_FSQRT.		// ISD::STRICT_FSQRT.
Action = TLI.getStrictFPOperationAction(Node->getOpcode(),		Action = TLI.getStrictFPOperationAction(Node->getOpcode(),
Node->getValueType(0));		Node->getValueType(0));
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Since this gets unique handling why isn't it just a separate case from the others? andrew.w.kaylor: Since this gets unique handling why isn't it just a separate case from the others?
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Good point. And making it a separate case also takes care of the formatting issue in the else block. kpn: Good point. And making it a separate case also takes care of the formatting issue in the else…
break;		break;
		case ISD::STRICT_FP_TO_UINT:
		llvm_unreachable("Expansion of STRICT_FP_TO_UINT missed in earlier pass!");
		break;
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions The style/formatting is wrong here. I think you need curly braces around your else-clause and the "else" itself needs to be on the same line as the curly brace above it. andrew.w.kaylor: The style/formatting is wrong here. I think you need curly braces around your else-clause and…
default:		default:
if (Node->getOpcode() >= ISD::BUILTIN_OP_END) {		if (Node->getOpcode() >= ISD::BUILTIN_OP_END) {
Action = TargetLowering::Legal;		Action = TargetLowering::Legal;
} else {		} else {
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
}		}
break;		break;
}		}
▲ Show 20 Lines • Show All 1,847 Lines • ▼ Show 20 Lines	if (VT.isInteger())
Results.push_back(DAG.getConstant(0, dl, VT));		Results.push_back(DAG.getConstant(0, dl, VT));
else {		else {
assert(VT.isFloatingPoint() && "Unknown value type!");		assert(VT.isFloatingPoint() && "Unknown value type!");
Results.push_back(DAG.getConstantFP(0, dl, VT));		Results.push_back(DAG.getConstantFP(0, dl, VT));
}		}
break;		break;
}		}
case ISD::FP_ROUND:		case ISD::FP_ROUND:
		case ISD::STRICT_FP_ROUND:
case ISD::BITCAST:		case ISD::BITCAST:
Tmp1 = EmitStackConvert(Node->getOperand(0), Node->getValueType(0),		Tmp1 = EmitStackConvert(Node->getOperand(0), Node->getValueType(0),
Node->getValueType(0), dl);		Node->getValueType(0), dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
		case ISD::STRICT_FP_EXTEND:
Tmp1 = EmitStackConvert(Node->getOperand(0),		Tmp1 = EmitStackConvert(Node->getOperand(0),
Node->getOperand(0).getValueType(),		Node->getOperand(0).getValueType(),
Node->getValueType(0), dl);		Node->getValueType(0), dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::SIGN_EXTEND_INREG: {		case ISD::SIGN_EXTEND_INREG: {
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions This doesn't seem correct. Shouldn't Node->getOperand(0) be the argument for FP_ROUND and the Chain for STRICT_FP_ROUND? Same for FP_EXTEND too. Also, does using a truncating store provide us with the same trapping behavior as an explicit trunc instruction? I don't know off the top of my head, but it may be different. To be fair, a user probably won't care too much about optimizing away a trap, but the purists might. cameron.mcinally: This doesn't seem correct. Shouldn't Node->getOperand(0) be the argument for FP_ROUND and the…
EVT ExtraVT = cast<VTSDNode>(Node->getOperand(1))->getVT();		EVT ExtraVT = cast<VTSDNode>(Node->getOperand(1))->getVT();
EVT VT = Node->getValueType(0);		EVT VT = Node->getValueType(0);

// An in-register sign-extend of a boolean is a negation:		// An in-register sign-extend of a boolean is a negation:
// 'true' (1) sign-extended is -1.		// 'true' (1) sign-extended is -1.
// 'false' (0) sign-extended is 0.		// 'false' (0) sign-extended is 0.
// However, we must mask the high bits of the source operand because the		// However, we must mask the high bits of the source operand because the
// SIGN_EXTEND_INREG does not guarantee that the high bits are already zero.		// SIGN_EXTEND_INREG does not guarantee that the high bits are already zero.
Show All 35 Lines	bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
}		}
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
Tmp1 = ExpandLegalINT_TO_FP(Node->getOpcode() == ISD::SINT_TO_FP,		Tmp1 = ExpandLegalINT_TO_FP(Node->getOpcode() == ISD::SINT_TO_FP,
Node->getOperand(0), Node->getValueType(0), dl);		Node->getOperand(0), Node->getValueType(0), dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
		case ISD::STRICT_FP_TO_SINT:
if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG))		if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG))
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions This seems incorrect, unless I missed something. The first operand to FP_TO_SINT should be the argument, but the first operand to STRICT_FP_TO_SINT should be the Chain. It does not look like expandFP_TO_SINT accounts for that. cameron.mcinally: This seems incorrect, unless I missed something. The first operand to FP_TO_SINT should be the…
case ISD::FP_TO_UINT: {		case ISD::FP_TO_UINT: {
SDValue True, False;		SDValue True, False;
EVT VT = Node->getOperand(0).getValueType();		EVT VT = Node->getOperand(0).getValueType();
EVT NVT = Node->getValueType(0);		EVT NVT = Node->getValueType(0);
APFloat apf(DAG.EVTToAPFloatSemantics(VT),		APFloat apf(DAG.EVTToAPFloatSemantics(VT),
APInt::getNullValue(VT.getSizeInBits()));		APInt::getNullValue(VT.getSizeInBits()));
APInt x = APInt::getSignMask(NVT.getSizeInBits());		APInt x = APInt::getSignMask(NVT.getSizeInBits());
(void)apf.convertFromAPInt(x, false, APFloat::rmNearestTiesToEven);		(void)apf.convertFromAPInt(x, false, APFloat::rmNearestTiesToEven);
Tmp1 = DAG.getConstantFP(apf, dl, VT);		Tmp1 = DAG.getConstantFP(apf, dl, VT);
Tmp2 = DAG.getSetCC(dl, getSetCCResultType(VT),		Tmp2 = DAG.getSetCC(dl, getSetCCResultType(VT),
Node->getOperand(0),		Node->getOperand(0),
Tmp1, ISD::SETLT);		Tmp1, ISD::SETLT);
True = DAG.getNode(ISD::FP_TO_SINT, dl, NVT, Node->getOperand(0));		True = DAG.getNode(ISD::FP_TO_SINT, dl, NVT, Node->getOperand(0));
// TODO: Should any fast-math-flags be set for the FSUB?		// TODO: Should any fast-math-flags be set for the FSUB?
False = DAG.getNode(ISD::FP_TO_SINT, dl, NVT,		False = DAG.getNode(ISD::FP_TO_SINT, dl, NVT,
DAG.getNode(ISD::FSUB, dl, VT,		DAG.getNode(ISD::FSUB, dl, VT,
Node->getOperand(0), Tmp1));		Node->getOperand(0), Tmp1));
False = DAG.getNode(ISD::XOR, dl, NVT, False,		False = DAG.getNode(ISD::XOR, dl, NVT, False,
DAG.getConstant(x, dl, NVT));		DAG.getConstant(x, dl, NVT));
Tmp1 = DAG.getSelect(dl, NVT, Tmp2, True, False);		Tmp1 = DAG.getSelect(dl, NVT, Tmp2, True, False);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
}		}
		case ISD::STRICT_FP_TO_UINT:
		llvm_unreachable("Expansion of STRICT_FP_TO_UINT missed in earlier pass!");
		break;
case ISD::VAARG:		case ISD::VAARG:
Results.push_back(DAG.expandVAArg(Node));		Results.push_back(DAG.expandVAArg(Node));
Results.push_back(Results[0].getValue(1));		Results.push_back(Results[0].getValue(1));
break;		break;
case ISD::VACOPY:		case ISD::VACOPY:
Results.push_back(DAG.expandVACopy(Node));		Results.push_back(DAG.expandVACopy(Node));
break;		break;
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
▲ Show 20 Lines • Show All 1,713 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	#endif
case ISD::SIGN_EXTEND_VECTOR_INREG:		case ISD::SIGN_EXTEND_VECTOR_INREG:
case ISD::ZERO_EXTEND_VECTOR_INREG:		case ISD::ZERO_EXTEND_VECTOR_INREG:
Res = PromoteIntRes_EXTEND_VECTOR_INREG(N); break;		Res = PromoteIntRes_EXTEND_VECTOR_INREG(N); break;

case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND: Res = PromoteIntRes_INT_EXTEND(N); break;		case ISD::ANY_EXTEND: Res = PromoteIntRes_INT_EXTEND(N); break;

		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;		case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;

case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;		case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;

case ISD::AND:		case ISD::AND:
case ISD::OR:		case ISD::OR:
case ISD::XOR:		case ISD::XOR:
▲ Show 20 Lines • Show All 288 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_XINT(SDNode *N) {
// not Legal, check to see if we can use FP_TO_SINT instead. (If both UINT		// not Legal, check to see if we can use FP_TO_SINT instead. (If both UINT
// and SINT conversions are Custom, there is no way to tell which is		// and SINT conversions are Custom, there is no way to tell which is
// preferable. We choose SINT because that's the right thing on PPC.)		// preferable. We choose SINT because that's the right thing on PPC.)
if (N->getOpcode() == ISD::FP_TO_UINT &&		if (N->getOpcode() == ISD::FP_TO_UINT &&
!TLI.isOperationLegal(ISD::FP_TO_UINT, NVT) &&		!TLI.isOperationLegal(ISD::FP_TO_UINT, NVT) &&
TLI.isOperationLegalOrCustom(ISD::FP_TO_SINT, NVT))		TLI.isOperationLegalOrCustom(ISD::FP_TO_SINT, NVT))
NewOpc = ISD::FP_TO_SINT;		NewOpc = ISD::FP_TO_SINT;

		if (N->getOpcode() == ISD::STRICT_FP_TO_UINT &&
		!TLI.isOperationLegal(ISD::STRICT_FP_TO_UINT, NVT) &&
		TLI.isOperationLegalOrCustom(ISD::STRICT_FP_TO_SINT, NVT))
		NewOpc = ISD::STRICT_FP_TO_SINT;

SDValue Res = DAG.getNode(NewOpc, dl, NVT, N->getOperand(0));		SDValue Res = DAG.getNode(NewOpc, dl, NVT, N->getOperand(0));

// Assert that the converted value fits in the original type. If it doesn't		// Assert that the converted value fits in the original type. If it doesn't
// (eg: because the value being converted is too big), then the result of the		// (eg: because the value being converted is too big), then the result of the
// original operation was undefined anyway, so the assert is still correct.		// original operation was undefined anyway, so the assert is still correct.
//		//
// NOTE: fp-to-uint to fp-to-sint promotion guarantees zero extend. For example:		// NOTE: fp-to-uint to fp-to-sint promotion guarantees zero extend. For example:
// before legalization: fp-to-uint16, 65534. -> 0xfffe		// before legalization: fp-to-uint16, 65534. -> 0xfffe
// after legalization: fp-to-sint32, 65534. -> 0x0000fffe		// after legalization: fp-to-sint32, 65534. -> 0x0000fffe
return DAG.getNode(N->getOpcode() == ISD::FP_TO_UINT ?		return DAG.getNode((N->getOpcode() == ISD::FP_TO_UINT \|\|
		N->getOpcode() == ISD::STRICT_FP_TO_UINT) ?
ISD::AssertZext : ISD::AssertSext, dl, NVT, Res,		ISD::AssertZext : ISD::AssertSext, dl, NVT, Res,
DAG.getValueType(N->getValueType(0).getScalarType()));		DAG.getValueType(N->getValueType(0).getScalarType()));
}		}

SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_FP16(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_FP16(SDNode *N) {
EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
SDLoc dl(N);		SDLoc dl(N);

▲ Show 20 Lines • Show All 3,161 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 305 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::STRICT_FCOS:		case ISD::STRICT_FCOS:
case ISD::STRICT_FEXP:		case ISD::STRICT_FEXP:
case ISD::STRICT_FEXP2:		case ISD::STRICT_FEXP2:
case ISD::STRICT_FLOG:		case ISD::STRICT_FLOG:
case ISD::STRICT_FLOG10:		case ISD::STRICT_FLOG10:
case ISD::STRICT_FLOG2:		case ISD::STRICT_FLOG2:
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_EXTEND:
// These pseudo-ops get legalized as if they were their non-strict		// These pseudo-ops get legalized as if they were their non-strict
// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT		// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT
// is also legal, but if ISD::FSQRT requires expansion then so does		// is also legal, but if ISD::FSQRT requires expansion then so does
// ISD::STRICT_FSQRT.		// ISD::STRICT_FSQRT.
Action = TLI.getStrictFPOperationAction(Node->getOpcode(),		Action = TLI.getStrictFPOperationAction(Node->getOpcode(),
Node->getValueType(0));		Node->getValueType(0));
break;		break;
case ISD::ADD:		case ISD::ADD:
▲ Show 20 Lines • Show All 897 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
#endif		#endif
report_fatal_error("Do not know how to scalarize the result of this "		report_fatal_error("Do not know how to scalarize the result of this "
"operator!\n");		"operator!\n");

case ISD::MERGE_VALUES: R = ScalarizeVecRes_MERGE_VALUES(N, ResNo);break;		case ISD::MERGE_VALUES: R = ScalarizeVecRes_MERGE_VALUES(N, ResNo);break;
case ISD::BITCAST: R = ScalarizeVecRes_BITCAST(N); break;		case ISD::BITCAST: R = ScalarizeVecRes_BITCAST(N); break;
case ISD::BUILD_VECTOR: R = ScalarizeVecRes_BUILD_VECTOR(N); break;		case ISD::BUILD_VECTOR: R = ScalarizeVecRes_BUILD_VECTOR(N); break;
case ISD::EXTRACT_SUBVECTOR: R = ScalarizeVecRes_EXTRACT_SUBVECTOR(N); break;		case ISD::EXTRACT_SUBVECTOR: R = ScalarizeVecRes_EXTRACT_SUBVECTOR(N); break;
		case ISD::STRICT_FP_ROUND:
case ISD::FP_ROUND: R = ScalarizeVecRes_FP_ROUND(N); break;		case ISD::FP_ROUND: R = ScalarizeVecRes_FP_ROUND(N); break;
case ISD::FP_ROUND_INREG: R = ScalarizeVecRes_InregOp(N); break;		case ISD::FP_ROUND_INREG: R = ScalarizeVecRes_InregOp(N); break;
case ISD::FPOWI: R = ScalarizeVecRes_FPOWI(N); break;		case ISD::FPOWI: R = ScalarizeVecRes_FPOWI(N); break;
case ISD::INSERT_VECTOR_ELT: R = ScalarizeVecRes_INSERT_VECTOR_ELT(N); break;		case ISD::INSERT_VECTOR_ELT: R = ScalarizeVecRes_INSERT_VECTOR_ELT(N); break;
case ISD::LOAD: R = ScalarizeVecRes_LOAD(cast<LoadSDNode>(N));break;		case ISD::LOAD: R = ScalarizeVecRes_LOAD(cast<LoadSDNode>(N));break;
case ISD::SCALAR_TO_VECTOR: R = ScalarizeVecRes_SCALAR_TO_VECTOR(N); break;		case ISD::SCALAR_TO_VECTOR: R = ScalarizeVecRes_SCALAR_TO_VECTOR(N); break;
case ISD::SIGN_EXTEND_INREG: R = ScalarizeVecRes_InregOp(N); break;		case ISD::SIGN_EXTEND_INREG: R = ScalarizeVecRes_InregOp(N); break;
case ISD::VSELECT: R = ScalarizeVecRes_VSELECT(N); break;		case ISD::VSELECT: R = ScalarizeVecRes_VSELECT(N); break;
Show All 21 Lines	#endif
case ISD::FEXP:		case ISD::FEXP:
case ISD::FEXP2:		case ISD::FEXP2:
case ISD::FFLOOR:		case ISD::FFLOOR:
case ISD::FLOG:		case ISD::FLOG:
case ISD::FLOG10:		case ISD::FLOG10:
case ISD::FLOG2:		case ISD::FLOG2:
case ISD::FNEARBYINT:		case ISD::FNEARBYINT:
case ISD::FNEG:		case ISD::FNEG:
		case ISD::STRICT_FP_EXTEND:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::FRINT:		case ISD::FRINT:
case ISD::FROUND:		case ISD::FROUND:
case ISD::FSIN:		case ISD::FSIN:
case ISD::FSQRT:		case ISD::FSQRT:
case ISD::FTRUNC:		case ISD::FTRUNC:
▲ Show 20 Lines • Show All 429 Lines • ▼ Show 20 Lines	case ISD::VSELECT:
Res = ScalarizeVecOp_VSELECT(N);		Res = ScalarizeVecOp_VSELECT(N);
break;		break;
case ISD::SETCC:		case ISD::SETCC:
Res = ScalarizeVecOp_VSETCC(N);		Res = ScalarizeVecOp_VSETCC(N);
break;		break;
case ISD::STORE:		case ISD::STORE:
Res = ScalarizeVecOp_STORE(cast<StoreSDNode>(N), OpNo);		Res = ScalarizeVecOp_STORE(cast<StoreSDNode>(N), OpNo);
break;		break;
		case ISD::STRICT_FP_ROUND:
case ISD::FP_ROUND:		case ISD::FP_ROUND:
Res = ScalarizeVecOp_FP_ROUND(N, OpNo);		Res = ScalarizeVecOp_FP_ROUND(N, OpNo);
break;		break;
}		}
}		}

// If the result is null, the sub-method took care of registering results etc.		// If the result is null, the sub-method took care of registering results etc.
if (!Res.getNode()) return false;		if (!Res.getNode()) return false;
▲ Show 20 Lines • Show All 1,103 Lines • ▼ Show 20 Lines	#endif
case ISD::SETCC: Res = SplitVecOp_VSETCC(N); break;		case ISD::SETCC: Res = SplitVecOp_VSETCC(N); break;
case ISD::BITCAST: Res = SplitVecOp_BITCAST(N); break;		case ISD::BITCAST: Res = SplitVecOp_BITCAST(N); break;
case ISD::EXTRACT_SUBVECTOR: Res = SplitVecOp_EXTRACT_SUBVECTOR(N); break;		case ISD::EXTRACT_SUBVECTOR: Res = SplitVecOp_EXTRACT_SUBVECTOR(N); break;
case ISD::EXTRACT_VECTOR_ELT:Res = SplitVecOp_EXTRACT_VECTOR_ELT(N); break;		case ISD::EXTRACT_VECTOR_ELT:Res = SplitVecOp_EXTRACT_VECTOR_ELT(N); break;
case ISD::CONCAT_VECTORS: Res = SplitVecOp_CONCAT_VECTORS(N); break;		case ISD::CONCAT_VECTORS: Res = SplitVecOp_CONCAT_VECTORS(N); break;
case ISD::TRUNCATE:		case ISD::TRUNCATE:
Res = SplitVecOp_TruncateHelper(N);		Res = SplitVecOp_TruncateHelper(N);
break;		break;
		case ISD::STRICT_FP_ROUND:
case ISD::FP_ROUND: Res = SplitVecOp_FP_ROUND(N); break;		case ISD::FP_ROUND: Res = SplitVecOp_FP_ROUND(N); break;
case ISD::FCOPYSIGN: Res = SplitVecOp_FCOPYSIGN(N); break;		case ISD::FCOPYSIGN: Res = SplitVecOp_FCOPYSIGN(N); break;
case ISD::STORE:		case ISD::STORE:
Res = SplitVecOp_STORE(cast<StoreSDNode>(N), OpNo);		Res = SplitVecOp_STORE(cast<StoreSDNode>(N), OpNo);
break;		break;
case ISD::MSTORE:		case ISD::MSTORE:
Res = SplitVecOp_MSTORE(cast<MaskedStoreSDNode>(N), OpNo);		Res = SplitVecOp_MSTORE(cast<MaskedStoreSDNode>(N), OpNo);
break;		break;
Show All 18 Lines	case ISD::UINT_TO_FP:
if (N->getValueType(0).bitsLT(N->getOperand(0).getValueType()))		if (N->getValueType(0).bitsLT(N->getOperand(0).getValueType()))
Res = SplitVecOp_TruncateHelper(N);		Res = SplitVecOp_TruncateHelper(N);
else		else
Res = SplitVecOp_UnaryOp(N);		Res = SplitVecOp_UnaryOp(N);
break;		break;
case ISD::CTTZ:		case ISD::CTTZ:
case ISD::CTLZ:		case ISD::CTLZ:
case ISD::CTPOP:		case ISD::CTPOP:
		case ISD::STRICT_FP_EXTEND:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND:		case ISD::ANY_EXTEND:
case ISD::FTRUNC:		case ISD::FTRUNC:
case ISD::FCANONICALIZE:		case ISD::FCANONICALIZE:
Res = SplitVecOp_UnaryOp(N);		Res = SplitVecOp_UnaryOp(N);
break;		break;
▲ Show 20 Lines • Show All 709 Lines • ▼ Show 20 Lines	#endif
case ISD::ANY_EXTEND_VECTOR_INREG:		case ISD::ANY_EXTEND_VECTOR_INREG:
case ISD::SIGN_EXTEND_VECTOR_INREG:		case ISD::SIGN_EXTEND_VECTOR_INREG:
case ISD::ZERO_EXTEND_VECTOR_INREG:		case ISD::ZERO_EXTEND_VECTOR_INREG:
Res = WidenVecRes_EXTEND_VECTOR_INREG(N);		Res = WidenVecRes_EXTEND_VECTOR_INREG(N);
break;		break;

case ISD::ANY_EXTEND:		case ISD::ANY_EXTEND:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
		case ISD::STRICT_FP_EXTEND:
case ISD::FP_ROUND:		case ISD::FP_ROUND:
		case ISD::STRICT_FP_ROUND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
		case ISD::STRICT_FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
Res = WidenVecRes_Convert(N);		Res = WidenVecRes_Convert(N);
break;		break;

		case ISD::STRICT_FP_TO_UINT:
		llvm_unreachable("Expansion of STRICT_FP_TO_UINT missed in earlier pass!");
		break;

case ISD::BITREVERSE:		case ISD::BITREVERSE:
case ISD::BSWAP:		case ISD::BSWAP:
case ISD::CTLZ:		case ISD::CTLZ:
case ISD::CTPOP:		case ISD::CTPOP:
case ISD::CTTZ:		case ISD::CTTZ:
case ISD::FABS:		case ISD::FABS:
case ISD::FCEIL:		case ISD::FCEIL:
case ISD::FCOS:		case ISD::FCOS:
▲ Show 20 Lines • Show All 1,183 Lines • ▼ Show 20 Lines	#endif
case ISD::FCOPYSIGN: Res = WidenVecOp_FCOPYSIGN(N); break;		case ISD::FCOPYSIGN: Res = WidenVecOp_FCOPYSIGN(N); break;

case ISD::ANY_EXTEND:		case ISD::ANY_EXTEND:
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
Res = WidenVecOp_EXTEND(N);		Res = WidenVecOp_EXTEND(N);
break;		break;

		case ISD::STRICT_FP_EXTEND:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
Res = WidenVecOp_Convert(N);		Res = WidenVecOp_Convert(N);
break;		break;
▲ Show 20 Lines • Show All 777 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,298 Lines • ▼ Show 20 Lines	SDNode* SelectionDAG::mutateStrictFPToFP(SDNode *Node) {
case ISD::STRICT_FLOG: NewOpc = ISD::FLOG; IsUnary = true; break;		case ISD::STRICT_FLOG: NewOpc = ISD::FLOG; IsUnary = true; break;
case ISD::STRICT_FLOG10: NewOpc = ISD::FLOG10; IsUnary = true; break;		case ISD::STRICT_FLOG10: NewOpc = ISD::FLOG10; IsUnary = true; break;
case ISD::STRICT_FLOG2: NewOpc = ISD::FLOG2; IsUnary = true; break;		case ISD::STRICT_FLOG2: NewOpc = ISD::FLOG2; IsUnary = true; break;
case ISD::STRICT_FRINT: NewOpc = ISD::FRINT; IsUnary = true; break;		case ISD::STRICT_FRINT: NewOpc = ISD::FRINT; IsUnary = true; break;
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
NewOpc = ISD::FNEARBYINT;		NewOpc = ISD::FNEARBYINT;
IsUnary = true;		IsUnary = true;
break;		break;
		case ISD::STRICT_FP_TO_SINT: NewOpc = ISD::FP_TO_SINT; break;
		case ISD::STRICT_FP_TO_UINT: NewOpc = ISD::FP_TO_UINT; break;
		case ISD::STRICT_FP_ROUND: NewOpc = ISD::FP_ROUND; IsUnary = true; break;
		case ISD::STRICT_FP_EXTEND: NewOpc = ISD::FP_EXTEND; IsUnary = true; break;
		}

		bool IsChained = true;
		switch (OrigOpc) {
		default:
		break;
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_EXTEND:
		IsChained = false;
		break;
}		}
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions There are a lot of comments on this, so I may have missed something. Take with a grain of salt... I don't think these are correct. These can trap so can't be speculatively executed. They would need a chain. cameron.mcinally: There are a lot of comments on this, so I may have missed something. Take with a grain of salt..
		kpnAuthorUnsubmitted Not Done Reply Inline Actions I couldn't figure out how to have these be chained but have the non-strict continue to not be chained. Too many things fell over if they didn't match. kpn: I couldn't figure out how to have these be chained but have the non-strict continue to not be…
		cameron.mcinallyUnsubmitted Done Reply Inline Actions I'm not an expert with this code, so cc @andrew.w.kaylor. This doesn't seem like the right direction though. Maybe these unchained operations should be left to a different patch until a proper solution is found. cameron.mcinally: I'm not an expert with this code, so cc @andrew.w.kaylor. This doesn't seem like the right…

// We're taking this node out of the chain, so we need to re-link things.		// We're taking this node out of the chain, so we need to re-link things.
		if (IsChained) {
SDValue InputChain = Node->getOperand(0);		SDValue InputChain = Node->getOperand(0);
SDValue OutputChain = SDValue(Node, 1);		SDValue OutputChain = SDValue(Node, 1);
ReplaceAllUsesOfValueWith(OutputChain, InputChain);		ReplaceAllUsesOfValueWith(OutputChain, InputChain);
		}

SDVTList VTs = getVTList(Node->getOperand(1).getValueType());		SDVTList VTs;
SDNode *Res = nullptr;		SDNode *Res = nullptr;
if (IsUnary)
		switch (OrigOpc) {
		default:
		VTs = getVTList(Node->getOperand(1).getValueType());
		break;
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_EXTEND:
		VTs = getVTList(Node->ValueList[0]);
		break;
		}

		if (!IsChained)
		Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(0) });
		craig.topperUnsubmitted Not Done Reply Inline Actions This doesn't copy the second argument to FP_ROUJND over does it? craig.topper: This doesn't copy the second argument to FP_ROUJND over does it?
		kpnAuthorUnsubmitted Done Reply Inline Actions No. It should. kpn: No. It should.
		else if (IsUnary)
Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1) });		Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1) });
else if (IsTernary)		else if (IsTernary)
Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1),		Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1),
Node->getOperand(2),		Node->getOperand(2),
Node->getOperand(3)});		Node->getOperand(3)});
else		else
Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1),		Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1),
Node->getOperand(2) });		Node->getOperand(2) });
▲ Show 20 Lines • Show All 1,583 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,584 Lines • ▼ Show 20 Lines	setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(2))));		getValue(I.getArgOperand(2))));
return nullptr;		return nullptr;
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
▲ Show 20 Lines • Show All 663 Lines • ▼ Show 20 Lines
}		}
}		}
}		}

void SelectionDAGBuilder::visitConstrainedFPIntrinsic(		void SelectionDAGBuilder::visitConstrainedFPIntrinsic(
const ConstrainedFPIntrinsic &FPI) {		const ConstrainedFPIntrinsic &FPI) {
SDLoc sdl = getCurSDLoc();		SDLoc sdl = getCurSDLoc();
unsigned Opcode;		unsigned Opcode;
		bool IsChained = true;
switch (FPI.getIntrinsicID()) {		switch (FPI.getIntrinsicID()) {
default: llvm_unreachable("Impossible intrinsic"); // Can't reach here.		default: llvm_unreachable("Impossible intrinsic"); // Can't reach here.
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
Opcode = ISD::STRICT_FADD;		Opcode = ISD::STRICT_FADD;
break;		break;
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
Opcode = ISD::STRICT_FSUB;		Opcode = ISD::STRICT_FSUB;
break;		break;
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
Opcode = ISD::STRICT_FMUL;		Opcode = ISD::STRICT_FMUL;
break;		break;
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
Opcode = ISD::STRICT_FDIV;		Opcode = ISD::STRICT_FDIV;
break;		break;
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
Opcode = ISD::STRICT_FREM;		Opcode = ISD::STRICT_FREM;
break;		break;
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
Opcode = ISD::STRICT_FMA;		Opcode = ISD::STRICT_FMA;
break;		break;
		case Intrinsic::experimental_constrained_fptosi:
		Opcode = ISD::STRICT_FP_TO_SINT;
		IsChained = false;
		break;
		case Intrinsic::experimental_constrained_fptrunc:
		Opcode = ISD::STRICT_FP_ROUND;
		IsChained = false;
		break;
		case Intrinsic::experimental_constrained_fpext:
		Opcode = ISD::STRICT_FP_EXTEND;
		IsChained = false;
		break;
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
Opcode = ISD::STRICT_FSQRT;		Opcode = ISD::STRICT_FSQRT;
break;		break;
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
Opcode = ISD::STRICT_FPOW;		Opcode = ISD::STRICT_FPOW;
break;		break;
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
Opcode = ISD::STRICT_FPOWI;		Opcode = ISD::STRICT_FPOWI;
Show All 25 Lines	void SelectionDAGBuilder::visitConstrainedFPIntrinsic(
case Intrinsic::experimental_constrained_nearbyint:		case Intrinsic::experimental_constrained_nearbyint:
Opcode = ISD::STRICT_FNEARBYINT;		Opcode = ISD::STRICT_FNEARBYINT;
break;		break;
}		}
const TargetLowering &TLI = DAG.getTargetLoweringInfo();		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
SDValue Chain = getRoot();		SDValue Chain = getRoot();
SmallVector<EVT, 4> ValueVTs;		SmallVector<EVT, 4> ValueVTs;
ComputeValueVTs(TLI, DAG.getDataLayout(), FPI.getType(), ValueVTs);		ComputeValueVTs(TLI, DAG.getDataLayout(), FPI.getType(), ValueVTs);
		if (IsChained)
ValueVTs.push_back(MVT::Other); // Out chain		ValueVTs.push_back(MVT::Other); // Out chain
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Why are you not attaching these nodes to the chain? andrew.w.kaylor: Why are you not attaching these nodes to the chain?
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Because they need to match what the default lowering is expecting. Otherwise a variety of failures happen. kpn: Because they need to match what the default lowering is expecting. Otherwise a variety of…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions There is code that removes the chain when the strict node is mutated to a non-strict node. That should be preventing lowering problems. I believe the chain was necessary to prevent re-ordering prior to final instruction selection. andrew.w.kaylor: There is code that removes the chain when the strict node is mutated to a non-strict node. That…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Mutation happens too late in some cases. If it happened early enough then there wouldn't be any need for the strict intrinsics to be mentioned in SelectionDAGLegalize::ExpandNode(). Since it doesn't we can't use the chain. Should that mutation happen earlier? kpn: Mutation happens too late in some cases. If it happened early enough then there wouldn't be…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions We need the mutation to be put off as long as possible. I think it should be possible to unlink the chain in ExpandNode if necessary. Depending on what's happening there, we might even want the expanded node to make use of the chain. andrew.w.kaylor: We need the mutation to be put off as long as possible. I think it should be possible to unlink…

SDVTList VTs = DAG.getVTList(ValueVTs);		SDVTList VTs = DAG.getVTList(ValueVTs);
SDValue Result;		SDValue Result;
if (FPI.isUnaryOp())		if (Opcode == ISD::STRICT_FP_ROUND \|\| Opcode == ISD::STRICT_FP_EXTEND)
		Result = DAG.getNode(Opcode, sdl, VTs,
		{ getValue(FPI.getArgOperand(0)),
		DAG.getTargetConstant(0, sdl,
		TLI.getPointerTy(DAG.getDataLayout())) });
		else if (Opcode == ISD::STRICT_FP_TO_SINT)
		craig.topperUnsubmitted Not Done Reply Inline Actions Is this adding a second argument to STRICT_FP_EXTEND as well? I don't think the non-strict FP_EXTEND has two arguments. craig.topper: Is this adding a second argument to STRICT_FP_EXTEND as well? I don't think the non-strict…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Agreed. I'll fix it. kpn: Agreed. I'll fix it.
		Result = DAG.getNode(Opcode, sdl, VTs,
		{ getValue(FPI.getArgOperand(0)) });
		else if (FPI.isUnaryOp())
Result = DAG.getNode(Opcode, sdl, VTs,		Result = DAG.getNode(Opcode, sdl, VTs,
{ Chain, getValue(FPI.getArgOperand(0)) });		{ Chain, getValue(FPI.getArgOperand(0)) });
else if (FPI.isTernaryOp())		else if (FPI.isTernaryOp())
Result = DAG.getNode(Opcode, sdl, VTs,		Result = DAG.getNode(Opcode, sdl, VTs,
{ Chain, getValue(FPI.getArgOperand(0)),		{ Chain, getValue(FPI.getArgOperand(0)),
getValue(FPI.getArgOperand(1)),		getValue(FPI.getArgOperand(1)),
getValue(FPI.getArgOperand(2)) });		getValue(FPI.getArgOperand(2)) });
else		else
Result = DAG.getNode(Opcode, sdl, VTs,		Result = DAG.getNode(Opcode, sdl, VTs,
{ Chain, getValue(FPI.getArgOperand(0)),		{ Chain, getValue(FPI.getArgOperand(0)),
getValue(FPI.getArgOperand(1)) });		getValue(FPI.getArgOperand(1)) });

		if (IsChained) {
assert(Result.getNode()->getNumValues() == 2);		assert(Result.getNode()->getNumValues() == 2);
SDValue OutChain = Result.getValue(1);		SDValue OutChain = Result.getValue(1);
DAG.setRoot(OutChain);		DAG.setRoot(OutChain);
		}
SDValue FPResult = Result.getValue(0);		SDValue FPResult = Result.getValue(0);
setValue(&FPI, FPResult);		setValue(&FPI, FPResult);
}		}

std::pair<SDValue, SDValue>		std::pair<SDValue, SDValue>
SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,		SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,
const BasicBlock *EHPadBB) {		const BasicBlock *EHPadBB) {
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
▲ Show 20 Lines • Show All 3,977 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	#endif
case ISD::ZERO_EXTEND: return "zero_extend";		case ISD::ZERO_EXTEND: return "zero_extend";
case ISD::ANY_EXTEND: return "any_extend";		case ISD::ANY_EXTEND: return "any_extend";
case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";		case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";
case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";		case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";
case ISD::SIGN_EXTEND_VECTOR_INREG: return "sign_extend_vector_inreg";		case ISD::SIGN_EXTEND_VECTOR_INREG: return "sign_extend_vector_inreg";
case ISD::ZERO_EXTEND_VECTOR_INREG: return "zero_extend_vector_inreg";		case ISD::ZERO_EXTEND_VECTOR_INREG: return "zero_extend_vector_inreg";
case ISD::TRUNCATE: return "truncate";		case ISD::TRUNCATE: return "truncate";
case ISD::FP_ROUND: return "fp_round";		case ISD::FP_ROUND: return "fp_round";
		case ISD::STRICT_FP_ROUND: return "strict_fp_round";
case ISD::FLT_ROUNDS_: return "flt_rounds";		case ISD::FLT_ROUNDS_: return "flt_rounds";
case ISD::FP_ROUND_INREG: return "fp_round_inreg";		case ISD::FP_ROUND_INREG: return "fp_round_inreg";
case ISD::FP_EXTEND: return "fp_extend";		case ISD::FP_EXTEND: return "fp_extend";
		case ISD::STRICT_FP_EXTEND: return "strict_fp_extend";

case ISD::SINT_TO_FP: return "sint_to_fp";		case ISD::SINT_TO_FP: return "sint_to_fp";
case ISD::UINT_TO_FP: return "uint_to_fp";		case ISD::UINT_TO_FP: return "uint_to_fp";
case ISD::FP_TO_SINT: return "fp_to_sint";		case ISD::FP_TO_SINT: return "fp_to_sint";
		case ISD::STRICT_FP_TO_SINT: return "strict_fp_to_sint";
case ISD::FP_TO_UINT: return "fp_to_uint";		case ISD::FP_TO_UINT: return "fp_to_uint";
		case ISD::STRICT_FP_TO_UINT: return "strict_fp_to_uint";
case ISD::BITCAST: return "bitcast";		case ISD::BITCAST: return "bitcast";
case ISD::ADDRSPACECAST: return "addrspacecast";		case ISD::ADDRSPACECAST: return "addrspacecast";
case ISD::FP16_TO_FP: return "fp16_to_fp";		case ISD::FP16_TO_FP: return "fp16_to_fp";
case ISD::FP_TO_FP16: return "fp_to_fp16";		case ISD::FP_TO_FP16: return "fp_to_fp16";

// Control flow instructions		// Control flow instructions
case ISD::BR: return "br";		case ISD::BR: return "br";
case ISD::BRIND: return "brind";		case ISD::BRIND: return "brind";
▲ Show 20 Lines • Show All 536 Lines • Show Last 20 Lines

lib/CodeGen/StrictFP.cpp

				//===----- StrictFP.cpp - Required transforms for strict FP ---------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// \file
				/// This file contains transforms necessary for strict floating point
				/// operations. The transforms done vary depending on the backend.
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions Can you say more here about what these transformations are? It's clear that you intend this as a generic pass that currently does one thing but might have others added later. That's good, but I'd like to see the possibilities described here as they are implemented. andrew.w.kaylor: Can you say more here about what these transformations are? It's clear that you intend this as…
				/// Currently the full set of transforms is:
				/// - Conversions of floating point types to unsigned integral types
				/// are transformed to avoid the speculative execution present in
				/// the default lowering.
				///
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/CodeGen/TargetLowering.h"
				#include "llvm/CodeGen/TargetPassConfig.h"
				#include "llvm/CodeGen/TargetSubtargetInfo.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/InstrTypes.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/Intrinsics.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Pass.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/raw_ostream.h"
				#include "llvm/Transforms/IPO.h"
				#include "llvm/Transforms/Utils/BasicBlockUtils.h"
				using namespace llvm;

				#define DEBUG_TYPE "constrained-fp-transforms"

				STATISTIC(NumStrictFPOps, "Number of strict floating point ops transformed");

				namespace {

				class StrictFPPass : public FunctionPass {
				public:
				static char ID;

				const DataLayout *DL;
				const TargetLowering *TLI;

				StrictFPPass() : FunctionPass(ID) {
				initializeStrictFPPassPass(*PassRegistry::getPassRegistry());
				}

				bool runOnFunction(Function &) override;

				private:
				void inspectIntrinsicCall(IntrinsicInst *);

				bool processIntrinsicCall(IntrinsicInst *);

				bool processVectorIntrinsicCall(IntrinsicInst *);

				void replaceConstrainedFPToUI(IntrinsicInst *);

				void replaceVectorConstrainedFPToUI(IntrinsicInst *);

				std::vector<IntrinsicInst *> IntrinsicWorkList;
				std::vector<IntrinsicInst *> VectorWorkList;
				};

				bool StrictFPPass::runOnFunction(Function &F) {
				bool Changed = false;
				DL = &F.getParent()->getDataLayout();
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions Per the LLVM Programmer's Manual (http://llvm.org/docs/ProgrammersManual.html#iterating-over-the-instruction-in-a-function) you should be using an inst_iterator here. andrew.w.kaylor: Per the LLVM Programmer's Manual (http://llvm.org/docs/ProgrammersManual.html#iterating-over…

				auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
				craig.topperUnsubmitted Done Reply Inline Actions I believe you should be getting TM by doing this. auto TPC = getAnalysisIfAvailable<TargetPassConfig>(); if (!TPC) return false; auto &TM = TPC->getTM<TargetMachine>(); There is only one other pass that takes TargetMachine in its constructor. The others use what I've put above. So I believe that is the preferred way. craig.topper:* I believe you should be getting TM by doing this. ``` auto *TPC =…
				if (!TPC)
				return false;

				auto &TM = TPC->getTM<TargetMachine>();

				TLI = TM.getSubtargetImpl(F)->getTargetLowering();

				for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; I++) {
				craig.topperUnsubmitted Done Reply Inline Actions There is little reason to pass Context around. Value has a getContext as does Type. So you can get the context easily whenever you need it. craig.topper: There is little reason to pass Context around. Value has a getContext as does Type. So you can…
				if (auto Call = dyn_cast<IntrinsicInst>(&I))
				inspectIntrinsicCall(Call);
				}

				for (auto *I : VectorWorkList) {
				Changed \|= processVectorIntrinsicCall(I);
				}

				for (auto *I : IntrinsicWorkList) {
				Changed \|= processIntrinsicCall(I);
				}

				VectorWorkList.clear();
				IntrinsicWorkList.clear();

				craig.topperUnsubmitted Done Reply Inline Actions This cast is unnecessary. IntrinsicInst is a subclass of Value craig.topper: This cast is unnecessary. IntrinsicInst is a subclass of Value
				return Changed;
				}
				craig.topperUnsubmitted Done Reply Inline Actions This should use TLI->getValueType. craig.topper: This should use TLI->getValueType.

				void StrictFPPass::inspectIntrinsicCall(IntrinsicInst *I) {

				switch (Intrinsic::ID IID = I->getIntrinsicID()) {
				default:
				return;
				case Intrinsic::experimental_constrained_fptoui:
				Value *IntDst = I;
				Type *IntDstType = IntDst->getType();
				EVT VT = TLI->getValueType(*DL, IntDstType);

				auto Action = TLI->getOperationAction(ISD::FP_TO_UINT, VT);

				// We don't currently handle Custom or Promote for strict FP pseudo-ops.
				// For now, we just expand for those cases.
				if (Action != TargetLowering::Legal)
				Action = TargetLowering::Expand;

				if (Action == TargetLowering::Expand) {
				if (IntDstType->isVectorTy())
				VectorWorkList.push_back(I);
				else
				IntrinsicWorkList.push_back(I);
				}

				break;
				}
				return;
				}

				bool StrictFPPass::processIntrinsicCall(IntrinsicInst *Call) {
				switch (Intrinsic::ID IID = Call->getIntrinsicID()) {
				default:
				return false;
				case Intrinsic::experimental_constrained_fptoui:
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions Could you add an example here of what the resulting IR will look like? It would make the code a lot easier to follow. andrew.w.kaylor: Could you add an example here of what the resulting IR will look like? It would make the code a…
				replaceConstrainedFPToUI(Call);
				break;
				}
				return true;
				}

				bool StrictFPPass::processVectorIntrinsicCall(IntrinsicInst *Call) {
				switch (Intrinsic::ID IID = Call->getIntrinsicID()) {
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions I think you can use "IntDst->getBitWidth()" here. Also, APInt::getSignedMinValue() does this same thing and is a bit more self-documenting. andrew.w.kaylor: I think you can use "IntDst->getBitWidth()" here. Also, APInt::getSignedMinValue() does this…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions error: no member named `getBitWidth' in` llvm::Value'. kpn: error: no member named `getBitWidth' in `llvm::Value'.
				default:
				return false;
				case Intrinsic::experimental_constrained_fptoui:
				replaceVectorConstrainedFPToUI(Call);
				break;
				}
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions I believe conversion of a NaN to an integer should raise "INVALID" (which the fcmp will) and then the result is undefined, but the 'true' case does less so I think ULT is preferable. andrew.w.kaylor: I believe conversion of a NaN to an integer should raise "INVALID" (which the fcmp will) and…
				return true;
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions We are going to need a constrained version of fcmp, and when we have it you should use it here. When the IRBuilder supports constrained floating point modes, it would be nice to use that here but I guess you can't do that yet, so maybe just a comment saying we should later? andrew.w.kaylor: We are going to need a constrained version of fcmp, and when we have it you should use it here.
				}

				andrew.w.kaylorUnsubmitted Done Reply Inline Actions This is an odd name. How about "within.sint.range"? In any case, I think '.' is more common than '_' as a name space holder, probably because the name will automatically get '.<n>' appended if it's a duplicate. andrew.w.kaylor: This is an odd name. How about "within.sint.range"? In any case, I think '.' is more common…
				void StrictFPPass::replaceConstrainedFPToUI(IntrinsicInst *I) {

				// Four blocks:
				// #1 Gets the compare instruction, is the original block
				// #2 Gets conversion instructions when in signed range
				// #3 Conversion instructions when out of signed range
				// #4 Gets the PHI plus the remainder of the original block
				//
				// The original call gets replaced with the PHI
				//
				// An example of a transform of a double into an unsigned i32:
				//
				// entry:
				craig.topperUnsubmitted Done Reply Inline Actions Why do you need a SmallVector here? Why can't you just call getArgOperand? craig.topper: Why do you need a SmallVector here? Why can't you just call getArgOperand?
				// %within.sint.range = fcmp ult double 4.210000e+01, 0x41E0000000000000
				craig.topperUnsubmitted Done Reply Inline Actions This cast is unecessary. craig.topper: This cast is unecessary.
				// br i1 %within.sint.range, label %0, label %2

				// ; <label>:0: ; preds = %entry
				// %1 = call i32 @llvm.experimental.constrained.fptosi.i32.f64(
				craig.topperUnsubmitted Not Done Reply Inline Actions What if the intrinsic uses a vector type? craig.topper: What if the intrinsic uses a vector type?
				kpnAuthorUnsubmitted Not Done Reply Inline Actions It would have been caught by the IR verifier. A vector would have been rejected there. kpn: It would have been caught by the IR verifier. A vector would have been rejected there.
				craig.topperUnsubmitted Not Done Reply Inline Actions I don't see where the IR verifier rejects vectors. I just see that it checks that element counts are equal. And why should it reject vectors? We need to support fptosi/fptoui for vectors. craig.topper: I don't see where the IR verifier rejects vectors. I just see that it checks that element…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions It doesn't reject them, now. This latest patch added support for vectors. My inexperience with Phabricator lost the comment that said that. kpn: It doesn't reject them, now. This latest patch added support for vectors. My inexperience with…
				// double 4.210000e+01, metadata !"fpexcept.strict")
				// br label %6

				// ; <label>:2: ; preds = %entry
				// %3 = call double @llvm.experimental.constrained.fsub.f64(
				// double 4.210000e+01, double 0x41E0000000000000,
				// metadata !"round.dynamic",
				// metadata !"fpexcept.strict")
				// %4 = call i32 @llvm.experimental.constrained.fptosi.i32.f64(
				// double %3, metadata !"fpexcept.strict")
				// %5 = xor i32 %4, -2147483648
				// br label %6

				// ; <label>:6: ; preds = %2, %0
				// %7 = phi i32 [ %1, %0 ], [ %5, %2 ]

				LLVMContext &Context = I->getContext();

				Value *IntDst = I;
				Value *FPSrc = I->getArgOperand(0);
				Value *ExBehavior = I->getArgOperand(1);

				auto *t = cast<IntegerType>(I->getType());
				APInt IntMaxAP(APInt::getSignedMinValue(DL->getTypeStoreSize(t) * 8));
				APFloat FPMaxAP((double)0);
				FPMaxAP.convertFromAPInt(IntMaxAP, false, APFloat::rmNearestTiesToEven);
				Constant *FPMaxSIntV = ConstantFP::get(FPSrc->getType(), FPMaxAP);
				Constant *IntMaxSIntV = ConstantInt::get(IntDst->getType(), IntMaxAP);

				/* TODO: use a new strict version of fcmp */
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions This description is wrong. andrew.w.kaylor: This description is wrong.
				FCmpInst *FPCompare = new FCmpInst(I, FCmpInst::FCMP_ULT, FPSrc, FPMaxSIntV,
				"within.sint.range");

				TerminatorInst ThenTerm, ElseTerm;
				SplitBlockAndInsertIfThenElse(FPCompare, I, &ThenTerm, &ElseTerm);

				BasicBlock *StraightConvBB = ThenTerm->getParent();
				BasicBlock *NotInRangeBB = ElseTerm->getParent();
				BasicBlock *ExitBB = I->getParent();
				Function *F2SI = Intrinsic::getDeclaration(
				ExitBB->getModule(), Intrinsic::experimental_constrained_fptosi,
				{I->getType(), FPSrc->getType()});
				Function *FSUB = Intrinsic::getDeclaration(
				ExitBB->getModule(), Intrinsic::experimental_constrained_fsub,
				FPSrc->getType());

				MDString *SubRoundingMDS = MDString::get(Context, "round.dynamic");
				Value *SubRounding = MetadataAsValue::get(Context, SubRoundingMDS);

				CallInst *ThenSIntCall;
				CallInst *ElseSIntCall;
				Instruction *BiasedFPSrc;
				Instruction *ElseSIntResult;

				ThenSIntCall = CallInst::Create(F2SI, {FPSrc, ExBehavior}, "", ThenTerm);
				BranchInst::Create(ExitBB, StraightConvBB);
				ThenTerm->eraseFromParent();

				BiasedFPSrc = CallInst::Create(
				FSUB, {FPSrc, FPMaxSIntV, SubRounding, ExBehavior}, "", ElseTerm);
				ElseSIntCall =
				CallInst::Create(F2SI, {BiasedFPSrc, ExBehavior}, "", ElseTerm);
				ElseSIntResult = BinaryOperator::Create(BinaryOperator::Xor, ElseSIntCall,
				IntMaxSIntV, "", ElseTerm);
				BranchInst::Create(ExitBB, NotInRangeBB);
				ElseTerm->eraseFromParent();

				PHINode *PN = PHINode::Create(ElseSIntResult->getType(), 2, "", I);
				PN->addIncoming(ThenSIntCall, ThenSIntCall->getParent());
				PN->addIncoming(ElseSIntResult, ElseSIntResult->getParent());
				I->replaceAllUsesWith(PN);
				I->eraseFromParent();

				++NumStrictFPOps;
				}

				void StrictFPPass::replaceVectorConstrainedFPToUI(IntrinsicInst *I) {

				LLVMContext &C = I->getContext();

				Value *IntVDst = I;
				Value *FPVSrc = I->getArgOperand(0);
				Value *ExBehavior = I->getArgOperand(1);

				auto *FPVecT = cast<VectorType>(FPVSrc->getType());
				auto *FPElemT = FPVecT->getElementType();

				auto *IntVecT = cast<VectorType>(IntVDst->getType());
				auto *IntElemT = IntVecT->getElementType();

				Function *F2UI = Intrinsic::getDeclaration(
				I->getModule(), Intrinsic::experimental_constrained_fptoui,
				{IntElemT, FPElemT});

				IntVDst =
				UndefValue::get(VectorType::get(IntElemT, FPVecT->getNumElements()));

				for (uint64_t i = 0; i < FPVecT->getNumElements(); i++) {
				Value *IndexV = ConstantInt::getSigned(Type::getInt32Ty(C), i);

				Instruction *ScalarFP = ExtractElementInst::Create(FPVSrc, IndexV, "", I);

				IntrinsicInst *ScalarUInt = dyn_cast<IntrinsicInst>(
				CallInst::Create(F2UI, {ScalarFP, ExBehavior}, "", I));

				inspectIntrinsicCall(ScalarUInt);

				IntVDst = InsertElementInst::Create(IntVDst, ScalarUInt, IndexV, "", I);
				}

				I->replaceAllUsesWith(IntVDst);
				I->eraseFromParent();

				++NumStrictFPOps;
				}

				} // End anonymous namespace

				char StrictFPPass::ID = 0;
				INITIALIZE_PASS(StrictFPPass, DEBUG_TYPE, "Strict floating point transforms",
				false, false)

				FunctionPass *llvm::createStrictFPPass() { return new StrictFPPass(); }

lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	static cl::opt<CFLAAType> UseCFLAA(
cl::values(clEnumValN(CFLAAType::None, "none", "Disable CFL-AA"),		cl::values(clEnumValN(CFLAAType::None, "none", "Disable CFL-AA"),
clEnumValN(CFLAAType::Steensgaard, "steens",		clEnumValN(CFLAAType::Steensgaard, "steens",
"Enable unification-based CFL-AA"),		"Enable unification-based CFL-AA"),
clEnumValN(CFLAAType::Andersen, "anders",		clEnumValN(CFLAAType::Andersen, "anders",
"Enable inclusion-based CFL-AA"),		"Enable inclusion-based CFL-AA"),
clEnumValN(CFLAAType::Both, "both",		clEnumValN(CFLAAType::Both, "both",
"Enable both variants of CFL-AA")));		"Enable both variants of CFL-AA")));

		static cl::opt<bool>
		StrictFP("strict-fp-transforms", cl::init(true), cl::Hidden,
		cl::desc("Enable transformations needed for strict FP."));

/// Option names for limiting the codegen pipeline.		/// Option names for limiting the codegen pipeline.
/// Those are used in error reporting and we didn't want		/// Those are used in error reporting and we didn't want
/// to duplicate their names all over the place.		/// to duplicate their names all over the place.
const char *StartAfterOptName = "start-after";		const char *StartAfterOptName = "start-after";
const char *StartBeforeOptName = "start-before";		const char *StartBeforeOptName = "start-before";
const char *StopAfterOptName = "stop-after";		const char *StopAfterOptName = "stop-after";
const char *StopBeforeOptName = "stop-before";		const char *StopBeforeOptName = "stop-before";

▲ Show 20 Lines • Show All 381 Lines • ▼ Show 20 Lines	if (Verify)
PM->add(createMachineVerifierPass(Banner));		PM->add(createMachineVerifierPass(Banner));
}		}

/// Add common target configurable passes that perform LLVM IR to IR transforms		/// Add common target configurable passes that perform LLVM IR to IR transforms
/// following machine independent optimization.		/// following machine independent optimization.
void TargetPassConfig::addIRPasses() {		void TargetPassConfig::addIRPasses() {
switch (UseCFLAA) {		switch (UseCFLAA) {
case CFLAAType::Steensgaard:		case CFLAAType::Steensgaard:
addPass(createCFLSteensAAWrapperPass());		addPass(createCFLSteensAAWrapperPass());
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions This doesn't seem like the right place to do this. Should it be happening much later, like around CodeGenPrepare? andrew.w.kaylor: This doesn't seem like the right place to do this. Should it be happening much later, like…
break;		break;
case CFLAAType::Andersen:		case CFLAAType::Andersen:
addPass(createCFLAndersAAWrapperPass());		addPass(createCFLAndersAAWrapperPass());
break;		break;
case CFLAAType::Both:		case CFLAAType::Both:
addPass(createCFLAndersAAWrapperPass());		addPass(createCFLAndersAAWrapperPass());
addPass(createCFLSteensAAWrapperPass());		addPass(createCFLSteensAAWrapperPass());
break;		break;
▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines
}		}

/// Add pass to prepare the LLVM IR for code generation. This should be done		/// Add pass to prepare the LLVM IR for code generation. This should be done
/// before exception handling preparation passes.		/// before exception handling preparation passes.
void TargetPassConfig::addCodeGenPrepare() {		void TargetPassConfig::addCodeGenPrepare() {
if (getOptLevel() != CodeGenOpt::None && !DisableCGP)		if (getOptLevel() != CodeGenOpt::None && !DisableCGP)
addPass(createCodeGenPreparePass());		addPass(createCodeGenPreparePass());
addPass(createRewriteSymbolsPass());		addPass(createRewriteSymbolsPass());

		// Experimental pass with transforms needed for strict FP.
		if (StrictFP)
		addPass(createStrictFPPass());
}		}

/// Add common passes that perform LLVM IR to IR transforms in preparation for		/// Add common passes that perform LLVM IR to IR transforms in preparation for
/// instruction selection.		/// instruction selection.
void TargetPassConfig::addISelPrepare() {		void TargetPassConfig::addISelPrepare() {
addPreISel();		addPreISel();

// Force codegen to run according to the callgraph.		// Force codegen to run according to the callgraph.
▲ Show 20 Lines • Show All 473 Lines • Show Last 20 Lines

lib/IR/IntrinsicInst.cpp

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	return StringSwitch<ExceptionBehavior>(ExceptionArg)
.Case("fpexcept.strict", ebStrict)		.Case("fpexcept.strict", ebStrict)
.Default(ebInvalid);		.Default(ebInvalid);
}		}

bool ConstrainedFPIntrinsic::isUnaryOp() const {		bool ConstrainedFPIntrinsic::isUnaryOp() const {
switch (getIntrinsicID()) {		switch (getIntrinsicID()) {
default:		default:
return false;		return false;
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
case Intrinsic::experimental_constrained_log10:		case Intrinsic::experimental_constrained_log10:
case Intrinsic::experimental_constrained_log2:		case Intrinsic::experimental_constrained_log2:
Show All 15 Lines

lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,068 Lines • ▼ Show 20 Lines	Assert(isa<ConstantInt>(CS.getArgOperand(1)),
CS);		CS);
break;		break;
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
▲ Show 20 Lines • Show All 390 Lines • ▼ Show 20 Lines	static DISubprogram getSubprogram(Metadata LocalScope) {

// Just return null; broken scope chains are checked elsewhere.		// Just return null; broken scope chains are checked elsewhere.
assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");		assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");
return nullptr;		return nullptr;
}		}

void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {		void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {
unsigned NumOperands = FPI.getNumArgOperands();		unsigned NumOperands = FPI.getNumArgOperands();
Assert(((NumOperands == 5 && FPI.isTernaryOp()) \|\|		bool HasExceptionMD = false;
(NumOperands == 3 && FPI.isUnaryOp()) \|\| (NumOperands == 4)),		bool HasRoundingMD = false;
		switch (FPI.getIntrinsicID())
		{
		case Intrinsic::experimental_constrained_sqrt:
		case Intrinsic::experimental_constrained_sin:
		case Intrinsic::experimental_constrained_cos:
		case Intrinsic::experimental_constrained_exp:
		case Intrinsic::experimental_constrained_exp2:
		case Intrinsic::experimental_constrained_log:
		case Intrinsic::experimental_constrained_log10:
		case Intrinsic::experimental_constrained_log2:
		case Intrinsic::experimental_constrained_rint:
		case Intrinsic::experimental_constrained_nearbyint:
		Assert((NumOperands == 3),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		HasRoundingMD = true;
		break;

		case Intrinsic::experimental_constrained_fma:
		Assert((NumOperands == 5),
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions Since you've broken this out into a switch statement, can you separate the unary and ternary ops and give them each the appropriate assert? I think that would be much more readable than this compound check (which I realize was my creation). andrew.w.kaylor: Since you've broken this out into a switch statement, can you separate the unary and ternary…
"invalid arguments for constrained FP intrinsic", &FPI);		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		HasRoundingMD = true;
		break;

		case Intrinsic::experimental_constrained_fadd:
		case Intrinsic::experimental_constrained_fsub:
		case Intrinsic::experimental_constrained_fmul:
		case Intrinsic::experimental_constrained_fdiv:
		case Intrinsic::experimental_constrained_pow:
		case Intrinsic::experimental_constrained_powi:
		Assert((NumOperands == 4),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		HasRoundingMD = true;
		break;

		case Intrinsic::experimental_constrained_frem:
		Assert((NumOperands == 3),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		break;

		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui: {
		Assert((NumOperands == 2),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		kbartonUnsubmitted Not Done Reply Inline Actions Could you use the isFPOrFPVectorTy here to check both conditions, and then remove the assert in the else below? kbarton: Could you use the isFPOrFPVectorTy here to check both conditions, and then remove the assert in…

		kbartonUnsubmitted Not Done Reply Inline Actions Can you use a dyn_cast here instead? if (auto OperandT = dyn_cast<VectorType>(Operand->getType())) { do vector stuff } kbarton:* Can you use a dyn_cast here instead? if (auto *OperandT = dyn_cast<VectorType>(Operand->getType…
		kpnAuthorUnsubmitted Done Reply Inline Actions Yes, that's much more concise. kpn: Yes, that's much more concise.
		Value *Operand = FPI.getArgOperand(0);
		uint64_t NumSrcElem = 0;
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions The default should probably always been an error. andrew.w.kaylor: The default should probably always been an error.
		if (Operand->getType()->isVectorTy()) {
		auto *OperandT = cast<VectorType>(Operand->getType());
		NumSrcElem = OperandT->getNumElements();
		Assert(OperandT->getVectorElementType()->isFloatingPointTy(),
		"Intrinsic first argument vector must be floating point",
		&FPI);
		}
		else
		Assert(Operand->getType()->isFloatingPointTy(),
		"Intrinsic first argument must be floating point",
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions How about this? int RoundingIdx = (HasExceptionMD ? NumOperands - 2 : NumOperands - 1); On the other hand, are we ever going to have intrinsics that have a rounding mode but not exception behavior? andrew.w.kaylor: How about this? int RoundingIdx = (HasExceptionMD ? NumOperands - 2 : NumOperands - 1); On…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Done. I don't know if we'll have a rounding mode but not exceptions. I doubt it, but I can't say for certain. kpn: Done. I don't know if we'll have a rounding mode but not exceptions. I doubt it, but I can't…
		&FPI);

		Operand = &FPI;
		kbartonUnsubmitted Done Reply Inline Actions Similarly, could you use isIntOrIntVectorTy here and remove the assert in the if/else below? kbarton: Similarly, could you use isIntOrIntVectorTy here and remove the assert in the if/else below?
		Assert((NumSrcElem > 0) == Operand->getType()->isVectorTy(),
		"Intrinsic first argument and result disagree on vector use",
		&FPI);
		if (Operand->getType()->isVectorTy()) {
		auto *OperandT = cast<VectorType>(Operand->getType());
		Assert(NumSrcElem == OperandT->getNumElements(),
		"Intrinsic first argument and result vector lengths must be equal",
		&FPI);
		Assert(OperandT->getVectorElementType()->isIntegerTy(),
		"Intrinsic result vector must be integer",
		&FPI);
		}
		else
		Assert(Operand->getType()->isIntegerTy(),
		"Intrinsic result must be an integer",
		&FPI);
		}
		break;

		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext: {
		Assert((NumOperands == 2),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		kbartonUnsubmitted Not Done Reply Inline Actions same comment about commoning the asserts here. kbarton: same comment about commoning the asserts here.
		kpnAuthorUnsubmitted Done Reply Inline Actions The fptrunc and fpext changes were split out into another ticket and updated there. That committed code is more concise in I suspect the way you are intending. kpn: The fptrunc and fpext changes were split out into another ticket and updated there. That…

		Value *Operand = FPI.getArgOperand(0);
		uint64_t NumSrcElem = 0;
		if (Operand->getType()->isVectorTy()) {
		auto *OperandT = cast<VectorType>(Operand->getType());
		NumSrcElem = OperandT->getNumElements();
		Assert(OperandT->getVectorElementType()->isFloatingPointTy(),
		"Intrinsic first argument vector must be floating point",
		&FPI);
		}
		else
		Assert(Operand->getType()->isFloatingPointTy(),
		"Intrinsic first argument must be floating point",
		&FPI);

		Operand = &FPI;
		Assert((NumSrcElem > 0) == Operand->getType()->isVectorTy(),
		"Intrinsic first argument and result disagree on vector use",
		&FPI);
		if (Operand->getType()->isVectorTy()) {
		auto *OperandT = cast<VectorType>(Operand->getType());
		Assert(NumSrcElem == OperandT->getNumElements(),
		"Intrinsic first argument and result vector lengths must be equal",
		&FPI);
		Assert(OperandT->getVectorElementType()->isFloatingPointTy(),
		"Intrinsic result vector must be floating point",
		&FPI);
		}
		else
		Assert(Operand->getType()->isFloatingPointTy(),
		"Intrinsic result must be an floating point",
		&FPI);
		}
		break;

		default:
		llvm_unreachable("Invalid constrained FP intrinsic!");
		}

		if (HasExceptionMD) {
Assert(isa<MetadataAsValue>(FPI.getArgOperand(NumOperands-1)),		Assert(isa<MetadataAsValue>(FPI.getArgOperand(NumOperands-1)),
"invalid exception behavior argument", &FPI);		"invalid exception behavior argument", &FPI);
Assert(isa<MetadataAsValue>(FPI.getArgOperand(NumOperands-2)),		Assert(FPI.getExceptionBehavior() != ConstrainedFPIntrinsic::ebInvalid,
		"invalid exception behavior argument", &FPI);
		kbartonUnsubmitted Not Done Reply Inline Actions It looks like getArgOperand expects an unsigned. Is there a specific reason you are using an int here? If it's possible that RoundingIdx is negative, then you should probably add an assert here before passing it to getArgOperand. kbarton: It looks like getArgOperand expects an unsigned. Is there a specific reason you are using an…
		}
		if (HasRoundingMD) {
		int RoundingIdx = (HasExceptionMD ? NumOperands - 2 : NumOperands - 1);
		Assert(isa<MetadataAsValue>(FPI.getArgOperand(RoundingIdx)),
"invalid rounding mode argument", &FPI);		"invalid rounding mode argument", &FPI);
Assert(FPI.getRoundingMode() != ConstrainedFPIntrinsic::rmInvalid,		Assert(FPI.getRoundingMode() != ConstrainedFPIntrinsic::rmInvalid,
"invalid rounding mode argument", &FPI);		"invalid rounding mode argument", &FPI);
Assert(FPI.getExceptionBehavior() != ConstrainedFPIntrinsic::ebInvalid,		}
"invalid exception behavior argument", &FPI);
}		}

void Verifier::visitDbgIntrinsic(StringRef Kind, DbgVariableIntrinsic &DII) {		void Verifier::visitDbgIntrinsic(StringRef Kind, DbgVariableIntrinsic &DII) {
auto *MD = cast<MetadataAsValue>(DII.getArgOperand(0))->getMetadata();		auto *MD = cast<MetadataAsValue>(DII.getArgOperand(0))->getMetadata();
AssertDI(isa<ValueAsMetadata>(MD) \|\|		AssertDI(isa<ValueAsMetadata>(MD) \|\|
(isa<MDNode>(MD) && !cast<MDNode>(MD)->getNumOperands()),		(isa<MDNode>(MD) && !cast<MDNode>(MD)->getNumOperands()),
"invalid llvm.dbg." + Kind + " intrinsic address/value", &DII, MD);		"invalid llvm.dbg." + Kind + " intrinsic address/value", &DII, MD);
AssertDI(isa<DILocalVariable>(DII.getRawVariable()),		AssertDI(isa<DILocalVariable>(DII.getRawVariable()),
▲ Show 20 Lines • Show All 624 Lines • Show Last 20 Lines

test/CodeGen/AArch64/O0-pipeline.ll

	Show All 21 Lines
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Rewrite Symbols			; CHECK-NEXT: Rewrite Symbols
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Strict floating point transforms
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Exception handling preparation			; CHECK-NEXT: Exception handling preparation
	; CHECK-NEXT: Safe Stack instrumentation pass			; CHECK-NEXT: Safe Stack instrumentation pass
	; CHECK-NEXT: Insert stack protectors			; CHECK-NEXT: Insert stack protectors
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: IRTranslator			; CHECK-NEXT: IRTranslator
	; CHECK-NEXT: Legalizer			; CHECK-NEXT: Legalizer
	; CHECK-NEXT: RegBankSelect			; CHECK-NEXT: RegBankSelect
	Show All 30 Lines

test/CodeGen/AArch64/O3-pipeline.ll

	Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Interleaved Access Pass			; CHECK-NEXT: Interleaved Access Pass
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: CodeGen Prepare			; CHECK-NEXT: CodeGen Prepare
	; CHECK-NEXT: Rewrite Symbols			; CHECK-NEXT: Rewrite Symbols
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Strict floating point transforms
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Exception handling preparation			; CHECK-NEXT: Exception handling preparation
	; CHECK-NEXT: AArch64 Promote Constant			; CHECK-NEXT: AArch64 Promote Constant
	; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()			; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Merge internal globals			; CHECK-NEXT: Merge internal globals
	; CHECK-NEXT: Safe Stack instrumentation pass			; CHECK-NEXT: Safe Stack instrumentation pass
	; CHECK-NEXT: Insert stack protectors			; CHECK-NEXT: Insert stack protectors
	▲ Show 20 Lines • Show All 107 Lines • Show Last 20 Lines

test/CodeGen/X86/O0-pipeline.ll

	Show All 22 Lines
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	; CHECK-NEXT: Expand reduction intrinsics			; CHECK-NEXT: Expand reduction intrinsics
	; CHECK-NEXT: Expand indirectbr instructions			; CHECK-NEXT: Expand indirectbr instructions
	; CHECK-NEXT: Rewrite Symbols			; CHECK-NEXT: Rewrite Symbols
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Strict floating point transforms
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Exception handling preparation			; CHECK-NEXT: Exception handling preparation
	; CHECK-NEXT: Safe Stack instrumentation pass			; CHECK-NEXT: Safe Stack instrumentation pass
	; CHECK-NEXT: Insert stack protectors			; CHECK-NEXT: Insert stack protectors
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: X86 DAG->DAG Instruction Selection			; CHECK-NEXT: X86 DAG->DAG Instruction Selection
	; CHECK-NEXT: X86 PIC Global Base Reg Initialization			; CHECK-NEXT: X86 PIC Global Base Reg Initialization
	; CHECK-NEXT: Expand ISel Pseudo-instructions			; CHECK-NEXT: Expand ISel Pseudo-instructions
	Show All 34 Lines

test/CodeGen/X86/O3-pipeline.ll

	Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Interleaved Access Pass			; CHECK-NEXT: Interleaved Access Pass
	; CHECK-NEXT: Expand indirectbr instructions			; CHECK-NEXT: Expand indirectbr instructions
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: CodeGen Prepare			; CHECK-NEXT: CodeGen Prepare
	; CHECK-NEXT: Rewrite Symbols			; CHECK-NEXT: Rewrite Symbols
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Strict floating point transforms
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Exception handling preparation			; CHECK-NEXT: Exception handling preparation
	; CHECK-NEXT: Safe Stack instrumentation pass			; CHECK-NEXT: Safe Stack instrumentation pass
	; CHECK-NEXT: Insert stack protectors			; CHECK-NEXT: Insert stack protectors
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results			; CHECK-NEXT: Function Alias Analysis Results
	▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

test/CodeGen/X86/fp-con-fptoui.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -O3 -mtriple=x86_64-pc-linux < %s \| FileCheck %s

				; Verify that fptoui(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f1:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xorl %eax, %eax
				; CHECK-NEXT: testb %al, %al
				; CHECK-NEXT: jne .LBB0_2
				; CHECK-NEXT: # %bb.1:
				; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
				; CHECK-NEXT: retq
				; CHECK-NEXT: .LBB0_2:
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: subsd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: cvttsd2si %xmm0, %eax
				; CHECK-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; CHECK-NEXT: retq
				define zeroext i32 @f1() {
				entry:
				%result = call zeroext i32 @llvm.experimental.constrained.fptoui.i32.f64(
				double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that fptoui(variable) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f2:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: ucomisd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: jae .LBB1_2
				; CHECK-NEXT: # %bb.1:
				; CHECK-NEXT: cvttsd2si %xmm0, %eax
				; CHECK-NEXT: retq
				; CHECK-NEXT: .LBB1_2:
				; CHECK-NEXT: subsd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: cvttsd2si %xmm0, %eax
				; CHECK-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; CHECK-NEXT: retq
				define zeroext i32 @f2(double %D) {
				entry:
				%result = call zeroext i32 @llvm.experimental.constrained.fptoui.i32.f64(
				double %D,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that fptoui(<42.1, 37.0>) isn't simplified when the rounding
				; mode is unknown.
				; CHECK-LABEL: f3:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: xorl %eax, %eax
				; CHECK-NEXT: testb %al, %al
				; CHECK-NEXT: jne .LBB2_2
				; CHECK-NEXT: # %bb.1:
				; CHECK-NEXT: cvttsd2si %xmm0, %eax
				; CHECK-NEXT: jmp .LBB2_3
				; CHECK-NEXT: .LBB2_2:
				; CHECK-NEXT: subsd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: cvttsd2si %xmm0, %eax
				; CHECK-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; CHECK-NEXT: .LBB2_3:
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
				; CHECK-NEXT: xorl %eax, %eax
				; CHECK-NEXT: testb %al, %al
				; CHECK-NEXT: jne .LBB2_5
				; CHECK-NEXT: # %bb.4:
				; CHECK-NEXT: cvttsd2si %xmm1, %eax
				; CHECK-NEXT: movq %rax, %xmm1
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
				; CHECK-NEXT: retq
				; CHECK-NEXT: .LBB2_5:
				; CHECK-NEXT: subsd {{.*}}(%rip), %xmm1
				; CHECK-NEXT: cvttsd2si %xmm1, %eax
				; CHECK-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; CHECK-NEXT: movq %rax, %xmm1
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
				; CHECK-NEXT: retq
				define <2 x i32> @f3() {
				entry:
				%result = call <2 x i32> @llvm.experimental.constrained.fptoui.v2i32.v2f64(
				<2 x double><double 42.1, double 37.0>,
				metadata !"fpexcept.strict")
				ret <2 x i32> %result
				}

				; Verify that fptoui(<variable, variable>) isn't simplified when the
				; rounding mode is unknown.
				; CHECK-LABEL: f4:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: ucomisd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: jae .LBB3_2
				; CHECK-NEXT: # %bb.1:
				; CHECK-NEXT: cvttsd2si %xmm0, %eax
				; CHECK-NEXT: movd %eax, %xmm1
				; CHECK-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
				; CHECK-NEXT: ucomisd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: jb .LBB3_4
				; CHECK-NEXT: .LBB3_5:
				; CHECK-NEXT: movsd {{.*#+}} xmm2 = mem[0],zero
				; CHECK-NEXT: subsd %xmm2, %xmm0
				; CHECK-NEXT: cvttsd2si %xmm0, %eax
				; CHECK-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm0[0]
				; CHECK-NEXT: movdqa %xmm1, %xmm0
				; CHECK-NEXT: retq
				; CHECK-NEXT: .LBB3_2:
				; CHECK-NEXT: movsd {{.*#+}} xmm1 = mem[0],zero
				; CHECK-NEXT: movapd %xmm0, %xmm2
				; CHECK-NEXT: subsd %xmm1, %xmm2
				; CHECK-NEXT: cvttsd2si %xmm2, %eax
				; CHECK-NEXT: xorl $-2147483648, %eax # imm = 0x80000000
				; CHECK-NEXT: movd %eax, %xmm1
				; CHECK-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
				; CHECK-NEXT: ucomisd {{.*}}(%rip), %xmm0
				; CHECK-NEXT: jae .LBB3_5
				; CHECK-NEXT: .LBB3_4:
				; CHECK-NEXT: cvttsd2si %xmm0, %eax
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm0[0]
				; CHECK-NEXT: movdqa %xmm1, %xmm0
				; CHECK-NEXT: retq
				define <2 x i32> @f4(<2 x double> %D) {
				entry:
				%result = call <2 x i32> @llvm.experimental.constrained.fptoui.v2i32.v2f64(
				<2 x double> %D,
				metadata !"fpexcept.strict")
				ret <2 x i32> %result
				}



				@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
				declare zeroext i32 @llvm.experimental.constrained.fptoui.i32.f64(double, metadata)
				declare <2 x i32> @llvm.experimental.constrained.fptoui.v2i32.v2f64(<2 x double>, metadata)

test/CodeGen/X86/fp-intrinsics.ll

Show First 20 Lines • Show All 268 Lines • ▼ Show 20 Lines	%result = call double @llvm.experimental.constrained.fma.f64(
double 42.1,		double 42.1,
double 42.1,		double 42.1,
double 42.1,		double 42.1,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret double %result		ret double %result
}		}

		; Verify that fptosi(42.1) isn't simplified when the rounding mode is
		; unknown.
		; Verify that no gross errors happen.
		; CHECK-LABEL: @f20
		; COMMON: cvttsd2si
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions Can you check the entire expanded IR pattern here? It might be worth having a separate test that verifies the StrictFP pass in isolation. andrew.w.kaylor: Can you check the entire expanded IR pattern here? It might be worth having a separate test…
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Same here as the vector version below. Do we want the truncating convert? That was surprising to me. cameron.mcinally: Same here as the vector version below. Do we want the truncating convert? That was surprising…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions The strict intrinsic results in the same instruction as the regular fptosi instruction. Having them be the same means the mutation from strict to non-strict is working correctly. And, yes, rounding towards zero is correct. That's why there's no rounding metadata. kpn: The strict intrinsic results in the same instruction as the regular fptosi instruction. Having…
		define i32 @f20() {
		entry:
		%result = call i32 @llvm.experimental.constrained.fptosi.i32.f64(double 42.1,
		metadata !"fpexcept.strict")
		ret i32 %result
		}

		; Verify that round(42.1) isn't simplified when the rounding mode is
		; unknown.
		; Verify that no gross errors happen.
		; CHECK-LABEL: @f21
		; COMMON: cvtsd2ss
		define float @f21() {
		entry:
		%result = call float @llvm.experimental.constrained.fptrunc.f32.f64(
		double 42.1,
		metadata !"fpexcept.strict")
		ret float %result
		}

		; Verify that fpext(42.1) isn't simplified when the rounding mode is
		; unknown.
		; Verify that no gross errors happen.
		; CHECK-LABEL: @f22
		; COMMON: cvtss2sd
		define double @f22(float %x) {
		entry:
		%result = call double @llvm.experimental.constrained.fpext.f64.f32(float %x,
		metadata !"fpexcept.strict")
		ret double %result
		}

@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"		@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)		declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)
declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata)		declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata)
declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)
		declare i32 @llvm.experimental.constrained.fptosi.i32.f64(double, metadata)
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions I believe your decorated intrinsic names are incorrect in all of these cases. The name needs to specify both return type and argument type. Take a look at what opt produces if you give it these names as inputs. andrew.w.kaylor: I believe your decorated intrinsic names are incorrect in all of these cases. The name needs to…
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Do we want unsigned convert tests too? fptoui? I see that there are SystemZ tests to cover them, so maybe that's sufficient? Just pointing this out so others can see. cameron.mcinally: Do we want unsigned convert tests too? fptoui? I see that there are SystemZ tests to cover…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions The SystemZ tests target hardware new enough to lower to a single instruction. The tests for fptoui on x86 use the default lowering, but the default lowering is disallowed since it does speculative execution and traps. The support for fixing that I had in this patch but was asked to split it out into another patch. So there's no constrained fptoui test here using the default lowering in this patch. kpn: The SystemZ tests target hardware new enough to lower to a single instruction. The tests for…
		declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata)
		declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)

test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

Show First 20 Lines • Show All 3,406 Lines • ▼ Show 20 Lines	entry:
%nearby = call <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(		%nearby = call <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(
<4 x double> <double 42.1, double 42.2,		<4 x double> <double 42.1, double 42.2,
double 42.3, double 42.4>,		double 42.3, double 42.4>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <4 x double> %nearby		ret <4 x double> %nearby
}		}

		define <2 x i32> @constrained_vector_fptosi() {
		; NO-FMA-LABEL: constrained_vector_fptosi:
		; NO-FMA: # %bb.0: # %entry
		; NO-FMA-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; NO-FMA-NEXT: movq %rax, %xmm1
		; NO-FMA-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; NO-FMA-NEXT: movq %rax, %xmm0
		; NO-FMA-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; NO-FMA-NEXT: retq
		;
		; HAS-FMA-LABEL: constrained_vector_fptosi:
		; HAS-FMA: # %bb.0: # %entry
		; HAS-FMA-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; HAS-FMA-NEXT: vmovq %rax, %xmm0
		; HAS-FMA-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; HAS-FMA-NEXT: vmovq %rax, %xmm1
		; HAS-FMA-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; HAS-FMA-NEXT: retq
		entry:
		%result = call <2 x i32> @llvm.experimental.constrained.fptosi.v2i32.v2f64(
		<2 x double><double 42.1, double 42.0>,
		metadata !"fpexcept.strict")
		ret <2 x i32> %result
		}

		define <4 x float> @constrained_vector_fptrunc(<4 x double> %D) {
		; NO-FMA-LABEL: constrained_vector_fptrunc:
		; NO-FMA: # %bb.0: # %entry
		; NO-FMA-NEXT: cvtpd2ps %xmm1, %xmm1
		; NO-FMA-NEXT: cvtpd2ps %xmm0, %xmm0
		; NO-FMA-NEXT: unpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; NO-FMA-NEXT: retq
		;
		; HAS-FMA-LABEL: constrained_vector_fptrunc:
		; HAS-FMA: # %bb.0: # %entry
		; HAS-FMA-NEXT: vcvtpd2ps %ymm0, %xmm0
		; HAS-FMA-NEXT: vzeroupper
		; HAS-FMA-NEXT: retq
		entry:
		%result = call <4 x float> @llvm.experimental.constrained.fptrunc.v4f32.v4f64(
		<4 x double> %D,
		metadata !"fpexcept.strict")
		ret <4 x float> %result
		}

		define <2 x double> @constrained_vector_fpext(<2 x float> %D) {
		; NO-FMA-LABEL: constrained_vector_fpext:
		; NO-FMA: # %bb.0: # %entry
		; NO-FMA-NEXT: cvtss2sd %xmm0, %xmm1
		; NO-FMA-NEXT: shufps {{.*#+}} xmm0 = xmm0[1,1,2,3]
		; NO-FMA-NEXT: cvtss2sd %xmm0, %xmm0
		; NO-FMA-NEXT: movlhps {{.*#+}} xmm1 = xmm1[0],xmm0[0]
		; NO-FMA-NEXT: movaps %xmm1, %xmm0
		; NO-FMA-NEXT: retq
		;
		; HAS-FMA-LABEL: constrained_vector_fpext:
		; HAS-FMA: # %bb.0: # %entry
		; HAS-FMA-NEXT: vcvtps2pd %xmm0, %ymm0
		; HAS-FMA-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0
		; HAS-FMA-NEXT: vzeroupper
		; HAS-FMA-NEXT: retq
		entry:
		%result = call <2 x double> @llvm.experimental.constrained.fpext.v2f64.v2f32(
		<2 x float> %D,
		metadata !"fpexcept.strict")
		ret <2 x double> %result
		}


; Single width declarations		; Single width declarations
declare <2 x double> @llvm.experimental.constrained.fdiv.v2f64(<2 x double>, <2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.fdiv.v2f64(<2 x double>, <2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.fmul.v2f64(<2 x double>, <2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.fmul.v2f64(<2 x double>, <2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.fadd.v2f64(<2 x double>, <2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.fadd.v2f64(<2 x double>, <2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.fsub.v2f64(<2 x double>, <2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.fsub.v2f64(<2 x double>, <2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.fma.v2f64(<2 x double>, <2 x double>, <2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.fma.v2f64(<2 x double>, <2 x double>, <2 x double>, metadata, metadata)
declare <4 x float> @llvm.experimental.constrained.fma.v4f32(<4 x float>, <4 x float>, <4 x float>, metadata, metadata)		declare <4 x float> @llvm.experimental.constrained.fma.v4f32(<4 x float>, <4 x float>, <4 x float>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.sqrt.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.sqrt.v2f64(<2 x double>, metadata, metadata)
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
declare <4 x double> @llvm.experimental.constrained.cos.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.cos.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.exp.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.exp.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.exp2.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.exp2.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.log.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.log.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.log10.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.log10.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.log2.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.log2.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.rint.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.rint.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(<4 x double>, metadata, metadata)

		declare <2 x i32> @llvm.experimental.constrained.fptosi.v2i32.v2f64(<2 x double>, metadata)
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Same question as the scalar versions. Do we want <4 x float> to <4 x i32>/etc casts? cameron.mcinally: Same question as the scalar versions. Do we want <4 x float> to <4 x i32>/etc casts?
		kpnAuthorUnsubmitted Not Done Reply Inline Actions On the odd chance that it may tickle the vector legalizer I'll go ahead and add tests with float. Same answer as the scalar tests: Rounding towards zero is correct. kpn: On the odd chance that it may tickle the vector legalizer I'll go ahead and add tests with…
		declare <4 x float> @llvm.experimental.constrained.fptrunc.v4f32.v4f64(<4 x double>, metadata)
		declare <2 x double> @llvm.experimental.constrained.fpext.v2f64.v2f32(<2 x float>, metadata)
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions This surprised me. Should this be the truncating convert? Or should it be vcvtsd2si? cameron.mcinally: This surprised me. Should this be the truncating convert? Or should it be vcvtsd2si?
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Yes, it should be a truncating conversion. kpn: Yes, it should be a truncating conversion.

test/Feature/fp-intrinsics.ll

	Show First 20 Lines • Show All 236 Lines • ▼ Show 20 Lines
	define double @f17() {			define double @f17() {
	entry:			entry:
	%result = call double @llvm.experimental.constrained.fma.f64(double 42.1, double 42.1, double 42.1,			%result = call double @llvm.experimental.constrained.fma.f64(double 42.1, double 42.1, double 42.1,
	metadata !"round.dynamic",			metadata !"round.dynamic",
	metadata !"fpexcept.strict")			metadata !"fpexcept.strict")
	ret double %result			ret double %result
	}			}

				; Verify that fptoui(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f18
				; CHECK: call zeroext i32 @llvm.experimental.constrained.fptoui
				define zeroext i32 @f18() {
				entry:
				%result = call zeroext i32 @llvm.experimental.constrained.fptoui.f64(
				double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that fptosi(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f19
				; CHECK: call i32 @llvm.experimental.constrained.fptosi
				define i32 @f19() {
				entry:
				%result = call i32 @llvm.experimental.constrained.fptosi.f64(double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that fptrunc(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f20
				; CHECK: call float @llvm.experimental.constrained.fptrunc
				define float @f20() {
				entry:
				%result = call float @llvm.experimental.constrained.fptrunc.f32(double 42.1,
				metadata !"fpexcept.strict")
				ret float %result
				}

				; Verify that fpext(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f21
				; CHECK: call double @llvm.experimental.constrained.fpext
				define double @f21() {
				entry:
				%result = call double @llvm.experimental.constrained.fpext.f64(double 42.1,
				metadata !"fpexcept.strict")
				ret double %result
				}

	@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"			@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
	declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)			declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)
	declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)
				declare zeroext i32 @llvm.experimental.constrained.fptoui.f64(double, metadata)
				declare i32 @llvm.experimental.constrained.fptosi.f64(double, metadata)
				cameron.mcinallyUnsubmitted Not Done Reply Inline Actions I haven't been following along closely, so please forgive if this was already discussed... Should we have float->i32 casts too? Also double->i64? cameron.mcinally: I haven't been following along closely, so please forgive if this was already discussed...
				kpnAuthorUnsubmitted Not Done Reply Inline Actions This is an opt test. I'm not sure we'd benefit from placing those extra tests here. kpn: This is an opt test. I'm not sure we'd benefit from placing those extra tests here.
				declare float @llvm.experimental.constrained.fptrunc.f32(double, metadata)
				declare double @llvm.experimental.constrained.fpext.f64(double, metadata)
				cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Should 'fpext.f64' have a float argument instead of a double? cameron.mcinally: Should 'fpext.f64' have a float argument instead of a double?
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Probably, yes. Will fix. kpn: Probably, yes. Will fix.

This is an archive of the discontinued LLVM Phabricator instance.

More math intrinsics for conservative math handlingAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 161048

docs/AddingConstrainedIntrinsics.rst

docs/LangRef.rst

include/llvm/CodeGen/ISDOpcodes.h

include/llvm/CodeGen/Passes.h

include/llvm/CodeGen/SelectionDAGNodes.h

include/llvm/CodeGen/TargetLowering.h

include/llvm/IR/IntrinsicInst.h

include/llvm/IR/Intrinsics.td

include/llvm/InitializePasses.h

lib/CodeGen/CMakeLists.txt

lib/CodeGen/CodeGen.cpp

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

lib/CodeGen/StrictFP.cpp

lib/CodeGen/TargetPassConfig.cpp

lib/IR/IntrinsicInst.cpp

lib/IR/Verifier.cpp

test/CodeGen/AArch64/O0-pipeline.ll

test/CodeGen/AArch64/O3-pipeline.ll

test/CodeGen/X86/O0-pipeline.ll

test/CodeGen/X86/O3-pipeline.ll

test/CodeGen/X86/fp-con-fptoui.ll

test/CodeGen/X86/fp-intrinsics.ll

test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

test/Feature/fp-intrinsics.ll

More math intrinsics for conservative math handling
AbandonedPublic