This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
2/29
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
1/3
ISDOpcodes.h
-
Passes.h
-
SelectionDAGNodes.h
-
IR/
-
IntrinsicInst.h
-
Intrinsics.td
-
InitializePasses.h
-
lib/
-
CodeGen/
-
CMakeLists.txt
-
CodeGen.cpp
-
SelectionDAG/
10
LegalizeDAG.cpp
2/5
SelectionDAG.cpp
7
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
13/19
StrictFP.cpp
1/1
TargetPassConfig.cpp
-
IR/
-
IntrinsicInst.cpp
5/11
Verifier.cpp
-
test/
-
CodeGen/
-
AArch64/
-
O0-pipeline.ll
-
O3-pipeline.ll
-
X86/
-
O0-pipeline.ll
-
O3-pipeline.ll
2/6
fp-intrinsics.ll
-
Feature/
4
fp-intrinsics.ll

Differential D43515

More math intrinsics for conservative math handling
AbandonedPublic

Authored by kpn on Feb 20 2018, 9:44 AM.

Download Raw Diff

Details

Reviewers

andrew.w.kaylor
craig.topper
hfinkel
mehdi_amini
aemerson
javed.absar
kbarton

Summary

This builds on D27028 and D32319's work on constrained math intrinsics.

Quoting from D32319: "The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed."

There are more patches coming, but I wanted to start with just a handful here.

Diff Detail

Event Timeline

kpn created this revision.Feb 20 2018, 9:44 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptFeb 20 2018, 9:44 AM

At the request of my employer's legal department:
Copyright © 2018 SAS Institute Inc., Cary, NC, USA. All Rights Reserved.

andrew.w.kaylor added inline comments.Feb 20 2018, 11:52 AM

docs/LangRef.rst
13283	I think you need to say more about the semantics. The LLVM IR fptoui and fptosi are specifically documented as rounding to zero. I don't think we want that with the constrained intrinsics so we need to very specifically document how they will be different from the standard instructions.
13330	This intrinsic is a bit troublesome. The llvm.round intrinsic says that it "returns the same values as the libm round functions would, and handles error conditions in the same way." The libm round function, in turn, is documented as rounding to the nearest integer (and away from zero in halfway cases) regardless of the current rounding mode. So what do we want the constrained form of the intrinsic to do? I think it needs to ignore the rounding mode. I'm not sure about exception behavior. If it doesn't respect exception behavior then we probably don't want to have the constrained form of this intrinsic at all.
13366	This seems to be leaking SelectionDAG implementation details into the IR space. How is this used?
13407	This is a replacement for fpext, right? I think you should say that somewhere.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
968	STRICT_FP_TO_SINT needs to do something more than this. The default lowering of FP_TO_SINT is known to raise spurious FE_INEXACT exceptions because it involves speculative execution.
1151	Since this gets unique handling why isn't it just a separate case from the others?
1156	The style/formatting is wrong here. I think you need curly braces around your else-clause and the "else" itself needs to be on the same line as the curly brace above it.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6199–6200	Why are you not attaching these nodes to the chain?

kpn added inline comments.Feb 21 2018, 12:29 PM

docs/LangRef.rst
13283	What do we want these intrinsics to do that is different from the normal IR fptoui and fptosi? I don't think any of the constrained intrinsics _today_ are doing anything with the rounding and exception metadata. That makes it hard for me to say much about it in documentation, today. Well, unless I misunderstood the current code. That's totally possible.
13330	We'll still need the constrained intrinsic to avoid getting reordered. How about if I rename the intrinsic to be fptrunc instead of round? Then the rounding would be explicit.
13366	It seems to exist for the MVT::ppcf128 type. A quick grep doesn't show any other users. Test coverage is lacking. I added a test using the intrinsic, but I had to mark it expected fail since I couldn't get it to work. To avoid the risk of bugs getting introduced later I went ahead and implemented the intrinsic. Would it be better to not have the intrinsic and to instead have a pass that replaces the non-STRICT SDNode with a STRICT version? That would avoid said leaking into the IR space. It would, however, mean that llvm would have an opinion on when STRICT nodes should be used. I'm not sure that's a good thing.
13407	I think I picked names for the intrinsics that matched the SelectionDAG node enum. Perhaps it would be better to match the bitcode language names? In which case this intrinsic would be "fpext" instead of "extend". Either way that's a good idea for the documentation to at least mention fpext.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
968	I have seen FP_TO_SINT cause traps when it shouldn't. Would having that default lowering use the chain in the STRICT_ case solve that issue?
1151	Good point. And making it a separate case also takes care of the formatting issue in the else block.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6199–6200	Because they need to match what the default lowering is expecting. Otherwise a variety of failures happen.

andrew.w.kaylor added inline comments.Feb 21 2018, 6:17 PM

docs/LangRef.rst
13283	You're right, none of the other intrinsics are doing anything specific with the rounding mode. The purpose of the intrinsics is to prevent the optimizer from doing things that would introduce compile-time rounding. However, the general assumption of the intrinsics is that whatever you set the rounding mode to at runtime is the rounding behavior you will get. I'd want to consult a front end expert as to exactly what should be happening. A quick glance at the C99 standard tells me that when a floating point number is converted to an integer it is truncated toward zero. I think that means the same thing as the LLVM language reference claim that fptoui and fptosi round the number toward zero. So I think that we don't want the runtime rounding mode to change the behavior of these intrinsics, and so we should document it as such.
13330	No, fptrunc does something else, right? My concern is that if this is preserving the behavior of the round library function (and I think we need to) then the rounding mode argument isn't relevant, so maybe it should be omitted in this case. The key thing is to be explicit about its behavior in the documentation and then make sure the implementation actually does what we say it will.
13366	I think it's better not to have the intrinsic for now. I don't understand what the SD node is doing well enough to say much more, but it looks to me like the SD node shouldn't be there either. It's a target-specific hack that leaked into the target-independent code if my understanding is correct.
13407	These intrinsics should be driven by what the front end needs. If no front end is generating an equivalent now then we don't want an intrinsic. So, yes, please match the bitcode language names.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
968	No, I don't think the chain will fix this. We need to implement strict lowering that does something different.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6199–6200	There is code that removes the chain when the strict node is mutated to a non-strict node. That should be preventing lowering problems. I believe the chain was necessary to prevent re-ordering prior to final instruction selection.

kpn added inline comments.Feb 26 2018, 12:24 PM

docs/LangRef.rst
13283	Our front end guy here confirms truncation. I've updated my working copy to state that fptoui and fptosi truncate.
13330	Well, if I'm reading SelectionDAGBuilder::visitFPTrunc() correctly then fptrunc gets turned into ISD::FP_ROUND. So, no, renaming this intrinsic to be fptrunc would not be wrong. Contrast this with llvm.round getting turned into ISD::FROUND. I think we need constrained intrinsics for both, so this one in this patch should be renamed after fptrunc. I haven't touched the ISD::FROUND node yet, and that's what llvm.round gets turned into from the looks of SelectionDAGBuilder::visitIntrinsicCall(). That could be a later patch. I agree that the rounding mode should be ignored and shouldn't be in the intrinsic's metadata. WDYT?
13366	Done. It will be removed in the next diff.
13407	Will do.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
968	Is there something we can do at this level to fix this? If there is then I'm all for it, but if there isn't then we should probably still put the intrinsic in. We'll need it eventually, and currently none of the constrained intrinsics solve the complete optimization problem. So I wonder if this intrinsic is really all that different from the other experimental constrained intrinsics. If a backend models FP side effects then wouldn't the existing default lowering work correctly?
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6199–6200	Mutation happens too late in some cases. If it happened early enough then there wouldn't be any need for the strict intrinsics to be mentioned in SelectionDAGLegalize::ExpandNode(). Since it doesn't we can't use the chain. Should that mutation happen earlier?

andrew.w.kaylor added inline comments.Feb 28 2018, 11:25 AM

docs/LangRef.rst
13330	You shouldn't be thinking in terms of the SelectionDAG at this level. ISD::FP_ROUND is very poorly named and does not do the same thing that llvm.round does. ISD::FP_ROUND converts a floating point number to a smaller type, but not necessarily an integer. That's fptrunc. I suppose you are right that we do need a constrained version of fptrunc. I'm a bit concerned by the things that the LLVM language definition says are undefined. Those are the cases that will be of most interest for the constrained case and we should document the expected behavior, but I think we need to consider why the current spec says the result is undefined. In any event llvm.round, which is what I thought you were replacing here, is something completely different.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
968	To be honest I'm not entirely sure how to fix this in the current selection DAG model. The issue is that we need to introduce a branch to fix the problem, but by the time we're selecting instructions it's too late to do that. I think it needs to be addressed when we're building the DAG,
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6199–6200	We need the mutation to be put off as long as possible. I think it should be possible to unlink the chain in ExpandNode if necessary. Depending on what's happening there, we might even want the expanded node to make use of the chain.

kpn added inline comments.Mar 6 2018, 6:13 AM

docs/LangRef.rst
13407	Say, now that I think about this, what does rounding have to do with extending a FP value? Shouldn't this intrinsic just plain not have rounding metadata?

andrew.w.kaylor added inline comments.Mar 6 2018, 9:14 AM

docs/LangRef.rst
13407	That's a reasonable point. The same issue came up with frem. The rounding mode doesn't apply to frem, but I kept the argument just so that it could be handled internally the same way as the other constrained intrinsics and then documented in the language reference that the rounding mode has no effect. I'm not completely convinced that was a good decision on my part. If you want to look at what would have to change to support constrained intrinsics with no rounding mode argument, I would likely support that. I don't think it would be much extra handling. There are just some things where the class that is used to represent the intrinsics (ConstrainedFPIntrinsic) would need to be aware of the possibility that this argument is omitted.

This new diff adds changing the default lowering of STRICT_FP_TO_UINT to not allow speculative execution. I've also eliminated the rounding metadata from instructions when it wasn't needed.

Herald added a reviewer: javed.absar. · View Herald TranscriptMay 1 2018, 7:32 AM

Herald added a subscriber: mgorny. · View Herald Transcript

Missed one place in the documentation that needed updating.

At some point we should create a document that describes the entire flow of FP instructions through the instruction selection process. To be honest I don't remember how it all works, and that makes it difficult to review changes like this. It would also be nice to verify that we all have the same understanding of how it works. I don't mean to volunteer you to produce the entire document, but would you mind giving me a rough outline? I'm still concerned about the case that is not chained.

docs/LangRef.rst
13325	You need to do something more here to document the difference between the return type and the argument type. Also in fpext below.
lib/CodeGen/StrictFP.cpp
12	Can you say more here about what these transformations are? It's clear that you intend this as a generic pass that currently does one thing but might have others added later. That's good, but I'd like to see the possibilities described here as they are implemented.
75	Per the LLVM Programmer's Manual (http://llvm.org/docs/ProgrammersManual.html#iterating-over-the-instruction-in-a-function) you should be using an inst_iterator here.
137	Could you add an example here of what the resulting IR will look like? It would make the code a lot easier to follow.
145	I think you can use "IntDst->getBitWidth()" here. Also, APInt::getSignedMinValue() does this same thing and is a bit more self-documenting.
151	I believe conversion of a NaN to an integer should raise "INVALID" (which the fcmp will) and then the result is undefined, but the 'true' case does less so I think ULT is preferable.
152	We are going to need a constrained version of fcmp, and when we have it you should use it here. When the IRBuilder supports constrained floating point modes, it would be nice to use that here but I guess you can't do that yet, so maybe just a comment saying we should later?
154	This is an odd name. How about "within.sint.range"? In any case, I think '.' is more common than '_' as a name space holder, probably because the name will automatically get '.<n>' appended if it's a duplicate.
202	This description is wrong.
lib/CodeGen/TargetPassConfig.cpp
566	This doesn't seem like the right place to do this. Should it be happening much later, like around CodeGenPrepare?
lib/IR/Verifier.cpp
4470	Since you've broken this out into a switch statement, can you separate the unary and ternary ops and give them each the appropriate assert? I think that would be much more readable than this compound check (which I realize was my creation).
4516	The default should probably always been an error.
4526	How about this? int RoundingIdx = (HasExceptionMD ? NumOperands - 2 : NumOperands - 1); On the other hand, are we ever going to have intrinsics that have a rounding mode but not exception behavior?
test/CodeGen/X86/fp-intrinsics.ll
281	Can you check the entire expanded IR pattern here? It might be worth having a separate test that verifies the StrictFP pass in isolation.
346	I believe your decorated intrinsic names are incorrect in all of these cases. The name needs to specify both return type and argument type. Take a look at what opt produces if you give it these names as inputs.

cameron.mcinally added a subscriber: cameron.mcinally.May 24 2018, 7:48 AM

kpn updated this revision to Diff 152072.Jun 20 2018, 6:05 AM

kpn marked 12 inline comments as done.

kpn added inline comments.

lib/CodeGen/StrictFP.cpp
145	error: no member named `getBitWidth' in` llvm::Value'.

kpn added inline comments.Jun 20 2018, 6:06 AM

docs/LangRef.rst
13325	I hope this is what you mean. I've changed it to show that the result is a different type. I've followed the naming scheme used elsewhere in this document.
lib/IR/Verifier.cpp
4526	Done. I don't know if we'll have a rounding mode but not exceptions. I doubt it, but I can't say for certain.

craig.topper added inline comments.Jun 20 2018, 11:52 PM

lib/CodeGen/StrictFP.cpp
77	I believe you should be getting TM by doing this. auto *TPC = getAnalysisIfAvailable<TargetPassConfig>(); if (!TPC) return false; auto &TM = TPC->getTM<TargetMachine>(); There is only one other pass that takes TargetMachine in its constructor. The others use what I've put above. So I believe that is the preferred way.
85	There is little reason to pass Context around. Value has a getContext as does Type. So you can get the context easily whenever you need it.
100	This cast is unnecessary. IntrinsicInst is a subclass of Value
102	This should use TLI->getValueType.
167	Why do you need a SmallVector here? Why can't you just call getArgOperand?
168	This cast is unecessary.
172	What if the intrinsic uses a vector type?

uweigand added a subscriber: uweigand.Jun 25 2018, 6:30 AM

kpn marked 6 inline comments as done.Jul 26 2018, 9:15 AM

kpn added inline comments.

lib/CodeGen/StrictFP.cpp
172	It would have been caught by the IR verifier. A vector would have been rejected there.

kpn updated this revision to Diff 157504.Jul 26 2018, 9:16 AM

craig.topper added inline comments.Jul 26 2018, 9:56 AM

lib/CodeGen/StrictFP.cpp
172	I don't see where the IR verifier rejects vectors. I just see that it checks that element counts are equal. And why should it reject vectors? We need to support fptosi/fptoui for vectors.

kpn added inline comments.Jul 26 2018, 9:59 AM

lib/CodeGen/StrictFP.cpp
172	It doesn't reject them, now. This latest patch added support for vectors. My inexperience with Phabricator lost the comment that said that.

Delete a line accidentally left in.

Rebase. Ping.

In D43515#1013383, @kpn wrote:

At the request of my employer's legal department:
Copyright © 2018 SAS Institute Inc., Cary, NC, USA. All Rights Reserved.

If it's not just a remark, but is supposed to have some legal/whatever meaning,
i'm not sure some comment in some review is the correct direction.

docs/LangRef.rst
13252	This probably has insufficient amount of `^`. Might want to actually test-build the docs.

Adding new constrained instrinsics and adding the pass should be separate patches I think. Changing the syntax of frem should be another patch.

docs/LangRef.rst
13190	This change should be in a separate patch. There's too much going on in this patch and this is easy to overlook.

In D43515#1229127, @craig.topper wrote:

Adding new constrained instrinsics and adding the pass should be separate patches I think. Changing the syntax of frem should be another patch.

Will do.

Split out changes as requested. This diff is just the four new intrinsics. The fptoui pass and the change to frem will be later.

I've also corrected some documentation issues in this iteration.

Ping

Herald added a subscriber: arphaman. · View Herald TranscriptOct 3 2018, 11:44 AM

craig.topper added inline comments.Oct 3 2018, 5:22 PM

docs/LangRef.rst
13333	This reads funny. I think it should maybe be "result of truncating a floating point"
13367	This also reads funny
include/llvm/CodeGen/ISDOpcodes.h
534	These need comments. Does STRICT_FP_ROUND have the TRUNC argument that FP_ROUND has?

kpn added inline comments.Oct 4 2018, 6:42 AM

docs/LangRef.rst
13333	How about if I just copy the text used by the normal fptrunc instruction?
13367	Same as fptrunc. I could just copy the text from the fpext instruction?
include/llvm/CodeGen/ISDOpcodes.h
534	I could rearrange and lump them in with the non-strict versions of each. Then I'd just need one extra line restating that the STRICT_ versions prevent optimizations. Yes, STRICT_FP_ROUND does have the TRUNC argument that FP_ROUND has, but it is currently always zero. Fixing this require rerouting this one strict node to go through the same codepath as the non-strict node. That would make it different from all the other constrained nodes. Should I go ahead and make that change? I _think_ it is safe if the TRUNC argument really does work like it is documented. I also just noticed that I need to put back STRICT_FP_TO_UINT in at least one place. It should be everywhere STRICT_FP_TO_SINT is handled _except_ in the default lowering.

craig.topper added inline comments.Oct 4 2018, 10:51 AM

docs/LangRef.rst
13333	Sure
13367	Sure
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7118	This doesn't copy the second argument to FP_ROUJND over does it?
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6204	Is this adding a second argument to STRICT_FP_EXTEND as well? I don't think the non-strict FP_EXTEND has two arguments.

kpn added inline comments.Oct 4 2018, 10:53 AM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7118	No. It should.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6204	Agreed. I'll fix it.

cameron.mcinally added inline comments.Oct 4 2018, 11:43 AM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7105	There are a lot of comments on this, so I may have missed something. Take with a grain of salt... I don't think these are correct. These can trap so can't be speculatively executed. They would need a chain.

kpn added inline comments.Oct 4 2018, 11:58 AM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7105	I couldn't figure out how to have these be chained but have the non-strict continue to not be chained. Too many things fell over if they didn't match.

cameron.mcinally added inline comments.Oct 5 2018, 8:24 AM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
7105	I'm not an expert with this code, so cc @andrew.w.kaylor. This doesn't seem like the right direction though. Maybe these unchained operations should be left to a different patch until a proper solution is found.

kpn mentioned this in D53216: [FPEnv] Add constrained intrinsics for MAXNUM and MINNUM.Oct 18 2018, 11:24 AM

kpn mentioned this in D53411: [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics.Oct 19 2018, 5:56 AM

kpn marked 5 inline comments as done.Nov 1 2018, 9:08 AM

Address review comments.

Add use of the chain to these four new SDNode types.

cameron.mcinally added inline comments.Nov 5 2018, 11:51 AM

test/CodeGen/X86/fp-intrinsics.ll
281	Same here as the vector version below. Do we want the truncating convert? That was surprising to me.
346	Do we want unsigned convert tests too? fptoui? I see that there are SystemZ tests to cover them, so maybe that's sufficient? Just pointing this out so others can see.
test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
4372 ↗	(On Diff #172160)	This surprised me. Should this be the truncating convert? Or should it be vcvtsd2si?
4701 ↗	(On Diff #172160)	Same question as the scalar versions. Do we want <4 x float> to <4 x i32>/etc casts?
test/Feature/fp-intrinsics.ll
309	I haven't been following along closely, so please forgive if this was already discussed... Should we have float->i32 casts too? Also double->i64?
311	Should 'fpext.f64' have a float argument instead of a double?

kpn added inline comments.Nov 5 2018, 12:36 PM

test/CodeGen/X86/fp-intrinsics.ll
346	The SystemZ tests target hardware new enough to lower to a single instruction. The tests for fptoui on x86 use the default lowering, but the default lowering is disallowed since it does speculative execution and traps. The support for fixing that I had in this patch but was asked to split it out into another patch. So there's no constrained fptoui test here using the default lowering in this patch.
test/Feature/fp-intrinsics.ll
311	Probably, yes. Will fix.

kpn added inline comments.Nov 6 2018, 10:29 AM

test/CodeGen/X86/fp-intrinsics.ll
281	The strict intrinsic results in the same instruction as the regular fptosi instruction. Having them be the same means the mutation from strict to non-strict is working correctly. And, yes, rounding towards zero is correct. That's why there's no rounding metadata.
test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
4372 ↗	(On Diff #172160)	Yes, it should be a truncating conversion.
4701 ↗	(On Diff #172160)	On the odd chance that it may tickle the vector legalizer I'll go ahead and add tests with float. Same answer as the scalar tests: Rounding towards zero is correct.
test/Feature/fp-intrinsics.ll
309	This is an opt test. I'm not sure we'd benefit from placing those extra tests here.

Rebase.

Minor test changes.

cameron.mcinally added inline comments.Nov 28 2018, 10:59 AM

include/llvm/CodeGen/ISDOpcodes.h
534	This ordering does not mesh with what is already in place. The STRICT_XXX opcodes are currently clustered together, where these new opcodes are grouped with their non-strict counterparts. I'm not opposed to grouping the corresponding non-strict and strict opcodes, but that would probably be better left to a separate patch. I think it makes sense to keep everything uniform until a final decision is made, for clarity's sake.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3033	This doesn't seem correct. Shouldn't Node->getOperand(0) be the argument for FP_ROUND and the Chain for STRICT_FP_ROUND? Same for FP_EXTEND too. Also, does using a truncating store provide us with the same trapping behavior as an explicit trunc instruction? I don't know off the top of my head, but it may be different. To be fair, a user probably won't care too much about optimizing away a trap, but the purists might.
3088	This seems incorrect, unless I missed something. The first operand to FP_TO_SINT should be the argument, but the first operand to STRICT_FP_TO_SINT should be the Chain. It does not look like expandFP_TO_SINT accounts for that.

kpn mentioned this in D55897: Add constrained fptrunc and fpext intrinsics.Dec 19 2018, 12:20 PM

kpn mentioned this in D59830: [FPEnv] Make constrained FP IR verification more flexible..Mar 26 2019, 11:33 AM

kpn mentioned this in D59833: [FPEnv] New document for adding new constrained FP intrinsics.Mar 26 2019, 11:58 AM

kpn mentioned this in rL357065: The IR verifier currently supports the constrained floating point intrinsics,.Mar 27 2019, 6:30 AM

kpn mentioned this in rG4f3cdc6555ca: The IR verifier currently supports the constrained floating point intrinsics….

pengfei added a subscriber: pengfei.May 4 2019, 11:05 PM

kpn added a subscriber: kbarton.May 10 2019, 12:32 PM

A few minor comments about commoning asserts and using dyn_cast instead of cast<>.
Aside from that, I think this looks good. That said, I'm by no means an expert in this area so don't feel I'm qualified to give a final approval to commit.

lib/IR/Verifier.cpp
4513	Could you use the isFPOrFPVectorTy here to check both conditions, and then remove the assert in the else below?
4514	Can you use a dyn_cast here instead? if (auto *OperandT = dyn_cast<VectorType>(Operand->getType())) { do vector stuff }
4529	Similarly, could you use isIntOrIntVectorTy here and remove the assert in the if/else below?
4553	same comment about commoning the asserts here.
4597	It looks like getArgOperand expects an unsigned. Is there a specific reason you are using an int here? If it's possible that RoundingIdx is negative, then you should probably add an assert here before passing it to getArgOperand.

The only thing left in this ticket to commit is the fptosi and fptoui changes. I'm working on incorporating review comments from D55897 into this ticket and fixing other bugs that I'm coming across. I'll hopefully update this ticket next week.

lib/IR/Verifier.cpp
4514	Yes, that's much more concise.
4553	The fptrunc and fpext changes were split out into another ticket and updated there. That committed code is more concise in I suspect the way you are intending.

kpn added a subscriber: ajwock.May 21 2019, 6:26 AM

Address review comments from here and from D55897.

This patch is now down to just handling fptosi and fptoui.

kpn added a reviewer: kbarton.May 30 2019, 11:47 AM

kpn mentioned this in D53157: Teach the IRBuilder about constrained fadd and friends.May 31 2019, 9:52 AM

Ping.

Ping

How would you feel about rebooting this as a new patch? There's a lot of irrelevant history here, and I feel like I'm missing some context as I review it.

In general, I see that you're down to just implementing the fptosi and fptoui cases. I'm concerned about what happens in the fptoui case. It's mentioned in a few of the earlier comments that the default expansion of this opcode introduces speculative exceptions, and if that's being handled in the latest implementation I haven't read it closely enough to see what's going on. If it is being handled, I'd expect to see a comment block somewhere explaining what's being done.

In D43515#1547200, @andrew.w.kaylor wrote:

How would you feel about rebooting this as a new patch? There's a lot of irrelevant history here, and I feel like I'm missing some context as I review it.

I can do that.

In general, I see that you're down to just implementing the fptosi and fptoui cases. I'm concerned about what happens in the fptoui case. It's mentioned in a few of the earlier comments that the default expansion of this opcode introduces speculative exceptions, and if that's being handled in the latest implementation I haven't read it closely enough to see what's going on. If it is being handled, I'd expect to see a comment block somewhere explaining what's being done.

I believe the speculative exceptions should be largely fixed by r348251, committed by rksimon. I changed the code to continue to use strict nodes when expanding a strict node. Since the strict nodes are chained, does that not solve the part of the problem not solved by r348251? Is the existing comment not enough, or do I need to copy some of the commit message into code comments?

In D43515#1548845, @kpn wrote:

I believe the speculative exceptions should be largely fixed by r348251, committed by rksimon. I changed the code to continue to use strict nodes when expanding a strict node. Since the strict nodes are chained, does that not solve the part of the problem not solved by r348251? Is the existing comment not enough, or do I need to copy some of the commit message into code comments?

Sorry, I wasn't aware of Simon's change. That definitely simplifies what needs to be done here, and, yes, the existing comment is sufficient.

Replaced by D63782.

Revision Contents

Path

Size

docs/

LangRef.rst

141 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

5 lines

Passes.h

2 lines

SelectionDAGNodes.h

4 lines

IR/

IntrinsicInst.h

4 lines

Intrinsics.td

17 lines

InitializePasses.h

1 line

lib/

CodeGen/

CMakeLists.txt

1 line

CodeGen.cpp

1 line

SelectionDAG/

LegalizeDAG.cpp

14 lines

SelectionDAG.cpp

31 lines

SelectionDAGBuilder.cpp

32 lines

SelectionDAGDumper.cpp

4 lines

StrictFP.cpp

206 lines

TargetPassConfig.cpp

8 lines

IR/

IntrinsicInst.cpp

4 lines

Verifier.cpp

96 lines

test/

CodeGen/

AArch64/

O0-pipeline.ll

1 line

O3-pipeline.ll

1 line

X86/

O0-pipeline.ll

1 line

O3-pipeline.ll

1 line

fp-intrinsics.ll

55 lines

Feature/

fp-intrinsics.ll

49 lines

Diff 144714

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 13,181 Lines • ▼ Show 20 Lines

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	declare <type>			declare <type>
	@llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,			@llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
	metadata <rounding mode>,
	craig.topperUnsubmitted Not Done Reply Inline Actions This change should be in a separate patch. There's too much going on in this patch and this is easy to overlook. craig.topper: This change should be in a separate patch. There's too much going on in this patch and this is…
	metadata <exception behavior>)			metadata <exception behavior>)

	Overview:			Overview:
	"""""""""			"""""""""

	The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder			The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
	from the division of its two operands.			from the division of its two operands.


	Arguments:			Arguments:
	""""""""""			""""""""""

	The first two arguments to the '``llvm.experimental.constrained.frem``'			The first two arguments to the '``llvm.experimental.constrained.frem``'
	intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`			intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
	of floating-point values. Both arguments must have identical types.			of floating-point values. Both arguments must have identical types.

	The third and fourth arguments specify the rounding mode and exception			The third argument specifies the exception behavior as described above.
	behavior as described above. The rounding mode argument has no effect, since
	the result of frem is never rounded, but the argument is included for
	consistency with the other constrained floating-point intrinsics.

	Semantics:			Semantics:
	""""""""""			""""""""""

	The value produced is the floating-point remainder from the division of the two			The value produced is the floating-point remainder from the division of the two
	value operands and has the same type as the operands. The remainder has the			value operands and has the same type as the operands. The remainder has the
	same sign as the dividend.			same sign as the dividend.

	Show All 28 Lines

	Semantics:			Semantics:
	""""""""""			""""""""""

	The result produced is the product of the first two operands added to the third			The result produced is the product of the first two operands added to the third
	operand computed with infinite precision, and then rounded to the target			operand computed with infinite precision, and then rounded to the target
	precision.			precision.

				'``llvm.experimental.constrained.fptoui``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				lebedev.riUnsubmitted Not Done Reply Inline Actions This probably has insufficient amount of `^`. Might want to actually test-build the docs. lebedev.ri: This probably has insufficient amount of `^`. Might want to actually test-build the docs.

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fptoui(<type> <op>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fptoui``' intrinsic returns the result of a
				conversion of a floating point operand to an unsigned integer.

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fptoui``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is an unsigned integer converted from the floating
				point operand. The value is truncated, so it is rounded towards zero.

				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions I think you need to say more about the semantics. The LLVM IR fptoui and fptosi are specifically documented as rounding to zero. I don't think we want that with the constrained intrinsics so we need to very specifically document how they will be different from the standard instructions. andrew.w.kaylor: I think you need to say more about the semantics. The LLVM IR fptoui and fptosi are…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions What do we want these intrinsics to do that is different from the normal IR fptoui and fptosi? I don't think any of the constrained intrinsics _today_ are doing anything with the rounding and exception metadata. That makes it hard for me to say much about it in documentation, today. Well, unless I misunderstood the current code. That's totally possible. kpn: What do we want these intrinsics to do that is different from the normal IR fptoui and fptosi?
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions You're right, none of the other intrinsics are doing anything specific with the rounding mode. The purpose of the intrinsics is to prevent the optimizer from doing things that would introduce compile-time rounding. However, the general assumption of the intrinsics is that whatever you set the rounding mode to at runtime is the rounding behavior you will get. I'd want to consult a front end expert as to exactly what should be happening. A quick glance at the C99 standard tells me that when a floating point number is converted to an integer it is truncated toward zero. I think that means the same thing as the LLVM language reference claim that fptoui and fptosi round the number toward zero. So I think that we don't want the runtime rounding mode to change the behavior of these intrinsics, and so we should document it as such. andrew.w.kaylor: You're right, none of the other intrinsics are doing anything specific with the rounding mode.
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Our front end guy here confirms truncation. I've updated my working copy to state that fptoui and fptosi truncate. kpn: Our front end guy here confirms truncation. I've updated my working copy to state that fptoui…
				'``llvm.experimental.constrained.fptosi``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fptosi(<type> <op>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fptosi``' intrinsic returns the result of a
				conversion of a floating point operand to a signed integer.

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fptoui``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is a signed integer converted from the floating
				point operand.

				'``llvm.experimental.constrained.fptrunc``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions You need to do something more here to document the difference between the return type and the argument type. Also in fpext below. andrew.w.kaylor: You need to do something more here to document the difference between the return type and the…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions I hope this is what you mean. I've changed it to show that the result is a different type. I've followed the naming scheme used elsewhere in this document. kpn: I hope this is what you mean. I've changed it to show that the result is a different type. I've…
				@llvm.experimental.constrained.fptrunc(<type> <op>,
				metadata <exception behavior>)

				Overview:
				"""""""""
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions This intrinsic is a bit troublesome. The llvm.round intrinsic says that it "returns the same values as the libm round functions would, and handles error conditions in the same way." The libm round function, in turn, is documented as rounding to the nearest integer (and away from zero in halfway cases) regardless of the current rounding mode. So what do we want the constrained form of the intrinsic to do? I think it needs to ignore the rounding mode. I'm not sure about exception behavior. If it doesn't respect exception behavior then we probably don't want to have the constrained form of this intrinsic at all. andrew.w.kaylor: This intrinsic is a bit troublesome. The llvm.round intrinsic says that it "returns the same…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions We'll still need the constrained intrinsic to avoid getting reordered. How about if I rename the intrinsic to be fptrunc instead of round? Then the rounding would be explicit. kpn: We'll still need the constrained intrinsic to avoid getting reordered. How about if I rename…
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions No, fptrunc does something else, right? My concern is that if this is preserving the behavior of the round library function (and I think we need to) then the rounding mode argument isn't relevant, so maybe it should be omitted in this case. The key thing is to be explicit about its behavior in the documentation and then make sure the implementation actually does what we say it will. andrew.w.kaylor: No, fptrunc does something else, right? My concern is that if this is preserving the behavior…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Well, if I'm reading SelectionDAGBuilder::visitFPTrunc() correctly then fptrunc gets turned into ISD::FP_ROUND. So, no, renaming this intrinsic to be fptrunc would not be wrong. Contrast this with llvm.round getting turned into ISD::FROUND. I think we need constrained intrinsics for both, so this one in this patch should be renamed after fptrunc. I haven't touched the ISD::FROUND node yet, and that's what llvm.round gets turned into from the looks of SelectionDAGBuilder::visitIntrinsicCall(). That could be a later patch. I agree that the rounding mode should be ignored and shouldn't be in the intrinsic's metadata. WDYT? kpn: Well, if I'm reading SelectionDAGBuilder::visitFPTrunc() correctly then fptrunc gets turned…
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions You shouldn't be thinking in terms of the SelectionDAG at this level. ISD::FP_ROUND is very poorly named and does not do the same thing that llvm.round does. ISD::FP_ROUND converts a floating point number to a smaller type, but not necessarily an integer. That's fptrunc. I suppose you are right that we do need a constrained version of fptrunc. I'm a bit concerned by the things that the LLVM language definition says are undefined. Those are the cases that will be of most interest for the constrained case and we should document the expected behavior, but I think we need to consider why the current spec says the result is undefined. In any event llvm.round, which is what I thought you were replacing here, is something completely different. andrew.w.kaylor: You shouldn't be thinking in terms of the SelectionDAG at this level. ISD::FP_ROUND is very…

				The '``llvm.experimental.constrained.fptrunc``' intrinsic returns the result of
				a truncating of a floating point operand into a smaller floating point result.
				craig.topperUnsubmitted Not Done Reply Inline Actions This reads funny. I think it should maybe be "result of truncating a floating point" craig.topper: This reads funny. I think it should maybe be "result of truncating a floating point"
				kpnAuthorUnsubmitted Not Done Reply Inline Actions How about if I just copy the text used by the normal fptrunc instruction? kpn: How about if I just copy the text used by the normal fptrunc instruction?
				craig.topperUnsubmitted Done Reply Inline Actions Sure craig.topper: Sure

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.round``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values. This argument must be larger in size
				than the result.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is a floating point value truncated to be smaller in size
				than the operand.

				'``llvm.experimental.constrained.fpext``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <type>
				@llvm.experimental.constrained.fpext(<type> <op>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fpext``' intrinsic returns the result of
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions This seems to be leaking SelectionDAG implementation details into the IR space. How is this used? andrew.w.kaylor: This seems to be leaking SelectionDAG implementation details into the IR space. How is this…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions It seems to exist for the MVT::ppcf128 type. A quick grep doesn't show any other users. Test coverage is lacking. I added a test using the intrinsic, but I had to mark it expected fail since I couldn't get it to work. To avoid the risk of bugs getting introduced later I went ahead and implemented the intrinsic. Would it be better to not have the intrinsic and to instead have a pass that replaces the non-STRICT SDNode with a STRICT version? That would avoid said leaking into the IR space. It would, however, mean that llvm would have an opinion on when STRICT nodes should be used. I'm not sure that's a good thing. kpn: It seems to exist for the MVT::ppcf128 type. A quick grep doesn't show any other users. Test…
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions I think it's better not to have the intrinsic for now. I don't understand what the SD node is doing well enough to say much more, but it looks to me like the SD node shouldn't be there either. It's a target-specific hack that leaked into the target-independent code if my understanding is correct. andrew.w.kaylor: I think it's better not to have the intrinsic for now. I don't understand what the SD node is…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Done. It will be removed in the next diff. kpn: Done. It will be removed in the next diff.
				an enlarging of a floating point operand.
				craig.topperUnsubmitted Not Done Reply Inline Actions This also reads funny craig.topper: This also reads funny
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Same as fptrunc. I could just copy the text from the fpext instruction? kpn: Same as fptrunc. I could just copy the text from the fpext instruction?
				craig.topperUnsubmitted Done Reply Inline Actions Sure craig.topper: Sure

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fpext`'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values. This argument must be smaller in size
				than the result.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is a floating point value extended to be larger in size
				than the operand. All restrictions that apply to the fpext instruction also
				apply to this intrinsic.

	Constrained libm-equivalent Intrinsics			Constrained libm-equivalent Intrinsics
	--------------------------------------			--------------------------------------

	In addition to the basic floating-point operations for which constrained			In addition to the basic floating-point operations for which constrained
	intrinsics are described above, there are constrained versions of various			intrinsics are described above, there are constrained versions of various
	operations which provide equivalent behavior to a corresponding libm function.			operations which provide equivalent behavior to a corresponding libm function.
	These intrinsics allow the precise behavior of these operations with respect to			These intrinsics allow the precise behavior of these operations with respect to
	rounding mode and exception behavior to be controlled.			rounding mode and exception behavior to be controlled.

	As with the basic constrained floating-point intrinsics, the rounding mode			As with the basic constrained floating-point intrinsics, the rounding mode
	and exception behavior arguments only control the behavior of the optimizer.			and exception behavior arguments only control the behavior of the optimizer.
	They do not change the runtime floating-point environment.			They do not change the runtime floating-point environment.


	'``llvm.experimental.constrained.sqrt``' Intrinsic			'``llvm.experimental.constrained.sqrt``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions This is a replacement for fpext, right? I think you should say that somewhere. andrew.w.kaylor: This is a replacement for fpext, right? I think you should say that somewhere.
				kpnAuthorUnsubmitted Not Done Reply Inline Actions I think I picked names for the intrinsics that matched the SelectionDAG node enum. Perhaps it would be better to match the bitcode language names? In which case this intrinsic would be "fpext" instead of "extend". Either way that's a good idea for the documentation to at least mention fpext. kpn: I think I picked names for the intrinsics that matched the SelectionDAG node enum. Perhaps it…
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions These intrinsics should be driven by what the front end needs. If no front end is generating an equivalent now then we don't want an intrinsic. So, yes, please match the bitcode language names. andrew.w.kaylor: These intrinsics should be driven by what the front end needs. If no front end is generating an…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Will do. kpn: Will do.
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Say, now that I think about this, what does rounding have to do with extending a FP value? Shouldn't this intrinsic just plain not have rounding metadata? kpn: Say, now that I think about this, what does rounding have to do with extending a FP value?
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions That's a reasonable point. The same issue came up with frem. The rounding mode doesn't apply to frem, but I kept the argument just so that it could be handled internally the same way as the other constrained intrinsics and then documented in the language reference that the rounding mode has no effect. I'm not completely convinced that was a good decision on my part. If you want to look at what would have to change to support constrained intrinsics with no rounding mode argument, I would likely support that. I don't think it would be much extra handling. There are just some things where the class that is used to represent the intrinsics (ConstrainedFPIntrinsic) would need to be aware of the possibility that this argument is omitted. andrew.w.kaylor: That's a reasonable point. The same issue came up with frem. The rounding mode doesn't apply to…
	declare <type>			declare <type>
	@llvm.experimental.constrained.sqrt(<type> <op1>,			@llvm.experimental.constrained.sqrt(<type> <op1>,
	metadata <rounding mode>,			metadata <rounding mode>,
	metadata <exception behavior>)			metadata <exception behavior>)

	Overview:			Overview:
	"""""""""			"""""""""

	▲ Show 20 Lines • Show All 1,352 Lines • Show Last 20 Lines

include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 525 Lines • ▼ Show 20 Lines	enum NodeType {
/// in a register of the same size. This operation effectively just		/// in a register of the same size. This operation effectively just
/// discards excess precision. The type to round down to is specified by		/// discards excess precision. The type to round down to is specified by
/// the VT operand, a VTSDNode.		/// the VT operand, a VTSDNode.
FP_ROUND_INREG,		FP_ROUND_INREG,

/// X = FP_EXTEND(Y) - Extend a smaller FP type into a larger FP type.		/// X = FP_EXTEND(Y) - Extend a smaller FP type into a larger FP type.
FP_EXTEND,		FP_EXTEND,

		STRICT_FP_TO_SINT,
		craig.topperUnsubmitted Not Done Reply Inline Actions These need comments. Does STRICT_FP_ROUND have the TRUNC argument that FP_ROUND has? craig.topper: These need comments. Does STRICT_FP_ROUND have the TRUNC argument that FP_ROUND has?
		kpnAuthorUnsubmitted Done Reply Inline Actions I could rearrange and lump them in with the non-strict versions of each. Then I'd just need one extra line restating that the STRICT_ versions prevent optimizations. Yes, STRICT_FP_ROUND does have the TRUNC argument that FP_ROUND has, but it is currently always zero. Fixing this require rerouting this one strict node to go through the same codepath as the non-strict node. That would make it different from all the other constrained nodes. Should I go ahead and make that change? I _think_ it is safe if the TRUNC argument really does work like it is documented. I also just noticed that I need to put back STRICT_FP_TO_UINT in at least one place. It should be everywhere STRICT_FP_TO_SINT is handled _except_ in the default lowering. kpn: I could rearrange and lump them in with the non-strict versions of each. Then I'd just need one…
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions This ordering does not mesh with what is already in place. The STRICT_XXX opcodes are currently clustered together, where these new opcodes are grouped with their non-strict counterparts. I'm not opposed to grouping the corresponding non-strict and strict opcodes, but that would probably be better left to a separate patch. I think it makes sense to keep everything uniform until a final decision is made, for clarity's sake. cameron.mcinally: This ordering does not mesh with what is already in place. The STRICT_XXX opcodes are currently…
		STRICT_FP_TO_UINT,
		STRICT_FP_ROUND,
		STRICT_FP_EXTEND,

/// BITCAST - This operator converts between integer, vector and FP		/// BITCAST - This operator converts between integer, vector and FP
/// values, as if the value was stored to memory with one type and loaded		/// values, as if the value was stored to memory with one type and loaded
/// from the same address with the other type (or equivalently for vector		/// from the same address with the other type (or equivalently for vector
/// format conversions, etc). The source and result are required to have		/// format conversions, etc). The source and result are required to have
/// the same bit size (e.g. f32 <-> i32). This can also be used for		/// the same bit size (e.g. f32 <-> i32). This can also be used for
/// int-to-int or fp-to-fp conversions, but that is a noop, deleted by		/// int-to-int or fp-to-fp conversions, but that is a noop, deleted by
/// getNode().		/// getNode().
///		///
▲ Show 20 Lines • Show All 457 Lines • Show Last 20 Lines

include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 431 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.
FunctionPass *createBreakFalseDeps();		FunctionPass *createBreakFalseDeps();

// This pass expands indirectbr instructions.		// This pass expands indirectbr instructions.
FunctionPass *createIndirectBrExpandPass();		FunctionPass *createIndirectBrExpandPass();

/// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp		/// Creates CFI Instruction Inserter pass. \see CFIInstrInserter.cpp
FunctionPass *createCFIInstrInserter();		FunctionPass *createCFIInstrInserter();

		// Experimental pass with transforms needed for strict fp
		FunctionPass createStrictFPPass(TargetMachine );
} // End llvm namespace		} // End llvm namespace

#endif		#endif

include/llvm/CodeGen/SelectionDAGNodes.h

Show First 20 Lines • Show All 641 Lines • ▼ Show 20 Lines	switch (NodeType) {
case ISD::STRICT_FCOS:		case ISD::STRICT_FCOS:
case ISD::STRICT_FEXP:		case ISD::STRICT_FEXP:
case ISD::STRICT_FEXP2:		case ISD::STRICT_FEXP2:
case ISD::STRICT_FLOG:		case ISD::STRICT_FLOG:
case ISD::STRICT_FLOG10:		case ISD::STRICT_FLOG10:
case ISD::STRICT_FLOG2:		case ISD::STRICT_FLOG2:
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_EXTEND:
return true;		return true;
}		}
}		}

/// Test if this node has a post-isel opcode, directly		/// Test if this node has a post-isel opcode, directly
/// corresponding to a MachineInstr opcode.		/// corresponding to a MachineInstr opcode.
bool isMachineOpcode() const { return NodeType < 0; }		bool isMachineOpcode() const { return NodeType < 0; }

▲ Show 20 Lines • Show All 1,742 Lines • Show Last 20 Lines

include/llvm/IR/IntrinsicInst.h

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	public:
static bool classof(const IntrinsicInst *I) {		static bool classof(const IntrinsicInst *I) {
switch (I->getIntrinsicID()) {		switch (I->getIntrinsicID()) {
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
▲ Show 20 Lines • Show All 531 Lines • Show Last 20 Lines

include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 487 Lines • ▼ Show 20 Lines	let IntrProperties = [IntrInaccessibleMemOnly] in {
def int_experimental_constrained_fdiv : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_fdiv : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_frem : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_frem : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;

def int_experimental_constrained_fma : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_fma : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;

		def int_experimental_constrained_fptosi : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

		def int_experimental_constrained_fptoui : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

		def int_experimental_constrained_fptrunc : Intrinsic<[ llvm_anyfloat_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

		def int_experimental_constrained_fpext : Intrinsic<[ llvm_anyfloat_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

// These intrinsics are sensitive to the rounding mode so we need constrained		// These intrinsics are sensitive to the rounding mode so we need constrained
// versions of each of them. When strict rounding and exception control are		// versions of each of them. When strict rounding and exception control are
// not required the non-constrained versions of these intrinsics should be		// not required the non-constrained versions of these intrinsics should be
// used.		// used.
def int_experimental_constrained_sqrt : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_sqrt : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
▲ Show 20 Lines • Show All 483 Lines • Show Last 20 Lines

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 357 Lines • ▼ Show 20 Lines
	void initializeSlotIndexesPass(PassRegistry&);			void initializeSlotIndexesPass(PassRegistry&);
	void initializeSpeculativeExecutionLegacyPassPass(PassRegistry&);			void initializeSpeculativeExecutionLegacyPassPass(PassRegistry&);
	void initializeSpillPlacementPass(PassRegistry&);			void initializeSpillPlacementPass(PassRegistry&);
	void initializeStackColoringPass(PassRegistry&);			void initializeStackColoringPass(PassRegistry&);
	void initializeStackMapLivenessPass(PassRegistry&);			void initializeStackMapLivenessPass(PassRegistry&);
	void initializeStackProtectorPass(PassRegistry&);			void initializeStackProtectorPass(PassRegistry&);
	void initializeStackSlotColoringPass(PassRegistry&);			void initializeStackSlotColoringPass(PassRegistry&);
	void initializeStraightLineStrengthReducePass(PassRegistry&);			void initializeStraightLineStrengthReducePass(PassRegistry&);
				void initializeStrictFPPassPass(PassRegistry&);
	void initializeStripDeadDebugInfoPass(PassRegistry&);			void initializeStripDeadDebugInfoPass(PassRegistry&);
	void initializeStripDeadPrototypesLegacyPassPass(PassRegistry&);			void initializeStripDeadPrototypesLegacyPassPass(PassRegistry&);
	void initializeStripDebugDeclarePass(PassRegistry&);			void initializeStripDebugDeclarePass(PassRegistry&);
	void initializeStripGCRelocatesPass(PassRegistry&);			void initializeStripGCRelocatesPass(PassRegistry&);
	void initializeStripNonDebugSymbolsPass(PassRegistry&);			void initializeStripNonDebugSymbolsPass(PassRegistry&);
	void initializeStripNonLineTableDebugInfoPass(PassRegistry&);			void initializeStripNonLineTableDebugInfoPass(PassRegistry&);
	void initializeStripSymbolsPass(PassRegistry&);			void initializeStripSymbolsPass(PassRegistry&);
	void initializeStructurizeCFGPass(PassRegistry&);			void initializeStructurizeCFGPass(PassRegistry&);
	Show All 26 Lines

lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines	add_llvm_library(LLVMCodeGen
SlotIndexes.cpp		SlotIndexes.cpp
SpillPlacement.cpp		SpillPlacement.cpp
SplitKit.cpp		SplitKit.cpp
StackColoring.cpp		StackColoring.cpp
StackMapLivenessAnalysis.cpp		StackMapLivenessAnalysis.cpp
StackMaps.cpp		StackMaps.cpp
StackProtector.cpp		StackProtector.cpp
StackSlotColoring.cpp		StackSlotColoring.cpp
		StrictFP.cpp
TailDuplication.cpp		TailDuplication.cpp
TailDuplicator.cpp		TailDuplicator.cpp
TargetFrameLoweringImpl.cpp		TargetFrameLoweringImpl.cpp
TargetInstrInfo.cpp		TargetInstrInfo.cpp
TargetLoweringBase.cpp		TargetLoweringBase.cpp
TargetLoweringObjectFileImpl.cpp		TargetLoweringObjectFileImpl.cpp
TargetOptionsImpl.cpp		TargetOptionsImpl.cpp
TargetPassConfig.cpp		TargetPassConfig.cpp
Show All 24 Lines

lib/CodeGen/CodeGen.cpp

Show First 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeSafeStackLegacyPassPass(Registry);		initializeSafeStackLegacyPassPass(Registry);
initializeScalarizeMaskedMemIntrinPass(Registry);		initializeScalarizeMaskedMemIntrinPass(Registry);
initializeShrinkWrapPass(Registry);		initializeShrinkWrapPass(Registry);
initializeSlotIndexesPass(Registry);		initializeSlotIndexesPass(Registry);
initializeStackColoringPass(Registry);		initializeStackColoringPass(Registry);
initializeStackMapLivenessPass(Registry);		initializeStackMapLivenessPass(Registry);
initializeStackProtectorPass(Registry);		initializeStackProtectorPass(Registry);
initializeStackSlotColoringPass(Registry);		initializeStackSlotColoringPass(Registry);
		initializeStrictFPPassPass(Registry);
initializeTailDuplicatePass(Registry);		initializeTailDuplicatePass(Registry);
initializeTargetPassConfigPass(Registry);		initializeTargetPassConfigPass(Registry);
initializeTwoAddressInstructionPassPass(Registry);		initializeTwoAddressInstructionPassPass(Registry);
initializeUnpackMachineBundlesPass(Registry);		initializeUnpackMachineBundlesPass(Registry);
initializeUnreachableBlockElimLegacyPassPass(Registry);		initializeUnreachableBlockElimLegacyPassPass(Registry);
initializeUnreachableMachineBlockElimPass(Registry);		initializeUnreachableMachineBlockElimPass(Registry);
initializeVirtRegMapPass(Registry);		initializeVirtRegMapPass(Registry);
initializeVirtRegRewriterPass(Registry);		initializeVirtRegRewriterPass(Registry);
initializeWinEHPreparePass(Registry);		initializeWinEHPreparePass(Registry);
initializeXRayInstrumentationPass(Registry);		initializeXRayInstrumentationPass(Registry);
initializeMIRCanonicalizerPass(Registry);		initializeMIRCanonicalizerPass(Registry);
}		}

void LLVMInitializeCodeGen(LLVMPassRegistryRef R) {		void LLVMInitializeCodeGen(LLVMPassRegistryRef R) {
initializeCodeGen(*unwrap(R));		initializeCodeGen(*unwrap(R));
}		}

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 959 Lines • ▼ Show 20 Lines	switch (Opcode) {
case ISD::STRICT_FCOS: EqOpc = ISD::FCOS; break;		case ISD::STRICT_FCOS: EqOpc = ISD::FCOS; break;
case ISD::STRICT_FEXP: EqOpc = ISD::FEXP; break;		case ISD::STRICT_FEXP: EqOpc = ISD::FEXP; break;
case ISD::STRICT_FEXP2: EqOpc = ISD::FEXP2; break;		case ISD::STRICT_FEXP2: EqOpc = ISD::FEXP2; break;
case ISD::STRICT_FLOG: EqOpc = ISD::FLOG; break;		case ISD::STRICT_FLOG: EqOpc = ISD::FLOG; break;
case ISD::STRICT_FLOG10: EqOpc = ISD::FLOG10; break;		case ISD::STRICT_FLOG10: EqOpc = ISD::FLOG10; break;
case ISD::STRICT_FLOG2: EqOpc = ISD::FLOG2; break;		case ISD::STRICT_FLOG2: EqOpc = ISD::FLOG2; break;
case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break;		case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break;
case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;		case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;
		case ISD::STRICT_FP_TO_SINT: EqOpc = ISD::FP_TO_SINT; break;
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions STRICT_FP_TO_SINT needs to do something more than this. The default lowering of FP_TO_SINT is known to raise spurious FE_INEXACT exceptions because it involves speculative execution. andrew.w.kaylor: STRICT_FP_TO_SINT needs to do something more than this. The default lowering of FP_TO_SINT is…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions I have seen FP_TO_SINT cause traps when it shouldn't. Would having that default lowering use the chain in the STRICT_ case solve that issue? kpn: I have seen FP_TO_SINT cause traps when it shouldn't. Would having that default lowering use…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions No, I don't think the chain will fix this. We need to implement strict lowering that does something different. andrew.w.kaylor: No, I don't think the chain will fix this. We need to implement strict lowering that does…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Is there something we can do at this level to fix this? If there is then I'm all for it, but if there isn't then we should probably still put the intrinsic in. We'll need it eventually, and currently none of the constrained intrinsics solve the complete optimization problem. So I wonder if this intrinsic is really all that different from the other experimental constrained intrinsics. If a backend models FP side effects then wouldn't the existing default lowering work correctly? kpn: Is there something we can do at this level to fix this? If there is then I'm all for it, but if…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions To be honest I'm not entirely sure how to fix this in the current selection DAG model. The issue is that we need to introduce a branch to fix the problem, but by the time we're selecting instructions it's too late to do that. I think it needs to be addressed when we're building the DAG, andrew.w.kaylor: To be honest I'm not entirely sure how to fix this in the current selection DAG model. The…
		case ISD::STRICT_FP_TO_UINT: EqOpc = ISD::FP_TO_UINT; break;
		case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;
		case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;
}		}

auto Action = TLI.getOperationAction(EqOpc, VT);		auto Action = TLI.getOperationAction(EqOpc, VT);

// We don't currently handle Custom or Promote for strict FP pseudo-ops.		// We don't currently handle Custom or Promote for strict FP pseudo-ops.
// For now, we just expand for those cases.		// For now, we just expand for those cases.
if (Action != TargetLowering::Legal)		if (Action != TargetLowering::Legal)
Action = TargetLowering::Expand;		Action = TargetLowering::Expand;
▲ Show 20 Lines • Show All 154 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FCOS:		case ISD::STRICT_FCOS:
case ISD::STRICT_FEXP:		case ISD::STRICT_FEXP:
case ISD::STRICT_FEXP2:		case ISD::STRICT_FEXP2:
case ISD::STRICT_FLOG:		case ISD::STRICT_FLOG:
case ISD::STRICT_FLOG10:		case ISD::STRICT_FLOG10:
case ISD::STRICT_FLOG2:		case ISD::STRICT_FLOG2:
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_EXTEND:
// These pseudo-ops get legalized as if they were their non-strict		// These pseudo-ops get legalized as if they were their non-strict
// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT		// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT
// is also legal, but if ISD::FSQRT requires expansion then so does		// is also legal, but if ISD::FSQRT requires expansion then so does
// ISD::STRICT_FSQRT.		// ISD::STRICT_FSQRT.
Action = getStrictFPOpcodeAction(TLI, Node->getOpcode(),		Action = getStrictFPOpcodeAction(TLI, Node->getOpcode(),
Node->getValueType(0));		Node->getValueType(0));
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Since this gets unique handling why isn't it just a separate case from the others? andrew.w.kaylor: Since this gets unique handling why isn't it just a separate case from the others?
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Good point. And making it a separate case also takes care of the formatting issue in the else block. kpn: Good point. And making it a separate case also takes care of the formatting issue in the else…
break;		break;
default:		default:
if (Node->getOpcode() >= ISD::BUILTIN_OP_END) {		if (Node->getOpcode() >= ISD::BUILTIN_OP_END) {
Action = TargetLowering::Legal;		Action = TargetLowering::Legal;
} else {		} else {
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions The style/formatting is wrong here. I think you need curly braces around your else-clause and the "else" itself needs to be on the same line as the curly brace above it. andrew.w.kaylor: The style/formatting is wrong here. I think you need curly braces around your else-clause and…
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
}		}
break;		break;
}		}

if (SimpleFinishLegalizing) {		if (SimpleFinishLegalizing) {
SDNode *NewNode = Node;		SDNode *NewNode = Node;
switch (Node->getOpcode()) {		switch (Node->getOpcode()) {
▲ Show 20 Lines • Show All 1,847 Lines • ▼ Show 20 Lines	if (VT.isInteger())
Results.push_back(DAG.getConstant(0, dl, VT));		Results.push_back(DAG.getConstant(0, dl, VT));
else {		else {
assert(VT.isFloatingPoint() && "Unknown value type!");		assert(VT.isFloatingPoint() && "Unknown value type!");
Results.push_back(DAG.getConstantFP(0, dl, VT));		Results.push_back(DAG.getConstantFP(0, dl, VT));
}		}
break;		break;
}		}
case ISD::FP_ROUND:		case ISD::FP_ROUND:
		case ISD::STRICT_FP_ROUND:
case ISD::BITCAST:		case ISD::BITCAST:
Tmp1 = EmitStackConvert(Node->getOperand(0), Node->getValueType(0),		Tmp1 = EmitStackConvert(Node->getOperand(0), Node->getValueType(0),
Node->getValueType(0), dl);		Node->getValueType(0), dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
		case ISD::STRICT_FP_EXTEND:
Tmp1 = EmitStackConvert(Node->getOperand(0),		Tmp1 = EmitStackConvert(Node->getOperand(0),
Node->getOperand(0).getValueType(),		Node->getOperand(0).getValueType(),
Node->getValueType(0), dl);		Node->getValueType(0), dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::SIGN_EXTEND_INREG: {		case ISD::SIGN_EXTEND_INREG: {
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions This doesn't seem correct. Shouldn't Node->getOperand(0) be the argument for FP_ROUND and the Chain for STRICT_FP_ROUND? Same for FP_EXTEND too. Also, does using a truncating store provide us with the same trapping behavior as an explicit trunc instruction? I don't know off the top of my head, but it may be different. To be fair, a user probably won't care too much about optimizing away a trap, but the purists might. cameron.mcinally: This doesn't seem correct. Shouldn't Node->getOperand(0) be the argument for FP_ROUND and the…
EVT ExtraVT = cast<VTSDNode>(Node->getOperand(1))->getVT();		EVT ExtraVT = cast<VTSDNode>(Node->getOperand(1))->getVT();
EVT VT = Node->getValueType(0);		EVT VT = Node->getValueType(0);

// An in-register sign-extend of a boolean is a negation:		// An in-register sign-extend of a boolean is a negation:
// 'true' (1) sign-extended is -1.		// 'true' (1) sign-extended is -1.
// 'false' (0) sign-extended is 0.		// 'false' (0) sign-extended is 0.
// However, we must mask the high bits of the source operand because the		// However, we must mask the high bits of the source operand because the
// SIGN_EXTEND_INREG does not guarantee that the high bits are already zero.		// SIGN_EXTEND_INREG does not guarantee that the high bits are already zero.
Show All 35 Lines	bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
}		}
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
Tmp1 = ExpandLegalINT_TO_FP(Node->getOpcode() == ISD::SINT_TO_FP,		Tmp1 = ExpandLegalINT_TO_FP(Node->getOpcode() == ISD::SINT_TO_FP,
Node->getOperand(0), Node->getValueType(0), dl);		Node->getOperand(0), Node->getValueType(0), dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
		case ISD::STRICT_FP_TO_SINT:
if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG))		if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG))
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions This seems incorrect, unless I missed something. The first operand to FP_TO_SINT should be the argument, but the first operand to STRICT_FP_TO_SINT should be the Chain. It does not look like expandFP_TO_SINT accounts for that. cameron.mcinally: This seems incorrect, unless I missed something. The first operand to FP_TO_SINT should be the…
case ISD::FP_TO_UINT: {		case ISD::FP_TO_UINT: {
SDValue True, False;		SDValue True, False;
EVT VT = Node->getOperand(0).getValueType();		EVT VT = Node->getOperand(0).getValueType();
EVT NVT = Node->getValueType(0);		EVT NVT = Node->getValueType(0);
APFloat apf(DAG.EVTToAPFloatSemantics(VT),		APFloat apf(DAG.EVTToAPFloatSemantics(VT),
APInt::getNullValue(VT.getSizeInBits()));		APInt::getNullValue(VT.getSizeInBits()));
APInt x = APInt::getSignMask(NVT.getSizeInBits());		APInt x = APInt::getSignMask(NVT.getSizeInBits());
(void)apf.convertFromAPInt(x, false, APFloat::rmNearestTiesToEven);		(void)apf.convertFromAPInt(x, false, APFloat::rmNearestTiesToEven);
Tmp1 = DAG.getConstantFP(apf, dl, VT);		Tmp1 = DAG.getConstantFP(apf, dl, VT);
Tmp2 = DAG.getSetCC(dl, getSetCCResultType(VT),		Tmp2 = DAG.getSetCC(dl, getSetCCResultType(VT),
Node->getOperand(0),		Node->getOperand(0),
Tmp1, ISD::SETLT);		Tmp1, ISD::SETLT);
True = DAG.getNode(ISD::FP_TO_SINT, dl, NVT, Node->getOperand(0));		True = DAG.getNode(ISD::FP_TO_SINT, dl, NVT, Node->getOperand(0));
// TODO: Should any fast-math-flags be set for the FSUB?		// TODO: Should any fast-math-flags be set for the FSUB?
False = DAG.getNode(ISD::FP_TO_SINT, dl, NVT,		False = DAG.getNode(ISD::FP_TO_SINT, dl, NVT,
DAG.getNode(ISD::FSUB, dl, VT,		DAG.getNode(ISD::FSUB, dl, VT,
Node->getOperand(0), Tmp1));		Node->getOperand(0), Tmp1));
False = DAG.getNode(ISD::XOR, dl, NVT, False,		False = DAG.getNode(ISD::XOR, dl, NVT, False,
DAG.getConstant(x, dl, NVT));		DAG.getConstant(x, dl, NVT));
Tmp1 = DAG.getSelect(dl, NVT, Tmp2, True, False);		Tmp1 = DAG.getSelect(dl, NVT, Tmp2, True, False);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
}		}
		case ISD::STRICT_FP_TO_UINT:
		llvm_unreachable("Expansion of STRICT_FP_TO_UINT missed in earlier pass!");
		break;
case ISD::VAARG:		case ISD::VAARG:
Results.push_back(DAG.expandVAArg(Node));		Results.push_back(DAG.expandVAArg(Node));
Results.push_back(Results[0].getValue(1));		Results.push_back(Results[0].getValue(1));
break;		break;
case ISD::VACOPY:		case ISD::VACOPY:
Results.push_back(DAG.expandVACopy(Node));		Results.push_back(DAG.expandVACopy(Node));
break;		break;
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
▲ Show 20 Lines • Show All 1,654 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,080 Lines • ▼ Show 20 Lines	SDNode* SelectionDAG::mutateStrictFPToFP(SDNode *Node) {
case ISD::STRICT_FLOG: NewOpc = ISD::FLOG; IsUnary = true; break;		case ISD::STRICT_FLOG: NewOpc = ISD::FLOG; IsUnary = true; break;
case ISD::STRICT_FLOG10: NewOpc = ISD::FLOG10; IsUnary = true; break;		case ISD::STRICT_FLOG10: NewOpc = ISD::FLOG10; IsUnary = true; break;
case ISD::STRICT_FLOG2: NewOpc = ISD::FLOG2; IsUnary = true; break;		case ISD::STRICT_FLOG2: NewOpc = ISD::FLOG2; IsUnary = true; break;
case ISD::STRICT_FRINT: NewOpc = ISD::FRINT; IsUnary = true; break;		case ISD::STRICT_FRINT: NewOpc = ISD::FRINT; IsUnary = true; break;
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
NewOpc = ISD::FNEARBYINT;		NewOpc = ISD::FNEARBYINT;
IsUnary = true;		IsUnary = true;
break;		break;
		case ISD::STRICT_FP_TO_SINT: NewOpc = ISD::FP_TO_SINT; break;
		case ISD::STRICT_FP_TO_UINT: NewOpc = ISD::FP_TO_UINT; break;
		case ISD::STRICT_FP_ROUND: NewOpc = ISD::FP_ROUND; IsUnary = true; break;
		case ISD::STRICT_FP_EXTEND: NewOpc = ISD::FP_EXTEND; IsUnary = true; break;
}		}

// We're taking this node out of the chain, so we need to re-link things.		// We're taking this node out of the chain, so we need to re-link things.
		if (OrigOpc != ISD::STRICT_FP_TO_SINT) {
SDValue InputChain = Node->getOperand(0);		SDValue InputChain = Node->getOperand(0);
SDValue OutputChain = SDValue(Node, 1);		SDValue OutputChain = SDValue(Node, 1);
ReplaceAllUsesOfValueWith(OutputChain, InputChain);		ReplaceAllUsesOfValueWith(OutputChain, InputChain);
		}

SDVTList VTs = getVTList(Node->getOperand(1).getValueType());		SDVTList VTs;
SDNode *Res = nullptr;		SDNode *Res = nullptr;
if (IsUnary)
		switch (OrigOpc) {
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions There are a lot of comments on this, so I may have missed something. Take with a grain of salt... I don't think these are correct. These can trap so can't be speculatively executed. They would need a chain. cameron.mcinally: There are a lot of comments on this, so I may have missed something. Take with a grain of salt..
		kpnAuthorUnsubmitted Not Done Reply Inline Actions I couldn't figure out how to have these be chained but have the non-strict continue to not be chained. Too many things fell over if they didn't match. kpn: I couldn't figure out how to have these be chained but have the non-strict continue to not be…
		cameron.mcinallyUnsubmitted Done Reply Inline Actions I'm not an expert with this code, so cc @andrew.w.kaylor. This doesn't seem like the right direction though. Maybe these unchained operations should be left to a different patch until a proper solution is found. cameron.mcinally: I'm not an expert with this code, so cc @andrew.w.kaylor. This doesn't seem like the right…
		default:
		VTs = getVTList(Node->getOperand(1).getValueType());
		break;
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_EXTEND:
		VTs = getVTList(Node->ValueList[0]);
		break;
		}

		if (OrigOpc == ISD::STRICT_FP_TO_SINT)
		Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(0) });
		craig.topperUnsubmitted Not Done Reply Inline Actions This doesn't copy the second argument to FP_ROUJND over does it? craig.topper: This doesn't copy the second argument to FP_ROUJND over does it?
		kpnAuthorUnsubmitted Done Reply Inline Actions No. It should. kpn: No. It should.
		else if (IsUnary)
Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1) });		Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1) });
else if (IsTernary)		else if (IsTernary)
Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1),		Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1),
Node->getOperand(2),		Node->getOperand(2),
Node->getOperand(3)});		Node->getOperand(3)});
else		else
Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1),		Res = MorphNodeTo(Node, NewOpc, VTs, { Node->getOperand(1),
Node->getOperand(2) });		Node->getOperand(2) });
▲ Show 20 Lines • Show All 1,499 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,502 Lines • ▼ Show 20 Lines	setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(2))));		getValue(I.getArgOperand(2))));
return nullptr;		return nullptr;
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
▲ Show 20 Lines • Show All 615 Lines • ▼ Show 20 Lines	case Intrinsic::experimental_constrained_fdiv:
Opcode = ISD::STRICT_FDIV;		Opcode = ISD::STRICT_FDIV;
break;		break;
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
Opcode = ISD::STRICT_FREM;		Opcode = ISD::STRICT_FREM;
break;		break;
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
Opcode = ISD::STRICT_FMA;		Opcode = ISD::STRICT_FMA;
break;		break;
		case Intrinsic::experimental_constrained_fptosi:
		Opcode = ISD::STRICT_FP_TO_SINT;
		break;
		case Intrinsic::experimental_constrained_fptoui:
		Opcode = ISD::STRICT_FP_TO_UINT;
		break;
		case Intrinsic::experimental_constrained_fptrunc:
		Opcode = ISD::STRICT_FP_ROUND;
		break;
		case Intrinsic::experimental_constrained_fpext:
		Opcode = ISD::STRICT_FP_EXTEND;
		break;
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
Opcode = ISD::STRICT_FSQRT;		Opcode = ISD::STRICT_FSQRT;
break;		break;
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
Opcode = ISD::STRICT_FPOW;		Opcode = ISD::STRICT_FPOW;
break;		break;
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
Opcode = ISD::STRICT_FPOWI;		Opcode = ISD::STRICT_FPOWI;
Show All 25 Lines	void SelectionDAGBuilder::visitConstrainedFPIntrinsic(
case Intrinsic::experimental_constrained_nearbyint:		case Intrinsic::experimental_constrained_nearbyint:
Opcode = ISD::STRICT_FNEARBYINT;		Opcode = ISD::STRICT_FNEARBYINT;
break;		break;
}		}
const TargetLowering &TLI = DAG.getTargetLoweringInfo();		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
SDValue Chain = getRoot();		SDValue Chain = getRoot();
SmallVector<EVT, 4> ValueVTs;		SmallVector<EVT, 4> ValueVTs;
ComputeValueVTs(TLI, DAG.getDataLayout(), FPI.getType(), ValueVTs);		ComputeValueVTs(TLI, DAG.getDataLayout(), FPI.getType(), ValueVTs);
		if (Opcode != ISD::STRICT_FP_TO_SINT)
ValueVTs.push_back(MVT::Other); // Out chain		ValueVTs.push_back(MVT::Other); // Out chain
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Why are you not attaching these nodes to the chain? andrew.w.kaylor: Why are you not attaching these nodes to the chain?
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Because they need to match what the default lowering is expecting. Otherwise a variety of failures happen. kpn: Because they need to match what the default lowering is expecting. Otherwise a variety of…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions There is code that removes the chain when the strict node is mutated to a non-strict node. That should be preventing lowering problems. I believe the chain was necessary to prevent re-ordering prior to final instruction selection. andrew.w.kaylor: There is code that removes the chain when the strict node is mutated to a non-strict node. That…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Mutation happens too late in some cases. If it happened early enough then there wouldn't be any need for the strict intrinsics to be mentioned in SelectionDAGLegalize::ExpandNode(). Since it doesn't we can't use the chain. Should that mutation happen earlier? kpn: Mutation happens too late in some cases. If it happened early enough then there wouldn't be…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions We need the mutation to be put off as long as possible. I think it should be possible to unlink the chain in ExpandNode if necessary. Depending on what's happening there, we might even want the expanded node to make use of the chain. andrew.w.kaylor: We need the mutation to be put off as long as possible. I think it should be possible to unlink…

SDVTList VTs = DAG.getVTList(ValueVTs);		SDVTList VTs = DAG.getVTList(ValueVTs);
SDValue Result;		SDValue Result;
if (FPI.isUnaryOp())		if (Opcode == ISD::STRICT_FP_TO_SINT)
		craig.topperUnsubmitted Not Done Reply Inline Actions Is this adding a second argument to STRICT_FP_EXTEND as well? I don't think the non-strict FP_EXTEND has two arguments. craig.topper: Is this adding a second argument to STRICT_FP_EXTEND as well? I don't think the non-strict…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Agreed. I'll fix it. kpn: Agreed. I'll fix it.
		Result = DAG.getNode(Opcode, sdl, VTs,
		{ getValue(FPI.getArgOperand(0)) });
		else if (FPI.isUnaryOp())
Result = DAG.getNode(Opcode, sdl, VTs,		Result = DAG.getNode(Opcode, sdl, VTs,
{ Chain, getValue(FPI.getArgOperand(0)) });		{ Chain, getValue(FPI.getArgOperand(0)) });
else if (FPI.isTernaryOp())		else if (FPI.isTernaryOp())
Result = DAG.getNode(Opcode, sdl, VTs,		Result = DAG.getNode(Opcode, sdl, VTs,
{ Chain, getValue(FPI.getArgOperand(0)),		{ Chain, getValue(FPI.getArgOperand(0)),
getValue(FPI.getArgOperand(1)),		getValue(FPI.getArgOperand(1)),
getValue(FPI.getArgOperand(2)) });		getValue(FPI.getArgOperand(2)) });
else		else
Result = DAG.getNode(Opcode, sdl, VTs,		Result = DAG.getNode(Opcode, sdl, VTs,
{ Chain, getValue(FPI.getArgOperand(0)),		{ Chain, getValue(FPI.getArgOperand(0)),
getValue(FPI.getArgOperand(1)) });		getValue(FPI.getArgOperand(1)) });

		if (Opcode != ISD::STRICT_FP_TO_SINT) {
assert(Result.getNode()->getNumValues() == 2);		assert(Result.getNode()->getNumValues() == 2);
SDValue OutChain = Result.getValue(1);		SDValue OutChain = Result.getValue(1);
DAG.setRoot(OutChain);		DAG.setRoot(OutChain);
		}
SDValue FPResult = Result.getValue(0);		SDValue FPResult = Result.getValue(0);
setValue(&FPI, FPResult);		setValue(&FPI, FPResult);
}		}

std::pair<SDValue, SDValue>		std::pair<SDValue, SDValue>
SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,		SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI,
const BasicBlock *EHPadBB) {		const BasicBlock *EHPadBB) {
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
▲ Show 20 Lines • Show All 3,910 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	#endif
case ISD::ZERO_EXTEND: return "zero_extend";		case ISD::ZERO_EXTEND: return "zero_extend";
case ISD::ANY_EXTEND: return "any_extend";		case ISD::ANY_EXTEND: return "any_extend";
case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";		case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";
case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";		case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";
case ISD::SIGN_EXTEND_VECTOR_INREG: return "sign_extend_vector_inreg";		case ISD::SIGN_EXTEND_VECTOR_INREG: return "sign_extend_vector_inreg";
case ISD::ZERO_EXTEND_VECTOR_INREG: return "zero_extend_vector_inreg";		case ISD::ZERO_EXTEND_VECTOR_INREG: return "zero_extend_vector_inreg";
case ISD::TRUNCATE: return "truncate";		case ISD::TRUNCATE: return "truncate";
case ISD::FP_ROUND: return "fp_round";		case ISD::FP_ROUND: return "fp_round";
		case ISD::STRICT_FP_ROUND: return "strict_fp_round";
case ISD::FLT_ROUNDS_: return "flt_rounds";		case ISD::FLT_ROUNDS_: return "flt_rounds";
case ISD::FP_ROUND_INREG: return "fp_round_inreg";		case ISD::FP_ROUND_INREG: return "fp_round_inreg";
case ISD::FP_EXTEND: return "fp_extend";		case ISD::FP_EXTEND: return "fp_extend";
		case ISD::STRICT_FP_EXTEND: return "strict_fp_extend";

case ISD::SINT_TO_FP: return "sint_to_fp";		case ISD::SINT_TO_FP: return "sint_to_fp";
case ISD::UINT_TO_FP: return "uint_to_fp";		case ISD::UINT_TO_FP: return "uint_to_fp";
case ISD::FP_TO_SINT: return "fp_to_sint";		case ISD::FP_TO_SINT: return "fp_to_sint";
		case ISD::STRICT_FP_TO_SINT: return "strict_fp_to_sint";
case ISD::FP_TO_UINT: return "fp_to_uint";		case ISD::FP_TO_UINT: return "fp_to_uint";
		case ISD::STRICT_FP_TO_UINT: return "strict_fp_to_uint";
case ISD::BITCAST: return "bitcast";		case ISD::BITCAST: return "bitcast";
case ISD::ADDRSPACECAST: return "addrspacecast";		case ISD::ADDRSPACECAST: return "addrspacecast";
case ISD::FP16_TO_FP: return "fp16_to_fp";		case ISD::FP16_TO_FP: return "fp16_to_fp";
case ISD::FP_TO_FP16: return "fp_to_fp16";		case ISD::FP_TO_FP16: return "fp_to_fp16";

// Control flow instructions		// Control flow instructions
case ISD::BR: return "br";		case ISD::BR: return "br";
case ISD::BRIND: return "brind";		case ISD::BRIND: return "brind";
▲ Show 20 Lines • Show All 533 Lines • Show Last 20 Lines

lib/CodeGen/StrictFP.cpp

				//===----- StrictFP.cpp - Required transforms for strict FP ---------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				///
				/// \file
				/// This file contains transforms necessary for strict floating point
				/// operations.
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions Can you say more here about what these transformations are? It's clear that you intend this as a generic pass that currently does one thing but might have others added later. That's good, but I'd like to see the possibilities described here as they are implemented. andrew.w.kaylor: Can you say more here about what these transformations are? It's clear that you intend this as…
				///
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/CodeGen/TargetLowering.h"
				#include "llvm/CodeGen/TargetSubtargetInfo.h"
				#include "llvm/IR/InstrTypes.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/Intrinsics.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Pass.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/raw_ostream.h"
				#include "llvm/Transforms/IPO.h"
				#include "llvm/Transforms/Utils/BasicBlockUtils.h"
				using namespace llvm;

				#define DEBUG_TYPE "constrained-fp-transforms"

				STATISTIC(NumStrictFPOps, "Number of strict floating point ops transformed");

				namespace {

				class StrictFPPass : public FunctionPass {
				public:
				static char ID;

				std::vector<IntrinsicInst *> IntrinsicWorkList;
				const DataLayout *DL;
				TargetMachine *TM = nullptr;

				StrictFPPass() : FunctionPass(ID) {
				initializeStrictFPPassPass(*PassRegistry::getPassRegistry());
				}

				StrictFPPass(TargetMachine *TM) : FunctionPass(ID), TM(TM) {
				initializeStrictFPPassPass(*PassRegistry::getPassRegistry());
				}

				bool runOnFunction(Function &) override;

				private:
				void inspectIntrinsicCall(IntrinsicInst , const TargetLowering );

				bool processIntrinsicCall(LLVMContext &, IntrinsicInst *);

				void replaceConstrainedFPToUI(LLVMContext &, IntrinsicInst *);
				};

				bool StrictFPPass::runOnFunction(Function &F) {
				bool Changed = false;
				DL = &F.getParent()->getDataLayout();

				LLVMContext &Context = F.getParent()->getContext();

				auto *TLI = TM->getSubtargetImpl(F)->getTargetLowering();

				for (auto &BB : F) {
				for (auto &I : BB) {
				if (auto *Call = dyn_cast<IntrinsicInst>(&I))
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions Per the LLVM Programmer's Manual (http://llvm.org/docs/ProgrammersManual.html#iterating-over-the-instruction-in-a-function) you should be using an inst_iterator here. andrew.w.kaylor: Per the LLVM Programmer's Manual (http://llvm.org/docs/ProgrammersManual.html#iterating-over…
				inspectIntrinsicCall(Call, TLI);
				}
				craig.topperUnsubmitted Done Reply Inline Actions I believe you should be getting TM by doing this. auto TPC = getAnalysisIfAvailable<TargetPassConfig>(); if (!TPC) return false; auto &TM = TPC->getTM<TargetMachine>(); There is only one other pass that takes TargetMachine in its constructor. The others use what I've put above. So I believe that is the preferred way. craig.topper:* I believe you should be getting TM by doing this. ``` auto *TPC =…
				}

				for (auto *I : IntrinsicWorkList) {
				Changed \|= processIntrinsicCall(Context, I);
				}

				IntrinsicWorkList.clear();

				craig.topperUnsubmitted Done Reply Inline Actions There is little reason to pass Context around. Value has a getContext as does Type. So you can get the context easily whenever you need it. craig.topper: There is little reason to pass Context around. Value has a getContext as does Type. So you can…
				return Changed;
				}

				void StrictFPPass::inspectIntrinsicCall(IntrinsicInst *I,
				const TargetLowering *TLI) {

				switch (Intrinsic::ID IID = I->getIntrinsicID()) {
				default:
				return;
				case Intrinsic::experimental_constrained_fptoui:
				Value *IntDst = cast<Value>(I);
				Type *IntDstType = IntDst->getType();
				EVT VT = EVT::getEVT(IntDstType, true);

				auto Action = TLI->getOperationAction(ISD::FP_TO_UINT, VT);
				craig.topperUnsubmitted Done Reply Inline Actions This cast is unnecessary. IntrinsicInst is a subclass of Value craig.topper: This cast is unnecessary. IntrinsicInst is a subclass of Value

				// We don't currently handle Custom or Promote for strict FP pseudo-ops.
				craig.topperUnsubmitted Done Reply Inline Actions This should use TLI->getValueType. craig.topper: This should use TLI->getValueType.
				// For now, we just expand for those cases.
				if (Action != TargetLowering::Legal)
				Action = TargetLowering::Expand;

				if (Action == TargetLowering::Expand)
				IntrinsicWorkList.push_back(I);

				break;
				}
				return;
				}

				bool StrictFPPass::processIntrinsicCall(LLVMContext &Context,
				IntrinsicInst *Call) {
				switch (Intrinsic::ID IID = Call->getIntrinsicID()) {
				default:
				return false;
				case Intrinsic::experimental_constrained_fptoui:
				replaceConstrainedFPToUI(Context, Call);
				break;
				}
				return true;
				}

				void StrictFPPass::replaceConstrainedFPToUI(LLVMContext &Context,
				IntrinsicInst *I) {

				// Four blocks:
				// #1 Gets the compare instruction, is the original block
				// #2 Gets conversion instructions when in signed range
				// #3 Conversion instructions when out of signed range
				// #4 Gets the PHI plus the remainder of the original block
				//
				// The original call gets replaced with the PHI

				andrew.w.kaylorUnsubmitted Done Reply Inline Actions Could you add an example here of what the resulting IR will look like? It would make the code a lot easier to follow. andrew.w.kaylor: Could you add an example here of what the resulting IR will look like? It would make the code a…
				SmallVector<Value *, 4> Operands(I->arg_operands());
				Value *IntDst = cast<Value>(I);
				Value *FPSrc = Operands[0];
				Value *ExBehavior = Operands[1];

				auto *t = cast<IntegerType>(I->getType());
				APInt IntMaxAP(DL->getTypeStoreSize(t) * 8, t->getSignBit());
				APFloat FPMaxAP((double)0);
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions I think you can use "IntDst->getBitWidth()" here. Also, APInt::getSignedMinValue() does this same thing and is a bit more self-documenting. andrew.w.kaylor: I think you can use "IntDst->getBitWidth()" here. Also, APInt::getSignedMinValue() does this…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions error: no member named `getBitWidth' in` llvm::Value'. kpn: error: no member named `getBitWidth' in `llvm::Value'.
				FPMaxAP.convertFromAPInt(IntMaxAP, false, APFloat::rmNearestTiesToEven);
				Constant *FPMaxSIntV = ConstantFP::get(FPSrc->getType(), FPMaxAP);
				Constant *IntMaxSIntV = ConstantInt::get(IntDst->getType(), IntMaxAP);

				/* TODO: should this be FCMP_OLT? Ordered? */
				/* TODO: strict version of compare? */
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions I believe conversion of a NaN to an integer should raise "INVALID" (which the fcmp will) and then the result is undefined, but the 'true' case does less so I think ULT is preferable. andrew.w.kaylor: I believe conversion of a NaN to an integer should raise "INVALID" (which the fcmp will) and…
				FCmpInst *FPCompare = new FCmpInst(I, FCmpInst::FCMP_ULT, FPSrc, FPMaxSIntV,
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions We are going to need a constrained version of fcmp, and when we have it you should use it here. When the IRBuilder supports constrained floating point modes, it would be nice to use that here but I guess you can't do that yet, so maybe just a comment saying we should later? andrew.w.kaylor: We are going to need a constrained version of fcmp, and when we have it you should use it here.
				"not_too_high_sint");

				andrew.w.kaylorUnsubmitted Done Reply Inline Actions This is an odd name. How about "within.sint.range"? In any case, I think '.' is more common than '_' as a name space holder, probably because the name will automatically get '.<n>' appended if it's a duplicate. andrew.w.kaylor: This is an odd name. How about "within.sint.range"? In any case, I think '.' is more common…
				TerminatorInst ThenTerm, ElseTerm;
				SplitBlockAndInsertIfThenElse(FPCompare, I, &ThenTerm, &ElseTerm);

				BasicBlock *StraightConvBB = ThenTerm->getParent();
				BasicBlock *NotInRangeBB = ElseTerm->getParent();
				BasicBlock *ExitBB = I->getParent();
				Function *F2SI = Intrinsic::getDeclaration(
				ExitBB->getModule(), Intrinsic::experimental_constrained_fptosi,
				{I->getType(), FPSrc->getType()});
				Function *FSUB = Intrinsic::getDeclaration(
				ExitBB->getModule(), Intrinsic::experimental_constrained_fsub,
				FPSrc->getType());

				craig.topperUnsubmitted Done Reply Inline Actions Why do you need a SmallVector here? Why can't you just call getArgOperand? craig.topper: Why do you need a SmallVector here? Why can't you just call getArgOperand?
				MDString *SubRoundingMDS = MDString::get(Context, "round.dynamic");
				craig.topperUnsubmitted Done Reply Inline Actions This cast is unecessary. craig.topper: This cast is unecessary.
				Value *SubRounding = MetadataAsValue::get(Context, SubRoundingMDS);

				CallInst *ThenSIntCall;
				CallInst *ElseSIntCall;
				craig.topperUnsubmitted Not Done Reply Inline Actions What if the intrinsic uses a vector type? craig.topper: What if the intrinsic uses a vector type?
				kpnAuthorUnsubmitted Not Done Reply Inline Actions It would have been caught by the IR verifier. A vector would have been rejected there. kpn: It would have been caught by the IR verifier. A vector would have been rejected there.
				craig.topperUnsubmitted Not Done Reply Inline Actions I don't see where the IR verifier rejects vectors. I just see that it checks that element counts are equal. And why should it reject vectors? We need to support fptosi/fptoui for vectors. craig.topper: I don't see where the IR verifier rejects vectors. I just see that it checks that element…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions It doesn't reject them, now. This latest patch added support for vectors. My inexperience with Phabricator lost the comment that said that. kpn: It doesn't reject them, now. This latest patch added support for vectors. My inexperience with…
				Instruction *BiasedFPSrc;
				Instruction *ElseSIntResult;

				ThenSIntCall = CallInst::Create(F2SI, {FPSrc, ExBehavior}, "", ThenTerm);
				BranchInst::Create(ExitBB, StraightConvBB);
				ThenTerm->eraseFromParent();

				BiasedFPSrc = CallInst::Create(
				FSUB, {FPSrc, FPMaxSIntV, SubRounding, ExBehavior}, "", ElseTerm);
				ElseSIntCall =
				CallInst::Create(F2SI, {BiasedFPSrc, ExBehavior}, "", ElseTerm);
				ElseSIntResult = BinaryOperator::Create(BinaryOperator::Xor, ElseSIntCall,
				IntMaxSIntV, "", ElseTerm);
				BranchInst::Create(ExitBB, NotInRangeBB);
				ElseTerm->eraseFromParent();

				PHINode *PN = PHINode::Create(ElseSIntResult->getType(), 2, "", I);
				PN->addIncoming(ThenSIntCall, ThenSIntCall->getParent());
				PN->addIncoming(ElseSIntResult, ElseSIntResult->getParent());
				I->replaceAllUsesWith(PN);
				I->eraseFromParent();

				++NumStrictFPOps;
				}

				} // End anonymous namespace

				char StrictFPPass::ID = 0;
				INITIALIZE_PASS(StrictFPPass, DEBUG_TYPE, "Force constrained floating point",
				false, false)
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions This description is wrong. andrew.w.kaylor: This description is wrong.

				FunctionPass llvm::createStrictFPPass(TargetMachine TM) {
				return new StrictFPPass(TM);
				}

lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	static cl::opt<CFLAAType> UseCFLAA(
cl::values(clEnumValN(CFLAAType::None, "none", "Disable CFL-AA"),		cl::values(clEnumValN(CFLAAType::None, "none", "Disable CFL-AA"),
clEnumValN(CFLAAType::Steensgaard, "steens",		clEnumValN(CFLAAType::Steensgaard, "steens",
"Enable unification-based CFL-AA"),		"Enable unification-based CFL-AA"),
clEnumValN(CFLAAType::Andersen, "anders",		clEnumValN(CFLAAType::Andersen, "anders",
"Enable inclusion-based CFL-AA"),		"Enable inclusion-based CFL-AA"),
clEnumValN(CFLAAType::Both, "both",		clEnumValN(CFLAAType::Both, "both",
"Enable both variants of CFL-AA")));		"Enable both variants of CFL-AA")));

		static cl::opt<bool>
		StrictFP("strict-fp-transforms", cl::init(true), cl::Hidden,
		cl::desc("Enable transformations needed for strict FP."));

/// Option names for limiting the codegen pipeline.		/// Option names for limiting the codegen pipeline.
/// Those are used in error reporting and we didn't want		/// Those are used in error reporting and we didn't want
/// to duplicate their names all over the place.		/// to duplicate their names all over the place.
const char *StartAfterOptName = "start-after";		const char *StartAfterOptName = "start-after";
const char *StartBeforeOptName = "start-before";		const char *StartBeforeOptName = "start-before";
const char *StopAfterOptName = "stop-after";		const char *StopAfterOptName = "stop-after";
const char *StopBeforeOptName = "stop-before";		const char *StopBeforeOptName = "stop-before";

▲ Show 20 Lines • Show All 379 Lines • ▼ Show 20 Lines
#endif		#endif
if (Verify)		if (Verify)
PM->add(createMachineVerifierPass(Banner));		PM->add(createMachineVerifierPass(Banner));
}		}

/// Add common target configurable passes that perform LLVM IR to IR transforms		/// Add common target configurable passes that perform LLVM IR to IR transforms
/// following machine independent optimization.		/// following machine independent optimization.
void TargetPassConfig::addIRPasses() {		void TargetPassConfig::addIRPasses() {
		// Experimental pass with transforms needed for strict FP.
		if (StrictFP)
		addPass(createStrictFPPass(TM));
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions This doesn't seem like the right place to do this. Should it be happening much later, like around CodeGenPrepare? andrew.w.kaylor: This doesn't seem like the right place to do this. Should it be happening much later, like…

switch (UseCFLAA) {		switch (UseCFLAA) {
case CFLAAType::Steensgaard:		case CFLAAType::Steensgaard:
addPass(createCFLSteensAAWrapperPass());		addPass(createCFLSteensAAWrapperPass());
break;		break;
case CFLAAType::Andersen:		case CFLAAType::Andersen:
addPass(createCFLAndersAAWrapperPass());		addPass(createCFLAndersAAWrapperPass());
break;		break;
case CFLAAType::Both:		case CFLAAType::Both:
▲ Show 20 Lines • Show All 583 Lines • Show Last 20 Lines

lib/IR/IntrinsicInst.cpp

Show First 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	return StringSwitch<ExceptionBehavior>(ExceptionArg)
.Case("fpexcept.strict", ebStrict)		.Case("fpexcept.strict", ebStrict)
.Default(ebInvalid);		.Default(ebInvalid);
}		}

bool ConstrainedFPIntrinsic::isUnaryOp() const {		bool ConstrainedFPIntrinsic::isUnaryOp() const {
switch (getIntrinsicID()) {		switch (getIntrinsicID()) {
default:		default:
return false;		return false;
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
case Intrinsic::experimental_constrained_log10:		case Intrinsic::experimental_constrained_log10:
case Intrinsic::experimental_constrained_log2:		case Intrinsic::experimental_constrained_log2:
Show All 15 Lines

lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,033 Lines • ▼ Show 20 Lines	Assert(isa<ConstantInt>(CS.getArgOperand(1)),
CS);		CS);
break;		break;
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
▲ Show 20 Lines • Show All 387 Lines • ▼ Show 20 Lines	static DISubprogram getSubprogram(Metadata LocalScope) {

// Just return null; broken scope chains are checked elsewhere.		// Just return null; broken scope chains are checked elsewhere.
assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");		assert(!isa<DILocalScope>(LocalScope) && "Unknown type of local scope");
return nullptr;		return nullptr;
}		}

void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {		void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {
unsigned NumOperands = FPI.getNumArgOperands();		unsigned NumOperands = FPI.getNumArgOperands();
		bool HasExceptionMD = false;
		bool HasRoundingMD = false;
		switch (FPI.getIntrinsicID())
		{
		case Intrinsic::experimental_constrained_fadd:
		case Intrinsic::experimental_constrained_fsub:
		case Intrinsic::experimental_constrained_fmul:
		case Intrinsic::experimental_constrained_fdiv:
		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_sqrt:
		case Intrinsic::experimental_constrained_pow:
		case Intrinsic::experimental_constrained_powi:
		case Intrinsic::experimental_constrained_sin:
		case Intrinsic::experimental_constrained_cos:
		case Intrinsic::experimental_constrained_exp:
		case Intrinsic::experimental_constrained_exp2:
		case Intrinsic::experimental_constrained_log:
		case Intrinsic::experimental_constrained_log10:
		case Intrinsic::experimental_constrained_log2:
		case Intrinsic::experimental_constrained_rint:
		case Intrinsic::experimental_constrained_nearbyint:
Assert(((NumOperands == 5 && FPI.isTernaryOp()) \|\|		Assert(((NumOperands == 5 && FPI.isTernaryOp()) \|\|
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions Since you've broken this out into a switch statement, can you separate the unary and ternary ops and give them each the appropriate assert? I think that would be much more readable than this compound check (which I realize was my creation). andrew.w.kaylor: Since you've broken this out into a switch statement, can you separate the unary and ternary…
(NumOperands == 3 && FPI.isUnaryOp()) \|\| (NumOperands == 4)),		(NumOperands == 3 && FPI.isUnaryOp()) \|\| (NumOperands == 4)),
"invalid arguments for constrained FP intrinsic", &FPI);		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		HasRoundingMD = true;
		break;

		case Intrinsic::experimental_constrained_frem:
		Assert((NumOperands == 3),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		break;

		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui: {
		Assert((NumOperands == 2),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		Value *Operand = FPI.getArgOperand(0);
		Assert(Operand->getType()->isFloatingPointTy(),
		"Constrained FP intrinsic first argument must be floating point",
		&FPI);
		Operand = &FPI;
		Assert(Operand->getType()->isIntegerTy(),
		"Constrained FP intrinsic result must be an integer",
		&FPI);
		}
		break;

		case Intrinsic::experimental_constrained_fptrunc:
		case Intrinsic::experimental_constrained_fpext: {
		Assert((NumOperands == 2),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;
		Value *Operand = FPI.getArgOperand(0);
		Assert(Operand->getType()->isFloatingPointTy(),
		"Constrained FP intrinsic first argument must be floating point",
		&FPI);
		Operand = &FPI;
		Assert(Operand->getType()->isFloatingPointTy(),
		"Constrained FP intrinsic result must be floating point",
		&FPI);
		}
		break;
		kbartonUnsubmitted Not Done Reply Inline Actions Could you use the isFPOrFPVectorTy here to check both conditions, and then remove the assert in the else below? kbarton: Could you use the isFPOrFPVectorTy here to check both conditions, and then remove the assert in…

		kbartonUnsubmitted Not Done Reply Inline Actions Can you use a dyn_cast here instead? if (auto OperandT = dyn_cast<VectorType>(Operand->getType())) { do vector stuff } kbarton:* Can you use a dyn_cast here instead? if (auto *OperandT = dyn_cast<VectorType>(Operand->getType…
		kpnAuthorUnsubmitted Done Reply Inline Actions Yes, that's much more concise. kpn: Yes, that's much more concise.
		default:
		break;
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions The default should probably always been an error. andrew.w.kaylor: The default should probably always been an error.
		}

		if (HasExceptionMD) {
Assert(isa<MetadataAsValue>(FPI.getArgOperand(NumOperands-1)),		Assert(isa<MetadataAsValue>(FPI.getArgOperand(NumOperands-1)),
"invalid exception behavior argument", &FPI);		"invalid exception behavior argument", &FPI);
Assert(isa<MetadataAsValue>(FPI.getArgOperand(NumOperands-2)),		Assert(FPI.getExceptionBehavior() != ConstrainedFPIntrinsic::ebInvalid,
		"invalid exception behavior argument", &FPI);
		}
		if (HasRoundingMD) {
		int RoundingOffset = (HasExceptionMD ? 2 : 1);
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions How about this? int RoundingIdx = (HasExceptionMD ? NumOperands - 2 : NumOperands - 1); On the other hand, are we ever going to have intrinsics that have a rounding mode but not exception behavior? andrew.w.kaylor: How about this? int RoundingIdx = (HasExceptionMD ? NumOperands - 2 : NumOperands - 1); On…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Done. I don't know if we'll have a rounding mode but not exceptions. I doubt it, but I can't say for certain. kpn: Done. I don't know if we'll have a rounding mode but not exceptions. I doubt it, but I can't…
		Assert(isa<MetadataAsValue>(FPI.getArgOperand(NumOperands-RoundingOffset)),
"invalid rounding mode argument", &FPI);		"invalid rounding mode argument", &FPI);
Assert(FPI.getRoundingMode() != ConstrainedFPIntrinsic::rmInvalid,		Assert(FPI.getRoundingMode() != ConstrainedFPIntrinsic::rmInvalid,
		kbartonUnsubmitted Done Reply Inline Actions Similarly, could you use isIntOrIntVectorTy here and remove the assert in the if/else below? kbarton: Similarly, could you use isIntOrIntVectorTy here and remove the assert in the if/else below?
"invalid rounding mode argument", &FPI);		"invalid rounding mode argument", &FPI);
Assert(FPI.getExceptionBehavior() != ConstrainedFPIntrinsic::ebInvalid,		}
"invalid exception behavior argument", &FPI);
}		}

void Verifier::visitDbgIntrinsic(StringRef Kind, DbgInfoIntrinsic &DII) {		void Verifier::visitDbgIntrinsic(StringRef Kind, DbgInfoIntrinsic &DII) {
auto *MD = cast<MetadataAsValue>(DII.getArgOperand(0))->getMetadata();		auto *MD = cast<MetadataAsValue>(DII.getArgOperand(0))->getMetadata();
AssertDI(isa<ValueAsMetadata>(MD) \|\|		AssertDI(isa<ValueAsMetadata>(MD) \|\|
(isa<MDNode>(MD) && !cast<MDNode>(MD)->getNumOperands()),		(isa<MDNode>(MD) && !cast<MDNode>(MD)->getNumOperands()),
"invalid llvm.dbg." + Kind + " intrinsic address/value", &DII, MD);		"invalid llvm.dbg." + Kind + " intrinsic address/value", &DII, MD);
AssertDI(isa<DILocalVariable>(DII.getRawVariable()),		AssertDI(isa<DILocalVariable>(DII.getRawVariable()),
"invalid llvm.dbg." + Kind + " intrinsic variable", &DII,		"invalid llvm.dbg." + Kind + " intrinsic variable", &DII,
DII.getRawVariable());		DII.getRawVariable());
AssertDI(isa<DIExpression>(DII.getRawExpression()),		AssertDI(isa<DIExpression>(DII.getRawExpression()),
"invalid llvm.dbg." + Kind + " intrinsic expression", &DII,		"invalid llvm.dbg." + Kind + " intrinsic expression", &DII,
DII.getRawExpression());		DII.getRawExpression());

// Ignore broken !dbg attachments; they're checked elsewhere.		// Ignore broken !dbg attachments; they're checked elsewhere.
if (MDNode *N = DII.getDebugLoc().getAsMDNode())		if (MDNode *N = DII.getDebugLoc().getAsMDNode())
if (!isa<DILocation>(N))		if (!isa<DILocation>(N))
return;		return;

BasicBlock *BB = DII.getParent();		BasicBlock *BB = DII.getParent();
Function *F = BB ? BB->getParent() : nullptr;		Function *F = BB ? BB->getParent() : nullptr;

		kbartonUnsubmitted Not Done Reply Inline Actions same comment about commoning the asserts here. kbarton: same comment about commoning the asserts here.
		kpnAuthorUnsubmitted Done Reply Inline Actions The fptrunc and fpext changes were split out into another ticket and updated there. That committed code is more concise in I suspect the way you are intending. kpn: The fptrunc and fpext changes were split out into another ticket and updated there. That…
// The scopes for variables and !dbg attachments must agree.		// The scopes for variables and !dbg attachments must agree.
DILocalVariable *Var = DII.getVariable();		DILocalVariable *Var = DII.getVariable();
DILocation *Loc = DII.getDebugLoc();		DILocation *Loc = DII.getDebugLoc();
AssertDI(Loc, "llvm.dbg." + Kind + " intrinsic requires a !dbg attachment",		AssertDI(Loc, "llvm.dbg." + Kind + " intrinsic requires a !dbg attachment",
&DII, BB, F);		&DII, BB, F);

DISubprogram *VarSP = getSubprogram(Var->getRawScope());		DISubprogram *VarSP = getSubprogram(Var->getRawScope());
DISubprogram *LocSP = getSubprogram(Loc->getRawScope());		DISubprogram *LocSP = getSubprogram(Loc->getRawScope());
Show All 27 Lines	void Verifier::verifyFragmentExpression(const DbgInfoIntrinsic &I) {
// union, the overhang piece will be outside of the allotted space for the		// union, the overhang piece will be outside of the allotted space for the
// variable and this check fails.		// variable and this check fails.
// FIXME: Remove this check as soon as clang stops doing this; it hides bugs.		// FIXME: Remove this check as soon as clang stops doing this; it hides bugs.
if (V->isArtificial())		if (V->isArtificial())
return;		return;

verifyFragmentExpression(V, Fragment, &I);		verifyFragmentExpression(V, Fragment, &I);
}		}

		kbartonUnsubmitted Not Done Reply Inline Actions It looks like getArgOperand expects an unsigned. Is there a specific reason you are using an int here? If it's possible that RoundingIdx is negative, then you should probably add an assert here before passing it to getArgOperand. kbarton: It looks like getArgOperand expects an unsigned. Is there a specific reason you are using an…
template <typename ValueOrMetadata>		template <typename ValueOrMetadata>
void Verifier::verifyFragmentExpression(const DIVariable &V,		void Verifier::verifyFragmentExpression(const DIVariable &V,
DIExpression::FragmentInfo Fragment,		DIExpression::FragmentInfo Fragment,
ValueOrMetadata *Desc) {		ValueOrMetadata *Desc) {
// If there's no size, the type is broken, but that should be checked		// If there's no size, the type is broken, but that should be checked
// elsewhere.		// elsewhere.
auto VarSize = V.getSizeInBits();		auto VarSize = V.getSizeInBits();
if (!VarSize)		if (!VarSize)
▲ Show 20 Lines • Show All 528 Lines • Show Last 20 Lines

test/CodeGen/AArch64/O0-pipeline.ll

	Show All 10 Lines
	; CHECK-NEXT: Scoped NoAlias Alias Analysis			; CHECK-NEXT: Scoped NoAlias Alias Analysis
	; CHECK-NEXT: Assumption Cache Tracker			; CHECK-NEXT: Assumption Cache Tracker
	; CHECK-NEXT: Create Garbage Collector Module Metadata			; CHECK-NEXT: Create Garbage Collector Module Metadata
	; CHECK-NEXT: Machine Branch Probability Analysis			; CHECK-NEXT: Machine Branch Probability Analysis
	; CHECK-NEXT: ModulePass Manager			; CHECK-NEXT: ModulePass Manager
	; CHECK-NEXT: Pre-ISel Intrinsic Lowering			; CHECK-NEXT: Pre-ISel Intrinsic Lowering
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Expand Atomic instructions			; CHECK-NEXT: Expand Atomic instructions
				; CHECK-NEXT: Force constrained floating point
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

test/CodeGen/AArch64/O3-pipeline.ll

	Show All 11 Lines
	; CHECK-NEXT: Scoped NoAlias Alias Analysis			; CHECK-NEXT: Scoped NoAlias Alias Analysis
	; CHECK-NEXT: Create Garbage Collector Module Metadata			; CHECK-NEXT: Create Garbage Collector Module Metadata
	; CHECK-NEXT: Profile summary info			; CHECK-NEXT: Profile summary info
	; CHECK-NEXT: Machine Branch Probability Analysis			; CHECK-NEXT: Machine Branch Probability Analysis
	; CHECK-NEXT: ModulePass Manager			; CHECK-NEXT: ModulePass Manager
	; CHECK-NEXT: Pre-ISel Intrinsic Lowering			; CHECK-NEXT: Pre-ISel Intrinsic Lowering
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Expand Atomic instructions			; CHECK-NEXT: Expand Atomic instructions
				; CHECK-NEXT: Force constrained floating point
	; CHECK-NEXT: Simplify the CFG			; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Data Prefetch			; CHECK-NEXT: Loop Data Prefetch
	▲ Show 20 Lines • Show All 140 Lines • Show Last 20 Lines

test/CodeGen/X86/O0-pipeline.ll

	Show All 10 Lines
	; CHECK-NEXT: Scoped NoAlias Alias Analysis			; CHECK-NEXT: Scoped NoAlias Alias Analysis
	; CHECK-NEXT: Assumption Cache Tracker			; CHECK-NEXT: Assumption Cache Tracker
	; CHECK-NEXT: Create Garbage Collector Module Metadata			; CHECK-NEXT: Create Garbage Collector Module Metadata
	; CHECK-NEXT: Machine Branch Probability Analysis			; CHECK-NEXT: Machine Branch Probability Analysis
	; CHECK-NEXT: ModulePass Manager			; CHECK-NEXT: ModulePass Manager
	; CHECK-NEXT: Pre-ISel Intrinsic Lowering			; CHECK-NEXT: Pre-ISel Intrinsic Lowering
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Expand Atomic instructions			; CHECK-NEXT: Expand Atomic instructions
				; CHECK-NEXT: Force constrained floating point
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Lower Garbage Collection Instructions			; CHECK-NEXT: Lower Garbage Collection Instructions
	; CHECK-NEXT: Shadow Stack GC Lowering			; CHECK-NEXT: Shadow Stack GC Lowering
	; CHECK-NEXT: Remove unreachable blocks from the CFG			; CHECK-NEXT: Remove unreachable blocks from the CFG
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)			; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
	; CHECK-NEXT: Scalarize Masked Memory Intrinsics			; CHECK-NEXT: Scalarize Masked Memory Intrinsics
	▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

test/CodeGen/X86/O3-pipeline.ll

	Show All 11 Lines
	; CHECK-NEXT: Assumption Cache Tracker			; CHECK-NEXT: Assumption Cache Tracker
	; CHECK-NEXT: Create Garbage Collector Module Metadata			; CHECK-NEXT: Create Garbage Collector Module Metadata
	; CHECK-NEXT: Profile summary info			; CHECK-NEXT: Profile summary info
	; CHECK-NEXT: Machine Branch Probability Analysis			; CHECK-NEXT: Machine Branch Probability Analysis
	; CHECK-NEXT: ModulePass Manager			; CHECK-NEXT: ModulePass Manager
	; CHECK-NEXT: Pre-ISel Intrinsic Lowering			; CHECK-NEXT: Pre-ISel Intrinsic Lowering
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Expand Atomic instructions			; CHECK-NEXT: Expand Atomic instructions
				; CHECK-NEXT: Force constrained floating point
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Induction Variable Users			; CHECK-NEXT: Induction Variable Users
	▲ Show 20 Lines • Show All 144 Lines • Show Last 20 Lines

test/CodeGen/X86/fp-intrinsics.ll

Show First 20 Lines • Show All 268 Lines • ▼ Show 20 Lines	%result = call double @llvm.experimental.constrained.fma.f64(
double 42.1,		double 42.1,
double 42.1,		double 42.1,
double 42.1,		double 42.1,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret double %result		ret double %result
}		}

		; Verify that fptoui(42.1) isn't simplified when the rounding mode is
		; unknown.
		; Verify that no gross errors happen.
		; CHECK-LABEL: @f19
		; COMMON: movsd
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions Can you check the entire expanded IR pattern here? It might be worth having a separate test that verifies the StrictFP pass in isolation. andrew.w.kaylor: Can you check the entire expanded IR pattern here? It might be worth having a separate test…
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Same here as the vector version below. Do we want the truncating convert? That was surprising to me. cameron.mcinally: Same here as the vector version below. Do we want the truncating convert? That was surprising…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions The strict intrinsic results in the same instruction as the regular fptosi instruction. Having them be the same means the mutation from strict to non-strict is working correctly. And, yes, rounding towards zero is correct. That's why there's no rounding metadata. kpn: The strict intrinsic results in the same instruction as the regular fptosi instruction. Having…
		; COMMON: subsd
		define zeroext i32 @f19() {
		entry:
		%result = call zeroext i32 @llvm.experimental.constrained.fptoui.f64(
		double 42.1,
		metadata !"fpexcept.strict")
		ret i32 %result
		}

		; Verify that fptosi(42.1) isn't simplified when the rounding mode is
		; unknown.
		; Verify that no gross errors happen.
		; CHECK-LABEL: @f20
		; COMMON: cvttsd2si
		define i32 @f20() {
		entry:
		%result = call i32 @llvm.experimental.constrained.fptosi.f64(double 42.1,
		metadata !"fpexcept.strict")
		ret i32 %result
		}

		; Verify that round(42.1) isn't simplified when the rounding mode is
		; unknown.
		; Verify that no gross errors happen.
		; CHECK-LABEL: @f21
		; COMMON: cvtsd2ss
		define float @f21() {
		entry:
		%result = call float @llvm.experimental.constrained.fptrunc.f32(double 42.1,
		metadata !"fpexcept.strict")
		ret float %result
		}

		; Verify that fpext(42.1) isn't simplified when the rounding mode is
		; unknown.
		; Verify that no gross errors happen.
		; CHECK-LABEL: @f22
		; COMMON: cvtss2sd
		define double @f22(float %x) {
		entry:
		%result = call double @llvm.experimental.constrained.fpext.f32(float %x,
		metadata !"fpexcept.strict")
		ret double %result
		}

@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"		@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)		declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)
declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)		declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)
declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata)		declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata)
declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)
		declare zeroext i32 @llvm.experimental.constrained.fptoui.f64(double, metadata)
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions I believe your decorated intrinsic names are incorrect in all of these cases. The name needs to specify both return type and argument type. Take a look at what opt produces if you give it these names as inputs. andrew.w.kaylor: I believe your decorated intrinsic names are incorrect in all of these cases. The name needs to…
		cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Do we want unsigned convert tests too? fptoui? I see that there are SystemZ tests to cover them, so maybe that's sufficient? Just pointing this out so others can see. cameron.mcinally: Do we want unsigned convert tests too? fptoui? I see that there are SystemZ tests to cover…
		kpnAuthorUnsubmitted Not Done Reply Inline Actions The SystemZ tests target hardware new enough to lower to a single instruction. The tests for fptoui on x86 use the default lowering, but the default lowering is disallowed since it does speculative execution and traps. The support for fixing that I had in this patch but was asked to split it out into another patch. So there's no constrained fptoui test here using the default lowering in this patch. kpn: The SystemZ tests target hardware new enough to lower to a single instruction. The tests for…
		declare i32 @llvm.experimental.constrained.fptosi.f64(double, metadata)
		declare float @llvm.experimental.constrained.fptrunc.f32(double, metadata)
		declare double @llvm.experimental.constrained.fpext.f32(float, metadata)

test/Feature/fp-intrinsics.ll

	Show First 20 Lines • Show All 236 Lines • ▼ Show 20 Lines
	define double @f17() {			define double @f17() {
	entry:			entry:
	%result = call double @llvm.experimental.constrained.fma.f64(double 42.1, double 42.1, double 42.1,			%result = call double @llvm.experimental.constrained.fma.f64(double 42.1, double 42.1, double 42.1,
	metadata !"round.dynamic",			metadata !"round.dynamic",
	metadata !"fpexcept.strict")			metadata !"fpexcept.strict")
	ret double %result			ret double %result
	}			}

				; Verify that fptoui(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f18
				; CHECK: call zeroext i32 @llvm.experimental.constrained.fptoui
				define zeroext i32 @f18() {
				entry:
				%result = call zeroext i32 @llvm.experimental.constrained.fptoui.f64(
				double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that fptosi(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f19
				; CHECK: call i32 @llvm.experimental.constrained.fptosi
				define i32 @f19() {
				entry:
				%result = call i32 @llvm.experimental.constrained.fptosi.f64(double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that fptrunc(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f20
				; CHECK: call float @llvm.experimental.constrained.fptrunc
				define float @f20() {
				entry:
				%result = call float @llvm.experimental.constrained.fptrunc.f32(double 42.1,
				metadata !"fpexcept.strict")
				ret float %result
				}

				; Verify that fpext(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f21
				; CHECK: call double @llvm.experimental.constrained.fpext
				define double @f21() {
				entry:
				%result = call double @llvm.experimental.constrained.fpext.f64(double 42.1,
				metadata !"fpexcept.strict")
				ret double %result
				}

	@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"			@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
	declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)			declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)
	declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)
				declare zeroext i32 @llvm.experimental.constrained.fptoui.f64(double, metadata)
				declare i32 @llvm.experimental.constrained.fptosi.f64(double, metadata)
				cameron.mcinallyUnsubmitted Not Done Reply Inline Actions I haven't been following along closely, so please forgive if this was already discussed... Should we have float->i32 casts too? Also double->i64? cameron.mcinally: I haven't been following along closely, so please forgive if this was already discussed...
				kpnAuthorUnsubmitted Not Done Reply Inline Actions This is an opt test. I'm not sure we'd benefit from placing those extra tests here. kpn: This is an opt test. I'm not sure we'd benefit from placing those extra tests here.
				declare float @llvm.experimental.constrained.fptrunc.f32(double, metadata)
				declare double @llvm.experimental.constrained.fpext.f64(double, metadata)
				cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Should 'fpext.f64' have a float argument instead of a double? cameron.mcinally: Should 'fpext.f64' have a float argument instead of a double?
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Probably, yes. Will fix. kpn: Probably, yes. Will fix.

This is an archive of the discontinued LLVM Phabricator instance.

More math intrinsics for conservative math handlingAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 144714

docs/LangRef.rst

include/llvm/CodeGen/ISDOpcodes.h

include/llvm/CodeGen/Passes.h

include/llvm/CodeGen/SelectionDAGNodes.h

include/llvm/IR/IntrinsicInst.h

include/llvm/IR/Intrinsics.td

include/llvm/InitializePasses.h

lib/CodeGen/CMakeLists.txt

lib/CodeGen/CodeGen.cpp

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

lib/CodeGen/StrictFP.cpp

lib/CodeGen/TargetPassConfig.cpp

lib/IR/IntrinsicInst.cpp

lib/IR/Verifier.cpp

test/CodeGen/AArch64/O0-pipeline.ll

test/CodeGen/AArch64/O3-pipeline.ll

test/CodeGen/X86/O0-pipeline.ll

test/CodeGen/X86/O3-pipeline.ll

test/CodeGen/X86/fp-intrinsics.ll

test/Feature/fp-intrinsics.ll

More math intrinsics for conservative math handling
AbandonedPublic