This adds constrained intrinsics for the signed and unsigned conversions of integers to floating-point.
Line the arguments up with the l ine above. Same for the two below
Please add a FIXME
You can DAG.getMergeValues which takes less arguments
Don't we need to propagate the chain result too?
Pleas add a FIXME here
When given a strict FP node LowerOperation() will require that the resulting new node has the chain in the same place in the values. So, no, I don't think so unless we want to thread the std::pair all the way back up.
Oh this works because the node is either a STRICT_FP_EXTEND or STRICT_FP_ROUND so the node in first and second is the same. But we should probably use a MERGE_VALUES here to not rely on that.
Should document Chain here.
Or maybe it would be better to use MERGE_VALUES to return both components if appropriate?
If we pass Node, then all other arguments are really redundant, aren't they?
Hmm, wouldn't we also need to update this routine? Or can we say that promotion is not appropriate for handling strict semantics anyway?
Why do we have to do the Replace... thing here instead of just appending both to Results (like is done for other nodes with chain)?
This should now also use ValVT, I think.
Hmm. There's an UnrollVectorOp_StrictFP in LegalizeVectorTypes.cpp. If we change this routine to also handle some strict operations, maybe we should go all the way and just have merge the routines completely?
I believe the NumOperands check is now redunant with common tests handled above.
I don't understand. N2 is a constant that is either 0 or 1. What will happen if it is discarded here?
This code was lifted straight out of getNode() somewhere around line 5194. Without it the X86 target dies trying to lower a rounding of f64 to f64. This happens because getStrictFPExtendOrRound() returns a round when input and output are the same size. This mirrors the non-strict getFPExtendOrRound().
Oops you’re right about N2. But there is an issue with the chain. Which I think is what I was thinking N2 was when I wrote that. If the input chain didn’t come from N1 this is broken.
I don’t see any precedent for returning a MERGE_VALUES from getNode. I think we need to fix the caller of getStrictFPExtendOrRound to only call when necessary.
-Improve some of the X86 code.
-Add Promote support. Use it for i8/i16 on X86.
-Remove changes to UnrollVectorOp which seemed to be unexercised
-Some cleanup in ExpandLegalINT_TO_FP
-Drop the changes to getNode. We can't fold NOP conversions here and asserts were recently added in another patch.
I'm still working on this ticket daily! I'm trying to merge the two vector unrolling functions like Ulrich suggested. But I ran into problems that lead me to think we may have a serious issue lurking that we'll need to fix. That's what I've been working on: trying to understand the issue and see if it needs further investigation.
If you are in a hurry then you could have sent me an email and I would have uploaded the diffs I've got without further investigation.
I'm leaving attached the comments on my work that I've been adding but haven't submitted until now.
Documentation. Check. I'll have that in my next round.
I don't remember why the prototype for expandFP_TO_UINT() got reformatted. But this code is already in the tree. Having expandUINT_TO_FP() be consistent with the existing tree seems like a good idea?
Seems that way. I'll give it a try.
I don't have a test for it so I didn't change it.
Can we guarantee the result would be rounded back down? Seems like promotion would be invalid without that guarantee.
I've had a lot of trouble with this, but at the moment I'm unable to reproduce any issue here. Let's simplify and see how it goes.
Looks like it.
Done. I've also added an assert to document that getStrictFPExtendOrRound() wants the lengths to be different.
The calls DAG.UnrollVectorOp() can't be used in place of DAGTypeLegalizer::ReplaceValueWith() because then the replaced node doesn't get properly deleted. This then results in an assertion failure later because the replaced node wasn't legal. And having SelectionDAG call into DAGTypeLegalizer doesn't seem like a winning idea, at least not without some plumbing. So today I'm going to punt on merging those functions together and just post what I've got. After breakfast sometime.
I've looked over the common code changes again, and they now look good to me. I'll leave it to Craig to review the X86 changes ...
Right. In general promotion is not appropriate for strict semantics because you don't get the right exceptions (for overflow etc.).
NewChain seems superfluous?
Promotion is definitely bad for fp->int, but is there really an issue for int->fp? We're just going to use a bigger int for the input. If the small int was going to overflow, it should still overflow when its extended.
Can a strict node get here? The call site in vectorLegalizer::Expand only checks for the non-strict node.
When are these changes needed? The STRICT_UINT_TO_FP handling in LegalizeVectorOps always scalarizes and goes through ExpandStrictFPOp.
Can we merge the 2 strict fp ifs here? The only thing we do between them is declare a new variable.
Why is this not just DestVT != Sub.getValueType()
We probably do need a strict version of FHADD, but until we have that we should just go to the shuffle + STRICT_FADD code below rather than silently dropping the chain.
This looks out of date. I recently changed this to return a std::pair of Result and Chain so it was obvious that the chain result was intended as an output. So we need a merge values here for strict fp now. This should be reflected in https://reviews.llvm.org/D71130
Huh. I forgot to submit my comments earlier.
Anyway, I've merged in many changes from D71130, I've made a few changes as requested in LegalizeDAG.cpp, and I added the missing call to ExpandUINT_TO_FLOAT(). Now I'm seeing X86::VMULPDYrr die when running the vector-constrained-fp-intrinsics.ll test on the v4i32 type. It seems that in InstrEmitter.cpp it dies at the assertion at line 838 because NumOperands is 3. That's as far as I've gotten today.
Agreed. I'm working on LegalizeDAG.cpp right now. I notice X86 already has Promote for i8 and i16.
Eh, I figured it'd be more clear this way. I'll change it.
Ah, I must have lost the change to call here, possibly with D69887 going in.
Change lifted from D71130.
Yup, I just needed to rebase.
VectorLegalizer::ExpandUINT_TO_FLOAT() uses UnrollVectorOp() if the required instructions are not available. I have a change where it gets called from the top of ExpandStrictFPOp() and instead returns an empty value if it would have called UnrollVectorOp(). I think that may eliminate the need to modify UnrollVectorOp() since it will then fall through into the ExpandUINT_TO_FLOAT() logic instead. If it works it'll generate better code, and eliminating complexity would be nice.
I addressed this by eliminating the need to call DAG.UnrollVectorOp(). That function is now out of the discussion. The tests show that this way here usually results in shorter sequences of instructions as well.
General question for you and @uweigand that I realized today. Do we need to set the FPExcept bit in the flags for new nodes when we expand/promote operations?
Isn't SDValue(nullptr, 0) equivalent to SDValue() and more consistent with other code?
Can we just declare SDValue Res in the if condition so we don't need double parentheses?
Ah yes, we need to do that.
This reminds me of a general concern: for all other flag bits, omitting the flag is conservatively safe, it just may impact performance. But for FPExcept, omitting the flag impacts correctness. Maybe we should go ahead and invert the sense of the flag (i.e. use a FPNoExcept flag instead of FPExcept) after all ...