Page MenuHomePhabricator

[FPEnv][ARM] Add lowering of STRICT_FSETCC and STRICT_FSETCCS
ClosedPublic

Authored by john.brawn on Jan 22 2020, 7:07 AM.

Details

Summary

These can be lowered to code sequences using CMPFP and CMPFPE which then get selected to VCMP and VCMPE. The implementation isn't fully correct, as the chain operand isn't handled correctly, but resolving that looks like it would involve changes around FPSCR-handling instructions and how the FPSCR is modelled.

The fp-intrinsics test was already testing some of this but as the entire test was being XFAILed it wasn't noticed. Un-XFAIL the test and instead leave the cases where we aren't generating the right instruction sequences as FIXME.

Diff Detail

Event Timeline

john.brawn created this revision.Jan 22 2020, 7:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 22 2020, 7:07 AM
craig.topper added inline comments.Jan 22 2020, 8:38 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3672 ↗(On Diff #239584)

Why is it ok to turn a STRICT node into a non-strict node?

john.brawn marked an inline comment as done.Jan 22 2020, 9:35 AM
john.brawn added inline comments.
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3672 ↗(On Diff #239584)

Well, it's what we're doing already in the case that a STRICT_FSETCCS has a condition code that LegalizeSetCCCondCode can make legal by swapping the operands - or at least I think so from reading the code, the situation where I'm seeing FSETCCS is from the expansion of STRICT_FP_TO_UINT where that doesn't happen so I don't know of a way to check if that is the case.

If converting to non-strict isn't right, then what should be done instead?

john.brawn marked an inline comment as not done.Jan 22 2020, 9:36 AM
craig.topper added inline comments.Jan 22 2020, 9:40 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3672 ↗(On Diff #239584)

That does look like a bug, but probably not tested.

The ARM target will need to replicate whatever it does for floating point ISD::SETCC to also handle ISD::STRICT_FSETCC/FSETCCS. That may require fixing the SELECT_CC FIXME below if ARM uses that. You may need new STRICT_ target specific nodes that have chain input and output. And new isel patterns to match them.

john.brawn marked an inline comment as done.Jan 23 2020, 5:46 AM
john.brawn added inline comments.
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3672 ↗(On Diff #239584)

The ARM target has SETCC set to expand which means it gets converted to SELECT_CC. I would expect that a similar expansion for STRICT_FSETCC would require adding a STRICT_SELECT_CC node if we don't want to lose strictness.

I think probably it's easier for the ARM target to have lowering for STRICT_FSETCC and STRICT_FSETCCS, given that we have the VCMP and VCMPE instructions that correspond to the two, so I'll do that.

john.brawn retitled this revision from [FPEnv][ARM] Handle expansion of strict SETCC with legal condition code to [FPEnv][ARM] Add lowering of STRICT_FSETCC and STRICT_FSETCCS.
john.brawn edited the summary of this revision. (Show Details)
john.brawn added a reviewer: dmgreen.

Change this so we instead add handling to the ARM target.

Is it possible for arm_cmpfpe have both a chain and a glue? I presume it's the cmp that would need the chain, not anything else that gets expanded?

llvm/test/CodeGen/ARM/fp-intrinsics.ll
73

This is the thing that only works when dp is enabled? What was the deal there?

Am I right in saying that they are not incorrect, just inefficient?

samparker added a comment.EditedJan 30 2020, 5:15 AM

Is it possible for arm_cmpfpe have both a chain and a glue?

+1. The helper looks like it's barely used so hopefully this shouldn't be an invasion change?

carwil added a subscriber: carwil.Jan 30 2020, 7:03 AM

Is it possible for arm_cmpfpe have both a chain and a glue? I presume it's the cmp that would need the chain, not anything else that gets expanded?

Doing this gets an assertion failure during instruction selection:

Assertion `NumMIOperands >= II.getNumOperands() && NumMIOperands <= II.getNumOperands() + II.getNumImplicitDefs() + NumImpUses && "#operands for dag node doesn't match .td file!"'

VCMPES has one implicit def (FPSCR_NZCV), so having either chain or glue the numbers balance out, but if you have both you have one more operand than what's expected and you get the assertion.

Even when adding SDNPHasChain to the node description?

Even when adding SDNPHasChain to the node description?

That was with SDNPHasChain. After some more investigation though it turns out the problem there was when I was generating the CMPFPE I was putting the chain at the end when it should be at the start, which means when ISel was happening it wasn't being moved to the end, which meant InstrEmitter wasn't noticing it and handling it correctly.

It turns out though that having chain and glue appears to have a different problem. After ISel we currently have:

  t22: i32,glue = VCMPS t3, t6, TargetConstant:i32<14>, Register:i32 $noreg
t23: i32,glue = FMSTAT TargetConstant:i32<14>, Register:i32 $noreg, t22:1

but adding a chain we have:

  t20: i32,ch,glue = VCMPES t3, t6, TargetConstant:i32<14>, Register:i32 $noreg, t0
t21: i32,glue = FMSTAT TargetConstant:i32<14>, Register:i32 $noreg, t20

FMSTAT is using t20 which is the i32 output of the VCMPES, but it should be using t20:2, the glue output of the VCMPES, and this causes problems later. I'm not sure what even is responsible for making this happen - it looks like it's either in the ARMGenDAGISel.inc table, or the code which interprets the table.

dmgreen accepted this revision.Feb 2 2020, 7:59 AM

OK. It looks like that is at least not going to be simple to sort out immediately.

In the meantime, not crashing is certainly better than crashing, and I think the rest of this code looks good. It's just the strict ordering that we are missing. Please see if you can follow up with either adding a chain or not using glue.

LGTM otherwise.

This revision is now accepted and ready to land.Feb 2 2020, 7:59 AM
This revision was automatically updated to reflect the committed changes.