Page MenuHomePhabricator

[PPC] Peephole to remove extra fcmp that checks for NaN
AbandonedPublic

Authored by amehsan on Oct 26 2016, 8:09 AM.

Details

Summary

The functionality is explained in a comment in the patch. So I do not repeat that here.

This problem is exposed after a patch that I recently committed to generate +0.0 using xor, instead of loading it from constant area. Currently this is a targetted fix for this particular pattern. I tried a little to create other testcases with a similar problem, but that was not successful. If you have ideas for generalizing the patch, I can update it.

Diff Detail

Event Timeline

amehsan updated this revision to Diff 75895.Oct 26 2016, 8:09 AM
amehsan retitled this revision from to [PPC] Peephole to remove extra fcmp that checks for NaN.
amehsan updated this object.
amehsan added a subscriber: llvm-commits.
amehsan added inline comments.Oct 26 2016, 8:12 AM
test/CodeGen/PowerPC/fast-isel-fcmp-nan.ll
4–7

I think my comment here is not quite accurate. I will update it to something like this:

; Some tests need to check two bits of the CR field, so we check for a
; pattern that indicates both CR bits are checked. Otherwise we look for a cmp followed
; by a jump. For O2 tests, we need to make sure that only one comparison is done.

hfinkel added inline comments.Oct 26 2016, 5:26 PM
lib/Target/PowerPC/PPCInstrInfo.cpp
1584

s/otherReg/OtherReg/

1585

What is the conditional here on CopyReg actually testing. How could it be zero (i.e. an invalid register) if hasOneUse(CopyReg) was true?

1599

This makes sense, but we should add a comment explaining why. The point here is that we're trying to figure out whether (vreg2 < 0 or isnan(vreg)). When we use FCMPU, the un bit will be set if either of the operands is nan. So this is only semantically valid when we know that the other operand of the compare is not nan. We do know this when we know it is zero.

1631

No need for this white-space change.

amehsan added inline comments.Oct 26 2016, 5:46 PM
lib/Target/PowerPC/PPCInstrInfo.cpp
1584

Will change

1585

The CROp has two source operands (Operands 1 and 2). If Operand(1) is the same Reg that was defined by Copy insn we just visited (CopyReg), then we look at Operand (2) and go to its definition. Otherwise, Operand(2) will be CopyReg and we should start from Operand(1) and go to its definition, etc.

1599

That's right. I will put your explanation in the code with some minor changes if needed.

echristo added inline comments.Oct 26 2016, 6:12 PM
lib/Target/PowerPC/PPCInstrInfo.cpp
1543

Actually, go ahead and put this in the commit message as well. That way there's an easier record of it there too. More detailed commit messages are often goodness.

amehsan added inline comments.Oct 26 2016, 6:20 PM
lib/Target/PowerPC/PPCInstrInfo.cpp
1543

Sure

amehsan added inline comments.Oct 26 2016, 6:22 PM
lib/Target/PowerPC/PPCInstrInfo.cpp
1631

OK. Will undo it.

hfinkel added inline comments.Oct 26 2016, 6:22 PM
lib/Target/PowerPC/PPCInstrInfo.cpp
1585

I still don't understand. CopyReg is defined above as:

unsigned CopyReg = CopyMI->getOperand(0).getReg();
if (!MRI->hasOneUse(CopyReg))
  return false;

under what conditions is CopyReg zero (i.e. an invalid register)?

amehsan added inline comments.Oct 26 2016, 6:27 PM
lib/Target/PowerPC/PPCInstrInfo.cpp
1585

I am probably missing something. CopyReg is the register that contains the output of COPY instruction. It should not be zero. Maybe there are edge cases that I am not aware of ?

nemanjai added inline comments.Oct 27 2016, 12:35 AM
lib/Target/PowerPC/PPCInstrInfo.cpp
1567

s/sub/Sub?

1585

But the condition is not for whether CopyReg is zero but rather whether it is the same register as the first operand of the CROR/CRNOR.

1589

I haven't thought about this enough to decipher whether this is possible so I'd rather ask...
Are all of these getUniqueVRegDef calls guaranteed to succeed? Namely, is it impossible that any of the arguments are physical registers at this point?

1619

This may be a naive question since I don't really know under which conditions we emit XSCMPUDP vs. FCMPUD, but can this not be done with XSCMPUDP as well? Or perhaps there is no need for this.

amehsan added inline comments.Oct 27 2016, 12:57 AM
lib/Target/PowerPC/PPCInstrInfo.cpp
1585

I just realized that the indentation, could cause it to be misinterpreted. I will check clang-format.

1619

That's a good question. The problem is PPCInstrInfo::analyzeCompare() does not return true for FCMPUD so from Peephole opt point of view XSCMPUD is not a compare opcode. Since there might be some finer points that has to be take into account, I opened a PR for this, rather than fixing it here.

amehsan added inline comments.Oct 27 2016, 1:00 AM
lib/Target/PowerPC/PPCInstrInfo.cpp
1619

Typo in the comment: PPCInstrInfo::analyzeCompare() does not return true for XSCMPUDP

There has been many comments here, but only a couple of minor issues has to be addressed. (check indentation, remove whitespace, change some variable names). If you don't mind to approve it, I will fix those before committing.

hfinkel accepted this revision.Oct 27 2016, 11:20 AM
hfinkel edited edge metadata.

There has been many comments here, but only a couple of minor issues has to be addressed. (check indentation, remove whitespace, change some variable names). If you don't mind to approve it, I will fix those before committing.

Yes, with the comments addressed, this LGTM.

lib/Target/PowerPC/PPCInstrInfo.cpp
1585

Ah, indeed. I misread it. Thanks!

1589

No. We should check the return code. As in the function below we have:

MachineInstr *MI = MRI->getUniqueVRegDef(SrcReg);
if (!MI) return false;
This revision is now accepted and ready to land.Oct 27 2016, 11:20 AM
amehsan added inline comments.Oct 27 2016, 11:46 AM
lib/Target/PowerPC/PPCInstrInfo.cpp
1589

Right. I even tested this, but then I forgot to check the return value.

amehsan abandoned this revision.Dec 15 2016, 12:36 PM

This patch, as is, is incorrect and requires some changes to proceed. Abandoning.