This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Require NSZ flag for c-a*b to FNMSUB
ClosedPublic

Authored by qiucf on Mar 22 2020, 8:57 PM.

Details

Summary

On PowerPC, FNMSUB (including VSX and non-VSX version) means -(a*b-c). But the backend now generates these instructions regardless whether nsz flag exists or not.

If a*b-c==0, such transformation changes sign of zero. This patch fixes this issue.

Diff Detail

Event Timeline

qiucf created this revision.Mar 22 2020, 8:57 PM
qiucf marked an inline comment as done.Mar 22 2020, 9:00 PM
qiucf added inline comments.
llvm/test/CodeGen/PowerPC/fma-negate.ll
290

This should be a regression. And the reason is SplitVectorRes did not pass SDNode flags to legalized ones. If that's truly a bug, I will cook a patch for it.

spatel added inline comments.Mar 23 2020, 6:45 AM
llvm/test/CodeGen/PowerPC/fma-negate.ll
290

This should not be a regression, right? If we dropped the flags during legalization, that is a bug.

qiucf marked 2 inline comments as done.
qiucf added inline comments.
llvm/test/CodeGen/PowerPC/fma-negate.ll
290

I posted the fix at D76832 :)

spatel added inline comments.
llvm/test/CodeGen/PowerPC/fma-negate.ll
290

Thanks! Please rebase this patch after that the other one is pushed.
PPC experts can judge whether we want to proceed with this one or wait for possible improvements in getNegatibleCost().
We have known bugs with that logic (D76638, D76439), but I'm not sure yet what the best solution is. @qshanz may have a better idea.

jsji added a comment.Mar 26 2020, 7:06 AM

@steven.zhang Can you please *Disable* or *Delete* the deprecated account @qshanz to avoid confusion. Thanks.

qiucf updated this revision to Diff 252852.Mar 26 2020, 8:18 AM
qiucf marked an inline comment as done.

Rebase to reflect changes after D76832

amyk added a subscriber: amyk.Mar 26 2020, 3:51 PM
amyk added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1485

Make the return statement inline with the rest of the return statements.

15654

This could be a silly question but I'm curious and wanting to learn the transformation you implemented. When I read the comment "choose the cheaper one to negate," it looks like in the first condition that N1 is the cheaper one but N0 is being negated. Could you elaborate on this a little more?

16008

Maybe good to add a comment to this function to briefly describe the transformation you are trying to do.

16020

Did you mean, we don't need to care about fnmadd?

16021

s/transform/transformation

@steven.zhang Can you please *Disable* or *Delete* the deprecated account @qshanz to avoid confusion. Thanks.

I have updated the profile. Thank you for reminder.

qiucf updated this revision to Diff 253496.Mar 29 2020, 10:35 PM
qiucf marked 6 inline comments as done.

Address comments

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1485

Thanks. clang-format always changes this line...

15654

Yes. It's a little bit confusing since NegatibleCost defines Expensive=0; Neutral=1; Cheaper=2. And code in DAG combiner also follows such convention.

16020

I think at least for now we don't really need fnmadd since dag combiner already handles it ((fneg (fma ...))). But if we drop code for fnmsub here, we would get worse code sequence.

qiucf marked 3 inline comments as done.Apr 14 2020, 1:25 AM
qiucf added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
15654

This should be fixed after D77993.

qiucf updated this revision to Diff 257233.Apr 14 2020, 1:57 AM

Rebase after the getNegatibleCost refactoring patch: D77319

steven.zhang added inline comments.Apr 14 2020, 6:32 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
15617

You should check the value of NegN2 instead of the cost though their semantics might be the same.

15633

Please refer to how we did for FMA. i.e. Your implementation has problems when A=neutral, B=neutral, and C=cheaper

qiucf updated this revision to Diff 257591.Apr 14 2020, 8:34 PM
qiucf marked 3 inline comments as done.

Address comments from Steven

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
15633

Thanks, I was not aware of (A=neutral && B=neutral && C=cheaper) => cheaper.

steven.zhang added inline comments.Apr 14 2020, 11:38 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
15630

As we already know that, N0Cost <= N1Cost, so, what we need to do is as:
Cost = std::min(N0Cost, N2Cost);
I will update the logic in DAGCombine there as it has similar issue.

qiucf updated this revision to Diff 257960.Apr 15 2020, 10:47 PM
qiucf marked an inline comment as done.

Simplify a cost calculation

This will need refactoring after D77319

@spatel @RKSimon Do you know that why we didn't have the generic node for FNMSUB/FMSUB as I see the duplicate logic between X86 and PowerPC ?

@spatel @RKSimon Do you know that why we didn't have the generic node for FNMSUB/FMSUB as I see the duplicate logic between X86 and PowerPC ?

I can't say for sure, but my guess is that target-specific logic could be added without incurring the overhead of general discussion/approval:
rG3cb3b0045c62bc6cc

There's also a question of deciding what name corresponds to each math operation. The x86 instructions don't match PPC. For example:
[PPC] fnmsub = -(A * B - C)
[x86] vfnmsub132sd = -(A * B) - C

It would be nice to have common nodes/optimizations if we can find enough code to share. If we do that, we also need to see if other targets can benefit (AArch64 has fnmsub machine IR).

@spatel @RKSimon Do you know that why we didn't have the generic node for FNMSUB/FMSUB as I see the duplicate logic between X86 and PowerPC ?

I can't say for sure, but my guess is that target-specific logic could be added without incurring the overhead of general discussion/approval:
rG3cb3b0045c62bc6cc

There's also a question of deciding what name corresponds to each math operation. The x86 instructions don't match PPC. For example:
[PPC] fnmsub = -(A * B - C)
[x86] vfnmsub132sd = -(A * B) - C

It would be nice to have common nodes/optimizations if we can find enough code to share. If we do that, we also need to see if other targets can benefit (AArch64 has fnmsub machine IR).

Thank you for the information. Maybe,we can add fmsub(A*B-C), fnmsub(-A*B-C), fnmadd(-A*B+C). But yes, we need to see if other targets can benefit from this.

qiucf updated this revision to Diff 265164.May 20 2020, 1:25 AM

Fix test failure after several td/case changes

steven.zhang added inline comments.May 20 2020, 6:05 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
16242

Can we move all these logic into getNegatedExpression so that more opportunity will be found?

qiucf marked an inline comment as done.May 24 2020, 8:39 AM
qiucf added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
16242

Removing main part of these logic into getNegatedExpression makes code cleaner. But FNMSUB nodes will never try negate unless we combine it?

steven.zhang added inline comments.May 24 2020, 7:59 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
16242

ok. I think, we still need to combine it as FNMSUB as what you did now to reduce FNEG. But we miss to do the negation between FMA and FNMSUB. But it could be a follow up patch.

16252

Do we need this transformation ? I didn't see the benefit of doing this as it didn't reduces the FNEG.

qiucf updated this revision to Diff 266448.May 27 2020, 1:15 AM
qiucf marked an inline comment as done.

Address Steven's comments. Fixed a crash due to type legalizing.

LGTM now overall with some minor comments. As you are adding the pattern for VNMSUBFP, I would suggest you do it in another patch.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
15818

Remove the InvertedOpc but use FMA directly as we know that it is FNMSUB now. And move the invertFMAOpcode as lambda of the getNegatedExpression.

15833

It will be great if you can add more wording on how it changes the sign of zeroes here.

qiucf updated this revision to Diff 266467.May 27 2020, 3:01 AM

Move vnmsubfp exploitation to another patch

steven.zhang accepted this revision.May 28 2020, 3:27 AM

LGTM now as far as you update the comments in getNegatedExpression about why we are changing the sign of zero. But please hold on several days in case others have more comments,

This revision is now accepted and ready to land.May 28 2020, 3:27 AM
This revision was automatically updated to reflect the committed changes.