This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Propagate FMF flags in FMA folding
ClosedPublic

Authored by qiucf on Sep 2 2020, 8:51 AM.

Details

Summary

DAG combiner folds (fma a 1.0 b) into (fadd a b) but the flag isn't propagated into new fadd. This patch fixes that.


However, besides this, SDNodeFlags are easily missed/passed wrongly. (D86871) Is there any 'rule' about whether pass flags or not? Roughly speaking, there are (or more than?) three types of transformation in SDAG:

  • The node is eliminated. Flags of it will never be passed to others.
  • A node is expanded/simplified into another node(s). In this case, flags of the node should be passed to all newly created node(s).
  • Some nodes are transformed to another group of nodes. Here new nodes should have intersection of original flags? Or decided by opcode type?

We can do some check against new nodes or pass flags automatically if the rule is clear. Any ideas?

Diff Detail

Event Timeline

qiucf created this revision.Sep 2 2020, 8:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2020, 8:51 AM
qiucf requested review of this revision.Sep 2 2020, 8:51 AM
spatel added a comment.Sep 3 2020, 4:43 AM

However, besides this, SDNodeFlags are easily missed/passed wrongly. (D86871) Is there any 'rule' about whether pass flags or not? Roughly speaking, there are (or more than?) three types of transformation in SDAG:

  1. The node is eliminated. Flags of it will never be passed to others.
  2. A node is expanded/simplified into another node(s). In this case, flags of the node should be passed to all newly created node(s).
  3. Some nodes are transformed to another group of nodes. Here new nodes should have intersection of original flags? Or decided by opcode type?

We can do some check against new nodes or pass flags automatically if the rule is clear. Any ideas?

This is out-of-scope for this particular patch review, but:
The first 2 cases seem straightforward, although there is a possibility that a node that carries FMF can be converted into some other node that does not have any meaningful use of that flag (for example fdiv arcp becomes fmul arcp - should always be harmless?).
The third case is where things are not specified clearly.
In IR, we have class FastMathFlagGuard to save/carry/restore state within IRBuilder. Maybe SDAG could use something like that too?

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13191

If this block is dead code, you can remove it independently of this patch.

13194

Add a test (maybe with vector types) for this code path?

nemanjai added inline comments.Sep 4 2020, 7:42 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13156

A couple of things are unclear to me (although not related to this patch):

  1. Why do we handle the 0.0 and 1.0 special cases before canonicalization and thereby have to check both ways?
  2. Why is the handling for -1.0 not in the same place?
13194

+1

qiucf updated this revision to Diff 290166.Sep 6 2020, 7:54 PM
qiucf marked 2 inline comments as done.
  • Update tests
  • Keep redundant code (irrelevant to this patch)
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13156

The canonicalization code was added years after original folding logic. I guess it's just missed carelessly. We can have an NFC patch later refactoring it.

spatel added inline comments.Sep 7 2020, 5:43 AM
llvm/test/CodeGen/PowerPC/fma-combine.ll
293

I can't tell if anything is changing on this test with this patch. Should this not reduce to 'xor' like the scalar code?

qiucf added inline comments.Sep 7 2020, 7:52 PM
llvm/test/CodeGen/PowerPC/fma-combine.ll
293

For the NOVSX check: v2f64 isn't legal, so result before this patch is fsub+fsub. Now it returns constants.

If VSX is enabled, combiner will not meet the condition, since ConstantFPSDNode not works for vectors. So result doesn't change.

spatel accepted this revision.Sep 8 2020, 6:01 AM

LGTM - This patch shows a minimal improvement, and I don't think it does any harm.
We should, however, follow-up with the clean-up of redundant/dead code and enhancement for vector support.

llvm/test/CodeGen/PowerPC/fma-combine.ll
293

Ah, ok. Please put a 'TODO' comment in the code and here in the test so we can remember that.

This revision is now accepted and ready to land.Sep 8 2020, 6:01 AM
This revision was automatically updated to reflect the committed changes.