This is an archive of the discontinued LLVM Phabricator instance.

Allow fneg + strict_fadd -> strict_fsub in DAGCombiner
ClosedPublic

Authored by ajwock on Aug 7 2020, 12:28 PM.

Details

Summary

This is the first of a set of DAGCombiner changes enabling strictfp optimizations. I want to test to waters with this to make sure changes like these are acceptable for the strictfp case- this particular change should preserve exception ordering and result precision perfectly, and many other possible changes appear to be able to as well.

Copied from regular fadd combines but modified to preserve ordering via the chain, this change allows strict_fadd x, (fneg y) to become strict_fsub x, y and strict_fadd (fneg x), y to become strict_fsub y, x.

Diff Detail

Event Timeline

ajwock created this revision.Aug 7 2020, 12:28 PM
ajwock requested review of this revision.Aug 7 2020, 12:28 PM
arsenm added inline comments.Aug 11 2020, 7:40 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12824

There is some strange legality predicate you are supposed to use for the strict cases, although I don't fully understand it

kpn added inline comments.Aug 14 2020, 12:23 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12824

Where is this predicate? Is there an existing case that can be modeled after?

My thinking is that this transform is probably safe because the FNeg will never trap and it will never round so it should therefore be safe to fold it into the add/sub. But I'm no expert on floating-point I admit.

arsenm added inline comments.Aug 14 2020, 12:25 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12824

This is totally valid. But for some reason, we have getStrictFPOperationAction which differs from checking the regular legalization action on the STRICT_ opcode

craig.topper added inline comments.Aug 14 2020, 12:30 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12824

I think getStrictFPOperationAction works along with the code in SelectionDAGISel that mutates strict nodes to non-strict nodes for targets that haven't done the work. It was intended to query whether the target will be able to handle the mutated node. It should go away when all targets have done the work.

Since I think we're now blocking clang from using strict fp on targets that haven't done the work. Then we probably can just use the normal legal operation check.

ajwock added inline comments.Aug 14 2020, 1:43 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12824

As far as I can tell, that comes into play later on as craig topper has stated. I don't think there's any way for that to apply during combination. Here we have to rely on operations that we can be sure are legal. As long as this operation is legal according to the standard (I'm also no expert, but I'm pretty sure based on my interpretation) exceptions should still occur as the original code prescribes, and in the same order as such.

Why wait until codegen to do this? IIUC, it's impossible for any backend to spontaneously create strict FP ops, so this pattern must be present in the original IR.
Ie, wouldn't we be better off reducing these in IR (InstCombinerImpl::visitCallInst())?

Why wait until codegen to do this? IIUC, it's impossible for any backend to spontaneously create strict FP ops, so this pattern must be present in the original IR.
Ie, wouldn't we be better off reducing these in IR (InstCombinerImpl::visitCallInst())?

It doesn't matter if instcombine does this or not, this pattern could still easily appear after legalization

Why wait until codegen to do this? IIUC, it's impossible for any backend to spontaneously create strict FP ops, so this pattern must be present in the original IR.
Ie, wouldn't we be better off reducing these in IR (InstCombinerImpl::visitCallInst())?

It doesn't matter if instcombine does this or not, this pattern could still easily appear after legalization

Ok, I haven't followed strict FP closely, so my assumption was wrong. Is there a test or example somewhere that shows how a strict fadd is created out of something else?

Why wait until codegen to do this? IIUC, it's impossible for any backend to spontaneously create strict FP ops, so this pattern must be present in the original IR.
Ie, wouldn't we be better off reducing these in IR (InstCombinerImpl::visitCallInst())?

It doesn't matter if instcombine does this or not, this pattern could still easily appear after legalization

Ok, I haven't followed strict FP closely, so my assumption was wrong. Is there a test or example somewhere that shows how a strict fadd is created out of something else?

You may need to expand strict FP operations in terms of other strict FP operations. If I were to seriously try to implement strictfp for AMDGPU, it would look like a lot of expansions with other strict operations. I think this is too much effort for SelectionDAG, but is more manageable for GlobalISel.

Also, in general, codegen may need more relaxed rules for mixing strict and non-strict FP operations (at least in GlobalISel where there's an explicit instruction ordering). For example, AMDGPU's non-strict fdiv lowering needs to introduce mode switches (which in the DAG we use custom strict FP nodes with glue)

This combine does create the strict_fsub in terms of strict_fadd while preserving the chain-protected ordering. Would it still be appropriate to have this transform and others like it in the DAGCombiner?

arsenm accepted this revision.Aug 26 2020, 11:55 AM

LGTM

This revision is now accepted and ready to land.Aug 26 2020, 11:55 AM
This revision was landed with ongoing or failed builds.Aug 27 2020, 5:17 AM
This revision was automatically updated to reflect the committed changes.