This is an archive of the discontinued LLVM Phabricator instance.

guard fsqrt with fmf sub flags
ClosedPublic

Authored by mcberg2017 on Jun 4 2018, 4:53 PM.

Details

Summary

This change uses fmf subflags to guard optimizations as well as unsafe. These changes originated from D46483.
It contains only context for fsqrt.

Diff Detail

Event Timeline

mcberg2017 created this revision.Jun 4 2018, 4:53 PM
spatel added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10897

Here, we're entering a bit of uncharted FMF territory. cc'ing some more potential reviewers for opinions - @wristow @efriedma @andrew.w.kaylor

I think we want to allow this transform when the node allows "approximate functions" (afn). I don't think we should care about 'arcp' - the transform of 1.0/sqrt(x) is handled on a different path AFAICT.

mcberg2017 added inline comments.Jun 5 2018, 10:36 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10897

Not quite true, I reference AMDGPU which uses 1.0/sqrt to map to AMDGPUISD::RSQ, which is rsqrt, although a partial condition. See SITargetLowering::lowerFastUnsafeFDIV. Adding Matt as this is a AMD topic.

wristow added inline comments.Jun 5 2018, 11:57 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10897

It does seem to me that 'afn' is what we want here, rather than 'arcp'. I think the point about SITargetLowering::lowerFastUnsafeFDIV means that the code there ought to be checking both 'afn' and 'arcp'.

arsenm added inline comments.Jun 5 2018, 12:01 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10897

The point of the code there is it doesn't need either. In some situations the instruction does the right thing without any need for a flag

arsenm added inline comments.Jun 5 2018, 12:08 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10897

For the cases where it's not legal, I suppose checking both makes sense

efriedma added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10897

I can't see how arcp is relevant to this patch at all; visitFSQRT doesn't deal with division.

The flags required for rsqrt were discussed here: https://reviews.llvm.org/D37686#877987 .

mcberg2017 updated this revision to Diff 150064.Jun 5 2018, 6:51 PM

My take on Eli's point from D37686 is afn or unsafe is what we need then. The code and tests are augmented for that change.

hfinkel added inline comments.Jun 6 2018, 5:49 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10897

I can't see how arcp is relevant to this patch at all; visitFSQRT doesn't deal with division.

I agree. This is just a question of how we compute sqrt(x). For an approximation of 1/sqrt(x), then I can see also needing arcp.

spatel added inline comments.Jun 6 2018, 8:19 AM
test/CodeGen/PowerPC/fmf-propagation.ll
368–387

Is there some reason for trimming the output here? I think it's important that we show the entire estimate sequence here to be consistent. Otherwise, it's misleading as it appears that we only need the raw estimate instruction.

mcberg2017 marked an inline comment as done.

So should we include arcp in the check so that buildSqrtEstimate fires for arcp opportunities(to produce rsqrt) as well or is it fine the way it is?

spatel accepted this revision.Jun 6 2018, 10:57 AM

So should we include arcp in the check so that buildSqrtEstimate fires for arcp opportunities(to produce rsqrt) as well or is it fine the way it is?

No - everyone agrees that this transform is predicated with 'afn' alone. The reciprocal sqrt transform is an independent problem/patch.

LGTM.

This revision is now accepted and ready to land.Jun 6 2018, 10:57 AM
This revision was automatically updated to reflect the committed changes.