This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] filter out denorm inputs when calculating sqrt estimate (PR34994)
ClosedPublic

Authored by spatel on Jan 19 2018, 3:18 PM.

Details

Summary

As shown in the example in PR34994:
https://bugs.llvm.org/show_bug.cgi?id=34994
...we can return a very wrong answer (inf instead of 0.0) for square root when using a reciprocal square root estimate instruction.

Here, I've conditionalized the filtering out of denorms based on the function having "denormal-fp-math"="ieee" in its attributes. The other options for this attribute are 'preserve-sign' and 'positive-zero'.

So we don't generate this extra code by default with just '-ffast-math' (because then there's no denormal attribute string at all), but it works if you specify '-ffast-math -fdenormal-fp-math=ieee' from clang.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Jan 19 2018, 3:18 PM

Ping.

Note that on x86 Linux, specifying -fdenormal-fp-math=ieee doesn't inhibit linking crtfastmath.o ( rL165240 / https://bugs.llvm.org/show_bug.cgi?id=14024 ), but I think that's a separate issue. (No matter what we do here in codegen, denormals will be off on that platform.)

spatel added a comment.Feb 1 2018, 7:57 AM

The patch was approved on the mailing list by @evandro , but he noted that AArch64 fails to pick up this change. That's because it duplicated the masking code as part of custom lowering for that target. I'll add a FIXME for that problem, commit this, and look for a fix as a follow-up.

RKSimon accepted this revision.Feb 1 2018, 8:37 AM

The patch was approved on the mailing list by @evandro , but he noted that AArch64 fails to pick up this change. That's because it duplicated the masking code as part of custom lowering for that target. I'll add a FIXME for that problem, commit this, and look for a fix as a follow-up.

LGTM - thanks

This revision is now accepted and ready to land.Feb 1 2018, 8:37 AM
This revision was automatically updated to reflect the committed changes.
arsenm added a subscriber: arsenm.Oct 29 2019, 6:13 PM
arsenm added inline comments.
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
17465

Why doesn't this assume IEEE behavior if the attribute wasn't specified?

Herald added a project: Restricted Project. · View Herald TranscriptOct 29 2019, 6:13 PM
spatel marked an inline comment as done.Oct 30 2019, 8:27 AM
spatel added inline comments.
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
17465

I had to go back through the comments in:
https://bugs.llvm.org/show_bug.cgi?id=34994

The concern was that adding runtime code to the estimate to deal with denorms would reduce performance on platforms (like x86 Linux or PlayStation) that would flush denorms anyway when building with -ffast-math.

So we offered the explicit use of clang's "-fdenormal-fp-math=ieee" as a way to specify the seemingly uncommon case. If we say that LLVM assumes an IEEE target by default, then we could change that. But that means we need to explicitly set the function attribute for all targets in clang?

arsenm added inline comments.Nov 1 2019, 10:17 AM
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
17465

I think it's important to assume ieee behavior by default and there could be a denormal. Is there a comprehensive list of platforms that by default disable denorms for clang to emit flushing (and what type of flushing they use?)

spatel marked an inline comment as done.Nov 1 2019, 1:35 PM
spatel added inline comments.
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
17465

I don't object to assuming IEEE by default, but we'll want to at least get the known x86 platforms right in clang before fixing this. I don't know of any comprehensive list that would tell us which platforms are compliant or not (or vary based on some semi-external factor like x86).

I'm pretty sure there are out-of-tree GPU targets that are always non-compliant (always flush), so we probably need to post to llvm-dev and see what the various target owners want to do.