This is an archive of the discontinued LLVM Phabricator instance.

DAG: Stop trying to fold FP -(x-y) -> y-x in getNode with nsz
ClosedPublic

Authored by arsenm on Dec 30 2019, 2:25 PM.

Details

Summary

This was increasing the number of instructions when fsub was legalized
on AMDGPU with no signed zeros enabled. This fold should be guarded by
hasOneUse, and I don't think getNode should be doing that. The same
fold is already done as a regular combine through isNegatibleForFree.

This does require duplicating, even though isNegatibleForFree does
this combine already (and properly checks hasOneUse) to avoid one PPC
regression. In the regression, the outer fneg has nsz but the fsub
operand does not. isNegatibleForFree only sees the operand, and
doesn't see it's used from a nsz context. A nsz parameter needs to be
added and threaded through isNegatibleForFree to avoid this.

Diff Detail

Event Timeline

arsenm created this revision.Dec 30 2019, 2:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 30 2019, 2:25 PM

The transform doesn't necessarily add an instruction (depends on the target), but I agree that we don't want to replace an fneg generically with an fsub - even if that reduces the dependency chain, the fneg might be cheap/free relative to an independent FP op. I've added an AArch64 test here that should wiggle with this patch:
rGe6bdecf1cd6b

Please rebase (and would be better to have the baseline AMDGPU test as a preliminary commit too).

arsenm updated this revision to Diff 235726.Dec 31 2019, 11:57 AM

Rebase. New AArch64 test looks like an improvement

spatel accepted this revision.Dec 31 2019, 1:48 PM

LGTM

This revision is now accepted and ready to land.Dec 31 2019, 1:48 PM