This is an archive of the discontinued LLVM Phabricator instance.

DAGCombiner: don't duplicate (fmul x, c) when folding fneg if fneg is free
ClosedPublic

Authored by escha on Jun 5 2015, 7:21 AM.

Download Raw Diff

Details

Reviewers

t.p.northover
escha
arsenm
resistor

Summary

The fold (fneg (fmul x, c)) -> (fmul x, -c) duplicates the fmul if the fmul has multiple uses. If fneg is free, this is even worse, since we didn't actually gain anything by doing so. Check hasOneUse or fneg not being free before doing this fold.

Diff Detail

Repository: rL LLVM

Event Timeline

escha updated this revision to Diff 27200.Jun 5 2015, 7:21 AM

escha retitled this revision from to DAGCombiner: don't duplicate (fmul x, c) when folding fneg if fneg is free.

escha updated this object.

escha edited the test plan for this revision. (Show Details)

escha added reviewers: resistor, t.p.northover.

escha set the repository for this revision to rL LLVM.

escha added a subscriber: Unknown Object (MLST).

Please add a test case. Otherwise, seems reasonable.

I think the only in-tree target that uses isFNegFree is R600; should I try to come up with an R600 test case? I've never worked on that backend.

In D10277#184262, @escha wrote:

I think the only in-tree target that uses isFNegFree is R600; should I try to come up with an R600 test case? I've never worked on that backend.

Yes, it should be effected. It should be easy to copy one of the tests for this (probably fmul.ll is simplest)

I've updated one of the R600 tests to check that it doesn't duplicate the fmul-by-constant when it has one negated use and one regular use.

LGTM

This revision is now accepted and ready to land.Jun 5 2015, 10:46 AM

escha accepted this revision.Jun 5 2015, 10:57 AM

escha closed this revision.

escha added a reviewer: escha.

I would say the hasOneUse() check makes sense even in the TLI.isFNegFree() == false case (it's not TLI.NegIsWayMoreExpensiveThanMul() after all).

I'd totally agree with you, but I felt this was less invasive and risky for the moment.

Mainly, I noticed that some other folds (e.g. FMA combining) might not have been designed with an unfolded fneg in mind, so this change made independently (without looking at other folds and verifying things are okay) might worsen code on other platforms. Since I hadn't done that sort of performance verification for anything other than a GPU with neg-mods on inputs, I'm not that confident with that aggressive a change.

... plus, I think the isNegatableForFree code path already does something like this? So I kinda assumed that this fold existed for a reason and didn't want to muck with it too much.

But like, if you want to investigate making it entirely conditional on hasOneUse, that sounds totally okay!

Revision Contents

Path

Size

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

3 lines

test/

CodeGen/

R600/

fmul.ll

16 lines

Diff 27208

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Context not available.
	}	}

	// (fneg (fmul c, x)) -> (fmul -c, x)	// (fneg (fmul c, x)) -> (fmul -c, x)
	if (N0.getOpcode() == ISD::FMUL) {	if (N0.getOpcode() == ISD::FMUL &&
		(N0.getNode()->hasOneUse() \|\| !TLI.isFNegFree(VT))) {
	ConstantFPSDNode *CFP1 = dyn_cast<ConstantFPSDNode>(N0.getOperand(1));	ConstantFPSDNode *CFP1 = dyn_cast<ConstantFPSDNode>(N0.getOperand(1));
	if (CFP1) {	if (CFP1) {
	APFloat CVal = CFP1->getValueAPF();	APFloat CVal = CFP1->getValueAPF();
Context not available.

test/CodeGen/R600/fmul.ll

Context not available.
	ret void	ret void
	}	}

		; There should be three multiplies here; %a should be used twice (once
		; negated), not duplicated into mul x, 5.0 and mul x, -5.0.
		; FUNC-LABEL: {{^}}test_mul_twouse:
		; SI: v_mul_f32
		; SI: v_mul_f32
		; SI: v_mul_f32
		; SI-NOT: v_mul_f32
		define void @test_mul_twouse(float addrspace(1)* %out, float %x, float %y) #0 {
		%a = fmul float %x, 5.0
		%b = fsub float -0.0, %a
		%c = fmul float %b, %y
		%d = fmul float %c, %a
		store float %d, float addrspace(1)* %out
		ret void
		}

	attributes #0 = { "less-precise-fpmad"="true" "no-infs-fp-math"="true" "no-nans-fp-math"="true" "unsafe-fp-math"="true" }	attributes #0 = { "less-precise-fpmad"="true" "no-infs-fp-math"="true" "no-nans-fp-math"="true" "unsafe-fp-math"="true" }
Context not available.