This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Combine fabs(fneg(x)) to fabs(x)
ClosedPublic

Authored by mbrkusanin on Oct 1 2021, 8:33 AM.

Diff Detail

Event Timeline

mbrkusanin created this revision.Oct 1 2021, 8:33 AM
mbrkusanin requested review of this revision.Oct 1 2021, 8:33 AM
foad added inline comments.Oct 1 2021, 8:37 AM
llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
371–376

Would it be neater to combine these into a single "fabs of fabs-or-fneg" combine?

mbrkusanin added inline comments.Oct 1 2021, 8:56 AM
llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
371–376

There are actually not that similar.
fabs(fabs(x)) to fabs(x) replaces outer abs with inner one and deletes the outer one.

I do not delete anything because fneg might have multiple uses.
My main motivation for this combine is to help AMD's instruction selection when it comes to source modifiers. After abs is selected into a modifier it will stop and it wont pick neg which will live on and be redundant.
Even with multiple uses of fneg(x) those other uses can still be replaced by a neg src modifier (but because of the redundant use it won't be DCE'd).

Here is an interesting test that shows benefits even with multple uses of fneg(x):

define amdgpu_vs float @test(float %x, float %y) {
.entry:
    %negx = fneg float %x
    %absnegx = call float @llvm.fabs.f32(float %negx)
    %add1 = fadd float %absnegx, %y
    %add2 = fadd float %add1, %negx
    ret float %add2
}

declare float @llvm.fabs.f32(float)

Before patch:

	v_xor_b32_e32 v2, 0x80000000, v0
	v_add_f32_e64 v1, |v2|, v1
	v_add_f32_e64 v0, v1, -v0

After patch:

	v_add_f32_e64 v1, |v0|, v1
	v_add_f32_e64 v0, v1, -v0
foad accepted this revision.Oct 4 2021, 8:08 AM

LGTM.

llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
371–376

OK.

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
2139

Could return false early.

This revision is now accepted and ready to land.Oct 4 2021, 8:08 AM
mbrkusanin marked an inline comment as done.Oct 5 2021, 4:44 AM
This revision was automatically updated to reflect the committed changes.