This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Override getNegatedExpression constant handling
ClosedPublic

Authored by arsenm on Feb 13 2023, 4:23 PM.

Details

Reviewers
rampitec
foad
sebastian-ne
Pierre-vh
Group Reviewers
Restricted Project
Summary

Ignore the multiple use heuristics of the default
implementation, and report cost based on inline immediates. This
is mostly interesting for -0 vs. 0. Gets a few small improvements.
fneg_fadd_0_f16 is a small regression. We could probably avoid this
if we handled folding fneg into div_fixup.

Diff Detail

Event Timeline

arsenm created this revision.Feb 13 2023, 4:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2023, 4:23 PM
arsenm requested review of this revision.Feb 13 2023, 4:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2023, 4:23 PM
Herald added a subscriber: wdng. · View Herald Transcript
rampitec accepted this revision.Feb 14 2023, 10:10 AM
This revision is now accepted and ready to land.Feb 14 2023, 10:10 AM
foad added a comment.Feb 16 2023, 3:15 AM

Heads up, this is causing an infinite loop in the DAG combiner. I'm working on reducing a test case.

foad added a comment.Feb 16 2023, 3:27 AM

llc -march=amdgcn -mcpu=gfx1030 hangs on this test case:

define float @f(float %arg) {
bb:
  %i = fmul float %arg, 0.0
  %i1 = fsub float 0.0, %i
  ret float %i1
}

Could you please fix or revert?

foad added a comment.Feb 16 2023, 5:55 AM

An excerpt from the infinite debug output:

Combining: t7: ch,glue = CopyToReg # D:1 t0, Register:f32 $vgpr0, t3103

Combining: t3103: f32 = fsub # D:1 ConstantFP:f32<0.000000e+00>, t3102
Creating fp constant: t3104: f32 = ConstantFP<-0.000000e+00>
Creating new node: t3105: f32 = fmul # D:1 t2, ConstantFP:f32<-0.000000e+00>
Creating new node: t3106: f32 = fadd # D:1 t3105, ConstantFP:f32<0.000000e+00>
 ... into: t3106: f32 = fadd # D:1 t3105, ConstantFP:f32<0.000000e+00>

Combining: t7: ch,glue = CopyToReg # D:1 t0, Register:f32 $vgpr0, t3106

Combining: t3106: f32 = fadd # D:1 t3105, ConstantFP:f32<0.000000e+00>
Creating new node: t3107: f32 = fmul # D:1 t2, ConstantFP:f32<0.000000e+00>
Creating new node: t3108: f32 = fsub # D:1 ConstantFP:f32<0.000000e+00>, t3107
 ... into: t3108: f32 = fsub # D:1 ConstantFP:f32<0.000000e+00>, t3107
foad added a comment.Feb 16 2023, 9:12 AM

llc -march=amdgcn -mcpu=gfx1030 hangs on this test case:

define float @f(float %arg) {
bb:
  %i = fmul float %arg, 0.0
  %i1 = fsub float 0.0, %i
  ret float %i1
}

Could you please fix or revert?

I've reverted the patch and added this test case to test/CodeGen/AMDGPU/fneg-combines.new.ll