This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Try to unfold fneg source when matching legacy fmin/fmax
ClosedPublic

Authored by arsenm on Dec 15 2022, 11:23 AM.

Details

Reviewers
rampitec
bogner
Pierre-vh
foad
Group Reviewers
Restricted Project
Summary

This is NFC as it stands, since other combines will effectively
prevent this from being reachable. This will avoid regressions in a
future change which tries to make better use of select source
modifiers.

Didn't bother with the GlobalISel part for now, since the baseline
combine doesn't seem to work on the existing test.

This is the target specific partner to D140128

Diff Detail

Event Timeline

arsenm created this revision.Dec 15 2022, 11:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 11:23 AM
arsenm requested review of this revision.Dec 15 2022, 11:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 11:23 AM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm updated this revision to Diff 484347.Dec 20 2022, 12:03 PM

Rebase on some expanded test coverage

foad added inline comments.Dec 20 2022, 11:30 PM
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1416

This function doesn't use False. Might be clearer to take a Swapped flag instead of True and False.

1502

It's very unclear what you're trying to match here. I think it's something like this, depending on whether NegTrue actually stripped a neg or not:

// LHS op RHS ? LHS : -RHS -> -min/max(LHS, RHS)
// LHS op RHS ? -LHS : -RHS -> -min/max(LHS, RHS)

The first one doesn't make any sense to me. For the second one don't you need to flip the condition code?

llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll
295–297

I think these three lines implement roughly max(s0, 0) so how can converting it to min(0, s0) be correct?

arsenm added inline comments.Dec 21 2022, 4:17 AM
llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll
295–297

It's ngt, not gt. It's more like !(v_cmp_le_f32)

foad added inline comments.Dec 21 2022, 4:31 AM
llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll
295–297

Right, the code is doing s0<=0?v0:v1 which is s0<=0?-0:s0 which is max.

foad added inline comments.Dec 21 2022, 4:39 AM
llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll
295–297

Oh sorry I think I've misremembered how cndmask works.

arsenm added inline comments.Jan 20 2023, 6:06 AM
llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll
295–297

Right, cndmask is backward from how you would expect

foad added inline comments.Jan 23 2023, 3:47 AM
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1502

I still don't understand this

arsenm added inline comments.Jan 23 2023, 4:17 AM
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1502

fmin/fmax legacy are not commutative so there's no swapping here. Your sample here missed the part where the constant was negated, this is just pulling a negate out of the two select operands

arsenm updated this revision to Diff 491310.Jan 23 2023, 4:42 AM

Add comment

bogner accepted this revision.Jan 31 2023, 10:54 AM

lgtm

This revision is now accepted and ready to land.Jan 31 2023, 10:54 AM
arsenm closed this revision.Feb 2 2023, 6:50 PM
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
1416

I think it's harder to read if you're considering the caller context