This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Force sign operand of f64 fcopysign to f32
ClosedPublic

Authored by arsenm on Jan 26 2023, 8:27 AM.

Details

Reviewers
rampitec
foad
sebastian-ne
Pierre-vh
Group Reviewers
Restricted Project
Summary

The fcopysign DAG operation, unlike the IR one, allows
different types for the sign and magnitude. We can reduce
the bitwidth of the high operand since only the sign bit matters.

The default combine only introduces mixed fcopysign
operand types from fpext/fptrunc. We effectively do this
already during selection, but doing it earlier in the combiner
should expose new combine opportunities (e.g. the existing tests
now eliminate the load of the low half of the double). Unfortunately
this isn't enough to handle the case I'm interested in just yet.

Diff Detail

Event Timeline

arsenm created this revision.Jan 26 2023, 8:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2023, 8:27 AM
arsenm requested review of this revision.Jan 26 2023, 8:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2023, 8:27 AM
Herald added a subscriber: wdng. · View Herald Transcript
foad added inline comments.Jan 26 2023, 8:43 AM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
9445

Can't you bitcast f64 to v2f32, to avoid the second bitcast?

arsenm added inline comments.Jan 26 2023, 4:11 PM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
9445

Yes, but surprisingly this loses the load width reduction optimization

arsenm updated this revision to Diff 492590.Jan 26 2023, 4:12 PM

Avoid second bitcast

foad accepted this revision.Jan 27 2023, 1:31 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
9445

Ugh.

This revision is now accepted and ready to land.Jan 27 2023, 1:31 AM