This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Combine down fcopysign f64 magnitude
ClosedPublic

Authored by arsenm on Jan 26 2023, 5:46 PM.

Details

Reviewers
foad
Pierre-vh
rampitec
Group Reviewers
Restricted Project
Summary

Copy through the low bits and only apply an f32
copysign to the high half.

Diff Detail

Event Timeline

arsenm created this revision.Jan 26 2023, 5:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2023, 5:46 PM
arsenm requested review of this revision.Jan 26 2023, 5:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2023, 5:46 PM
Herald added a subscriber: wdng. · View Herald Transcript
foad added a comment.Jan 27 2023, 1:35 AM

Looks OK but does it actually improve codegen? It is hard to tell from the diffs.

Looks OK but does it actually improve codegen? It is hard to tell from the diffs.

This mostly improves the cases involving constants. The actual width reduction is effectively what the codegen already, so there's a narrow range of second order improvements available from other combines

foad accepted this revision.Jan 27 2023, 5:09 AM

This mostly improves the cases involving constants.

It would be nice to see a test case that demonstrates that.

This revision is now accepted and ready to land.Jan 27 2023, 5:09 AM

This mostly improves the cases involving constants.

It would be nice to see a test case that demonstrates that.

See s_test_copysign_f64_0_mag, it let a constant fold into an operand and avoided a BFI (I'm adding this test in a precommit)