This is an archive of the discontinued LLVM Phabricator instance.

[X86][FP16] Do not split FP64->FP16 to FP64->FP32->FP16
ClosedPublic

Authored by pengfei on Jul 20 2022, 1:13 AM.

Details

Summary

Truncation from double to half is not always identical to truncating to float first and then to half. https://godbolt.org/z/56s9517hd

On the other hand, expanding to float and then to double is always identical to expanding to double directly. https://godbolt.org/z/Ye8vbYPnY

Diff Detail

Event Timeline

pengfei created this revision.Jul 20 2022, 1:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2022, 1:13 AM
pengfei requested review of this revision.Jul 20 2022, 1:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2022, 1:13 AM
LuoYuanke added inline comments.Jul 21 2022, 2:22 AM
llvm/test/CodeGen/X86/half-constrained.ll
208

Just be curious. Why there are 4 underscore? Is it the right function name?

llvm/test/CodeGen/X86/vector-half-conversions.ll
799–800

It seems transform from double to float and then float to half. The same for AVX1.

pengfei added inline comments.Jul 21 2022, 5:11 AM
llvm/test/CodeGen/X86/half-constrained.ll
208

This exists before the FP16 patches. Notice the tests are for darwin platform. Not sure if it has special mangling.

llvm/test/CodeGen/X86/vector-half-conversions.ll
799–800

Good catch! Will investigate.

pengfei updated this revision to Diff 446459.Jul 21 2022, 5:52 AM

Fix the missing split part.

pengfei added inline comments.Jul 21 2022, 5:56 AM
llvm/test/CodeGen/X86/vector-half-conversions.ll
799–800

On non-AVX512 case, the v8f64->v8f16 was firstly split to v4f64->v4f16. Then v4f16 will be widden to v8f16 by another path which I missed to change :(

RKSimon added inline comments.Jul 21 2022, 7:03 AM
llvm/test/CodeGen/X86/cvt16.ll
24

apply the nounwind change separately to minimise diff

llvm/test/CodeGen/X86/fastmath-float-half-conversion.ll
93

apply separately?

pengfei updated this revision to Diff 446514.Jul 21 2022, 8:22 AM
pengfei marked 2 inline comments as done.

Rebase after rGf621e568f333.

RKSimon accepted this revision.Jul 21 2022, 8:31 AM

LGTM

This revision is now accepted and ready to land.Jul 21 2022, 8:31 AM
skan accepted this revision.Jul 21 2022, 5:30 PM
This revision was landed with ongoing or failed builds.Jul 21 2022, 5:36 PM
This revision was automatically updated to reflect the committed changes.