This is an archive of the discontinued LLVM Phabricator instance.

[X86] Support SAE for VCVTPS2PH from intrinsic.
ClosedPublic

Authored by FreddyYe on Aug 25 2022, 3:35 AM.

Details

Summary

For now, clang and gcc both failed to generate sae version from _mm512_cvt_roundps_ph:
https://godbolt.org/z/oh7eTGY5z. Intrinsic guide description is also wrong, which will be
update soon.

Diff Detail

Event Timeline

FreddyYe created this revision.Aug 25 2022, 3:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2022, 3:35 AM
FreddyYe requested review of this revision.Aug 25 2022, 3:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2022, 3:35 AM
FreddyYe edited the summary of this revision. (Show Details)Aug 25 2022, 3:52 AM
FreddyYe added reviewers: pengfei, LuoYuanke, skan, yubing.
pengfei added inline comments.Aug 25 2022, 6:02 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
27165–27170

We need to clear the SAE bit by replace the imm with the new RC.

llvm/test/CodeGen/X86/avx512-intrinsics.ll
1016–1017

This is not correct. The {%k1}/{%k1} {z} are missing. The same below.

The root reason is its intrinsic data is different from other mask ones: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86IntrinsicsInfo.h#L788-L793

Maybe we can consider define a new INTR_TYPE_1OP_IMM8_MASK_SAE?

FreddyYe added inline comments.Aug 25 2022, 7:25 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
27165–27170

I was to save code changes to implement, since hardware will ignore except the lower three bits.

llvm/test/CodeGen/X86/avx512-intrinsics.ll
1016–1017

Sorry didn't notice this fault. I would say to extend MCVTPS2PH to MCVTPS2PH_SAE?

FreddyYe updated this revision to Diff 455787.Aug 25 2022, 9:06 PM
FreddyYe edited the summary of this revision. (Show Details)

Address comments.

pengfei accepted this revision.Sep 5 2022, 6:35 PM

LGTM.

This revision is now accepted and ready to land.Sep 5 2022, 6:35 PM
This revision was landed with ongoing or failed builds.Sep 5 2022, 8:28 PM
This revision was automatically updated to reflect the committed changes.