For now, clang and gcc both failed to generate sae version from _mm512_cvt_roundps_ph:
https://godbolt.org/z/oh7eTGY5z. Intrinsic guide description is also wrong, which will be
update soon.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
27165–27170 | We need to clear the SAE bit by replace the imm with the new RC. | |
llvm/test/CodeGen/X86/avx512-intrinsics.ll | ||
1016–1017 | This is not correct. The {%k1}/{%k1} {z} are missing. The same below. The root reason is its intrinsic data is different from other mask ones: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86IntrinsicsInfo.h#L788-L793 Maybe we can consider define a new INTR_TYPE_1OP_IMM8_MASK_SAE? |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
27165–27170 | I was to save code changes to implement, since hardware will ignore except the lower three bits. | |
llvm/test/CodeGen/X86/avx512-intrinsics.ll | ||
1016–1017 | Sorry didn't notice this fault. I would say to extend MCVTPS2PH to MCVTPS2PH_SAE? |
We need to clear the SAE bit by replace the imm with the new RC.