Title says all, there were a few of them missing, making us generate code like:
movq m64, xmm0 vpmovzxbd xmm0, ymm0
instead of:
vpmovzxbd m64, ymm0
I only glanced at the test results for non-AVX2, but them seemed correct.
Paths
| Differential D6125
[X86] Refactor PMOV[SZ]Xrm to add missing AVX2 patterns. ClosedPublic Authored by ab on Nov 4 2014, 4:21 PM.
Details Summary Title says all, there were a few of them missing, making us generate code like: movq m64, xmm0 vpmovzxbd xmm0, ymm0 instead of: vpmovzxbd m64, ymm0 I only glanced at the test results for non-AVX2, but them seemed correct.
Diff Detail
Event Timelineab updated this object. Comment Actions I'd add these intrinsics to X86IntrinsicsInfo.h in the following form: Please add Adam Nemet to reviewers. Comment Actions That's pretty nifty, thanks for the heads up! That makes the *Yrm patterns from D6126 necessary though. -Ahmed ab retitled this revision from [X86] Add patterns for AVX2 VPMOV[SZ]XYrm intrinsics. to [X86] Refactor PMOV[SZ]Xrm to add missing AVX2 patterns.. ab edited edge metadata. Comment ActionsI followed Elena's advice to use X86IntrinsicsInfo, and got a bit carried away: I removed the existing PMOVX intrinsic patterns, and unified them into only X86v[sz]ext patterns. So let's just say this patch is about refactoring the PMOVX handling, which happens to add a few of the missing patterns.
Comment Actions Hi, In my opinion you have a lot of redundant patterns. And not each pattern is accompanied with test. As far as I know, we don't fold FP load to integer operation. Am I wrong?
Comment Actions Remove a few useless patterns.
Comment Actions Hi Elena! To recap:
%0 = load i64* %p %tmp2 = insertelement <2 x i64> zeroinitializer, i64 %0, i32 0 %1 = bitcast <2 x i64> %tmp2 to <8 x i16> %2 = call <4 x i32> @llvm.x86.sse41.pmovsxwd(<8 x i16> %1)
%X = load <2 x i32>* %ptr %Y = sext <2 x i32> %X to <2 x i64> The other patterns cover the expected scalar load + extension, or vector load + extension (for intrinsics). -Ahmed Comment Actions I re-checked all patterns again. I think you can commit the patch. Thanks
Revision Contents
Diff 17012 llvm/trunk/lib/Target/X86/X86InstrSSE.td
llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h
llvm/trunk/test/CodeGen/X86/avx2-pmovxrm-intrinsics.ll
llvm/trunk/test/CodeGen/X86/sse41-pmovxrm-intrinsics.ll
llvm/trunk/test/CodeGen/X86/vector-sext.ll
llvm/trunk/test/CodeGen/X86/vector-zext.ll
|