This is an archive of the discontinued LLVM Phabricator instance.

Use vmovss to handle inserting an element into index 0 of a v8f32 vector of zeros.
ClosedPublic

Authored by craig.topper on Feb 25 2015, 11:22 PM.

Details

Reviewers
chandlerc
Summary

This fixes this case from PR22685 to be handled by vmovss

efine <8 x float> @mov00(float* %ptr) {

%val = load float* %ptr
%vec = insertelement <8 x float> zeroinitializer, float %val, i32 0
ret <8 x float> %vec

}

Instead of this

vxorps %xmm0, %xmm0, %xmm0
vinsertps $0, (%rdi), %xmm0, %xmm0 ## xmm0 = mem[0],xmm0[1,2,3]
vxorps %ymm1, %ymm1, %ymm1
vinsertf128 $0, %xmm0, %ymm1, %ymm0
retq

Diff Detail

Event Timeline

craig.topper retitled this revision from to Use vmovss to handle inserting an element into index 0 of a v8f32 vector of zeros..
craig.topper updated this object.
craig.topper edited the test plan for this revision. (Show Details)
craig.topper added a reviewer: chandlerc.
craig.topper added a subscriber: Unknown Object (MLST).
chandlerc accepted this revision.Feb 25 2015, 11:30 PM
chandlerc edited edge metadata.

Looks fine. Add the floating point test cases as well?

This revision is now accepted and ready to land.Feb 25 2015, 11:30 PM
spatel added a subscriber: spatel.Mar 3 2015, 8:59 AM

Can we move the mask check into lower256BitVectorShuffle() ?

Otherwise, we'll need to duplicate the logic to catch the following cases:

define <4 x i64> @mov_v4i64(i64* %ptr) {
  %val = load i64, i64* %ptr
  %i0 = insertelement <4 x i64> zeroinitializer, i64 %val, i32 0
  ret <4 x i64> %i0
}

define <8 x i32> @mov_v8i32(i32* %ptr) {
  %val = load i32, i32* %ptr
  %i0 = insertelement <8 x i32> zeroinitializer, i32 %val, i32 0
  ret <8 x i32> %i0
}

define <16 x i16> @mov_v16i16(i16* %ptr) {
  %val = load i16, i16* %ptr
  %i0 = insertelement <16 x i16> zeroinitializer, i16 %val, i32 0
  ret <16 x i16> %i0
}

define <32 x i8> @mov_v32i8(i8* %ptr) {
  %val = load i8, i8* %ptr
  %i0 = insertelement <32 x i8> zeroinitializer, i8 %val, i32 0
  ret <32 x i8> %i0
}
spatel added a comment.Mar 3 2015, 9:52 AM

I added more test cases to the bug:
http://llvm.org/bugs/show_bug.cgi?id=22685#c4

This may require more than one patch to get right, but I think that we should be handling all subtypes of 256-bit vectors.

Committed my initial change in r23135. Still need to review the additional test cases.

craig.topper closed this revision.Oct 20 2015, 8:36 AM