This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use vmovq for v4i64/v4f64/v8i64/v8f64 vzmovl.
ClosedPublic

Authored by craig.topper on Jun 15 2019, 12:19 AM.

Details

Summary

We already use vmovq for v2i64/v2f64 vzmovl. But we were using a
blend with 0 for v4i64/v4f64 and vmovsd with 0 for v8i64/v8f64.

I think the blend with 0 or scalar movss/d is only needed for
vXi32 where we don't have an instruction that can move 32
bits from one xmm to another while zeroing upper bits.

Event Timeline

craig.topper created this revision.Jun 15 2019, 12:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2019, 12:19 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

PR34876 and PR34874 suggests we should prefer BLEND over MOVSD/MOVQ etc.?

I think those titles were based on what I thought we were doing. I was probably confused by v4i64/v8i64 using blend without noticing v2i64 isn’t. I think movq is available on ports 0/1/5 on Sandy Bridge.

I think those titles were based on what I thought we were doing. I was probably confused by v4i64/v8i64 using blend without noticing v2i64 isn’t. I think movq is available on ports 0/1/5 on Sandy Bridge.

LGTM - I can't find a case where MOVQ is worse than VBLEND/VPBLEND - please can you update PR34876 and PR34874 to make that clear.

RKSimon accepted this revision.Jun 21 2019, 3:35 AM
This revision is now accepted and ready to land.Jun 21 2019, 3:35 AM
craig.topper closed this revision.Jun 23 2019, 11:38 AM

Committed in r364079