This is an archive of the discontinued LLVM Phabricator instance.

[AVX512] Bring back vector-shuffle lowering support through broadcasts
Needs RevisionPublic

Authored by rob.khasanov on Oct 21 2014, 11:02 AM.

Details

Reviewers
chandlerc
Summary

I found that after your commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code.

Example below is from your commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll):

define <16 x i32> @_inreg16xi32(i32 %a) {
; CHECK-LABEL: _inreg16xi32:
; CHECK: ## BB#0:
-; CHECK-NEXT: vpbroadcastd %edi, %zmm0
+; CHECK-NEXT: vmovd %edi, %xmm0
+; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0
+; CHECK-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0
; CHECK-NEXT: retq
%b = insertelement <16 x i32> undef, i32 %a, i32 0

%c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer
 ret <16 x i32> %c

}

Here, 256-bit broadcast was generated instead of 512-bit one.

I investigated the reason, found that in your version of vector-shuffle lowering there is no AVX-512 support.

In this patch

  1. I added vector-shuffle lowering through broadcasts
  2. Removed asserts and branches likes because this is incorrect
  3. assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI");
  4. Fixed lowering tests

Diff Detail

Event Timeline

rob.khasanov retitled this revision from to [AVX512] Bring back vector-shuffle lowering support through broadcasts.
rob.khasanov updated this object.
rob.khasanov edited the test plan for this revision. (Show Details)
rob.khasanov added a reviewer: chandlerc.
rob.khasanov added subscribers: Unknown Object (MLST), anemet, delena.

Why do you need so many functions:
lowerV16F32VectorShuffle
lowerV8I64VectorShuffle
lowerV16I32VectorShuffle
.. ?

I don't see a big diff between them.

  • Elena
chandlerc edited edge metadata.Mar 29 2015, 2:09 PM

Sorry for the long delay...

lib/Target/X86/X86ISelLowering.cpp
10235

(ignore this)

10313–10319

These changes don't make sense to me. My understanding was that DQI was needed to shuffle integers.

chandlerc requested changes to this revision.Mar 29 2015, 2:09 PM
chandlerc edited edge metadata.
This revision now requires changes to proceed.Mar 29 2015, 2:09 PM