Index: cfe/trunk/docs/ReleaseNotes.rst =================================================================== --- cfe/trunk/docs/ReleaseNotes.rst +++ cfe/trunk/docs/ReleaseNotes.rst @@ -56,8 +56,12 @@ Non-comprehensive list of changes in this release ------------------------------------------------- -- ... - +- For X86 target, -march=skylake-avx512, -march=icelake-client, + -march=icelake-server, -march=cascadelake, -march=cooperlake will default to + not using 512-bit zmm registers in vectorized code unless 512-bit intrinsics + are used in the source code. 512-bit operations are known to cause the CPUs + to run at a lower frequency which can impact performance. This behavior can be + changed by passing -mprefer-vector-width=512 on the command line. New Compiler Flags ------------------ Index: llvm/trunk/docs/ReleaseNotes.rst =================================================================== --- llvm/trunk/docs/ReleaseNotes.rst +++ llvm/trunk/docs/ReleaseNotes.rst @@ -96,6 +96,10 @@ be passed in ZMM registers for calls and returns. Previously they were passed in two YMM registers. Old behavior can be enabled by passing -x86-enable-old-knl-abi +* -mprefer-vector-width=256 is now the default behavior skylake-avx512 and later + Intel CPUs. This tries to limit the use of 512-bit registers which can cause a + decrease in CPU frequency on these CPUs. This can be re-enabled by passing + -mprefer-vector-width=512 to clang or passing -mattr=-prefer-256-bit to llc. Changes to the AMDGPU Target ----------------------------- Index: llvm/trunk/lib/Target/X86/X86.td =================================================================== --- llvm/trunk/lib/Target/X86/X86.td +++ llvm/trunk/lib/Target/X86/X86.td @@ -601,6 +601,7 @@ // Skylake-AVX512 list SKXAdditionalFeatures = [FeatureAVX512, + FeaturePrefer256Bit, FeatureCDI, FeatureDQI, FeatureBWI, @@ -634,6 +635,7 @@ // Cannonlake list CNLAdditionalFeatures = [FeatureAVX512, + FeaturePrefer256Bit, FeatureCDI, FeatureDQI, FeatureBWI, Index: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll =================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll @@ -1,6 +1,14 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=fast-variable-shuffle,avx512vl,avx512bw,avx512dq,prefer-256-bit | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=fast-variable-shuffle,avx512vl,avx512bw,avx512dq,prefer-256-bit,avx512vbmi | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI +; Make sure CPUs default to prefer-256-bit. avx512vnni isn't interesting as it just adds an isel peephole for vpmaddwd+vpaddd +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=skylake-avx512 | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512 +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=cascadelake | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512 +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=cooperlake | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512 +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=cannonlake | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=icelake-client | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=icelake-server | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=tigerlake | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI ; This file primarily contains tests for specific places in X86ISelLowering.cpp that needed be made aware of the legalizer not allowing 512-bit vectors due to prefer-256-bit even though AVX512 is enabled.