This is an archive of the discontinued LLVM Phabricator instance.

[X86][llvm-exegesis] Exploring vector insert/extract
AbandonedPublic

Authored by lebedev.ri on Mar 3 2020, 2:26 AM.

Details

Summary

This is proof-of-concept, i'm not sure if we want this
so i didn't spend much time on this. If we do i can improve this.

It's not very useful without something like D60000,
but even now i'm seeing somewhat confusing results:

---
mode:            latency
key:
  instructions:
    - 'PEXTRDrr R9D XMM6 i_0x0'
    - 'MOV64toPQIrr XMM6 R9'
  config:          ''
  register_initial_values:
    - 'XMM6=0x0'
    - 'R9=0x0'
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 10.0292, per_snippet_value: 20.0584 }
error:           ''
info:            Repeating two instructions
assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F34244883C41049B9000000000000000066410F3A16F10066490F6EF166410F3A16F10066490F6EF166410F3A16F10066490F6EF166410F3A16F10066490F6EF166410F3A16F10066490F6EF166410F3A16F10066490F6EF166410F3A16F10066490F6EF166410F3A16F10066490F6EF1C3
...
---
mode:            latency
key:
  instructions:
    - 'PEXTRDrr EBP XMM8 i_0x0'
    - 'VMOV64toPQIrr XMM8 RBP'
  config:          ''
  register_initial_values:
    - 'XMM8=0x0'
    - 'RBP=0x0'
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 10.0344, per_snippet_value: 20.0688 }
error:           ''
info:            Repeating two instructions
assembled_snippet: 554883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F04244883C41048BD000000000000000066440F3A16C500C461F96EC566440F3A16C500C461F96EC566440F3A16C500C461F96EC566440F3A16C500C461F96EC566440F3A16C500C461F96EC566440F3A16C500C461F96EC566440F3A16C500C461F96EC566440F3A16C500C461F96EC55DC3
...
---
mode:            latency
key:
  instructions:
    - 'EXTRACTPSrr EDI XMM2 i_0x0'
    - 'VPINSRWrr XMM2 XMM7 EDI i_0x1'
  config:          ''
  register_initial_values:
    - 'XMM2=0x0'
    - 'XMM7=0x0'
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 11.0299, per_snippet_value: 22.0598 }
error:           ''
info:            Repeating two instructions
assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F14244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F3C244883C410660F3A17D700C5C1C4D701660F3A17D700C5C1C4D701660F3A17D700C5C1C4D701660F3A17D700C5C1C4D701660F3A17D700C5C1C4D701660F3A17D700C5C1C4D701660F3A17D700C5C1C4D701660F3A17D700C5C1C4D701C3
...
---
mode:            latency
key:
  instructions:
    - 'EXTRACTPSrr ESI XMM6 i_0x0'
    - 'PINSRDrr XMM6 XMM6 ESI i_0x1'
  config:          ''
  register_initial_values:
    - 'XMM6=0x0'
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 11.0328, per_snippet_value: 22.0656 }
error:           ''
info:            Repeating two instructions
assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F34244883C410660F3A17F600660F3A22F601660F3A17F600660F3A22F601660F3A17F600660F3A22F601660F3A17F600660F3A22F601660F3A17F600660F3A22F601660F3A17F600660F3A22F601660F3A17F600660F3A22F601660F3A17F600660F3A22F601C3
...

So extraction from 0'th lane isn't actually any faster?

Diff Detail

Event Timeline

lebedev.ri created this revision.Mar 3 2020, 2:26 AM
RKSimon added inline comments.Mar 3 2020, 10:18 AM
llvm/tools/llvm-exegesis/lib/X86/Target.cpp
789

The VPINSR*Z* variants still work on 128-bit vectors, they just use EVEX encoding (supporting predicate masks etc.)

lebedev.ri marked 2 inline comments as done.

Addressed review note - only AVX512 instructions work on 256-bit subvectors.
Also, tentatively add AVX512 instructions.

llvm/tools/llvm-exegesis/lib/X86/Target.cpp
789

Hm, and same for VINSERTPSZ, as far as i can tell.
Must be copypaste gone wrong.

lebedev.ri abandoned this revision.Aug 12 2020, 2:23 PM