This allows us to use PSHUFB for v8i16 (and v4i32 after D42308) and VPERMD/PERMPS for v4i64/v4f64 variable shuffles.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
7986 | This could maybe use a comment. These adds are conceptually ORs right? There shouldn't be any carries to the next element. Can you use Scale in place of 1ull in the loop instead of multiplying afterwards? | |
test/CodeGen/X86/var-permute-128.ll | ||
47 | Super annoying that padd doesn't print its constant pool. We should finish D37184. | |
test/CodeGen/X86/var-permute-256.ll | ||
1290 | Add a DQI command line since I think this math would be able to use VMULLQ? |
Comment Actions
With AVX512DQ tests - I've raised PR36191 as the failure to fold the broadcasts into the zmm for 'fake' ymm vpmullq is rather annoying
This could maybe use a comment.
These adds are conceptually ORs right? There shouldn't be any carries to the next element.
Can you use Scale in place of 1ull in the loop instead of multiplying afterwards?