The movzbl instruction can be combined to vpinsrb or vmovd, when it is actual lowered from anyext.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/X86/avx-intrinsics-fast-isel.ll | ||
---|---|---|
2141 | @craig.topper, I am wondering why we keep movzbl before vpinsrb and vmovd. Is it used deliberately to eliminate partially register stall? |
llvm/test/CodeGen/X86/avx-intrinsics-fast-isel.ll | ||
---|---|---|
2141 | Yes I think that was the original reason. With 64-bit we almost never use ah,by,ch,dh. And Intel CPUs since SNB don’t have merges unless the H registers have been written. So it probably doesn’t matter much anymore. |
we've lost some checks, probably due to the shared check-prefixes not working anymore
llvm/test/CodeGen/X86/load-scalar-as-vector.ll | ||
---|---|---|
5 | I think you need to splt the AVX into: --check-prefixes=AVX,AVX1 --check-prefixes=AVX,AVX2 | |
llvm/test/CodeGen/X86/sse41-intrinsics-fast-isel.ll | ||
6 | you may need to add X86-AVX1 (and X64-AVX512 below) |
Have you seen any cases where we need movzwl?
llvm/lib/Target/X86/X86InstrAVX512.td | ||
---|---|---|
11720 | I don't think you need HasAVX? |
LGTM (it doesn't look like movzwl is an issue given we try so hard to avoid 16-bit ops)
I don't think you need HasAVX?