Improves the 8 byte case from PR42674.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
35451 ↗ | (On Diff #214576) | we can easily support 2i8/4i8 as well by replacing this with an insertion into a zero v16i8 vector |
Comment Actions
That doesn’t seem profitable for v2i8. We’d be better off extracting both elements and doing a scalar add. For v4i8, I’m not sure. Psadbw is 5 cycles on some CPUs if I remember right, the normal expansion is probably faster on those CPUs.