We already perform horizontal add/sub if we extract from elements 0 and 1, this patch extends it to non-0/1 element extraction indices (as long as they are from the lowest 128-bit vector).
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
I might've overlooked it - do we have test coverage for a 256-bit source vector where we extract from the upper elements?
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
19039–19043 | Need to update this comment to something like: |
test/CodeGen/X86/haddsub.ll | ||
---|---|---|
1012 | We still miss folding to extractf128+hadd+permilps - but the cost-benefit isn't great. |
test/CodeGen/X86/haddsub.ll | ||
---|---|---|
1012 | FYI - I have a follow up mini-patch that will fix this. |
Need to update this comment to something like:
This is a shuffle or free if the left index is 0.